Mobile User Behavior Analysis
This project aims to analyze mobile user behavior by exploring various factors influencing app usage, device performance, and user demographics. The dataset utilized contains comprehensive information regarding users, including device models, operating systems, app usage time, screen on time, battery drain, data usage, age, gender, and user behavior classification.
See the full source code on GitHub: github.com/aelluminate/mobile-user-behavior-analysis
Objectives
The primary objectives of this project are:
To explore the distribution of key behavioral metrics among users.
To conduct comparative analysis between different device models and operating systems.
To analyze user behavior segmentation based on demographics, specifically age and gender.
To derive insights about the relationships between app usage, battery performance, and demographic features.
The Data
This dataset provides a comprehensive analysis of mobile device usage patterns and user behavior classification. It contains 700 samples of user data, including metrics such as app usage time, screen-on time, battery drain, and data consumption. Each entry is categorized into one of five user behavior classes, ranging from light to extreme usage, allowing for insightful analysis and modeling.
The dataset contains the following columns:
User ID
: Unique identifier for each user.Device Model
: Model of the user's smartphone.Operating System
: The OS of the device (iOS or Android).App Usage Time
: Daily time spent on mobile applications, measured in minutes.Screen On Time
: Average hours per day the screen is active.Battery Drain
: Daily battery consumption in mAh.Number of Apps Installed
: Total apps available on the device.Data Usage
: Daily mobile data consumption in megabytes.Age
: Age of the user.Gender
: Gender of the user (Male or Female).User Behavior Class
: Classification of user behavior based on usage patterns (1 to 5).
This dataset is ideal for researchers, data scientists, and analysts interested in understanding mobile user behavior and developing predictive models in the realm of mobile technology and applications.
Methodology
Data Importation and Preprocessing: Libraries like Pandas, NumPy, Matplotlib, and Seaborn are utilized for data manipulation and visualization. Initial exploration includes reading the dataset, checking for missing values, and dropping unnecessary columns (e.g., User ID).
Exploratory Data Analysis (EDA):
The dataset is inspected to understand its structure and summarize statistics.
A distribution analysis is conducted for numerical features to visualize how metrics like app usage time, battery drain, and data usage vary.
Data Visualization:
Distribution Plots: Individual visualizations are created for the distributions of key metrics.
Correlation Analysis: A correlation matrix is generated and visualized to identify relationships between numeric variables.
Comparative Analysis: Bar plots illustrate the differences in metrics across device models and operating systems, providing insights into performance variations.
Segmentation Analysis: Grouped metrics by age and gender to understand average app usage and battery drain differences among demographics.
Visualizations
Distribution Analysis
Distribution of App Usage Time (min/day)
The graph you see here is a histogram depicting the distribution of app usage time. The x-axis represents the app usage time in minutes per day, and the y-axis represents the frequency. On the x-axis, we can see that the app usage time ranges from 0 to 600 minutes per day. The y-axis shows the number of users who fall within each range of app usage time.
The densest region of the graph, which tells us where most of the data points are concentrated, is between 0 and 100 minutes. This indicates that a substantial portion of users spend less than 100 minutes per day on the app. The graph tapers off after 200 minutes, suggesting that there are fewer users who spend extended periods on the app.
Overall, this graph suggests that most users tend to use the app for short bursts throughout the day, rather than long stretches of continuous use.
Distribution of Screen On Time (hours/day)
This histogram provides a visual representation of the distribution of daily screen time among our users. The x-axis shows the screen time in hours per day, while the y-axis indicates the frequency, or the number of users who fall within each screen time range.
As you can see, the majority of our users spend between 2 and 4 hours on their screens each day. This suggests that a significant portion of our user base has a moderate level of screen time. However, there is a smaller group of users who spend significantly more time on their screens, with a few individuals exceeding 10 hours per day.
The overall shape of the distribution is somewhat skewed to the right, indicating that there is a tail of users with exceptionally high screen time. This might suggest that a small percentage of our user base is heavily reliant on screens for their daily activities.
Distribution of Battery Drain (mAh/day)
This histogram depicts the distribution of battery drain among our devices, measured in milliampere-hours per day (mAh/day). The x-axis represents the battery drain, while the y-axis indicates the frequency, or the number of devices that fall within each battery drain range.
As we can see, the majority of our devices have a battery drain between 500 and 1000 mAh/day. This suggests that a significant portion of our devices have a moderate level of battery consumption. However, there is a smaller group of devices that drain significantly more battery, with a few exceeding 2500 mAh/day.
The overall shape of the distribution is slightly skewed to the right, indicating that there is a tail of devices with exceptionally high battery drain. This might suggest that a small percentage of our devices are using more power-intensive applications or have hardware issues that are leading to excessive battery consumption.
Distribution of Numbers of Apps Installed
This histogram illustrates the distribution of the number of apps installed on our users' devices. The x-axis represents the number of apps installed, while the y-axis indicates the frequency, or the number of devices that fall within each app installation range.
As we can see, the majority of our users have between 0 and 20 apps installed on their devices. This suggests that a significant portion of our user base has a relatively small number of apps. However, there is a smaller group of users with a much higher number of apps installed, with some devices having over 80 apps.
The overall shape of the distribution is slightly skewed to the right, indicating that there is a tail of devices with exceptionally high app installations. This might suggest that a small percentage of our user base is using a large number of apps, potentially for specific purposes or due to personal preferences.
Distribution of Data Usage (MB/day)
This histogram illustrates the distribution of daily data usage among our users, measured in megabytes per day (MB/day). The x-axis represents the data usage, while the y-axis indicates the frequency, or the number of users that fall within each data usage range.
As we can see, the majority of our users have a data usage between 0 and 500 MB/day. This suggests that a significant portion of our user base has a relatively low level of data consumption. However, there is a smaller group of users with significantly higher data usage, with some exceeding 2000 MB/day.
The overall shape of the distribution is skewed to the right, indicating that there is a tail of users with exceptionally high data usage. This might suggest that a small percentage of our user base is using data-intensive applications or services, such as streaming video or downloading large files.
Distribution of Age
This histogram depicts the distribution of age among our user base. The x-axis represents the age, while the y-axis indicates the frequency, or the number of users that fall within each age range.
As we can see, the majority of our users are between 20 and 30 years old. This suggests that our user base is primarily composed of younger adults. However, there is a significant portion of users who are older than 30, with a noticeable peak around 50 years old.
The overall shape of the distribution is somewhat skewed to the right, indicating that there is a tail of users who are older than the majority. This might suggest that our user base is not exclusively focused on a single age group, but rather includes a diverse range of age demographics.
Correlational Analysis
This correlation matrix visually represents the relationships between various user metrics, including app usage time, screen on time, battery drain, number of apps installed, data usage, age, and user behavior class. A positive correlation indicates that two variables tend to increase or decrease together, while a negative correlation suggests an inverse relationship. The strength of the correlation is measured by the absolute value of the coefficient, with values closer to 1 or -1 representing stronger correlations.
Key Findings
Strong Positive Correlations
App Usage Time, Screen On Time, Battery Drain, Number of Apps Installed, and Data Usage: These metrics exhibit very strong positive correlations with each other. This suggests that users who spend more time on their devices tend to have higher battery drain, install more apps, and use more data.
Weak or No Correlation
Age with other metrics: Age does not appear to have a significant correlation with any of the other variables, indicating that user behavior and device usage are not strongly influenced by age.
User Behavior Class with Age: There is no correlation between user behavior class and age, suggesting that user engagement and behavior are not directly related to age.
Segmentation Analysis
Average App Usage Time (min/day) by Age and Gender
This bar plot illustrates the average daily app usage time across different age groups, segmented by gender. The x-axis represents ages from 18 to 59, while the y-axis shows the average usage time in minutes per day.
Looking at the distribution, we observe that usage patterns vary significantly across age groups and gender. The data reveals that daily engagement ranges from approximately 150 to 400 minutes (2.5 to 6.7 hours). Female users (shown in teal) demonstrate more consistent usage patterns across different age groups, while male users (shown in pink) exhibit more variable engagement levels.
Notably, there are several distinctive peaks in usage:
Early 20s show particularly high engagement for both genders
Age 34 marks a significant spike for female users, reaching approximately 400 minutes per day
Early 50s display another notable peak, particularly around age 53
The gender comparison reveals interesting patterns. Female users tend to maintain more stable usage levels across age groups, while male usage shows more pronounced fluctuations. This is particularly evident in the 47-50 age range, where male users consistently show higher engagement levels than their female counterparts.
Average Battery Drain (mAh/day) by Age and Gender
This bar plot visualizes the average daily battery drain across different age groups, segmented by gender. The x-axis shows ages ranging from 18 to 59, while the y-axis displays the average battery drain in milliampere-hours (mAh) per day.
The distribution reveals that battery consumption patterns closely mirror the app usage patterns we observed earlier. Daily battery drain typically ranges from around 1,000 to 2,000 mAh across all segments. Female users (shown in teal) demonstrate more consistent battery consumption patterns, while male users (shown in pink) show more variable drain rates.
Several notable patterns emerge:
Peak battery consumption occurs in similar age brackets as peak app usage:
Early 20s show high battery drain for both genders
Age 34 shows maximum battery consumption for female users, reaching nearly 2,000 mAh per day
Early 50s display another significant peak in consumption
The lowest battery drain is generally seen in the mid-40s age range, dropping to around 850 mAh per day
This correlation between app usage time and battery drain suggests that our app's power consumption is relatively consistent across different usage patterns. However, the data indicates opportunities for optimization, particularly for high-usage segments where battery life might be a crucial factor in user experience and retention.
Results
This analysis can reveal significant insights:
User Usage Patterns: Understanding how different demographics interact with devices and apps.
Performance Insights: Identifying which devices perform better in terms of battery life and app efficiency.
Data-Driven Decision Making: Providing stakeholders with actionable insights that can guide product development, marketing strategies, and customer engagement efforts.
Conclusion
This project provides a comprehensive analysis of mobile user behavior, focusing on key metrics such as app usage time, battery drain, and data consumption. By exploring the dataset and visualizing the relationships between various features, we can derive valuable insights into user behavior patterns, device performance, and demographic segmentation. These insights can be leveraged to optimize app development, enhance user experience, and drive business growth in the mobile technology sector.
Last updated