Results 1 -
8 of
8
Truth in Advertising: The Hidden Cost of Mobile Ads for Software Developers
"... Abstract—The “free app ” distribution model has been ex-tremely popular with end users and developers. Developers use mobile ads to generate revenue and cover the cost of developing these free apps. Although the apps are ostensibly free, they in fact do come with hidden costs. Our study of 21 real w ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
(Show Context)
Abstract—The “free app ” distribution model has been ex-tremely popular with end users and developers. Developers use mobile ads to generate revenue and cover the cost of developing these free apps. Although the apps are ostensibly free, they in fact do come with hidden costs. Our study of 21 real world Android apps shows that the use of ads leads to mobile apps that consume significantly more network data, have increased energy consumption, and require repeated changes to ad related code. We also found that complaints about these hidden costs are significant and can impact the ratings given to an app. Our results provide actionable information and guidance to software developers in weighing the tradeoffs of incorporating ads into their mobile apps. Index Terms—Mobile advertisements, mobile devices I.
The app sampling problem for app store mining
- in Working Conference on Mining Software Repositories
"... Abstract—Many papers on App Store Mining are susceptible to the App Sampling Problem, which exists when only a subset of apps are studied, resulting in potential sampling bias. We introduce the App Sampling Problem, and study its effects on sets of user review data. We investigate the effects of sam ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract—Many papers on App Store Mining are susceptible to the App Sampling Problem, which exists when only a subset of apps are studied, resulting in potential sampling bias. We introduce the App Sampling Problem, and study its effects on sets of user review data. We investigate the effects of sampling bias, and techniques for its amelioration in App Store Mining and Analysis, where sampling bias is often unavoidable. We mine 106,891 requests from 2,729,103 user reviews and investigate the properties of apps and reviews from 3 different partitions: the sets with fully complete review data, partially complete review data, and no review data at all. We find that app metrics such as price, rating, and download rank are significantly different between the three completeness levels. We show that correlation analysis can find trends in the data that prevail across the partitions, offering one possible approach to App Store Analysis in the presence of sampling bias. I.
Mining android app usages for generating actionable gui-based execution scenarios
- In 12th Working Conference on Mining Software Repositories (MSR’15
, 2015
"... Abstract—GUI-based models extracted from Android app execution traces, events, or source code can be extremely useful for challenging tasks such as the generation of scenarios or test cases. However, extracting effective models can be an expensive process. Moreover, existing approaches for automatic ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
(Show Context)
Abstract—GUI-based models extracted from Android app execution traces, events, or source code can be extremely useful for challenging tasks such as the generation of scenarios or test cases. However, extracting effective models can be an expensive process. Moreover, existing approaches for automatically deriving GUI-based models are not able to generate scenarios that include events which were not observed in execution (nor event) traces. In this paper, we address these and other major challenges in our novel hybrid approach, coined as MONKEYLAB. Our approach is based on the Record→Mine→Generate→Validate frame-work, which relies on recording app usages that yield execution (event) traces, mining those event traces and generating execution scenarios using statistical language modeling, static and dynamic analyses, and validating the resulting scenarios using an interac-tive execution of the app on a real device. The framework aims at mining models capable of generating feasible and fully replayable (i.e., actionable) scenarios reflecting either natural user behavior or uncommon usages (e.g., corner cases) for a given app. We evaluated MONKEYLAB in a case study involving several medium-to-large open-source Android apps. Our results demonstrate that MONKEYLAB is able to mine GUI-based models that can be used to generate actionable execution scenarios for both natural and unnatural sequences of events on Google Nexus 7 tablets. Index Terms—GUI models, mobile apps, mining execution traces and event logs, statistical language models
Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction Models
"... Abstract—Defect prediction models help software quality as-surance teams to effectively allocate their limited resources to the most defect-prone software modules. A variety of classification techniques have been used to build defect prediction models ranging from simple (e.g., logistic regression) ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Defect prediction models help software quality as-surance teams to effectively allocate their limited resources to the most defect-prone software modules. A variety of classification techniques have been used to build defect prediction models ranging from simple (e.g., logistic regression) to advanced tech-niques (e.g., Multivariate Adaptive Regression Splines (MARS)). Surprisingly, recent research on the NASA dataset suggests that the performance of a defect prediction model is not significantly impacted by the classification technique that is used to train it. However, the dataset that is used in the prior study is both: (a) noisy, i.e., contains erroneous entries and (b) biased, i.e., only contains software developed in one setting. Hence, we set out to replicate this prior study in two experimental settings. First, we apply the replicated procedure to the same (known-to-be noisy) NASA dataset, where we derive similar results to the prior study, i.e., the impact that classification techniques have appear to be minimal. Next, we apply the replicated procedure to two new datasets: (a) the cleaned version of the NASA dataset and (b) the PROMISE dataset, which contains open source software developed in a variety of settings (e.g., Apache, GNU). The results in these new datasets show a clear, statistically distinct separation of groups of techniques, i.e., the choice of classification technique has an impact on the performance of defect prediction models. Indeed, contrary to earlier research, our results suggest that some classification techniques tend to produce defect prediction models that outperform others. I.
Examining the Relationship between FindBugs Warnings and End User Ratings: A Case Study On 10,000 Android
"... In the mobile app ecosystem, end user ratings of apps (a mea-sure of end user perception) are extremely important to study as they are highly correlated with downloads and hence revenues. In this study we examine the relationship between the app ratings (and associated review-comments) from end user ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
In the mobile app ecosystem, end user ratings of apps (a mea-sure of end user perception) are extremely important to study as they are highly correlated with downloads and hence revenues. In this study we examine the relationship between the app ratings (and associated review-comments) from end users with the static analysis warnings (collected using FindBugs) from 10,000 free-to-download Android apps. In our case study, we find that specific categories of FindBugs warnings such as the ‘Bad Practice’, ‘In-ternationalization’, and ‘Performance ’ categories are found signif-icantly more in low-rated apps. We also find that there exists a correspondence between these three categories of warnings and the complaints in the review-comments of end users. These findings provide evidence that certain categories of warnings from Find-Bugs have a strong relationship with the rating of an app and hence are closely related to the user experience. Thus app developers can use static analysis tools such as FindBugs to potentially identify the culprit bugs behind the issues that users complain about, before they release the app. 1.
Studying the consistency of star ratings and the complaints in 1 & 2-star user reviews for top free cross-platform Android and iOS apps Studying the Consistency of Star Ratings and the Complaints in 1 & 2-Star User Reviews for Top Free Cross-Platform Andr
"... How users rate a mobile app via star ratings and user reviews is of utmost importance for the success of an app. Recent studies and surveys show that users rely heavily on star ratings and user reviews that are provided by other users, for deciding which app to download. However, understanding star ..."
Abstract
- Add to MetaCart
(Show Context)
How users rate a mobile app via star ratings and user reviews is of utmost importance for the success of an app. Recent studies and surveys show that users rely heavily on star ratings and user reviews that are provided by other users, for deciding which app to download. However, understanding star ratings and user reviews is a complicated matter, since they are influenced by many factors such as the actual quality of the app and how the user perceives such quality relative to their expectations, which are in turn influenced by their prior experiences and expectations relative to other apps on the platform (e.g., iOS versus Android). Nevertheless, star ratings and user reviews provide developers with valuable information for improving the software quality of their app. In an effort to expand their revenue and reach more users, app developers commonly build cross-platform apps, i.e., apps that are available on multiple platforms. As star ratings and user reviews are of such importance in the mobile app industry, it is essential for developers of cross-platform apps to maintain a consistent level of star ratings and user reviews for their apps across the various platforms on which they are available. In this paper, we investigate whether cross-platform apps achieve a consistent level of star ratings and user reviews. We manually identify 19 cross-platform apps and conduct an empirical study on their star ratings and user reviews. By manually tagging 9,902 1 & 2-star reviews of the studied cross-platform apps, we discover that the distribution of the frequency of complaint types varies across platforms. Finally, we study the negative impact ratio of complaint types and find that for some apps, users have higher expectations on one platform. All our proposed techniques and our methodologies are generic and can be used for any app. Our findings show that at least 68% of the studied cross-platform apps do not have consistent star ratings, which suggests that different quality assurance efforts need to be considered by Abstract How users rate a mobile app via star ratings and user reviews is of utmost importance for the success of an app. Recent studies and surveys show that users rely heavily on star ratings and user reviews that are provided by other users, for deciding which app to download. However, understanding star ratings and user reviews is a complicated matter, since they are influenced by many factors such as the actual quality of the app and how the user perceives such quality relative to their expectations, which are in turn influenced by their prior experiences and expectations relative to other apps on the platform (e.g., iOS versus Android). Nevertheless, star ratings and user reviews provide developers with valuable information for improving the software quality of their app. In an effort to expand their revenue and reach more users, app developers commonly build cross-platform apps, i.e., apps that are available on multiple platforms. As star ratings and user reviews are of such importance in the mobile app industry, it is essential for developers of cross-platform apps to maintain a consistent level of star ratings and user reviews for their apps across the various platforms on which they are available. In this paper, we investigate whether cross-platform apps achieve a consistent level of star ratings and user reviews. We manually identify 19 cross-platform apps and conduct an empirical study on their star ratings and user reviews. By manually tagging 9,902 1 & 2-star reviews of the studied cross-platform apps, we discover that the distribution of the frequency of complaint types varies across platforms. Finally, we study the negative impact ratio of complaint types and find that for some apps, users have higher expectations on one platform. All our proposed techniques and our methodologies are generic and can be used for any app. Our findings show that at least 68% of the studied cross-platform apps do not have consistent star ratings, which suggests that different quality assurance efforts need to be considered by developers for the different platforms that they wish to support.
Noname manuscript No. (will be inserted by the editor) Analyzing and Automatically Labelling The Types of User Issues that are Raised in Mobile App Reviews
"... the date of receipt and acceptance should be inserted later Abstract Mobile app reviews by users contain a wealth of information on the issues that users are experiencing. For example, a review might contain a feature request, a bug report, and/or a privacy complaint. Developers, users and app store ..."
Abstract
- Add to MetaCart
(Show Context)
the date of receipt and acceptance should be inserted later Abstract Mobile app reviews by users contain a wealth of information on the issues that users are experiencing. For example, a review might contain a feature request, a bug report, and/or a privacy complaint. Developers, users and app store owners (e.g., Apple, Blackberry, Google, Microsoft) can benefit from a better understanding of these issues – developers can better understand users ’ concerns, app store owners can spot anomalous apps, and users can compare similar apps to decide which ones to download or purchase. However, user reviews are not labelled, i.e., we do not know which types of issues are raised in a review. Hence, one must sift through potentially thousands of reviews with slang and abbreviations to understand the various types of issues. Moreover, the unstructured and informal nature of reviews complicates the automated labelling of such reviews. In this paper, we study the multi-labelled nature of reviews from 20 mobile apps in the Google Play Store and Apple App Store. We find that up to 30 % of the reviews raise various types of issues in a single review (e.g., a review might contain a feature request and a bug report). We then propose an approach that can automatically assign multiple labels to reviews based on the raised issues with a precision of 66 % and recall of 65%. Finally, we apply our approach to address three analytics proof-of-concept use case scenarios: (i) we compare competing apps to assist developers and users, (ii) we provide an overview of 601,221 reviews from 12,000 apps in the Google Play Store to assist app store owners and developers and (iii) we detect anomalous apps in the Google Play Store to assist app store owners and users.
What Are the Characteristics of High-Rated Apps? A Case Study on Free Android Applications
"... Abstract—The tremendous rate of growth in the mobile app market over the past few years has attracted many developers to build mobile apps. However, while there is no shortage of stories of how lone developers have made great fortunes from their apps, the majority of developers are struggling to bre ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—The tremendous rate of growth in the mobile app market over the past few years has attracted many developers to build mobile apps. However, while there is no shortage of stories of how lone developers have made great fortunes from their apps, the majority of developers are struggling to break even. For those struggling developers, knowing the “DNA ” (i.e., characteristics) of high-rated apps is the first step towards successful development and evolution of their apps. In this paper, we investigate 28 factors along eight dimensions to understand how high-rated apps are different from low-rated apps. We also investigate what are the most influential factors by applying a random-forest classifier to identify high-rated apps. Through a case study on 1,492 high-rated and low-rated free apps mined from the Google Play store, we find that high-rated apps are statistically significantly different in 17 out of the 28 factors that we considered. Our experiment also shows that the size of an app, the number of promotional images that the app displays on its web store page, and the target SDK version of an app are the most influential factors. I.