Results 1 - 10
of
373
Predicting the popularity of online content
- Commun. ACM
, 2010
"... We present a method for accurately predicting the long time popularity of online content from early measurements of user’s access. Using two content sharing portals, Youtube and Digg, we show that by modeling the accrual of views and votes on content offered by these services we can predict the long ..."
Abstract
-
Cited by 161 (7 self)
- Add to MetaCart
(Show Context)
We present a method for accurately predicting the long time popularity of online content from early measurements of user’s access. Using two content sharing portals, Youtube and Digg, we show that by modeling the accrual of views and votes on content offered by these services we can predict the long-term dynamics of individual submissions from initial data. In the case of Digg, measuring access to given stories during the first two hours allows us to forecast their popularity 30 days ahead with remarkable accuracy, while downloads of Youtube videos need to be followed for 10 days to attain the same performance. The differing time scales of the predictions are shown to be due to differences in how content is consumed on the two portals: Digg stories quickly become outdated, while Youtube videos are still found long after they are initially submitted to the portal. We show that predictions are more accurate for submissions for which attention decays quickly, whereas predictions for evergreen content will be prone to larger errors.
Statistics and Social Network of YouTube Videos
- in Proc. of IEEE IWQoS
, 2008
"... Abstract—YouTube has become the most successful Internet website providing a new generation of short video sharing service since its establishment in early 2005. YouTube has a great impact on Internet traffic nowadays, yet itself is suffering from a severe problem of scalability. Therefore, understa ..."
Abstract
-
Cited by 128 (11 self)
- Add to MetaCart
(Show Context)
Abstract—YouTube has become the most successful Internet website providing a new generation of short video sharing service since its establishment in early 2005. YouTube has a great impact on Internet traffic nowadays, yet itself is suffering from a severe problem of scalability. Therefore, understanding the characteristics of YouTube and similar sites is essential to network traffic engineering and to their sustainable development. To this end, we have crawled the YouTube site for four months, collecting more than 3 million YouTube videos ’ data. In this paper, we present a systematic and in-depth measurement study on the statistics of YouTube videos. We have found that YouTube videos have noticeably different statistics compared to traditional streaming videos, ranging from length and access pattern, to their growth trend and active life span. We investigate the social networking in YouTube videos, as this is a key driving force toward its success. In particular, we find that the links to related videos generated by uploaders ’ choices have clear small-world characteristics. This indicates that the videos have strong correlations with each other, and creates opportunities for developing novel techniques to enhance the service quality. I.
Unveiling Facebook: A Measurement Study of Social Network Based Applications
"... Online social networking sites such as Facebook and MySpace have become increasingly popular, with close to 500 million users as of August 2008. The introduction of the Facebook Developer Platform and OpenSocial allows thirdparty developers to launch their own applications for the existing massive u ..."
Abstract
-
Cited by 75 (3 self)
- Add to MetaCart
(Show Context)
Online social networking sites such as Facebook and MySpace have become increasingly popular, with close to 500 million users as of August 2008. The introduction of the Facebook Developer Platform and OpenSocial allows thirdparty developers to launch their own applications for the existing massive user base. The viral growth of these social applications can potentially influence how content is produced and consumed in the future Internet. To gain a better understanding, we conducted a largescale measurement study of the usage characteristics of online social network based applications. In particular, we developed and launched three Facebook applications, which have achieved a combined subscription base of over 8 million users. Using the rich dataset gathered through these applications, we analyze the aggregate workload characteristics (including temporal and geographical distributions) as well as the structure of user interactions. We explore the existence of ‘communities’, with high degree of interaction within a community and limited interaction outside the community. We find that a small fraction of users account for the majority of activity within the context of our Facebook applications and a small number of applications account for the majority of users on Facebook. Furthermore, user response times for Facebook applications are independent of source/destination user locality. We also investigate distinguishing characteristics of social gaming applications. To the best of our knowledge, this is the first study analyzing user activities on online social applications.
Ostra: Leveraging trust to thwart unwanted communication
- In USENIX NSDI
, 2008
"... Online communication media such as email, instant messaging, bulletin boards, voice-over-IP, and social networking sites allow any sender to reach potentially millions of users at near zero marginal cost. This property enables information to be exchanged freely: anyone with Internet access can publi ..."
Abstract
-
Cited by 68 (6 self)
- Add to MetaCart
(Show Context)
Online communication media such as email, instant messaging, bulletin boards, voice-over-IP, and social networking sites allow any sender to reach potentially millions of users at near zero marginal cost. This property enables information to be exchanged freely: anyone with Internet access can publish content. Unfortunately, the same property opens the door to unwanted communication, marketing, and propaganda. Examples include email spam, Web search engine spam, inappropriately labeled content on YouTube, and unwanted contact invitations in Skype. Unwanted communication wastes one of the most valuable resources in the information age: human attention. In this paper, we explore the use of trust relationships, such as social links, to thwart unwanted communication. Such relationships already exist in many application settings today. Our system, Ostra, bounds the total amount of unwanted communication a user can produce based on the number of trust relationships the user has, and relies on the fact that it is difficult for a user to create arbitrarily many trust relationships. Ostra is applicable to both messaging systems such as email and content-sharing systems such as YouTube. It does not rely on automatic classification of content, does not require global user authentication, respects each recipient’s idea of unwanted communication, and permits legitimate communication among parties who have not had prior contact. An evaluation based on data gathered from an online social networking site shows that Ostra effectively thwarts unwanted communication while not impeding legitimate communication. 1
Greening the Internet with Nano Data Centers
- Proceedings of the 5 th International Conference on Emerging Networking Experiments and Technologies
, 2009
"... Motivated by increased concern over energy consumption in modern data centers, we propose a new, distributed computing platform called Nano Data Centers (NaDa). NaDa uses ISP-controlled home gateways to provide computing and storage services and adopts a managed peer-to-peer model to form a distribu ..."
Abstract
-
Cited by 68 (6 self)
- Add to MetaCart
(Show Context)
Motivated by increased concern over energy consumption in modern data centers, we propose a new, distributed computing platform called Nano Data Centers (NaDa). NaDa uses ISP-controlled home gateways to provide computing and storage services and adopts a managed peer-to-peer model to form a distributed data center infrastructure. To evaluate the potential for energy savings in NaDa platform we pick Video-on-Demand (VoD) services. We develop an energy consumption model for VoD in traditional and in NaDa data centers and evaluate this model using a large set of empirical VoD access data. We find that even under the most pessimistic scenarios, NaDa saves at least 20 % to 30 % of the energy compared to traditional data centers. These savings stem from energypreserving properties inherent to NaDa such as the reuse of already committed baseline power on underutilized gateways, the avoidance of cooling costs, and the reduction of network energy consumption as a result of demand and service co-localization in NaDa.
Youtube everywhere: Impact of device and infrastructure synergies on user experience
- in Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, ser. IMC
"... In this paper we present a complete measurement study that compares YouTube traffic generated by mobile devices (smart-phones, tablets) with traffic generated by common PCs (desktops, notebooks, netbooks). We investigate the users ’ behavior and correlate it with the system performance. Our measurem ..."
Abstract
-
Cited by 56 (2 self)
- Add to MetaCart
In this paper we present a complete measurement study that compares YouTube traffic generated by mobile devices (smart-phones, tablets) with traffic generated by common PCs (desktops, notebooks, netbooks). We investigate the users ’ behavior and correlate it with the system performance. Our measurements are performed using unique data sets which are collected from vantage points in nation-wide ISPs and University campuses from two countries in Europe and the U.S. Our results show that the user access patterns are similar across a wide range of user locations, access technologies and user devices. Users stick with default player configurations, e.g., not changing video resolution or rarely enabling full screen playback. Furthermore it is very common that users abort video playback, with 60 % of videos watched for no more than 20 % of their duration. We show that the YouTube system is highly optimized for PC access and leverages aggressive buffering policies to guarantee excellent video playback. This however causes 25%-39 % of data to be unnecessarily transferred, since users abort the playback very early. This waste of data trans-ferred is even higher when mobile devices are considered. The limited storage offered by those devices makes the video download more complicated and overall less efficient, so that clients typically download more data than the actual video size. Overall, this result calls for better system optimization for both, PC and mobile accesses.
Video stream quality impacts viewer behavior: Inferring causality using quasi-experimental designs
- in Proc. Internet Measurement Conf
, 2012
"... ABSTRACT The distribution of videos over the Internet is drastically transforming how media is consumed and monetized. Content providers, such as media outlets and video subscription services, would like to ensure that their videos do not fail, startup quickly, and play without interruptions. In re ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
(Show Context)
ABSTRACT The distribution of videos over the Internet is drastically transforming how media is consumed and monetized. Content providers, such as media outlets and video subscription services, would like to ensure that their videos do not fail, startup quickly, and play without interruptions. In return for their investment in video stream quality, content providers expect less viewer abandonment, more viewer engagement, and a greater fraction of repeat viewers, resulting in greater revenues. The key question for a content provider or a CDN is whether and to what extent changes in video quality can cause changes in viewer behavior. Our work is the first to establish a causal relationship between video quality and viewer behavior, taking a step beyond purely correlational studies. To establish causality, we use QuasiExperimental Designs, a novel technique adapted from the medical and social sciences. We study the impact of video stream quality on viewer behavior in a scientific data-driven manner by using extensive traces from Akamai's streaming network that include 23 million views from 6.7 million unique viewers. We show that viewers start to abandon a video if it takes more than 2 seconds to start up, with each incremental delay of 1 second resulting in a 5.8% increase in the abandonment rate. Further, we show that a moderate amount of interruptions can decrease the average play time of a viewer by a significant amount. A viewer who experiences a rebuffer delay equal to 1% of the video duration plays 5% less of the video in comparison to a similar viewer who experienced no rebuffering. Finally, we show that a viewer who experienced failure is 2.32% less likely to revisit the same site within a week than a similar viewer who did not experience a failure.
Characteristics of YouTube network traffic at a campus network Measurements, models, and implications,”
- Comput. Netw.,
, 2009
"... a b s t r a c t User-Generated Content has become very popular since new web services such as YouTube allow for the distribution of user-produced media content. YouTube-like services are different from existing traditional VoD services in that the service provider has only limited control over the ..."
Abstract
-
Cited by 53 (2 self)
- Add to MetaCart
a b s t r a c t User-Generated Content has become very popular since new web services such as YouTube allow for the distribution of user-produced media content. YouTube-like services are different from existing traditional VoD services in that the service provider has only limited control over the creation of new content. We analyze how content distribution in YouTube is realized and then conduct a measurement study of YouTube traffic in a large university campus network. Based on these measurements, we analyzed the duration and the data rate of streaming sessions, the popularity of videos, and access patterns for video clips from the clients in the campus network. The analysis of the traffic shows that trace statistics are relatively stable over short-term periods while long-term trends can be observed. We demonstrate how synthetic traces can be generated from the measured traces and show how these synthetic traces can be used as inputs to trace-driven simulations. We also analyze the benefits of alternative distribution infrastructures to improve the performance of a YouTube-like VoD service. The results of these simulations show that P2P-based distribution and proxy caching can reduce network traffic significantly and allow for faster access to video clips.
Learning Social Tag Relevance by Neighbor Voting
"... Abstract—Social image analysis and retrieval is important for helping people organize and access the increasing amount of user-tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-cont ..."
Abstract
-
Cited by 53 (10 self)
- Add to MetaCart
(Show Context)
Abstract—Social image analysis and retrieval is important for helping people organize and access the increasing amount of user-tagged multimedia. Since user tagging is known to be uncontrolled, ambiguous, and overly personalized, a fundamental problem is how to interpret the relevance of a user-contributed tag with respect to the visual content the tag is describing. Intuitively, if different persons label visually similar images using the same tags, these tags are likely to reflect objective aspects of the visual content. Starting from this intuition, we propose in this paper a neighbor voting algorithm which accurately and efficiently learns tag relevance by accumulating votes from visual neighbors. Under a set of well defined and realistic assumptions, we prove that our algorithm is a good tag relevance measurement for both image ranking and tag ranking. Three experiments on 3.5 million Flickr photos demonstrate the general applicability of our algorithm in both social image retrieval and image tag suggestion. Our tag relevance learning algorithm substantially improves upon baselines for all the experiments. The results suggest that the proposed algorithm is promising for real-world applications. Index Terms—Social tagging, tag relevance learning, neighbor voting, multimedia indexing and retrieval I.
Privacy-preserving p2p data sharing with oneswarm
- In ACM SIGCOMM
, 2010
"... Privacy—the protection of information from unauthorized disclosure—is increasingly scarce on the Internet. The lack of privacy is particularly true for popular peer-to-peer data sharing applications such as BitTorrent where user behavior is easily monitored by third parties. Anonymizing overlays suc ..."
Abstract
-
Cited by 51 (3 self)
- Add to MetaCart
(Show Context)
Privacy—the protection of information from unauthorized disclosure—is increasingly scarce on the Internet. The lack of privacy is particularly true for popular peer-to-peer data sharing applications such as BitTorrent where user behavior is easily monitored by third parties. Anonymizing overlays such as Tor and Freenet can improve user privacy, but only at a cost of substantially reduced performance. Most users are caught in the middle, unwilling to sacrifice either privacy or performance. In this paper, we explore a new design point in this tradeoff between privacy and performance. We describe the design and implementation of a new P2P data sharing protocol, called OneSwarm, that provides users much better privacy than BitTorrent and much better performance than Tor or Freenet. A key aspect of the One-Swarm design is that users have explicit configurable control over the amount of trust they place in peers and in the sharing model for their data: the same data can be shared publicly, anonymously, or with access control, with both trusted and untrusted peers. One-Swarm’s novel lookup and transfer techniques yield a median factor of 3.4 improvement in download times relative to Tor and a factor of 6.9 improvement relative to Freenet. OneSwarm is publicly available and has been downloaded by hundreds of thousands of users since its release.