Characterizing and Modeling Popularity of User-generated Videos

Y. Borghol, S. Mitra, S. Ardon, N. Carlsson, D. Eager, and A. Mahanti, Proc. 29th IFIP WG 7.3 Int'l. Symp. on Computer Performance, Modeling, Measurements and Evaluation (IFIP Performance 2011), Amsterdam, Netherlands, Oct. 2011, to appear.

This paper develops a framework for studying the popularity dynamics of user-generated videos, presents a characterization of the popularity dynamics, and proposes a model that captures the key properties of these dynamics. We illustrate the biases that may be introduced in the analysis for some choices of the sampling technique used for collecting data; however, sampling from recently-uploaded videos provides a dataset that is seemingly unbiased. Using a dataset that tracks the views to a sample of recently-uploaded YouTube videos over the first eight months of their lifetime, we study the popularity dynamics. We find that the relative popularities of the videos within our dataset are highly non-stationary, owing primarily to large differences in the required time since upload until peak popularity is finally achieved, and secondly to popularity oscillation. We propose a model that can accurately capture the popularity dynamics of collections of recently-uploaded videos as they age, including key measures such as hot set churn statistics, and the evolution of the viewing rate and total views distributions over time.

Datasets

The datasets used in our paper are made available here for use by the wider research community. The datasets consist of publicly available meta-data associated with videos from the Youtube Web site. Please refer to Section 3 of our paper for a description of the data collection methodology and a summary of the datasets. If you use our datasets in your research, please drop Anirban Mahanti a line at "anirban dot mahanti AT-SIGN gmail dot com", and include a reference to our paper in your work.

Recently-uploaded Videos

Keyword-search Videos