16 2. DATA COLLECTION
applied the matrix factorization technique [37] to factorize this original matrix into two latent
matrices with 100 latent features, such that the empirical errors between the production of these
2 latent matrices and the original matrix are as small as possible. e entries in the 2 latent ma-
trices are inferred by the observed values in the original matrix only, and over-fitting is avoided
through a regularized model.
2.3 DATASET III FOR MICRO-VIDEO ROUTING
We evaluate our proposed multimodal sequential learning methods for the task of micro-video
routing on two public micro-video, Dataset III-1 and Dataset III-2.
Dataset III-1. is dataset is released by the Kuaishou Competition in ChinaMM2018
conference,
3
which aims to infer users’ click probabilities for new micro-videos. In this dataset,
there are multiple interactions between users and micro-videos, such as click, not click, like,
and follow. Particularly, not click means the user did not click the micro-video after preview-
ing its thumbnail. Moreover, each behavior is associated with a timestamp, which records when
the behavior happens. We have to mention that the timestamp has been processed such that
the absolute time is unknown, but the sequential order can be obtained according to the times-
tamp. For each micro-video, the contest organizers have released its 2,048-d visual embedding
of its thumbnail. Among the large-scale dataset, we randomly selected 10,000 users and their
3,239,534 interacted micro-videos to construct the Dataset III-1.
Dataset III-2. is dataset is constructed by [4] for micro-video click-through predic-
tion. It consists of 10,986 users, 1,704,880 micro-videos, and 12,737,619 interactions. Differ-
ent from Dataset III-1, Dataset III-2 only contains the “click” and “not click” behaviors. Each
micro-video in Dataset III-2 is represented by a 512-d visual embedding vector extracted from
its thumbnail and associated with a category label, and each user’s behavior is linked with a
processed timestamp.
e statistics of the above two datasets are summarized in Table 2.3. e reported ex-
perimental results of micro-video routing in this book are based on these two datasets. Specifi-
cally, we set the first 80% of a users historical accessed micro-videos as the training set and the
rest of 20% as the testing one in the Dataset III-1. As for Dataset III-2, we utilized the same
setting with [28]. It is worth mentioning that we adopted the Principal Component Analysis
(PCA) [169] to reduce the visual embedding vector of a micro-video to 64 dimension.
2.4 SUMMARY
To justify our proposed models and its three practical application scenarios, in this chapter, we
introduce three micro-video datasets. In particular, we construct Dataset I and II for popularity
prediction and venue category estimation, respectively. In addition, we leverage the publicly
released micro-video datasets, namely Dataset III-1 and III-2, to testify the task of micro-video
3
http://mm.ccf.org.cn/chinamm/2018/
2.4. SUMMARY 17
Table 2.3: Statistics of the two datasets
Dataset Dataset I Dataset II
# Users 10,000 10,986
# Items 3,239,534 1,704,880
# Interactions 13,661,383 12,737,619
# Interaction types 4 2
# Average interactions per user 1,366.14 1,159.44
# Average interactions per item 4.28 7.47
# Average clicked items per user 277 218
# Interactions in training set 10,931,092 8,970,310
# Interactions in test set 2,730,291 3,767,309
routing. We have released all the involved codes, parameter settings, and datasets involved in
this book to facilitate other researchers in the community of micro-video understanding.
4
4
https://ilearn2019.wixsite.com/microvideo.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset