C H A P T E R 3
Multimodal Transductive
Learning for Micro-Video
Popularity Prediction
Despite their shortness, micro-videos generally outline a relatively simple but complete story to
audiences. Within a limited time interval, producers also attempt to condense and maximize
what they want to say, thereby to create more attractive stories. Compared with traditional long
videos like the ones in Youtube, micro-videos are produced to satisfy a fast-paced modern soci-
ety, which makes micro-videos appear to be more social-oriented. erefore, micro-videos are
much easier to be spread. Popular micro-videos have enormous commercial potential in many
ways, such as online marketing and brand tracking. In fact, the popularity prediction of tra-
ditional UGCs including tweets, web images, and long videos, has achieved good theoretical
underpinnings and great practical success. However, little research has thus far been conducted
to predict the popularity of the bite-sized videos. In this chapter, we work toward solving the
problem of popularity prediction of micro-videos posted on social networks.
Since micro-videos are produced with the aim of rapid spreading and sharing among users,
these videos bring more intrinsic relations with social networks that differ from traditional long
videos. erefore, it makes predicting the popularity of micro-videos a non-trivial task due to
the following facts.
(1) Heterogeneous. Due to the short duration of micro-videos, each modality can only
provide limited information, the so-called modality limitation. Fortunately, micro-videos always
involve multiple modalities, namely, social, visual, acoustic, and textual
modalities. In a sense,
these modalities are co-related rather than independent and essentially characterize the same
micro-videos. erefore, the major challenge lies on how to effectively fuse micro-videos’ het-
erogeneous clues from multiple modalities [162, 163, 189]. e most naive strategies are early
Micro-videos are usually associated with certain textual data, such as video descriptions given by the video owners.
