literatures and make a step forward of micro-video captioning tasks. (5) Different from image
captioning, visual descriptions in micro-videos are relatively shorter and the number of ground
truth descriptions is limited, which results in the infeasibility of traditional captioning evaluation
metrics, e.g., BLEU, ROUGE, METEOR, and CIDEr. erefore, developing new evaluation
metrics fitting micro-video captioning should be a popular future topic.
To retain users’ stickiness, beyond improving the quality of micro-videos, micro-video platforms
and publishers have to draw users’ eyes quickly [63]. As the most representative snapshot, the
thumbnail summaries a micro-video visually and provides the first impression to the users, as
shown in Figure 7.4. Moreover, studies report that the thumbnail is a crucial deciding factor
in determining to watch a video or skip to another [13]. It means that an appealing thumbnail
makes the micro-video more attractive. However, due to the inconvenient operation on smart-
phones or lack of experience, selecting a good thumbnail poses a challenge to users. erefore, we
suggest that an automatic thumbnail selection strategy is necessary to the micro-video sharing
Figure 7.4: Exemplar demonstration of the micro-video thumbnail.
Although several pioneer efforts [53, 76, 93, 104, 194] have been dedicated to jointly con-
sider the quality and representativeness for selecting the thumbnail, they ignored the fact that
the thumbnail should reflect the publishers preference and meet more users’ interests. Consid-
ering such fact, it brings the following challenges to the task: (1) how to measure the publishers
preferences on the different frames extracted from the micro-video; (2) how to calculate the
popularity of each frame according to the distribution of users’ interests on the platform; and
(3) to our knowledge, there is no such a suitable dataset to explore the micro-video thumbnail se-
lection. Toward these challenges, amounts of micro-videos associated with the side-information
(e.g., comments, publishers’ profiles) are first collected to build a large-scale micro-video dataset
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.