Micro-Video Thumbnail Selection

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7.2. MICRO-VIDEO CAPTIONING 143

100,000

10,000

1,000

100

Frequent Hashtags

Hashtags Ordered by Frequency

Long-tail Hashtags

Frequency

2.0k 4.0k 6.0k 8.0k 10.0k 12.0k 14.0k

Figure 7.2: e statistic of hashtag frequency distribution in Instagram.

existing approaches recommend hashtags while ignoring redundancy among them, therefore,

how to obtain relevant hashtags in consideration of their inter-dependencies is diﬃcult.

In the future, we will tackle this task from the following three directions. First, we plan

to construct a knowledge graph to explore hashtag correlations, and leverage existing structural

knowledge to derive proper dependencies between frequent hashtags and long-tail hashtags.

Second, we will introduce multi-level attention mechanism into the multimodal sequence model

to focus on important cues among the sequential features and multi-modality features. Lastly, we

expect to simulate how human annotators works and generate diverse and distinct micro-video

annotation.

7.2 MICRO-VIDEO CAPTIONING

Micro-video captioning aims to auto generate textual descriptions for micro-videos. Some ex-

amples can be found in Figure 7.3. Due to its representation capability involving both computer

vision and natural language processing techniques, the micro-video captioning shows great po-

tential in aiding visually impaired people better understand visual contents. Moreover, it also

plays a vital role in searching micro-videos and answering questions regarding micro-video con-

tents. As users tend to submit queries and ask questions about micro-video clips through text-

based keywords, a better content descriptor can promote the user satisfaction as well as loyalty for

144 7. RESEARCH FRONTIERS

Figure 7.3: Example of micro-video caption.

micro-video platforms. However, current micro-video systems (e.g., Vine, Instagram, Kuaishou,

TikTok) lack of these content descriptions, resulting in performance degradation of micro-video

retrieval and question-answering systems. Besides, some of the user-annotated captions are not

adequate enough to correctly describe the micro-video contents. erefore, it is crucial to develop

micro-video captioning approaches to auto generate concise and accurate video descriptions.

Although micro-video captioning is an important research task in literature, there are

some challenges:

(1) With the fast development of DNNs, employing more powerful network structures

(e.g., graph neural networks, reinforcement learning techniques) to micro-video captioning will

undoubtedly improve the model performance. (2) Normally the salient part inside a micro-video

consists of a short video clips (e.g., 10 s), which ﬁts well with the attention mechanism. Con-

sidering this, how to utilizing the attention mechanism to generate micro-video descriptions

will be an important research problem. (3) Traditional video captioning is struggled with te-

dious description problem due to the limitation of training corpus. erefore, the novel caption

generation will be a potential direction for the micro-video captioning task. (4) Since construc-

tion of datasets is a fundamental problem in machine learning and current micro-video datasets

are short of these captioning information, more abundant datasets will beneﬁt further related

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Micro-Video Thumbnail Selection

Create new playlist

Sign In

Sign Up

Table of Contents for
Micro-Video Thumbnail Selection