Component-Wise Evaluation of ALPINE

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

134 6. MULTIMODAL SEQUENTIAL LEARNING

• BPR [137]: is is a Bayesian personalized ranking model, which trains on pairwise items

by maximizing the diﬀerence between the posterior probability of the positive samples and

the negative ones.

• CNN-R: is model is a CNN-based recommendation system, which utilizes the CNN

structure to model sequential information. In particular, it ﬁrst applies diﬀerent convo-

lutional kernels to the sequential feature matrix. Explicitly, the window size varies from

one to ten, and each kernel size has 32 linear ﬁlters. ereafter, it feeds the obtained fea-

ture map into the max pooling layer followed by a fully connected layer to obtain interest

embedding. Finally, a MLP is followed to predict the click probability.

• LSTM-R: is model utilizes the LSTM network to model the user’s sequential infor-

mation. Having obtained the hidden states, it feeds them into a fully connected layer to

generate the interest representation, and then a MLP module is adopted to predict the

click probability.

• ATRank [203]: It is an attention-based user behavior modeling framework, which cap-

tures the user’s behavior interactions in multiple semantic spaces by the self-attention

mechanism.

• NCF [60]: It is a collaborative ﬁltering-based deep recommendation model, which learns

the user embedding and the item embedding with a shallow network (element-wise prod-

uct between user and item) and a deep network (concatenation of the user and item em-

bedding followed by several MLP layers).

• THACIL [28]: It is a self-attention-based method for the micro-video recommenda-

tion, which utilizes a multi-head self-attention layer to capture the long-term correlation

within user behaviors and the item and category two-level attention layer to model the

ﬁne-grained proﬁling of the user interest.

It is worth mentioning that THACIL and ATRank utilize the same click probability

prediction layer as our model. As to the other methods including CNN-R, LSTM-R, BPR,

and NCF, we fed the interest representations and the embedding of the new micro-video into

the MLP layer to predict the click probability.

6.5.3 OVERALL COMPARISON

We conducted an empirical study to investigate whether our proposed model can achieve better

recommendation performance. e results of all methods on two datasets are summarized in

Table 6.1. Several observations stand out.

• BPR performs worse than the other baselines since it overlooks the sequential character-

istic of the users’ interest information. It hence fails to exploit the user’s dynamic interest,

revealing the necessity of modeling the historical sequence.

6.5. EXPERIMENTS 135

Table 6.1: Performance comparison between our proposed model and several state-of-the-art

baselines over Dataset III-1 and III-2. And statistical signiﬁcance over AUC between ALPINE

and the best baseline (i.e., THACIL) is determined by a t-test (4 denotes p-value <0.01).

Methods

Dataset III-1 Dataset III-2

AUC P@50 R@50 F@50 AUC P@50 R@50 F@50

BPR 0.595 0.290 0.387 0.331 0.583 0.241 0.181 0.206

LRTM-R 0.713 0.316 0.420 0.360 0.641 0.277 0.205 0.236

CNN-R 0.719 0.312 0.413 0.356 0.650 0.287 0.214 0.245

ATRank 0.722 0.322 0.426 0.367 0.660 0.297 0.221 0.253

NCF 0.724 0.320 0.420 0.364 0.672 0.316 0.225 0.262

THACIL 0.727 0.325 0.429 0.369 0.684 0.324 0.234 0.269

ALPINE 0.739

△

0.331 0.436 0.376 0.713

△

0.300 0.460 0.362

• Sequential modeling methods, including LSTM-R, CNN-R, ATRank, and THACIL,

surpass the BPR model. is veriﬁes the eﬀectiveness of sequence modeling. Moreover,

the self-attention based models, i.e., ATRank and THACIL, outperform CNN-R and

LSTM-R, especially the latter one. It reveals that simply utilizing the LSTM network

is insuﬃcient to capture the users’ dynamic and diverse interest information from a very

long sequence. e attention mechanism can implicitly reduce the memorization length

by focusing on the key interest information, that is why ATRank and THACIL achieve

better performance on two datasets.

• While NCF does not model the user’s historical information as a sequence, it also achieves

promising performance compared with the other baselines. Probably because setting a user

embedding matrix and updating it in the training stage can improve the interest represen-

tation. Moreover, two operations, the element wise product and several MLPs, model the

relationship between users and items better.

• ALPINE achieves the best performance, substantially surpassing all the baselines. Partic-

ularly, ALPINE presents consistent improvements over sequential models like ATRank

and THACIL, verifying the importance of memorizing the prior interested information

and employing the temporal graph-based LSTM network on enhancing the interest rep-

resentation. In addition, our proposed ALPINE exceeds NCF, because NCF randomly

initializes the user matrix rather than explores its multi-level interest information. is jus-

tiﬁes the eﬀectiveness of our proposed multi-level interest modeling module. Moreover,

as ALPINE also characterizes the user’s uninterested cues, which can further improve the

recommendation performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Component-Wise Evaluation of ALPINE

Create new playlist

Sign In

Sign Up

Table of Contents for
Component-Wise Evaluation of ALPINE