3.6. MULTIMODAL TRANSDUCTIVE LEARNING 31
consider the squared loss regarding the N unlabeled samples to guarantee the learning perfor-
mance. We ultimately reach our objective function as
min
f;L.S
0
/
N
X
iD1
.y
i
f
i
/
2
C f
T
L.S
0
/f C
K
X
kD1
1
tr.L.S
0
//
L.S
0
/
1
tr.L.S
k
//
L.S
k
/
2
F
;
where and are both nonnegative regularization parameters. To be more specific, penal-
izes the disagreement among the latent space and modalities, and
encourages that similar
popularity will be assigned to similar micro-videos.
3.6.2 OPTIMIZATION
To simplify the representation, we first define that
8
<
:
Q
L D
1
tr.L.S
0
//
L.S
0
/;
Q
L
k
D
1
tr.L.S
k
//
L.S
k
/:
(3.9)
erefore, the objective function can be transformed to
min
f
N
X
iD1
.y
i
f
i
/
2
C
K
X
k
D
1
Q
L
Q
L
k
2
F
C f
T
Q
Lf; s.t. tr.L.S
0
// D 1: (3.10)
Furthermore, to optimize
Q
L more efficiently, inspired by the property that tr.
Q
L
k
/ D 1, we
let
L.S
0
/ D
K
X
kD1
ˇ
k
Q
L
k
; s.t.
K
X
kD1
ˇ
k
D 1: (3.11)
Consequently, we have,
Q
L D
1
tr.L.S
0
//
L.S
0
/ D
K
X
kD1
ˇ
k
Q
L
k
; s.t.
K
X
kD1
ˇ
k
D 1: (3.12)
Interestingly, we find that ˇ
k
can be treated as the co-related degree between the latent common
space and each modality. It is worth noting that we do not impose the constraint of ˇ 0, since
we want to keep both positive and negative co-relations. A positive coefficient indicates the
positive correlation between the modality space and the latent common space, while a negative
coefficient reflects the negative correlation, which may be due to the noisy data of the modality.
e larger the ˇ
k
is, the higher correlation between the latent space and the k-th modality will
32 3. MULTIMODAL TRANSDUCTIVE LEARNING
be. In the end, the final objective function can be written as:
min
f;ˇ
N
X
iD1
.y
i
f
i
/
2
C
K
X
kD1
P
K
iD1
ˇ
i
Q
L
i
Q
L
k
2
F
C f
T
K
X
kD1
ˇ
k
Q
L
k
f C
ˇ
2
;
s.t. e
T
ˇ D 1; (3.13)
where ˇ D Œˇ
1
; ˇ
2
; ; ˇ
K
T
2 R
K
and e D Œ1; 1; ; 1
T
2 R
K
. is the regularization param-
eter, introduced to avoid the overfitting problem. We denote the objective function of Eq. (3.13)
as . We adopt the alternating optimization strategy to solve the two variables f and ˇ in . In
particular, we optimize one variable while fixing the other one in each iteration. We keep this
iterative procedure until the converges.
Computing ˇ
j
with f Fixed
We first fix f and transform the objective function as
min
ˇ
K
X
kD1
N CM
X
tD1
M
.t/
ˇ
Q
l
.t/
k
2
F
C g
T
ˇ C
ˇ
2
; s.t. e
T
ˇ D 1; (3.14)
where g D Œf
T
Q
L
1
f; f
T
Q
L
2
f; : : : ; f
T
Q
L
K
f
T
2 R
K
, M
.t/
D Œ
Q
l
.t/
1
;
Q
l
.t/
2
; : : : ;
Q
l
.t/
K
2 R
.N CM /K
and
Q
l
.t/
k
2
R
N CM
denotes the t-th column of
Q
L
k
. For simplicity, we replace
Q
l
.t/
K
with
Q
l
.t/
k
e
T
ˇ, as e
T
ˇ D 1.
With the help of Lagrangian, can be rewritten as follows:
min
ˇ
K
X
kD1
N CM
X
tD1
M
.t/
Q
l
.t/
k
e
T
ˇ
2
F
C g
T
ˇ C ı .1 e
T
ˇ/ C
ˇ
2
; (3.15)
where ı is a nonnegative Lagrange multiplier. Taking derivative of Eq. (3.15) with respect to ˇ,
we have
@
@ˇ
D Hˇ C g ıe; (3.16)
where
H D 2
"
K
X
k
D
1
N CM
X
tD1
M
.t/
Q
l
.t/
k
e
T
T
M
.t/
Q
l
.t/
k
e
T
!
C I
#
; (3.17)
and I is a K K identity matrix. Setting Eq. (3.16) to zero, we have:
ˇ D H
1
e g/: (3.18)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset