44 3. MULTIMODAL TRANSDUCTIVE LEARNING
By combining the objective functions in Eqs. (3.26) and (3.30) with Eq. (3.28), we develop
our TLRMVR algorithm as follows:
min
w;Z;E;
O
W
k
Z
k
C
k
E
k
1
C ˛
k
w
k
2
2
C ˇG
w; Z;
O
W
s.t.
O
W
T
X D
O
W
T
XZ C E;
O
W
O
S
O
W
T
D I:
(3.35)
Here, we initialize
O
W by the following trace ratio equation:
f
W
1
; W
2
; : : : ; W
K
g
D arg max
W
1
;;:::;W
K
tr
O
W
T
S
O
W
tr
O
W
T
O
S
O
W
: (3.36)
When the low-rank representation from multi-view embedding learning and regression
analysis are both performed, the lowest-rank representation shared by all views not only captures
the global structure of all modalities but also indicates the regression requirements.
3.7.2 OPTIMIZATION
e objective function in Eq. (3.35) can be solved by applying the alternating direction method
of multipliers (ADMM), which divides a complex problem into subproblems, where each of
them is easier to handle with an iterative process. We first introduce two Lagrange multipliers
Y
1
and Y
2
to obtain the so-called augmented Lagrangian function.
en, we merge the last five terms into a single one: H.w; Z; E;
O
W; Y
1
; Y
2
; / D
ˇG.w; Z;
O
W/ C
2
k
O
W
T
X
O
W
T
XZ E C
Y
1
k
2
F
C
2
k
O
W
T
O
S
O
W I C
Y
2
k
2
F
; thus, the aug-
mented Lagrangian function of Eq. (3.35) can be reformulated as follows:
L
w; Z; E;
O
W; Y
1
; Y
2
;
D
k
Z
k
C
k
E
k
1
C ˛
k
w
k
2
2
1
2
k
Y
1
k
2
F
C
k
Y
2
k
2
F
C H
w; Z; E;
O
W; Y
1
; Y
2
;
: (3.37)
To better interpret the process, we introduce a variable t and define
Z
t
; E
t
; W
t
; w
t
; Y
1;t
; Y
2;t
, and
t
as the variables updated in the t-th iteration. Under
the ADMM framework, the problem L with respect to each variable in the t C 1 iteration is
optimized as the following scheme:
For Z: We can update Z by dropping the terms independent of Z as the following scheme:
Z
tC1
D arg min
Z
k
Z
k
C H.Z; E;
O
W; Y
1
; Y
2
; /
D arg min
Z
k
Z
k
C
t
t
2
k
Z Z
t
k
2
F
C
h
O
Z
H; Z Z
t
i
D arg min
Z
1
2
k
Z Z
t
C O
Z
H
k
2
F
C
1
t
t
k
Z
k
;
(3.38)
3.7. MULTI-MODAL TRANSDUCTIVE LOW-RANK LEARNING 45
where O
Z
H D ˇw
t
.w
T
t
Z
t
M y
T
/M
T
C 2ˇw
t
w
T
t
Z
t
L
t
X
T
O
W
t
.
O
W
T
t
X
O
W
T
t
XZ
t
E
t
C
1
t
Y
1;t
/ is the partial derivative H.Z; E; W; Y
1
; Y
2
; / with respect to Z and
t
D k
O
W
T
t
Xk
2
F
.
e problem in Eq. (3.38) is a standard nuclear norm minimization problem, which can
be approximately solved by the singular value thresholding (SVT) algorithm [14]. Specifically,
suppose that the singular vector decomposition of Z
t
O
Z
H of rank r is
Z
t
O
Z
H D PQ
T
; D diag
f
ı
i
g
r
iD1
; (3.39)
where P and Q are left-singular and right-singular matrices with orthogonal columns and
is a rectangular diagonal matrix with non-negative real numbers ı
i
on the diagonal. en, the
optimal solution Z is Z
tC1
D D
1=
t
t
.Z
t
O
Z
H /. For each 1=
t
t
0, the soft-thresholding
operator D
1=
t
t
.Z
t
O
Z
H / is defined as [14]:
8
<
:
D
1=
t
t
.Z
t
O
Z
H / D P
1=
t
t
C
Q
T
;
1=
t
t
C
D diag.f
i
1=
t
t
/
C
g/;
(3.40)
where t
C
is the positive part of t , namely, t
C
D max.0; t/.
For E: We can obtain the optimization of E with fixed w; Z; and W as follows:
E
tC1
D arg min
E
2
O
W
T
t
X
O
W
T
t
XZ
tC1
E
2
F
C
D
Y
1;t
;
O
W
T
t
X
O
W
T
t
XZ
tC1
E
E
C
k
E
k
1
D arg min
E
t
k
E
k
1
C
1
2
E
O
W
T
t
U
tC1
Y
1;t
=
t
2
F
;
(3.41)
where U
tC1
D X XZ
tC1
is defined for simplicity. e optimization of Eq. (3.41) can be solved
by using the shrinkage operator [90].
For w: We can optimize w with fixed E; Z; and W as follows:
w
tC1
D arg min
w
1
2
y .Z
tC1
M/
T
w
2
2
C w
T
ZLZ
T
w C
˛
ˇ
k
w
k
2
2
: (3.42)
e above problem is actually the well-known ridge regression, whose optimal solution is
w
tC1
D
Z
tC1
MM
T
Z
T
tC1
C 2Z
tC1
LZ
T
tC1
C
ˇ
I
1
Z
tC1
My.
For W: By setting the derivative of L regarding W to zero, we have
O
S
O
W
tC1
.Y
2;t
C Y
T
2;t
/ 2ıˇS
O
W
tC1
C
t
U
tC1
U
T
tC1
O
W
tC1
D U
tC1
E
T
tC1
U
tC1
Y
T
1;t
:
(3.43)
en, W
tC1
can be optimized by solving the Lyapunov equation.
Moreover, the Lagrange multipliers Y
1
and Y
2
are updated by the following scheme:
8
<
:
Y
1;tC1
D Y
1;t
C
t
O
W
T
tC1
U
tC1
E
tC1
;
Y
2;tC1
D Y
2;t
C
t
O
W
T
tC1
O
S
O
W
tC1
I
:
(3.44)
46 3. MULTIMODAL TRANSDUCTIVE LEARNING
Algorithm 3.1 Optimization of our proposed algorithm
Input: Feature matrices X, popularity score vector y,
parameter variables , ˛, ˇ, ı.
Initialize: Z
0
D E
0
D Y
1;0
D Y
2;0
D w D 0, t D 0,
D 1:3,
0
D 10
6
,
max
D 10
6
, t
max
D 10
3
.
1. Compute the covariance matrix S
ij
by S
ij
D X
i
X
T
j
;
2. Initialize
O
W
0
by
O
W
0
D arg max
O
W
tr.
O
W
T
.S
O
S/
O
W/
t r.
O
W
T
O
S
O
W/
;
While not converged do
3. Fix others and update Z
tC1
:
Z
tC1
D arg min
Z
1
k
Z
k
C
1
2
k
Z Z
t
C O
Z
h
k
2
F
;
4. Fix others and update E
tC1
:
E
t
C
1
D arg min
E
t
k
E
k
1
C
1
2
kE
O
W
T
t
X C
O
W
T
t
XZ
tC1
Y
1;t
=
t
k
2
F
;
5. Fix others and update w
tC1
:
w
tC1
D
Z
tC1
MM
T
Z
T
tC1
C 2Z
tC1
LZ
T
tC1
C
ˇ
I
1
Z
tC1
My;
6. Fix others and update
O
W
tC1
:
O
S
O
W
tC1
.Y
2;t
C Y
T
2;t
/ 2ıˇS
O
W
tC1
C
t
UU
T
O
W
tC1
D 2UE
T
tC1
UY
T
1
,
O
W
tC1
Orthogonal.
O
W
tC1
/;
7. Update the multipliers Y
1;tC1
and Y
2;tC1
:
Y
1;tC1
D Y
1;t
C
t
.
O
W
T
tC1
X
O
W
T
tC1
XZ
tC1
E
tC1
/;
Y
2;tC1
D Y
2;t
C
t
.
O
W
T
tC1
O
S
O
W
tC1
I/;
8. Update the parameter
tC1
by
tC1
D min.
t
;
max
/;
9. Check the convergence conditions;
End while
Output:
O
W, E ,Z, w
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset