66 4. MULTIMODAL COOPERATIVE LEARNING
where
i
s are the introduced variables that should satisfy
P
M
iD1
i
D 1,
i
> 0 and the equality
holds for
i
D jb
i
j=kbk
1
. Based on this preliminary, we can derive the following inequality:
X
v2V
e
v
k
W
G
v
k
!
2
K
X
kD1
X
v2V
e
2
v
kw
k
G
v
k
2
2
q
k;v
; (4.10)
where
P
k
P
v
q
k;v
D 1, q
k;v
0, 8k; v; w
k
G
v
denotes the k-th row vector of the group matrix
W
G
v
. It worth noting that the equality holds when
q
k;v
D
e
2
v
w
k
G
v
2
2
P
K
kD1
P
v2V
e
2
v
w
k
G
v
2
2
: (4.11)
us far, we have theoretically derived that minimizing with respect to W is equivalent
to minimizing the following convex objective function:
min
W;q
k;v
1
2
k
Y BW
k
2
F
C
1
2
S
X
sD1
X
s
A
s
B
2
F
C
2
2
S
X
sD1
A
s
2
F
C
3
2
K
X
kD1
X
v2V
e
v
W
k
G
v
2
q
k;v
:
(4.12)
To facilitate the computation of the derivative of objective function with respect to w
t
for the
t-th task, we define a diagonal matrix Q
t
2 R
KK
with the diagonal entry as follows:
Q
t
kk
D
X
fv2Vjt 2vg
e
2
v
q
k;v
: (4.13)
We ultimately have the following objective function:
min
W;Q
T
X
tD1
k
y
t
Bw
t
k
2
F
C
1
2
S
X
sD1
k
X
s
A
s
B
k
2
F
C
2
2
S
X
sD1
k
A
s
k
2
F
C
3
2
T
X
tD1
w
T
t
Q
t
w
t
: (4.14)
e alternative optimization strategy is also applicable here. By fixing Q
t
, taking derivative
of the above formulation regarding w
t
, and setting it to zero, we reach
w
t
D
B
T
B C
3
Q
t
1
B
T
y
t
: (4.15)
Once we obtain all the w
t
, we can easily compute Q
t
based on Eq. (4.11).
4.4.2 TASK RELATEDNESS ESTIMATION
According to our assumption, the hierarchical tree structure of venue categories plays a pivotal
role to boost the learning performance in our model. Hence, the key issue is how to precisely
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset