APPLICATION: APPROXIMATING PERMANENTS 171
Restricted permutations generalized Bregman bound
Input: A, Output:
σ
, n
1) n 0
2) Repeat
3) n n + 1,
σ
inv
(0, 0,...,0), j 0
4) Repeat
5) j j + 1, C
j
←{i : A(i, j) > 0}
6) If C
j
= {i},then
σ
(i) j
7) Else if i C
j
where A(i, j)=r
i
,then
σ
inv
( j) i
8) Else
9) Calculate M
3
(A)
10) For all i C
j
, calculate M
3
(A
i, j
)
11) Draw U [0,M
3
(A)]
12) If U >
n
i
=1
M
3
(A
i
, j
) then
σ
inv
reject
13) Else let
σ
inv
( j) min{i :
i
i
=1
M
3
(A
i
, j
) > U}
14) Until
σ
inv
= reject or j = n
15) Until
σ
inv
= reject
16) Return
σ
1
inv
9.2.5 Lower bounds on permanents
To give an upper bound on the running time of the algorithm, it is necessary to show
that M(A)/per(A) is small. This requires a lower bound on the permanent.
In the 1920s, Van der Waerden conjectured the following lower bound. This con-
jecture was independently shown to be true by Falikman [34] and Ergorychev [33].
Theorem 9.2 (Falikman-Egorychev Theorem). Let A be a nonnegative matrix whose
row and column sums are at most 1. Then per(A) n!/n
n
.
A matrix whose row and column sums are 1 is called doubly stochastic. The theo-
rem states that the doubly stochastic matrix with the smallest value of the permanent
is the one where every entry in the matrix is 1/n. That is, the mass of the entries is
spread out as far as possible.
In [52], it was considered how to use this theorem to bound the running time o f
SAR using the Generalized Bregman bound.
Denition 9.2. Say that a matrix is regular if all the row and column sums are the
same.
For an n by n matrix A that is regular with common row sum r, dividing each row
by r gives a doubly stochastic matrix. Hence per(A) r
n
n!/n
n
. This gave rise to the
following result.
Lemma 9.8. Suppose that each row and column sum of a {0,1} matrix A equals
γ
n
for
γ
[1/n,1].ThenRestricted permutations generalized Bregman bound(A)
takes on average at most O(n
1.5+0.5/
γ
) steps.
Proof. From our earlier AR theory, the number of trials will be bounded above by
172 ADVANCED ACCEPTANCE/REJECTION
i
h
1
(r)/[r
n
n!/n
n
]=(n
n
/n!)(h
1
(r)/r)
n
. By Sterlings inequality, n! (n/e)
n
2
π
n.
Hence n
n
/n! e
n
(2
π
n)
1/2
.Sincer 1, h
1
(r)=e
1
[r +(1/2)ln(r)+exp(1) 1],
so
h
1
(r)
r
= e
1
1 +
(1/2)ln(r)+e 1
r
e
1
e
[(1/2) ln(r)+e1]/r
.
Therefore,
n
n
n!
·
h
1
(r)
r
n
e
n
2
π
n
·e
n
exp((1 /2)ln(r)+e 1)/
γ
=(2
π
n)
1/2
(
γ
n)
0.5/
γ
e
(e1)/
γ
= O(n
0.5+0.5/
γ
).
Each step requires at most n
2
operations to compute the bounds, making the
running time (on average) at most O (n
1.5+0.5/
γ
) steps.
9.2.6 Sinkhorn balancing
So how does the lower bound help when the matrix is not regular? Recall that if
a row (or column) of the matrix is multiplied by a constant, then the permanent is
multiplied by the same constant.
Therefore, by dividing each row by its sum, it is possible to make a matrix whose
row entries are all 1, and whose permanent is related in a known way to the original
matrix.
However, the column sums might not be 1. So divide the columns by their sums.
This might have messed up the rows, however. So next divide the rows by their sum.
Continue moving back and forth between rows and columns until the matrix is almost
regular. This procedure is known as Sinkhorn balancing [116].
More precisely, if X and Y are diagonal matrices, then XAY is a scaling of A
where per(A)=per(XAY)/[per(X)per(Y )]. From the denition, the permanent of a
diagonal matrix is just the product of its entries. By keeping track of the scaling done
to rows and columns at each step, when Sinkhorn balancing is ended the matrices X
and Y will be known.
The question then becomes: how many steps must be taken before the resulting
matrix is close to balanced? To answer this question, it is necessary to have a notion
of how close the matrix is to being doubly stochastic.
Denition 9.3. Say that diagonal matrices X and Y scale A to accuracy
α
if both the
row and columns sums of XAY all fall in [1
α
,1 +
α
].
In [75], it was shown that the ellipsoid method could be used to obtain such a
scaling in O(n
4
ln(ln(1/
α
)n/
α
)) time. While a useful theoretical result, the ellipsoid
method is very co mplex to code and can suffer from numerical instability.
In practice, simply employing the method of rescaling the rows, then the
APPLICATION: APPROXIMATING PERMANENTS 173
columns, then the rows again, and so on until accuracy
α
is reached, is much faster.
It can be shown to require only O((
α
1
+ ln(n))
n log(1/
α
)) steps, and each step
takes Θ(n
2
) time. This method is called Sinkhorn balancing.
Sinkhorn balancing
Input: A,
α
Output: A, s
1) s 1
2) Repeat
3) Let r
i
be the sum of row i of A
4) Divide row i of A by r
i
, multiply s by r
i
5) Let c
i
be the sum of column i of A
6) Divide column i of A by c
i
, multiply s by c
i
7) Until max
i
{|r
i
1|,|c
i
1|}
α
Now for matrices B with row and column sums exactly 1, p er(B) n!/n
n
.When
they are only 1 within additive accuracy
α
, the following holds.
Lemma 9.9. Let B = X AY be A scaled to accuracy
α
0.385/n. Then
per(B)
n!
n
n
exp(4n
2
α
). (9.27)
See [86] for the proof. So for
α
= 0.25/n
2
, only a factor of e is lost in the running
time of the acceptance/rejection result. Sinkhorn balancing takes O(n
4.5
ln(n)) steps
to accomplish this level of accuracy.
Now that the graph has been made regular, what is the best way to generate sam-
ples? Well, Minc’s inequality works best wh en the row sums are as large as possible.
To make them larger, simply multiply the row by the inverse of the largest element.
That makes the row sums as large as possible while keeping all of the elements of
the matrix in the set [0,1].
Next, the SAR method using the generalized Bregman bound can be applied to
the regularized matrix to obtain an estim ate of p.
Estimate permanent
Input: A, k Output: ˆa
1) Let n be the number of rows and columns of A
2) (A,s) Sinkhorn
balancing(A , 0.25/n
2
)
3) For every row i of A
4) c max
j
A(i, j)
5) Divide row i of A by c, multiply s by c
6) R 0
7) Repeat k times
8) (
σ
,n) Restricted permutations generalized Bregman bound(A)
9) Draw R
1
,...,R
n
iid Exp(1)
10) R R + R
1
+ ···+ R
n
11) ˆa M
3
(A)s(k 1)/R
174 ADVANCED ACCEPTANCE/REJECTION
Recall from Section 2.5, the reason for drawing the exponentials at each step is
so that ˆa/per(A) is k 1 times an inverse gamma distribution with parameters k and
1. Therefore E[ ˆa]=per(A) and the relative error of the estimate does not depend on
per(A) at all.
The balancing at the beginning only needs to be done once, so the running time
of the procedure will be the time to do the balancing plus the time to do the accep-
tance/rejection part. The rst step in bounding this running time is to show that the
maximum element of each row in the scaled matrix is small when the original matrix
A is sufciently dense.
Lemma 9.10. Let
γ
(1/2 , 1], and A be an n-by-n matrix with entries in [0,1] and
at least
γ
n entries equal to 1 in every row and column. Let B = XAY with X and Y
diagonal matrices where B is doubly stochastic with accuracy
α
.Then
max
i, j
B(i, j)
(1 +
α
)
2
(2
γ
1)n 3n
α
2
. (9.28)
Proof. Fix i and j in {1,...,n}. Introduce S
r
= {j
= j : A(i, j
)=1} and S
c
= {i
=
i : A(i
, j)=1}. Given the density of A,#S
r
and #S
c
are at least
γ
n 1. Break A into
four submatrices based on S
r
and S
c
. For subsets S
1
and S
2
of {1 ,...,n},letA(S
1
,S
2
)
be the submatrix of A that contains elements that are in rows o f S
1
and columns of
S
2
.Then
A
1
= A(S
r
,S
c
), A
2
= A(S
r
,S
C
c
), A
3
= A(S
C
r
,S
c
), A
4
= A(S
C
r
,S
C
c
),
and B
1
,B
2
,B
3
,B
4
are the corresponding submatrices of B.
Let s(C) denote the sum of the entries of a matrix C.ThensinceB is doubly
stochastic to accuracy 1
α
, each row and column adds up to at least 1
α
. Hence
s(B
1
)+s(B
2
) #S
c
(1
α
)
s(B
1
)+s(B
3
) #S
r
(1
α
).
And since each row adds up to at most 1 +
α
:
s(B
1
)+s(B
2
)+s(B
3
)+s(B
4
) n(1 +
α
)
#S
c
(1
α
)+#S
r
(1
α
) s(B
1
)+s(B
4
) n(1 +
α
).
Since the submatrix B
4
includes B(i,i), B(i,i) s(B
4
) and
B(i,i) n(1 +
α
)+s(B
1
) (#S
c
+ #S
r
)(1
α
). (9.29)
From the density assumption, #S
c
+ #S
r
2
γ
n 2. To upper bound s(B
1
), rst
dene x(i)=X(i,i) and y(i)=Y (i,i) so that B(i
, j
)=x(i
)A(i
, j
)y( j
).Then
s(B
1
)
i
S
c
jS
r
x(i
)A(i
, j
)y( j
)=
i
S
c
x(i
)

j
S
r
y( j
)
. (9.30)
APPLICATION: APPROXIMATING PERMANENTS 175
Given B is doubly stochastic to within accuracy
α
,
n
i
=1
x(i
)A(i
, j)y( j) 1+
α
,
and
n
j
=1
x(i)A(i, j
)y( j
) 1 +
α
.Thisgives
i
S
c
x(i
) ≤−B(i, j)+
n
i
=1
x(i
)A(i
, j) ≤−B(i, j)+y(j)
1
(1 +
α
)
j
S
r
y( j
) ≤−B(i, j)+
n
j
=1
A(i, j
)y( j
) ≤−B(i, j)+x(i)
1
(1 +
α
).
So s(B
1
) x(i)
1
y( j)
1
(1+
α
).NowwhenA(i, j)=0, B(i, j)=x(i)A(i, j)y( j)=
0 as well. When A(i, j)=1, B(i, j)=x(i)y( j).Sos(B
1
) B(i, j)
1
(1 +
α
)
2
.
At this point we have
0 B(i, j) n(1 +
α
)+B(i, j)
1
(1 +
α
)
2
(2
γ
n 2)(1
α
). (9.31)
Solving gives
B(i, j) (1 +
α
)
2
/[(2
γ
1 2
γα
α
)n 2]. (9.32)
Using 2
γ
+ 1 3 nishes the proof.
Lemma 9.11. Let B be an n-by-n nonnegative matrix with max
i, j
B(i, j) [(2
γ
1)n]
1
+o(n
1
) that is doubly stochastic to accuracy 0.385/n
2
. Then for C the matrix
formed by dividing each row of B(i, j) by its maximum value,
M
3
(C)
per(C)
5(2
π
n)
1/2
(n(2
γ
1)+o(n))
(2
γ
1)/2
e
(2
γ
1)(e1)
. (9.33)
Proof. By Lemma 9.9, per(B) n!n
n
exp(4n
2
(0.385/n
2
)) 0.2n!n
n
.Letm
i
denote the maximum element of row i of B.Thenper(B)=per(C)
m
1
i
.So
per(C) n!s
1
n
n
m
1
i
.
For the upper bound, the rows of C add to at least [1 0.385/n
2
]/m
i
.Since
for r 1, h
1
(r)=e
1
[r +(1/2)ln(r)+e 1], that gives an upper bound per(C)
e
n
i
(m
1
i
+(1/2)ln(m
1
i
)+e 1).
On th e other hand 0.2n!n
n
0.2e
n
2
π
n. Hence
M
3
(C)
per(C)
e
n
i
(m
1
i
+(1/2)ln(m
1
i
)+e 1)
0.2
2
π
ne
n
m
i
=
i
!
1 +
(1/2)ln(m
1
i
)+e 1
m
1
i
"
5(2
π
n)
1/2
exp
!
i
(1/2)ln(m
1
i
)+e 1
m
1
i
"
5(2
π
n)
1/2
exp
!
i
(1/2)ln(n(2
γ
1)+o(n)) + e 1
n(2
γ
1)+o(n)
"
5(2
π
n)
1/2
exp((2
γ
1)
1
[(1/2)ln(n(2
γ
1)+o(n)) + e 1])
= 5(2
π
n)
1/2
(n(2
γ
1)+o(n))
(
γ
1/2)
1
e
(2
γ
1)
1
(e1)
.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset