Appendix A

Matrix Analysis

A.1 Vector Spaces and Hilbert Space

Finite-dimensional random vectors are the basic building blocks of many applications. Halmos (1958) [1472] is the standard reference. We just take the most elementary material from him.


Definition A.1 (Vector space)
A vector space is a set Ω of elements called vectors satisfying the following axioms. (A) To every pair x and y, of vectors in Ω, there corresponds a vector x + y, called the sum of x and y, in such a way that
  • addition is communicative, x + y = y + x,
  • addition is associative, x + (y + z) = (x + y) + z,
  • there exists in Ω a unique vector (called the origin) such that x + 0 = x, for every x, and
  • to every vector in Ω there corresponds a unique vectorx such that x(− x) = 0.
(B) To every pair, α and x, where α is a scalar and x is a vector in Ω, there corresponds a vector αx in Ω, called the product of α and x, in such a way that
1. multiplication by scalars is associative, α(βx) = (αβx), and
2. 1x = x.
(C)
1. Multiplication by scalars is distributive with respect to vector addition, α(x + y) = αx + βy, and
2. Multiplication by vectors is distributive with respect to scalar addition, (α + β)x = αx + βx.

These axioms are not claimed to be logically independent.


Example A.1
1. Let images be the set of all complex numbers; If we regard x + y and αx as ordinary complex numerical addition and multiplication, images becomes a complex vector space.
2. Let images be the set of all n-tuples of complex numbers. If

images

are elements of images we write, by definition,

images

images is a vector space since all parts of our axioms are satisfied; it will be called n-dimensional complex coordinate space.

 


Definition A.2 (Linear dependence and linear independence
A finite set xi of vectors is linear dependent if there exists a corresponding set αi of scalars, not all zero, such that

images

If, on the other hand, images implies that αi for each i, the set xi is linearly independent.

 


Theorem A.1 (Linear combination)
The set of nonzero vectors x1, ···, xn is linearly dependent if and only if some xk, 2 ≤ k ≤ n, is a linear combination of the preceding ones.

 


Definition A.3 (Finite-dimensional)
A (linear) basis (or a coordinate system) in a vector space Ω is the set Ξ of linearly independent vectors such that every vector in Ω is a linear combination of elements of Ξ. A vector space Ω is finite-dimensional if it has a finite basis.

Recall that the basic building block in “Big Data” is a random vector which is defined in a finite-dimensional vector space. The dimension is high but still finite-dimensional. The high-dimensional data processing is critical to many modern applications.


Theorem A.2 (Basis)
If Ω is a finite-dimensional vector space and if {y1, ···, ym} is any set of linearly independent vectors in Ω, then, unless the y's already form a basis, we can find vectors {ym + 1, ···, ym + p} so that the totality of the y's, that is, {y1, ···, ymym + 1, ···, ym + p} is a basis. In other words, every linearly independent set can be extended to a basis.

 


Theorem A.3 (Dimension)
The number of elements in any basis of a finite-dimensional vector space Ω is the same as in any other basis.

 


Definition A.4 (Isomorphism)
Two vector spaces images and images (over the same field) are isomorphic, if there is a one-to-one correspondence between the vectors x of images and the vectors y of images, say y = T(x), such that

images

In other words, images and images are isomorphic if there is an isomorphism (such as T) between them, where an isomorphism is a one-to-one correspondence that preserves all linear relations.

 


Theorem A.4 (Isomorphic)
Every n-dimensional vector space Ω over a field images is isomorphic to images

 


Definition A.5 (Subspaces)
A nonempty subset Δ of a vector space Ω is a subspace or a linear manifold if along with every pair, x and y, of vectors contained in Δ, every linear combination αx + βy is also contained in Δ.

 


Theorem A.5 (Intersection of Subspace)
The intersection of any collection of subspaces is a subspace.

 


Theorem A.6 (Dimension of Subspace)
A subspace Δ of an n-dimensional vector space Ω is a vector space of dimension ≤ n.

 


Definition A.6 (Linear functional)
A linear functional on a vector space Ω is a scalar-valued function y defined for every vector x, with the property that (identically in the vectors x1 and x2 and the scalars α1 and α2)

images


If y1 and y2 are linear functions on Ω and α1 and α2 are scalars, let us define the function by

images

It is easy to check that y is also a linear functional; we denote it by α1y1 + α2y2. With these definitions of the linear concepts (zero, addition, scalar multiplication), the set Ω′ forms a vector space, the dual space of Ω.

A.2 Transformations


Definition A.7 (Linear transformation)
A linear transformation (or operator) A on a vector space Ω is a correspondence that assigns to every vector x in Ω a vector Ax in Ω, in such a way that

images

identically in the vectors x and y and the scalars α and β.

 


Theorem A.7 (Linear transformation)
The set of all linear transformations on a vector space is itself a vector space.

Linear transformations can be regarded as vectors.


Definition A.8 (Matrices)
Let Ω be an n-dimensional vector space, let images be any basis of Ω, and let A be a linear transformation on Ω. Since every vector is a linear combination of the xi, we have in particular

images

The set αij of n2 scalars, indexed with the double subscript i, j, is the matrix of A in the coordinate system images A matrixij) is usually written in the form of a square array:

images

the scalarsi1, ···, αin) form a column;1j, ···, αnj) form a row, of A.

A.3 Trace

The trace function, TrA = ∑iaii, satisfies the following properties [110] for matrices A, B, C, D, X and scalar α:

A.1 A.1

To prove the last property, note that since images is a scalar, the left side of (A.1) is

images

A.4 Basics of C*-Algebra

C*-algebras (pronounced “C-star”) are an important area of research in functional analysis. The prototypical example of a C*-algebra is a complex algebra images of linear operators on a complex Hilbert space with two additional properties:

1. images is a topologically closed set in the norm topology of operators.
2. images is closed under the operation of taking adjoints of operators.

C*-algebras [138, 1473] are now an important tool in the theory of unitary representations of locally compact groups, and are also used in algebraic formulations of quantum mechanics. Another active area of research is the program to obtain classification, or to determine the extent of which classification is possible. It is through the latter area that its connection with our interest in hypothesis detection is made.

A.5 Noncommunicative Matrix-Valued Random Variables

We mainly follow [11] this section, but with different notations that are convenient for our context. Random variables are functions defined on a measure space, and they are often identified by their distributions in probability theory [11]. In the simplest case when the random variables are real valued, the distribution is a probability measure on the real line. In this appendix probability distributions can be represented by means of linear Hilbert space operators, as well. (In this appendix an operator is an infinite-dimensional matrix.) This observation is as old as quantum mechanics; the standard probabilistic interpretation of the quantum mechanical formalism is related.

In an algebraic generalization, elements of a typically noncommutative algebra together with a linear functional on the algebra are regarded as noncommutative random variables. The linear functional evaluated on this element is the expectation value, its use to powers of this selected element leads to the moments of the noncommunicative random variables. One does not distinguish between two random variables, when they have the same moments. A very new feature of this theory occurs when these noncommunicative random variables are truly noncommuting with each other. Then, one cannot have a joint distribution in the sense of classical probability theory, but a functional of the algebra of polynomials of noncommuting indeterminates may work as an abstract concept of joint distribution [11]. Random matrices with respect to the expectation of their trace are natural “noncommuniting” noncommutative (matrix-valued) random variables.

Random variables over a probability space form an algebra. Indeed, they are measurable functions defined on a set Ω, and so are the product and sum of two of them, that is, A B and A + B. As mentioned before, the expectation value images is a linear functional on this algebra. The algebraic approach to probability stresses this point. An algebra over a field is a vector space equipped with a bilinear vector product. That is to say, it is an algebraic structure consisting of a vector space together with an operation, usually called multiplication, that combines any two vectors to form a third vector; to qualify as an algebra, this multiplication must satisfy certain compatibility axioms with the given vector space structure, such as distributivity. In other words, an algebra over a field is a set together with operations of multiplication, addition, and scalar multiplication by elements of the field [1474].

If images is a unital algebra (a vector space defined above) over the complex numbers and ϕ is a linear functional of images such that

images

then images will be called a noncommutative probability space and an element A of images will be called a noncommunicative random variable. of course, a random matrix is such a noncommunitative random variable. The number ϕ(Ak) is called the n-th moment of a noncommutative random variable.


Example A.2 (Bounded operators [11])
Let images denote the algebra of all bounded operators acting on a Hilbert space images. If the linear functional images is defined by means of a unit vector images as

images

then any element of images is a noncommunitative random variable.

If images is, further, self-adjoint (or Hermitian for the finite-dimensional case), then a probability measure is associated to A and φ, as mentioned above. The algebra used in the definition of a noncommunitative random variable is often replaced with a *-algebra. In fact, images is a *-algebra, if the operation A* stands for the adjoint of A. The most familiar example of a *-algebra is the field of complex numbers C where * is just complex conjugation. Another example is the matrix algebra of n × n matrices over C with * given by the conjugate transpose. Its generalization, the Hermitian adjoint of a linear operator on a Hilbert space is also a *-algebra (or star-algebra).

A *-algebra is a unital algebra over the complex numbers which is equipped with an involution*. The revolution recalls the adjoint operation of Hilbert space operators as follows:

1. images is conjugate linear.
2. (AB)* = (BA)*,
3. A** = A.

When images is a noncommunitative probability space over a *-algebra images ϕ is always assumed to be a state on images that is, a linear function such that

1. φ(1) = 1,
2. images and φ(A*A) ≥ 0 for every

images

A matrix X whose entries are (classical) random variables on a (classical) probability space is called a random matrix, such as a sample covariance matrix XXH. Here H stands for the conjugate and transpose (Hermitian) of a complex matrix.

Random matrices form a *-algebra. For example, consider X11, X12, X21, X22 to be four bounded (classical) scalar random variables on a probability space. Then

images

is a bounded 2 × 2 random matrix. The set X of all such matrices has a *-algebra structure when the unit matrix operations are considered, and images is a noncommunitative probability space when, for example,

images

A C*-algebra is a *-algebra images which is endowed with a norm such that

images

and furthermore, images is a Banach space with respect to this norm.

Gelfand and Naimark give two important theorems concerning the representation of C*-algebras.

1. A communicative, unital C*-algebra is isometrically isomorphic to the algebra of all continuous complex functions on a certain compact Hausdorff space, if the function space is equipped with the supremum norm and the involution of point-wise conjugation.
2. A general C * -algebra is isometrically isomorphic to the algebra of operators on a Hilbert space, if the function space is equipped with the operator norm and the involution of adjoint conjugation.

Combination of the above two theorems yields a form of the spectral theorem. For a linear functional φ of a C*-algebra,

images

is equivalent to

images

A noncommunitative probability space images will be called a C*-probability space when images is a C*-algebra and φ is a state on images.

All real bounded classical scalar random variables may be considered as noncommunicative random variables.

A.6 Distances and Projections

For projections, we freely use [1475]. Let images denote the algebra of linear operators acting on a finite-dimensional Hilbert space images. The von Neumann entropy of a state ρ, i.e. that is, a positive operator of unit trace in images, is given by S(ρ) = − Trρlogρ.

For images, the absolute value |A| is defined as images and it is a positive matrix. The trace norm of AB is defined as

images

This trace norm||AB||1 is a nature distance between complex n × n matrices A and B, images. Similarly,

images

is also a natural distance. We can define the p-norm as

images

It was Von Neumann who showed first that the Hoelder inequality remains true in the matrix setting

images

If A is a self-adjoint and written as

images

where the vector ei form an orthonormal basis, then it is defined as

images

Then A = {A ≥ 0} + {A < 0} = A+ + A and |A| = {A ≥ 0} − {A < 0} = A+A. The decomposition is called the Jordan decomposition of A. Corresponding definitions apply for the other spectral projections {A < 0}, {A > 0}, and {A ≤ 0}. For two operators, {A < B}, {A > B}, and {AB}.

For self-adjoint operators A, B and any positive operator 0PI we have

images

Identical conditions hold for strict inequalities in the spectral projections {A < B} and {A > B}.

The trace distance between operators A and B is given by

images

The fidelity of states ρ and ρ′ is defined as

images

The trace distance between two states is related to the fidelity as follows:

images

For self-adjoint operators A, B and any positive operator 0PI, the inequality

images

for any images > 0, implies that

images

The “gentle measurement” lemma is given here: For a state ρ and any positive operator 0PI, if Tr(ρP) ≥ 1 − δ, then

images

The same holds if ρ is only a subnormalized density operator, that is, Trρ ≤ 1.

If ρ is a state and P is a projection operator such that Tr(Pρ) > 1 − δ for a given δ > 0, then

images

where

images

and images.

Consider a state ρ and a positive operator σ ∈ Bε(ρ), for some images > 0. If πσ denote the projection onto the support of σ, then

images


Lemma A.1 (Hoffman-Wielandt)
[16, 34] Let A and B be N × N self-adjoint matrices, with eigenvalues images and images. Then,

images


The singular values of a matrix AMn are the eigenvalues of its absolute value images, we have fixed the notation s(A) = (s1(A), …, sn(A)) with s1(A) ≥…≥ sn(A). Singular values are closely related to the unitary invariant norm. Singular values inequalities are weaker than Löwner partial order inequalities and stronger than unitarily invariant norm inequalities in the following sense [133]:

images

for all unitarily invariant norms. The norm ||A||1 = Tr|A| is unitarily invariant. Singular values are unitarily invariant: s(UAV) = s(A) for every A and all unitary U, V. A norm || · || is called unitarily invariant if

A.2 A.2

for all AMn and all unitary U, VMn. ||A|| ≤ ||B|| for all unitarily invariant norms if and only if images that is,

A.3 A.3

The differences of two positive semidefinite matrices A, BMn are often encountered. Denote the block diagonal matrix

images

by the notation AB. Then [133]

1. si(AB) ≤ si(AB), i = 1, 2, …, n,
2. ||AB|| ≤ ||AB|| for all unitarily invariant norms,
3. images
4. ||A − |z|B|| ≤ ||A + zB|| ≤ ||A + |z|B||, for any complex number z.

Note that the weak log majorization images is stronger than the weak majorization images.

A.6.1 Matrix Inequalities

Let images be a linear mapping from finite-dimensional Hilbert spaces images and images. α is called positive if it sends positive (semidefinite) operators to positive (semidefinite) operators. Let images be a positive, unital linear mapping and images be a convex function. Then, it follow [34, p. 189] that

A.4 A.4

for every images Here sa denotes the self-adjoint case.

Let A and B be positive operators, then for 0 ≤ s ≤ 1,

images

The triangle inequality for the matrix images is [114, p. 237]

A.5 A.5

where A and B are any square complex matrices of the same size, and U and V are unitary matrices. Taking the trace of (A.5) leads to the following

A.6 A.6

Replacing B in (A.6) with B + C leads to

images

Similarly, we have

A.7 A.7

For positive operators A and B,

images

The n-tuples of the coefficients of real numbers may be regarded as diagonal matrices and the majorization can be extended to self-adjoint matrices. Suppose that A, BMn are so. Then images means that the n-tuple of eigenvalues of A is majorized by the n- tuple of eigenvalues of B; similarly for the weak majorization. Since the majorization depends only on the spectrums, images holds if and only if images for some unitaries U and V. It follows from Birkhoff's theorem [34] that images implies that

images

for some pi > 0 with ∑ipi = 1 and for some unitaries.


Theorem A.8
Let ρ1 and ρ2 be states. Then the following statements are equivalent.
1. images.
2. ρ1 is more mixed than ρ2.
3. images for some convex combination λi and for some unitaries Ui.
4. images for any convex function images.

A.6.2 Partial Ordering of Positive Semidefinite Matrices

Let A ≥ 0 and B ≥ 0 be of the same size. Then

1. A + BB,
2. images,
3. Tr(AB) ≤ Tr(A)Tr(B),
4. images, when n > 1,
5. the eigenvalues of AB are all nonnegative, λi(AB) ≥ 0.
6. AB is positive semidefinite, AB ≥ 0, if and only if AB = BA. AB may not be even Hermitian.

If AB ≥ 0, then

1. rank(A) ≥ rank(B),
2. det A ≥ det B,
3. B−1A−1 if A and B are nonsingular,
4. TrA ≥ TrB.

Let A, BMn be positive semidefinite. Then for any complex number and any unitarily invariant norm [133],

images

A.6.3 Partial Ordering of Hermitian Matrices

We follow [126, p. 273] for a short review. [115] has the most exhaustive collection. Positive definite and semi-definite matrices are important since covariance matrix and sample covariance matrix (used in practice) are semi-definite. A Hermitian matrix images is called positive definite if

images

and positive semidefinite if the weaker condition xHAx ≥ 0 holds. A Hermitian matrix is positive definite if and only if all of its eigenvalues are positive, and positive semidefinite if and only if all of its eigenvalues are nonnegative. For images, we write A > B when AB is positive definite, and AB if BA is positive definite. This is a partial ordering of the set of n × n Hermitian matrices. It is partial because we may have images and images.

Let A ≥ 0 and B ≥ 0 be of the same size and C is nonsingular. We have

1. CHAC > 0,
2. CHBC > 0,
3. A−1 > 0,
4. A + BB,
5. images,
6. Tr(AB) ≤ TrATrB,
7. AB ⇒ λi(A) ≥ λi(B), where λi are the eigenvalues (sorted in decreasing order),
8. det (A) ≥ det (B) and TrA ≥ TrB.
9. the eigenvalues of AB are all nonnegative. Furthermore, AB is positive semidefinite if and only if AB = BA.

The partitioned Hermitian matrix

images

with square blocks A and D, is positive definite if and only if A > 0 and its Schur complement DBHA−1B > 0, or D > BHA−1B.

The Hadamard determinant inequality for a positive semidefinite images is

images

The Minkowski determinant inequality for positive definite images is

images

with equality if and only if B = cA for some constant c.

If f is convex then

images

and

A.8 A.8

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset