Random Matrix Theory: Lecture Review
By
Zhenyu LIAO
at
CentraleSupelec
Based on le lecture on RMT of
Prof。Romain Couillet
@19/Feb/2016
Part 1: Motivation
If we have
with
iid and
。
From the law of large number(
LLN
), the sample covariance matrix(
SCM
) is:
when
we have a convergence almost sure(
a。s
)。
But in fact, correctly speaking, the condition is not really
, but should be
!
Which means
should be much larger than
!(in fact we can see later this “much larger” means at least 100 times!)
We can write this
SCM
in a different way, with
, we have
So something interesting:
If
(identity), and
, we will have:
In fact, it will be
if
and
else。
But we can also see at the same time that, as the matrix
is of size
, but only contains at most
independant vector
, we have:
Which indicates that
will have at least
zeros eigenvalues
!
Which means that if we evaluate
, the result
may not go to zero
!(In fact it‘s due to the slow convergence of
)。
Conclusion, our original statistics break down when the condition of
is not satisfied。 Or, in other words, when
goes up, we can not use our traditional strategy to approximate some matrix, e。g。the convariance matrix。
But with the RMT, we know what it will be like when in the certain regime that
, that is the Marchenko-Pastur law as follow。
Part 2: The Marchenko-Pastur law and the Stieltjes Transform
The MP law
DEF
: The empirical spectre distribution (
ESD
)
of a Hermitian matrix
is defined as the following measure:
Notice that for
, we have:
(and it is kind of a probability distribution, isn’t it?)
THEO
: (the Marchenko-Pastur disctribution for large dimension covariance matrix)
let
have iid entries with zeros means and 1 variance, then:
as
, we have:
with
, for all
Clearly we can see that this result is bounded between
and will tend to 0 when
,with means
#####Historical recall:
First RMT result dates from 1953:
let
in
be Hermitian, iid with zeros means and one var。 Then:
where
Method
: by calculating its different moments。(more details later)
Remark
: This convergence is a little complicated: it is a convergence in law for the measure
, but a
a。s
for the variable
。
####The Stieltjes Transform
DEF
: let
be a finite measure with
。 Then the
Stieltjes Transform
of
is defined as :
Important remark
: if
(see its definition in the previous section), then
(only by changing integer and sum)
So we have:
with
unitaire
:
and ==a trick here==
and particularly if we take
eigenvector of
, we will have
, we will call this important matrix
the
resolvent
of
。
Idea
: bring the computing of
eigenvalues
to the calcul of
inverse
, with which we have much more known results。
PROP
:
the ST of a real-supported finite measure of
the for all
continuity point of
, we have:
Proof
: By a little trick of:
change of variable
Attention
! Use with care because
is not defined
but this limit can exist anyway!
Remark
: “discret case”?:
a direct result is:
Proof
: use definition directly with
DCT
(dominate convergence theorem) the fact that
####Proof of the MP theorem
Take
have iid centres with zeros means and 1 variance, then:
and
Let us start with the first entry of the matrix,
, let
with
and
, so we can decompose our matrix in the following form:
with the known result of matrix inverse by block:
So we have our first entry:
with:
==a useful trick==
make up to simplify the huge inverse term
==An interesting trick!== With the matrix we work on
of bounded spectral norm (
) with
independent of
, and always
the random one, we can estimate
by
with a convergence
a。s
。 This idea comes from the following idea:
(known as ==trace lemma==)
behind this formula: with a
rank 1
perturbation, and after the mean by
, the behaviour of matrix(e。x。 trace) will not change。
With the ==lemma of rank 1 perturbation==, we can say that:
。
Finally, we have just done for the first entry, but it works the same way for others, so we have:
Then with the funny fact of trace of a matrix, we are able to ==connect to the eigenvalues matrix== by:(always in the regime of
)
so we have:
which means
is the ==solution of equation==
, obviously there are two possible solutions, and we should choose one of finite norm。 And with
we obtain the MP distribution!
QED
@26/Feb/2016
###Part 3: The sample covariance matrix
THEO
: let
, iid
entries and
determinist, non-negative definit Hermitian so that:
Then as
, we have:
where
is a probability measure with ST
unique solution (
for
)
(a relation implicite, maybe NO analytical solution)
E.X.
:
So we have
, which means 3 peaks at eigenvalues = 1,3 and 7。
The problem is how to plot
, which is the asymptotic value for
?
Idea is that:
compute
with
rather small, say
, by taking
, with
the number of iteration, obviously here we are using an iterative way to solve this problem, or in a more formal way:
and initilize with
approximated density of
at mass
is given by
for every
Remark
: relation to SCM(sample cov matrix): let
be indep vectors in
, so that
, with
with iid (0,1) entries。
So that we have:
and
。 Then the SCM estimator of
is given by:
with
(The idea behind is to use the eigenvalues of
to esimtate/say sth about
)
and we have:
so with a measure
, we will get:
so with our notation of
, we can rewrite in the following form:
Application
: in wireless communication: in a MIMO system with p-Tx and n-Rx, when trying to calculate the ergodic capacity。
###Part 4: More elaborate model and ==deterministic equivalents==
THEO
: Let
and
, non-negative definite
, iid (0,1) entries such that:
(with
spectral norm here)
Then as
, with
。
is such that
,
where
is a (determinist) proba measure, with its ST
with
the unique solution (pair) in
for
:
which is also called the ==fundamental equations==
Remark
:
the value of
will never converge, but its difference between a determinist
function
will trend to 0
it is the same theo, but more general, and with an equation whose solution not explicite
Proof
: The Bai-Silverstein approche: idea of guessing the solution iteratively
A constructive iterative finding of the solution: with
we start from:
with
Remark
: ==resolvent identity==:
Proof
: Multiplied by A from the left and B from the right
So with the design of matrix D we can solve this problem iteratively。
First improvement of D:
, so that we have:
Since
is something with random iid entries (0,1), the value of
trends to be something
uniform
。 So let us try:
, we get:
let us assume
diagonal
, this condition is necessary for our proof, but not necessary with more advanced technics。 Then we can decompose
So we have:
Here with
which depend on
but only one term, and all other with different indices!
So we can of course separate this long term into two parts:
and
Lemma
: ==useful trick==
Proof
: multiplied from left the matrix
So with:
we get:
with
and trace of a number will be itself。
And with this
form on both numerator and denominator, we are thinking about using ==trace lemma== to give rid of the random part with M a bounded norm。 So we will have:
So once more in this form we are thinking of the ==rank 1 pertubation lemma==:
, so we have:
, so we are happy, because we get the same factor
at left and right。
So finally, we can say that
, and with
, we get:
Conclusion
:
with
and
, but obviously this loop is not closed!
So by doing the same for
, we can get:
And we are happy to see this loop closed。
Final conclusion
:
where
and simply by replacing
we will get exactly the same thing as in our theorem。
@1/Mars/2016
With
and
iid entries。 We can come up with:
white noise model
MP law:
: application in signal processing: ddetection
sample covariance marix:
, or
with
: Application in signal processing, subspace methods, e。x。
MUSIC
time dependent SCM/wireless communication(MIMO channel model):
, with R the Rx cov matrix and T the Tx matrix。
Part 5: Application to wireless communication
MIMO ergodic capacity
As shown in the following figure, a MIMO system is naturally connected to RMT。
The capacity is given by
So in order to solve this problem, a recap our ==fundamental equation== is as follow:
With
the determinist equivalent for
, with its ST
with
the unique solution (pair) in
for
:
We want to evaluate
, with the lemma (==Shannon transform== of
):
Lemma
: for
some probability mesure with support in
, we have:
Proof
:
change variable with
exchange integrals
try to make up the
term
pay attention when change variables within integrals
THEO
: For
, we have :
where
Proof
:
differentiation of
the chain rule, e。g。
Important to notice that:
, which makes latter the
term vanished, the same holds for the term of
so finally there is only
, which is equal to
(with a little trick using ==a constant getting in the trace== and make-up elimination)
starting to work on the original
: also with differentiation while using the above lemma, we get:
with the above 4 and 5, we can figure out that the derive of
and
, so next is to find the possible constant difference, which turn out to be 0。
Then replace our
by
in order to get the variable
involved in。
SO our ==fundamental equations== become more complex, but we work slways with the same idea by deriving on
, hoping to get
。
Then things turn out to be a weel-known problem: the
waterfilling problem
, which can be solved by the
algorithm of iterative waterfilling
, see more detailshere
Part 6: Spiked models
DEF
:
If in a particular case, we take
only for the last diagonal element is
。
In this case, the measure
。
So the limiting measure of
will be like M-P law, but with a outlier because of
。
Eigenvalues of a finite rank pertubation
multiple version of spiked models
, with
and
, in the regime that
of small rank and when
, while
stays unchanged。
, with
and unitary。
The goal is to identify the exsitance and position of the
isolated eigenvalues
。
with the notaion
。
So we solve the following equation:
with
, with of course
with the useful ==determinant property==:
, and ==resolvent identity==, we get:
And for
, we have:
So finally we get:
with ==Sylvester‘s determinant identity==:
Lemma
: For
,
with i。i。d
entries:
with
the S。T transform of the M-P law and
(Remark: with
, iid entries, it is the ==trace lemma==)
Applying the lemma we get:
Asymptotically the solution
are such that:
So we have:
Since we know that
First we should check that things are well-located in the support。
, of positive derivate。
So we can see that the two limits are:
So it is not difficult to image that
turns out to be some function of
that looks like a hyperbolic outside the regime
So for
, we have
, with
。 In this limit case, we have:
So we only have solution in the regime
, which means
This implies the fact that only in the situation that when the eigenvalue have enough energy:
, we can have
these isolated eigenvalues
。 This phenomenon is often referred to as
phase transition
in the literature。
When we suppose that we have
, we can go on in our calculation as:
We consider two limit cases:
, so of course with eigs
when
:
, which signifies kind of a continous behavior。
Study of the eigenvectors:
?what can we say about the matrix
?(which are commenly used in array processing)
Idea: we can estimate quantities of this type:
with
for
deterministic vectors。
==Results==: Let
, also
the eigenvectors of
associated with the
largest eigenvalues, then as
, we have:
and also:
in particular, we have:
Idea of the proof comes from the fact that
is a function of the ==resolvant== and with the Cauchy integral。