Random Matrix Theory: Lecture Review

Zhenyu LIAO

CentraleSupelec

Based on le lecture on RMT of

Prof。Romain Couillet

@19/Feb/2016

Part 1： Motivation

If we have

$x_1,x_2,\cdots,x_n\in \mathcal{R}^p$

with

iid and

$\sim \mathcal{N}(0,C_p)$

。

From the law of large number（

LLN

）， the sample covariance matrix（

SCM

） is：

$\hat{C}_p = \frac{1}{n}{\sum_{i=1}^{n}{x_ix_i^*}}\rightarrow C_p$

when

$n\rightarrow \infty$

we have a convergence almost sure（

a。s

）。

But in fact， correctly speaking， the condition is not really

$n\rightarrow\infty$

， but should be

！

Which means

should be much larger than

！（in fact we can see later this “much larger” means at least 100 times！）

We can write this

SCM

in a different way， with

$X = [x_1,x_2,\cdots,x_n]$

， we have

$\hat{C}_p = \frac{1}{n}{[XX^*]}$

So something interesting：

（identity）， and

， we will have：

$[\hat{C}_p]_{i,j}\rightarrow 1,0$

In fact， it will be

and

else。

But we can also see at the same time that， as the matrix

$\hat{C}_p$

is of size

$p\times p$

， but only contains at most

independant vector

， we have：

$rank(\hat{C}_p)\le n<p$

Which indicates that

$\hat{C}_p$

will have at least

zeros eigenvalues

！

Which means that if we evaluate

$\|\hat{C}_p-C_P\|$

， the result

may not go to zero

！（In fact it‘s due to the slow convergence of

$1\over\sqrt{n}$

）。

Conclusion， our original statistics break down when the condition of

is not satisfied。 Or， in other words， when

goes up， we can not use our traditional strategy to approximate some matrix， e。g。the convariance matrix。

But with the RMT， we know what it will be like when in the certain regime that

$p,n\rightarrow \infty,\frac pn \rightarrow c\in (0,+\infty)$

， that is the Marchenko-Pastur law as follow。

Part 2： The Marchenko-Pastur law and the Stieltjes Transform

The MP law

DEF

： The empirical spectre distribution （

ESD

）

$\mu_A$

of a Hermitian matrix

$A\in R^{p\times p}$

is defined as the following measure：

$\mu_A = \frac 1p \sum_{i=1}^{p}\delta_{\lambda(A)}$

Notice that for

$a\le b \in R$

， we have：

$\mu([a,b])=\int_{a}^{b}\frac 1p \sum_{i=1}^{p}\delta_{\lambda_i(A)}(x)dx = \frac 1p \#\{\lambda(A)\in [a,b]\}$

（and it is kind of a probability distribution， isn’t it？）

THEO

：（the Marchenko-Pastur disctribution for large dimension covariance matrix）

let

$X = (X_{i,j})_{1\le i\le p,1\le j \le n}$

have iid entries with zeros means and 1 variance， then：

$\mu_{\frac 1n XX^*}\rightarrow \mu_C$

$p,n\rightarrow \infty,\frac pn \rightarrow c\in (0,+\infty)$

， we have：

$\mu(\{0\}) = (1-\frac 1c)^+$

$\mu(dx) = f(x) dx$

with

$f(x) = \frac{1}{2\pi cx}\sqrt{[x-[1-\sqrt{c})^2]^+\cdot [(1+\sqrt{c})^2-x]^+}$

， for all

$\ne0$

Clearly we can see that this result is bounded between

$[(1-\sqrt{c})^2,(1+\sqrt{c})^2]$

and will tend to 0 when

$c\rightarrow 0$

，with means

#####Historical recall：

First RMT result dates from 1953：

let

$R^{p\times p}$

be Hermitian， iid with zeros means and one var。 Then：

$\mu_{\frac{1}{\sqrt{p}}X}\rightarrow \mu_{s.c}$

where

$=\mu_{s.c} = \frac{1}{2\pi}\sqrt{(4-x^2)^+}$

Method

： by calculating its different moments。（more details later）

Remark

： This convergence is a little complicated： it is a convergence in law for the measure

$\mu$

， but a

a。s

for the variable

。

####The Stieltjes Transform

DEF

： let

$\mu$

be a finite measure with

$Supp(\mu)\in R$

。 Then the

Stieltjes Transform

$m_\mu$

$\mu$

is defined as ：

$m_\mu(z) = \int \frac{\mu(dt)}{t-z} \quad \text{for all }z \notin Supp(\mu)$

Important remark

： if

$\mu= \mu_{A}$

（see its definition in the previous section）， then

$m_{\mu_A}(z) = \int \frac{1}{t-z}\frac 1p \sum_{i=1}^{p}\delta_{\lambda_i(A)}(t)dt \\= \frac 1p \sum_{i=1}^{p}\frac{1}{\lambda_i(A) - z} = \frac 1p tr(\Lambda - zI_p)^{-1}$

（only by changing integer and sum）

So we have：

$\frac 1p tr[(\Lambda - zI_p)^{-1}UU^*] = \frac 1p tr[U^*(\Lambda - zI_p)^{-1}U] \\= \frac 1p tr(U^*\Lambda U - zI_p)^{-1}, \quad \forall \ U\ \text{unitary}$

with

unitaire

：

and ==a trick here==

and particularly if we take

eigenvector of

， we will have

$=\frac 1p tr(A - zI_p)^{-1}$

， we will call this important matrix

the

resolvent

。

Idea

： bring the computing of

eigenvalues

to the calcul of

inverse

， with which we have much more known results。

PROP

：

$m_\mu(z)$

the ST of a real-supported finite measure of

$\mu$

the for all

continuity point of

$\mu$

， we have：

$\mu([a,b]) = \lim_{y\rightarrow 0}\frac 1\pi \int_{a}^{b}\Im[m_\mu(x+iy)dx]$

Proof

： By a little trick of：

$\frac{d}{dx}{\arctan(x)} = \frac{1}{x^2+1}$

change of variable

$\lim_{y\to 0}$

Attention

！ Use with care because

$m_\mu(z)$

is not defined

$\in Supp(\mu)$

but this limit can exist anyway！

Remark

： “discret case”？：

$f(x) = \lim_{y\rightarrow 0}\frac 1\pi \Im[m_\mu(x+iy)]$

a direct result is：

$\mu\{x\} = \lim_{y\rightarrow 0}-i\cdot y\cdot m_\mu(x+iy)$

Proof

： use definition directly with

DCT

（dominate convergence theorem） the fact that

$\lim_{y\to 0}\frac{-iy}{(t-x)-iy}=\delta_{t-x}$

####Proof of the MP theorem

Take

$X = (X_{i,j})_{1\le i\le p,1\le j \le n}$

have iid centres with zeros means and 1 variance， then：

$\mu_{\frac 1n XX^*} = \frac 1p \sum_{i=1}^{p}{\delta_{\lambda_i(\frac 1n XX^*)}}$

and

$m_{\mu_{\frac 1n XX^*}}(z) = \frac 1p tr(\frac 1n XX^*-zI_p)^{-1}\\\text{with}\ z \in C^+ = \{z|\Im(z)>0\}$

Let us start with the first entry of the matrix，

$[(\frac 1n XX^*-zI_p)^{-1}]_{11}$

， let

$X = [x^*, \tilde{X}]^T$

with

$x^* \in \mathcal{R}^{1\times n}$

and

$\tilde{X}\in \mathcal{R}^{(p-1)\times n}$

， so we can decompose our matrix in the following form：

$=\begin{bmatrix}\frac 1n x^*x-z & \frac 1n x^*\tilde{X}^* \\ \frac 1n \tilde{X}x & \frac 1n \tilde{X}\tilde{X}^*-zI_{p-1}\end{bmatrix}$

with the known result of matrix inverse by block：

$\begin{bmatrix}A & B\\ C& D\end{bmatrix}^{-1} = \begin{bmatrix}[A-BD^{-1}C]^{-1} & \cdots \\\cdots & \cdots \end{bmatrix}$

So we have our first entry：

$= [-z-\frac znx^*(\frac 1n\tilde{X}^*\tilde{X}-zI_n)^{-1}x]^{-1} = \frac{1}{-z-\frac znx^*(\frac 1n\tilde{X}^*\tilde{X}-zI_n)^{-1}x}$

with：

==a useful trick==

$A(I_m+BA)^{-1} = (I_n+AB)^{-1}A$

make up to simplify the huge inverse term

==An interesting trick！== With the matrix we work on

of bounded spectral norm （

$\max|\lambda_{i}(A)|<K$

） with

independent of

， and always

the random one， we can estimate

$\frac 1n x^*Ax$

$\frac 1n tr(A)$

with a convergence

a。s

。 This idea comes from the following idea：

（known as ==trace lemma==）

behind this formula： with a

rank 1

perturbation， and after the mean by

$\frac 1n$

， the behaviour of matrix（e。x。 trace） will not change。

With the ==lemma of rank 1 perturbation==， we can say that：

$\frac 1n tr(X^*X)^{-1} \simeq \frac 1n tr(\tilde{X}^*\tilde{X})^{-1}$

。

Finally， we have just done for the first entry， but it works the same way for others， so we have：

$m_{\mu_{\frac 1n XX^*}}(z) = \frac 1p tr(\frac 1n XX^*-zI_p)^{-1}\simeq \frac{1}{-z-z\frac 1n tr(\frac 1n X^*X-zI_n)^{-1}}$

Then with the funny fact of trace of a matrix， we are able to ==connect to the eigenvalues matrix== by：（always in the regime of

$p\gt n$

）

$tr(\frac 1n XX^*-zI_p)^{-1} = tr(\Lambda-zI_p)^{-1} \\= \sum_{i=1}^{p}\frac{1}{\lambda_i(\frac 1n XX^*)-z} =\sum_{i=1}^{n}\frac{1}{\lambda_i(\frac 1n X^*X)-z} + (p-n)\frac{1}{0-z}$

so we have：

$m_\mu = \frac{1}{-z-zcm_\mu-c +1}$

which means

$m_\mu$

is the ==solution of equation==

， obviously there are two possible solutions， and we should choose one of finite norm。 And with

$\lim_{y \to 0}$

we obtain the MP distribution！

QED

@26/Feb/2016

###Part 3： The sample covariance matrix

THEO

： let

$X\in \mathcal{R}^{p\times n}$

， iid

entries and

$R\in \mathcal{R}^{p\times p}$

determinist， non-negative definit Hermitian so that：

$\nu_p \overset{\Delta}{=} \frac 1p \sum_{i=1}^{p}{\delta_{\lambda_i(R)}\rightarrow \nu}$

Then as

$p,n\to \infty,\frac pn\to c\in (0,\infty)$

， we have：

$\mu_p \overset{\Delta}{=} \frac 1n \sum_{i=1}^{n}{\delta_{\lambda_i(\frac 1n X^HRX)}}\to \mu$

where

$\mu$

is a probability measure with ST

$m_\mu(z)$

unique solution （

for

$z \in C^+ \overset{\Delta}{=} \{z\in C,\Im(z)>0\}$

）

$m_\mu(z) = (-z+c\int\frac{t\cdot\nu(dt)}{1+t\cdot m_\mu(z)})^{-1}$

（a relation implicite， maybe NO analytical solution）

E.X.

：

$= \begin{bmatrix}I_{\frac p3} & 0 & 0 \\0 & 3I_{\frac p3} & 0 \\0 & 0 & 7I_{\frac p3}\end{bmatrix}$

So we have

$\nu = \frac 13(\delta_1+\delta_3+\delta_7)$

， which means 3 peaks at eigenvalues = 1，3 and 7。

The problem is how to plot

$\mu$

， which is the asymptotic value for

$\mu_p$

？

Idea is that：

compute

$\frac 1\pi\Im[m_\mu(x+i\epsilon)]$

with

$\epsilon$

rather small， say

$10^{-6}$

， by taking

$m_\mu(x+i\epsilon) = \lim_{t\to \infty}m^{(t)}_\mu(x+i\epsilon)$

， with

the number of iteration， obviously here we are using an iterative way to solve this problem， or in a more formal way：

$m^{(t)}_\mu(z) = (-z+c\int\frac{t\cdot\nu(dt)}{1+t\cdot m^{(t-1)}_\mu(z)})^{-1}$

and initilize with

$m^{(0)}_\mu(z) = 0$

approximated density of

$\mu$

at mass

is given by

$\frac 1\pi\Im[m_\mu(x+i\epsilon)]$

for every

$x \in R$

Remark

： relation to SCM（sample cov matrix）： let

$y_1,\cdots,y_n$

be indep vectors in

， so that

$y_i = R^{\frac 12}x_i$

， with

$x_i\in R^p$

with iid （0，1） entries。

So that we have：

and

。 Then the SCM estimator of

is given by：

$\hat{R} \overset{\Delta}{=} \frac 1n \sum_{i=1}^{n}{y_iy^*_i} = \frac 1n R^{\frac 12}{XX^*}R^{\frac 12}$

with

$X = [x_1,x_2,\cdots,x_p]$

（The idea behind is to use the eigenvalues of

$\hat{R}$

to esimtate/say sth about

）

and we have：

$\tilde{\mu_p} = \frac 1p \sum_{i=1}^{p}{\delta_{\lambda_i(\frac 1n R^{\frac 12}XX^*R^{\frac 12})}} = \frac np \mu_p + \frac{p-n}{p}\delta_0$

so with a measure

， we will get：

$m_{\tilde{\mu_p}}(z) = \frac np m_{\mu_p}(z) + \frac{n-p}{p}\frac 1z$

so with our notation of

$c = \frac pn$

， we can rewrite in the following form：

$\tilde{\mu_p} \to \tilde\mu = \frac 1c \mu + (1-\frac 1c)\delta_0$

$m_{\tilde{\mu_p}}(z) \to m_{\tilde\mu}(z) = \frac 1c m_{\mu}(z) + (\frac 1c -1)\frac 1z$

Application

： in wireless communication： in a MIMO system with p-Tx and n-Rx， when trying to calculate the ergodic capacity。

###Part 4： More elaborate model and ==deterministic equivalents==

THEO

： Let

$R\in \mathcal{R}^{p\times p}$

and

$T\in \mathcal{R}^{n\times n}$

， non-negative definite

$X\in \mathcal{R}^{p\times n}$

， iid （0，1） entries such that：

$\begin{cases}\lim\sup_{p}\|R\|<\infty \\\lim\sup_{n}\|T\| \lt \infty\end{cases}$

（with

$\|\|$

spectral norm here）

Then as

$n,p\to \infty$

， with

$0<lim\inf\frac pn \le \lim\sup \frac pn <\infty$

。

$\mu_p\overset{\Delta}{=}\frac 1p \sum_{i = 0}^{p}\delta_{\lambda_i (\frac 1n R^{\frac 12}XTX^*R^{\frac 12})}$

is such that

$\mu_p - \mathcal{\bar{\mu}}_p \overset{a.s}{\rightarrow}0$

，

where

$\mathcal{\bar{\mu}_p}$

is a （determinist） proba measure， with its ST

$m_{\mathcal{\bar{\mu}}_p} = \frac 1p tr(-z[\tilde{e_p}(z)R+I_p])^{-1}$

with

$\tilde{e_p}(z),{e_p}(z)$

the unique solution （pair） in

$\mathcal{C}^+$

for

：

$\begin{cases}\tilde{e_p}(z) = \frac 1n trT(-z[{e_p}(z)T+I_n])^{-1}\\{e_p}(z) = \frac 1n trR(-z[\tilde{e_p}(z)R+I_p])^{-1}\end{cases}$

which is also called the ==fundamental equations==

Remark

：

the value of

$m_{\mu_p}(z)$

will never converge， but its difference between a determinist

function

$m_{\bar{\mu}_p(z)}$

will trend to 0

it is the same theo， but more general， and with an equation whose solution not explicite

Proof

： The Bai-Silverstein approche： idea of guessing the solution iteratively

A constructive iterative finding of the solution： with

$m_{\mu_p}(z) = \frac 1p tr(-zI_p +\frac 1n R^{\frac 12}XTX^*R^{\frac 12})^{-1}$

we start from：

$\frac 1p tr(-zI_p +\frac 1n R^{\frac 12}XTX^*R^{\frac 12})^{-1} - \frac 1p trD^{-1} \\=\frac 1p tr(-zI_P+\frac 1n R^{\frac 12}XTX^*R^{\frac 12})^{-1}(D+zI_p-\frac 1n R^{\frac 12}XTX^*R^{\frac 12})D^{-1} \\= \frac 1p tr(-zI_p+B)^{-1}(D+zI_p-B)D^{-1}$

with

$B = \frac 1n R^{\frac 12}XTX^*R^{\frac 12}$

Remark

： ==resolvent identity==：

$A^{-1}-B^{-1} = -A^{-1}(A-B)B^{-1}$

Proof

： Multiplied by A from the left and B from the right

So with the design of matrix D we can solve this problem iteratively。

First improvement of D：

$D = -zI_p+ R^{\frac 12}FR^{\frac 12}$

， so that we have：

$=\frac 1p tr(-zI_p+B)^{-1}R^{\frac 12}(F-\frac 1n XTX^*)R^{\frac 12}D^{-1}$

Since

is something with random iid entries （0，1）， the value of

trends to be something

uniform

。 So let us try：

， we get：

$= \frac 1p tr(-zI_p+B)^{-1}R^{\frac 12}(f(z)I_p-\frac 1n XTX^*)R^{\frac 12}D^{-1}\\= f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1p tr[(-zI_p+B)^{-1}R^{\frac 12}\frac 1n XTX^* R^{\frac 12}D^{-1}]$

let us assume

diagonal

， this condition is necessary for our proof， but not necessary with more advanced technics。 Then we can decompose

$\frac 1n XTX^* = \frac 1n\sum_{i=1}^{n}t_i x_ix_i^*$

So we have：

$\frac 1p tr(-zI_p +B)^{-1} - \frac 1p trD^{-1}\\ = f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p tr[(-zI_p+B)^{-1}R^{\frac 12}x_ix_i^* R^{\frac 12}D^{-1}]\\ =f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p x_i^*R^{\frac 12}D^{-1}(-zI_sp+B)^{-1}R^{\frac 12}x_i$

Here with

$B = \frac 1n R^{\frac 12}XTX^*R^{\frac 12}=\frac 1n \sum_{j=1}^{n}t_jR^{\frac 12}x_jx_j^*R^{\frac 12}$

which depend on

but only one term， and all other with different indices！

So we can of course separate this long term into two parts：

$\sum_{i=j}$

and

$\sum_{i\neq j}$

Lemma

： ==useful trick==

$(A+tvv^*)^{-1}v = \frac{A^{-1}v}{1+tv^*A^{-1}v}$

Proof

： multiplied from left the matrix

So with：

$\begin{cases}A = zI_p + \frac 1n\sum_{j\neq i}t_jR^{\frac 12}x_jx_j^*R^{\frac 12}\\ t = \frac 1n t_{j=i}\\ v = R^{\frac 12}x_{j=i}\\ v^* = x_{j=i}^*R^{\frac 12} \end{cases}$

we get：

$= f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p tr[x_i^* R^{\frac 12 }D^{-1}(-zI_p+B)^{-1}R^{\frac 12}x_i]\\ =f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p x_i^* R^{\frac 12 }D^{-1}(-zI_p+B)^{-1}R^{\frac 12}x_i\\ =f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p \frac{x_{j=i}^*R^{\frac 12}D^{-1}(zI_p + \frac 1n\sum_{j\neq i}t_jR^{\frac 12}x_jx_j^*R^{\frac 12})^{-1}R^{\frac 12}x_{j=i}}{1+\frac 1n t_{j=i}x_{j=i}^*R^{\frac 12}A^{-1}R^{\frac 12}x_{j=i}}$

with

and trace of a number will be itself。

And with this

form on both numerator and denominator， we are thinking about using ==trace lemma== to give rid of the random part with M a bounded norm。 So we will have：

$\approx f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p \frac{tr[(-zI_p + \frac 1n\sum_{j\neq i}t_jR^{\frac 12}x_jx_j^*R^{\frac 12})^{-1}RD^{-1}]}{1+\frac 1n t_{j=i}tr[R(-zI_p + \frac 1n\sum_{j\neq i}t_jR^{\frac 12}x_jx_j^*R^{\frac 12})^{-1}]}$

So once more in this form we are thinking of the ==rank 1 pertubation lemma==：

$\frac 1n tr[A(B+vv^*-zI)^{-1}] - \frac 1p tr[A(B-zI)^{-1}]\le \frac{\|A\|}{n\Im[z]}$

， so we have：

$\approx f(z)\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]-\frac 1n \sum_{i=1}^{n}t_i \frac 1p \frac{tr[(-zI_p + B)^{-1}RD^{-1}]}{1+\frac 1n t_{j=i}tr[R(-zI_p + B)^{-1}]}$

， so we are happy， because we get the same factor

$\frac 1p tr[(-zI_p+B)^{-1}RD^{-1}]$

at left and right。

So finally， we can say that

$f(x)\approx \frac 1n \sum_{i=1}^{n}t_i \frac{1}{1+\frac 1n t_{j=i}tr[R(-zI_p + B)^{-1}]}$

， and with

$\tilde{f}(z) = \frac 1n tr[R(-zI_p + B)^{-1}]$

， we get：

$f(z) = \frac 1n tr[T(I_n+T\tilde{f}(z))^{-1}]$

Conclusion

：

$\frac 1p tr(-zI_p+B)^{-1} - \frac 1p tr(-zI_p+Rf(z))^{-1}\approx 0$

with

and

$\begin{cases}f(z) = \frac 1n tr[T(I_n+T\tilde{f}(z))^{-1}]\\ \tilde{f}(z) = \frac 1n tr[R(-zI_p + B)^{-1}]\end{cases}$

， but obviously this loop is not closed！

So by doing the same for

$\frac 1p tr[R(-zI_p + B)^{-1}]$

， we can get：

$\frac 1p tr[R(-zI_p + B)^{-1}]\approx \frac 1p tr[R(-zI_p + Rf(z))^{-1}]$

And we are happy to see this loop closed。

Final conclusion

：

$m_{\mu_p}(z)\approx \frac 1p tr(-zI_p+f(z)R)^{-1}$

where

$\begin{cases}f(z) = \frac 1n tr[T(I_n+T\tilde{f}(z))^{-1}]\\ \tilde{f}(z)= \frac 1n tr[R(-zI_p + Rf(z))^{-1}]\end{cases}$

and simply by replacing

$\begin{cases}\tilde{e}_p(z) = -\frac{f(z)}{z}\\ e_p(z) = \tilde{f}(z)\end{cases}$

we will get exactly the same thing as in our theorem。

@1/Mars/2016

With

$X\in\mathcal{R}^{p\times n}$

and

$X_{i,j}\sim (0,1)$

iid entries。 We can come up with：

white noise model

$\Rightarrow$

MP law：

$\frac 1n XX^*$

： application in signal processing： ddetection

sample covariance marix：

$\frac 1n R^{\frac 12}XX^*R^{\frac 12}$

， or

$\frac 1n \sum_{i=1}^{n}y_iy_i^*$

with

$y_i = R^{\frac 12}x_i$

： Application in signal processing， subspace methods， e。x。

MUSIC

time dependent SCM/wireless communication（MIMO channel model）：

$H = \frac{1}{\sqrt{n}}R^{\frac 12}XT^{\frac 12}$

， with R the Rx cov matrix and T the Tx matrix。

Part 5： Application to wireless communication

MIMO ergodic capacity

As shown in the following figure， a MIMO system is naturally connected to RMT。

The capacity is given by

$C(\sigma^2) = \sup_{P\ge 0, Tr(P)\le 1,P\in \mathcal{R}^{n\times n}} \mathcal{E}_{\alpha}[\log\det(I_p+\frac{1}{\sigma^2}HPH^*)]$

So in order to solve this problem， a recap our ==fundamental equation== is as follow：

With

$\mathcal{\bar{\mu}_p}$

the determinist equivalent for

$m_{\mu_p}(z) = \frac 1p tr(-zI_p +\frac 1n R^{\frac 12}XTX^*R^{\frac 12})^{-1}$

， with its ST

$m_{\mathcal{\bar{\mu}}_p} = \frac 1p tr(-z[\tilde{e_p}(z)R+I_p])^{-1}$

with

$\tilde{e_p}(z),{e_p}(z)$

the unique solution （pair） in

$\mathcal{C}^+$

for

：

$\begin{cases}\tilde{e_p}(z) = \frac 1n trT(-z[{e_p}(z)T+I_n])^{-1}\\{e_p}(z) = \frac 1n trR(-z[\tilde{e_p}(z)R+I_p])^{-1}\end{cases}$

We want to evaluate

$\int\log(1+\frac{1}{\sigma^2}t)\mu_{\frac 1n R^{\frac 12}XTX^*R^{\frac 12}}(dt)$

， with the lemma （==Shannon transform== of

$\mu$

）：

Lemma

： for

$\mu$

some probability mesure with support in

$\mathcal{R}^+$

， we have：

$\int\log\left(1+\frac 1x t\right)\mu(dt) = \int_{t=x}^{\infty}\left(\frac 1t - m_{\mu}(-t)\right)dt$

Proof

：

change variable with

$y = \frac 1x$

exchange integrals

try to make up the

$m_{\mu}$

term

pay attention when change variables within integrals

THEO

： For

$H = \frac{1}{\sqrt{n}}R^{\frac 12}XT^{\frac 12}$

， we have ：

$C(\sigma^2,I_p)-\bar{C}(\sigma^2,I_p) \overset{a.s}{\rightarrow}0$

where

$\bar{C}(\sigma^2,I_p)=\frac 1p \log\det(I_p + \tilde{e}_p(-\sigma^2)R) + \frac 1p \log\det(I_n + e_p(-\sigma^2)T) - \frac np \sigma^2 e_p(-\sigma^2)\tilde{e}_p(-\sigma^2)$

Proof

：

differentiation of

$\bar{C}(\sigma^2,I_p)$

the chain rule， e。g。

$\frac{d\bar{c}}{d\sigma^2} = \frac{d\bar{c}}{de_p}\cdot\frac{de_p}{d\sigma^2} + \frac{d\bar{c}}{d\tilde{e}_p}\cdot\frac{d\tilde{e}_p}{d\sigma^2}+ \frac{d\bar{c}}{d\sigma^2}$

Important to notice that：

$\frac{d}{de_p}\frac 1p \log\det(I+e_p T) = \frac {d}{de_p}\frac 1p \sum_{i = 1}^{n}\log[1+e_p\lambda_i(T)]=\frac 1p tr[T(I+e_pT)^{-1}]$

， which makes latter the

$\frac{\partial\bar{c}}{\partial e_p}$

term vanished， the same holds for the term of

$\frac{\partial\bar{c}}{\partial \tilde{e}_p}$

so finally there is only

$\frac{\partial\bar{c}}{\partial \sigma^2}$

， which is equal to

$m_{\bar{\mu}}(-\sigma^2)-\frac{1}{\sigma^2}$

（with a little trick using ==a constant getting in the trace== and make-up elimination）

starting to work on the original

$C(\sigma^2)$

： also with differentiation while using the above lemma， we get：

$\frac{d}{d\sigma^2} \int \log\left(1+\frac{t}{\sigma^2}\right)\bar{\mu}(dt) = \frac{d}{d\sigma^2}\int_{t=\sigma^2}^{\infty}\left(\frac 1t -m_{\bar{\mu}}(-t)\right)dt = -\frac{1}{\sigma^2} + m_{\bar{\mu}}(-\sigma^2)$

with the above 4 and 5， we can figure out that the derive of

and

$\bar{C}$

， so next is to find the possible constant difference， which turn out to be 0。

Then replace our

$T^{\frac 12}PT^{\frac 12}$

in order to get the variable

involved in。

SO our ==fundamental equations== become more complex， but we work slways with the same idea by deriving on

， hoping to get

。

Then things turn out to be a weel-known problem： the

waterfilling problem

， which can be solved by the

algorithm of iterative waterfilling

， see more detailshere

Part 6： Spiked models

DEF

：

$\frac 1n R^{\frac 12} X X^* R^{\frac 12}$

If in a particular case， we take

only for the last diagonal element is

$\alpha$

。

In this case， the measure

$v = \lim_{p\to\infty}\frac 1p\sum_{i=1}^{p}\delta_{\lambda_i (R)} = \lim_{p\to\infty}\frac 1p[(p-1)\delta_1 + \delta_{\alpha}] = \delta_1$

。

So the limiting measure of

$\mu$

will be like M-P law， but with a outlier because of

$\alpha$

。

Eigenvalues of a finite rank pertubation

multiple version of spiked models

$- \frac 1n(I+V\Omega V^*)^{\frac 12}XX^*(I+V\Omega V^*)^{\frac 12}$

， with

$V\in \mathcal{R}^{p\times k}$

and

$\Omega\quad \text{a diag}\in \mathcal{R}^{k\times k}$

， in the regime that

$P = V\Omega V^*$

of small rank and when

$n,p\to\infty$

， while

stays unchanged。

$\frac 1n XX^* + V\Omega V^*$

$\frac 1n (X+U\Omega U^*)(X+U \Omega U^*)^*$

， with

$U\in \mathcal{R}^{n\times k}$

and unitary。

The goal is to identify the exsitance and position of the

isolated eigenvalues

。

with the notaion

$P = V\Omega V^*$

。

So we solve the following equation：

$\det\left( \frac 1n (I_p+P)^{\frac 12}XX^*(I_p+P)^{\frac 12} - \lambda I_p\right) = 0$

with

$\alpha\notin [(1-\sqrt{c})^2,(1+\sqrt{c})^2]$

， with of course

$c = \lim(\frac pn)$

with the useful ==determinant property==：

$\det(AB) = \det(A)\det(B)$

， and ==resolvent identity==， we get：

$0 = \det(I_p + P)\cdot\det(\frac 1n XX^* - \lambda(I_p +P)^{-1}) = \det(\frac 1n XX^*-\lambda I_p)\det[I_p + (\frac 1n XX^* - \lambda I_p)^{-1}\lambda P(I_p+P)^{-1}]$

And for

$P(I_p+P)^{-1}$

， we have：

$= V\Omega V^*(I_p + V\Omega V^*)^{-1} = V\Omega(I_k + V^*V\Omega)^{-1}V^* = V\Omega (I_k + \Omega )^{-1} V^*$

So finally we get：

$0 = \det[I_p + (\frac 1n XX^* - \lambda I_p )^{-1} \lambda V\Omega (I_k + \Omega)^{-1}V^*] = \det[I_k + \lambda V^* (\frac 1n XX^* - \lambda I_p)^{-1}V\Omega(I_k+\Omega)^{-1}]$

with ==Sylvester‘s determinant identity==：

$\det(I_p + AB) = \det(I_k + BA)$

Lemma

： For

$a,b\in \mathcal{R}^{n},\|a\|,\|b\|=O(1)$

，

with i。i。d

entries：

$a^*(\frac 1n XX^* - zI_p)^{-1}b - a^*bm(z)\overset{a.s.}{\rightarrow}0$

with

the S。T transform of the M-P law and

$z\notin [(1-\sqrt{c})^2,(1+\sqrt{c})^2]$

（Remark： with

， iid entries， it is the ==trace lemma==）

Applying the lemma we get：

$0 = \det(I_k + \lambda m(\lambda)\cdot \Omega(I_p+\Omega)^{-1})+ O(1)$

Asymptotically the solution

$\rho$

are such that：

$\exists i: 1+\rho\cdot m(\rho)\frac{w_i}{1+w_i}=0$

So we have：

$0 = \prod_{i=0}^{k}\left(1+\lambda\cdot m(\lambda)\frac{w_i}{1+w_i }\right)+O(1)\\ \to\rho \cdot m(\rho)=-\frac{1+w_i}{w_i}$

Since we know that

$z\cdot m(z) = \frac{z}{-z+1-c-cz\cdot m(z)}, \text{with}\ z\in \mathcal{C}\notin[(1-\sqrt{c})^2,(1+\sqrt{c})^2]$

First we should check that things are well-located in the support。

$\rho m(\rho)= \int \frac{\rho}{t-\rho}\mu(dt)=-\int \frac{-\rho}{t-\rho}\mu(dt) = -1 +\int\frac{t}{t-\rho}\mu(dt)$

， of positive derivate。

So we can see that the two limits are：

$\left\{ \begin{aligned} \rho\cdot m(\rho)\overset{\rho\to-\infty}{\longrightarrow}-1\\ \rho\cdot m(\rho)\overset{\rho\to+\infty}{\longrightarrow}-1\ \end{aligned} \right.$

So it is not difficult to image that

$\rho\cdot m(\rho)$

turns out to be some function of

$\rho$

that looks like a hyperbolic outside the regime

$[(1-\sqrt{c})^2,(1+\sqrt{c})^2]$

So for

$\rho \to (1+\sqrt{c})^2$

， we have

$m(\rho)\to\rho^+$

， with

$\rho^+ = \frac{1}{-(1+\sqrt{c})^2+1-c-c(1+\sqrt{c})^2\rho^+}=-\frac{1}{\sqrt{c}(1+\sqrt{c})}$

。 In this limit case， we have：

$\rho\cdot m(\rho)\to-\frac{1+\sqrt{c}}{\sqrt{c}}= -1-\frac{1}{\sqrt{c}}$

So we only have solution in the regime

$\left(-1-\frac{1}{\sqrt{c}},\ -1 \right)$

， which means

$\rho\cdot m(\rho)=-1-\frac{1}{w_i}>-1-\frac{1}{\sqrt{c}}\longrightarrow w_i > \sqrt{c}$

This implies the fact that only in the situation that when the eigenvalue have enough energy：

$P>\sqrt{c}$

， we can have

these isolated eigenvalues

。 This phenomenon is often referred to as

phase transition

in the literature。

When we suppose that we have

$w_i > \sqrt{c}$

， we can go on in our calculation as：

$-1-\frac{1}{w_i} = \rho\cdot m(\rho)=\frac{\rho}{-\rho +1-c-c(-1-\frac{1}{w_i})}\\\to \rho = \frac{c(1+w_i)}{w_i}+1+w_i$

We consider two limit cases：

$\frac 1n(I+V\Omega V^*)^{\frac 12}XX^*(I+V\Omega V^*)^{\frac 12}\overset{c\to 0}{\longrightarrow}I+V\Omega V^*$

， so of course with eigs

when

$w_i \to \sqrt{c}$

：

$\rho \to (1+\sqrt{c})^2$

， which signifies kind of a continous behavior。

Study of the eigenvectors：

？what can we say about the matrix

？（which are commenly used in array processing）

Idea： we can estimate quantities of this type：

with

$V = [v_1,\ldots,v_k]$

for

deterministic vectors。

==Results==： Let

$w_1>w_2>\ldots>w_{\tilde{k}}>\sqrt{c}\ge w_{\tilde{k}+1}\ge \ldots w_k$

， also

$\hat{v}_1,\ldots,\hat{v}_{\tilde{k}}$

the eigenvectors of

$\frac 1n (I+P)^{1/2}XX^*(I+P)^{1/2}$

associated with the

$\tilde{k}$

largest eigenvalues， then as

$n,p\to\infty,\frac pn\to\in(0,\infty)$

， we have：

$a^*\hat{v}_i\hat{v}_i^*b - \frac{1-cw_i^{-2}}{1+cw_i^{-1}}a^*v_iv_i^*b\overset{a.s}{\to}0,\ \forall i\le \tilde{k}$

and also：

$a^*\hat{v}_i\hat{v}_i^*b\overset{a.s}{\to}0,\ \forall i>\tilde{k}$

in particular， we have：

$\left|\hat{v}_i^*\hat{v}_i\right|^2 \to\begin{cases}\frac{1-cw_i^{-2}}{1-cw_i^{-1}}, & \forall \ i \le \tilde{k} \\0 & \text{otherwise}\end{cases}$

Idea of the proof comes from the fact that

$a^*\hat{v}_i\hat{v}_i^*b$

is a function of the ==resolvant== and with the Cauchy integral。

小蜜蜂問答

小蜜蜂問答

Random Matrix Theory: Lecture Review

推薦文章

小蜜蜂問答

小蜜蜂問答

Random Matrix Theory: Lecture Review

相關文章

take和get的區別？

arrive in和get to區別？

歌詞為‘就算世界與我為敵，我也願意…’的歌名叫什麼？

competency和ability的區別？

推薦文章