Guaranteed Higher Grade!

Free Quote
Answered

Q1.The model is as following

X = WHT , X ∈ R m×n , W ∈ R m×r + , H ∈ R n×r + ,

where r ≤ min{m, n}, and rank(X) = rank(W) = rank(H) = r. Here, W ∈ R m×r

+ means that W is a m × n matrix whose elements are all nonnegative and real-valued. The goal of NMF is to “recover” W and H from the observed data X, given the knowledge of r. The NMF technique serves as the foundation of many unsupervised machine learning tasks, e.g., topic modeling, community detection, and soft clustering.

Hint: You may want to use the following fact: Assume a1, . . . , aN are all rea-valued scalars and nonnegative. In addition, assume that θi ∈ R satisfies θi ≥ 0 for i = 1, . . . , N and PN

i=1 θi = 1.

Then, we have X N i=1 θiai ≤ max i=1,...,Nai

. (1) (a) Assume X(:, `) 6= 0 for all ` = 1, . . . , n. Consider a normalized model Xf where Xf(:, `) = X(:, `)

kX(:, `)k1,

where X(:, `) means the `th column of X (i.e., the Matlab notation). Show that Xf can be represented as

Xf = WfHfT , where we have Hf(`, :)1 = 1, Hf ≥ 0, i.e., the rows of Hf ∈ R n×r are sum-to-one and nonnegative. In addition, show that Wf = WΣ1,

where Σ1 ∈ R r×r is a diagonal matrix. (10%)

(b) Assume that H = Π

D B

where D is a full-rank diagonal matrix and Π ∈ R n×n is a permutation matrix. Show that there exists an index set Λ = {`1, . . . , `r} such that Hf(`i

, :) = e

T

i

,

where ei is the ith unit vector. (5%)

(c) Define `b1 = arg max `∈{1,...,n}

Xf(:, `)

2,i.e., `b1 is the index of the column of Xf that gives the largest vector 2-norm. Show that Xf(:, `b1) ∈ {Wf(:, 1), . . . ,Wf(:, r)}. (Hint: This is equivalent to showing that `b1 ∈ Λ.) (5%)

(d) Assume that you have have a matrix defined as Wc1:k Wc1:k = [Wf(: 1), . . . ,Wf(:, k)], k < r.

The next step is as follows:`b

k+1 = arg max

`∈{1,...,n}

P

⊥

Wc1:k

Xf(:, `)

2.

Show that Xf(:, `b k+1) ∈ {Wf(:, k + 1), . . . ,Wf(:, r)}, i.e., you will find a new column in Wf. (Hint: This is equivalent to showing that `b k+1 ∈ {`k+1, . . . , `r}.) (10%)

(e) Bonus: The above procedure will give you an index set {`b1, . . . , `br} using only r steps. Comment on how will you use this set to find W from X? Can you find the exact W, or will

there be some unsolvable ambiguities?

Q2. (MUSIC (30%)) The MUSIC algorithm that we have learned in the class can be applied for DOA estimation. Assume N ≥ K. Let us first form

x(t) = [x1(t), . . . , xN (t)]T ∈ C

N , t = 1, 2, . . . , T.

(a) We know that x(t) = As(t) + w(t), where s(t) = [s1(t), . . . , sK(t)]T ∈ C K and w(t) = [w1(t), . . . , wN (t)]T ∈ C

N . Explicitly write out the expression of A. Under what conditions rank(A) = K?

(b) To apply MUSIC, we construct

Rx = E[x(t)x(t)

H] = AE[s(t)s(t)

H]AH + σ

2

I,

under the conditions that wn(t) is zero mean Gaussian noise with variance σ 2

, that wn(t) is spatially white, and that wn(t) and sk(t) are uncorrelated. This is critical step in MUSICbased DOA estimation. Assume that the source signals are wide-sense stationary, and that si(t) and sj (t) are uncorrelated (i.e., E[si(t)s ∗

j

(t)] = 0 for i 6= j), and that E[|si(t)|

2

] = α

2

k

. Show that R = AE[s(t)s(t) H]AH = ADiag(α 2 1 , . . . , α2 K)AH, where Diag(y) is a diagonal matrix whose diagonal is y. What is the rank of R? Is R positive definite (PD) when N > K? Justify your answer. What is the rank of Rx? Assume that the noise has nonzero variance σ 2. Is Rx PD? Justify your answer. (10%)

(c) Denote the eigendecomposition of Rx as Rx = [V1,V2]

Λ1 0 0 Λ2 V H 1V H 2.

In addition, denote the eigendecomposition of R as R = [U1, U2]

Σ1 0 0 Σ2 U H 1 U H 2 .

Under what conditions of N, K and θ1, . . . , θK do we have R(V1) = R(A)? Can we assume V1 = U1 and V2 = U2? If yes, what is the relationship between Λ1 and Σ1? What is the relationship between Λ2 and Σ2? Explain. (5%)(d) Assume we construct a(θ) = [1, e−j

2πd λ sin(θ), e−j 2π2d λ sin(θ) , . . . , e−j 2πd(N−1) λ

sin(θ) ]T. Do we have the following holds in general when N ≥ K? a(θ) ∈ R(V1) ⇐⇒ θ ∈ {θ1, . . . , θK}.If not, specify a condition in terms of K and N such that the above holds. (5%)

(e) To apply MUSIC, we construct an estimator

SMUSIC(θ) = 1 kV H 2 a(θ)k 2 2.

In practice, we approximate E[x(t)x(t)

H] by Rx = 1 T X T t=1 x(t)x(t) H.

When T < K, does MUSIC work? Explain. (5%)

Q3. (ESPRIT (30%)) There is another classic algorithm handling the same problem, namely, the ESPRIT algorithm. We study the insights of ESPRIT in this question. We assume N ≥ K + 1 in this question.

(a) Let us consider A1 = [IN−1, 0]A, and A2 = [0, IN−1]A. Show that A1 = A2D with a certain diagonal matrix D (specify D). (10%)

(b) Consider W1 = [IN−1, 0]V1 and W2 = [0, IN−1]V1 where R(V1) = R(A) as we used in MUSIC. Show that there exists a full rank Q ∈ R K×K such that

W1 = A1Q

W2 = A2Q

W2 = W1Q−1DQ

for a certain full rank Q ∈ R K×K if N ≥ K + 1. (10%)

(c) The procedure of ESPRIT is as follows: 1) Solve Ψ = arg min

G kW2 − W1Gk 2F and then take eigendecomposition of Ψ. The claim is that λk = e

−j 2πd λ sin(θk) , k = 1, . . . , K, if θ1 6= θ2 6= . . . 6= θK. Prove the above claim. (10%)

Q4. (Coding) Write a Matlab (Python is allowed if you prefer) code for MUSIC and ESPRIT and run them with the provided code. You may set d/λ = 1/2.

Solved by qualified expert