FastICA

From HandWiki
Revision as of 19:05, 6 February 2024 by imported>MedAI (link)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology.[1][2] Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed-point iteration scheme, that maximizes a measure of non-Gaussianity of the rotated components. Non-gaussianity serves as a proxy for statistical independence, which is a very strong condition and requires infinite data to verify. FastICA can also be alternatively derived as an approximative Newton iteration.

Algorithm

Prewhitening the data

Let the 𝐗:=(xij)N×M denote the input data matrix, M the number of columns corresponding with the number of samples of mixed signals and N the number of rows corresponding with the number of independent source signals. The input data matrix 𝐗 must be prewhitened, or centered and whitened, before applying the FastICA algorithm to it.

  • Centering the data entails demeaning each component of the input data 𝐗, that is,
xijxij1Mjxij
for each i=1,,N and j=1,,M. After centering, each row of 𝐗 has an expected value of 0.
  • Whitening the data requires a linear transformation 𝐋:N×MN×M of the centered data so that the components of 𝐋(𝐗) are uncorrelated and have variance one. More precisely, if 𝐗 is a centered data matrix, the covariance of 𝐋𝐱:=𝐋(𝐗) is the (N×N)-dimensional identity matrix, that is,
E{𝐋𝐱𝐋𝐱T}=𝐈N
A common method for whitening is by performing an eigenvalue decomposition on the covariance matrix of the centered data 𝐗, E{𝐗𝐗T}=𝐄𝐃𝐄T, where 𝐄 is the matrix of eigenvectors and 𝐃 is the diagonal matrix of eigenvalues. The whitened data matrix is defined thus by
𝐗𝐃1/2𝐄T𝐗.

Single component extraction

The iterative algorithm finds the direction for the weight vector 𝐰N that maximizes a measure of non-Gaussianity of the projection 𝐰T𝐗, with 𝐗N×M denoting a prewhitened data matrix as described above. Note that 𝐰 is a column vector. To measure non-Gaussianity, FastICA relies on a nonquadratic nonlinear function f(u), its first derivative g(u), and its second derivative g(u). Hyvärinen states that the functions

f(u)=logcosh(u),g(u)=tanh(u),andg(u)=1tanh2(u),

are useful for general purposes, while

f(u)=eu2/2,g(u)=ueu2/2,andg(u)=(1u2)eu2/2

may be highly robust.[1] The steps for extracting the weight vector 𝐰 for single component in FastICA are the following:

  1. Randomize the initial weight vector 𝐰
  2. Let 𝐰+E{𝐗g(𝐰T𝐗)T}E{g(𝐰T𝐗)}𝐰, where E{...} means averaging over all column-vectors of matrix 𝐗
  3. Let 𝐰𝐰+/𝐰+
  4. If not converged, go back to 2

Multiple component extraction

The single unit iterative algorithm estimates only one weight vector which extracts a single component. Estimating additional components that are mutually "independent" requires repeating the algorithm to obtain linearly independent projection vectors - note that the notion of independence here refers to maximizing non-Gaussianity in the estimated components. Hyvärinen provides several ways of extracting multiple components with the simplest being the following. Here, 1𝐌 is a column vector of 1's of dimension M.

Algorithm FastICA

Input: C Number of desired components
Input: 𝐗N×M Prewhitened matrix, where each column represents an N-dimensional sample, where C<=N
Output: 𝐖N×C Un-mixing matrix where each column projects 𝐗 onto independent component.
Output: 𝐒C×M Independent components matrix, with M columns representing a sample with C dimensions.
 for p in 1 to C:
    w𝐩 Random vector of length N
    while w𝐩 changes
        w𝐩1M𝐗g(w𝐩T𝐗)T1Mg(w𝐩T𝐗)1𝐌w𝐩
        w𝐩w𝐩j=1p1(w𝐩Tw𝐣)w𝐣
        w𝐩w𝐩w𝐩
output 𝐖[w𝟏,,w𝐂]
output 𝐒𝐖𝐓𝐗

See also

References