Principle Component Analysis on MEA Data

Julie Arnold

9.29 Midterm Project

4/1/04

This project applies principal component analysis to spike rate data collected using a multi-electrode array. Principal component analysis provides is a method for reducing complex, and possibly redundant data to a lower dimension to reveal hidden underlying factors controlling dynamics and to identify spatial relationships within the array. Results show that the first principal component is able to capture approximately 16% of the total variance of the data. This first principal component is also able show the most important pattern in the dynamics; approximately one-quarter of the analyzed array grid’s electrodes covary together. Finally, difficulties in interpreting the principal components are discussed, and one possible path for future work is suggested.

Introduction

The data used in this project was generated using a multi-electrode array (MEA), made up of sixty-four electrodes and shown in Figure1. In this case the MEA was used for making extracellular recordings from multiple points within a system of cultured hippocampal neurons.

Figure 1: 64-Electrode MEA

These extracellular recordings are noisy and data filtering was necessary to extract spiking and spike rate information for each electrode. Principal component analysis was then conducted on the spike rate data.

Principal component analysis (PCA) is a method for reducing complex, and possible redundant data to a lower dimension. The MEA spike rate data is definitely complex, but when beginning this investigation it is difficult to determine the amount of redundancy present in the data. By reducing the dimensionality of the data, PCA may reveal hidden, underlying factors governing how the electrode measurements covary with eachother, and identify spatial relationships between electrodes. For example, imagine the MEA array and the cultured neurons are bathed in a solution, and the concentration of this solution directly affects the spike rate seen on the electrodes. This is one example of a factor that would govern how the electrode measurements covary with eachother. While PCA is not able to name the factor as “concentration of the bath solution” it is able to identify that such a factor exists.

Methods

Spike rate information was extracted using one minute of spike information on the electrodes, and was calculated using a 30-millisecond time window. While the physical MEA array consists of 64 electrodes arranged in eight rows, the spike rate data analyzed only makes use of 40 electrodes. Many of the electrodes 1 through 24 showed near simultaneous spiking are disregarded in the analysis because of concerns that this is likely a product of the external environment, rather than the spiking of the cultured neurons.

Principle component analysis was then conducted on the spike rate data. Three assumptions underlying PCA are (1) linearity, (2) the data is of a Gaussian distribution, and (3) the data has a high signal to noise ratio. Linearity implies the assumption that the data is interpretable between data points, and also implies the constraint that PCA must re-express the data as a linear combination of its basis vectors. The second assumption that the data is Gaussian implies that the mean and variance completely describe the probability distribution of our data. Finally, a large SNR implies the assumption that large variances in our data represent important dynamics of the system.

To explain how PCA is derived and implemented, consider a two-dimensional example of the MEA data, shown in Figure 2.

Figure 2: Two-Dimensional Illustration of PCA

Plotted on the x-axis are spike rate measurements recorded on Electrode X, and plotted on the y-axis are spike rate measurements recorded on Electrode Y. In this case, each data point is a time sample.

As you can see, the data points fall within the shape of an oval. The more elongated this oval is, the more redundancy you have between measurements on Electrodes X and Y. Imagine the extreme case when all these data points would fall along a single line. In that case, Electrodes X and Y are completely redundant and it would be more meaningful to just record a single variable that is a linear combination of the original variables. In this case, our data does not fall onto a single line, but we would still like to pick a set of variables that co-vary as little as possible with other variables. These variables are labeled as the principal components. The first principal component corresponds to the variable with the greatest variance, and the second principal component is orthogonal to the first, and corresponds to the variable with the second greatest variance. By reducing the data to only the 1^st principal component, you may be able to describe the most important dynamics of the system, and to fully capture the original data you require both principal components.

Principal components are found by attempting to express the data with a basis in which each electrode measurement does not covary with other electrodes. Another way of saying this is that you want to make the covariance between separate electrodes equal to zero; essentially, you want to diagonalize the covariance matrix. Eigenvectors are used to diagonalize symmetric matrices, and the covariance IS a symmetric matrix. Thus, the principle components of the data are the eigenvectors of the covariance matrix.

In math terms, the problem statement is to find an orthonormal matrix P where Y = PX, such that S = YY^T/(n-1) is diagonalized, where X is the data matrix (mxn), P is matrix of principle components of X, and S is the diagonalized covariance matrix. To prove that the principle components of the data are the eigenvectors of the covariance matrix,

S = YY^T/(n-1) -> S = P(XX^T)PT/(n-1) -> S = PAP^T/(n-1)

Use the theorem: A = EDE^T

, where, where E is a matrix of eigenvectors of A arranged in columns, and D is a diagonal matrix.

Now, if you assume that the principle components are the eigenvectors of the covariance matrix: P = E^T

^{then you find
that the covariance matrix is diagonalized:}

S = P(P^TDP)P^T/(n-1) -> S = D/(n-1).

Another algebraic solution not described here is called Singular Value Decomposition (SVD).

Once the principle components of the data have been identified, it is possible to collapse the data onto any number of principal components, and then reconstruct that data. If all principal components are used to reconstruct the data, then there is no difference between the reconstructed data or the original data. However, if less than all the principal components are used to reconstruct the data, then there is an error between the reconstructed data and the original data. For the purposes of this project, this error quantified by subtracting the reconstructed data matrix from the original data matrix, and taking the infinity norm of the resulting error matrix to yield a single value error metric.

Results and Discussion

The first investigation looked into how well PCA conducted on one half of the data set is able to predict the second half of the data set. The results are shown in Figures 3 and 4.

Figure 3: Variance Associated with Each PC Figure 4: Error vs. Number of PC Used

PCA conducted on the first half of the data set yields Figure 3, and shows the variance of the data associated with each of the principal components normalized such that the sum of all variances is one. This graph shows that, for example, the first principal component is able to capture approximately 16% of the total variance of the data. The second principal component is only able to capture approximately 6% of the total variance of the data.

Figure 4 shows the results when these principal components are used to reconstruct the second half of the data set. Ideally, if the principal components predicted the second half of the data set as well as the first half of the data set, then Figures 3 and 4 would look almost identical. And, in fact, if you do apply PCA to the whole data set and generate these plots, they do look almost identical. The fact that they do not appear identical here suggest that the principal components of the first half of the data set do not accurately predict the second half of the data set. However, one interesting feature to notice in both graphs is the sharp drop between the first and second principal components. This suggests that in both cases, the first principal component is able to capture significantly more of the data than the other principal components.

The second investigation involved studying the spatial relationships of the principal components. The first investigation suggested that the first principal component likely provides insight into the most important patterns found in the spike rate data. This is because the first principal component captures the most variance in the data. Figure 5 shows a visualization of the first principle component plotted to overlay the physical form of the multi-electrode array.

Figure 5: Spatial Relationships in the First PC

The most interesting feature of this visualization has been highlighted in red. Nearly one-quarter of the MEA electrodes analyzed have a first principal component element expressed in blue, suggesting that they covary with each other! Another interesting feature is that other small clusters of nodes appear to covary with eachother in the visualization of this principal component as well as all other principal components.

Another interesting principal component to study is principal component 7 resulting from conducting PCA on the whole spike rate data set, rather than just the first half. This visualization is shown in Figure 6 below.

Figure 6: Spatial Relationship in the 7^th PC of Whole Data Set

In this case, Electrode 30 has a principal component element completely different than anything else found on the array. Unfortunately, by just looking at the principal component, one cannot determine if this island signals whether Electrode 30 is super-active or super-depressed. This is because, while elements in the principal component can be either positive or negative, so can elements in the reduced-dimensionality data used to form the reconstructed data. Future work to overcome this difficulty may include using non-negative matrix factorization, which forces the principal components and the reduced-dimensionality data to be non-negative. This tool may provide a more effective framework for interpreting these principal components.

References

Lee, Daniel. Seung, Sebastian. “Learning The Parts Of Objects By Non-Negative Matrix Factorization.” Nature, Vol 41, October 1999.

Schlens, Jon. “A Tutorial On Principal Component Analysis: Derivation, Discussion, and Singular Value Decomposition.” http://www.snl.salk.edu/~shlens/pub/notes/pca.pdf. Accessed 3/17/04.

Smith, Lindsey. “A Tutorial on Principal Component Analysis.” http://dsg.harvard.edu/courses/hst951/spring04/ho/principal_components.pdf Accessed 3/17/04.