CSP

What is Dimensionality Reduction Algorithm?

Basically, Dimensionality Reduction Algorithm is a method that change high-dimension data to low-dimension data. It reduces
import numpy as np import matplotlib.pyplot as plt # Generate data N = 500 mu = [0,0] sigma = [6,1] theta = 15*np.pi/180 # Angle of rotation for data2 rot1 = np.eye(2) # Rotation for data1 rot2 = np.array([[np.cos(theta), -np.sin(theta)],[np.sin(theta), np.cos(theta)]]) data1 = np.dot(np.dot(np.random.randn(N,2), np.diag(sigma)) + mu, rot1) data2 = np.dot(np.dot(np.random.randn(N,2), np.diag(sigma)) + mu, rot2) d1 = np.dot(rot1, [1,0]) d2 = np.dot(rot2, [1,0]) # Plot the generated data and their directions plt.subplot(1,2,1) plt.scatter(data1[:,0], data1[:,1]) plt.scatter(data2[:,0], data2[:,1]) plt.plot([0, d1[0]]*int(np.max(data1)), [0, d1[1]]*int(np.max(data1)), linewidth=2) plt.plot([0, d2[0]]*int(np.max(data2)), [0, d2[1]]*int(np.max(data2)), linewidth=2) plt.legend(['class 1', 'class 2', 'd_1','d_2']) plt.grid() plt.axis('equal') plt.title('Before CSP filtering') plt.xlabel('Channel 1') plt.ylabel('Channel 2') # CSP X1 = data1.T # Positive class data: X1~[C x T] X2 = data2.T # Negative class data: X2~[C x T] cov1 = np.cov(X1) cov2 = np.cov(X2) [l, W] = np.linalg.eig(np.dot(np.linalg.inv(np.sqrt(cov1)), np.dot(cov2, np.linalg.inv(np.sqrt(cov1))))) X1_CSP = np.dot(W.T, X1) X2_CSP = np.dot(W.T, X2) # Plot the results plt.subplot(1,2,2) plt.scatter(X1_CSP[0,:], X1_CSP[1,:]) plt.scatter(X2_CSP[0,:], X2_CSP[1,:]) plt.legend(['class 1', 'class 2']) plt.axis('equal') plt.grid() plt.title('After CSP filtering') plt.xlabel('Channel 1') plt.ylabel('Channel 2') plt.show
Python
복사
The Common Spatial Pattern (CSP) algorithm is used for feature extraction and dimensionality reduction in multi-channel EEG signal processing. It tries to maximize the variance between two classes of signals while minimizing the variance within each class.
In the code above, the CSP algorithm is applied on two sets of data, "data1" and "data2", with equal numbers of observations. First, the covariance matrices of the two datasets are calculated. Then, the eigenvectors of the covariance matrix of their concatenated data are calculated and sorted based on the magnitude of their corresponding eigenvalues. The eigenvectors with the largest eigenvalues are then used to create a spatial filter matrix "W", which is used to transform the original data into a new feature space. The transformed data, "X1_CSP" and "X2_CSP", have reduced dimensions and maximally differentiate between the two classes of signals.
Covariance Matrix
A covariance matrix is a matrix that describes the relationship between variables in a dataset. It measures how the variables change together. In the case of two variables, the covariance is a single number that describes the strength and direction of the linear relationship between the two variables. In the case of more than two variables, the covariance is represented as a matrix.
Eigenvectors and Eigenvalues
Eigenvectors and eigenvalues are concepts from linear algebra. An eigenvector is a non-zero vector that when multiplied by a given matrix results in a scalar multiple of the vector. The scalar value is called the eigenvalue. Eigenvectors and eigenvalues have important applications in many areas including image processing, computer graphics, machine learning, and more. In the case of the CSP algorithm, eigenvectors are used to describe the new transformed feature space in which the two classes of signals are maximally differentiated.
import numpy as np def csp(data1, data2): # Calculate covariance matrices for both data sets cov1 = np.cov(data1.T) cov2 = np.cov(data2.T) # Calculate the generalized eigenvectors and eigenvalues of the covariance matrices eigen_values, eigen_vectors = np.linalg.eig(cov1, cov2) # Sort the eigenvectors based on their corresponding eigenvalues sorted_indices = np.argsort(eigen_values)[::-1] eigen_vectors = eigen_vectors[:, sorted_indices] # Select a subset of the eigenvectors, typically the largest or smallest k eigenvectors k = 2 W = eigen_vectors[:, :k] return W # Use the generated data as input to the CSP algorithm W = csp(data1, data2) # Apply the obtained spatial filter to both data sets filtered_data1 = np.dot(data1, W) filtered_data2 = np.dot(data2, W)
Python
복사
Explanation:
1.
The function csp takes two input data sets, data1 and data2, and returns the spatial filter matrix W.
2.
The covariance matrices for both data sets are calculated using the np.cov function, which takes the transpose of the data set (data1.T or data2.T) as input.
3.
The generalized eigenvectors and eigenvalues of the covariance matrices are calculated using the np.linalg.eig function.
4.
The eigenvectors are sorted based on their corresponding eigenvalues, in decreasing order.
5.
A subset of the eigenvectors, typically the largest or smallest k eigenvectors, is selected. In this example, k is set to 2.
6.
The obtained spatial filter matrix W is used to filter both data sets by multiplying each data set with W. The resulting filtered data sets are stored in filtered_data1 and filtered_data2.
Give me a easy & specific explanation about covariance matrices and eigen vectors.
A covariance matrix is a square matrix that describes the covariance between multiple variables. The elements in the matrix represent the covariance between each pair of variables, and the diagonal elements represent the variance of each individual variable.
Eigenvectors and eigenvalues are mathematical concepts that can be used to analyze the behavior of a matrix. An eigenvector is a non-zero vector that, when multiplied by a matrix, results in a scalar multiple of the vector. The scalar is called the eigenvalue and is associated with the eigenvector. When working with covariance matrices, the eigenvectors represent the directions in which the data has the largest variance, and the eigenvalues represent the amount of variance in each direction. This information can be used in a variety of applications, such as dimensionality reduction and feature extraction.
def CSP(data1, data2): """ Common Spatial Pattern (CSP) algorithm Parameters: data1 (ndarray): 2D array representing first data with shape (num_samples, num_features) data2 (ndarray): 2D array representing second data with shape (num_samples, num_features) Returns: filters (ndarray): 2D array representing filters with shape (num_features, num_features) """ # Mean center the data mean1 = np.mean(data1, axis=0) mean2 = np.mean(data2, axis=0) data1_centered = data1 - mean1 data2_Centered = data2 - mean2 # Calculate covariance matrices cov1 = np.dot(data1_Centered.T, data1_Centered) / data1_Centered.shape[0] cov2 = np.dot(data2_Centered.T, data2_Centered) / data2_Centered.shape[0] # Calculate the eigenvalues and eigenvectors of the covariance matrices eig_val1, eig_vec1 = np.linalg.eig(cov1) eig_val2, eig_vec2 = np.linalg.eig(cov2) # Sort the eigenvalues and eigenvectors in descending order eig_val1_sorted_indices = np.argsort(eig_val1)[::-1] eig_vec1_sorted = eig_vec1[:, eig_val1_sorted_indices] eig_val2_sorted_indices = np.argsort(eig_val2)[::-1] eig_vec2_sorted = eig_vec2[:, eig_val2_sorted_indices] # Calculate the spatial filters filters = np.dot(eig_vec1_sorted, eig_vec2_sorted.T) return filters # Calculate the filters filters = CSP(data1, data2) # Apply the filters to the data data1_filtered = np.dot(data1, filters) data2_filtered = np.dot(data2, filters)
Python
복사