For dimensional
reduction in physics, see Dimensional reduction. In machine learning and statistics, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration and can be divided
into feature selection and feature extraction.
1.Feature selection
Feature selection approaches try to find a subset of the original variables (also called features or
attributes). Two strategies are filter (e.g. information gain) and wrapper (e.g. search guided by the accuracy) approaches. In some cases, data analysis such as regression or classification can be done in the reduced space more accurately than in the original space.
2. Feature extraction
Feature extraction transforms the data in the high-dimensional space to a space of fewer dimensions. The data transformation may be linear, as in principal component analysis (PCA), but many nonlinear dimensionalities reduction techniques also exist. For multidimensional data, tensor representation can be
used in dimensionality reduction through multilinear subspace learning.
The main linear the technique for dimensionality reduction, principal component analysis performs a linear
mapping of the data to a lower-dimensional space in such a way
that the variance of the data in the low-dimensional
representation is maximized. In practice, the correlation matrix of the data is constructed and the eigenvectors on this matrix are computed. The eigenvectors that correspond to the largest eigenvalues (the principal components) can now be used to reconstruct a large fraction of the variance of the original data. Moreover, the first few eigenvectors can often be interpreted in terms of the large-scale physical behavior of the system. The original space (with a dimension of the number of points) has been reduced (with data loss, but hopefully retaining the most important variance) to space spanned by a few eigenvectors.
A different approach to nonlinear dimensionality reduction is through the use of autoencoders, a special kind of feed-forward neural networks with a bottle-neck hidden layer. The training of deep
encoders are typically performed using a greedy layer-wise pre-training (e.g., using a stack of restricted Boltzmann machines) that is followed by a finetuning stage based on backpropagation.
No comments:
Post a Comment
If you have any doubt, let me know