Feature Selection
"

Filter Methods

Wrapper Methods

Embedded Methods

Feature Selection Checklist

Do you have domain knowledge?
Are your features commensurate?
Do you suspect interdependence of features? If

"Weka: “Feature Selection to Improve Accuracy and Decrease Training Time“.
Scikit-Learn: “Feature Selection in Python with Scikit-Learn“.
R: “Feature Selection with the Caret R Package“" "

Reference for the information above: https://machinelearningmastery.com/an-introduction-to-feature-selection/

Feature Selection with Neural Networks

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.4570&rep=rep1&type=pdf

Manifold learning

"Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high"
"linear dimensionality reduction frameworks have been designed, such as Principal Component Analysis (PCA), Independent Component Analysis, Linear Discriminant Analysis"
"Manifold Learning can be thought of as an attempt to generalize linear frameworks like PCA to be sensitive to non-linear structure in data."

Manifold learning Approaches:

ISomap (Nearest neighbor search.), Shortest-path graph search. Partial eigenvalue decomposition

Locally Linear Embedding
---Nearest Neighbors Search, Weight Matrix Construction. , Partial Eigenvalue Decomposition

Modified Locally Linear Embedding

Hessian Eigen mapping

Spectral Embedding

Local Tangent Space Alignment

Multi-dimensional Scaling (MDS)

t-distributed Stochastic Neighbor Embedding (t-SNE)

References: https://scikit-learn.org/stable/modules/manifold.html

Nonlinear Principal Component Analysis

Might not be that helpful: https://www.image.ucar.edu/pub/toyIV/monahan_5_16.pdf

Nonlinear PCA[edit]

Nonlinear PCA[42] (NLPCA) uses backpropagation to train a multi-layer perceptron (MLP) to fit to a manifold. Unlike typical MLP training, which only updates the weights, NLPCA updates both the weights and the inputs. That is, both the weights and inputs are treated as latent values. After training, the latent inputs are a low-dimensional representation of the observed vectors, and the MLP maps from that low-dimensional representation to the high-dimensional observation space."
https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Nonlinear_PCA

"Nonlinear principal component analysis (NLPCA) is commonly seen as a nonlinear generalization of standard principal component analysis (PCA). It generalizes the principal components from straight lines to curves (nonlinear). Thus, the subspace in the original data space which is described by all nonlinear components is also curved.
Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture also known as autoencoder, replicator network, bottleneck or sandglass type network. Such autoassociative neural network is a multi-layer perceptron that performs an identity mapping, meaning that the output of the network is required to be identical to the input. However, in the middle of the network is a layer that works as a bottleneck in which a reduction of the dimension of the data is enforced. This bottleneck-layer provides the desired component values (scores)."
http://www.nlpca.org/

Principal Component Analysis

https://medium.com/maheshkkumar/principal-component-analysis-2d11043ff324

EigenVector and EigenValues
https://medium.com/@dareyadewumi650/understanding-the-role-of-eigenvectors-and-eigenvalues-in-pca-dimensionality-reduction-10186dad0c5c

"Singular value decomposition

From Wikipedia, the free encyclopedia

Jump to navigation Jump to search
Illustration of the singular value decomposition UΣV* of a real 2×2 matrix M.

Top: The action of M, indicated by its effect on the unit disc D and the two canonical unit vectors e1 and e2.
Left: The action of V*, a rotation, on D, e1, and e2.
Bottom: The action of Σ, a scaling by the singular values σ1 horizontally and σ2 vertically.
Right: The action of U, another rotation.

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix that generalizes the eigendecomposition of a square normal matrix to any {\displaystyle m\times n} $m\times n$ matrix via an extension of the polar decomposition.

Specifically, the singular value decomposition of an {\displaystyle m\times n} $m\times n$ real or complex matrix {\displaystyle \mathbf {M} } $\mathbf {M}$ is a factorization of the form {\displaystyle \mathbf {U\Sigma V^{*}} } $\mathbf {U\Sigma V^{*}}$ , where {\displaystyle \mathbf {U} } $\mathbf {U}$ is an {\displaystyle m\times m} $m\times m$ real or complex unitary matrix, {\displaystyle \mathbf {\Sigma } } $\mathbf{\Sigma}$ is an {\displaystyle m\times n} $m\times n$ rectangular diagonal matrix with non-negative real numbers on the diagonal, and {\displaystyle \mathbf {V} } $\mathbf {V}$ is an {\displaystyle n\times n} $n\times n$ real or complex unitary matrix. If {\displaystyle \mathbf {M} } $\mathbf {M}$ is real, {\displaystyle \mathbf {U} } $\mathbf {U}$ and {\displaystyle \mathbf {V} =\mathbf {V^{*}} } $\mathbf {V} =\mathbf {V^{*}}$ are real orthonormal matrices.

"
Ref: https://en.wikipedia.org/wiki/Singular_value_decomposition

Graph Neural Networks
"Graph Neural Network is a type of Neural Network which directly operates on the Graph structure. A typical application of GNN is node classification. Essentially, every node in the graph is associated with a label, and we want to predict the label of the nodes without ground-truth .Feb 10, 2019"
https://towardsdatascience.com/a-gentle-introduction-to-graph-neural-network-basics-deepwalk-and-graphsage-db5d540d50b3

"How do Graph neural networks work?
Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth.

arxiv.org › pdf

Graph Neural Networks - arXiv

Deep Reinforcement Learning meets Graph Neural Networks: An optical network routing use case

https://arxiv.org/pdf/1910.07421.pdf

Bin Counting and Text Analysis

***. ***. *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

উন্মুক্ত শিক্ষার আসর – Open Education

NLP : AI and ML