Feb 06

Misc. : Classifier Performance and Model Selection

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

Cross Validation:

”

en.wikipedia.org

Cross–validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold cross–validation.May 23, 2018

machinelearningmastery.com › k-fold-cross-validation

A Gentle Introduction to k-fold Cross-Validation

***

Model selection – Wikipedia

Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection. Given candidate models of similar predictive or explanatory power, the simplest model is most likely to be the best choice (Occam’s razor).
“

Model Selection
http://statweb.stanford.edu/~jtaylo/courses/stats203/notes/selection.pdf

Machine Learning Model Evaluation

“Holdout Cross-Validation

Classification Accuracy
Confusion matrix
Logarithmic Loss
Area under curve (AUC)
F-Measure

Regression Metrics

Root Mean Squared Error and Mean Absolute Error.

https://heartbeat.fritz.ai/introduction-to-machine-learning-model-evaluation-fa859e1b2d7f

Model Assessment and Selection:
AIC BIC SRM
http://people.stat.sfu.ca/~dean/labmtgs/Fall2010/HZ-ModelAsssessmentandSelection-Ch7-1.pdf

Training Error

www.quora.com

“Training error is the error that you get when you run the trained model back on the training data. Remember that this data has already been used to train the model and this necessarily doesn’t mean that the model once trained will accurately perform when applied back on the training data itself.”

www.quora.com › What-is-a-training-and-test-error

What is a training and test error? – Quora

“Test error is the error when you get when you run the trained model on a set of data that it has previously never been exposed to. This data is often used to measure the accuracy of the model before it is shipped to production.

www.quora.com › What-is-a-training-and-test-error

What is a training and test error? – Quora“

”

***

Curse of dimensionality – Wikipedia

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.‎Domains · ‎Combinatorics · ‎Distance functions · ‎Nearest neighbor search

”

Bias–variance tradeoff – Wikipedia

The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: The bias error is an error from erroneous assumptions in the learning algorithm.”

“Bias Variance Dilemma”

https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229

”
What is bias and variance?

Bias is the simplifying assumptions made by the model to make the target function easier to approximate. Variance is the amount that the estimate of the target function will change given different training data. Trade-off is tension between the error introduced by the bias and the variance.Mar 18, 2016

machinelearningmastery.com › gentle-introduction-to-the-bias-variance-…

Gentle Introduction to the Bias-Variance Trade-Off in Machine …“

ROC Curve

“

Description

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate against the false positive rate at various threshold settings.“
Ref: https://en.wikipedia.org/wiki/Receiver_operating_characteristic

What is MLE
“In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable.
en.wikipedia.org › wiki › Maximum_likelihood_estimation

Maximum likelihood estimation – Wikipedia

”

MLE conceptually

https://medium.com/analytics-vidhya/maximum-likelihood-estimation-conceptual-understanding-using-an-example-28367a464486

Important Basic Concepts: Statistics for Big Data

http://bangla.salearningschool.com/recent-posts/important-basic-concepts-statistics-for-big-data/

***. ***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Feb 06

NLP : AI and ML

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

Feature Selection
"

Filter Methods

Wrapper Methods

Embedded Methods

Feature Selection Checklist

Do you have domain knowledge?
Are your features commensurate?
Do you suspect interdependence of features? If

"Weka: “Feature Selection to Improve Accuracy and Decrease Training Time“.
Scikit-Learn: “Feature Selection in Python with Scikit-Learn“.
R: “Feature Selection with the Caret R Package“" "

Reference for the information above: https://machinelearningmastery.com/an-introduction-to-feature-selection/

Feature Selection with Neural Networks

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.4570&rep=rep1&type=pdf

Manifold learning

"Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high"
"linear dimensionality reduction frameworks have been designed, such as Principal Component Analysis (PCA), Independent Component Analysis, Linear Discriminant Analysis"
"Manifold Learning can be thought of as an attempt to generalize linear frameworks like PCA to be sensitive to non-linear structure in data."

Manifold learning Approaches:

ISomap (Nearest neighbor search.), Shortest-path graph search. Partial eigenvalue decomposition

Locally Linear Embedding
—Nearest Neighbors Search, Weight Matrix Construction. , Partial Eigenvalue Decomposition

Modified Locally Linear Embedding

Hessian Eigen mapping

Spectral Embedding

Local Tangent Space Alignment

Multi-dimensional Scaling (MDS)

t-distributed Stochastic Neighbor Embedding (t-SNE)

References: https://scikit-learn.org/stable/modules/manifold.html

Nonlinear Principal Component Analysis

Might not be that helpful: https://www.image.ucar.edu/pub/toyIV/monahan_5_16.pdf

Nonlinear PCA[edit]

Nonlinear PCA[42] (NLPCA) uses backpropagation to train a multi-layer perceptron (MLP) to fit to a manifold. Unlike typical MLP training, which only updates the weights, NLPCA updates both the weights and the inputs. That is, both the weights and inputs are treated as latent values. After training, the latent inputs are a low-dimensional representation of the observed vectors, and the MLP maps from that low-dimensional representation to the high-dimensional observation space."
https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Nonlinear_PCA

"Nonlinear principal component analysis (NLPCA) is commonly seen as a nonlinear generalization of standard principal component analysis (PCA). It generalizes the principal components from straight lines to curves (nonlinear). Thus, the subspace in the original data space which is described by all nonlinear components is also curved.
Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture also known as autoencoder, replicator network, bottleneck or sandglass type network. Such autoassociative neural network is a multi-layer perceptron that performs an identity mapping, meaning that the output of the network is required to be identical to the input. However, in the middle of the network is a layer that works as a bottleneck in which a reduction of the dimension of the data is enforced. This bottleneck-layer provides the desired component values (scores)."
http://www.nlpca.org/

Principal Component Analysis

https://medium.com/maheshkkumar/principal-component-analysis-2d11043ff324

EigenVector and EigenValues
https://medium.com/@dareyadewumi650/understanding-the-role-of-eigenvectors-and-eigenvalues-in-pca-dimensionality-reduction-10186dad0c5c

"Singular value decomposition

From Wikipedia, the free encyclopedia

Jump to navigation Jump to search
Illustration of the singular value decomposition UΣV* of a real 2×2 matrix M.

Top: The action of M, indicated by its effect on the unit disc D and the two canonical unit vectors e1 and e2.
Left: The action of V*, a rotation, on D, e1, and e2.
Bottom: The action of Σ, a scaling by the singular values σ1 horizontally and σ2 vertically.
Right: The action of U, another rotation.

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix that generalizes the eigendecomposition of a square normal matrix to any {\displaystyle m\times n} $m\times n$ matrix via an extension of the polar decomposition.

Specifically, the singular value decomposition of an {\displaystyle m\times n} $m\times n$ real or complex matrix {\displaystyle \mathbf {M} } $\mathbf {M}$ is a factorization of the form {\displaystyle \mathbf {U\Sigma V^{*}} } $\mathbf {U\Sigma V^{*}}$ , where {\displaystyle \mathbf {U} } $\mathbf {U}$ is an {\displaystyle m\times m} $m\times m$ real or complex unitary matrix, {\displaystyle \mathbf {\Sigma } } $\mathbf{\Sigma}$ is an {\displaystyle m\times n} $m\times n$ rectangular diagonal matrix with non-negative real numbers on the diagonal, and {\displaystyle \mathbf {V} } $\mathbf {V}$ is an {\displaystyle n\times n} $n\times n$ real or complex unitary matrix. If {\displaystyle \mathbf {M} } $\mathbf {M}$ is real, {\displaystyle \mathbf {U} } $\mathbf {U}$ and {\displaystyle \mathbf {V} =\mathbf {V^{*}} } $\mathbf {V} =\mathbf {V^{*}}$ are real orthonormal matrices.

"
Ref: https://en.wikipedia.org/wiki/Singular_value_decomposition

Graph Neural Networks
"Graph Neural Network is a type of Neural Network which directly operates on the Graph structure. A typical application of GNN is node classification. Essentially, every node in the graph is associated with a label, and we want to predict the label of the nodes without ground-truth .Feb 10, 2019"
https://towardsdatascience.com/a-gentle-introduction-to-graph-neural-network-basics-deepwalk-and-graphsage-db5d540d50b3

"How do Graph neural networks work?
Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth.

arxiv.org › pdf

Graph Neural Networks – arXiv

Deep Reinforcement Learning meets Graph Neural Networks: An optical network routing use case

https://arxiv.org/pdf/1910.07421.pdf

Bin Counting and Text Analysis

***. ***. *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Feb 04

Misc. Optimization:

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

"Linear programming (LP, also called linear optimization) is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships.
en.wikipedia.org › wiki › Linear_programming

Linear programming – Wikipedia

"Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm."
https://en.wikipedia.org/wiki/Branch_and_bound

Convex Optimization Branch and Bound Methods

https://people.orie.cornell.edu/mru8/orie6326/lectures/sp.pdf

Semidefinite Programming and Max-Cut

https://www.cs.cmu.edu/~anupamg/adv-approx/lecture14.pdf

Relating max-cut problems and binary linear feasibility problems

http://www.optimization-online.org/DB_FILE/2009/02/2237.pdf

"Branch and cut[1] is a method of combinatorial optimization for solving integer linear programs (ILPs), that is, linear programming (LP) problems where some or all the unknowns are restricted to integer values.[2] Branch and cut involves running a branch and bound algorithm and using cutting planes to tighten the linear programming relaxations. Note that if cuts are only used to tighten the initial LP relaxation, the algorithm is called cut and branch."
https://en.wikipedia.org/wiki/Branch_and_cut

Integer Programming

http://web.mit.edu/15.053/www/AMP-Chapter-09.pdf

Bang–bang solutions in optimal control

"In optimal control problems, it is sometimes the case that a control is restricted to be between a lower and an upper bound. If the optimal control switches from one extreme to the other (i.e., is strictly never in between the bounds), then that control is referred to as a bang-bang solution.
Bang–bang controls frequently arise in minimum-time problems. For example, if it is desired to stop a car in the shortest possible time at a certain position ahead of the car, the solution is to apply maximum acceleration until the unique switching point, and then apply maximum braking to come to rest exactly at the desired position."
https://en.wikipedia.org/wiki/Bang%E2%80%93bang_control

*** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Feb 02

Misc Basic Statistics for Data Science

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

Hypergeometric Distribution

“In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of $k$ successes (random draws for which the object drawn has a specified feature) in $n$ draws, without replacement, from a finite population of size $N$ that contains exactly $K$ objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of $k$ successes in $n$ draws with replacement.
In statistics, the hypergeometric test uses the hypergeometric distribution to calculate the statistical significance of having drawn a specific $k$ successes (out of $n$ total draws) from the aforementioned population. The test is often used to identify which sub-populations are over- or under-represented in a sample. This test has a wide range of applications. For example, a marketing group could use the test to understand their customer base by testing a set of known customers for over-representation of various demographic subgroups (e.g., women, people under 30).” https://en.wikipedia.org/wiki/Hypergeometric_distribution

Binomial Distribution
“In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own boolean-valued outcome: success/yes/true/one (with probability p) or failure/no/false/zero (with probability q = 1 − p). A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution remains a good approximation, and is widely used.”
https://en.wikipedia.org/wiki/Binomial_distribution

Negative Binomial Distribution
“In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of failures (denoted r) occurs. For example, we can define that when we throw a dice and get a 6 it is a failure while rolling any other number is considered a success, and also choose r to be 3. We then throw the dice repeatedly until the third time the number 6 appears. In such a case, the probability distribution of the number of non-6s that appeared will be a negative binomial distribution.

The Pascal distribution (after Blaise Pascal) and Polya distribution (for George Pólya) are special cases of the negative binomial distribution. A convention among engineers, climatologists, and others is to use “negative binomial” or “Pascal” for the case of an integer-valued stopping-time parameter r, and use “Polya” for the real-valued case.”
https://en.wikipedia.org/wiki/Negative_binomial_distribution

Probability and Counting

“To decide “how likely” an event is, we need to count the number of times an event could occur and compare it to the total number of possible events. Such a comparison is called the probability of the particular event occurring. The mathematical theory of counting is known as combinatorial analysis”
https://www.intmath.com/counting-probability/counting-probability-intro.php

Principle of Counting

“The Fundamental Counting Principle (also called the counting rule) is a way to figure out the number of outcomes in a probability problem. Basically, you multiply the events together to get the total number of outcomes. The formula is:
If you have an event “a” and another event “b” then all the different outcomes for the events is a * b.”
https://www.statisticshowto.datasciencecentral.com/fundamental-counting-principle/

Combinatorics
https://mathigon.org/world/Combinatorics

fundamental principle of counting

“The Fundamental Counting Principle states that if one event has m possible outcomes and a second independent event has n possible outcomes, then there are m x n total possible outcomes for the two events together.”
https://www.mathgoodies.com/glossary/term/Fundamental%20Counting%20Principle

Factorial
“In mathematics, the factorial of a positive integer n, denoted by n!, is the product of all positive integers less than or equal to n:
{\displaystyle n!=n\times (n-1)\times (n-2)\times (n-3)\times \cdots \times 3\times 2\times 1\,.} $n!=n\times (n-1)\times (n-2)\times (n-3)\times \cdots \times 3\times 2\times 1\,.$ ”
https://en.wikipedia.org/wiki/Factorial

Factorial with Identical Numbers

“Bayes’ theorem

Description

In probability theory and statistics, Bayes’s theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. Wikipedia

Formula
$P(A\mid B)=\frac {P(B\mid A) \cdot P(A)}{P(B)}$

	=	events
	=	probability of A given B is true
	=	probability of B given A is true
	=	the independent probabilities of A and B

“

*** . *** . *** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Feb 02

The Kalman Filter: Theory : Example: Equations: Applications

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

The Kalman Filter: An algorithm for making sense of fused sensor insight

“The Kalman filter is relatively quick and easy to implement and provides an optimal estimate of the condition for normally distributed noisy sensor values under certain conditions. Mr. Kalman was so convinced of his algorithm that he was able to inspire a friendly engineer at NASA. And so this filter helped for the first time in the Apollo Guidance Computer at the moon landings .”

https://towardsdatascience.com/kalman-filter-an-algorithm-for-making-sense-from-the-insights-of-various-sensors-fused-together-ddf67597f35e

Kalman Filter

“In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory.”
https://en.wikipedia.org/wiki/Kalman_filter

Kalman Filter in Two Dimensions

https://www.researchgate.net/publication/3082925_Kalman_filtering_in_two_dimensions

Understanding and Applying Kalman Filtering

http://biorobotics.ri.cmu.edu/papers/sbp_papers/integrated3/kleeman_kalman_basics.pdf

Related:

https://arxiv.org/pdf/1910.03558.pdf

https://www.cse.sc.edu/~terejanu/files/tutorialEKF.pdf

https://statweb.stanford.edu/~candes/teaching/acm116/Handouts/Kalman.pdf

*** . *** . *** .
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Feb 02

Why You Don’t Need to Be Bezos to Worry About Spyware

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

Why You Don’t Need to Be Bezos to Worry About Spyware

5. Could that happen to me?

Yes, but the likelihood of that varies greatly. If you are a lawyer, journalist, activist or politician in possession of sensitive data, or an enemy of a regime that has little regard for human rights, you could be especially vulnerable to this kind of digital attack."

https://www.bloomberg.com/news/articles/2020-02-02/why-you-don-t-need-to-be-bezos-to-worry-about-spyware-quicktake

China to inject $174 billion of liquidity on Monday as markets reopen

"Chinese authorities have pledged to use various monetary policy tools to ensure liquidity remains reasonably ample and to support firms affected by the virus epidemic, which has so far claimed 305 lives, all but one in China."
https://www.reuters.com/article/us-china-health-cenbank/china-to-inject-174-billion-of-liquidity-on-monday-as-markets-reopen-idUSKBN1ZW074?il=0

*** . *** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Jan 30

Misc. Math. Data Science. Machine Learning. Optimization. Vector, PCA, Basis, Covariance

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, Math and Statistics for Data Science, and Engineering, ব্লগ । Blog

Misc. Math. Data Science. Machine Learning. Optimization. Vector, PCA, Basis, Covariance

Orthonormality: Orthonormal Vectors

“In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal and unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of unit length. An orthonormal set which forms a basis is called an orthonormal basis.”
https://en.wikipedia.org/wiki/Orthonormality

Basis for a Vector Space
“A vector space’s basis is a subset of vectors within the space that are linearly independent and span the space. A basis is linearly independent because the vectors in it cannot be defined as a linear combination of any of the other vectors in the basis.”

https://study.com/academy/lesson/finding-the-basis-of-a-vector-space.html

Vector Space
“In linear algebra, you might find yourself working with a set of vectors. When the operations of scalar multiplication and vector addition hold for a set of vectors, we call it a vector space.”
https://study.com/academy/lesson/finding-the-basis-of-a-vector-space.html

Explain the concept of covariance matrices based on the shape of data.

Variance:

covariance captures: “The diagonal spread of the data is captured by the covariance.”

“The covariance matrix defines the shape of the data. Diagonal spread is captured by the covariance, while axis-aligned spread is captured by the variance.”

https://www.visiondummy.com/2014/04/geometric-interpretation-covariance-matrix/

https://www.cs.rutgers.edu/~elgammal/classes/cs536/lectures/i2ml-chap6.pdf

https://pathmind.com/wiki/eigenvector

How to derive variance-covariance matrix of coefficients in linear regression

https://stats.stackexchange.com/questions/68151/how-to-derive-variance-covariance-matrix-of-coefficients-in-linear-regression

“The matrix $\operatorname {K} _{\mathbf {YX} }\operatorname {K} _{\mathbf {XX} }^{-1}$ is known as the matrix of regression coefficients, while in linear algebra $\operatorname {K} _{\mathbf {Y|X} }$ is the Schur complement of $\operatorname {K} _{\mathbf {XX} }$ in $\mathbf {\Sigma }$ .
The matrix of regression coefficients may often be given in transpose form, $\operatorname {K} _{\mathbf {XX} }^{-1}\operatorname {K} _{\mathbf {XY} }$ , suitable for post-multiplying a row vector of explanatory variables $\mathbf {X} ^{\rm {T}}$ rather than pre-multiplying a column vector ${\mathbf {X}}$ . In this form they correspond to the coefficients obtained by inverting the matrix of the normal equations of ordinary least squares (OLS).”
https://en.wikipedia.org/wiki/Covariance_matrix

Statistics 512: Applied Linear Models Topic 3

https://www.stat.purdue.edu/~boli/stat512/lectures/topic3.pdf

*** *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Jan 29

Misc Math, Data Science, Machine Learning, PCA, FA

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, Math and Statistics for Data Science, and Engineering, ব্লগ । Blog

“In mathematics, a set B of elements (vectors) in a vector space V is called a basis, if every element of V may be written in a unique way as a (finite) linear combination of elements of B. The coefficients of this linear combination are referred to as components or coordinates on B of the vector. The elements of a basis are called basis vectors.”

”

Equivalently B is a basis if its elements are linearly independent and every element of V is a linear combination of elements of B.[1] In more general terms, a basis is a linearly independent spanning set.

A vector space can have several bases; however all the bases have the same number of elements, called the dimension of the vector space.

”

https://en.wikipedia.org/wiki/Basis_(linear_algebra)

Positive Semidefinite Matrix
“A positive semidefinite matrix is a Hermitian matrix all of whose eigenvalues are nonnegative. SEE ALSO: Negative Definite Matrix, Negative Semidefinite Matrix, Positive Definite Matrix, Positive Eigenvalued Matrix, Positive Matrix.”

http://mathworld.wolfram.com/PositiveSemidefiniteMatrix.html

”

Hermitian Matrix

A square matrix is called Hermitian if it is self-adjoint. Therefore, a Hermitian matrix is defined as one for which

(1)

where denotes the conjugate transpose. This is equivalent to the condition

”

http://mathworld.wolfram.com/HermitianMatrix.html

Definiteness of a matrix

“In linear algebra, a symmetric $n\times n$ real matrix $M$ is said to be positive definite if the scalar $z^{\textsf {T}}Mz$ is strictly positive for every non-zero column vector $z$ of $n$ real numbers. Here $z^{\textsf {T}}$ denotes the transpose of $z$ .[1] When interpreting $Mz$ as the output of an operator, $M$ , that is acting on an input, $z$ , the property of positive definiteness implies that the output always has a positive inner product with the input, as often observed in physical processes.”
https://en.wikipedia.org/wiki/Definiteness_of_a_matrix

”

Singular value decomposition

From Wikipedia, the free encyclopedia

Jump to navigation Jump to search
Illustration of the singular value decomposition UΣV* of a real 2×2 matrix M.

Top: The action of M, indicated by its effect on the unit disc D and the two canonical unit vectors e1 and e2.
Left: The action of V*, a rotation, on D, e1, and e2.
Bottom: The action of Σ, a scaling by the singular values σ1 horizontally and σ2 vertically.
Right: The action of U, another rotation.

Specifically, the singular value decomposition of an $m\times n$ real or complex matrix $\mathbf {M}$ is a factorization of the form $\mathbf {U\Sigma V^{*}}$ , where $\mathbf {U}$ is an $m\times m$ real or complex unitary matrix, $\mathbf{\Sigma}$ is an $m\times n$ rectangular diagonal matrix with non-negative real numbers on the diagonal, and $\mathbf {V}$ is an $n\times n$ real or complex unitary matrix. If $\mathbf {M}$ is real, $\mathbf {U}$ and $\mathbf {V} =\mathbf {V^{*}}$ are real orthonormal matrices.”

https://en.wikipedia.org/wiki/Singular_value_decomposition

PCA using Python (scikit-learn)

https://towardsdatascience.com/pca-using-python-scikit-learn-e653f8989e60

Random R code in relation to PCA

#calculate covariance matrix
cov_mat = cov(normalized_mat)

#Calculation of eigen values using built in eigen function
#no need here to do our own eigen
eig <- eigen(cov_mat)

#verify with prcomp from R (principal components function)
prcomp(pca_data)

eig$vectors

t(eig$vectors)

Some more information on PCA and FA (Factor Analysis)
https://www.cs.rutgers.edu/~elgammal/classes/cs536/lectures/i2ml-chap6.pdf

*** *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Jan 29

‘Scam’ fundraisers reap millions in the name of heart-tugging causes

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, ব্লগ । Blog

"The call centers in Alabama, along with others in Nevada, New Jersey, and Florida, raise money on behalf of “scam PACs,” slang among critics for political action committees that purport to support worthy causes but in reality hand over little of the money for political – or charitable – purposes. Instead, the bulk of the money is kept by fundraising firms or the people running the PACs."
https://www.reuters.com/investigates/special-report/usa-fundraisers-scampacs/

*** *** *** .***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Jan 28

Optimization, Data Science, Math

By Sayed in AI ML DS RL DL NN NLP Data Mining Optimization, Math and Statistics for Data Science, and Engineering, ব্লগ । Blog

Optimization Problem:

Advances in Missile Guidance, Control, and Estimation

Preview:
https://play.google.com/books/reader?id=A2PMBQAAQBAJ&hl=en_GB&pg=GBS.PR14

https://books.google.ca/books?id=A2PMBQAAQBAJ&pg=PA595&lpg=PA595&dq=force+moment+interaction+with+thrusters&source=bl&ots=BruxnXwLzp&sig=ACfU3U39G-l3xDzbotOBJHcMV5uR7DkciQ&hl=en&sa=X&ved=2ahUKEwjZpsT44afnAhXRJt8KHfPYCroQ6AEwCnoECAoQAQ#v=onepage&q=force%20moment%20interaction%20with%20thrusters&f=false

"What is the difference between affine and linear?
4 Answers. A linear function fixes the origin, whereas an affine function need not do so. An affine function is the composition of a linear function with a translation, so while the linear part fixes the origin, the translation can map it somewhere else.Sep 15, 2014"

"If you choose bases for vector spaces ? and ? of dimensions ? and ? respectively, and consider functions ?:?→?, then ? is linear if ?(?)=?? for some ?×? matrix ? and ? is affine if ?(?)=??+? for some matrix ? and vector ?, where coordinate representations are used with respect to the bases chosen."

https://math.stackexchange.com/questions/275310/what-is-the-difference-between-linear-and-affine-function/275327

*** *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Machine Learning Model Evaluation

“Holdout Cross-Validation

What is a training and test error? – Quora“

***

Gentle Introduction to the Bias-Variance Trade-Off in Machine …“

Description

Important Basic Concepts: Statistics for Big Data

Filter Methods

Wrapper Methods

Embedded Methods

Nonlinear PCA[edit]

Principal Component Analysis

"Singular value decomposition

Bang–bang solutions in optimal control

Description

The Kalman Filter: An algorithm for making sense of fused sensor insight

Why You Don’t Need to Be Bezos to Worry About Spyware

5. Could that happen to me?

China to inject $174 billion of liquidity on Monday as markets reopen

Hermitian Matrix

Definiteness of a matrix

Singular value decomposition

PCA using Python (scikit-learn)

Advances in Missile Guidance, Control, and Estimation

Categories

Recent Posts

Topics

Grid