Misc. Data Science: Clustering

"Modelbased clustering assumes that the data were generated by a model and tries to recover the original model from the data. The model that we recover from the data then defines clusters and an assignment of documents to clusters. A commonly used criterion for estimating the model parameters is maximum likelihood.nlp.stanford.edu › IR-book › html › htmledition › model-based-clusteri…

Model-based clustering – Stanford NLP Group

"

"
Mixture models are also known as modelbased clustering. Modelbased clustering is a broad family of algorithms designed for modelling an unknown distribution as a mixture of simpler distributions, sometimes called basis distributions.

www.sciencedirect.com › topics › medicine-and-dentistry › model-based…

Model-Based Clustering – an overview | ScienceDirect Topics

"

"
Mixture model

Description

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Wikipedia

"

Introduction to Mixture Models
https://stephens999.github.io/fiveMinuteStats/intro_to_mixture_models.html

***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Misc. Optimization Resources

L0 Norm, L1 Norm, L2 Norm & L-Infinity Norm

https://medium.com/@montjoile/l0-norm-l1-norm-l2-norm-l-infinity-norm-7a7d18a4f40c

***

Iterative Solutions of Linear Systems

https://www.math.uh.edu/~jingqiu/math4364/iterative_linear_system.pdf

***

How statistical Norms improve modeling
https://towardsdatascience.com/norms-penalties-and-multitask-learning-2f1db5f97c1f

Project Example: Optimization:
http://www.cs.cmu.edu/~aarti/Class/10725_Fall17/past_projects.html

https://web.stanford.edu/class/ee392o/#projects

https://ece.uwaterloo.ca/~ece602/Projects/2017/Project21/main.html

Area and Project Example:
http://www.ece.tufts.edu/ee/194CO/project_14.pdf

Sensor and Optimization: Could be a good read.
http://homepages.rpi.edu/~mitchj/phdtheses/daryn/ramsdd.pdf

***. ***. ***

Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Part -1 : Bootstrapping, Bagging, Random Forests

What is a Classification Tree:

www.solver.com › classification-tree

Classification Tree | solver

A Classification tree labels, records, and assigns variables to discrete classes. A Classification tree can also provide a measure of confidence that the classification is correct. A Classification tree is built through a process known as binary recursive partitioning.

Pros and Cons of Classification Trees

Advantages:

  1. Requires less effort for data preparation
  2. normalization not required
  3. scaling of data not required
  4. Missing values in the data does not affect tree building that much
  5. Easy to explain

Disadvantage:

  1. Small data change causes a large change in the decision tree
  2. sometimes calculation can become far more complex
  3. higher time to train the model
  4. relatively expensive

https://medium.com/@dhiraj8899/top-5-advantages-and-disadvantages-of-decision-tree-algorithm-428ebd199d9a

What is Ensemble Learning?

"In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone." Wikipedia

blog.statsbot.co › ensemble-learning-d1dcd548e936

Ensemble Learning to Improve Machine Learning Results

"Aug 22, 2017 – Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance (bagging), bias (boosting), or improve predictions (stacking)."

What is Bootstraping?

"In statistics, bootstrapping is any test or metric that relies on random sampling with replacement. Bootstrapping allows assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates."

en.wikipedia.org › wiki › Bootstrapping_(statistics)

Bootstrapping (statistics) – Wikipedia

Bagging Steps:
"Suppose there are N observations and M features in training data set. A sample from training data set is taken randomly with replacement. A subset of M features are selected randomly and whichever feature gives the best split is used to split the node iteratively. The tree is grown to the largest.Feb 19, 2018"

analyticsindiamag.com › primer-ensemble-learning-bagging-boosting

Bagging and Boosting – Analytics India Magazine

***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

KL Divergence: Entropy: Cross Entropy: Example Use Cases. Equations as well.

KL Divergence in Picture and Examples

“Kullback–Leibler divergence is the difference between the Cross Entropy H for PQ and the true Entropy H for P.”

KL

KL

[1]

“And this is what we use as a loss function while training Neural Networks. When we have an image classification problem, the training data and corresponding correct labels represent P, the true distribution. The NN predictions are our estimations Q.”

Reference for the above (including image) : https://towardsdatascience.com/entropy-cross-entropy-kl-divergence-binary-cross-entropy-cb8f72e72e65
The above URL is a pretty great read.

****
Everything below is from the Internet including images and equations esp. from [1]

What’s the KL Divergence?

The Kullback-Leibler divergence (hereafter written as KL divergence) is a measure of how a probability distribution differs from another probability distribution.

The KL divergence measures the distance from the approximate distribution QQ to the true distribution PP

.”

KL Divergence from Q to P

[1]

not a distance metric, not symmetric

Can be written as:

[1]

First term is the is the cross entropy between
PP and Q. Second term is the entropy of P

Forward and Reverse KL

Forward: mean seeking behaviour. Where P (.) has High Probability, Q (.) will also have to have high probability.

Kind of will approximate around mean. P = the one with two peaks. Q kind of took mean.

[1]

Reverse KL: Mode Seeking Behaviour
Where Q (.) has High Probability, P (.) will also have to have high probability.

[1]

References:
[1] https://dibyaghosh.com/blog/probability/kldivergence.html
[2] https://towardsdatascience.com/light-on-math-machine-learning-intuitive-guide-to-understanding-kl-divergence-2b382ca2b2a8

*** ***

“What is KL divergence used for?
Very often in Probability and Statistics we’ll replace observed data or a complex distributions with a simpler, approximating distribution. KL Divergence helps us to measure just how much information we lose when we choose an approximation.May 10, 2017

www.countbayesie.com › blog › kullback-leibler-divergence-explained

 

 

Kullback-Leibler Divergence Explained — Count Bayesie

***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

Misc. : Classifier Performance and Model Selection

Cross Validation:


en.wikipedia.org

Crossvalidation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold crossvalidation.May 23, 2018

machinelearningmastery.com › k-fold-cross-validation

A Gentle Introduction to k-fold Cross-Validation

***

Model selection – Wikipedia

Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection. Given candidate models of similar predictive or explanatory power, the simplest model is most likely to be the best choice (Occam’s razor).

Model Selection
http://statweb.stanford.edu/~jtaylo/courses/stats203/notes/selection.pdf

Machine Learning Model Evaluation

“Holdout Cross-Validation

  • Classification Accuracy
  • Confusion matrix
  • Logarithmic Loss
  • Area under curve (AUC)
  • F-Measure

Regression Metrics

Root Mean Squared Error and Mean Absolute Error.

https://heartbeat.fritz.ai/introduction-to-machine-learning-model-evaluation-fa859e1b2d7f

Model Assessment and Selection:
AIC BIC SRM
http://people.stat.sfu.ca/~dean/labmtgs/Fall2010/HZ-ModelAsssessmentandSelection-Ch7-1.pdf

Training Error

www.quora.com

“Training error is the error that you get when you run the trained model back on the training data. Remember that this data has already been used to train the model and this necessarily doesn’t mean that the model once trained will accurately perform when applied back on the training data itself.”

www.quora.com › What-is-a-training-and-test-error

What is a training and test error? – Quora

“Test error is the error when you get when you run the trained model on a set of data that it has previously never been exposed to. This data is often used to measure the accuracy of the model before it is shipped to production.

www.quora.com › What-is-a-training-and-test-error

What is a training and test error? – Quora

***

Curse of dimensionality – Wikipedia

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.‎Domains · ‎Combinatorics · ‎Distance functions · ‎Nearest neighbor search

Bias–variance tradeoff – Wikipedia

The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: The bias error is an error from erroneous assumptions in the learning algorithm.”

“Bias Variance Dilemma”

https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229


What is bias and variance?

Bias is the simplifying assumptions made by the model to make the target function easier to approximate. Variance is the amount that the estimate of the target function will change given different training data. Trade-off is tension between the error introduced by the bias and the variance.Mar 18, 2016

machinelearningmastery.com › gentle-introduction-to-the-bias-variance-…

Gentle Introduction to the Bias-Variance Trade-Off in Machine …

ROC Curve

 

Description

A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate against the false positive rate at various threshold settings.
Ref: https://en.wikipedia.org/wiki/Receiver_operating_characteristic

What is MLE
“In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable.
en.wikipedia.org › wiki › Maximum_likelihood_estimation

Maximum likelihood estimation – Wikipedia

MLE conceptually

https://medium.com/analytics-vidhya/maximum-likelihood-estimation-conceptual-understanding-using-an-example-28367a464486

Important Basic Concepts: Statistics for Big Data

http://bangla.salearningschool.com/recent-posts/important-basic-concepts-statistics-for-big-data/

***. ***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

NLP : AI and ML

Feature Selection
"

Filter Methods

Wrapper Methods

Embedded Methods

Feature Selection Checklist

  1. Do you have domain knowledge?
  2. Are your features commensurate?
  3. Do you suspect interdependence of features? If

Reference for the information above: https://machinelearningmastery.com/an-introduction-to-feature-selection/

Feature Selection with Neural Networks

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.4570&rep=rep1&type=pdf

Manifold learning

"Manifold learning is an approach to non-linear dimensionality reduction. Algorithms for this task are based on the idea that the dimensionality of many data sets is only artificially high"
"linear dimensionality reduction frameworks have been designed, such as Principal Component Analysis (PCA), Independent Component Analysis, Linear Discriminant Analysis"
"Manifold Learning can be thought of as an attempt to generalize linear frameworks like PCA to be sensitive to non-linear structure in data."

Manifold learning Approaches:

ISomap (Nearest neighbor search.), Shortest-path graph search. Partial eigenvalue decomposition

Locally Linear Embedding
—Nearest Neighbors Search, Weight Matrix Construction. , Partial Eigenvalue Decomposition

Modified Locally Linear Embedding

Hessian Eigen mapping

Spectral Embedding

Local Tangent Space Alignment

Multi-dimensional Scaling (MDS)

t-distributed Stochastic Neighbor Embedding (t-SNE)

References: https://scikit-learn.org/stable/modules/manifold.html

Nonlinear Principal Component Analysis

Might not be that helpful: https://www.image.ucar.edu/pub/toyIV/monahan_5_16.pdf

"

Nonlinear PCA[edit]

Nonlinear PCA[42] (NLPCA) uses backpropagation to train a multi-layer perceptron (MLP) to fit to a manifold. Unlike typical MLP training, which only updates the weights, NLPCA updates both the weights and the inputs. That is, both the weights and inputs are treated as latent values. After training, the latent inputs are a low-dimensional representation of the observed vectors, and the MLP maps from that low-dimensional representation to the high-dimensional observation space."
https://en.wikipedia.org/wiki/Nonlinear_dimensionality_reduction#Nonlinear_PCA

"Nonlinear principal component analysis (NLPCA) is commonly seen as a nonlinear generalization of standard principal component analysis (PCA). It generalizes the principal components from straight lines to curves (nonlinear). Thus, the subspace in the original data space which is described by all nonlinear components is also curved.
Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture also known as autoencoder, replicator network, bottleneck or sandglass type network. Such autoassociative neural network is a multi-layer perceptron that performs an identity mapping, meaning that the output of the network is required to be identical to the input. However, in the middle of the network is a layer that works as a bottleneck in which a reduction of the dimension of the data is enforced. This bottleneck-layer provides the desired component values (scores)."
http://www.nlpca.org/

Principal Component Analysis

https://medium.com/maheshkkumar/principal-component-analysis-2d11043ff324

EigenVector and EigenValues
https://medium.com/@dareyadewumi650/understanding-the-role-of-eigenvectors-and-eigenvalues-in-pca-dimensionality-reduction-10186dad0c5c

"Singular value decomposition

From Wikipedia, the free encyclopedia

Jump to navigationJump to search
Illustration of the singular value decomposition UΣV* of a real 2×2 matrix M.

  • Top: The action of M, indicated by its effect on the unit disc D and the two canonical unit vectors e1 and e2.
  • Left: The action of V*, a rotation, on D, e1, and e2.
  • Bottom: The action of Σ, a scaling by the singular values σ1 horizontally and σ2 vertically.
  • Right: The action of U, another rotation.

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix that generalizes the eigendecomposition of a square normal matrix to any {\displaystyle m\times n}m\times n matrix via an extension of the polar decomposition.

Specifically, the singular value decomposition of an {\displaystyle m\times n}m\times n real or complex matrix {\displaystyle \mathbf {M} }\mathbf {M} is a factorization of the form {\displaystyle \mathbf {U\Sigma V^{*}} }{\displaystyle \mathbf {U\Sigma V^{*}} }, where {\displaystyle \mathbf {U} }\mathbf {U} is an {\displaystyle m\times m}m\times m real or complex unitary matrix, {\displaystyle \mathbf {\Sigma } }\mathbf{\Sigma} is an {\displaystyle m\times n}m\times n rectangular diagonal matrix with non-negative real numbers on the diagonal, and {\displaystyle \mathbf {V} }\mathbf {V} is an {\displaystyle n\times n}n\times n real or complex unitary matrix. If {\displaystyle \mathbf {M} }\mathbf {M} is real, {\displaystyle \mathbf {U} }\mathbf {U} and {\displaystyle \mathbf {V} =\mathbf {V^{*}} }{\displaystyle \mathbf {V} =\mathbf {V^{*}} } are real orthonormal matrices.






"
Ref: https://en.wikipedia.org/wiki/Singular_value_decomposition

Graph Neural Networks
"Graph Neural Network is a type of Neural Network which directly operates on the Graph structure. A typical application of GNN is node classification. Essentially, every node in the graph is associated with a label, and we want to predict the label of the nodes without ground-truth .Feb 10, 2019"
https://towardsdatascience.com/a-gentle-introduction-to-graph-neural-network-basics-deepwalk-and-graphsage-db5d540d50b3

"How do Graph neural networks work?
Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth.

arxiv.org › pdf

Graph Neural Networks – arXiv

"

Deep Reinforcement Learning meets Graph Neural Networks: An optical network routing use case

https://arxiv.org/pdf/1910.07421.pdf

Bin Counting and Text Analysis






***. ***. *** ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Misc. Optimization:

"Linear programming (LP, also called linear optimization) is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships.
en.wikipedia.org › wiki › Linear_programming

Linear programming – Wikipedia

"

"Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores branches of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated bounds on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm."
https://en.wikipedia.org/wiki/Branch_and_bound

Convex Optimization Branch and Bound Methods

https://people.orie.cornell.edu/mru8/orie6326/lectures/sp.pdf

Semidefinite Programming and Max-Cut

https://www.cs.cmu.edu/~anupamg/adv-approx/lecture14.pdf

Relating max-cut problems and binary linear feasibility problems

http://www.optimization-online.org/DB_FILE/2009/02/2237.pdf

"Branch and cut[1] is a method of combinatorial optimization for solving integer linear programs (ILPs), that is, linear programming (LP) problems where some or all the unknowns are restricted to integer values.[2] Branch and cut involves running a branch and bound algorithm and using cutting planes to tighten the linear programming relaxations. Note that if cuts are only used to tighten the initial LP relaxation, the algorithm is called cut and branch."
https://en.wikipedia.org/wiki/Branch_and_cut

Integer Programming

http://web.mit.edu/15.053/www/AMP-Chapter-09.pdf

Bang–bang solutions in optimal control

"In optimal control problems, it is sometimes the case that a control is restricted to be between a lower and an upper bound. If the optimal control switches from one extreme to the other (i.e., is strictly never in between the bounds), then that control is referred to as a bang-bang solution.
Bang–bang controls frequently arise in minimum-time problems. For example, if it is desired to stop a car in the shortest possible time at a certain position ahead of the car, the solution is to apply maximum acceleration until the unique switching point, and then apply maximum braking to come to rest exactly at the desired position."
https://en.wikipedia.org/wiki/Bang%E2%80%93bang_control

*** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Misc Basic Statistics for Data Science

Hypergeometric Distribution

“In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, without replacement, from a finite population of size N that contains exactly K objects with that feature, wherein each draw is either a success or a failure. In contrast, the binomial distribution describes the probability of k successes in n draws with replacement.
In statistics, the hypergeometric test uses the hypergeometric distribution to calculate the statistical significance of having drawn a specific k successes (out of n total draws) from the aforementioned population. The test is often used to identify which sub-populations are over- or under-represented in a sample. This test has a wide range of applications. For example, a marketing group could use the test to understand their customer base by testing a set of known customers for over-representation of various demographic subgroups (e.g., women, people under 30).” https://en.wikipedia.org/wiki/Hypergeometric_distribution

Binomial Distribution
“In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own boolean-valued outcome: success/yes/true/one (with probability p) or failure/no/false/zero (with probability q = 1 − p). A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

The binomial distribution is frequently used to model the number of successes in a sample of size n drawn with replacement from a population of size N. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution remains a good approximation, and is widely used.”
https://en.wikipedia.org/wiki/Binomial_distribution

Negative Binomial Distribution
“In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of failures (denoted r) occurs. For example, we can define that when we throw a dice and get a 6 it is a failure while rolling any other number is considered a success, and also choose r to be 3. We then throw the dice repeatedly until the third time the number 6 appears. In such a case, the probability distribution of the number of non-6s that appeared will be a negative binomial distribution.

The Pascal distribution (after Blaise Pascal) and Polya distribution (for George Pólya) are special cases of the negative binomial distribution. A convention among engineers, climatologists, and others is to use “negative binomial” or “Pascal” for the case of an integer-valued stopping-time parameter r, and use “Polya” for the real-valued case.”
https://en.wikipedia.org/wiki/Negative_binomial_distribution

Probability and Counting

“To decide “how likely” an event is, we need to count the number of times an event could occur and compare it to the total number of possible events. Such a comparison is called the probability of the particular event occurring. The mathematical theory of counting is known as combinatorial analysis”
https://www.intmath.com/counting-probability/counting-probability-intro.php

Principle of Counting

“The Fundamental Counting Principle (also called the counting rule) is a way to figure out the number of outcomes in a probability problem. Basically, you multiply the events together to get the total number of outcomes. The formula is:
If you have an event “a” and another event “b” then all the different outcomes for the events is a * b.”
https://www.statisticshowto.datasciencecentral.com/fundamental-counting-principle/

Combinatorics
https://mathigon.org/world/Combinatorics

fundamental principle of counting

“The Fundamental Counting Principle states that if one event has m possible outcomes and a second independent event has n possible outcomes, then there are m x n total possible outcomes for the two events together.”
https://www.mathgoodies.com/glossary/term/Fundamental%20Counting%20Principle

Factorial
“In mathematics, the factorial of a positive integer n, denoted by n!, is the product of all positive integers less than or equal to n:
{\displaystyle n!=n\times (n-1)\times (n-2)\times (n-3)\times \cdots \times 3\times 2\times 1\,.}{\displaystyle n!=n\times (n-1)\times (n-2)\times (n-3)\times \cdots \times 3\times 2\times 1\,.}
https://en.wikipedia.org/wiki/Factorial

Factorial with Identical Numbers

Bayes’ theorem

Description

In probability theory and statistics, Bayes’s theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. Wikipedia

Formula
P(A\mid B)=\frac {P(B\mid A) \cdot P(A)}{P(B)}

A, B = events
P(A|B) = probability of A given B is true
P(B|A) = probability of B given A is true
P(A), P(B) = the independent probabilities of A and B

*** . *** . *** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

The Kalman Filter: Theory : Example: Equations: Applications

The Kalman Filter: An algorithm for making sense of fused sensor insight

“The Kalman filter is relatively quick and easy to implement and provides an optimal estimate of the condition for normally distributed noisy sensor values under certain conditions. Mr. Kalman was so convinced of his algorithm that he was able to inspire a friendly engineer at NASA. And so this filter helped for the first time in the Apollo Guidance Computer at the moon landings .”

https://towardsdatascience.com/kalman-filter-an-algorithm-for-making-sense-from-the-insights-of-various-sensors-fused-together-ddf67597f35e

Kalman Filter

“In statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory.”
https://en.wikipedia.org/wiki/Kalman_filter

Kalman Filter in Two Dimensions

https://www.researchgate.net/publication/3082925_Kalman_filtering_in_two_dimensions

Understanding and Applying Kalman Filtering

http://biorobotics.ri.cmu.edu/papers/sbp_papers/integrated3/kleeman_kalman_basics.pdf

Related:

https://arxiv.org/pdf/1910.03558.pdf

https://www.cse.sc.edu/~terejanu/files/tutorialEKF.pdf

https://statweb.stanford.edu/~candes/teaching/acm116/Handouts/Kalman.pdf

*** . *** . *** .
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/

If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).

Why You Don’t Need to Be Bezos to Worry About Spyware

Why You Don’t Need to Be Bezos to Worry About Spyware

"

5. Could that happen to me?

Yes, but the likelihood of that varies greatly. If you are a lawyer, journalist, activist or politician in possession of sensitive data, or an enemy of a regime that has little regard for human rights, you could be especially vulnerable to this kind of digital attack."

https://www.bloomberg.com/news/articles/2020-02-02/why-you-don-t-need-to-be-bezos-to-worry-about-spyware-quicktake

China to inject $174 billion of liquidity on Monday as markets reopen

"Chinese authorities have pledged to use various monetary policy tools to ensure liquidity remains reasonably ample and to support firms affected by the virus epidemic, which has so far claimed 305 lives, all but one in China."
https://www.reuters.com/article/us-china-health-cenbank/china-to-inject-174-billion-of-liquidity-on-monday-as-markets-reopen-idUSKBN1ZW074?il=0

*** . *** . ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)

Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool

Our free or paid training events: https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/