Ref: https://www.wolframalpha.com/
Feb 11
Misc. Plot – 4
Feb 11
Misc Plots
Misc Plots:
Ref: https://www.wolframalpha.com/
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 11
Misc. Plots
Ref: https://www.wolframalpha.com/
—-
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 10
Euclidean Norm of a Matrix
Euclidean Norm of a Matrix
Ref: http://mathworld.wolfram.com/FrobeniusNorm.html
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 09
Misc. Data Science: Clustering
"Model–based clustering assumes that the data were generated by a model and tries to recover the original model from the data. The model that we recover from the data then defines clusters and an assignment of documents to clusters. A commonly used criterion for estimating the model parameters is maximum likelihood.nlp.stanford.edu › IR-book › html › htmledition › model-based-clusteri…
Model-based clustering – Stanford NLP Group
"
Mixture models are also known as model–based clustering. Model–based clustering is a broad family of algorithms designed for modelling an unknown distribution as a mixture of simpler distributions, sometimes called basis distributions.
www.sciencedirect.com › topics › medicine-and-dentistry › model-based…
Model-Based Clustering – an overview | ScienceDirect Topics
"
"
Mixture model
Description
In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Wikipedia
"
Introduction to Mixture Models
https://stephens999.github.io/fiveMinuteStats/intro_to_mixture_models.html
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 09
Misc. Optimization Resources
L0 Norm, L1 Norm, L2 Norm & L-Infinity Norm
https://medium.com/@montjoile/l0-norm-l1-norm-l2-norm-l-infinity-norm-7a7d18a4f40c
***
Iterative Solutions of Linear Systems
https://www.math.uh.edu/~jingqiu/math4364/iterative_linear_system.pdf
***
How statistical Norms improve modeling
https://towardsdatascience.com/norms-penalties-and-multitask-learning-2f1db5f97c1f
Project Example: Optimization:
http://www.cs.cmu.edu/~aarti/Class/10725_Fall17/past_projects.html
https://web.stanford.edu/class/ee392o/#projects
https://ece.uwaterloo.ca/~ece602/Projects/2017/Project21/main.html
Area and Project Example:
http://www.ece.tufts.edu/ee/194CO/project_14.pdf
Sensor and Optimization: Could be a good read.
http://homepages.rpi.edu/~mitchj/phdtheses/daryn/ramsdd.pdf
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 07
Part -1 : Bootstrapping, Bagging, Random Forests
What is a Classification Tree:
www.solver.com › classification-tree
Classification Tree | solver
A Classification tree labels, records, and assigns variables to discrete classes. A Classification tree can also provide a measure of confidence that the classification is correct. A Classification tree is built through a process known as binary recursive partitioning.
Pros and Cons of Classification Trees
Advantages:
- Requires less effort for data preparation
- normalization not required
- scaling of data not required
- Missing values in the data does not affect tree building that much
- Easy to explain
Disadvantage:
- Small data change causes a large change in the decision tree
- sometimes calculation can become far more complex
- higher time to train the model
- relatively expensive
What is Ensemble Learning?
"In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone." Wikipedia
blog.statsbot.co › ensemble-learning-d1dcd548e936
Ensemble Learning to Improve Machine Learning Results
"Aug 22, 2017 – Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance (bagging), bias (boosting), or improve predictions (stacking)."
What is Bootstraping?
"In statistics, bootstrapping is any test or metric that relies on random sampling with replacement. Bootstrapping allows assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates."
en.wikipedia.org › wiki › Bootstrapping_(statistics)
Bootstrapping (statistics) – Wikipedia
Bagging Steps:
"Suppose there are N observations and M features in training data set. A sample from training data set is taken randomly with replacement. A subset of M features are selected randomly and whichever feature gives the best split is used to split the node iteratively. The tree is grown to the largest.Feb 19, 2018"
analyticsindiamag.com › primer-ensemble-learning-bagging-boosting
Bagging and Boosting – Analytics India Magazine
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).
Feb 06
KL Divergence: Entropy: Cross Entropy: Example Use Cases. Equations as well.
KL Divergence in Picture and Examples
“Kullback–Leibler divergence is the difference between the Cross Entropy H for PQ and the true Entropy H for P.”
KL
[1]
“And this is what we use as a loss function while training Neural Networks. When we have an image classification problem, the training data and corresponding correct labels represent P, the true distribution. The NN predictions are our estimations Q.”
Reference for the above (including image) : https://towardsdatascience.com/entropy-cross-entropy-kl-divergence-binary-cross-entropy-cb8f72e72e65
The above URL is a pretty great read.
****
Everything below is from the Internet including images and equations esp. from [1]
“
What’s the KL Divergence?
The Kullback-Leibler divergence (hereafter written as KL divergence) is a measure of how a probability distribution differs from another probability distribution.
The KL divergence measures the distance from the approximate distribution QQ to the true distribution PP
.”
KL Divergence from Q to P
[1]
not a distance metric, not symmetric
Can be written as:
![]()
[1]
First term is the is the cross entropy between
PP and Q. Second term is the entropy of P
Forward and Reverse KL
Forward: mean seeking behaviour. Where P (.) has High Probability, Q (.) will also have to have high probability.
Kind of will approximate around mean. P = the one with two peaks. Q kind of took mean.
[1]
Reverse KL: Mode Seeking Behaviour
Where Q (.) has High Probability, P (.) will also have to have high probability.
[1]
References:
[1] https://dibyaghosh.com/blog/probability/kldivergence.html
[2] https://towardsdatascience.com/light-on-math-machine-learning-intuitive-guide-to-understanding-kl-divergence-2b382ca2b2a8
*** ***
“What is KL divergence used for?
Very often in Probability and Statistics we’ll replace observed data or a complex distributions with a simpler, approximating distribution. KL Divergence helps us to measure just how much information we lose when we choose an approximation.May 10, 2017
www.countbayesie.com › blog › kullback-leibler-divergence-explained
Kullback-Leibler Divergence Explained — Count Bayesie
“
***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
Feb 06
Misc. : Classifier Performance and Model Selection
Cross Validation:
Cross–validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold cross–validation.May 23, 2018
machinelearningmastery.com › k-fold-cross-validation
A Gentle Introduction to k-fold Cross-Validation
***
Model selection – Wikipedia
Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection. Given candidate models of similar predictive or explanatory power, the simplest model is most likely to be the best choice (Occam’s razor).
“
Model Selection
http://statweb.stanford.edu/~jtaylo/courses/stats203/notes/selection.pdf
Machine Learning Model Evaluation
“Holdout Cross-Validation
- Classification Accuracy
- Confusion matrix
- Logarithmic Loss
- Area under curve (AUC)
- F-Measure
Regression Metrics
Root Mean Squared Error and Mean Absolute Error.
https://heartbeat.fritz.ai/introduction-to-machine-learning-model-evaluation-fa859e1b2d7f
Model Assessment and Selection:
AIC BIC SRM
http://people.stat.sfu.ca/~dean/labmtgs/Fall2010/HZ-ModelAsssessmentandSelection-Ch7-1.pdf
Training Error
“Training error is the error that you get when you run the trained model back on the training data. Remember that this data has already been used to train the model and this necessarily doesn’t mean that the model once trained will accurately perform when applied back on the training data itself.”
www.quora.com › What-is-a-training-and-test-error
What is a training and test error? – Quora
“Test error is the error when you get when you run the trained model on a set of data that it has previously never been exposed to. This data is often used to measure the accuracy of the model before it is shipped to production.
www.quora.com › What-is-a-training-and-test-error
What is a training and test error? – Quora“
”
***
Curse of dimensionality – Wikipedia
The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.Domains · Combinatorics · Distance functions · Nearest neighbor search
”
Bias–variance tradeoff – Wikipedia
The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: The bias error is an error from erroneous assumptions in the learning algorithm.”
“Bias Variance Dilemma”
https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229
”
What is bias and variance?
Bias is the simplifying assumptions made by the model to make the target function easier to approximate. Variance is the amount that the estimate of the target function will change given different training data. Trade-off is tension between the error introduced by the bias and the variance.Mar 18, 2016
machinelearningmastery.com › gentle-introduction-to-the-bias-variance-…
Gentle Introduction to the Bias-Variance Trade-Off in Machine …“
ROC Curve
“
Description
A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate against the false positive rate at various threshold settings.“
Ref: https://en.wikipedia.org/wiki/Receiver_operating_characteristic
What is MLE
“In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable.
en.wikipedia.org › wiki › Maximum_likelihood_estimation
Maximum likelihood estimation – Wikipedia
”
MLE conceptually
Important Basic Concepts: Statistics for Big Data
http://bangla.salearningschool.com/recent-posts/important-basic-concepts-statistics-for-big-data/
***. ***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).




















































