Cross Validation:
Cross-validation is a resampling procedure used to evaluate machine learning models on a limited data sample. The procedure has a single parameter called k that refers to the number of groups that a given data sample is to be split into. As such, the procedure is often called k-fold cross-validation.May 23, 2018
machinelearningmastery.com › k-fold-cross-validation
A Gentle Introduction to k-fold Cross-Validation
***
Model selection - Wikipedia
Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is well-suited to the problem of model selection. Given candidate models of similar predictive or explanatory power, the simplest model is most likely to be the best choice (Occam's razor).
"
Model Selection
http://statweb.stanford.edu/~jtaylo/courses/stats203/notes/selection.pdf
Machine Learning Model Evaluation
"Holdout Cross-Validation
- Classification Accuracy
- Confusion matrix
- Logarithmic Loss
- Area under curve (AUC)
- F-Measure
Regression Metrics
Root Mean Squared Error and Mean Absolute Error.
https://heartbeat.fritz.ai/introduction-to-machine-learning-model-evaluation-fa859e1b2d7f
Model Assessment and Selection:
AIC BIC SRM
http://people.stat.sfu.ca/~dean/labmtgs/Fall2010/HZ-ModelAsssessmentandSelection-Ch7-1.pdf
Training Error
"Training error is the error that you get when you run the trained model back on the training data. Remember that this data has already been used to train the model and this necessarily doesn't mean that the model once trained will accurately perform when applied back on the training data itself."
www.quora.com › What-is-a-training-and-test-error
What is a training and test error? - Quora
"Test error is the error when you get when you run the trained model on a set of data that it has previously never been exposed to. This data is often used to measure the accuracy of the model before it is shipped to production.
www.quora.com › What-is-a-training-and-test-error
What is a training and test error? - Quora"
"
***
Curse of dimensionality - Wikipedia
The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.Domains · Combinatorics · Distance functions · Nearest neighbor search
"
Bias–variance tradeoff - Wikipedia
The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: The bias error is an error from erroneous assumptions in the learning algorithm."
"Bias Variance Dilemma"
https://towardsdatascience.com/understanding-the-bias-variance-tradeoff-165e6942b229
"
What is bias and variance?
Bias is the simplifying assumptions made by the model to make the target function easier to approximate. Variance is the amount that the estimate of the target function will change given different training data. Trade-off is tension between the error introduced by the bias and the variance.Mar 18, 2016
machinelearningmastery.com › gentle-introduction-to-the-bias-variance-...
Gentle Introduction to the Bias-Variance Trade-Off in Machine ..."
ROC Curve
"
Description
A receiver operating characteristic curve, or ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate against the false positive rate at various threshold settings."
Ref: https://en.wikipedia.org/wiki/Receiver_operating_characteristic
What is MLE
"In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable.
en.wikipedia.org › wiki › Maximum_likelihood_estimation
Maximum likelihood estimation - Wikipedia
"
MLE conceptually
Important Basic Concepts: Statistics for Big Data
http://bangla.salearningschool.com/recent-posts/important-basic-concepts-statistics-for-big-data/
***. ***. ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada
*** . *** *** . *** . *** . ***
Sayed Ahmed
BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Online and Offline Training: http://Training.SitesTree.com (Also, can be free and low cost sometimes)
Facebook Group/Form to discuss (Q & A): https://www.facebook.com/banglasalearningschool
Our free or paid training events: https://www.facebook.com/justetcsocial
Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/
If you want to contribute to occasional free and/or low cost online/offline training or charitable/non-profit work in the education/health/social service sector, you can financially contribute to: safoundation at salearningschool.com using Paypal or Credit Card (on http://sitestree.com/training/enrol/index.php?id=114 ).