REF: Internet and Gregory S. Nelson. The Analytics Lifecycle Toolkit: A Practical Guide fo an Effective Analytics Capability, John Wiley & Sons © 2018 . Chapter 6 – Problem Framing Data Analytics, Machine Learning Data Analytics, Machine Learning, Data Science
Category: Analytics and Machine Learning Project Development
May 21
Model Selection
• Optimizations/Machine Learning/Data Mining/Deep Learning/Reinforcement Learning/Graph Mining/NLP/Genetic Algorithms • Regression • Linear • Non-Linear • Classifications • Logistics Regression • Sigmoid : Binary • Softmax: Multi-Class • Bayes Classifier • SVM • Bayesian: Regression/Classification • Clustering • K-NN • KNN+ • Kmeans, Hierarchical, Density •Machine Learning/Data Mining/Deep Learning/Reinforcement Learning/Graph Mining/NLP •Time Series Analysis •Decision (Regression, …
May 21
Model Selection for your Project
Potential Models • Statistical Models • Parametric and Non-Parametric • Mathematical Model (Optimization) • Machine Learning • Data Mining • Deep Learning • Reinforcement Learning • Graph Mining • NLP • Optimization • Genetic Algorithm •Association •Basket Association •Apriori Algorithm •Supervised •Classification •Regression •Unsupervised •Clustering/Customer Segmentation •Reinforcement •Learn a policy (interactively) •Game Playing •Robot in …
May 21
Possible Data Analytics Project Goals
• Examine relations • Test Hypothesis • Validate • Find groups/classes/rules • Learn a policy • Maximize Reward interactively • Predict (Class or Value) • Forecast (numeric, sales) • Compare • Classify • Cluster Data Analytics, Machine Learning, Data Science
May 21
Evaluating Your Data Analytics Project Outcome
Regression Projects • R Square, Goodness of fit • RMSE Classification Projects • Confusion Matrix • ROC • Accuracy, Recall, Precision RL – Reinforcement • Reward – Cumulative Data Analytics, Machine Learning, Data Science
May 21
Dimensionality Reduction
May 21
Initial Analysis of Text and Image Data (Data Analytics and ML Projects)
Initial Analysis of Text Data • Stop word filter • Lemma • POS • Vocabulary Analysis Image Data: Initial Analysis • Fix image size, ratios • Image Scaling • Transform to Gray • Standardize Data Analytics, Machine Learning, Data Science
May 21
Data Requirements for Data Analytics Projects
Data • Dataset Characteristics •Large Scale, Real, Representative, Relevant Features, balanced classes, unit relevant • Adapting data/dataset for the project •Clean, normalize/standardize, bring more data, and bring more data of the missing type • Data Suitability for the project • Check for R Square Measure • Check for Bias, Variance, • Do Exploratory Analysis • …
May 21
Initial and Exploratory Analysis for Data Analytics Projects
To have a thorough understanding of the data. Two Types: • Initial Analysis • Exploratory Analysis Initial Analysis: Univariate Analysis • Deciding/Determining the dependent (target) variable • Assigning the correct data types, appropriate column names • Address: Inconsistencies, missing values, outliers • Categorical variables with too many levels (address the issue) • (understand) Distributions of …