Initial and Exploratory Analysis for Data Analytics Projects

To have a thorough understanding of the data.

Two Types:

• Initial Analysis

• Exploratory Analysis

Initial Analysis:

  • Univariate
  • Bi-Variate
  • Multi-Variate

Univariate Analysis

• Deciding/Determining the dependent (target) variable

• Assigning the correct data types, appropriate column names

• Address: Inconsistencies, missing values, outliers

• Categorical variables with too many levels (address the issue)

• (understand) Distributions of the variables (is it a right fit for the project)

• Imbalance in the dependent variable

• Time variables

• Univariate visualizations

A detailed data dictionary

Low variance filter

Bivariate Analysis

• Pairwise relations

• Pairwise visualizations

• Correlation analysis

Multivariate Analysis

Multivariate relations

Statistical tools

Exploratory Analysis

Normalizing

• Subsetting the data

• Clustering

Others

• Decision rules, association rules, n-grams

• Time series analysis

Data Analytics, Machine Learning, Data Science