Important Basic Concepts: Statistics for Big Data Graphical : Exploratory Data Analysis (EDA) methods? First of all, EDA is about exploring the data and understanding if the data will be good for the experiment and study. Graphs and plots can easily show the data patterns. The raw data can be difficult to understand for patterns …
Category: Statistics for Big Data
Statistics for Big Data
Sep 15
Questions Answered by Exploratory Data Analysis (EDA)
Questions Answered by Exploratory Data Analysis (EDA) What are the key properties of a Dataset (Center, Spread, Skew, probability distribution, correlation, outliers) 1. What is the center of the data (mean, median, mode) 2. How much spread is there in the data? (Variance, Standard deviation, Quartiles, Interquartile Range (IQR), Example: IQR = Q3 – Q1) …
Sep 15
Best Practices in Data Preparation
Best Practices in Data Preparation 1. Check data formats (Image, CSV, PC, Mac, mainframe, text, structured, unstructured) 2. Verify data types (numbers, text, floats, currencies, nominal, ordinal, interval, range) 3. Graph your Data (Scatter, Histogram, bar, line) 4. Verify the data (data accuracy, data makes sense) 5. Identify outliers ( Examples: very large or very …