Category: Statistics for Big Data

Statistics for Big Data

Important Basic Concepts: Statistics for Big Data

Important Basic Concepts: Statistics for Big Data Graphical : Exploratory Data Analysis (EDA) methods? First of all, EDA is about exploring the data and understanding if the data will be good for the experiment and study. Graphs and plots can easily show the data patterns. The raw data can be difficult to understand for patterns …

Continue reading

Permanent link to this article: http://bangla.sitestree.com/important-basic-concepts-statistics-for-big-data/

Questions Answered by Exploratory Data Analysis (EDA)

Questions Answered by Exploratory Data Analysis (EDA) What are the key properties of a Dataset (Center, Spread, Skew, probability distribution, correlation, outliers) 1. What is the center of the data (mean, median, mode) 2. How much spread is there in the data? (Variance, Standard deviation, Quartiles, Interquartile Range (IQR), Example: IQR = Q3 – Q1) …

Continue reading

Permanent link to this article: http://bangla.sitestree.com/questions-answered-by-exploratory-data-analysis-eda/

Best Practices in Data Preparation

Best Practices in Data Preparation 1. Check data formats (Image, CSV, PC, Mac, mainframe, text, structured, unstructured) 2. Verify data types (numbers, text, floats, currencies, nominal, ordinal, interval, range) 3. Graph your Data (Scatter, Histogram, bar, line) 4. Verify the data (data accuracy, data makes sense) 5. Identify outliers ( Examples: very large or very …

Continue reading

Permanent link to this article: http://bangla.sitestree.com/best-practices-in-data-preparation/