Best Practices in Data Preparation
1. Check data formats (Image, CSV, PC, Mac, mainframe, text, structured, unstructured)
2. Verify data types (numbers, text, floats, currencies, nominal, ordinal, interval, range)
3. Graph your Data (Scatter, Histogram, bar, line)
4. Verify the data (data accuracy, data makes sense)
5. Identify outliers ( Examples: very large or very small (than the rest))
6. Deal with missing values
7. Check your assumptions on data distribution (normal, poisson )
8. Backup and document - everything that you do
Reference: Anderson A., Semmelroth D.
Sayed Ahmed
Linkedin: https://ca.linkedin.com/in/sayedjustetc
Blog: http://sitestree.com, http://bangla.salearningschool.com