Important Basic Concepts: Statistics for Big Data

Important Basic Concepts: Statistics for Big Data

Graphical : Exploratory Data Analysis (EDA) methods?
First of all, EDA is about exploring the data and understanding if the data will be good for the experiment and study. Graphs and plots can easily show the data patterns. The raw data can be difficult to understand for patterns and fitness, Graphs can easily show some information about the data.

Graphical Methods can be as follows:
1. Scatter Plots
2. Histograms
3. Box Plots
4. Normal Probability plots

Quantitative Exploratory Data Analysis Techniques:

1. Interval Estimation (Ranges)
2. Hypothesis testing (Null Hypothesis, Alternate Hypothesis)

1. Interval Estimation (Ranges): Create a range of values within which a variable is likely to fall. Confidence Interval (mean will be here) is an interval estimation.

2. Hypothesis testing: Test various propositions about a data

Example: Test that the mean age of Canadian Population is 53.

It’s a multi-step process. Steps can be as follows:

1. Test Null Hypothesis: Assume the Hypothesis is true
2. Alternate Hypothesis: Hypothesis that will be accepted if the null hypothesis is rejected
3. Significance Level: what level of significance the null hypothesis will be conducted (i.e. 95% of the time the average return of index investing is 6% for 10 years period)
4. Test Statistic: Numerical measure showing sample data is consistent with Null Hypothesis
6. Critical Value: If test statistic (numerical measure) is more extreme than critical value – null hypothesis is rejected
7. Decision: decision is made by considering Test Statistic and Critical value

Some Basic Probability Distributions:

Binomial Distribution: When the variable can have only one of two values

Poisson Distribution: Describe the likelihood of given number of events occurring during a time interval (customers to your shop in an hour)

Normal Distribution: Symmetrical data. probability that a variable will have a given distance from the mean on both lower and higher side is equal.

t distribution: Similar to Normal Distribution. Extreme large or extreme low values are highly likely. Shows too much variance. Useful when the sample size is small (it is also told when there is not variance, standard deviation)

Chi Square Test: Test to see if a population follows a particular distribution such as normal distribution.

The F distribution: To test if two datasets are from the same population (by using variances).

Related Concepts:

What is Z Score?
Probability of a particular score to be occurring in our normal distribution.
Helps to compare two values that are from two different normal distributions

Another definition: it is a measure on how a value is related to the mean.

Chi Square test for Normal Distribution:
Null Hypothesis: No relation exists between categorical variables. They are independent. If the Hypothesis is true, it is a normal distribution

What is p value in Chi Square test:
p value is just a significance. Helps to understand the significance of the result. A small p value means a strong evidence against the Null Hypothesis.

Reference: Anderson A., Semmelroth D., Statistics for Big Data

Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Questions Answered by Exploratory Data Analysis (EDA)

Questions Answered by Exploratory Data Analysis (EDA)

What are the key properties of a Dataset (Center, Spread, Skew, probability distribution, correlation, outliers)

1. What is the center of the data (mean, median, mode)
2. How much spread is there in the data? (Variance, Standard deviation, Quartiles, Interquartile Range (IQR), Example: IQR = Q3 – Q1)
3. Is the data skewed? : Mean > Median = Positive, Mean = Median = Symmetrical, Mean < median = Negatively skewed
4. What distribution does the data follows? Is the data Normally distributed?
5. Are the elements in the Dataset uncorrelated? i.e. two variable move positively or negatively together or not; linearly or non-linearly or not
6. Does the center of the data change over time? Example: for time series data, does the mean change over time?
7. Does the spread of the dataset Change over time? Example: for time series data, does the variance change over time?
8. Are there outliers in the data?
9. Does the data conform to your assumptions? Normally Distributed, constant parameter, no outliers, close to normally distributed, members are independent or nearly independent, variance increases over time, or several outliers are there in the data

Reference: Anderson A., Semmelroth D., Statistics for Big Data


Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Best Practices in Data Preparation

Best Practices in Data Preparation

1. Check data formats (Image, CSV, PC, Mac, mainframe, text, structured, unstructured)
2. Verify data types (numbers, text, floats, currencies, nominal, ordinal, interval, range)
3. Graph your Data (Scatter, Histogram, bar, line)
4. Verify the data (data accuracy, data makes sense)
5. Identify outliers ( Examples: very large or very small (than the rest))
6. Deal with missing values
7. Check your assumptions on data distribution (normal, poisson )
8. Backup and document – everything that you do

Reference: Anderson A., Semmelroth D.
Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Machine Learning and Security Basics

Machine Learning and Security Basics

Objective: Define/describe key concepts on what Machine Learning can do for Security and how

First of all, what are security threats?
Malware, Worm, Trojan, Spyware, Adware, Ransomware, Rootkit, Backdoor, Bot, Botnet, Exploit, Scanning (port scanning), Sniffing (silently observe and record), Keylogger, Spam, Login Attack, Account Take Over, Phising (masquerading), Spear phising, Social Engineering, Incendiary Speech, Denial of Service, Distributed Denial of Service, Advanced persistent threats (APTs), Zero day vulnerability.

Then Cyber Threat Taxonomy:
Information gathering (Scan, Sniff, Social engineering)
Intrusion Attempts
Intrusions (Account Takeover, Privilege escalation, bot, application compromise)
Fraud (Unauthorized use of Resources)
Abusive Content (Spam, …)
Malware (Virus, Trojan)
Availability Attacks (DoS)

What is the motivation behind cyber attacks?
There can be many reasons including monetary gain, power and political gain/control.

What is Machine Learning?
Simply, Programs that learn from data, adapt with data changes, then form models and algorithms to utilize that learning for a goal (such as prevent security attacks)

How and where machine Learning can help in Security:
Some examples can be: Pattern Recognition and Anomaly detection, Malware and botnet detection and analysis, Spam Fighting

How does Machine Learning help in Security?
Examples can be: Using classification and clustering events/incidents/contents into security categories/classes/levels and taking proper actions to mitigate the effect or to prevent future incidents.

Where and how can Machine Learning help with Security?
Machine Learning can help in Anomaly Detection, Malware Analysis, Network Traffic Analysis, Protecting the Consumer web, also protecting and adapting itself from security attacks.

What are the Machine Learning Approaches and Concepts that can help with Security?
These will be primarily classification, clustering, and prediction approaches and algorithms. The way Machine Learning will work, it will analyze past logs, emails, login attempts, inbound and outbound requests, then find patterns, then create algorithms (i.e. find/customize and apply proper ML algorithms ) based on those patterns. With training data and test data, fine tune the model. The following algorithms can be applied for Security as well: Logistic Regression, Decision Trees, Decision Forests, Support Vector Machines, Naive Bayes, KNN, Neural Networks.

However, feature selection and feature engineering with selection of the proper model and algorithms will be critical.

Will continue…
Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Hacking Basics: Short Overview: A Short Breadth

Hacking Basics: Short Overview: A Short Breadth

Audience: Ethical Hackers, Wanna be security researchers, interested to know the basics of Security

Objective: Identify the key hacking concepts. Just the names primarily.

key Topics: Password Cracking, Keyloggers, Trojans, Viruses, Worms, Rootkits, Social Engineering, Privilege Escalation, Denial of service attack, Botnet, Alternate data streams, Steganography, Covering tracks.

Password Cracking Techniques that hackers use or can be used as part of ethical hacking:

Dictionary Attack, Brute Force Attack (uses combination of letters, digits, esp. characters – not dictionary words), Hybrid Attack (on top of dictionary attack such as world123), Syllable Attack (brute force attack on top of dictionary attack, combine dictionary and brute force), Rule based attack (when some hints are known then use rules to create passwords to use to crack), Rainbow Table Attack (uses pre-generated hashes of combinations/permutations of a character set), distributed password attack (uses multiple systems to crack the password), non-technical attack (social engineering, shoulder surfing to see typing)

Keyloggers: Used to steal passwords such as BIOS Embedded (sits in the BIOS), keylogger keyboard (replace keyboard with keyboards with internal memory), External Keyloggers (fits the PS/2, serial keyboard port to intercept), Software keyloggers (most common, installed in the target system)

Trojans: Hides inside trust-able applications can give the attacker full access to the system, can steal passwords and any other data. Two types: Overt channel (uses a genuine channel to communicate), Covert Channel (uses back/secret door)

Types of Trojans: Command shelf Trojans (gives command line access to the attacker), Document Trojans (hides inside word or pdf files), email trojans (attacker sends commands through email messages ), Botnet Trojans (uses bots to attack, for DOS attacks bots are usually used)

Viruses: A malicious program itself. The life-cycle of a virus: design/development, infection and replication (replicates itself on the target system to attack), detection (can be detected by anti-virus software), anti-virus signature development, Eradication

Types of Viruses: System/boot virus, File virus, Macro Virus (can be word and excel macro), Polymorphic virus (change code and behaviors of itself).

Computer Worms: Similar to virus however they self-replicate themselves aggressively and try to attack aggressively to infect as many systems as possible (and as quickly as possible)

Rootkits:

Affect the operating system, changes the kernels. hard to remove. Changes system level code to remain undetected by anti-virus software.

About Antivirus software: Not all anti-virus software can detect all viruses. VirusTotal a service scans submitted files using multiple anti-virus software to detect viruses.

Social Engineering: Deceptive art to engage in communication with others to collect sensitive/valuable information Such as the CRA scam, Lottery winner scam. Steps of Social Engineering: information gathering, choose the victim, establish trust, exploit the relationship.

Types of Social Engineering: Human based, computer based, mobile based

Privilege Escalation: First gets access to the system that can be a basic access. then tries to increase privileges that can be by using programming, or utilizing misconfiguration in the system.

DOS Attack: Overwhelms the system resources so that even legitimate users cannot get access or use the system and services.

Botnet: Can be used for DOS attacks. A Botnet is a group of systems that work together to shut the target system down. the botnet is controlled by the attackers.

Alternate Data Streams: Data attached to a file. The attachment is invisible to the user. Such as hidden.txt file (will be hidden and will gwt data) can be attached to file.txt (will be visible). StreamArmor application can detect such hidden files.

Steganography: techniques where data is hidden inside messages/data. Only the related parties will have keys to see the hidden data. Example purpose: Copy prevention, Hiding meta-data, Covert communication, Authenticity testing.

Covering Tricks:

Covering Tricks: Attacker after attacking the target systems and after collecting information, can try to clear all traces. Clearing might involve: clear browser cache, remove all files it created, clear audit and event logs, close open ports, stop the service processes, restore the registry, delete user accounts it created, create temp files.

Cleaner Programs: CCCleaner, File Shredder

References: Sagar Ajay, CEH, Apress

Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Where to get Malware Samples and Labels?

Where to get Malware Samples and Labels?

This information is intended for Security and Malware researchers.

Virus Total: https://www.virustotal.com/gui/

Malware-traffic-analysis-net: http://malware-traffic-analysis.net/

Virus Share : https://virusshare.com/

VX Haven: https://vxer.org

Kaggle, Microsoft Virus/Malware database: https://www.kaggle.com/c/microsoft-malware-prediction/data

Reference: Clarence C, and David F.

Visualization of Multivariate Charts

Dataset Types
Tables, Networks, Spatial
https://www.cs.ubc.ca/~tmm/talks/minicourse14/vad16nasa.pdf

Attribute Types
Categorical, Ordered, Quantitative

Multivariate Charts
Scatter Plot, Heat Map, Bubble Chart, Parallel Coordinates, Radar Plot

Scatter Plot
https://en.wikipedia.org/wiki/Scatter_plot

Scatter Plot : Details: Patterns
https://mste.illinois.edu/courses/ci330ms/youtsey/scatterinfo.html

Scatter Plot and Trend Line
https://www.mathsisfun.com/data/scatter-xy-plots.html

Scatterplot Details
https://www150.statcan.gc.ca/n1/edu/power-pouvoir/ch9/scatter-nuages/5214827-eng.htm

GRAPHICS: COMBINING TWOWAY SCATTERPLOTS | STATA LEARNING MODULES
https://stats.idre.ucla.edu/stata/modules/graph8/twoway-scatter-combine/

Multiple overlaid scatterplots
https://www.stata.com/support/faqs/graphics/gph/graphdocs/multiple-overlaid-scatterplots/index.html

Visualization in Data Science: What is it for?
https://cds.nyu.edu/wp-content/uploads/2014/04/bertini_datascience_showcase_May12_2014.pdf

Scatterplots and Outliers. Scatterplot Bad Examples
http://faculty.virginia.edu/ASTR3130/lablinks/GuidePlots.html

Heatmap
https://www.highcharts.com/demo/heatmap

SPSS: Heatmap
https://www.ibm.com/support/knowledgecenter/en/SS3RA7_15.0.0/com.ibm.spss.modeler.help/graphboard_creating_examples_heatmap.htm

Heat Map Color Gradients
https://docs.microsoft.com/en-us/bingmaps/v8-web-control/map-control-concepts/heat-map-module-examples/heat-map-color-gradients?redirectedfrom=MSDN

Heatmap and Correlation Map
https://blogs.sas.com/content/sasdummy/2013/06/12/correlations-matrix-heatmap-with-sas/

Heatmap: Biclusters
https://www.researchgate.net/figure/Heatmap-visualization-of-biclusters-a-Typical-heatmap-with-bright-colors-representing_fig1_5344500

A visual analytics approach for understanding biclustering results from microarray data
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-247

Bubble Chart:
https://www.fusioncharts.com/resources/chart-primers/bubble-chart

HOW TO DESIGN BUBBLE CHARTS: i.e Kind of Scatterplot
https://visage.co/data-visualization-101-bubble-charts/

Ted Talk: A Famous Bubble Chart
https://www.ted.com/talks/hans_rosling_reveals_new_insights_on_poverty?language=en

Junk Charts
https://junkcharts.typepad.com/junk_charts/2013/03/blowing-the-whistle-at-bubble-charts.html

Parallel Coordinates
https://en.wikipedia.org/wiki/Parallel_coordinates

Polygonal chain
https://en.wikipedia.org/wiki/Polygonal_chain

Patterns: Parallel Coordinates
https://eagereyes.org/techniques/parallel-coordinates

More on Parallel Coordinates: Includes bad examples of Parallel Coordinates
https://ldld.samizdat.cc/2016/parallel/

Radar Plots
https://www.fusioncharts.com/resources/chart-primers/radar-chart

Radar Plots. Also, what does the area mean?
https://ncva.itn.liu.se/education-geovisual-analytics/parallel-coordinates-and-radar-chart?l=en

3 Things to Think About – A Warning Label for Radar Charts
http://www.verghisgroup.com/wp-content/uploads/2012/04/3-Things-to-Think-About-A-Warning-Label-for-Radar-Charts.pdf

PERCEPTION and Data Visualization: Concepts, Examples, Code, Projects

PERCEPTION

Data Visualization for Human Perception
"Data visualization is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication. … In other words, to visualize data effectively, we must follow design principles that are derived from an understanding of human perception."
https://www.interaction-design.org/literature/book/the-encyclopedia-of-human-computer-interaction-2nd-ed/data-visualization-for-human-perception

The Role of Visual Perception in Data Visualization
https://www.sciencedirect.com/science/article/pii/S1045926X02902351

Pre-attentive Processing
https://www.alleydog.com/glossary/definition.php?term=Preattentive%2520Processing

3 Expert Data Visualization Tips for Grabbing Readers’ Attention
Read more at https://visme.co/blog/data-visualization-tips/#lxVeLRMe1wybkjdE.99
https://visme.co/blog/data-visualization-tips/

Attention and Visual Memory in Visualization and Computer Graphics
https://www.csc2.ncsu.edu/faculty/healey/download/tvcg.11.pdf

Preattentive Visual Properties and How to Use Them in Information Visualization
https://www.interaction-design.org/literature/article/preattentive-visual-properties-and-how-to-use-them-in-information-visualization

Choosing Colors for Data Visualization
http://www.b-eye-network.com/newsletters/ben/2235

Exploring Preattentive Attributes (Beta)
https://learnforeverlearn.com/preattentive/

Perception
http://www.ifs.tuwien.ac.at/~silvia/wien/vu-infovis/articles/03_Perception.pdf

Color value refers to the lightness or darkness of the hue. … Adding black to a hue produces a low-value color, often called a shade. Intensity. Intensity, also called chroma or saturation, refers to the brightness of a color.

Elements of Design: Value & Color
https://etad.usask.ca/skaalid/theory/cgdt/color.htm

Color in Visualization
https://www.ics.uci.edu/~majumder/vispercep/colviz.pdf

Colors as Hue, Saturation and Brightness
http://www.georeference.org/doc/colors_as_hue_saturation_and_brightness.htm

Choosing colors for your data visualization
https://cambridge-intelligence.com/choosing-colors-for-your-data-visualization/

Perception in Visualization
https://www.csc2.ncsu.edu/faculty/healey/PP/index.html

Target detection in scientific visualization
https://psycnet.apa.org/record/2001-00202-002
Check here as well: https://www.csc2.ncsu.edu/faculty/healey/PP/index.html

Object detection via color-based image segmentation using python
https://towardsdatascience.com/object-detection-via-color-based-image-segmentation-using-python-e9b7c72f0e11
Check here as well: https://www.csc2.ncsu.edu/faculty/healey/PP/index.html

A systematic comparison between visual cues for boundary detection
https://www.sciencedirect.com/science/article/pii/S0042698915003685
Check here as well: https://www.csc2.ncsu.edu/faculty/healey/PP/index.html

Perception and Boundary Detection
https://cslu.ohsu.edu/~bedricks/courses/conj_610/pdf/lec_2_part_2.pdf

Why rainbow colors aren’t the best option for data visualizations
https://www.poynter.org/archive/2013/why-rainbow-colors-arent-always-the-best-options-for-data-visualizations/

Color and Counting
Find on: http://www.ifs.tuwien.ac.at/~silvia/wien/vu-infovis/articles/03_Perception.pdf

Color and Perception
https://makingmaps.net/2007/08/28/perceptual-scaling-of-map-symbols/

Color Blindness by Inheritance
https://www.colour-blindness.com/general/prevalence/

Form and Color. Form and Boundary Detections
Find on: https://cslu.ohsu.edu/~bedricks/courses/conj_610/pdf/lec_2_part_2.pdf
Find on: http://www.ifs.tuwien.ac.at/~silvia/wien/vu-infovis/articles/03_Perception.pdf

Perceptual Scaling
Perceptual Scaling of Map Symbols
https://makingmaps.net/2007/08/28/perceptual-scaling-of-map-symbols/

MAJOR CITIES OF INDIA: PERCEPTUAL VS. ABSOLUTE SCALING
https://blogs.ubc.ca/katerynabaranovasgis/tag/perceptual-scaling/

Ebbinghaus illusion
https://en.wikipedia.org/wiki/Ebbinghaus_illusion
https://en.wikipedia.org/wiki/Ebbinghaus_illusion#/media/File:Mond-vergleich.svg

Flicker, Motion for Preattentive
https://www.csc2.ncsu.edu/faculty/healey/PP/
https://learnforeverlearn.com/preattentive/

Principles of Data Visualization
http://saravanan-thirumuruganathan.github.io/cse5334Spring2015/slides/03_PrinciplesOfViz/03_PrinciplesOfViz_final.pdf

Humans Detect Patterns
http://www.mrc-cbu.cam.ac.uk/people/matt-davis/cmabridge/

Gestalt Principles
https://www.interaction-design.org/literature/topics/gestalt-principles

Principles of grouping. Similarity, Closure
https://en.wikipedia.org/wiki/Principles_of_grouping

Symmetry:Gestalt Principles:Simplicity, symmetry and more: Gestalt theory and the design principles it gave birth to
https://www.canva.com/learn/gestalt-theory/

Continuity:
find on: https://cslu.ohsu.edu/~bedricks/courses/conj_610/pdf/lec_2_part_2.pdf
https://www.freepik.com/free-icon/olympic-games-logo_733186.htm

Misc
https://d3ui957tjb5bqd.cloudfront.net/uploads/2016/03/Gestalt-Psychology-.jpg

Projects

Will be added later in this article or in separate posts

Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Text Visualization: Concepts, Examples, Code (Python, R, Matlab), Example Projects

Text Visualization

Data Visualization Of President Obama’s Inauguration Speech
https://blog.capitalogix.com/public/2009/01/data-visualization-of-president-obamas-inauguration-speech.html

Text Visualization Course at Washington University
https://courses.cs.washington.edu/courses/cse512/16sp/lectures/CSE512-Text.pdf

Why Visualize Text?
https://www.teachervision.com/reading-comprehension/visualizing

You can visualize text to find key concepts in a speech.
Obama’s Speech: https://www.nytimes.com/2009/09/10/us/politics/10obama.text.html

Example: Speech/Article Visualizations
https://economix.blogs.nytimes.com/2009/09/09/obama-in-09-vs-clinton-in-93/?mtrref=undefined&gwh=EA5A091DC9E658A2DCC9513805FEAC3F&gwt=pay&assetType=REGIWALL

1993 Speech: Bill Clinton
https://economix.blogs.nytimes.com/2009/09/09/bill-clinton-on-health-care-1993/
1993 vs 2009. https://economix.blogs.nytimes.com/2009/09/09/obama-in-09-vs-clinton-in-93/

Wordtree
https://www.jasondavies.com/wordtree/?source=8bd3e76f64f2d614410435434e2fb1fd&amp;prefix

Text Visualization Overview
https://courses.cs.washington.edu/courses/cse512/16sp/lectures/CSE512-Text.pdf

Introduction to Text Analysis: Cleaning/Parsing
http://guides.library.duke.edu/c.php?g=289707&p=1930855

Tokenization
https://graphics.cs.wisc.edu/WP/vep/tokenization/

Wordcount
http://www.wordcount.org/main.php

Wordcloud
https://www.jasondavies.com/wordcloud/

See through your text
https://voyant-tools.org/

Bubble Cloud
http://vallandingham.me/bubble_cloud/?sherlock

Comparison Cloud: Comparative Word Cloud
https://sites.google.com/site/miningtwitter/questions/talking-about/wordclouds/comparison-cloud

N-Grams
http://text-analytics101.rxnlp.com/2014/11/what-are-n-grams.html

Wordtree
https://developers.google.com/chart/interactive/docs/gallery/wordtree

Pattern Searching using Suffix Tree
https://www.geeksforgeeks.org/pattern-searching-using-suffix-tree/

Wordtree
https://www.jasondavies.com/wordtree/

Clustered Word Cloud
http://neoformix.com/2011/WordClusterDiagram.html

Document Contrast Diagrams
http://neoformix.com/2008/DocumentContrastDiagrams.html

Phrase Net
https://www.betterevaluation.org/en/evaluation-options/phrase_net

Analyze Obama’s Speech from 2004: Code Example
https://hackernoon.com/analysing-obama-speeches-since-2004-7f08797f7078

A Complete Exploratory Data Analysis and Visualization for Text Data
https://towardsdatascience.com/a-complete-exploratory-data-analysis-and-visualization-for-text-data-29fb1b96fb6a

Basics of Text Analysis & Visualization
https://itnext.io/basics-of-text-analysis-visualization-1978de48af47

The Data Visualization Design Process: A Step-by-Step Guide for Beginners
https://depictdatastudio.com/data-visualization-design-process-step-by-step-guide-for-beginners/

Wordnet Visualization
http://vialab.science.uoit.ca/portfolio/wordnet-visualization

Visual Wordnet with D3JS
https://www.visual-thesaurus.com/wordnet.php

Visualizing Word Vectors with t-SNE
https://www.kaggle.com/jeffd23/visualizing-word-vectors-with-t-sne

Embedding Projector
https://projector.tensorflow.org/

Visualizing Tweets with Word2Vec and t-SNE, in Python
https://leightley.com/visualizing-tweets-with-word2vec-and-t-sne-in-python/

Text Analysis for Visualizations
http://vallandingham.me/textvis-talk/#1

visualization-and-sentiment-analysis
https://www.kaggle.com/shaliniyaramada/visualization-and-sentiment-analysis

Word Cloud. Bag of Words
https://www.mathworks.com/help/textanalytics/ref/ldamodel.wordcloud.html

Sayed Ahmed

Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://sitestree.com, http://bangla.salearningschool.com

Time Series: Concepts, Visualizations, Example Code, Example Projects

Time Series: Concepts, Visualizations, Example Code, Example Projects


What is Time Series
https://en.wikipedia.org/wiki/Time_series

A comprehensive beginner’s guide to create a Time Series Forecast (with Codes in Python and R)
https://www.analyticsvidhya.com/blog/2016/02/time-series-forecasting-codes-python/


A Complete Tutorial on Time Series Modeling in R
https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/


Creating Time Series Forecast using Python
https://courses.analyticsvidhya.com/courses/creating-time-series-forecast-using-python?utm_source=blog&utm_medium=TimeSeriesForecastComprehensivearticle


6.4.1. Definitions, Applications and Techniques
https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc41.htm


Forecast and Trend
https://www.excel-easy.com/examples/forecast-trend.html


Time Series Analysis: Seasonality
http://slideplayer.com/slide/1507153/


Cyclic and seasonal time series
https://robjhyndman.com/hyndsight/cyclicts/


Time Series and Forecasting
https://www.slideshare.net/bmcfad01/chapter-16-2700149


Stationary and non-stationary Time Series
https://www.analyticsvidhya.com/blog/2015/12/complete-tutorial-time-series-modeling/


Metric graphs 101: Timeseries graphs: Representing Time Series
Line Graph, Bar Graph, Heatmap, Area Graph, Stacked Area Graphs
https://www.datadoghq.com/blog/timeseries-metric-graphs-101/


Stock Market Graphs
https://ca.finance.yahoo.com/quote/%5EGSPTSE?p=%5EGSPTSE&guccounter=1


Half Life
https://en.wikipedia.org/wiki/Exponential_decay#Half-life


Decomposing Time Series
https://otexts.com/fpp2/classical-decomposition.html


Decomposition of time series
https://en.wikipedia.org/wiki/Decomposition_of_time_series


Area Chart
https://en.wikipedia.org/wiki/Area_chart


Area Graph
https://datavizcatalogue.com/methods/area_graph.html

Represent Time Series with Heatmap
https://datavizcatalogue.com/methods/heatmap.html


Time Series Visualizations – An overview
http://complexdatavisualized.com/time-series-visualizations-an-overview/


Visualization: The Ebb and Flow of Movies: Box Office Receipts 1986 — 2008
http://archive.nytimes.com/www.nytimes.com/interactive/2008/02/23/movies/20080223_REVENUE_GRAPHIC.html

Why we need frequency domain?
https://math.stackexchange.com/questions/55310/why-we-need-frequency-domain


Time–frequency analysis for music signals
https://en.wikipedia.org/wiki/Time%E2%80%93frequency_analysis_for_music_signals


What is the difference between Time domain and frequency domain
https://www.researchgate.net/post/What_is_the_difference_between_Time_domain_and_frequency_domain10


Frequency Domain and Fourier Transforms
http://www.princeton.edu/~cuff/ele201/kulkarni_text/frequency.pdf


Moving Average (MA)
https://www.investopedia.com/terms/m/movingaverage.asp


Calculation of Trend by Moving Average Method
https://www.toppr.com/guides/business-mathematics-and-statistics/time-series-analysis/moving-average-method/


Basic Feature Engineering With Time Series Data in Python
https://machinelearningmastery.com/basic-feature-engineering-time-series-data-python/


Time Series Feature Extraction for industrial big data (IIoT) applications
https://towardsdatascience.com/time-series-feature-extraction-for-industrial-big-data-iiot-applications-5243c84aaf0e


Taxonomy of Time Series Forecasting Problems
https://machinelearningmastery.com/taxonomy-of-time-series-forecasting-problems/


How To Identify Patterns in Time Series Data: Time Series Analysis
http://www.statsoft.com/textbook/time-series-analysis


An Introduction to Stationary and Non-Stationary Processes
https://www.investopedia.com/articles/trading/07/stationary.asp


A Gentle Introduction to Handling a Non-Stationary Time Series in Python
https://www.analyticsvidhya.com/blog/2018/09/non-stationary-time-series-python/