{"id":78251,"date":"2025-05-22T20:49:34","date_gmt":"2025-05-22T20:49:34","guid":{"rendered":"http:\/\/bangla.sitestree.com\/?p=78251"},"modified":"2025-06-08T21:34:01","modified_gmt":"2025-06-08T21:34:01","slug":"threat-to-validity-for-your-data-analytics-projects-2","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=78251","title":{"rendered":"Threat To Validity for Your Data Analytics Projects"},"content":{"rendered":"\n<p><\/p>\n\n\n\n<p>\u2022Internal<\/p>\n\n\n\n<p>\u2022External<\/p>\n\n\n\n<p>\u2022Construct<\/p>\n\n\n\n<p>\u2022Statistical Conclusion<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>\u2022<strong>Internal<\/strong>: Informative variable missing. Bring data from other sources<\/p>\n\n\n\n<p>\u2022<strong>External<\/strong>: Fixation variable make the result perfect. Model may not generalize<\/p>\n\n\n\n<p>\u2022<strong>Construct<\/strong>: Class imbalance affects outcome badly<\/p>\n\n\n\n<p>\u2022<strong>Statistical Conclusion<\/strong>: Based on the statistical measure used, the conclusion can be incorrect.<\/p>\n\n\n\n<p>\u2022Data Mining: Association: Support, Confidence, and Lift<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Internal Validity<\/strong><\/p>\n\n\n\n<p>Is your experiment (and Model) Internally Valid?<\/p>\n\n\n\n<p>What is the Threat that<\/p>\n\n\n\n<p>the experiment (model, and outcome) is invalid (internally)?)<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Example: Reasons that inferences between two variables are causal are incorrect. [b]<\/strong><\/p>\n\n\n\n<p><strong>Cause: Lack of informative variables<\/strong><\/p>\n\n\n\n<p><strong>Solution: Bring data from other sources<\/strong><\/p>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"286\" src=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?resize=750%2C286\" alt=\"\" class=\"wp-image-78252\" srcset=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?resize=1024%2C390 1024w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?resize=300%2C114 300w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?resize=768%2C293 768w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?resize=750%2C286 750w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-39.png?w=1055 1055w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"185\" src=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?resize=750%2C185\" alt=\"\" class=\"wp-image-78253\" srcset=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?resize=1024%2C252 1024w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?resize=300%2C74 300w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?resize=768%2C189 768w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?resize=750%2C185 750w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-40.png?w=1262 1262w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>External Validity<\/strong><\/p>\n\n\n\n<p>Is your experiment (and Model) Externally Valid?<\/p>\n\n\n\n<p>What is the Threat to external Validity that the experiment (model, and outcome) is externally invalid?) <\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>\u201cStudy results may not apply to other groups.\u201d<\/p>\n\n\n\n<p><strong>Cause<\/strong>: Fixation Variable<\/p>\n\n\n\n<p><strong>Solution<\/strong>: exclude fixation variable from the study<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"141\" src=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?resize=750%2C141\" alt=\"\" class=\"wp-image-78254\" srcset=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?resize=1024%2C192 1024w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?resize=300%2C56 300w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?resize=768%2C144 768w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?resize=750%2C141 750w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-41.png?w=1158 1158w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"64\" src=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=750%2C64\" alt=\"\" class=\"wp-image-78255\" srcset=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=1024%2C87 1024w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=300%2C25 300w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=768%2C65 768w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=1536%2C130 1536w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?resize=750%2C63 750w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-42.png?w=1702 1702w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Ref: https:\/\/en.wikipedia.org\/wiki\/External_validity<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Construct Validity<\/strong><\/p>\n\n\n\n<p>Is your experiment (and Model) Valid by Construction?<\/p>\n\n\n\n<p>What is the Threat that&nbsp; the experiment (model, and outcome) is invalid by Construction?)<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Example<\/strong>: in Classification if the data is imbalanced,<\/p>\n\n\n\n<p>Variables\u2019 effect on the outcome can be invalid<\/p>\n\n\n\n<p><strong>Cause<\/strong>: Construction\/balance problem<\/p>\n\n\n\n<p><strong>Solution<\/strong>: Treat Data for Imbalance<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Statistical Conclusion Validity<\/strong><\/p>\n\n\n\n<p>Is your conclusion (from the experiment and the Model) Statistically Valid, even done by Statistical Analysis?<\/p>\n\n\n\n<p>What is the Threat that&nbsp; the conclusion (from the experiment and the Model) is invalid?)<\/p>\n\n\n\n<p>Example: In data mining, you just considered Association. But that does not give the full picture<\/p>\n\n\n\n<p>Solution: Include Support, Confidence, and Lift<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" width=\"656\" height=\"325\" src=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-43.png?resize=656%2C325\" alt=\"\" class=\"wp-image-78256\" srcset=\"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-43.png?w=656 656w, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-43.png?resize=300%2C149 300w\" sizes=\"auto, (max-width: 656px) 100vw, 656px\" \/><\/figure>\n\n\n\n<p>Ref: https:\/\/www.analyticsvidhya.com\/<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Data Analytics, Machine Learning.<\/p>\n\n\n\n<p>Data Analytics, Machine Learning, Data Science<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u2022Internal \u2022External \u2022Construct \u2022Statistical Conclusion \u2022Internal: Informative variable missing. Bring data from other sources \u2022External: Fixation variable make the result perfect. Model may not generalize \u2022Construct: Class imbalance affects outcome badly \u2022Statistical Conclusion: Based on the statistical measure used, the conclusion can be incorrect. \u2022Data Mining: Association: Support, Confidence, and Lift Internal Validity Is your &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=78251\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1974,1],"tags":[],"class_list":["post-78251","post","type-post","status-publish","format-standard","hentry","category-analytics-and-machine-learning-project-development","category-root","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":78248,"url":"http:\/\/bangla.sitestree.com\/?p=78248","url_meta":{"origin":78251,"position":0},"title":"Threat To Validity for Your Data Analytics Projects","author":"Sayed","date":"May 22, 2025","format":false,"excerpt":"\u2022 Internal \u2022 External \u2022 Construct \u2022 Statistical Conclusion \u2022 Internal: Informative variable missing. Bring data from other sources \u2022 External: The Fixation variable makes the result perfect. The model may not generalize \u2022 Construct: Class imbalance affects the outcome badly \u2022 Statistical Conclusion: Based on the statistical measure used,\u2026","rel":"","context":"In &quot;Analytics and Machine Learning Project Development&quot;","block_context":{"text":"Analytics and Machine Learning Project Development","link":"http:\/\/bangla.sitestree.com\/?cat=1974"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":78189,"url":"http:\/\/bangla.sitestree.com\/?p=78189","url_meta":{"origin":78251,"position":1},"title":"Initial and Exploratory Analysis for Data Analytics Projects","author":"Sayed","date":"May 21, 2025","format":false,"excerpt":"To have a thorough understanding of the data. Two Types: \u2022 Initial Analysis \u2022 Exploratory Analysis Initial Analysis: Univariate Bi-Variate Multi-Variate Univariate Analysis \u2022 Deciding\/Determining the dependent (target) variable \u2022 Assigning the correct data types, appropriate column names \u2022 Address: Inconsistencies, missing values, outliers \u2022 Categorical variables with too many\u2026","rel":"","context":"In &quot;Analytics and Machine Learning Project Development&quot;","block_context":{"text":"Analytics and Machine Learning Project Development","link":"http:\/\/bangla.sitestree.com\/?cat=1974"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16434,"url":"http:\/\/bangla.sitestree.com\/?p=16434","url_meta":{"origin":78251,"position":2},"title":"Stochastic Processes and Related Terms","author":"Sayed","date":"November 27, 2019","format":false,"excerpt":"What is a Random Variable? Ans: \"In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is described informally as a variable whose values depend on outcomes of a random phenomenon.\" In probability theory, \"a random variable is understood as a measurable function defined on a\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":78178,"url":"http:\/\/bangla.sitestree.com\/?p=78178","url_meta":{"origin":78251,"position":3},"title":"How to Report (or Present) the outcome of your Analytics\/ML Project","author":"Sayed","date":"May 18, 2025","format":false,"excerpt":"Reporting and Analysis \u2022Examples \u2022Results section: Page 51: STOCK MARKET PREDICTION USING ENSEMBLE OF GRAPHTHEORY, MACHINE LEARNING AND DEEP LEARNING MODELS \u2022https:\/\/scholarworks.sjsu.edu\/cgi\/viewcontent.cgi?article=1692&context=etd_projects \u2022Check Results and Discussion sections \u2022https:\/\/arxiv.org\/ftp\/arxiv\/papers\/2203\/2203.06848.pdf \u2022A Comparative Study on Forecasting of Retail Sales May be complicated: Learning Context-Aware Classifier for Semantic Segmentation \u2022https:\/\/arxiv.org\/pdf\/2303.11633.pdf \u2022Learning Context-Aware Classifier for\u2026","rel":"","context":"In &quot;Analytics and Machine Learning Project Development&quot;","block_context":{"text":"Analytics and Machine Learning Project Development","link":"http:\/\/bangla.sitestree.com\/?cat=1974"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-13.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-13.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-13.png?resize=525%2C300 1.5x"},"classes":[]},{"id":78210,"url":"http:\/\/bangla.sitestree.com\/?p=78210","url_meta":{"origin":78251,"position":4},"title":"Model Selection for your Project","author":"Sayed","date":"May 21, 2025","format":false,"excerpt":"Potential Models \u2022 Statistical Models \u2022 Parametric and Non-Parametric \u2022 Mathematical Model (Optimization) \u2022 Machine Learning \u2022 Data Mining \u2022 Deep Learning \u2022 Reinforcement Learning \u2022 Graph Mining \u2022 NLP \u2022 Optimization \u2022 Genetic Algorithm \u2022Association \u2022Basket Association \u2022Apriori Algorithm \u2022Supervised \u2022Classification \u2022Regression \u2022Unsupervised \u2022Clustering\/Customer Segmentation \u2022Reinforcement \u2022Learn a policy\u2026","rel":"","context":"In &quot;Analytics and Machine Learning Project Development&quot;","block_context":{"text":"Analytics and Machine Learning Project Development","link":"http:\/\/bangla.sitestree.com\/?cat=1974"},"img":{"alt_text":"","src":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-22.png?resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-22.png?resize=350%2C200 1x, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-22.png?resize=525%2C300 1.5x, https:\/\/i0.wp.com\/bangla.sitestree.com\/wp-content\/uploads\/2025\/05\/image-22.png?resize=700%2C400 2x"},"classes":[]},{"id":17441,"url":"http:\/\/bangla.sitestree.com\/?p=17441","url_meta":{"origin":78251,"position":5},"title":"MISC STATistic PROBability LINEAR ALGebra MATRIX","author":"Sayed","date":"September 14, 2020","format":false,"excerpt":"MISC STAT PROB LINEAR ALG MATRIX PDF AND Stock and Bell Curve: https:\/\/www.investopedia.com\/terms\/p\/pdf.asp PDF in Khan Academy: https:\/\/www.khanacademy.org\/math\/statistics-probability\/random-variables-stats-library\/random-variables-continuous\/v\/probability-density-functions Mixed Random Variable https:\/\/www.youtube.com\/watch?v=ZXJjuRAXMhE \"The variance and the standard deviation are measures of the spread of the data around the mean. They summarise how close each observed data value is to the\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/78251","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=78251"}],"version-history":[{"count":2,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/78251\/revisions"}],"predecessor-version":[{"id":78270,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/78251\/revisions\/78270"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=78251"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=78251"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=78251"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}