{"id":14825,"date":"2019-06-22T23:12:45","date_gmt":"2019-06-23T03:12:45","guid":{"rendered":"http:\/\/bangla.salearningschool.com\/recent-posts\/machine-learning-apply-pca-on-datasets-and-related\/"},"modified":"2020-02-08T09:28:00","modified_gmt":"2020-02-08T14:28:00","slug":"machine-learning-apply-pca-on-datasets-and-related","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=14825","title":{"rendered":"Machine Learning: Apply PCA on Datasets and Related"},"content":{"rendered":"<p>FactorAnalyzer<br \/>\n<a href=\"https:\/\/github.com\/EducationalTestingService\/factor_analyzer\">https:\/\/github.com\/EducationalTestingService\/factor_analyzer<\/a><br \/>\n&#8212;<br \/>\nsruti-jain\/Marketing-Analysis-for-Hotel-Chain-website<br \/>\n<a href=\"https:\/\/github.com\/sruti-jain\/Marketing-Analysis-for-Hotel-Chain-website\">https:\/\/github.com\/sruti-jain\/Marketing-Analysis-for-Hotel-Chain-website<\/a><br \/>\n&#8212;<\/p>\n<p>Understanding PCA (Principal Component Analysis) with Python<br \/>\n<a href=\"https:\/\/towardsdatascience.com\/dive-into-pca-principal-component-analysis-with-python-43ded13ead21\">https:\/\/towardsdatascience.com\/dive-into-pca-principal-component-analysis-with-python-43ded13ead21<\/a><\/p>\n<p>The code for the most part will work though it used an earlier version of Python<br \/>\nYou will need module: StandardScaler<br \/>\nOtherwise you might find the code below to be useful:<br \/>\n[# ref: <a href=\"https:\/\/python-for-multivariate-analysis.readthedocs.io\/a_little_book_of_python_for_multivariate_analysis.html\">https:\/\/python-for-multivariate-analysis.readthedocs.io\/a_little_book_of_python_for_multivariate_analysis.html<\/a>]<\/p>\n<p>import sklearn<br \/>\nfrom sklearn import preprocessing<\/p>\n<p>standardisedX = sklearn.preprocessing.scale(cancer.data)<br \/>\nstandardisedX = pd.DataFrame(standardisedX) #, index=cancer.data.index, columns=cancer.data.columns)<br \/>\nstandardisedX.apply(np.mean)<\/p>\n<p>X_scaled = standardisedX<\/p>\n<p>from sklearn import decomposition<br \/>\npca = decomposition.PCA(n_components=3).fit(standardisedX)<br \/>\npca = decomposition.PCA(n_components=3).fit_transform(standardisedX)<br \/>\nX_pca = pca<\/p>\n<p>ex_variance=np.var(X_pca,axis=0)<br \/>\nex_variance_ratio = ex_variance\/np.sum(ex_variance)<br \/>\nex_variance_ratio<\/p>\n<p>Xax=X_pca[:,0]<br \/>\nYax=X_pca[:,1]<br \/>\nlabels=cancer.target<br \/>\ncdict={0:&#8217;red&#8217;,1:&#8217;green&#8217;}<br \/>\nlabl={0:&#8217;Malignant&#8217;,1:&#8217;Benign&#8217;}<br \/>\nmarker={0:&#8217;*&#8217;,1:&#8217;o&#8217;}<br \/>\nalpha={0:.3, 1:.5}<br \/>\nfig,ax=plt.subplots(figsize=(7,5))<br \/>\nfig.patch.set_facecolor(&#8216;white&#8217;)<br \/>\nfor l in np.unique(labels):<br \/>\nix=np.where(labels==l)<br \/>\nax.scatter(Xax[ix],Yax[ix],c=cdict[l],s=40,<br \/>\nlabel=labl[l],marker=marker[l],alpha=alpha[l])<\/p>\n<p># for loop ends<br \/>\nplt.xlabel(&quot;First Principal Component&quot;,fontsize=14)<br \/>\nplt.ylabel(&quot;Second Principal Component&quot;,fontsize=14)<br \/>\nplt.legend()<br \/>\nplt.show()<\/p>\n<p>Sayed Ahmed<\/p>\n<p>Linkedin: <a href=\"https:\/\/ca.linkedin.com\/in\/sayedjustetc\">https:\/\/ca.linkedin.com\/in\/sayedjustetc<\/a><\/p>\n<p>Blog: <a href=\"http:\/\/sitestree.com\">http:\/\/sitestree.com<\/a>, <a href=\"http:\/\/bangla.salearningschool.com\">http:\/\/bangla.salearningschool.com<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>FactorAnalyzer https:\/\/github.com\/EducationalTestingService\/factor_analyzer &#8212; sruti-jain\/Marketing-Analysis-for-Hotel-Chain-website https:\/\/github.com\/sruti-jain\/Marketing-Analysis-for-Hotel-Chain-website &#8212; Understanding PCA (Principal Component Analysis) with Python https:\/\/towardsdatascience.com\/dive-into-pca-principal-component-analysis-with-python-43ded13ead21 The code for the most part will work though it used an earlier version of Python You will need module: StandardScaler Otherwise you might find the code below to be useful: [# ref: https:\/\/python-for-multivariate-analysis.readthedocs.io\/a_little_book_of_python_for_multivariate_analysis.html] import sklearn from sklearn import preprocessing standardisedX &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=14825\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1910,182],"tags":[],"class_list":["post-14825","post","type-post","status-publish","format-standard","hentry","category-ai-ml-ds-rl-dl-nn-nlp-data-mining-optimization","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":14860,"url":"http:\/\/bangla.sitestree.com\/?p=14860","url_meta":{"origin":14825,"position":0},"title":"PCA: Understand the affecting Factors for your portfolio, or credit score, or food pattern (causing diseases)","author":"Sayed","date":"July 5, 2019","format":false,"excerpt":"Overview for Principal Components Analysis https:\/\/support.minitab.com\/en-us\/minitab\/18\/help-and-how-to\/modeling-statistics\/multivariate\/how-to\/principal-components\/before-you-start\/overview\/ \" The goal of principal components analysis is to explain the maximum amount of variance with the fewest number of principal components.\" Interpret the key results for Principal Components Analysis https:\/\/support.minitab.com\/en-us\/minitab\/18\/help-and-how-to\/modeling-statistics\/multivariate\/how-to\/principal-components\/interpret-the-results\/key-results\/ Interpret all statistics and graphs for Principal Components Analysis https:\/\/support.minitab.com\/en-us\/minitab\/18\/help-and-how-to\/modeling-statistics\/multivariate\/how-to\/principal-components\/interpret-the-results\/all-statistics-and-graphs\/ Methods and formulas\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":76257,"url":"http:\/\/bangla.sitestree.com\/?p=76257","url_meta":{"origin":14825,"position":1},"title":"Can you answer these questions on Data Science Project Development","author":"Sayed","date":"August 24, 2024","format":false,"excerpt":"Can you answer these questions on Data Science Project Development Questions to answer 1. What does a data science project usually involve? What is the common theme across data science projects? 2. Does industry projects and research projects differ? Why and to what extent? 3. What are the some dataset\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16371,"url":"http:\/\/bangla.sitestree.com\/?p=16371","url_meta":{"origin":14825,"position":2},"title":"Can you answer these random questions on Data Science Project Development","author":"Sayed","date":"November 9, 2019","format":false,"excerpt":"Questions to answer 1. What does a data science project usually involve? What is the common theme across data science projects? 2. Does industry projects and research projects differ? Why and to what extent? 3. What are the some dataset repositories? Where can you get them? 4. Are all public\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":26252,"url":"http:\/\/bangla.sitestree.com\/?p=26252","url_meta":{"origin":14825,"position":3},"title":"Can you answer these random questions on Data Science Project Development #Root","author":"Author-Check- Article-or-Video","date":"April 21, 2021","format":false,"excerpt":"Questions to answer 1. What does a data science project usually involve? What is the common theme across data science projects? 2. Does industry projects and research projects differ? Why and to what extent? 3. What are the some dataset repositories? Where can you get them? 4. Are all public\u2026","rel":"","context":"In &quot;FromSitesTree.com&quot;","block_context":{"text":"FromSitesTree.com","link":"http:\/\/bangla.sitestree.com\/?cat=1917"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":16743,"url":"http:\/\/bangla.sitestree.com\/?p=16743","url_meta":{"origin":14825,"position":4},"title":"NLP : AI and ML","author":"Sayed","date":"February 6, 2020","format":false,"excerpt":"Feature Selection \" Filter Methods Wrapper Methods Embedded Methods Feature Selection Checklist Do you have domain knowledge? Are your features commensurate? Do you suspect interdependence of features? If \"Weka: \u201cFeature Selection to Improve Accuracy and Decrease Training Time\u201c. Scikit-Learn: \u201cFeature Selection in Python with Scikit-Learn\u201c. R: \u201cFeature Selection with the\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"","src":"https:\/\/upload.wikimedia.org\/wikipedia\/commons\/thumb\/b\/bb\/Singular-Value-Decomposition.svg\/220px-Singular-Value-Decomposition.svg.png","width":350,"height":200},"classes":[]},{"id":16698,"url":"http:\/\/bangla.sitestree.com\/?p=16698","url_meta":{"origin":14825,"position":5},"title":"Misc Math, Data Science, Machine Learning, PCA, FA","author":"Sayed","date":"January 29, 2020","format":false,"excerpt":"\"In mathematics, a set B of elements (vectors) in a vector space V is called a basis, if every element of V may be written in a unique way as a (finite) linear combination of elements of B. The coefficients of this linear combination are referred to as components or\u2026","rel":"","context":"In &quot;AI ML DS RL DL NN NLP Data Mining Optimization&quot;","block_context":{"text":"AI ML DS RL DL NN NLP Data Mining Optimization","link":"http:\/\/bangla.sitestree.com\/?cat=1910"},"img":{"alt_text":"A=(a_(ij))","src":"https:\/\/i0.wp.com\/mathworld.wolfram.com\/images\/equations\/HermitianMatrix\/Inline1.gif?resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14825","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=14825"}],"version-history":[{"count":1,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14825\/revisions"}],"predecessor-version":[{"id":16778,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/14825\/revisions\/16778"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=14825"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=14825"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=14825"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}