{"id":17001,"date":"2020-04-18T23:38:48","date_gmt":"2020-04-19T03:38:48","guid":{"rendered":"http:\/\/bangla.salearningschool.com\/recent-posts\/python-ecommerce-part%e2%80%8a-%e2%80%8a1-merge-multiple-supplier-data-files-into-one-file\/"},"modified":"2020-04-19T06:49:54","modified_gmt":"2020-04-19T10:49:54","slug":"python-ecommerce-part%e2%80%8a-%e2%80%8a1-merge-multiple-supplier-data-files-into-one-file","status":"publish","type":"post","link":"http:\/\/bangla.sitestree.com\/?p=17001","title":{"rendered":"Python: Ecommerce: Part\u200a\u2014\u200a1: Merge Multiple Supplier Data Files into One File"},"content":{"rendered":"<h3>Python: Ecommerce: Part\u200a\u2014\u200a1: Merge Multiple Supplier Data Files into One File<\/h3>\n<h3>Section: Merge multiple Supplier Data Files<\/h3>\n<p> <strong>All code in one block<\/strong><\/p>\n<pre>#!\/usr\/bin\/env python\n# coding: utf-8<\/pre>\n<pre># # Section: Merge multiple Supplier Data Files\n#<\/pre>\n<pre># In[1]:<\/pre>\n<pre># if there is a need to merge multiple files \u2014 use this block\nimport os;\nimport glob;\nimport pandas as pd;<\/pre>\n<pre># supplier data files\/feeds are kept here\ndata_folder = \u2018data-supplier-2019\u201304\u201314\/supplier-raw-data\/\u2019;\nos.chdir(data_folder);<\/pre>\n<pre># In[6]:<\/pre>\n<pre># show all data feed file name\n# file extension for supplier data file\nextension = \u2018csv\u2019;\nall_filenames = [i for i in glob.glob(\u2018*.{}\u2019.format(extension))]\nall_filenames<\/pre>\n<pre># In[7]:<\/pre>\n<pre># total number of rows combined all data files\/feeds\nrow_total_count = 0\nfor f in all_filenames:\ndf_s = pd.read_csv(f)\nprint(df_s.shape, f)\nrow_total_count += df_s.shape[0]\nrow_total_count # print(row_total_count)<\/pre>\n<pre># In[8]:<\/pre>\n<pre># combine all files in the list\ncombined_csv = pd.concat([pd.read_csv(f) for f in all_filenames]);\ncombined_csv.shape<\/pre>\n<pre># In[10]:<\/pre>\n<pre># export combined data to a csv file\ncombined_csv.to_csv( \u201c..\/all_supplier_products_2019_04_14.csv\u201d, index=False, encoding=\u2019utf-8-sig\u2019)<\/pre>\n<pre># In[13]:<\/pre>\n<pre># read csv data file and show data on the screen\ndf = pd.read_csv(\u2018..\/all_supplier_products_2019_04_14.csv\u2019);\ndf.head()<\/pre>\n<p> <strong>The following is from Jupyter Notebook: Cell By Cell Display. Output data are also shown<\/strong><\/p>\n<p>In [1]:<\/p>\n<pre><em># if there is a need to merge multiple files -- use this block<\/em><\/pre>\n<pre><strong>import<\/strong> os;<\/pre>\n<pre><strong>import<\/strong> glob;<\/pre>\n<pre><strong>import<\/strong> pandas <strong>as<\/strong> pd;<\/pre>\n<pre>\n<\/pre>\n<pre><em># supplier data files\/feeds are kept here<\/em><\/pre>\n<pre>data_folder <strong>=<\/strong> 'data-supplier-2019-04-14\/supplier-raw-data\/';<\/pre>\n<pre>os.chdir(data_folder);<\/pre>\n<p> In [6]:<\/p>\n<pre><em># show all data feed file name<\/em><\/pre>\n<pre><em># file extension for supplier data file<\/em><\/pre>\n<pre>extension <strong>=<\/strong> 'csv';<\/pre>\n<pre>all_filenames <strong>=<\/strong> [i <strong class=\"gmail-markup--strong gmail-markup--pre-strong\">for<\/strong> i <strong class=\"gmail-markup--strong gmail-markup--pre-strong\">in<\/strong> glob.glob(&#39;*.{}&#39;.format(extension))]<\/pre>\n<pre>all_filenames<\/pre>\n<p> Out[6]:<\/p>\n<pre>[&#39;data_feeds_5e95c25a1f7f6.csv&#39;,<br \/> &#39;data_feeds_5e95c2962d471.csv&#39;,<br \/> &#39;data_feeds_5e95c2d255409.csv&#39;,<br \/> &#39;data_feeds_5e95c30e63423.csv&#39;,<br \/> &#39;data_feeds_5e95c38646478.csv&#39;,<br \/> &#39;data_feeds_5e95c5dd76370.csv&#39;]<\/pre>\n<p> In [7]:<\/p>\n<pre><em># total number of rows combined all data files\/feeds<\/em><\/pre>\n<pre>row_total_count <strong>=<\/strong> 0<\/pre>\n<pre><strong>for<\/strong> f <strong>in<\/strong> all_filenames:<\/pre>\n<pre>df_s <strong>=<\/strong> pd.read_csv(f)<\/pre>\n<pre>print(df_s.shape, f)<\/pre>\n<pre>row_total_count <strong>+=<\/strong> df_s.shape[0]<\/pre>\n<pre>row_total_count <em># print(row_total_count)<\/em><\/pre>\n<pre>(8058, 40) data_feeds_5e95c25a1f7f6.csv\n(7, 40) data_feeds_5e95c2962d471.csv\n(1, 40) data_feeds_5e95c2d255409.csv<\/pre>\n<pre>... ....\n(1072, 40) data_feeds_5e95c565d6e30.csv\n(4833, 40) data_feeds_5e95c5dd76370.csv<\/pre>\n<p> Out[7]:<\/p>\n<pre>55690<\/pre>\n<p> In [8]:<\/p>\n<pre><em># combine all files in the list<\/em><\/pre>\n<pre>combined_csv <strong>=<\/strong> pd.concat([pd.read_csv(f) <strong class=\"gmail-markup--strong gmail-markup--pre-strong\">for<\/strong> f <strong class=\"gmail-markup--strong gmail-markup--pre-strong\">in<\/strong> all_filenames]);<\/pre>\n<pre>combined_csv.shape<\/pre>\n<p> Out[8]:<\/p>\n<pre>(55690, 40)<\/pre>\n<p> In [10]:<\/p>\n<pre><em># export combined data to a csv file<\/em><\/pre>\n<pre>combined_csv.to_csv( &quot;..\/all_supplier_products_2019_04_14.csv&quot;, index<strong>=False<\/strong>, encoding<strong>=<\/strong>'utf-8-sig')<\/pre>\n<p> In [13]:<\/p>\n<pre>df <strong>=<\/strong> pd.read_csv('..\/all_supplier_products_2019_04_14.csv');<\/pre>\n<pre>df.head()<\/pre>\n<p> Out[13]:<\/p>\n<p>Product ID Model Code Full Product NameShort Product NameProduct URLCategory NameCategory URLSubcategory NameSubcategory URLDate Product Was Launched\u2026Related ProductsRelated AccessoriesWeight KgHeight mmWidth mmDepth mmVideo linkRetail PriceStock statusDate Back0107890POU_0850GV7YPull Rope Fitness Exercises Resistance Bands L\u2026Pull Rope Fitness<\/p>\n<p><em><strong>***. ***. ***<\/strong><\/em><br \/>\n<em><strong>Note: Older short-notes from this site are posted on Medium: <\/strong><\/em><a href=\"https:\/\/medium.com\/@SayedAhmedCanada\">https:\/\/medium.com\/@SayedAhmedCanada<\/a><\/p>\n<p>*** . *** *** . *** . *** . ***<br \/>\n<em><strong>Sayed Ahmed<\/strong><br \/>\n<\/em><br \/>\n<em><strong>BSc. Eng. in Comp. Sc. &amp; Eng. (BUET)<\/strong><\/em><br \/>\n<em><strong>MSc. in Comp. Sc. (U of Manitoba, Canada)<\/strong><\/em><br \/>\n<em><strong>MSc. in Data Science and Analytics (Ryerson University, Canada)<\/strong><\/em><br \/>\n<em><strong>Linkedin<\/strong>: <a href=\"https:\/\/ca.linkedin.com\/in\/sayedjustetc\">https:\/\/ca.linkedin.com\/in\/sayedjustetc<\/a><br \/>\n<\/em><\/p>\n<p><em><strong>Blog<\/strong>: <a href=\"http:\/\/bangla.salearningschool.com\/\">http:\/\/Bangla.SaLearningSchool.com<\/a>, <a href=\"http:\/\/sitestree.com\">http:\/\/SitesTree.com<\/a><\/em><br \/>\n<em><strong>Training Courses: <\/strong><a href=\"http:\/\/training.SitesTree.com\">http:\/\/Training.SitesTree.com<\/a> <\/em><br \/>\n<em><strong>8112223 Canada Inc\/Justetc<\/strong>: <a href=\"http:\/\/JustEtc.net\">http:\/\/JustEtc.net<\/a><\/em><\/p>\n<p><em><strong>Facebook Groups\/Forums to discuss (Q &amp; A): <\/strong><\/em><br \/>\n<a href=\"https:\/\/www.facebook.com\/banglasalearningschool\">https:\/\/www.facebook.com\/banglasalearningschool<\/a><br \/>\n<a href=\"https:\/\/www.facebook.com\/justetcsocial\">https:\/\/www.facebook.com\/justetcsocial<\/a><\/p>\n<p><em>Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. <\/em><a href=\"http:\/\/sitestree.com\/training\/\">http:\/\/sitestree.com\/training\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Python: Ecommerce: Part\u200a\u2014\u200a1: Merge Multiple Supplier Data Files into One File Section: Merge multiple Supplier Data Files All code in one block #!\/usr\/bin\/env python # coding: utf-8 # # Section: Merge multiple Supplier Data Files # # In[1]: # if there is a need to merge multiple files \u2014 use this block import os; import &hellip; <\/p>\n<p><a class=\"more-link btn\" href=\"http:\/\/bangla.sitestree.com\/?p=17001\">Continue reading<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1912,182],"tags":[],"class_list":["post-17001","post","type-post","status-publish","format-standard","hentry","category-build-ecommerce-software","category---blog","item-wrap"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":17002,"url":"http:\/\/bangla.sitestree.com\/?p=17002","url_meta":{"origin":17001,"position":0},"title":"Python: Ecommerce: Part \u2014 1: Merge Multiple Supplier Data Files into One File","author":"Sayed","date":"April 19, 2020","format":false,"excerpt":"Section: Merge multiple Supplier Data Files All code in one block #!\/usr\/bin\/env python # coding: utf-8# # Section: Merge multiple Supplier Data Files ## In[1]:# if there is a need to merge multiple files \u2014 use this block import os; import glob; import pandas as pd;# supplier data files\/feeds are\u2026","rel":"","context":"In &quot;Build Ecommerce Software&quot;","block_context":{"text":"Build Ecommerce Software","link":"http:\/\/bangla.sitestree.com\/?cat=1912"},"img":{"alt_text":"8112223 Canada Inc. (Justetc)","src":"https:\/\/miro.medium.com\/fit\/c\/80\/80\/0*P_esmjKoJnHlNjFX","width":350,"height":200},"classes":[]},{"id":16964,"url":"http:\/\/bangla.sitestree.com\/?p=16964","url_meta":{"origin":17001,"position":1},"title":"Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time","author":"Sayed","date":"March 21, 2020","format":false,"excerpt":"Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time By Sayed Ahmed Merge multiple transaction files into one. This is an extension to the article: Python: Read RBC Canada: Mastercard PDF Statement Transaction Data into CSV file By Sayed Ahmedmedium.com The Code for Merging\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":17008,"url":"http:\/\/bangla.sitestree.com\/?p=17008","url_meta":{"origin":17001,"position":2},"title":"Python: Ecommerce: Part \u2014 7: Partition a Data File (with product information) into Multiple Files.","author":"Sayed","date":"April 19, 2020","format":false,"excerpt":"In previous steps, we created a product data file to be uploaded to Magento 2 (from supplier data). However, there is a limit, how big a file can be uploaded to Magento 2 for product data import. Hence, this code will divide the data file into multiple files. This data\u2026","rel":"","context":"In &quot;Build Ecommerce Software&quot;","block_context":{"text":"Build Ecommerce Software","link":"http:\/\/bangla.sitestree.com\/?cat=1912"},"img":{"alt_text":"8112223 Canada Inc. (Justetc)","src":"https:\/\/miro.medium.com\/fit\/c\/80\/80\/0*P_esmjKoJnHlNjFX","width":350,"height":200},"classes":[]},{"id":16966,"url":"http:\/\/bangla.sitestree.com\/?p=16966","url_meta":{"origin":17001,"position":3},"title":"Python: Generate Expense Report from RBC MasterCard Transaction Data","author":"Sayed","date":"March 21, 2020","format":false,"excerpt":"Python: Generate Expense Report from RBC MasterCard Transaction Data For the context read the following two first Python: Read RBC Canada: Mastercard PDF Statement Transaction Data into CSV file By Sayed Ahmed medium.com Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time By Sayed\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":17003,"url":"http:\/\/bangla.sitestree.com\/?p=17003","url_meta":{"origin":17001,"position":4},"title":"Python: Ecommerce: Part \u2014 2: Drop Duplicates, Sort, and Take Only Unique Products After Merging All Supplier D ata Files into One File","author":"Sayed","date":"April 19, 2020","format":false,"excerpt":"All code in One Block # # Section: Verify, and Process Supplier Data Before Sending products to # # your retail (Magento 2) or marketplace (Amazon, Walmart)# In[7]:# combined_csv.sort_values(\u201cModel Code\u201d, inplace = True) # dropping ALL duplicte values based on Product SKU = Model Codeno_duplicates_combined_csv = combined_csv.drop_duplicates(subset = \u201cModel Code\u201d,\u2026","rel":"","context":"In &quot;Build Ecommerce Software&quot;","block_context":{"text":"Build Ecommerce Software","link":"http:\/\/bangla.sitestree.com\/?cat=1912"},"img":{"alt_text":"8112223 Canada Inc. (Justetc)","src":"https:\/\/miro.medium.com\/fit\/c\/80\/80\/0*P_esmjKoJnHlNjFX","width":350,"height":200},"classes":[]},{"id":16968,"url":"http:\/\/bangla.sitestree.com\/?p=16968","url_meta":{"origin":17001,"position":5},"title":"Python: Read Amex (American Express) Canada: Mastercard PDF Statement Transaction Data into CSV file","author":"Sayed","date":"March 26, 2020","format":false,"excerpt":"Python: Read Amex (American Express) Canada: PDF Statement Transaction Data into CSV file By Sayed Ahmed In general, you will get CSV files from your Amex card; However, sometimes you need to download in every 3 or 6 months. Otherwise, you might only get\/request pdf files for transactions (depends on\u2026","rel":"","context":"In &quot;\u09ac\u09cd\u09b2\u0997 \u0964 Blog&quot;","block_context":{"text":"\u09ac\u09cd\u09b2\u0997 \u0964 Blog","link":"http:\/\/bangla.sitestree.com\/?cat=182"},"img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/17001","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=17001"}],"version-history":[{"count":1,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/17001\/revisions"}],"predecessor-version":[{"id":17020,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=\/wp\/v2\/posts\/17001\/revisions\/17020"}],"wp:attachment":[{"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=17001"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=17001"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/bangla.sitestree.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=17001"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}