Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time

Python: Merge Multiple csv files into one to facilitate reporting on transaction data over time

By Sayed Ahmed

Merge multiple transaction files into one. This is an extension to the article:

Python: Read RBC Canada: Mastercard PDF Statement Transaction Data into CSV file
By Sayed Ahmedmedium.com

The Code for Merging

#!/usr/bin/env python
# coding: utf-8
# In[1]:
import os
import glob
import pandas as pd
data_folder = ‘./’
os.chdir(data_folder)
# In[2]:
extension = ‘csv’
all_filenames = [i for i in glob.glob(‘*.{}’.format(extension))]
# In[ ]:
sorted(list(all_filenames))
# In[3]:
# test : Check the data for each csv files to be combined.
# do the data align well with each other
row_total_count = 0
for i in range(0, len(all_filenames)):
j = i + 1
for f in all_filenames[i:j]:
file = f
print(file)
df_s = pd.read_csv(f)
print(df_s.shape, f)
#print(f)
row_total_count += df_s.shape[0]
print(df_s.head())
#row_total_count
#df_s.head()
# In[10]:
# keep track of total rows in all files so that you can compare the shape with the final combined data file
row_total_count = 0
for f in all_filenames:
file = f
print(file)
df_s = pd.read_csv(f, header=None)
print(df_s.shape, f)
#print(f)
row_total_count += df_s.shape[0]
#print(df_s.head())
row_total_count
#df_s.head()
# In[15]:
#combine all files in the list, axis =0 i.e. one after another
combined_csv = pd.concat([pd.read_csv(f, header=None) for f in all_filenames], axis=0)
# combined_csv.sort_values(“Model Code”, inplace = True)
# dropping ALL duplicate values
#combined_csv.drop_duplicates(subset =”Model Code”, keep = False, inplace = True)
#export to csv
combined_csv.to_csv( “rbc_mastercard_data_combined”, index=False, encoding=’utf-8-sig’)
# In[16]:
combined_csv.shape
# In[17]:
row_total_count == combined_csv.shape[0]
# In[19]:
df = pd.read_csv(‘rbc_mastercard_data_combined’)
df.head(100)
# In[ ]:
df.shape

Final Output

1*j5ET61-3P-l-GTj2pMhePg.png

Posted On:

https://medium.com/@SayedAhmedCanada/python-merge-multiple-csv-files-into-one-to-facilitate-reporting-on-transaction-data-over-time-2e4524355d43

Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***

Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)

MSc. in Comp. Sc. (U of Manitoba, Canada)

MSc. in Data Science and Analytics (Ryerson University, Canada)

Linkedin:

Blog: ,

Training Courses:

Facebook Groups/Forums to discuss (Q & A):

https://www.facebook.com/banglasalearningschool

https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/