Python: Read RBC Canada: Mastercard Statement Data into CSV file

Sure, you can use tools such as Adobe Acrobat to convert PDF statements to csv/xls files. This is a simple approach that you might be able to use when you do not have access to any such tools.

How can this help? If you were trying to calculate based on the transactions. One application can be calculating your business or personal expenses. Also, if you are writing a tax or financial software, similar can be useful. For sure, tools and programming languages are available to read PDF files directly than the approach I mentioned here.

Steps:

Download the statement.
Copy the text (only the transactions) into a text file
The use the code provided here or similar to read the text file and create a CSV file
You can write another Python code to merge multiple csv files
Using the combined CSV file, you can calculate your yearly categorized expenses or similar : for example

Sample Transaction Data
‘DEC 21 DEC 24 PAYMENT – THANK YOU / PAIEMENT – MERCI\n75105398358619984366202\n-$60.00\nJAN 10 JAN 10 PAYMENT – THANK YOU / PAIEMENT – MERCI\n75105399010619980541202\n-$21.82\nJAN 10 JAN 10 PURCHASE INTEREST 19.99% \n12312312312\n-$0.82\n\n\nDEC 10 DEC 12 PRESTO VIC PARK RELOAD EAST YORK ON\n55134428345800179063337\n$10.25\nDEC 13 DEC 14 TIM HORTONS 0670 QTH TORONTO ON\n55419218347200311042837\n$1.80\n

Note:
Assume that each transaction spans into three lines when you copy from a pdf file to a text file. The lines with interest charges might span into only one line. You can convert that into 3 lines before executing the code on it.

DEC 21 DEC 24 PAYMENT – THANK YOU / PAIEMENT – MERCI
75105398358619984366202
-$60.00
JAN 10 JAN 10 PAYMENT – THANK YOU / PAIEMENT – MERCI
75105399010619980541202
-$21.82
JAN 10 JAN 10 PURCHASE INTEREST 19.99%
12312312312
-$0.82

The Code:
#!/usr/bin/env python
# coding: utf-8

# In[5]:

file = open(‘dec2018-jan2019.txt’,’r’);

import os.path
import re

# open the file for reading
path = ‘dec2018-jan2019.txt’
if os.path.exists(path):
# print(‘debug: exists’)
file = open(‘dec2018-jan2019.txt’,’r’)
else:
print(‘Fatal error: file does not exist’)
exit(1)

# In[6]:

fileContent = file.read()
fileContent

# In[ ]:

#f = open("dec2018-jan2019.out.txt", "a");
#f.write("Now the file has more content!")
#f.close()

# In[3]:

one_line = ”;
count = 0;
f = open("dec2018-jan2019.out.txt.csv", "a");
file = open(‘dec2018-jan2019.txt’,’r’)
for line in file:
#print(line);
line = line.rstrip("\n");
if line !=”:
count = count + 1;

if (count % 3 == 1):
line = line[0:6] + "," + line[6:13] + "," + line[13:];

if (count % 3 == 2):
line = line + "," ;

if (count % 3 == 0):
#print(line[0]);
if line[0] == ‘-‘:
line = line[1:];
if (line[0] != ‘$’ ):
print(‘Warning possible mistake’)

#print(count);
one_line = one_line + line;

if (count % 3 == 0):
one_line = one_line + "\r\n";
print(one_line);
f.write(one_line);
one_line = ”;

#print(‘–‘);

f.close();

# In[ ]:

*** ***. ***
Note: Older short-notes from this site are posted on Medium: https://medium.com/@SayedAhmedCanada

*** . *** *** . *** . *** . ***
Sayed Ahmed

BSc. Eng. in Comp. Sc. & Eng. (BUET)
MSc. in Comp. Sc. (U of Manitoba, Canada)
MSc. in Data Science and Analytics (Ryerson University, Canada)
Linkedin: https://ca.linkedin.com/in/sayedjustetc

Blog: http://Bangla.SaLearningSchool.com, http://SitesTree.com
Training Courses: http://Training.SitesTree.com

Facebook Groups/Forums to discuss (Q & A):
https://www.facebook.com/banglasalearningschool
https://www.facebook.com/justetcsocial

Get access to courses on Big Data, Data Science, AI, Cloud, Linux, System Admin, Web Development and Misc. related. Also, create your own course to sell to others. http://sitestree.com/training/