Justetc Social Services (non-profit)Jan 31 · 29 min read
Indicator visualizations will be in a separate ipynb file (visualize-indicators-final.ipynb)
Visualizations can be created using the UI interface i.e. Select options and execute the code block after to get the visualizations.
I have placed a separate section at the end of this file where research questions and plots as I placed on my documents and presentations are placed (Code there will generate the plots for that section)
For research question section, corresponding health status will need to be selected and data reloaded to re — execute (UI selections will not matter)
#!conda install basemapimport pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline# it is important
# from geopy.geocoders import Nominatim
# for geo maps
# !conda install basemap
import conda
import os
if 'PROJ_DIR' in os.environ:
pyproj_datadir = os.environ['PROJ_DIR']
else:
conda_file_dir = conda.CONDA_PACKAGE_ROOT
conda_dir = conda_file_dir.split('lib')[0]
pyproj_datadir = os.path.join(os.path.join(conda_dir, 'Library'),'share')
os.environ['PROJ_LIB'] = pyproj_datadir
#important
#from mpl_toolkits.basemap import Basemap
Location of the Data Files Folder.
Data folder will have data files, one for one aspect of healthcare performance. I will load data file names(aspects) in a drop down — to act upon that. Hence, any new data files with similar structures (likely) can be easily integrated with the visualizations.
import os
data_folder = './data/'
measure_files = os.listdir(data_folder)
measure_files['.~lock.health-status.xls#',
'access-to-care.xls',
'health-status.xls',
'non-med-determinants.xls',
'patient-safety.xls',
'prescribing-primary.xls',
'quality-of-care.xls']
Select a measure/aspect to visualize upon
If you see: [‘.~lock.health-status.xls#’, or similar that needs to be ignored
Note: when a health-aspect i.e. measure will be changed in the drop-down below, some code need to be re-executed to load related data.
I am marking with START-RELOAD-DATA and END-RELOAD-DATA
print('Select a health measure/aspect to visualize\n')
# create the interactive interface
def f(measure):
return measure
print('Select a measure:')
measure_file = interactive(f, measure = measure_files);
display(measure_file)Select a health measure/aspect to visualize
Select a measure:
interactive(children=(Dropdown(description='measure', options=('.~lock.health-status.xls#', 'access-to-care.xl…
Test what aspect/measure we have selected
START-RELOAD-DATA
'Selected: ' + measure_file.result'Selected: access-to-care.xls'if (measure_file.result == ''):
measure_file.result = 'health-status.xls'
Load Data and Display
data_file = data_folder + measure_file.result
measure_data = pd.read_excel(data_file)
measure_data.head()
Find all the performance indicators under this aspect/measure
irrespective we have data for Canada or not
# find details on Indicators
# find all indicators
measure_data.set_index(['Indicator'])
indicators = pd.Index(measure_data['Indicator']).unique()
indicatorsIndex(['Inability to Pay for Medical Bills', 'Poor Weekend/Evening Care',
'Regular Doctor', 'Same or Next Day Appt',
'Wait Time: Cataract Surgery', 'Wait Time: Hip Replacement',
'Wait Time: Knee Replacement', 'Wait Time: Specialist'],
dtype='object', name='Indicator')
Find the performance indicators under this aspect/measure when Canada must have data
# indicators that must have canadian data
# first where canada do exist
data_with_canada = measure_data[measure_data['Region'] == 'Canada']
#indicators when canada also have data
indicators_with_canada = data_with_canada['Indicator'].unique()
print('All indicators\n', indicators)
print('Indicators where canada also has Data\n', indicators_with_canada)
'Indicator Counts', len(indicators_with_canada), len(indicators)All indicators
Index(['Inability to Pay for Medical Bills', 'Poor Weekend/Evening Care',
'Regular Doctor', 'Same or Next Day Appt',
'Wait Time: Cataract Surgery', 'Wait Time: Hip Replacement',
'Wait Time: Knee Replacement', 'Wait Time: Specialist'],
dtype='object', name='Indicator')
Indicators where canada also has Data
['Inability to Pay for Medical Bills' 'Poor Weekend/Evening Care'
'Regular Doctor' 'Same or Next Day Appt' 'Wait Time: Cataract Surgery'
'Wait Time: Hip Replacement' 'Wait Time: Knee Replacement'
'Wait Time: Specialist']
('Indicator Counts', 8, 8)
Find years as we can see in the data
All years, also seprately when we have data for Canada
# find all years
years = measure_data['Data year'].dropna().unique()
years_with_canada = data_with_canada['Data year'].dropna().unique()
print('Years we have data\n', years)
print('Years where canada also has data\n', years_with_canada)Years we have data
['2016' 2016 'Not applicable' '2015' '2013']
Years where canada also has data
[2016]
Years when we have data for Canada
# sort years for all data
years = [ int(aYear) for aYear in years if (aYear != 'Not applicable') and len( str(aYear).split(' ')) <= 1 ]
years = sorted(years)
years
all_years = [0] + years
all_years[0, 2013, 2015, 2016, 2016]
Sort the years so that we can show them in drop down in ascending format
0 means all years selected
# sort years
years_with_canada = [ int(aYear) for aYear in years_with_canada if (aYear != 'Not applicable') and len( str(aYear).split(' ')) <= 1 ]
years_with_canada = sorted(years_with_canada)
years_with_canada
all_years_canada = [0] + years_with_canada
all_years_canada[0, 2016]
All countries
countries = measure_data[ (measure_data['Type of region'] == 'Country') | (measure_data['Type of region'] == 'Canada') ]['Region']
countries = sorted (countries.unique())
#print(countries)
#countries = ['Canada'] + countries
# bring Canada at the top
if 'Canada' in countries:
countries = ['Canada'] + countries
countries[:1]['Canada']
unique color code for each country
color will be used as a third dimension in some plots
# countries unique color code
all_regions = measure_data['Region'].unique()
region_colors = []
region_colors_dict = {}
import random
random.seed(0)
for aRegion in all_regions:
region_colors_dict[aRegion] = np.random.randint(0, 255)
list(region_colors_dict.keys())[:5], list(region_colors_dict.values())[:5](['United Kingdom',
'Saskatchewan',
'Germany',
'British Columbia',
'Australia'],
[227, 16, 56, 193, 232])# latitude and longitude information
geolocator = Nominatim(user_agent="chrome")
lats, lons = [], []
lats_dict = {}
lons_dict = {}
#for aCountry in countries:
for aRegion in all_regions:
try:
location = geolocator.geocode(aRegion)
lats.append(location.latitude)
lons.append(location.longitude)
lats_dict[aRegion] = location.latitude
lons_dict[aRegion] = location.longitude
except:
continue
lats_dict['Canada']
#list(zip(lats, lons))---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-17-1521376715a3> in <module>()
1 # latitude and longitude information
----> 2 geolocator = Nominatim(user_agent="chrome")
3 lats, lons = [], []
4 lats_dict = {}
5 lons_dict = {}
NameError: name 'Nominatim' is not defined# Province unique color code
all_provinces = measure_data[ measure_data['Type of region'] == 'Province' ]['Region'].unique()
province_colors = []
province_colors_dict = {}
import random
random.seed(0)
for aProvince in all_provinces:
province_colors_dict[aProvince] = np.random.randint(0, 255)
list(province_colors_dict.keys()), list(province_colors_dict.values())(['Saskatchewan',
'British Columbia',
'Newfoundland and Labrador',
'Manitoba',
'Alberta',
'Quebec',
'New Brunswick',
'Ontario',
'Nova Scotia',
'Prince Edward Island'],
[172, 161, 91, 156, 3, 216, 153, 234, 191, 96])
latitude longitude for all regions (countries, provinces)
will be used in map/geo plots
# latitude and longitude information
geolocator = Nominatim(user_agent="chrome")
lats, lons = [], []
lats_dict = {}
lons_dict = {}
#for aCountry in countries:
for aRegion in all_regions:
try:
location = geolocator.geocode(aRegion)
lats.append(location.latitude)
lons.append(location.longitude)
lats_dict[aRegion] = location.latitude
lons_dict[aRegion] = location.longitude
except:
continue
lats_dict['Canada']
#list(zip(lats, lons))---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-19-1521376715a3> in <module>()
1 # latitude and longitude information
----> 2 geolocator = Nominatim(user_agent="chrome")
3 lats, lons = [], []
4 lats_dict = {}
5 lons_dict = {}
NameError: name 'Nominatim' is not defined
Test to check the OECD Data: Benchmark data
for anIndicator in indicators:
oecd_data = measure_data.loc[ (measure_data['Indicator'] == anIndicator) & (measure_data['Region'] == 'OECD average') ]
#oecd_data['Value'].tolist()[0]
print(oecd_data[ ['Indicator', 'Value', 'Data year' ]])
#print(all_years)
for aYear in all_years:
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicators[0]) & ( (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ) | \
( (measure_data['Indicator'] == indicators[0]) & (measure_data['Region'] == 'OECD average')) ]
#print(indicator_data[ ['Indicator', 'Value', 'Data year' ]])
#print(indicator_data['Value'])Indicator Value Data year
17 Inability to Pay for Medical Bills 8.7 Not applicable
Indicator Value Data year
27 Poor Weekend/Evening Care 54.1 Not applicable
Indicator Value Data year
57 Regular Doctor 94.6 Not applicable
Indicator Value Data year
84 Same or Next Day Appt 59.2 Not applicable
Indicator Value Data year
110 Wait Time: Cataract Surgery 98.0 Not applicable
Indicator Value Data year
128 Wait Time: Hip Replacement 115.2 Not applicable
Indicator Value Data year
161 Wait Time: Knee Replacement 204.6 Not applicable
Indicator Value Data year
174 Wait Time: Specialist 41.9 Not applicable
END-RELOAD-DATA
Method to plot over years or only for a year.
#Note: All combinations of UI selections might not work as that will require extensive testing and adjust (adjust with data and real life) #country might be represented by colors when applicable #supports plots such as: Bubble, Line, Bar, Hor Bar, Pie #other plot types can also be added. Once implemented will work for all measures and indicators (with proper data format provided) #Note: it is known to me that code can be improved here. similar codes are in multiple places — could be reduced
#plt.rcParams['figure.figsize'] = [10, 15]
def plot_measure_by_years(year, indicator, bubble_scale, chart_type = '', ratios=[10, 1], animate=False, provincial_only=False, all_years=all_years, fig_size=[10, 10], sec_fig=True):
plt.rcParams['figure.figsize'] = fig_size
# redundend code to address a last minute bug
# countries unique color code
all_regions = measure_data['Region'].unique()
region_colors = []
region_colors_dict = {}
import random
random.seed(0)
for aRegion in all_regions:
region_colors_dict[aRegion] = np.random.randint(0, 255)
# print('year, indicator', year, indicator)
# keep these, I might need this
"""
plt.ion()
f = plt.figure()
ax = f.gca()
#top box
f.show()
"""
# benchmark data
oecd_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Region'] == 'OECD average') ]
#benchmark_value = oecd_data['Value']
#print(type(benchmark_value), benchmark_value)
benchmark_value = oecd_data['Value'].tolist()[0]
# one improvement that can be made: usually I kept, two plots side by side. where the right one shows country and color
# as in other cases we do not need the right one, code to hide the right one or just to create and use one subplot is more
# appropriate
# if (sec_fig == True):
fig, axs = plt.subplots(1, 2, figsize=fig_size, sharey=False, gridspec_kw = {'width_ratios':ratios})
#else:
#axs = plt.subplots(1, 1, figsize=fig_size )
#plt.xticks(rotation=90)
# for one year
if ( year > 1 ):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & ( (measure_data['Data year'] == year) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ) | \
( (measure_data['Indicator'] == indicator) & (measure_data['Region'] == 'OECD average')) ]
if (provincial_only == True):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & \
(measure_data['Data year'] == year) & (measure_data['Type of region'].isin(['Province']))]
#print(indicator_data)
# for country color codes
if (provincial_only == False):
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
c_code = [ x[0:3] for x in indicator_data['Region'] ]
else:
#province_colors_dict
c = [ province_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
c_code = [ x[0:3] for x in indicator_data['Region'] ]
region_colors_dict = province_colors_dict
# this block might not apply to anything for one single year plot
color_as_a_dimension = False
if color_as_a_dimension == True:
axs[1].scatter([1]*len(c), [i*5 for i in range(len(c))], s=300, c=c, marker='^')
count = list(region_colors_dict.keys())
#print(count)
for j in range(len(c)):
axs[1].annotate(count[j], (1.0001, j*5))
axs[1].set_xlabel('Country and Colors')
axs[1].set_ylabel('')
axs[1].set_xticks([])
axs[1].set_yticks([])
if chart_type == 'Line':
# best to use only for one year
axs[0].plot(indicator_data['Value'], indicator_data['Region']) #, c=c
axs[0].set_xticklabels(indicator_data['Value'], rotation=90) # can be turned off
# https://stackoverflow.com/questions/10998621/rotate-axis-text-in-python-matplotlib
plt.suptitle(indicator + ' For a Year')
axs[0].set_xlabel('Values')
axs[0].set_ylabel('Countries/Regions')
fig.savefig('./saved_images_from_visualizations/' + 'line_' +indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
fig, axs = plt.subplots(1, 2, figsize=fig_size, sharey=False, gridspec_kw = {'width_ratios':ratios})
axs[0].plot(indicator_data['Region'], indicator_data['Value']) #, c=c
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
plt.suptitle(indicator + ' Over a Year')
axs[0].set_xlabel('Regions/Countries')
axs[0].set_ylabel('Values')
#fig.savefig('./saved_images_from_visualizations/cancer_mortality_2017_country_region_x.png')
fig.savefig('./saved_images_from_visualizations/' + 'line_' + indicator.replace(' ', '_')[0:12] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
elif chart_type == 'Bar':
#fig, axs = plt.subplots(1, 2, figsize=(10, 8), sharey=False, gridspec_kw = {'width_ratios':ratios})
axs[0].bar(indicator_data['Region'], indicator_data['Value']) #, c=c
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
plt.suptitle(indicator + ' Over a Year')
axs[0].set_xlabel('Regions/Countries')
axs[0].set_ylabel('Values')
#fig.savefig('./saved_images_from_visualizations/cancer_mortality_2017_country_region_x.png')
fig.savefig('./saved_images_from_visualizations/' + 'bar_' + indicator.replace(' ', '_')[0:12] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
elif chart_type == 'Hor Bar':
#fig, axs = plt.subplots(1, 2, figsize=(10, 8), sharey=False, gridspec_kw = {'width_ratios':ratios})
axs[0].barh(indicator_data['Region'], indicator_data['Value']) #, c=c
#axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
plt.suptitle(indicator + ' Over a Year')
axs[0].set_xlabel('Values')
axs[0].set_ylabel('Regions/Countries')
#fig.savefig('./saved_images_from_visualizations/cancer_mortality_2017_country_region_x.png')
fig.savefig('./saved_images_from_visualizations/' + 'hor_bar' + indicator.replace(' ', '_')[0:12] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
elif chart_type == 'Pie':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
#ax.pie(indicator_data['Value'], labels=indicator_data['Data year'], autopct="%1.1f%%")
axs[0].pie(indicator_data['Value'], labels=indicator_data['Region']) #, c=c
axs[0].set_xticklabels(indicator_data['Value'], rotation=90) # can be turned off
# https://stackoverflow.com/questions/10998621/rotate-axis-text-in-python-matplotlib
plt.suptitle(indicator + ' For a Year')
axs[0].set_xlabel('Values')
axs[0].set_ylabel('Countries/Regions')
fig.savefig('./saved_images_from_visualizations/' + 'pie_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
else:
#print(indicator_data)
# for country color codes
if (provincial_only == False):
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
else:
#province_colors_dict
c = [ province_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
#c_code = [ x[0:3] for x in indicator_data['Region'] ]
region_colors_dict = province_colors_dict
#ax.scatter(indicator_data['Data year'], indicator_data['Region'], s=indicator_data['Value'] * bubble_scale, c=c )
axs[0].scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale, c=c )
#axs[0].set_xticks([year])
#axs[0].set_yticks(indicator_data['Value'])
plt.suptitle(indicator + ' Over a Year ' + str(year) )
axs[0].set_xlabel('Regions')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
axs[1].scatter([1]*len(c), [i*5 for i in range(len(c))], s=300, c=c, marker='^')
count = list(region_colors_dict.keys())
#print(count)
for j in range(len(c)):
axs[1].annotate(count[j], (1.0001, j*5))
axs[1].set_xlabel('Country and Colors')
axs[1].set_ylabel('')
axs[1].set_xticks([])
axs[1].set_yticks([])
fig.savefig('./saved_images_from_visualizations/' + 'bubble_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
#plt.show()
else:
# for multiple years
# though this method is primarily to use for one year, unless in some specific cases
"""
plt.ion()
f = plt.figure()
#ax = f.gca()
#top box
f.show()
"""
# redundend code to address a last minute bug
# countries unique color code
all_regions = measure_data['Region'].unique()
region_colors = []
region_colors_dict = {}
import random
random.seed(0)
for aRegion in all_regions:
region_colors_dict[aRegion] = np.random.randint(0, 255)
#list(region_colors_dict.keys())[:5], list(region_colors_dict.values())[:5]
#print(all_years)
for aYear in all_years:
if aYear == 0:
continue
if aYear == 'Not applicable':
continue
# older : original #
#indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) ]
"""
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & ( (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ) | \
( (measure_data['Indicator'] == indicator) & (measure_data['Region'] == 'OECD average')) ]
"""
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if (provincial_only == True):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & \
(measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Province']))]
#print(indicator_data)
#print(indicator_data)
# for country color codes
#print(region_colors_dict)
if (provincial_only == False):
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
else:
#province_colors_dict
c = [ province_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
#c_code = [ x[0:3] for x in indicator_data['Region'] ]
region_colors_dict = province_colors_dict
#c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
#m = [ x for x in indicator_data['Region'] ]
#print( list(zip(m,c)))
if chart_type == 'Line':
# plt.plot(indicator_data['Data year'], indicator_data['Value'], c=c)
# plt.xticks(indicator_data['Value'])
# was here axs[0].plot(indicator_data['Data year'], indicator_data['Value']) #, c=c
# was here axs[0].set_xticks(indicator_data['Value'])
# brought from one year
# best to use only for one year
#plt.plot(indicator_data['Data year'], indicator_data['Value'], c=c)
#plt.xticks(indicator_data['Value'])
#axs[0].plot(indicator_data['Data year'], indicator_data['Value']) #, c=c
axs[0].plot(indicator_data['Data year'], indicator_data['Value']) #, c=c
#axs[0].plot(165, 'OECD', color='Red')
#axs[0].set_xticklabels(indicator_data['Data year'], rotation=90) # can be turned off
# https://stackoverflow.com/questions/10998621/rotate-axis-text-in-python-matplotlib
#plt.suptitle(indicator + ' For a Year')
#axs[0].set_xlabel('Years')
#axs[0].set_ylabel('Values')
fig.savefig('./saved_images_from_visualizations/cancer_mortality_years_country.png')
#plt.show()
"""
fig, axs = plt.subplots(1, 2, figsize=(10, 8), sharey=False, gridspec_kw = {'width_ratios':ratios})
axs[0].plot(indicator_data['Region'], indicator_data['Value']) #, c=c
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
plt.suptitle(indicator + ' Over a Year')
axs[0].set_xlabel('Regions/Countries')
axs[0].set_ylabel('Values')
fig.savefig('./saved_images_from_visualizations/cancer_mortality_2017_country_region_x.png')
plt.show()
"""
# end of brought from one year
#f.canvas.draw()
elif chart_type == 'Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
plt.bar(indicator_data['Data year'], indicator_data['Value'])
elif chart_type == 'Hor Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
plt.barh(indicator_data['Region'], indicator_data['Value'])
elif chart_type == 'Pie':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
ax.pie(indicator_data['Value'], labels=indicator_data['Data year'], autopct="%1.1f%%")
else:
indicator_data['Value'] = indicator_data['Value'].div(benchmark_value)
# for country color codes
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
axs[0].scatter(indicator_data['Data year'], indicator_data['Value'], s = indicator_data['Value'] * bubble_scale, c=c)
plt.suptitle(indicator + ' Over Years \n Values are in multiples of Benchmark (' + str(benchmark_value) + ')' )
axs[0].set_xlabel('Years')
axs[0].set_ylabel('Values/Becnhmark Value(' + str(benchmark_value) +')' )
fig.savefig('./saved_images_from_visualizations/transport_mortality_over_years.png')
#plt.show()
# show country colors
axs[1].scatter([1]*len(c), [i*5 for i in range(len(c))], s=300, c=c, marker='o')
count = list(m) #list(region_colors_dict.keys())
#print(count)
for j in range(len(c)):
axs[1].annotate(count[j], (1, j*5))
axs[1].set_xlabel('Country and Colors')
axs[1].set_ylabel('')
axs[1].set_xticks([])
axs[1].set_yticks([])
if (animate):
plt.pause(0.01)
#f.canvas.draw()
#plt.show()
Plot over a region for different countries
Some of the plots as can be generated from this method can also be generated using the method above. The use of this method will be mostly averaging over years. However, in the dataset, data for all years might not be available; hence, to make the visualization, the years selected need to make sense consideringreal-world Ideally, I could give options to select multiple years individually and show info to the users what data are there and what are appropriate (I am considering that out of scope for now)
plt.rcParams['figure.figsize'] = [10, 10]
def plot_measure_by_regions(year, indicator, bubble_scale, chart_type = '', ratios=[3,1], provincial_only=False, all_years=all_years):
#print('year, indicator', year, indicator)
plt.ion()
# now f = plt.figure()
# now ax = f.gca()
#plt.xticks([])
# now f.show()
# indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicators[0] ) ]
# (20, 10)
fig, axs = plt.subplots(1, 2, figsize=(20, 10), sharey=False, gridspec_kw = {'width_ratios':ratios})
#plt.xticks(rotation=90)
if ( year > 1 ):
#indicator_data = indicator_data [ indicator_data['Data year'] == year]
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == year) ]
# for country color codes
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
c_code = [ x[0:3] for x in indicator_data['Region'] ]
#print(indicator_data)
if chart_type == 'Line':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.plot(indicator_data['Region'], indicator_data['Value'])
axs[0].plot(indicator_data['Region'], indicator_data['Value']) #, s=indicator_data['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'line_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.bar(indicator_data['Region'], indicator_data['Value'])
indicator_data = indicator_data.sort_values(by=['Value'])
axs[0].bar(indicator_data['Region'], indicator_data['Value']) #, s=indicator_data['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'bar_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Hor Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
indicator_data = indicator_data.sort_values(by=['Value'])
axs[0].barh(indicator_data['Region'], indicator_data['Value']) #, s=indicator_data['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
#axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'hor_bar_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Pie':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
#ax.pie(indicator_data['Value'], labels=indicator_data['Region'], autopct="%1.1f%%")
axs[0].pie(indicator_data['Value'], labels = indicator_data['Region'], ) #, s=indicator_data['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'pie_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
else:
axs[0].scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
# plt.show()
# show country colors
axs[1].scatter([1]*len(c), [i*5 for i in range(len(c))], s=300, c=c, marker='o')
count = list(m) #list(region_colors_dict.keys())
#print(count)
for j in range(len(c)):
axs[1].annotate(count[j], (1, j*5))
axs[1].set_xlabel('Country and Colors')
axs[1].set_ylabel('')
axs[1].set_xticks([])
axs[1].set_yticks([])
fig.savefig('./saved_images_from_visualizations/' + 'bubble_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
# now plt.show()
# multiple year selected
else:
#indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if (provincial_only == True):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & \
(measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Province']))]
# for country color codes
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
c_code = [ x[0:3] for x in indicator_data['Region'] ]
indicator_data = indicator_data.set_index(['Region'])
x = indicator_data.groupby(['Region']).mean()
#print(x.index, x['Value'])
### for aYear in all_years:
# now indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) ]
"""
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if (provincial_only == True):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & \
(measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Province']))]
indicator_data = indicator_data.set_index(['Region'])
indicator_data['mean'] = indicator_data.groupby(['Region']).mean()
print(indicator_data)
"""
if chart_type == 'Line':
axs[0].plot(x.index, x['Value']) #, s = x['Value'] * bubble_scale, c=c )
# though I am repeating this block of code - this can be just placed at the end of the else block
# just trying to save on debug time
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(x.index, rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'line_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
# now plt.bar(indicator_data['Region'], indicator_data['Value'])
x = x.sort_values(by=['Value'])
axs[0].bar(x.index, x['Value']) #, s = x['Value'] * bubble_scale, c=c )
# though I am repeating this block of code - this can be just placed at the end of the else block
# just trying to save on debug time
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
axs[0].set_xticklabels(x.index, rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'bar_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Hor Bar':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
x = x.sort_values(by=['Value'])
axs[0].barh(x.index, x['Value']) #, s = x['Value'] * bubble_scale, c=c )
# though I am repeating this block of code - this can be just placed at the end of the else block
# just trying to save on debug time
plt.suptitle(indicator + ' Over Regions')
#axs[0].set_xlabel('Regions and Countries')
axs[0].set_xlabel('Values')
axs[0].set_ylabel('Regions and Countries')
#axs[0].set_xticklabels(x.index, rotation=90) # can be turned off
fig.savefig('./saved_images_from_visualizations/' + 'hor_bar_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
elif chart_type == 'Pie':
#ax.scatter(indicator_data['Region'], indicator_data['Value'], s=indicator_data['Value'] * bubble_scale )
#plt.barh(indicator_data['Region'], indicator_data['Value'])
# now ax.pie(indicator_data['Value'], labels=indicator_data['Region'], autopct="%1.1f%%")
axs[0].pie(x['Value'], labels=x.index) #, c=c
axs[0].set_xticklabels(x['Value'], rotation=90) # can be turned off
# https://stackoverflow.com/questions/10998621/rotate-axis-text-in-python-matplotlib
plt.suptitle(indicator + ' Over Regions')
#axs[0].set_xlabel('Values')
axs[0].set_ylabel('Countries/Regions')
fig.savefig('./saved_images_from_visualizations/' + 'pie_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
else:
#axs[0].scatter(indicator_data['Region'], indicator_data['Value'], s = indicator_data['Value'] * bubble_scale, c=c )
axs[0].scatter(x.index, x['Value'], s = x['Value'] * bubble_scale, c=c )
plt.suptitle(indicator + ' Over Regions')
axs[0].set_xlabel('Regions and Countries')
axs[0].set_ylabel('Values')
#axs[0].set_xticklabels(indicator_data['Region'], rotation=90) # can be turned off
axs[0].set_xticklabels(x.index, rotation=90) # can be turned off
#plt.show()
# show country colors
axs[1].scatter([1]*len(c), [i*5 for i in range(len(c))], s=300, c=c, marker='o')
count = list(m) #list(region_colors_dict.keys())
#print(count)
for j in range(len(c)):
axs[1].annotate(count[j], (1, j*5))
axs[1].set_xlabel('Country and Colors')
axs[1].set_ylabel('')
axs[1].set_xticks([])
axs[1].set_yticks([])
fig.savefig('./saved_images_from_visualizations/' + 'bubble_' + indicator.replace(' ', '_')[0:10] + '_' + str(np.random.randint(0, 99999)) + '.png')
#now f.canvas.draw()
#plt.suptitle(indicator + 'Over Regions and Years')
#plt.xlabel('Regions and Countries')
#plt.ylabel('Values')
Geo Plot, Map Plot on a World Map
"""
magnitudes = measure_data[['Region', 'Value']]
magnitudes['Region'][1]
year = 2015
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicators[0]) & (measure_data['Data year'] == 2015) ]
#print(indicator_data)
magnitudes = indicator_data[['Region', 'Value', 'Data year']]
for r, v in zip(magnitudes['Region'], magnitudes['Value']):
print(r, v)
""""\nmagnitudes = measure_data[['Region', 'Value']]\nmagnitudes['Region'][1]\n\nyear = 2015\nindicator_data = measure_data.loc[ (measure_data['Indicator'] == indicators[0]) & (measure_data['Data year'] == 2015) ] \n#print(indicator_data)\nmagnitudes = indicator_data[['Region', 'Value', 'Data year']]\n\nfor r, v in zip(magnitudes['Region'], magnitudes['Value']):\n print(r, v)\n"def plot_map_measure_by_regions(year, indicator, bubble_scale, chart_type = '', provincial_only=False, all_years=all_years):
#print('year, indicator', year, indicator)
if ( year > 1 ):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == year) ]
#print(indicator_data)
magnitudes = indicator_data[['Region', 'Value']]
# Make this plot larger.
plt.figure(figsize=(16,12))
eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))
min_marker_size = 2.5 #* bubble_scale
#for lon, lat, mag in zip(lons, lats, magnitudes):
#for i in range(indicator_data.shape[0]):
for reg, val in zip(magnitudes['Region'], magnitudes['Value']):
#try:
#reg = magnitudes[i:i+1]['Region']
#reg = magnitudes['Region'][i]
#print(reg)
lat = lats_dict[reg]
lon = lons_dict[reg]
#print(lat, lon)
x, y = eq_map(lon, lat)
mag = val #magnitudes['Value'][i] #magnitudes[i:i+1]['Value']
#print(mag)
msize = mag * min_marker_size/bubble_scale
#print(msize, msize)
#marker_string = get_marker_color(mag)
#eq_map.plot(x, y, marker_string, markersize=msize)
eq_map.plot(x, y, marker='o', markersize=msize)
#x, y = eq_map(0, 0)
#eq_map.plot(x, y, marker='D',color='m')
#plt.show()
#except:
#print('hello')
#continue
title_string = indicator
title_string += ' for year ' + str(year)
plt.title(title_string)
#plt.show()
plt.savefig('./saved_images_from_visualizations/' + 'geo_plot_for_a_year_' +indicator.replace(' ', '_')[0:5] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
#plt.suptitle(indicator + 'Over Regions and Years')
#plt.xlabel('Regions and Countries')
#plt.ylabel('Values')
#plt.show()
else:
# this is to plot over multiple years
# average taken over multiple years will be plotted
# I could make this more user friendly and data friendly
# by giving users the option to select years, countries; also by informing to what extent data are available (skipping as that will extend the work much)
#indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if (provincial_only == True):
indicator_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & \
(measure_data['Data year'] == aYear) & (measure_data['Type of region'].isin(['Province']))]
#print(indicator_data)
# for country color codes
"""
c = [ region_colors_dict[x] for x in indicator_data['Region'] ]
m = [ x for x in indicator_data['Region'] ]
c_code = [ x[0:3] for x in indicator_data['Region'] ]
"""
indicator_data = indicator_data.set_index(['Region'])
x = indicator_data.groupby(['Region']).mean()
#print(x)
#print(x['Value'][0])
#print(indicator_data)
# not used
magnitudes = pd.DataFrame()
magnitudes['Region'] = x.index
magnitudes['Value'] = x['Value']
#print(magnitudes)
# Make this plot larger.
plt.figure(figsize=(16,12))
eq_map = Basemap(projection='robin', resolution = 'l', area_thresh = 1000.0,
lat_0=0, lon_0=-130)
eq_map.drawcoastlines()
eq_map.drawcountries()
eq_map.fillcontinents(color = 'gray')
eq_map.drawmapboundary()
eq_map.drawmeridians(np.arange(0, 360, 30))
eq_map.drawparallels(np.arange(-90, 90, 30))
min_marker_size = 2.5 #* bubble_scale
#for lon, lat, mag in zip(lons, lats, magnitudes):
#for i in range(indicator_data.shape[0]):
#for reg, val in zip(magnitudes['Region'], magnitudes['Value']):
for reg, val in zip(x.index, x['Value']):
try:
#reg = magnitudes[i:i+1]['Region']
#reg = magnitudes['Region'][i]
#print(reg, val)
lat = lats_dict[reg]
lon = lons_dict[reg]
#print(lat, lon)
x, y = eq_map(lon, lat)
mag = val #magnitudes['Value'][i] #magnitudes[i:i+1]['Value']
#print(mag)
msize = mag * min_marker_size/bubble_scale
#print(msize, msize)
#marker_string = get_marker_color(mag)
#eq_map.plot(x, y, marker_string, markersize=msize)
eq_map.plot(x, y, marker='o', markersize=msize)
#x, y = eq_map(0, 0)
#eq_map.plot(x, y, marker='D',color='m')
#plt.show()
except:
#print('hello')
continue
title_string = indicator
title_string += ' Average for selected years '
plt.title(title_string)
#plt.show()
plt.savefig('./saved_images_from_visualizations/' + 'geo_plot_average_over_multiple_years_' +indicator.replace(' ', '_')[0:5] + '_' + str(np.random.randint(0, 99999)) + '.png')
plt.show()
Heatmap to compare across indicators and countries
Options implemented:
Heatmap for one year, one indicator for a health-aspect across countries Heatmap for one year, all indicator for a health-aspect across countries
Heatmap for one year, one indicator for a health-aspect across Canadian province Heatmap for one year, all indicator for a health-aspect across Canadian province
Heatmap for all years with mean values, one indicator for a health-aspect across countries Heatmap for all years with mean values, all indicator for a health-aspect across countries
Heatmap for all years with mean values, one indicator for a health-aspect across Canadian province Heatmap for all years with mean values, all indicator for a health-aspect across Canadian province
Whether taking mean over multiple years is a pragmatic approach or not: — it can be an effective way of measurements provided data exist for all those years (otherwise the years with data will dominate) — in my case, data will not be available for all indicators for all years — we can then just plot for one year or I could give an interface to select years, indicators, countries for custom comparison — that can be a long task. Hence, I am giving the tool that can be extended in different ways
# ref : https://cmdlinetips.com/2019/01/how-to-make-heatmap-with-seaborn-in-python/
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as pltdef plot_heatmap_across_indicators(year, indicator = '', ratios = [3,1], provincial_only = False, all_years = all_years, fig_size = [10, 10]):
if ( year > 1 ):
heatmap_data = measure_data.loc[ (measure_data['Data year'] == year) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if ( provincial_only == True ):
heatmap_data = measure_data.loc[ (measure_data['Data year'] == year) & (measure_data['Type of region'].isin(['Province']) ) ]
if ( indicator != ''):
heatmap_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == year) & ( measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if ( provincial_only == True ):
heatmap_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Data year'] == year) & (measure_data['Type of region'].isin(['Province']) ) ]
#print(heatmap_data)
indicator_data_heatmap = heatmap_data[ ['Region', 'Value', 'Indicator', 'Data year'] ]
#print(indicator_data_heatmap)
heatmap1_data = pd.pivot_table(indicator_data_heatmap, values='Value', index=['Region'], columns='Indicator')
plt.figure(figsize=fig_size)
ax = sns.heatmap(heatmap1_data, cmap="YlGnBu")
# https://stackoverflow.com/questions/48470251/move-tick-marks-at-the-top-of-the-seaborn-plot?noredirect=1&lq=1
ax.xaxis.set_ticks_position('top')
ax.set_xticklabels(indicator_data_heatmap['Indicator'], rotation=90) # can be turned off
#plt.show()
title_string = measure_file.result[0:len(measure_file.result)-4] + ':' + indicator
title_string += ' for year ' + str(year)
plt.title(title_string)
#plt.show()
plt.savefig('./saved_images_from_visualizations/' + 'heatmap_' + measure_file.result[0:len(measure_file.result)-4] + indicator.replace(' ', '_')[0:5] + '_' + str(np.random.randint(0, 99999)) + '.png')
#plt.show()
else:
# this is to plot over multiple years
# average taken over multiple years will be plotted
# I could make this more user friendly and data friendly
# by giving users the option to select years, countries; also by informing to what extent data are available (skipping as that will extend the work much)
heatmap_data = measure_data.loc[ (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if ( provincial_only == True ):
heatmap_data = measure_data.loc[ (measure_data['Type of region'].isin(['Province']) ) ]
if ( indicator != '' ):
heatmap_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Type of region'].isin(['Country', 'Canada']) ) ]
if ( provincial_only == True ):
heatmap_data = measure_data.loc[ (measure_data['Indicator'] == indicator) & (measure_data['Type of region'].isin(['Province']) ) ]
#print(heatmap_data)
indicator_data_heatmap = heatmap_data[ ['Region', 'Value', 'Indicator', 'Data year'] ]
indicator_data = indicator_data_heatmap.set_index(['Region'])
# x is not used, mean is calculated by seaborn
#x = indicator_data.groupby(['Region', 'Indicator']).mean()
#print(x.index)
#print(x)
#heatmap1_data = pd.pivot_table(indicator_data_heatmap, values='Value', index=['Region'], columns='Indicator')
#heatmap1_data = pd.pivot_table(x, values='Value', index=x.index, columns='Indicator')
heatmap1_data = pd.pivot_table(indicator_data, index = indicator_data.index, columns='Indicator', values='Value', aggfunc = 'mean')
plt.figure(figsize=fig_size)
#ax =
sns.heatmap(heatmap1_data, cmap="YlGnBu")
#ax.xaxis.set_ticks_position('top')
#ax.set_xticklabels(indicator_data_heatmap['Indicator'], rotation=90) # can be turned off
#plt.show()
title_string = measure_file.result[0:len(measure_file.result)-4] + ':' + indicator
all_years_str = ''
for aYear in all_years:
if (aYear > 0):
all_years_str += str(aYear) + ', '
year_str = ' for year ' + str(year) if year > 0 else ' Mean over years \n' + all_years_str
title_string += year_str
plt.title(title_string)
#plt.show()
plt.savefig('./saved_images_from_visualizations/' + 'heatmap_over_years_' + measure_file.result[0:len(measure_file.result)-4] + indicator.replace(' ', '_')[0:5] + '_' + str(np.random.randint(0, 99999)) + '.png')
#plt.show()
Create the components for the UI interface
Users will be interact with the system to generate custom visualizations
START-RELOAD-DAT
this UI component creation might not be the must
# create the interactive interface
def f(indicator):
return indicator
#print ('Measure' + measure_file.result)
#print('Select parameters\n')
#print('Select Indicator with or without Canadian data')
indicator_country = interactive(f, indicator=indicators);
#display(indicator_country)
indicator_country.result
def f(canada_indicator):
return canada_indicator
#print('Select Indicator with Canadian data')
indicator_canada = interactive(f, canada_indicator=indicators_with_canada);
#display(indicator_canada)
indicator_canada.result
def f(year):
return year
#print('Select Year: 0 = all years')
year_country = interactive(f, year=all_years);
#display(year_country)
year_country.result
#print('Select Year: 0 = all years')
year_canada = interactive(f, year=all_years_canada);
#display(year_canada)
year_canada.result
def f(what_to_plot):
return what_to_plot
#print('Select what to plt:')
what_to_plot = {}
# this can come from an excel file as well
what_to_plot['health-status.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
what_to_plot['access-to-care.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
#what_to_plot['indicator-methodology.xls'] = ['Indicator Values over years', 'Indicator Values over countries']
what_to_plot['non-med-determinants.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
what_to_plot['patient-safety.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
what_to_plot['prescribing-primary.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
what_to_plot['quality-of-care.xls'] = ['Indicator Values over years', 'Indicator Values over countries', 'Geo plot', 'Heatmap']
plots = interactive(f, what_to_plot=what_to_plot[measure_file.result]);
#display(plots)
plots.result
def f(chart_type):
return chart_type
#print('Select chart type:')
chart_types = ['Bubble', 'Bar', 'Hor Bar', 'Pie', 'Line']
chart = interactive(f, chart_type=chart_types);
#display(chart)
chart.result
def f(scale):
return scale
#print('Select bubble size')
bubble_scale_country = interactive(f, scale=(0, 100, 1));
#display(bubble_scale_country)
#bubble_scale_country.result = 10
def f(x):
return x
#print('Max country count to compare with')
country_count = interactive(f, x=range(1, 20));
#display(country_count)
country_count.result
#print('Select countries')
#print('Compare with Canada')
all_countries = interactive(f, x=True);
#display(all_countries)
all_countries.result
country_0 = interactive(f, x = countries);
country_1 = interactive(f, x = countries);
#country_str = "var%d = interactive(f, x = countries)"
#display(country_0)
#display(country_1)
country_0.result
def f(com_with_cdn):
return com_with_cdn
#print('Compare with Canada')
compare_canada = interactive(f, com_with_cdn=False);
#display(compare_canada)
compare_canada.result
def f(use_data_with_canada):
return use_data_with_canada
use_data_with_canada = interactive(f, use_data_with_canada=False);
#display(use_data_with_canada)
use_data_with_canada.result
def f(provincial_only):
return provincial_only
provincial_only_plot = interactive(f, provincial_only=False);
#display(use_data_with_canada)
provincial_only_plot.result
# animate over time
def f(animate):
return animate
#print('Compare with Canada')
animate_ = interactive(f, animate=False);
#display(compare_canada)
animate_.result
# for heatmap, do we consider the indicator or not
def f(heatmap_consider_indicator):
return heatmap_consider_indicator
heatmap_consider_indicator_ = interactive(f, heatmap_consider_indicator=False);
#display(heatmap_consider_indicator_)
heatmap_consider_indicator_.result
Render the UI controls
Note: use_data_with_canada — is to indicate that when canadian data are available. Also, Canada_Indicator and Year from the right will be used
com_with_cdn and animate controls are not used so far
Must as part of reload data
# Reference: https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Styling.html
print('Indicator and year list on the right, represent where Canadian data exist')
from ipywidgets import Button, GridBox, Layout, ButtonStyle
GridBox(children=[
use_data_with_canada, compare_canada,
indicator_country, indicator_canada,
year_country, year_canada,
plots, chart,
bubble_scale_country, provincial_only_plot,
heatmap_consider_indicator_, animate_
],
layout=Layout(
width='100%',
grid_template_rows='auto auto',
grid_template_columns='50%50%',
)
)Indicator and year list on the right, represent where Canadian data exist
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-28-5b8e7f56abb0> in <module>()
1 # Reference: https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Styling.html
2 print('Indicator and year list on the right, represent where Canadian data exist')
----> 3 from ipywidgets import Button, GridBox, Layout, ButtonStyle
4 GridBox(children=[
5 use_data_with_canada, compare_canada,
ImportError: cannot import name 'GridBox'
END-RELOAD-DATA
# assign a default value
bubble_scale_country.result = 10
Create the Plot Based on User Selections
plt.rcParams['figure.figsize'] = [10, 8]
#print(chart.result)
#provincial_only = False
heatmap_consider_indicator = heatmap_consider_indicator_.result
# Data irrespective Canada has data or not
if use_data_with_canada.result == False:
indicator = indicator_country.result
year = year_country.result
bubble_scale = bubble_scale_country.result
# when we are saying data for canada must exist there
else:
indicator = indicator_canada.result
year = year_canada.result
bubble_scale = bubble_scale_country.result
#provincial_only = provincial_only_plot.result
#print(indicator, year, bubble_scale)
if plots.result == 'Indicator Values over years':
plot_measure_by_years(year, indicator, bubble_scale, chart_type=chart.result, ratios=[3,1], animate = animate_.result,
provincial_only=provincial_only_plot.result, fig_size=[20,10], sec_fig=False)
elif plots.result == 'Geo plot':
plot_map_measure_by_regions(year, indicator, bubble_scale, chart_type=chart.result, \
provincial_only=provincial_only_plot.result)
elif plots.result == 'Heatmap':
if ( heatmap_consider_indicator == False ):
indicator = ''
plot_heatmap_across_indicators(year, indicator, provincial_only=provincial_only_plot.result, fig_size=[10, 10])
else: # Indicator Values over countries
plot_measure_by_regions(year, indicator, bubble_scale, chart_type=chart.result, \
ratios=[3,1], provincial_only=provincial_only_plot.result)
Section: Research Questions and Answers
Please Select the related measure and reload all data (as marked with Start-Reload, End-Reload). Otherwise the following visualizations might not work unless that is for currently selected measures
A better solution could be: I could place the measure selection here and could execute all the data reload code
Visualizations plotted independently for the visualizations used in the detail prsentation document
All these can be generated using the UI, I am just showing specific cases as I plotted using UI and provided on my report
How does Canada compare for a health status indicator such as : Cancer Mortality (F) for 2017 (per 100k)? Example: Cancer Mortality, 2017:
year = 2017
indicator = 'Cancer Mortality (F)'
# does not matter
bubble_scale = 11
chart_type = 'Line'
# subplot ratios
ratios = [1000, 1]
# not implemented
animate = False
# if for Provinces - Canada
provincial_only = False
# figure size
fig_size = [5, 5]
# not important
sec_fig = True
print('The right line on the plot does not count; comes from the right subplot; that is not relevant for this case')
plot_measure_by_years(year, indicator, bubble_scale, chart_type, ratios, animate, provincial_only, fig_size=fig_size)The right line on the plot does not count; comes from the right subplot; that is not relevant for this case
Research Question: how does an indicator such as Transport Mortality changed over time for different countries?
year = 0
indicator = 'Transport Accident Mortality (M)'
# does not matter
bubble_scale = 100
chart_type = 'Bubble'
# subplot ratios
# as we will show countries by colors, ratios are useful
ratios = [3, 1]
# not implemented
animate = False
# if for Provinces - Canada
provincial_only = False
# figure size
fig_size = [20, 10]
# not important
sec_fig = True
print('Note: Y axis values need to be multiplied by Benchmark value to get actual values')
plot_measure_by_years(year, indicator, bubble_scale, chart_type, ratios, animate, provincial_only, fig_size=fig_size)Note: Y axis values need to be multiplied by Benchmark value to get actual values
C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py:346: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Research Question: How do Canadian provinces perform for Transport Mortality (M) for 2017?
year = 2017
indicator = 'Transport Accident Mortality (M)'
# does not matter
bubble_scale = 100
# will plot for other chart types as well
chart_type = 'Line'
# subplot ratios
# as we will show countries by colors, ratios are useful
ratios = [10, 1]
# not implemented
animate = False
# Matters as we are plotting for Canadian provinces
provincial_only = True
# figure size
fig_size = [10, 8]
# not important
sec_fig = True
print('Note: Y axis values need to be multiplied by Benchmark value to get actual values')
plot_measure_by_years(year, indicator, bubble_scale, chart_type, ratios, animate, provincial_only, fig_size=fig_size)Note: Y axis values need to be multiplied by Benchmark value to get actual values
Research, Analysis, and Visualization Concern:
How does the transport mortality compare against countries on 2017 based on the data we have? Visualize in different format
chart_types = ['Bubble', 'Bar', 'Hor Bar', 'Pie', 'Line']
year = 2017
indicator = 'Transport Accident Mortality (M)'
# does not matter
bubble_scale = 32
# will plot for other chart types as well
chart_type = 'Line'
# subplot ratios
# as we will show countries by colors, ratios are useful
ratios = [10, 1]
# not implemented
animate = False
# Matters as we are plotting for Canadian provinces
provincial_only = False
# figure size
fig_size = [5, 5]
# not important
sec_fig = True
for chart_type in chart_types:
plot_measure_by_years(year, indicator, bubble_scale, chart_type, ratios, animate, provincial_only, fig_size=fig_size)
How does the transport mortality compare against Canadian provinces on 2017 based on the data we have? Visualize in different format
Note: the code from the above cell need to be executed first as I am reusing some variables
# Matters as we are plotting for Canadian provinces
provincial_only = True
# figure size
fig_size = [10, 8]
# not important
sec_fig = True
ratios = [3, 1]
for chart_type in chart_types:
plot_measure_by_years(year, indicator, bubble_scale, chart_type, ratios, animate, provincial_only, fig_size=fig_size)
For Research Question: What are average alcohol consumption across countries over last couple of years
# you can add or remove years, to get average measures over those years
year = 0 # 0 indicates all years for the list all_years
all_years = [2013, 2014, 2015, 2016, 2017]
indicator = 'Alcohol Consumption: Adults'
bubble_scale = 91 # NA
chart_type='Bar' # change to Line, Bubble, Pie, 'Hor Bar'
provincial_only = False # if you set true only provincial data will be plotted
# plot_measure_by_regions(year, indicator, bubble_scale, chart_type = '', ratios=[3,1], provincial_only=False, all_years=all_years):
plot_measure_by_regions(year, indicator, bubble_scale, chart_type, ratios=[3,1], provincial_only=provincial_only, all_years=all_years)
the above case just for year 2015
#year = 0 # 0 indicates all years for the list all_years
#all_years = [2013, 2014, 2015, 2016, 2017]
indicator = 'Alcohol Consumption: Adults'
#bubble_scale = 91 # NA
#chart_type='Bar' # change to Line, Bubble, Pie, 'Hor Bar'
provincial_only = False # if you set true only provincial data will be plotted
print('if you want for only one year change as follows')
print('For Research Question: What are average alcohol consumption across countries for 2015')
# if you want for only one year change as follows
year = 2015
# other chart type sych as bar will work though will have issues
chart_type='Bar'
#plot_measure_by_regions(year, indicator, bubble_scale, chart_type, ratios=[3,1], provincial_only)
plot_measure_by_regions(year, indicator, bubble_scale, chart_type, ratios=[3,1], provincial_only=provincial_only, all_years=all_years)if you want for only one year change as follows
For Research Question: What are average alcohol consumption across countries for 2015
Research Question: For 2015, which country smoked the most? Used Geo plot. However, plots like above sections could also be used
chart_type'Bar'year = 2015
indicator = 'Smoking: Adults (M)'
# the plot on the presentation used 2
# note : this is inverse size of the bubble
bubble_scale = 3
chart_type = ''
# are not relevant
# chart_type = '', all_years=all_years, also provincial_only=False
plot_map_measure_by_regions(year, indicator, bubble_scale, chart_type = '', provincial_only=False, all_years=all_years)
Research Question: Where in the world obesity are more common?
year = 0
indicator = 'Obesity Reported: Adults'
# the plot on the presentation used 2
# note : this is inverse size of the bubble
bubble_scale = 2
chart_type = ''
# are not relevant
# chart_type = '', all_years=all_years, also provincial_only=False
plot_map_measure_by_regions(year, indicator, bubble_scale, chart_type = '', provincial_only=False, all_years=all_years)
Research Question: Using a Heatmap, How different countries compare for their Non Medical Determinants aspect for 2014
Note: current selection needs to be: Non Medical Determinants.
You can change the Health Measure and then all data have to be reloaded by executing the sections marked: START-Reload, END-Reload
indicator = ''
year = 2014
indicator = ''
provincial_only_plot = False
plot_heatmap_across_indicators(year, indicator, provincial_only=provincial_only_plot, fig_size=[10, 10])
Research question: How do different countries compare for health status indicators over years
Note: current selection needs to be: Health status with code from START-Reload-data to END-RELOAD-Data need to be executed)
indicator = ''
year = 0
indicator = ''
provincial_only_plot = False
plot_heatmap_across_indicators(year, indicator, provincial_only=provincial_only_plot, fig_size=[10, 10])
Research question: How do Canadian Provinces compare for health status indicators over years
Note: current selection needs to be: Health status (otherwise, plots will be using current health aspect/measure that I may or may not have tested)
indicator = ''
year = 0
indicator = ''
provincial_only_plot = True
fig_size_=[7, 5]
plot_heatmap_across_indicators(year, indicator, provincial_only=provincial_only_plot, fig_size=fig_size_)
year, indicator(2017, 'Transport Accident Mortality (F)')
Research Question: How does transport mortality (Female) compare acrosss countries for 2017
This is just an example to show heatmap plot when ‘heatmap_consider_indicator’ option is selected and an indicator is selected
year = 2017
indicator = 'Transport Accident Mortality (F)'
fig_size_=[3, 5]
plot_heatmap_across_indicators(year, indicator, provincial_only=provincial_only_plot.result, fig_size=fig_size_)
Access to Care
Please select Access to Care Measure, and reload all data
Wait time for specialists in Days
year = 0
indicator = 'Wait Time: Specialist'
bubble_scale = 10
chart_type = 'Bubble'
provincial_only=False
ratios=[3,1]
plot_measure_by_regions(year, indicator, bubble_scale, chart_type=chart_type, ratios=ratios, provincial_only=False)
Access to Care: same or next day appointment
year, indicator, bubble_scale(0, 'Same or Next Day Appt', 10)year = 0
indicator = 'Same or Next Day Appt'
bubble_scale = 10
chart_type = 'Bubble'
provincial_only=True
ratios=[3,1]
plot_measure_by_regions(year, indicator, bubble_scale, chart_type=chart_type, ratios=ratios, provincial_only=True)
Heatmap: Access to Care Indicators
year,indicator(0, '')year = 0
indicator = ''
plot_heatmap_across_indicators(year, indicator, provincial_only=False, fig_size=[10, 10])
The following code are supposed to be removed at the final step
Code reused from lab 06 the geoplot
# references
# https://pypi.org/project/geopy/
Health System Performance
Following