
Feb 11
Installing Talend in MAC
Talend Open Studio for Data Integration.
If you have intel based Mac, then the OSX installer will work.
Else, you may install a Virtual Machine such as Parallels, then have Windows in it (Parallels may come with Windows), then have Talend under Windows
Another option can be use Linux Versions.
Download and install
Download from: https://www.qlik.com/us/trial/talend-data-fabric
(Talend download from https://www.talend.com/products/talend-open-studio/ leads to the above URL) Instructions are also on this site.
You may see which one works for you. You will need Java to install Linux Version on Mac. Make sure, you have the correct version of Java in your system.
Feb 06
Python: MatPlotLib: Plot Data
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
columns = ["J_ID", "P_ID", "D_ID", "Year", "JOB_TITLE", "Department_Name", "Total Salary", "Commission"]
df = pd.read_csv("datamart.csv", usecols=columns)
# Create a 3D plot
fig = plt.figure(figsize=(20, 15))
ax = fig.add_subplot(111, projection='3d')
# Scatter plot
ax.scatter(df['Year'], df['J_ID'], df['D_ID'], c=df['Total Salary'], cmap='viridis', marker='X')
# Labels
ax.set_xlabel('Year')
ax.set_ylabel('Job Title (Categorized)')
ax.set_zlabel('Departments')
# Title
ax.set_title('3D Plot of Total Salary by Year and Job Title and Department')
for i in range(df.shape[0]):
if i%7 == 0:
ax.text(df['Year'][i], df['J_ID'][1], df['D_ID'][i], '(' + str(df['Year'][i]) + ',' + str(df["JOB_TITLE"][i]) + ',' + str(df["Department_Name"][i]) + ',' + str(df['Total Salary'][i]) + ")")
# Show plot
plt.show()


Sample Data:

Jan 30
Dynamic SQL
“Dynamic SQL is the SQL statement that is constructed and executed at runtime based on input parameters passed”
https://www.sqlshack.com/dynamic-sql-in-sql-server/
Example: ChatGPT
DECLARE @sql AS NVARCHAR(MAX)
SET @sql = ‘SELECT * FROM Employees WHERE Department = ”’ + @department + ””
EXEC sp_executesql @sql
Another Example: Oracle
table_name := ‘HR.Employees’;
dyn_sql := ‘Select * from ‘ || table_name || ‘ FETCH FIRST 10 ROWS ONLY ‘;
EXECUTE IMMEDIATE dyn_sql;
DECLARE
table_name VARCHAR2(100):= '';
dyn_sql VARCHAR2(100):= '';
BEGIN
table_name := 'HR.Employees';
dyn_sql := 'Select * from ' || table_name || ' FETCH FIRST 10 ROWS ONLY ';
EXECUTE IMMEDIATE dyn_sql;
END;
A simple dynamic SQL in a stored Procedure (Oracle)
CREATE OR REPLACE PROCEDURE Retrieve_10_ROWS
IS
dyn_sql VARCHAR2(200);
table_name VARCHAR2(50);
BEGIN
table_name := 'HR.Employees';
dyn_sql := 'Select * from ' || table_name || ' FETCH FIRST 10 ROWS ONLY ';
EXECUTE IMMEDIATE dyn_sql;
END;
EXEC Retrieve_10_ROWS;
Dynamic/Reference Cursor and Dynamic SQL
CREATE OR REPLACE PROCEDURE Retrieve_10_ROWS_dyn_cur(p_table_to_query VARCHAR2) IS
dyn_sql VARCHAR2(200);
JOB_ID VARCHAR2(50);
JOB_TITLE VARCHAR2(100);
TYPE cur_typ IS REF CURSOR;
c cur_typ;
BEGIN
dyn_sql := 'Select JOB_ID, JOB_TITLE from ' || p_table_to_query || ' FETCH FIRST 10 ROWS ONLY ';
OPEN c FOR dyn_sql;
--USING table_name;
LOOP
FETCH c INTO JOB_ID, JOB_TITLE;
EXIT WHEN c%NOTFOUND;
-- process row here
DBMS_OUTPUT.PUT_LINE(JOB_ID || ' ' || JOB_TITLE);
END LOOP;
CLOSE c;
EXECUTE IMMEDIATE dyn_sql;
END;
Jan 29
What is a Dynamic Cursor in Oracle PL/SQL
Dynamic/Ref Cursor
CREATE OR REPLACE PROCEDURE query_invoice(
month VARCHAR2,
year VARCHAR2) IS
TYPE cur_typ IS REF CURSOR;
c cur_typ;
query_str VARCHAR2(200);
inv_num NUMBER;
inv_cust VARCHAR2(20);
inv_amt NUMBER;
BEGIN
query_str := 'SELECT num, cust, amt FROM inv_' || month ||'_'|| year
|| ' WHERE invnum = :id';
OPEN c FOR query_str USING inv_num;
LOOP
FETCH c INTO inv_num, inv_cust, inv_amt;
EXIT WHEN c%NOTFOUND;
-- process row here
END LOOP;
CLOSE c;
END;
/
Ref: https://docs.oracle.com/cd/B13789_01/appdev.101/b10795/adfns_dy.htm
cursor FOR loop

Cursor For Loop for an Explicit Cursor

Jan 19
Case Statement in Oracle PL/SQL
Write a SQL block that will categorize Employee Salaries. If the Salary is higher than average, it will show ‘Above Average’ to the output. If the Salary is lower than the average it will show ‘Below average’ to the output. Use HR.Employees Table. Also, utilize CASE statement.
SELECT employee_id,
CASE
WHEN salary > (SELECT AVG(salary) FROM hr.employees)
THEN 'Above Average'
WHEN salary < (SELECT AVG(salary) FROM hr.employees)
THEN 'Below Average'
ELSE 'Average'
END AS salary_comparison
FROM hr.employees
order by salary_comparison;
-- Ref: with a bit of information from the Internet
Jan 12
3D Scatter Plot in Python
Visualizing 3-D numeric data with Scatter Plots
length, breadth and depth
Ref: https://towardsdatascience.com/the-art-of-effective-visualization-of-multi-dimensional-data-6c7202990c57
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib as mpl
import numpy as np
import seaborn as sns
%matplotlib inline
fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(111, projection=’3d’)
xs = wines[‘residual sugar’]
ys = wines[‘fixed acidity’]
zs = wines[‘alcohol’]
ax.scatter(xs, ys, zs, s=50, alpha=0.6, edgecolors=’w’)
ax.set_xlabel(‘Residual Sugar’)
ax.set_ylabel(‘Fixed Acidity’)
ax.set_zlabel(‘Alcohol’)

Other Visualizations of Wine Data
Univariate Analysis
tight_layout, 15 bins

Correlation Maps: Multivariate Analysis
Seaborn – Heatmaps


plot parallel coordinates

BOX Plot

Violin plot: sns.violinplot(x=”quality”, y=”sulphates”, data=wines, ax=ax) : Seaborn

3d Pairwise Scatter Plot

Images are taken from my own executions.
Jan 12