Jason Sjafrudin

Data Scientist- Machine Learning | NLP | Analytics

ABOUT ME

A data driven story teller that is passionate about optimizing processes with artificial intelligence. With expertise spanning Data Engineering, Data Science, and Data Analytics, I possess a distinctive ability to connect the realms of engineering and product understanding for any business. My proficiency includes ETL, data warehousing, data modeling, exploratory data analysis, statistical testing, NLP, and machine learning.

At 19 years old, I completed my bachelor's degree in Econometrics and Quantitative Economics at University of California, Berkeley. I have also aquired a Master's degree in Applied Data Science from University of Chicago. Combined with my 4+ years of work experience in the data landscape, I have been apart of numerous exciting projects! The portfolio on this website is most likely outdated, please check out my github if you want to see my portfolio.

My goal is to optimize world processes by using data. If you are interested in connecting with me, feel free to contact me.

MY PORTFOLIO/PROJECTS (shown below) ON THIS WEBSITE IS MOST LIKELY OUTDATED, PLEASE CHECK OUT MY github IF YOU WANT TO SEE MY UP-TO-DATE PORTFOLIO/PROJECTS!!!!! ♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡♡

2019-2022

Data Engineering Projects

(30 mins read)

My experiences have allowed me to work with various technologies for creating data pipelines across many enterprises. Here, I outlined some of my favorite data engineering projects as well as the technologies and solutions used for performing ETL.

2021

Predictive Analysis Model to Forecast Quantity Demand

(8 mins read)

For most companies that offer goods, supply chain disruptions happen mainly due to inventory shortages. This turmoil leads to all the shipping and manufacturing delays we often experience for our orders. This project aims to build a predictive analysis model to forecast quantity demand in one of the most well-known apparel industries to deal with this type of SC disruption.

2022

Anomaly Detection Model (for Data Quality Checks)

(8 mins read)


In the world of data analytics, poor quality data is one of the biggest challenges we face. I implemented statistical methods to build an anomaly detection model for performing data quality checks within the digital/web environment.

2021-2022

Phishing Email Prediction using Machine Learning with Python

(10 mins read)

While data confidentiality is important everywhere, it is especially vital for a healthcare organization to keep their patient’s information safe and secure. Phishing remains one of the major causes of breaches in this type of industry. We use machine learning to detect phishing emails in an attempt to solve this very issue.