Applied Data Science II: Machine Learning & Statistical Analysis (with honors)

Issued by WorldQuant University

Earners of this badge are able to build machine learning models to make predictions on real-world data. They understand the best way to treat, clean, and encode data and how to choose the appropriate machine learning models for the task. They can properly tune the model to create a generalized model that performs well on both a training set and on out-of-sample data. They can build models using text and time series data. Earners are also proficient in using Python’s scikit-learn package.

Type Learning
Level Intermediate
Time Weeks
Cost Free

Additional Details

Skills

Earning Criteria

Earners of this badge have previously earned the badge "Applied Data Science I: Scientific Computing & Python." Additionally, they have successfully completed 2 mini projects and maintained a cumulative average score of 90% or above. The descriptions and skills needed to complete these projects are listed below.
In mini project 1, earners of this badge worked with nursing home inspection data from the United States, predicting which providers may be fined and for how much. They used the scikit-learn Python package to construct progressively more complicated machine learning models. They had to impute missing values, apply feature engineering, and encode categorical data.
In mini project 2, earners of this badge used natural language processing to train various machine learning models to predict an Amazon review rating based on the text of the review. Further, they used one of the trained models to gain insight on the reviews, identifying words that are highly polar. With these highly polar words identified, one can understand what words highly influence the model’s prediction.