This assessment provides you an opportunity to explore the impact of applying various Machine Learning techniques on a dataset in a sandbox environment. You will complete the Programming Exercise from Google that will introduce you to modelling in the Machine Learning world. Note that this exercise is limited to exploring the application of Linear Regression in great detail, however, the feature engineering, transformations and hyperparameter tuning involved in applying different implementations of the regression algorithm are investigated. There is an emphasis on understanding the impacts of various feature transformations as well. Although a simple data set has been provided in this task, there will be opportunity to apply normalisation techniques.
You should follow the task instructions set out in the Google lab to setup and run the various libraries and environments as well as loading the dataset. The instructions will take you through various tasks including identifying different applicable ML models, appropriate hyperparameter and feature transformation exploration. While writing your own models, think outside the box and see if a custom ML model can be made. As there is no “one right answer” to this task, the assessment is seeking to help you explore the impacts of various possible options to further your own understanding. The task instructions and rubric outline in detail what each grade assigned to students will demonstrate.
The assessment also requires you to write a manual of approximately 500 words, explaining the models and ML techniques utilised, what impact they had on the data exploration and visualisation task and provide an evaluation of their efficiency. Once again, this is an exploration task and your analysis and conclusions of the effectiveness of various models you’ve investigated will be the subject of the marking criteria.
This assessment presents an opportunity to study Linear Regression as the ML technique to study in more detail. The various models available to solve a regression problem form a significant part of a data scientist’s toolset as many use cases in academia and industry will need knowledge on how to process datasets where a continuous range is used to predict outcomes. This task will help you explore different models available to solve such problems, utilise the wide variety of properties including hyperparameters and techniques like feature engineering to help you understand the impact of these changes. You will also be presented with an opportunity to visualise and analyse the impact of your changes.
You need a Google account to do this assessment. You can create a free Google account here:
https://myaccount.google.com/.
Once created, you need to navigate to the Google created lab: “Intro to Modelling” here:
https://colab.research.google.com/github/google/eng-edu/blob/master/ml/fe/exercises/intro_to_modeling.ipynb?utm_source=ss-data-prep&utm_campaign=colab-external&utm_medium=referral&utm_content=intro_to_modeling
In addition to following the instructions outlined in the lab, you must: