Week 15 - Machine Learning - (Supervised Learning) Introduction to Regression from Principles of M.L. Python by Microsoft Learning

Liaw Bei Le · June 28, 2021

  • Linear regression using scikit-learn
  • Overview of linear regression
  • Data preparation
    • Split the dataset
      • create independently sampled training dataset and test data set
      • .train_test_split( ) to perform Bernoulli sampling
    • Scale numeric features
      • Min-Max normalization
      • Z-Score normalization
        • .StandardScaler( ) then .transform( )
  • Train the regression model
    • Instantiate and fit the model
      • .LinearRegression( ) & .fit( )
      • print the model coefficients
    • Plot the predicted values computed from the training features
      • The predict method is applied to the model with the training data
  • Evaluate model performance
    • Using the test dataset
    • Choose & compute performance metric
      • Mean squared error or MSE
      • Root mean squred error or RMSE
      • Mean absolute error or MAE
      • Median absolute error
      • R squared or 𝑅2 , also known as the coefficient of determination
      • Adjusted R squared
    • Residuals (errors) of the model should be Normally distributed
      • plot a kernel density plot and histogram of the residuals of the regression model
      • Quantile-Quantile Normal plot, or Q-Q Normal plot
        • If the residuals have a distribution which is approximately Normal, points will nearly fall along the straight line
      • Residual plot

Model complexity and Generalisation Performance (Regression):

  • Plot a scatterplot of data points in the training and test sets
  • Write a function that fits a polynomial LinearRegression model on training data for varying degrees
  • Finding degree levels corresponding generalization performance on the dataset (underfitting, overfitting, good) based on R squared scores

These are my notes taken from Microsoft Learning’s Principles of Machine Learning in Python - Module 4.

What I learnt:

Introduction to Regression

Method: Train the model to minimize the difference between our predicted y^ and the known label values y.

Goal of regression: To produce a model that represents the ‘best fit’ to some observed data. Find a function of some features X which predicts the label value y.

I have reviewed and went through Introduction to Regression in this link.

Twitter, Facebook