Week 15 - Machine Learning - (Supervised Learning) Applying Linear Regression for automobile prices from Principles of M.L. Python by Microsoft Learning

Liaw Bei Le · July 1, 2021

Linear regression using scikit-learn
Use categorical data
Apply transformations to features and labels to improve model performance
Compare regression models to improve model performance
Prepare the model matrix
- Create dummy variables from categorical features
- Concatenate the numeric features
Data preparation
- Split the dataset
  - create independently sampled training dataset and test data set
  - .train_test_split( ) to perform Bernoulli sampling
- Scale numeric features
  - Z-Score normalization
    - .StandardScaler( ) then .transform( )
Construct the regression model
- Instantiate and fit the model
  - .LinearRegression( ) & .fit( )
  - print the intercept term and model coefficients
Evaluate model performance
- Plot the predicted values computed from the training features - The predict method is applied to the model with the training data
- Using the test dataset
- Choose & compute performance metric
  - Mean squared error or MSE
  - Root mean squred error or RMSE
  - Mean absolute error or MAE
  - Median absolute error
  - R squared or 𝑅2 , also known as the coefficient of determination
  - Adjusted R squared
- Residuals (errors) of the model should be Normally distributed
  - plot a kernel density plot and histogram of the residuals of the regression model
  - Quantile-Quantile Normal plot, or Q-Q Normal plot
    - If the residuals have a distribution which is approximately Normal, points will nearly fall along the straight line
  - Residual plot

These are my notes taken from Microsoft Learning’s Principles of Machine Learning in Python - Module 4.

What I learnt:

Applying Linear Regression

Goal: To construct a linear regression model to predict the price of automobiles from their characteristics

I have reviewed and went through Applying Linear Regression for automobile prices in this link.

Share: Twitter, Facebook