- Multivariate Linear Regression
I am learning about Machine Learning via the course offered by Coursera named: Machine Learning by Stanford University. This post covers Week 2 materials.
What I learnt:
Linear regression with multiple variables
Linear regression with multiple variables (or features) is also known as “multivariate linear regression”.
Multiple variables = multiple features
In original version we had:
X = house size, use this to predict
y = house price
If in another version, we have more variables (such as number of bedrooms, number floors, age of the home):
- m: number of training examples (i.e. number of rows in a table)
- xi: vector of the input for an example (so a vector of the four parameters for the ith input example)
- i is an index into the training set
So, x is an n-dimensional feature vector.
x3 is, for example, the 3rd house, and contains the four features associated with that house.
So x3(2) is, for example, the number of bedrooms in the third house.
Now that we have multiple features, what is the form of our hypothesis?
- An example of a hypothesis which is trying to predict the price of a house
- Parameters are still determined through a cost function
Here’s the form of the hypothesis, rewritten. The equation below is same as the above as x0=1:
For convenience of notation, x0 = 1.
- For every example i you have an additional 0th feature for each example.
- So now, our feature vector is n + 1 dimensional feature vector indexed from 0.
- This is a column vector called x.
- Each example has a column vector associated with it.
- So let’s say we have a new example called “X”.
- Parameters are also in a 0 indexed n+1 dimensional vector.
- This is also a column vector called θ.
- This vector is the same for each example.
If hθ(x) = θT X :
- θT is an 1 by (n+1) matrix.
- In other words, because θ is a column vector, the transposition operation transforms it into a row vector
- So before, θ was a matrix (n+1) by 1
- Now, θT is a matrix 1 by (n+1)
- Which means the inner dimensions of θT and X match, so they can be multiplied together as 1 by (n+1) * (n+1) by 1 = hθ(x).
- So, in other words, the transpose of our parameter vector * an input example X gives you a predicted hypothesis which is 1 by 1 dimensions (i.e. a single value).
- This x0 = 1 lets us write this like this.