Linear Regression in Go - Part 2
Fri, Nov 13, 2015 |In the previous post we covered the hypothesis function, which is the function that will predict a value given a set of features for a new unknown case. In this post we’re going to build the cost function, a way to mearsure the error of the prediction function with a specific set of weights .
For convenience this is the function we discussed earlier:
$$h_\theta(x) = \theta^T x$$
As we said linear regression is a supervised algorithm, this means that we need to train
it with a list of examples in order to find the values of the vector $\theta$
where the average error is minimized (we’ll discuss a common bias called overfitting later).
The cost function is a function that calculate the error of a given set of $\vec{\theta}$
and
a training set. In the following graph there is a subset of the previous examples, just 5 houses.
Plus a green line that is the result of plotting the hypotesis function, the thin red lines
are the difference between a value in the training set and the predicted value by our hypothesis.
To calculate the error we’ll use the following function:
$$J(\theta) = \frac{1}{2m}\sum_{i=0}^{m}(y_i - h_\theta(x_i))^2$$
It’s basically the mean of the squares of the difference betwen the predicted value $h_\theta(x)$
and the actual value $y$
.
Consider that now $X$
is a matrix of $m,n$
where $m$
is the number of examples in our training set (in the graph plotted here we have 5 houses)
and $n$
is the number of features (like house size, # of bathrooms, # of bedrooms and so on)
So here is our go implementation using gonum matrix :
1func Cost(x *mat64.Dense, y, theta *mat64.Vector) float64 {
2 //initialize receivers
3 m, _ := x.Dims()
4 h := mat64.NewDense(m, 1, make([]float64, m))
5 squaredErrors := mat64.NewDense(m, 1, make([]float64, m))
6
7 //actual calculus
8 h.Mul(x, theta)
9 squaredErrors.Apply(func(r, c int, v float64) float64 {
10 return math.Pow(h.At(r, c)-y.At(r, c), 2)
11 }, h)
12 j := mat64.Sum(squaredErrors) * 1.0 / (2.0 * float64(m))
13
14 return j
15}
As usual the full code is here and a test is here .
In part 3 we’re going to build the method that minimize the error choosing the proper $\theta$
values.
You can find part 3 here