▶︎
all
running...

Lecture 2

Linear Regression Models

Cat Hearts example:

Experience EE

Learning Task, TT

Linear Regression Model

Performance Measure, PP

Unconstrained Optimisation (Minimisation)

Given a continuous function:


Theorem:
For any continous function, f:RRf: \R \rightarrow \R, if xx is a local optimum, f(x)=0f'(x) = 0


Definition:
The 1st1^{st} derivative of a function f:RRf: \R \rightarrow \R is
f(x)=limΔx0f(x+Δx)f(x)Δxf'(x) = \lim_{\Delta x \rightarrow 0}\frac{f(x+\Delta x) - f(x)}{\Delta x}

Differentiation Rules
  1. (cf(x))=cf(x)(cf(x))' = cf'(x)
  2. (xk)(x^k)' = kxk1kx^{k-1}, if k0k \neq 0
  3. (f(x)+g(x))=f(x)+g(x)(f(x)+g(x))' = f'(x) + g'(x)
  4. (f(g(x))=f(g(x))g(x))(f(g(x))' = f'(g(x))g'(x)) \leftarrow chain rule
Approach 1: Ordinary least squares
Approach 2: Gradient descent

Idea:

Attempt 1 (failed)

winitial weightw \leftarrow initial \ weight
repeat:
       if J(w)<0J'(w) < 0
             ww+ϵw \leftarrow w + \epsilon
       elseif J(w)>0J'(w) > 0
             wwϵw \leftarrow w - \epsilon

Issue with this attempt:

Attempt 2: Gradient Descent (1D)

winitial weightw \leftarrow initial \ weight
repeat:
       if J(w)<0J'(w) < 0
             wwϵJ(w)w \leftarrow w - \epsilon \cdot J'(w)