Linear regression

Imagine you’re house-hunting and see a listing that reads “3-bed, 2-bath, 1 850 ft², built in 2010”. Your first instinct is to guess the price. That guess is exactly what linear regression tries to formalize: find a straight-line rule that turns facts (features) into the number we care about (price).

Hypothesis

We hypothesize that the relationship between inputs and output is linear.

Let’s begin we simple, classic example of House-Price Prediction Data and try to fit our model here:

Living area (ft²)	#Bedrooms	Age (yrs)	…	Price (\$k)
1 650	3	10	…	510
890	2	36	…	239
…	…	…	…	…

We want a function that maps the features to the price.

Multivariate Form:

Real houses have many features. Let $x_j^i$ denote the $j^{th}$ feature of the $i^{th}$ sample.

So the equation becomes:

$$ y = m_1 x_1 + m_2 x_2 + ... + m_n x_n + c $$

Where:

$x_1, x_2, ..., x_n$ are features
$m_1, m_2, ..., m_n$ are coefficients (slopes)
$c$ is the intercept

We can also represent these equations in Vectorized form

$\theta = [m_1, m_2, ..., m_n, c]$