Imagine you’re house-hunting and see a listing that reads “3-bed, 2-bath, 1 850 ft², built in 2010”. Your first instinct is to guess the price. That guess is exactly what linear regression tries to formalize: find a straight-line rule that turns facts (features) into the number we care about (price).
We hypothesize that the relationship between inputs and output is linear.

Let’s begin we simple, classic example of House-Price Prediction Data and try to fit our model here:
| Living area (ft²) | #Bedrooms | Age (yrs) | … | Price (\$k) |
|---|---|---|---|---|
| 1 650 | 3 | 10 | … | 510 |
| 890 | 2 | 36 | … | 239 |
| … | … | … | … | … |
We want a function that maps the features to the price.
Real houses have many features. Let $x_j^i$ denote the $j^{th}$ feature of the $i^{th}$ sample.
So the equation becomes:
$$ y = m_1 x_1 + m_2 x_2 + ... + m_n x_n + c $$
Where:
We can also represent these equations in Vectorized form
$\theta = [m_1, m_2, ..., m_n, c]$