Lecture 25: Linear Regression

Linear Regression is one of the most fundamental and widely used algorithms in machine learning and statistics. It models the relationship between a dependent variable (target) and one or more independent variables (features) by fitting a straight line (or hyperplane) through the data.

1) Mathematical Foundation

y = w0 + w1x1 + w2x2 + ... + wnxn + ε

The goal is to estimate coefficients w that minimize the squared error:

J(w) = \(\frac{1}{m}\) ∑ (yi - ŷi

2) Assumptions of Linear Regression

AssumptionDescription
LinearityRelationship between predictors and target is linear.
IndependenceObservations are independent of each other.
HomoscedasticityConstant variance of errors across values of predictors.
No multicollinearityPredictors should not be highly correlated with each other.
Normality of errorsResiduals are normally distributed.

3) Graphical Representation

Simple Linear Regression

A straight line fitted to data points in 2D (X vs Y).

Simple Linear Regression Graph
Multiple Linear Regression

A hyperplane fitted in higher dimensions.

Multiple Regression Graph

4) Applications

5) Hands-on Example 1: Diabetes Classification (Binary → Logistic Extension)

Although diabetes prediction is a classification problem, linear regression can be used first as a baseline model by predicting continuous probabilities (then thresholded).

from sklearn.datasets import load_diabetes
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

X, y = load_diabetes(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)

preds = model.predict(X_test)
print("MSE:", mean_squared_error(y_test, preds))

6) Hands-on Example 2: Sales Forecasting (Continuous Target)

We predict monthly sales using advertising spend on TV, radio, and newspaper.

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score

# Example Sales Data
sales = pd.DataFrame({
    "TV":[230.1,44.5,17.2,151.5,180.8],
    "Radio":[37.8,39.3,45.9,41.3,10.8],
    "Newspaper":[69.2,45.1,69.3,58.5,58.4],
    "Sales":[22.1,10.4,9.3,18.5,12.9]
})

X = sales[["TV","Radio","Newspaper"]]
y = sales["Sales"]

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)
model = LinearRegression()
model.fit(X_train,y_train)

preds = model.predict(X_test)
print("R² Score:", r2_score(y_test,preds))

By interpreting coefficients, we can see which medium (TV, Radio, Newspaper) contributes most to sales.

7) Regularization (Ridge, Lasso, ElasticNet)

Ridge

Penalizes large coefficients using L2 norm.

J = RSS + λ ∑ w²

Lasso

Penalizes absolute values of coefficients (L1 norm) → feature selection.

J = RSS + λ ∑ |w|

ElasticNet

Combination of Ridge + Lasso.

J = RSS + λ1∑ w² + λ2∑ |w|

8) Practical Playbook

# Pipeline with scaling + regularization
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Ridge

pipe = Pipeline([
    ("scale", StandardScaler()),
    ("ridge", Ridge(alpha=1.0))
])
pipe.fit(X_train, y_train)
Try This: