When features are continuous (like height, weight, age), Naive Bayes can't use frequency counts. Instead, we model the likelihood P(xi∣y)P(x_i \mid y) using a probability distribution, most commonly a Gaussian (Normal) distribution.
🔸 Gaussian Naive Bayes Assumption
For each feature xix_i, conditioned on class y=cy = c, we assume the data follows a normal distribution:
P(xi∣y=c)=12πσc2⋅exp(−(xi−μc)22σc2)P(x_i \mid y = c) = \frac{1}{\sqrt{2\pi\sigma_c^2}} \cdot \exp\left( -\frac{(x_i - \mu_c)^2}{2\sigma_c^2} \right)
Where:
- μc\mu_c: mean of feature xix_i for class y=cy = c
- σc2\sigma_c^2: variance of feature xix_i for class y=cy = c
🔹 Steps to Estimate Probabilities
- Split the training data by class y=cy = c
- For each feature xix_i:
- Compute the mean μic\mu_{ic} and variance σic2\sigma^2_{ic} from training data for each class
- Use the Gaussian PDF to compute likelihoods P(xi∣y=c)P(x_i \mid y = c) at prediction time
✅ Example
Suppose you have a feature "height" and two classes: Male and Female.
Training data:
| Height (cm) |
Gender |
| 170 |
Male |
| 180 |
Male |
| 160 |
Female |
| 165 |
Female |
Step 1: Compute mean and variance per class
- Male:
- Mean: μ=170+1802=175\mu = \frac{170 + 180}{2} = 175
- Variance: σ2=(170−175)2+(180−175)22=25\sigma^2 = \frac{(170-175)^2 + (180-175)^2}{2} = 25