Introduction | Notion

Bayes' Theorem – Introduction

Bayes' Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

Bayes' Theorem Formula:

P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}

Where:

$P(A|B)$ is the posterior probability (probability of A given B).
$P(B|A)$ is the likelihood (probability of B given A).
$P(A)$ is the prior probability (probability of A).
$P(B)$ is the evidence (probability of B occurring).

Naive Bayes Assumption

"Naive" refers to the assumption of feature independence, meaning:

P(x_1, x_2, ..., x_n | y) = P(x_1|y) \cdot P(x_2|y) \cdot ... \cdot P(x_n|y)

This simplifies the computation because instead of computing a potentially high-dimensional joint probability, we just compute the product of individual probabilities.

Impact on Equation:

The full Bayes' Theorem with features looks like:

P(y|x_1, x_2, ..., x_n) = \frac{P(y) \cdot P(x_1, x_2, ..., x_n | y)}{P(x_1, x_2, ..., x_n)}

Under Naive Bayes, this becomes:

P(y|x_1, ..., x_n) \propto P(y) \cdot \prod_{i=1}^n P(x_i | y)

The denominator $P(x_1, ..., x_n)$ can be ignored for classification since it’s the same for all classes.