Naive Bayes is widely used for text classification tasks like spam detection, sentiment analysis, and topic categorization — thanks to its simplicity and performance.
🧠 Why Naive Bayes Works Well for Text
- In text, features are words (tokens).
- A document is treated as a bag of words (order doesn't matter).
- The Naive Bayes assumption (word independence given the class) holds reasonably well in practice.
- It’s fast and scalable for large text datasets.
🏷️ Multinomial Naive Bayes for Text Classification
This is the most commonly used version of Naive Bayes for text classification.
1. Model Assumption:
- Features are word counts (i.e., how many times a word appears).
- Assumes documents are generated by a multinomial distribution over words.
🔢 Multinomial Naive Bayes Formula
For class yy, and document dd with words x1,x2,...,xnx_1, x_2, ..., x_n:
P(y∣d)∝P(y)⋅∏i=1nP(xi∣y)count(xi,d)P(y|d) \propto P(y) \cdot \prod_{i=1}^{n} P(x_i|y)^{\text{count}(x_i, d)}
Where:
- P(y)P(y) is the prior (probability of class y)
- P(xi∣y)P(x_i|y) is the probability of word xix_i given class yy