1. Introduction to RNNs

Recurrent Neural Networks (RNNs) are specialized neural networks designed to work with sequential data like text, speech, or time-series. Unlike standard neural networks, which process inputs independently, RNNs maintain a "memory" of past inputs using hidden states. This makes them ideal for tasks where context matters, such as predicting the next word in a sentence or understanding a sentence’s sentiment.

Why RNNs?


2. RNN Architecture

Imagine reading a book page by page while trying to remember the story. RNNs work similarly: they process sequences step-by-step while carrying forward a "summary" of past information.

Core Mechanism

Example:

For the sentence "The cat sat on the mat", the RNN processes each word while updating its hidden state:

"the" → updates hidden state → "cat" → updates hidden state → ... → "mat"

Unrolling the RNN

Visualize the RNN as a chain of repeated cells:

Input: [x₁] → [Cell] → [h₁] → [x₂] → [Cell] → [h₂] → ...

Each cell shares the same weights, ensuring consistency across time steps.