You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: index.md
+33-1Lines changed: 33 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -78,11 +78,43 @@ The plot below shows the displacement $x(t)$ with respect to $t$ (already contai
78
78
79
79
80
80
81
-
## LSTM Autoencoder model
81
+
## LSTM Neural Network and Autoencoder:
82
+
Long Short-Term Memory (LSTM) networks belong to family of recurrent neural networks (RNN) architectures, they're well suited to **capture long-range dependencies in sequential data**. The main limitation of RNNs is their tendency to forget older context due to the vanishing gradient problem, LSTMs solve this with **gating mechanisms** that control information flow, allowing them to remember patterns over hundreds of timesteps. For this reason, they're a good fit for anomaly detection on time-series data such as dynamical system measurements, where previous behaviour influences future behaviour.
82
83
84
+
At timestep $t$, the LSTM updates its **hidden state $h_t$** and **cell state $c_t$** using the **current input $x_t$** and **previous states $h_{t-1}, c_{t-1}$**. The main components of LSTMs are:
85
+
-**Forget gate $f_t$**: governs how much to retain of the previous state
86
+
- Input gate $i_t$ and candidate state $\tilde{c_t}$: decide how much new information to add
87
+
-**Output gate $o_t$**: controls parts of the cell state that influence the hidden state
88
+
89
+
Using sigmoid $\sigma$ and $tanh()$ activations, with learnable weights $W$ and biases $b$, the canonical equations for a LSTM network are:
where $\cdot$ is elementwise multiplication. The cell state $c_t$ is responsible for long-term memory, while $h_t$ conveys output at the end of each step.
103
+
104
+
For an **LSTM Autoencoder**:
105
+
-**Encoder**: receives every window $X \in \mathbb{R}^{W \times d}$ (with $d = 1$ feature per timestep) and compresses it into a **latent (lower-dimensional) vector**
106
+
-**Decoder**: reconstructs sequence from the latent representation.
107
+
108
+
During training on normal data, LSTM learns to reconstruct normal dynamics and anomalies deviate from normal learned patterns, therefore yielding higher reconstruction error $\left\| X - \hat{X}\right\|^2$.
0 commit comments