Skip to content

Commit 0b158fc

Browse files
committed
wrote LSTM NN and AE section, proceeding to MSE loss and Physics-informed loss section
1 parent 52f6a45 commit 0b158fc

File tree

2 files changed

+33
-9
lines changed

2 files changed

+33
-9
lines changed

index.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,11 +78,43 @@ The plot below shows the displacement $x(t)$ with respect to $t$ (already contai
7878

7979

8080

81-
## LSTM Autoencoder model
81+
## LSTM Neural Network and Autoencoder:
82+
Long Short-Term Memory (LSTM) networks belong to family of recurrent neural networks (RNN) architectures, they're well suited to **capture long-range dependencies in sequential data**. The main limitation of RNNs is their tendency to forget older context due to the vanishing gradient problem, LSTMs solve this with **gating mechanisms** that control information flow, allowing them to remember patterns over hundreds of timesteps. For this reason, they're a good fit for anomaly detection on time-series data such as dynamical system measurements, where previous behaviour influences future behaviour.
8283

84+
At timestep $t$, the LSTM updates its **hidden state $h_t$** and **cell state $c_t$** using the **current input $x_t$** and **previous states $h_{t-1}, c_{t-1}$**. The main components of LSTMs are:
85+
- **Forget gate $f_t$**: governs how much to retain of the previous state
86+
- Input gate $i_t$ and candidate state $\tilde{c_t}$: decide how much new information to add
87+
- **Output gate $o_t$**: controls parts of the cell state that influence the hidden state
88+
89+
Using sigmoid $\sigma$ and $tanh()$ activations, with learnable weights $W$ and biases $b$, the canonical equations for a LSTM network are:
90+
91+
$$
92+
\begin{align*}
93+
f_t &= \sigma(W_f[h_{t-1}, x_t] + b_f), && \text{(forget gate)} \\
94+
i_t &= \sigma(W_i[h_{t-1}, x_t] + b_i), && \text{(input gate)} \\
95+
\tilde{c}_t &= \tanh(W_c[h_{t-1}, x_t] + b_c), && \text{(candidate state)} \\
96+
c_t &= f_t \cdot c_{t-1} + i_t \cdot \tilde{c}_t, && \text{(cell update)} \\
97+
o_t &= \sigma(W_o[h_{t-1}, x_t] + b_o), && \text{(output gate)} \\
98+
h_t &= o_t \cdot \tanh(c_t), && \text{(hidden state)}
99+
\end{align*}
100+
$$
101+
102+
where $\cdot$ is elementwise multiplication. The cell state $c_t$ is responsible for long-term memory, while $h_t$ conveys output at the end of each step.
103+
104+
For an **LSTM Autoencoder**:
105+
- **Encoder**: receives every window $X \in \mathbb{R}^{W \times d}$ (with $d = 1$ feature per timestep) and compresses it into a **latent (lower-dimensional) vector**
106+
- **Decoder**: reconstructs sequence from the latent representation.
107+
108+
During training on normal data, LSTM learns to reconstruct normal dynamics and anomalies deviate from normal learned patterns, therefore yielding higher reconstruction error $\left\| X - \hat{X}\right\|^2$.
83109

84110
## MSE loss vs. Physics-Informed loss
111+
....
85112

86113
## Implementation
114+
...
87115

88116
## Results
117+
...
118+
119+
## Future work
120+
...

readme.md

Lines changed: 0 additions & 8 deletions
This file was deleted.

0 commit comments

Comments
 (0)