Decoder Model

- [ ] Implement `compute_recall`
  - Currently this involves a separate forward pass of the model. We should be able to combine this with `evaluation_loss` 
   - `model.generate` isn't working atm
- [x] flash-attention implementation
- [ ] Improve logging format: `log_predictions`
   - Can we make this interactive via [gradio](https://www.gradio.app/)? 
- [ ] Plot learning rate
- [ ] Early Stopping
- [ ] two 'yes' tokens

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decoder Model #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Decoder Model #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions