**Context:** 8k-32k+ CoT responses require robust memory management. **Requirements:** - [ ] Optimize the online-softmax logp kernel for long sequences. - [ ] Implement proactive workspace sizing and strict memory-safety boundary checks to prevent OOM/segfaults.
Context: 8k-32k+ CoT responses require robust memory management.
Requirements: