Read full report
Typing speed has traditionally been estimated with simple heuristic models. This project replaces them with machine learning models trained on ≈4.8M keystrokes of typing data to predict Inter-Keystroke Intervals (IKI). Multi-Layer Perceptron and LightGBM models achieve a mean absolute error (MAE) of 0.57 on a QWERTY validation set. These models generalize to the Dvorak layout with an MAE of 0.59 despite not being trained on it, the first zero-shot evaluation of a typing-speed model on a different layout. The models are used to evaluate four typing system optimizations. A one-shot shift key (tapped before a letter to capitalize it) yields at least a 0.71% gain, and a repeat key (tapped after a key to repeat it) a 0.26% gain, each displacing the semicolon. Optimized auto-expanding abbreviation dictionaries increase speed by up to 13% for a 160-entry dictionary, assuming a typist recalls each mapping instantly. The model indicates a 0.7% speed benefit for Dvorak over QWERTY. While the LightGBM model exhibits no statistically significant change in systematic error (bias shift) between layouts, the lack of data for layouts other than QWERTY and Dvorak limits the ability to evaluate arbitrary keyboard layouts. More non-QWERTY typing data would most improve the models.
See HOWTORUN.md