Commit 471a5f4
feat: text-prefix matching — bypass BPE re-tokenization in chat reuse (#50)
Follow-up to PR #49. The token-level LCP path in tq_generate_continue
has a fundamental limitation: model-generated tokens (sample_topp) and
text-encoded tokens (tq_encode of the response in the next turn) can
diverge due to BPE merge non-roundtripping. This caps per-turn LCP at
the prompt boundary (~10 tokens), so longer histories still incur
mostly-full reprefill.
Fix: tq_generate_chat_text() — text-level prefix matching.
How it works:
1. Each session stores the entire prompt+response text from the
previous call (cached_text).
2. On a new request, check if the new prompt starts with cached_text
byte-for-byte. If yes, the cached state is byte-equivalent valid.
3. Tokenize ONLY the suffix (new_prompt[strlen(cached_text):]) and
prefill those tokens at positions [n_cached..n_cached + n_suffix).
4. Run generation. The accumulated output text gets appended to
cached_text via a tee callback for the next call.
5. If text prefix doesn't match, fall back to tq_generate_continue
(token LCP path).
Bug fix bundled: json_find_key("user") was matching the value in
{"role":"user"} instead of the top-level "user" key. Result: every
request used the "default" session, so multi-session was effectively
broken (cross-pollution). The fix scans for "key": (with colon) to
disambiguate from value matches.
Measured (SmolLM2-135M, single thread, real chat replay):
Single user, 10-turn accumulation:
PR #48 (token LCP only): turn 10 → 3700 ms
PR #49 (above + multi-session): turn 10 → 3700 ms (LCP still capped)
This PR (text-prefix path): turn 10 → 739 ms (5x)
alice + bob interleaved, 5 turns each (real assistant replay):
PR #49: alice 5 = 2412 ms, bob 5 = 2357 ms
Now: alice 5 = 498 ms, bob 5 = 462 ms (5x)
The growth that remains (~50ms/turn) is the unavoidable O(n) cost of
the attention computation over the full context — KV prefill is now
truly O(new tokens per turn), not O(full history per turn).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 361751d commit 471a5f4
2 files changed
Lines changed: 295 additions & 19 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
802 | 802 | | |
803 | 803 | | |
804 | 804 | | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
| 897 | + | |
| 898 | + | |
| 899 | + | |
| 900 | + | |
| 901 | + | |
| 902 | + | |
| 903 | + | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
| 919 | + | |
| 920 | + | |
| 921 | + | |
| 922 | + | |
| 923 | + | |
| 924 | + | |
| 925 | + | |
| 926 | + | |
| 927 | + | |
| 928 | + | |
| 929 | + | |
| 930 | + | |
| 931 | + | |
| 932 | + | |
| 933 | + | |
| 934 | + | |
| 935 | + | |
| 936 | + | |
| 937 | + | |
| 938 | + | |
| 939 | + | |
| 940 | + | |
| 941 | + | |
| 942 | + | |
| 943 | + | |
| 944 | + | |
| 945 | + | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
| 950 | + | |
| 951 | + | |
| 952 | + | |
| 953 | + | |
| 954 | + | |
| 955 | + | |
| 956 | + | |
| 957 | + | |
| 958 | + | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
| 22 | + | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
33 | 45 | | |
34 | 46 | | |
35 | 47 | | |
| |||
95 | 107 | | |
96 | 108 | | |
97 | 109 | | |
| 110 | + | |
98 | 111 | | |
99 | 112 | | |
100 | 113 | | |
| |||
144 | 157 | | |
145 | 158 | | |
146 | 159 | | |
| 160 | + | |
147 | 161 | | |
148 | 162 | | |
149 | 163 | | |
| |||
237 | 251 | | |
238 | 252 | | |
239 | 253 | | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
240 | 258 | | |
241 | 259 | | |
242 | | - | |
243 | | - | |
244 | | - | |
245 | | - | |
246 | | - | |
247 | | - | |
248 | | - | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
249 | 270 | | |
250 | 271 | | |
251 | 272 | | |
| |||
758 | 779 | | |
759 | 780 | | |
760 | 781 | | |
761 | | - | |
762 | | - | |
763 | | - | |
764 | | - | |
765 | | - | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
766 | 788 | | |
767 | 789 | | |
768 | 790 | | |
| |||
795 | 817 | | |
796 | 818 | | |
797 | 819 | | |
798 | | - | |
799 | | - | |
800 | | - | |
801 | | - | |
802 | | - | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
803 | 826 | | |
804 | 827 | | |
805 | 828 | | |
| |||
1260 | 1283 | | |
1261 | 1284 | | |
1262 | 1285 | | |
| 1286 | + | |
1263 | 1287 | | |
1264 | 1288 | | |
1265 | 1289 | | |
| |||
0 commit comments