summaryrefslogtreecommitdiff
path: root/unicode.h
AgeCommit message (Expand)Author
2024-03-01unicode : switch to multimap based nfd_map (#5799)Douglas Hanley
2024-02-28llama : improve BERT tokenization (#5740)Douglas Hanley
2024-02-26unicode : reuse iterator (#5726)Georgi Gerganov
2024-02-13tests : multi-thread the tokenizer tests (#5474)Georgi Gerganov
2024-01-21add `#include <string>` to unicode.h (#5051)bobqianic
2023-10-03Work on the BPE tokenizer (#3252)goerch