Age | Commit message (Expand) | Author |
---|---|---|
2024-03-01 | unicode : switch to multimap based nfd_map (#5799) | Douglas Hanley |
2024-02-28 | llama : improve BERT tokenization (#5740) | Douglas Hanley |
2024-02-26 | unicode : reuse iterator (#5726) | Georgi Gerganov |
2024-02-13 | tests : multi-thread the tokenizer tests (#5474) | Georgi Gerganov |
2024-01-21 | add `#include <string>` to unicode.h (#5051) | bobqianic |
2023-10-03 | Work on the BPE tokenizer (#3252) | goerch |