Releases: stephantul/skeletoken
Releases · stephantul/skeletoken
0.3.3
What's Changed
- feat: add normalizer and pretokenizer instantiation by @stephantul in #73
- fix: bug with remapped added token ids by @stephantul in #74
- chore: up version by @stephantul in #75
Full Changelog: 0.3.1...0.3.3
0.3.2
What's Changed
- feat: add normalizer and pretokenizer instantiation by @stephantul in #73
Full Changelog: 0.3.1...0.3.2
0.3.1
What's Changed
- feat: add citation information by @stephantul in #61
- fix: bug with preprocessor by @stephantul in #62
- feat: add test for copy by @stephantul in #63
- feat: add flag for adding tokens by @stephantul in #64
- fix: monitor flaky test by @stephantul in #65
- fix: forgot to pass argument by @stephantul in #66
- fix: revert token addition, fix bugs by @stephantul in #67
- feat: add batch merges for speedups by @stephantul in #68
- fix: Remove token mapper by @stephantul in #69
- fix: batch merging included incorrect early exit by @stephantul in #70
- fix: types, token removal with unk and pad by @stephantul in #71
Full Changelog: 0.3.0...0.3.1
0.3.0
What's Changed
- Fix merges when decasing by @stephantul in #37
- refactor decasing to be more robust by @stephantul in #38
- fix: no longer use casefold by @stephantul in #39
- feat: add pylate and sentence-transformers integrations by @stephantul in #40
- feat: add unk and pad token id by @stephantul in #41
- feat: add original class when initializing from a transformers tokenizer by @stephantul in #42
- add tutorials by @stephantul in #43
- feat: fix padding token post init by @stephantul in #44
- add additional tutorial by @stephantul in #45
- add README example by @stephantul in #46
- Type check by @stephantul in #47
- feat: add make commands and ty by @stephantul in #48
- Add docs for conversion by @stephantul in #49
- make everything functional by @stephantul in #50
- add in-place functions for normalization by @stephantul in #51
- fix logging message by @stephantul in #52
- Typing fixes by @stephantul in #53
- feat: Add token prep by @stephantul in #54
- add id for bos/eos by @stephantul in #55
- feat: add addedtokens helper function by @stephantul in #56
- fix: addedtokens bug in decasing by @stephantul in #57
- feat: add prefix space property by @stephantul in #58
- fix: crash when tokenizer is empty by @stephantul in #59
Full Changelog: 0.2.2...0.3.0
0.2.2
What's Changed
- feat: add top-level vocabulary properties by @stephantul in #28
- feat: add preprocessor by @stephantul in #29
- feat: add helper methods for preprocessor by @stephantul in #30
- add preprocessor by @stephantul in #31
- add preprocessor by @stephantul in #33
- feat: add batch removal method by @stephantul in #32
- feat: add trainer by @stephantul in #34
- add tests for trainer by @stephantul in #35
- bump version by @stephantul in #36
Full Changelog: 0.2.1...0.2.2
0.2.1
What's Changed
- feat: Add model delta by @stephantul in #20
- feat: make pattern creation nicer by @stephantul in #21
- feat: make greedy tokenizer better by @stephantul in #23
- feat: constrain template post processor to be more precise by @stephantul in #24
- feat: add tokens to ids and ids to tokens roundtrip by @stephantul in #25
- feat: add transformers saving & loading by @stephantul in #26
- fix: remove merges when removing tokens by @stephantul in #27
Full Changelog: 0.2.0...0.2.1
0.2.0
What's Changed
- feat: new helpers by @stephantul in #1
- Add docstrings by @stephantul in #2
- Characteristics by @stephantul in #3
- Rename by @stephantul in #4
- fix: more docstrings by @stephantul in #5
- Vocab mutation by @stephantul in #6
- fix: mutable returning the model itself by @stephantul in #7
- fix mutable returning the model itself by @stephantul in #8
- feat: add github action by @stephantul in #9
- feat: add eos bos detection by @stephantul in #10
- add prefix and suffix token markers by @stephantul in #11
- feat: remove collisions by @stephantul in #12
- add truncation and padding by @stephantul in #13
- feat: Unk pad token setter by @stephantul in #15
- add docs, add merges, add tests by @stephantul in #16
- Docs by @stephantul in #17
- add transformers integration by @stephantul in #18
- fix: template issues by @stephantul in #19
New Contributors
- @stephantul made their first contribution in #1
Full Changelog: https://github.com/stephantul/skeletoken/commits/0.2.0