The set_tokenizer API seems a bit suspect here, given that it can be replaced with
const tokenize = WordTokenizers.nltk_tokenize
and likewise for RevTok etc, without bringing in multiple packages just to define an alias :)
I also think it's generally a good idea to expose people to higher order functions and such; people might not realise that you can just e.g. pass a custom tokenize function into a constructor rather than setting and unsetting it globally.
The
set_tokenizerAPI seems a bit suspect here, given that it can be replaced withand likewise for RevTok etc, without bringing in multiple packages just to define an alias :)
I also think it's generally a good idea to expose people to higher order functions and such; people might not realise that you can just e.g. pass a custom tokenize function into a constructor rather than setting and unsetting it globally.