Suggested functionality: Estimate by model_type

First off: Great tool and saved me the headache of trying to trace the functions tokens myself.
A final touch could to introduce an option to have a token estimator class (tokenizer class?) which gets the model type as attribute and then uses the tiktoken.encoding_for_model() function to retrieve the encoding.

That way if openai ever changes the encoding or uses a different encoding for newer models the package can stay up to date.
On a side note what I think is also useful are following functions which you can use e.g. to prevent logging of huge inputs to the model 

    def get_string_tokens(self, the_str : str) -> int:
        return len(self.encode(the_str))


    def get_limited_string(self, the_str : str, max_tokens : int) -> str:
        encoded_str = self.encode(the_str)
        return self.decode(encoded_str[:max_tokens])

Best 
Somerandomguy10111

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Suggested functionality: Estimate by model_type #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Suggested functionality: Estimate by model_type #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions