The Token Company (YC W26) offers a "bear-1" compression model to preprocess inputs, helping with LLM costs, latency and allowing more context into prompts with improved LLM performance and accuracy
The Token Company (YC W26) offers a "bear-1" compression model to preprocess inputs, helping with LLM costs, latency and allowing more context into prompts with improved LLM performance and accuracy. Compression is useful for teams building LLM powered products and looking to maintain a large context, reduce inference costs, and improve model performance. In benchmarks, we have been able to reduce 66% of input tokens with +1.1% improved accuracy by removing the least significant tokens from prompts. The Token Company offers the model via an API. The prompt is first passed to the compression API, the compressed result can then be given to an LLM model of choice. The process enables significant token savings with improved context handling and model understanding.
The Token Company (YC W26) offers a "bear-1" compression model to preprocess inputs, helping with LLM costs, latency and allowing more context into prompts with improved LLM performance and accuracy
The Token Company (YC W26) offers a "bear-1" compression model to preprocess inputs, helping with LLM costs, latency and allowing more context into prompts with improved LLM performance and accuracy. Compression is useful for teams building LLM powered products and looking to maintain a large context, reduce inference costs, and improve model performance. In benchmarks, we have been able to reduce 66% of input tokens with +1.1% improved accuracy by removing the least significant tokens from prompts. The Token Company offers the model via an API. The prompt is first passed to the compression API, the compressed result can then be given to an LLM model of choice. The process enables significant token savings with improved context handling and model understanding.