According to monitoring by 1M AI News, the open-source vector database Chroma has released Context-1, a 20-billion-parameter agentic search model specifically designed for multi-turn retrieval tasks. The model weights are open-sourced under the Apache 2.0 license, and the synthetic data generation pipeline code has been released simultaneously.
Context-1 is positioned as a retrieval subagent: it does not directly answer questions, but instead returns a set of supporting documents to downstream inference models through multi-round search. The core technology is “self-editing context,” meaning the model actively discards irrelevant document snippets during the search process, freeing up space in a limited context window for subsequent searches and preventing performance degradation caused by context bloat.
Training is conducted in two stages: first, use large models such as Kimi K2.5 to generate SFT trajectories for supervised fine-tuning warm-up, and then train with reinforcement learning (based on the CISPO algorithm) on more than 8,000 synthetic tasks. The reward design uses a curriculum mechanism: in the early stage, re-recall is increased to encourage broad exploration, and in the later stage it gradually shifts toward precision to encourage selectively retaining relevant content. The base model is gpt-oss-20b, adapted with LoRA; during inference, it runs on B200 with MXFP4 quantization, achieving a throughput of 400–500 token/s.
On Chroma’s four in-house domain benchmarks (webpages, finance, law, email) and public benchmarks (BrowseComp-Plus, SealQA, FRAMES, HotpotQA), Context-1’s four-way parallel version performs on the “final answer hit rate” metric on par with or close to state-of-the-art models such as GPT-5.2, Opus 4.5, and Sonnet 4.5—for example, reaching 0.96 on BrowseComp-Plus (Opus 4.5 is 0.87, GPT-5.2 is 0.82)—while its cost and latency are only a fraction of those models. Notably, the model was trained only on web, legal, and financial data, but in the email domain—where it was not involved in training—it also shows significant improvement, demonstrating cross-domain transferability of its search capabilities.