
Harness-1, a 20B parameter retrieval subagent, is trained with reinforcement learning on gpt-oss-20b within a stateful search harness. The model operates inside a stateful search harness that manages evidence tracking and claim verification. Researchers from University of Illinois Urbana-Champaign, UC Berkeley, and Chroma developed it to separate search decisions from bookkeeping. The approach reduces optimization conflicts in search agent training.
Tap to vote and see what everyone thinks.
Huawei posts trains DeepSeek V4-Pro with 1,000 Ascend 910C chips
Summary by ByteBrief