An engineer built a dataset Q&A system and found that increasing context windows from 4k to 128k tokens produced longer, more detailed, and more confidently wrong answers. The fix routed computation queries away from RAG entirely. The system measured performance across 7 query types and 5 context sizes on 100,000 rows.
Tap to vote and see what everyone thinks.