
Microsoft Research released Lens, a 3.8 billion parameter text-to-image model that matches larger rivals on benchmarks. The key is 800 million detailed captions generated by GPT-4.1, replacing vague web alt-text. Code and weights are open source, proving caption quality matters more than raw scale.
Tap to vote and see what everyone thinks.
Siri AI vs Gemini: Apple's new AI features compared
Summary by ByteBrief