ByteBrief
We're a portrait publication through and through. Turn your phone back and your briefing picks up right where you left it.
(We tried widescreen once. It wasn't us.)

Huawei's Zurich lab released SINQ, an open-source quantization method that cuts LLM memory needs by up to 70%. This allows workloads requiring Nvidia A100 or H100 GPUs to run on consumer cards like the RTX 4090. The Apache 2.0 licensed project is on GitHub and Hugging Face.
Tap to vote and see what everyone thinks.
Summary by ByteBrief