ByteBriefDistilling the feed
Which tokens does a hybrid model predict better? | ByteBrief