Datalab released lift, a 9B open-weights vision model that extracts structured JSON from PDFs and images using a provided JSON schema. It scores 90.2% field accuracy on Datalab's 225-document benchmark. The model is Datalab's first built purely for extraction, extending their open-source OCR tools chandra, marker, and surya.
Tap to vote and see what everyone thinks.
Summary by ByteBrief