meet lfm2.5-vl-450m-extract
what it does: extracts user-defined fields from images and returns structured json (a standard text format for organized data that computers can easily read). you define what to extract via a yaml schema (a simple, human-readable list of instructions) in the system prompt; the model outputs a matching json object.
example: you give it a photo of a wood surface and ask for color, texture, and pattern – it returns {"wood_color": "light brown", "wood_texture": "smooth", "wood_pattern": "wavy"}
architecture:
• 450m total parameters (350m language model + 100m vision encoder – the part that "looks at" the image)
• combines two processing techniques: conv (scans the image locally, patch by patch) and attention (looks at relationships across the whole image)
• supports dynamic resolution and a 128k context window (how much text it can consider at once)
enum support: normally the model writes its own answer freely. with enums, you give it a fixed list of choices and it must pick one – e.g. texture can only be "smooth", "rough", or "grainy". think of it like a multiple-choice question instead of an open one
performance:
• 98.9% json validity – output is properly formatted nearly every time
• 98.8% schema f1 – it returns exactly the fields you asked for (100 is perfect)
• 84.5 vlm judge score – a separate ai rates how well the answers match the actual image
• beats all models under 1b parameters, matches models up to 4× its size
use cases:
• safety event detection (e.g. fallen person, fire, leakage)
• video frame analytics
• e-commerce product auto-tagging
limitations:
• single image only – no multi-image reasoning
• flat json output only – no nested or complex structures
• not suited for open-ended questions about images

Introducing LFM2.5-VL-1.6B-Extract and LFM2.5-VL-450M-Extract: Vision-language models that return structured JSON, not free-form text.
— Liquid AI (@liquidai) June 5, 2026
Pass in an image and a list of fields. Get back a clean JSON object.
> Two sizes: 1.6B parameters and 450M
> open-weight
> run on any device… pic.twitter.com/exCqprVbVK
Nick Trenkler