liquid ai just dropped a tiny model that reads images and pulls out exactly the data you need

Nick Trenkler 5 Jun 2026 2 min read

Illustrated man holding a glowing JSON panel as glitchy image fragments swirl around him in an industrial setting

meet lfm2.5-vl-450m-extract

what it does: extracts user-defined fields from images and returns structured json (a standard text format for organized data that computers can easily read). you define what to extract via a yaml schema (a simple, human-readable list of instructions) in the system prompt; the model outputs a matching json object.

example: you give it a photo of a wood surface and ask for color, texture, and pattern – it returns {"wood_color": "light brown", "wood_texture": "smooth", "wood_pattern": "wavy"}

architecture:
• 450m total parameters (350m language model + 100m vision encoder – the part that "looks at" the image)
• combines two processing techniques: conv (scans the image locally, patch by patch) and attention (looks at relationships across the whole image)
• supports dynamic resolution and a 128k context window (how much text it can consider at once)

enum support: normally the model writes its own answer freely. with enums, you give it a fixed list of choices and it must pick one – e.g. texture can only be "smooth", "rough", or "grainy". think of it like a multiple-choice question instead of an open one

performance:
• 98.9% json validity – output is properly formatted nearly every time
• 98.8% schema f1 – it returns exactly the fields you asked for (100 is perfect)
• 84.5 vlm judge score – a separate ai rates how well the answers match the actual image
• beats all models under 1b parameters, matches models up to 4× its size

use cases:
• safety event detection (e.g. fallen person, fire, leakage)
• video frame analytics
• e-commerce product auto-tagging

limitations:
• single image only – no multi-image reasoning
• flat json output only – no nested or complex structures
• not suited for open-ended questions about images

GitHub repository for LFM2.5-VL-450M-Extract showing model files, 1.04 GB size, and config files for Liquid AI's vision model

Introducing LFM2.5-VL-1.6B-Extract and LFM2.5-VL-450M-Extract: Vision-language models that return structured JSON, not free-form text.

Pass in an image and a list of fields. Get back a clean JSON object.

> Two sizes: 1.6B parameters and 450M
> open-weight
> run on any device… pic.twitter.com/exCqprVbVK
— Liquid AI (@liquidai) June 5, 2026

liquid ai just dropped a tiny model that reads images and pulls out exactly the data you need

Read next

minimax m3 surges 198% – still priciest in a chinese-dominated top 4

the stack that eats them: why most agentic ai startups will fail

openai is about to join the price war started by deepseek

Stay in the loop

Read next

minimax m3 surges 198% – still priciest in a chinese-dominated top 4

the stack that eats them: why most agentic ai startups will fail

openai is about to join the price war started by deepseek