macaron-v1-preview-749b released today. nex-n2-pro open-sourced today. macaron comes from mindlab research and is post-trained from glm-5.1. nex-n2-pro comes from nex-agi and is post-trained from qwen3.5
two big agent-focused model drops in the same news cycle, and they're aiming at almost completely different problems
Nex-N2 is now open sourceοΌAn agentic model series from Nex AGI built for coding, tool use, deep research, and long-horizon workflows. π§ π
β ModelScope (@ModelScope2022) June 8, 2026
π οΈ https://t.co/wQZptCLpS0
βοΈ https://t.co/LIqNNjtyPD
β Models: Nex-N2-Pro 397B total, 17B active; Nex-N2-mini 35B total, 3B active
ββ¦ pic.twitter.com/bDwovClrFc
macaron is pitched as a personal agent β the kind of model that helps you pick a restaurant, reschedule an errand, or render a comparison card for a booking decision. nex-n2-pro is pitched as a productivity agent β the kind that closes prs, drives a terminal for hours, and runs long-horizon coding loops. macaron leans hard on generative ui and a custom protocol called a2ui; nex-n2-pro leans hard on agentic coding, deep research, and tool calling
Macaron-V1-Preview-749B π a Mixture-of-LoRA personal agent model from MindLab
β Adina Yakup (@AdinaYakup) June 8, 2026
β¨ 744B base + 5 specialist LoRAs
β¨ Generative UI as a core skill
β¨ Personal agent focused
β¨ 202K context
β¨ MIT license pic.twitter.com/OkjxKEThzZ
here's how they actually stack up
size and architecture
macaron-v1-preview-749b is a 749b-class mixture-of-lora: a 744b base plus five ~1b lora adapters (l0 default, l1βl4 specialists for tool use, coding, computer-agent, generative ui). routing happens via an explicit "router tool" rather than a learned gating network β the harness decides which adapter handles each turn. bfloat16, 202,752-token context, mit license

nex-n2-pro is a 396.8b moe with ~17b active params, post-trained on qwen3.5-397b-a17b-base. no adapter routing β one model, one weight set. its "agentic thinking" framework is a training-time framing: adaptive thinking (decide reasoning depth per step) plus coherent thinking (one reasoning style across task types). apache-2.0. a smaller nex-n2-mini (35b moe, 3b active) ships alongside

macaron is ~1.9x the total params but uses sparse specialists; nex activates ~17b per token from a smaller pool. macaron bets on interference-avoidance between skill families; nex bets on transfer between them
what each is built for
macaron targets daily-life decisions where state changes between turns β where to eat, how to reroute, scheduling errands. its distinctive capability is a2ui: emitting protocol actions that render as cards, forms, sliders, dashboards instead of text walls. a2ui-bench scores three layers (protocol correctness, task construction, ux lift) with rendered visual checks for overflow and broken layouts
nex-n2-pro targets agentic coding, deep research, tool calling, terminal execution. the framing is closing the loop between requirement understanding, planning, code implementation, environment feedback, debugging, and iteration. no ui generation story
how the new models (macaron & nex-n2-pro) compare
nex-n2-pro is a direct competitor to the top tier. its numbers put it shoulder-to-shoulder with deepseek v4-pro and kimi k2.6 on the benchmarks they share:

so on the established coding/agent benchmarks, nex-n2-pro is the new leader on terminal-bench 2.1 among open weights (75.3 vs minimax m3's 66.0), basically tied with the field on swe-bench verified and pro, and competitive on reasoning. apache-2.0, clean sglang deploy. it's a legitimate new top-tier option β if the numbers hold up under independent testing, it joins deepseek/kimi/glm/minimax as a first-class agentic-coding pick.
macaron-v1-preview-749b is not really comparable to the rest. it doesn't try to compete on swe-bench pro, terminal-bench, or browsecomp. it's the only model in this whole list specifically post-trained for personal-agent tasks β calendar, restaurants, routing, daily-life decisions β plus generative ui via the a2ui protocol. nothing else on the list does generative ui. so the question isn't "is it better than kimi" but "do you need the thing it does?" if you're building a consumer personal assistant that renders cards and forms, macaron is alone in its category. if you're building a coding agent, ignore it
bottom line
pick nex-n2-pro for code, terminals, repos, or long-running tool chains β better numbers where they overlap, cleaner deploy path, apache-2.0. pick macaron if you need generative ui output or are building a consumer personal-agent product where livingbench/vitabench/pinchbench-style tasks are the actual workload β and you're willing to run their harness. they're not really competing for the same deployment.
Nick Trenkler