MAS-Architect: Designing Agents That Redesign Themselves
MAS-Architect: Declarative Multi-Agent System Design via Separation of Concerns
Jing Huang, Lidong Zhang, Mutian Bao, Yadong Li, Xingzhong Xu, Jinjian Zhang, Jie Liu, Ming Kong, Qiang Zhu
MAS-Architect reframes Auto-MAS as architectural generation rather than template selection. Its key move is a code-based declarative MAS paradigm: topology planning says what the collaboration graph is, node implementation says how each agent executes, and a shared State Schema keeps the two layers coupled only through typed state. With Distill-then-Explore training, the Meta-Agent learns to synthesize query-specific multi-agent systems from scratch, reaching 78.7% average accuracy while establishing a stronger efficiency-performance frontier.
In Brief
- 1
The paper's central design choice is Separation of Concerns: topology, node behavior, and runtime state are made explicit instead of being entangled in one imperative code stream.
- 2
MAS-Architect generates task-adaptive MAS from scratch, deciding node count, connections, conditional routing, roles, tools, and reasoning patterns at query level.
- 3
Distill-then-Explore gives the Meta-Agent a warm start from validated teacher architectures, then lets RL discover architectures beyond imitation.
- 4
The strongest Lifelong Agents signal is emergent organization: parallel search streams, recursive audit loops, and role-specialized agents appear without fixed templates.
Problem
Existing Auto-MAS methods live between two unsatisfying extremes. Graph-based methods are readable and verifiable, but are often stuck in DAGs, predefined roles, operator pools, and static routing. Imperative MAS-as-code methods are expressive, but topology, control flow, prompts, tools, and state passing become tangled inside procedural code. The paper argues that this coupling makes architectures hard to search, inspect, optimize, and adapt per query.
Declarative MAS
MAS-Architect introduces a declarative representation with two layers. The Topology Layer declares the graph, dynamic branches, loops, and routing rules - what the architecture is. The Implementation Layer realizes each node - how an agent reasons, acts, calls tools, and updates state. A standardized State Schema becomes the interface between them, carrying task context, intermediate results, and trajectories without hiding topology inside execution code.
Training
The training pipeline first distills architectural patterns from a large teacher model. Candidate MAS specifications are compiled and executed, and only successful, correct architectures survive execution-based rejection sampling for SFT. The second stage uses RL with verified rewards: invalid or shortcut-like architectures get zero reward, valid but incorrect attempts receive a small base reward, and correct valid executions receive full reward. This makes exploration target architecture quality rather than surface-form imitation.
Evidence
On GSM8K, GSM-Hard, MATH, MMLU, and GPQA, MAS-Architect with Qwen3-4B reaches 78.7% average accuracy, 6.4 points above vanilla and 3.0 points above the second-best method in the table. The GSM8K efficiency plot is especially telling: 94.4% accuracy with 2,533 prompt tokens per query, and the appendix reports 23.8% fewer total tokens than MAS-GPT on GSM8K. Ablations show that SFT gives small stable gains, while RL is the main unlocker, especially on HotpotQA where Qwen3-4B rises by 16.0 points.
Why Lifelong Agents
The Lifelong Agents angle is not simply that agents collaborate. It is that the Meta-Agent learns an architectural policy: given a new task, it can create a new organization, assign roles, route information, audit failures, and specialize agents. That is a step toward agents whose competence grows at the level of system design, not only at the level of single-model answers.