#1 SEMA-RAG: A Self-Evolving Multi-Agent Retrieval-Augmented Generation Framework for Medical Reasoning
Score: 29.6
Matched keywords: agent, llm, multi-agent, rag, reasoning, retrieval-augmented
Categories: cs.CL, cs.AI
Compressed abstract: Retrieval-Augmented Generation (RAG) is widely employed to mitigate risks such as hallucinations and knowledge obsolescence in medical question answering, yet its predominantly single-round, static retrieval paradigm misaligns with the multi-stage process of clinical reasoning. This compressed workflow induces two structural deficiencies: question-to-query translation often lacks clinically grounded semantic interpr…
Open summary page · arXiv · PDF
#2 PROTEA: Offline Evaluation and Iterative Refinement for Multi-Agent LLM Workflows
Score: 32.0
Matched keywords: agent, llm, multi-agent, prompt
Categories: cs.CL, cs.AI, cs.HC, cs.SE
Compressed abstract: Multi-agent LLM workflows -- systems composed of multiple role-specific LLM calls -- often outperform single-prompt baselines, but they remain difficult to debug and refine. Failures can originate from subtle errors in intermediate outputs that propagate to downstream nodes, requiring developers to inspect long traces and infer which agent to modify.
Open summary page · arXiv · PDF
#3 LLM-Guided Communication for Cooperative Multi-Agent Reinforcement Learning
Score: 31.7
Matched keywords: agent, llm, multi-agent, reasoning
Categories: cs.AI, cs.LG, cs.MA
Compressed abstract: Communication is a key component in multi-agent reinforcement learning (MARL) for mitigating partial observability, yet prior approaches often rely on inefficient information exchange or fail to transmit sufficient state information. To address this, we propose LLM-driven Multi-Agent Communication (LMAC), which leverages an LLM's reasoning capability to design a communication protocol that enables all agents to reco…
Open summary page · arXiv · PDF
#4 VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems
Score: 31.2
Matched keywords: agent, large language model, llm, multi-agent
Categories: cs.CL
Compressed abstract: Large language model-driven multi-agent systems (LLM-MAS) excel at complex tasks, yet unreliable agents remain a key bottleneck to system-level reliability. Automatic failure attribution is therefore critical, but existing approaches, such as direct prediction of agent-error pairs and agent-first failure attribution, rely on local logs of agents and miss global failures that only manifest over full interaction traje…
Open summary page · arXiv · PDF
#5 Learning Transferable Topology Priors for Multi-Agent LLM Collaboration Across Domains
Score: 42.6
Matched keywords: agent, alignment, large language model, llm, multi-agent, reasoning, token
Categories: cs.CL
Compressed abstract: Large language model (LLM)-based multi-agent systems have shown strong potential for complex reasoning by coordinating specialized agents through structured communication. However, existing topology-evolution methods typically construct or optimize a collaboration topology for each query from scratch, leading to substantial online search overhead, high inference-time token consumption, and limited scalability in mul…
Open summary page · arXiv · PDF
#6 MetaCogAgent: A Metacognitive Multi-Agent LLM Framework with Self-Aware Task Delegation
Score: 34.2
Matched keywords: agent, alignment, benchmark, large language model, llm, multi-agent
Categories: cs.AI, cs.MA
Compressed abstract: Multi-agent large language model (LLM) systems have shown promise for solving complex tasks through agent collaboration. However, existing frameworks assign tasks based on predefined roles without considering whether an agent can accurately assess its own competence boundaries, leading to overconfident execution on tasks beyond its expertise.
Open summary page · arXiv · PDF
#7 Agent Bazaar: Enabling Economic Alignment in Multi-Agent Marketplaces
Score: 24.0
Matched keywords: agent, alignment, large language models, multi-agent
Categories: cs.LG, cs.MA
Compressed abstract: The deployment of Large Language Models (LLMs) as autonomous economic agents introduces systemic risks that extend beyond individual capability failures. As agents transition to directly interacting with marketplaces, their collective behavior can amplify volatility and mask deception at scale.
Open summary page · arXiv · PDF
#8 Alignment Dynamics in LLM Fine-Tuning
Score: 29.2
Matched keywords: alignment, fine-tuning, large language models, llm
Categories: cs.LG, cs.AI
Compressed abstract: Although Large Language Models (LLMs) achieve strong alignment through supervised fine-tuning and reinforcement learning from human feedback, the alignment is often fragile under subsequent fine-tuning. Existing explanations either attribute alignment fragility to gradient geometry or characterize it as a distributional shift in model outputs, yet few provide a unified account that bridges parameter-space learning d…
Open summary page · arXiv · PDF
#9 Event-B Agent: Towards LLM Agent for Formal Model Synthesis and Repair
Score: 23.0
Matched keywords: agent, large language models, llm
Categories: cs.SE
Compressed abstract: Building software that is correct by construction is a long-standing goal in software engineering, as it ensures reliability during design and development rather than after deployment. Formal methods realize this vision by enabling the expression of system behavior and requirements in mathematics, thereby guaranteeing correctness through formal verification, including theorem proving and model checking.
Open summary page · arXiv · PDF
#10 Whispers in the Noise: Surrogate-Guided Concept Awakening via a Multi-Agent Framework
Score: 29.2
Matched keywords: agent, agent framework, alignment, diffusion, multi-agent
Categories: cs.AI
Compressed abstract: Diffusion models (DMs) are widely used for text-to-image generation, but their strong generative capabilities also raise concerns about unsafe or undesirable content. Concept erasure aims to mitigate these risks by removing specific concepts from pretrained models.
Open summary page · arXiv · PDF