#1 Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition
Score: 35.4
Matched keywords: agent, large language model, llm, multi-agent, reasoning
Categories: cs.LG, cs.AI, cs.CL
Compressed abstract: While Large Language Model-based Multi-Agent Systems (MAS) consistently outperform single-agent systems on complex tasks, their intricate interactions introduce critical reliability challenges arising from communication dynamics and role dependencies. Existing Uncertainty Quantification methods, typically designed for single-turn outputs, fail to address the unique complexities of the MAS.
Open summary page · arXiv · PDF
#2 Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction
Score: 44.7
Matched keywords: agent, autonomous coding, benchmark, code generation, large language model, large language models, llm, multi-agent
Categories: cs.AI, cs.MA
Compressed abstract: Human cognitive development is shaped not only by individual effort but by structured social interaction, where role-based exchanges such as those between a tutor and a learner, enable solutions that neither could achieve alone. Inspired by these developmental principles, we ask the question whether a tutor-student multi-agent system can create a synergistic effect by pushing Large Language Model (LLM) beyond what i…
Open summary page · arXiv · PDF
#3 Camera Artist: A Multi-Agent Framework for Cinematic Language Storytelling Video Generation
Score: 25.2
Matched keywords: agent, agent framework, multi-agent
Categories: cs.AI
Compressed abstract: We propose Camera Artist, a multi-agent framework that models a real-world filmmaking workflow to generate narrative videos with explicit cinematic language. While recent multi-agent systems have made substantial progress in automating filmmaking workflows from scripts to videos, they often lack explicit mechanisms to structure narrative progression across adjacent shots and deliberate use of cinematic language, res…
Open summary page · arXiv · PDF
#4 QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation
Score: 24.7
Matched keywords: benchmark, code generation, large language models, llm, reasoning
Categories: cs.LG, cs.AI, cs.PL, cs.SE, quant-ph
Compressed abstract: Large Language Models (LLMs) are increasingly used for code generation, yet quantum code generation is still evaluated mostly within single frameworks, making it difficult to separate quantum reasoning from framework familiarity. We introduce QuanBench+, a unified benchmark spanning Qiskit, PennyLane, and Cirq, with 42 aligned tasks covering quantum algorithms, gate decomposition, and state preparation.
Open summary page · arXiv · PDF
#5 LLM-Rosetta: A Hub-and-Spoke Intermediate Representation for Cross-Provider LLM API Translation
Score: 19.7
Matched keywords: large language model, llm, reasoning
Categories: cs.SE, cs.AI
Compressed abstract: The rapid proliferation of Large Language Model (LLM) providers--each exposing proprietary API formats--has created a fragmented ecosystem where applications become tightly coupled to individual vendors. Switching or bridging providers requires O(N^2) bilateral adapters, impeding portability and multi-provider architectures.
Open summary page · arXiv · PDF
#6 Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines
Score: 17.2
Matched keywords: agent, ai, llm, multi-agent
Categories: cs.CR, cs.AI, cs.LG
Compressed abstract: We introduce Semantic Intent Fragmentation (SIF), an attack class against LLM orchestration systems where a single, legitimately phrased request causes an orchestrator to decompose a task into subtasks that are individually benign but jointly violate security policy. Current safety mechanisms operate at the subtask level, so each step clears existing classifiers -- the violation only emerges at the composed plan.
Open summary page · arXiv · PDF
#7 Generative AI Agent Empowered Power Allocation for HAP Propulsion and Communication Systems
Score: 23.5
Matched keywords: agent, ai, ai agent, artificial intelligence
Categories: cs.NI, cs.IT
Compressed abstract: High altitude platforms (HAPs) are emerging as a key enabler for 6 G coverage, yet limited energy must be split between propulsion and communications. Most prior HAP studies ignore propulsion power or rely on surrogates that miss hull-propeller interference, leading to misestimated communication power budgets and degraded beamforming.
Open summary page · arXiv · PDF
#8 MAG-3 D: Multi-Agent Grounded Reasoning for 3 D Understanding
Score: 34.7
Matched keywords: agent, agent framework, coding agent, multi-agent, multimodal, reasoning
Categories: cs.CV, cs.MA
Compressed abstract: Vision-language models (VLMs) have achieved strong performance in multimodal understanding and reasoning, yet grounded reasoning in 3 D scenes remains underexplored. Effective 3 D reasoning hinges on accurate grounding: to answer open-ended queries, a model must first identify query-relevant objects and regions in a complex scene, and then reason about their spatial and geometric relationships.
Open summary page · arXiv · PDF
#9 Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models
Score: 24.4
Matched keywords: benchmark, large language model, large language models, llm, reasoning
Categories: cs.CL, cs.AI, cs.LG
Compressed abstract: Extended reasoning models represent a transformative shift in Large Language Model (LLM) capabilities by enabling explicit test-time computation for complex problem solving. However, the optimal configuration of sampling temperature and prompting strategy for these systems remains largely underexplored.
Open summary page · arXiv · PDF
#10 Semantic Rate-Distortion for Bounded Multi-Agent Communication: Capacity-Derived Semantic Spaces and the Communication Cost of Alignment
Score: 18.7
Matched keywords: agent, alignment, benchmark, multi-agent
Categories: cs.IT, cs.AI
Compressed abstract: When two agents of different computational capacities interact with the same environment, they need not compress a common semantic alphabet differently; they can induce different semantic alphabets altogether. We show that the quotient POMDP Q_{m,T}(M) - the unique coarsest abstraction consistent with an agent's capacity - serves as a capacity-derived semantic space for any bounded agent, and that communication betw…
Open summary page · arXiv · PDF