#1 TurboAgent: An LLM-Driven Autonomous Multi-Agent Framework for Turbomachinery Aerodynamic Design
Score: 40.2
Matched keywords: agent, agent framework, large language model, llm, multi-agent
Categories: cs.AI
Compressed abstract: The aerodynamic design of turbomachinery is a complex and tightly coupled multi-stage process involving geometry generation, performance prediction, optimization, and high-fidelity physical validation. Existing intelligent design approaches typically focus on individual stages or rely on loosely coupled pipelines, making fully autonomous end-to-end design challenging.To address this issue, this study proposes TurboA…
Open summary page · arXiv · PDF
#2 Fighting AI with AI: AI-Agent Augmented DNS Blocking of LLM Services during Student Evaluations
Score: 29.2
Matched keywords: agent, ai, large language models, llm
Categories: cs.NI, cs.AI, cs.CY, cs.ET, cs.LG
Compressed abstract: The transformative potential of large language models (LLMs) in education, such as improving accessibility and personalized learning, is being eclipsed by significant challenges. These challenges stem from concerns that LLMs undermine academic assessment by enabling bypassing of critical thinking, leading to increased cognitive offloading.
Open summary page · arXiv · PDF
#3 SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
Score: 23.8
Matched keywords: llm, reasoning, token
Categories: cs.LG, cs.AI, cs.CL
Compressed abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capabilities and unresolved token inefficiency. To address this, we propose Stage-aware Hierarchical Advantage via Potential Estimation (SHAPE), a framework that formalizes reasoning as a trajectory through a state space of…
Open summary page · arXiv · PDF
#4 SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills
Score: 29.3
Matched keywords: agent, ai, ai agent, alignment, benchmark, llm, prompt
Categories: cs.CR, cs.AI
Compressed abstract: OpenClaw's ClawHub marketplace hosts over 13,000 community-contributed agent skills, and between 13% and 26% of them contain security vulnerabilities according to recent audits. Regex scanners miss obfuscated payloads; formal static analyzers cannot read the natural language instructions in SKILL.md files where prompt injection and social engineering attacks hide.
Open summary page · arXiv · PDF
#5 Qualixar OS: A Universal Operating System for AI Agent Orchestration
Score: 42.2
Matched keywords: agent, ai, ai agent, alignment, llm, multi-agent
Categories: cs.AI, cs.MA, cs.SE
Compressed abstract: We present Qualixar OS, the first application-layer operating system for universal AI agent orchestration. Unlike kernel-level approaches (AIOS) or single-framework tools (AutoGen, CrewAI), Qualixar OS provides a complete runtime for heterogeneous multi-agent systems spanning 10 LLM providers, 8+ agent frameworks, and 7 transports.
Open summary page · arXiv · PDF
#6 How Much LLM Does a Self-Revising Agent Actually Need?
Score: 21.4
Matched keywords: agent, llm
Categories: cs.AI, cs.CL
Compressed abstract: Recent LLM-based agents often place world modeling, planning, and reflection inside a single language model loop. This can produce capable behavior, but it makes a basic scientific question difficult to answer: which part of the agent's competence actually comes from the LLM, and which part comes from explicit structure around it?
Open summary page · arXiv · PDF
#7 AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
Score: 23.7
Matched keywords: agent, ai, ai agents, benchmark, llm
Categories: cs.LG, cs.AI, cs.MA, cs.SE
Compressed abstract: AI agents are increasingly deployed in real-world applications, including systems such as Manus, OpenClaw, and coding agents. Existing research has primarily focused on server-side efficiency, proposing methods such as caching, speculative execution, traffic scheduling, and load balancing to reduce the cost of serving agentic workloads.
Open summary page · arXiv · PDF
#8 MAT-Cell: A Multi-Agent Tree-Structured Reasoning Framework for Batch-Level Single-Cell Annotation
Score: 29.3
Matched keywords: agent, large language models, multi-agent, rag, reasoning, retrieval-augmented
Categories: q-bio.QM, cs.AI
Compressed abstract: Automated cellular reasoning faces a core dichotomy: supervised methods fall into the Reference Trap and fail to generalize to out-of-distribution cell states, while large language models (LLMs), without grounded biological priors, suffer from a Signal-to-Noise Paradox that produces spurious associations. We propose MAT-Cell, a neuro-symbolic reasoning framework that reframes single-cell analysis from black-box clas…
Open summary page · arXiv · PDF
#9 ReCodeAgent: A Multi-Agent Workflow for Language-agnostic Translation and Validation of Large-scale Repositories
Score: 26.5
Matched keywords: agent, agent workflow, multi-agent, repository-level
Categories: cs.SE
Compressed abstract: Most repository-level code translation and validation techniques have been evaluated on a single source-target programming language (PL) pair, owing to the complex engineering effort required to adapt new PL pairs. Programming agents can enable PL-agnosticism in repository-level code translation and validation: they can synthesize code across many PLs and autonomously use existing tools specific to each PL's analysi…
Open summary page · arXiv · PDF
#10 On the Step Length Confounding in LLM Reasoning Data Selection
Score: 29.6
Matched keywords: fine-tuning, large language models, llm, reasoning, token
Categories: cs.CL, cs.AI
Compressed abstract: Large reasoning models have recently demonstrated strong performance on complex tasks that require long chain-of-thought reasoning, through supervised fine-tuning on large-scale and high-quality datasets. To construct such datasets, existing pipelines generate long reasoning data from more capable Large Language Models (LLMs) and apply manually heuristic or naturalness-based selection methods to filter high-quality…
Open summary page · arXiv · PDF