#1 Dynamic Attentional Context Scoping: Agent-Triggered Focus Sessions for Isolated Per-Agent Steering in Multi-Agent LLM Orchestration
Score: 28.2
Matched keywords: agent, llm, multi-agent
Categories: cs.MA, cs.AI, cs.LG
Compressed abstract: Multi-agent LLM orchestration systems suffer from context pollution: when N concurrent agents compete for the orchestrator's context window, each agent's task state, partial outputs, and pending questions contaminate the steering interactions of every other agent, degrading decision quality. We introduce Dynamic Attentional Context Scoping (DACS), a mechanism in which the orchestrator operates in two asymmetric mode…
Open summary page · arXiv · PDF
#2 Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
Score: 43.0
Matched keywords: agent, harness, harness engineering, large language model, llm
Categories: cs.SE, cs.MA
Compressed abstract: Large language model (LLM) agents are increasingly built less by changing model weights than by reorganizing the runtime around them. Capabilities that earlier systems expected the model to recover internally are now externalized into memory stores, reusable skills, interaction protocols, and the surrounding harness that makes these modules reliable in practice.
Open summary page · arXiv · PDF
#3 ReRec: Reasoning-Augmented LLM-based Recommendation Assistant via Reinforcement Fine-tuning
Score: 31.2
Matched keywords: alignment, fine-tuning, llm, reasoning
Categories: cs.IR, cs.AI
Compressed abstract: With the rise of LLMs, there is an increasing need for intelligent recommendation assistants that can handle complex queries and provide personalized, reasoning-driven recommendations. LLM-based recommenders show potential but face challenges in multi-step reasoning, underscoring the need for reasoning-augmented systems.
Open summary page · arXiv · PDF
#4 Multimodal Reasoning with LLM for Encrypted Traffic Interpretation: A Benchmark
Score: 25.6
Matched keywords: benchmark, llm, multimodal, reasoning
Categories: cs.CR, cs.AI, cs.MM, cs.NI
Compressed abstract: Network traffic, as a key media format, is crucial for ensuring security and communications in modern internet infrastructure. While existing methods offer excellent performance, they face two key bottlenecks: (1) They fail to capture multidimensional semantics beyond unimodal sequence patterns.
Open summary page · arXiv · PDF
#5 Verify Before You Commit: Towards Faithful Reasoning in LLM Agents via Self-Auditing
Score: 29.4
Matched keywords: agent, benchmark, large language model, llm, reasoning
Categories: cs.AI, cs.CL
Compressed abstract: In large language model (LLM) agents, reasoning trajectories are treated as reliable internal beliefs for guiding actions and updating memory. However, coherent reasoning can still violate logical or evidential constraints, allowing unsupported beliefs repeatedly stored and propagated across decision steps, leading to systematic behavioral drift in long-horizon agentic systems.
Open summary page · arXiv · PDF
#6 GRASS: Gradient-based Adaptive Layer-wise Importance Sampling for Memory-efficient Large Language Model Fine-tuning
Score: 19.2
Matched keywords: fine-tuning, large language model, large language models
Categories: cs.CL, cs.LG
Compressed abstract: Full-parameter fine-tuning of large language models is constrained by substantial GPU memory requirements. Low-rank adaptation methods mitigate this challenge by updating only a subset of parameters.
Open summary page · arXiv · PDF
#7 Lightweight LLM Agent Memory with Small Language Models
Score: 20.2
Matched keywords: agent, llm
Categories: cs.AI
Compressed abstract: Although LLM agents can leverage tools for complex tasks, they still need memory to maintain cross-turn consistency and accumulate reusable information in long-horizon interactions. However, retrieval-based external memory systems incur low online overhead but suffer from unstable accuracy due to limited query construction and candidate filtering.
Open summary page · arXiv · PDF
#8 Program Analysis Guided LLM Agent for Proof-of-Concept Generation
Score: 14.0
Matched keywords: agent, llm
Categories: cs.SE
Compressed abstract: Software developers frequently receive vulnerability reports that require them to reproduce the vulnerability in a reliable manner by generating a proof-of-concept (PoC) input that triggers it. Given the source code for a software project and a specific code location for a potential vulnerability, automatically generating a PoC for the given vulnerability has been a challenging research problem.
Open summary page · arXiv · PDF
#9 ClawBench: Can AI Agents Complete Everyday Online Tasks?
Score: 16.4
Matched keywords: ai, ai agents
Categories: cs.CL, cs.AI
Compressed abstract: AI agents may be able to automate your inbox, but can they automate other routine aspects of your life? Everyday online tasks offer a realistic yet unsolved testbed for evaluating the next generation of AI agents.
Open summary page · arXiv · PDF
#10 SeLaR: Selective Latent Reasoning in Large Language Models
Score: 19.8
Matched keywords: large language models, reasoning, token
Categories: cs.CL, cs.AI
Compressed abstract: Chain-of-Thought (CoT) has become a cornerstone of reasoning in large language models, yet its effectiveness is constrained by the limited expressiveness of discrete token sampling. Recent latent reasoning approaches attempt to alleviate this limitation by replacing discrete tokens with soft embeddings (probability-weighted mixtures of token embeddings) or hidden states, but they commonly suffer from two issues: (1)…
Open summary page · arXiv · PDF