arXiv daily keyword digest · 2026-05-07

#1 AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

Score: 34.2

Matched keywords: agent, ai, ai agent, ai agents, benchmark, llm, tool use

Categories: cs.AI, cs.CR

Compressed abstract: Modern AI agents execute real-world side effects through tool calls such as file operations, shell commands, HTTP requests, and database queries. A single unsafe action, including accidental deletion, credential exposure, or data exfiltration, can cause irreversible harm.

Open summary page · arXiv · PDF

#2 Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control

Score: 21.4

Matched keywords: fine-tuning, large language models, llm

Categories: cs.LG, cs.AI, cs.CL

Compressed abstract: Post-training large language models (LLMs) often suffers from catastrophic forgetting, where improvements on a target objective degrade previously acquired capabilities. Recent evidence suggests that this phenomenon is primarily driven by excessive distributional drift during optimization.

Open summary page · arXiv · PDF

#3 DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Score: 25.0

Matched keywords: agent, ai, ai agents, prompt

Categories: cs.AI

Compressed abstract: AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due to their high capability and flexibility, such agents raise significant security and safety concerns.

Open summary page · arXiv · PDF

#4 Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games

Score: 32.2

Matched keywords: agent, large language models, llm, multi-agent, reasoning

Categories: cs.AI

Compressed abstract: While Large Language Models (LLMs) excel in certain reasoning tasks, they struggle in multi-agent games where the final outcome depends on the joint strategies of all agents. In multi-agent games, the non-stationarity of other agents brings significant challenges on the evaluation of the reasoning process and the credit assignment over multiple reasoning steps.

Open summary page · arXiv · PDF

#5 From Parameter Dynamics to Risk Scoring : Quantifying Sample-Level Safety Degradation in LLM Fine-tuning

Score: 22.2

Matched keywords: alignment, fine-tuning, large language models, llm

Categories: cs.AI, cs.LG

Compressed abstract: Safety alignment of Large Language Models (LLMs) is extremely fragile, as fine-tuning on a small number of benign samples can erase safety behaviors learned from millions of preference examples. Existing studies attempt to explain this phenomenon by comparing parameters and hidden states before and after fine-tuning, but overlook their dynamic evolution during fine-tuning.

Open summary page · arXiv · PDF

#6 Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

Score: 19.2

Matched keywords: large language models, llm, reasoning

Categories: cs.CL, cs.ET, cs.LG

Compressed abstract: Reinforcement Learning with Verifiable Rewards (RLVR) is an essential paradigm that enhances the reasoning capabilities of Large Language Models (LLMs). However, existing methods typically rely on static policy optimization schemes that misalign with the model's evolving reasoning capabilities.

Open summary page · arXiv · PDF

#7 Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

Score: 18.2

Matched keywords: benchmark, fine-tuning, large language models, llm

Categories: cs.SE, cs.AI

Compressed abstract: Reinforcement fine-tuning (RFT) has become a core paradigm for post-training large language models, yet its training process remains highly fragile. Existing efforts mainly improve reliability at the system level or address specific issues in individual subproblems by modifying RFT algorithms.

Open summary page · arXiv · PDF

#8 SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies

Score: 20.0

Matched keywords: agent, ai, ai agents, benchmark, coding agent

Categories: cs.MA, cs.SE

Compressed abstract: The emergence of "vibe coding" platforms, where users describe applications in natural language and AI agents autonomously generate full-stack software, has created a need for rigorous evaluation beyond code-level benchmarks. In order to assess them as virtual software development agencies on understanding business requirements, making architectural decisions, writing production code, handling iterative modification…

Open summary page · arXiv · PDF

#9 RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

Score: 9.4

Matched keywords: llm

Categories: cs.CL, cs.AI, cs.LG

Compressed abstract: We present our winning system for Task~B (generation with reference passages) in SemEval-2026 Task~8: MTRAGEval. Our method is a heterogeneous ensemble of seven LLMs with two prompting variants, where a GPT-4 o-mini judge selects the best candidate per instance.

Open summary page · arXiv · PDF

#10 Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs

Score: 22.4

Matched keywords: harness, large language models, llm

Categories: cs.DC, cs.AI, cs.CL, cs.LG

Compressed abstract: The usage of large language models (LLMs) has grown increasingly fragmented, with no single model dominating. Meanwhile, cloud providers offer a wide range of mid-tier and older-generation GPUs that enjoy better availability and deliver comparable performance per dollar to top-tier hardware.

Open summary page · arXiv · PDF

2026-05-07 · arXiv Daily Keyword Digest (Top 10 of 662)

#1 AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

#2 Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control

#3 DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

#4 Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games

#5 From Parameter Dynamics to Risk Scoring : Quantifying Sample-Level Safety Degradation in LLM Fine-tuning

#6 Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

#7 Towards Robust LLM Post-Training: Automatic Failure Management for Reinforcement Fine-Tuning

#8 SWE-WebDevBench: Evaluating Coding Agent Application Platforms as Virtual Software Agencies

#9 RaguTeam at SemEval-2026 Task 8: Meno and Friends in a Judge-Orchestrated LLM Ensemble for Faithful Multi-Turn Response Generation

#10 Coral: Cost-Efficient Multi-LLM Serving over Heterogeneous Cloud GPUs