arXiv daily keyword digest · 2026-05-01

#1 Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents

Score: 37.7

Matched keywords: agent, ai, ai agents, large language model, llm, reasoning

Categories: cs.AI

Compressed abstract: We present Collaborative Agent Reasoning Engineering (CARE), a disciplined methodology for engineering Large Language Model (LLM) agents in scientific domains. Unlike ad-hoc trial-and-error approaches, CARE specifies behavior, grounding, tool orchestration, and verification through reusable artifacts and systematic, stage-gated phases.

Open summary page · arXiv · PDF

#2 SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Score: 31.8

Matched keywords: code generation, fine-tuning, large language models, llm, prompt

Categories: cs.CR, cs.AR

Compressed abstract: As large language models (LLMs) are increasingly fine-tuned for hardware tasks like RTL code generation, the scarcity of high-quality datasets often leads to the use of rapidly assembled or generated training data. These datasets frequently lack security verification and are highly susceptible to data poisoning attacks.

Open summary page · arXiv · PDF

#3 MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection

Score: 38.9

Matched keywords: agent, agent framework, multi-agent, multimodal, reasoning, retrieval-augmented

Categories: cs.AI, cs.CL

Compressed abstract: Multimodal Stance Detection (MSD) is crucial for understanding public discourse, yet effectively fusing text and image, especially with conflicting signals, remains challenging. Existing methods often face difficulties with contextual grounding, cross-modal interpretation ambiguity, and single-pass reasoning fragility.

Open summary page · arXiv · PDF

#4 End-to-End Evaluation and Governance of an EHR-Embedded AI Agent for Clinicians

Score: 12.2

Matched keywords: agent, ai, ai agent

Categories: cs.AI

Compressed abstract: Clinical AI systems require not just point-in-time evaluation but continuous governance: the ongoing practice of monitoring, evaluating, iterating, and re-evaluating performance throughout deployment. We present an end-to-end framework of governance that integrates rubric validation, live deployment feedback, technical performance monitoring, and cost tracking, with controlled experimentation gating system changes b…

Open summary page · arXiv · PDF

#5 Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning

Score: 27.4

Matched keywords: large language models, llm, prompt, reasoning

Categories: cs.LG, stat.ML

Compressed abstract: Recent advances in large language models (LLMs) have increasingly relied on reinforcement learning (RL) to improve their reasoning capabilities. Three approaches have been widely adopted: (i) Proximal policy optimization and advantage actor-critic rely on a deep neural network to estimate the value function of the learning policy in order to reduce the variance of the policy gradient.

Open summary page · arXiv · PDF

#6 Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Score: 16.4

Matched keywords: fine-tuning, llm, token

Categories: cs.CR, cs.AI

Compressed abstract: Local fine-tuning datasets routinely contain sensitive secrets such as API keys, personal identifiers, and financial records. Although ''local offline fine-tuning'' is often viewed as a privacy boundary, we reveal that compromised model code is sufficient to steal them.

Open summary page · arXiv · PDF

#7 Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models

Score: 22.4

Matched keywords: large language models, llm, reasoning

Categories: cs.CL, cs.AI

Compressed abstract: Large Language Models (LLMs) are known to acquire reasoning capabilities through shared inference patterns in pre-training data, which are further elicited via Chain-of-Thought (CoT) practices. However, whether fundamental reasoning patterns, such as induction, deduction, and abduction, can be decoupled from specific problem instances remains a critical challenge for model controllability, and for shedding light on…

Open summary page · arXiv · PDF

#8 Sentiment Analysis of AI Adoption in Indonesian Higher Education Using Machine Learning and Transformer-Based Models

Score: 17.8

Matched keywords: ai, artificial intelligence, deep learning, machine learning, transformer

Categories: cs.CL

Compressed abstract: This study analyzes Indonesian student opinions on the adoption of artificial intelligence in higher education using two approaches: TF-IDF-based machine learning and Transformer-based deep learning. The dataset consists of 2,295 labeled samples, combining 1,154 student opinions with additional lexical sentiment data.

Open summary page · arXiv · PDF

#9 Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents

Score: 27.6

Matched keywords: agent, large language model, llm, token

Categories: cs.CL, cs.AI, cs.LG

Compressed abstract: Large language model (LLM)-based coding agents increasingly rely on external memory to reuse prior debugging experience, repair traces, and repository-local operational knowledge. However, retrieved memory is useful only when the current failure is genuinely compatible with a previous one; superficial similarity in stack traces, terminal errors, paths, or configuration symptoms can lead to unsafe memory injection.

Open summary page · arXiv · PDF

#10 RoadMapper: A Multi-Agent System for Roadmap Generation of Solving Complex Research Problems

Score: 26.2

Matched keywords: agent, benchmark, large language models, llm, multi-agent

Categories: cs.CL, cs.MA

Compressed abstract: People commonly leverage structured content to accelerate knowledge acquisition and research problem solving. Among these, roadmaps guide researchers through hierarchical subtasks to solve complex research problems step by step.

Open summary page · arXiv · PDF

2026-05-01 · arXiv Daily Keyword Digest (Top 10 of 677)

#1 Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents

#2 SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

#3 MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection

#4 End-to-End Evaluation and Governance of an EHR-Embedded AI Agent for Clinicians

#5 Kernelized Advantage Estimation: From Nonparametric Statistics to LLM Reasoning

#6 Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

#7 Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models

#8 Sentiment Analysis of AI Adoption in Indonesian Higher Education Using Machine Learning and Transformer-Based Models

#9 Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents

#10 RoadMapper: A Multi-Agent System for Roadmap Generation of Solving Complex Research Problems