#1 Doctorina MedBench: End-to-End Evaluation of Agent-Based Medical AI
Score: 17.2
Matched keywords: agent, ai, reasoning
Categories: cs.CL, cs.AI, cs.LG, cs.MA
Score: 17.2
Matched keywords: agent, ai, reasoning
Categories: cs.CL, cs.AI, cs.LG, cs.MA
Score: 16.4
Matched keywords: large language model, large language models, llm
Categories: cs.CL, cs.AI, cs.LG
Score: 23.8
Matched keywords: large language model, large language models, llm, multimodal, token
Categories: cs.DC, cs.AI
Score: 13.2
Matched keywords: llm, reasoning
Categories: cs.LG, cs.AI, cs.CL
Score: 20.0
Matched keywords: benchmark, large language models, llm, reasoning, token
Categories: cs.RO, cs.AI
Score: 21.6
Matched keywords: ai, benchmark, large language models, llm, rag
Categories: cs.CL
Score: 15.4
Matched keywords: agent, ai, benchmark, large language models
Categories: cs.SE, cs.LG
Score: 14.8
Matched keywords: artificial intelligence, diffusion, transformer
Categories: cs.CV, cs.AI, cs.LG, eess.IV
Score: 10.2
Matched keywords: benchmark, reasoning
Categories: cs.CV, cs.AI, cs.CL, cs.LG
Score: 10.2
Matched keywords: machine learning
Categories: cs.CR, cs.AI, cs.LG