#1 Can Large Multimodal Models Inspect Buildings? A Hierarchical Benchmark for Structural Pathology Reasoning
Score: 20.6
Matched keywords: ai, ai agents, benchmark, foundation models, multimodal, reasoning
Categories: cs.CV
Score: 20.6
Matched keywords: ai, ai agents, benchmark, foundation models, multimodal, reasoning
Categories: cs.CV
Score: 20.0
Matched keywords: benchmark, large language models, llm, reasoning
Categories: cs.CL
Score: 17.0
Matched keywords: alignment, large language models, llm, prompt
Categories: cs.CR, cs.AI
Score: 16.4
Matched keywords: alignment, fine-tuning, multimodal, reasoning
Categories: cs.CV, cs.AI
Score: 14.8
Matched keywords: llm, reasoning
Categories: cs.CL, cs.AI, cs.LG
Score: 14.0
Matched keywords: benchmark, large language model, llm
Categories: cs.RO, cs.MA
Score: 13.8
Matched keywords: large language models, token
Categories: cs.CL, cs.AI, cs.LG
Score: 13.4
Matched keywords: alignment, benchmark, diffusion, large language models, multimodal
Categories: cs.CV, cs.AI
Score: 11.6
Matched keywords: agent, reasoning
Categories: cs.CV, cs.AI, cs.CL
Score: 11.2
Matched keywords: benchmark, multimodal, reasoning
Categories: cs.CV