← ListarXivPDF

#9 Brain-LLM Alignment Tracks Training Data, Not Typology

Score: 17.4 | Matched keywords: alignment, llm

Categories: cs.CL, cs.AI, q-bio.NC

Abstract Snapshot

Compressed abstract

Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages. Does alignment also generalize cross-linguistically, and what governs the variation?

Main idea

Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages.

Method signal

Does alignment also generalize cross-linguistically, and what governs the variation? We test this using fMRI data from 112 participants across English, Chinese, and French (the Le Petit Prince corpus) and seven LLMs spanning English-dominant, Chinese-dominant, and multilingual architectures.

Contribution signal

Our central finding is that training-language dominance, not an inherent property of English, drives the alignment pattern: a Chinese-dominant model (Baichuan2-7 B), architecture-matched to LLaMA-2-7 B, reverses the gradient entirely, aligning best with Chinese brains and worst with English. Beyond training dominance, formal typological distance independently covaries with alignment degradation, syntax-associated brain regions (IFG) show 2.3 steeper typological gradients than lexico-semantic regions (PTL), and tokenization fertility accounts for 60% of a cross-linguistic shift in optimal encoding layer.

Original Abstract

Brain-LLM alignment is well established in English, yet the brain's language network is neuroanatomically universal across languages. Does alignment also generalize cross-linguistically, and what governs the variation? We test this using fMRI data from 112 participants across English, Chinese, and French (the Le Petit Prince corpus) and seven LLMs spanning English-dominant, Chinese-dominant, and multilingual architectures. Our central finding is that training-language dominance, not an inherent property of English, drives the alignment pattern: a Chinese-dominant model (Baichuan2-7 B), architecture-matched to LLaMA-2-7 B, reverses the gradient entirely, aligning best with Chinese brains and worst with English. Beyond training dominance, formal typological distance independently covaries with alignment degradation, syntax-associated brain regions (IFG) show 2.3 steeper typological gradients than lexico-semantic regions (PTL), and tokenization fertility accounts for 60% of a cross-linguistic shift in optimal encoding layer. These results reveal that the apparent "English advantage" in brain-LLM alignment is an artifact of training data composition, while the remaining variation reflects genuine typological structure concentrated in syntactic processing.