Abstract
How does the brain transform words into meaning? By aligning insights from linguistics, neuroscience, and Large Language Models (LLMs), we observe that AI models and the human brain surprisingly converge on similar representational principles. Using neuroimaging and electrophysiology, we find that as LLMs improve at language tasks, their internal activations increasingly mirror cortical activity, and effectively enable us to decode meaning directly from these brain signals. Building on these results, we will outline a roadmap to uncover the neural code of language: (1) a benchmark dataset of brain recordings to build a "Rosetta Stone" across humans and models, (2) unique intracranial data from young children to characterize the computational principles of language development, and (3) a mathematical framework to understand the geometry of the neural representations of symbolic structures. Together, this research program moves us closer to deciphering how the human brain learns, represents and manipulates the structures of language.