Unlocking Out-of-Distribution Generalization in Transformers via Recursive Latent Space Reasoning Paper โข 2510.14095 โข Published Oct 15 โข 5
Disentangling and Integrating Relational and Sensory Information in Transformer Architectures Paper โข 2405.16727 โข Published May 26, 2024