The overall goal of this line of research is to develop a new family of language models that can process any written language by rendering text as images, which allows the models to learn from the visual similarities between written languages. Realizing this goal includes creating and implementing tokenization-free multilingual language models, the collection and curation of visually diverse language data, the training of small-scale and large-scale models, developing techniques for effective model quantization and compression, and creating models that jointly process natural images and rendered text using a single interface. The project is currently funded by a grant from the Villum Foundation.
2023 |
Text Rendering Strategies for Pixel Language Models.
Proceedings of EMNLP. |
2023 |
PHD: Pixel-Based Language Modeling of Historical Documents.
Proceedings of EMNLP. |
2023 |
Language Modelling with Pixels.
Proceedings of ICLR. Notable Top 5% Paper. |