Pixel Language Models

Desmond Elliott
Associate Professor
Jonas F. Lotz
Jonas F. Lotz
(co-advised with Christian Igel)
Emanuele Bugliarello
Ph.D 2023. Now Research Scientist at Google

The overall goal of this line of research is to develop a new family of language models that can process any written language by rendering text as images, which allows the models to learn from the visual similarities between written languages. Realizing this goal includes creating and implementing tokenization-free multilingual language models, the collection and curation of visually diverse language data, the training of small-scale and large-scale models, developing techniques for effective model quantization and compression, and creating models that jointly process natural images and rendered text using a single interface. The project is currently funded by a grant from the Villum Foundation.

Related Publications

2023
Text Rendering Strategies for Pixel Language Models.
Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, and Desmond Elliott.
Proceedings of EMNLP.
2023
PHD: Pixel-Based Language Modeling of Historical Documents.
Nadav Borenstein, Phillip Rust, Desmond Elliott, and Isabelle Augenstein.
Proceedings of EMNLP.
2023
Language Modelling with Pixels.
Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam Lhoneux, and Desmond Elliott.
Proceedings of ICLR. Notable Top 5% Paper.