Inconsistencies in TeX-produced Documents

Abstract

TeX is a widely-used typesetting system adopted by most publishers and professional societies due to its versatility and formatting capabilities. While the TeX is responsible for generating a significant amount of documents, irregularities in the TeX ecosystem may produce inconsistent documents, resulting in failures to adhere to formatting specifications, or the same document rendering differently for different authors. In this work, we investigate and quantify the robustness of the TeX ecosystem through a large-scale study of 432 documents. We developed an automated pipeline to evaluate the cross-engine and cross-version compatibilities. We found significant inconsistencies in the outputs of different TeX engines: only 0.2% of documents compiled to identical output with XeTeX and PDFTeX due to a lack of cross-engine support in popular LaTeX packages and document classes used in academic conferences. A smaller—but still significant—extent of inconsistencies were found across different distributions of TeXlive, with only 42.1% of documents producing the same output from 2020 to 2023. From a sample of 10 unique root causes of inconsistencies, we identified two new bugs in LaTeX packages, and five existing bugs that were fixed independently of this study. We also observed potentially unintended inconsistencies across different versions of the TeXLive distribution outside of the updates listed in changelogs. We expect that this study will help authors of TeX documents to understand the often undocumented differences between TeX engines and how their documents may be affected by updates in the TeX ecosystem, thus avoiding unexpected outcomes. This work may also benefit developers by enhancing understanding of how different implementations result in unintended differences, as well as the typical inconsistencies that may occur.

Date
Apr 9, 2024 2:00 PM — 3:00 PM
Event
Weekly Talk
Location
NUS SoC
Jovyn Tan
Jovyn Tan
Undergraduate Student

Jovyn is working on the automated testing of TeX engines as a final year project.