Detecting Logic Bugs in Database Engines via Equivalent Expression Transformation

Abstract

Database management systems (DBMSs) are crucial for storing and fetching data. To improve the reliability of such systems, approaches have been proposed to detect logic bugs that cause DBMSs to process data incorrectly. These approaches manipulate queries and check whether the query results produced by DBMSs follow the expectations. However, such query-level manipulation cannot handle complex query semantics and thus needs to limit the patterns of generated queries, degrading testing effectiveness. In this paper, we tackle the problem using a fine-grained methodology—expression-level manipulation—which empowers the proposed approach to be applicable to arbitrary queries. To find logic bugs in DBMSs, we design a novel and general approach, equivalent expression transformation (EET). Our core idea is that manipulating expressions of a query in a semantic-preserving manner also preserves the semantics of the entire query and is independent of query patterns. EET validates DBMSs by checking whether the transformed queries still produce the same results as the corresponding original queries. We realize our approach and evaluate it on 5 widely used and extensively tested DBMSs: MySQL, PostgreSQL, SQLite, ClickHouse, and TiDB. In total, EET found 66 unique bugs, 35 of which are logic bugs. We expect the generality and effectiveness of EET to inspire follow-up research and benefit the reliability of many DBMSs.

Date
Apr 30, 2024 2:00 PM — 3:00 PM
Event
Weekly Talk
Location
NUS SoC