Equivalent Query Rewriting in Graph Database without Hard-code Rules

Abstract

Graph Database Management Systems (GDBMS), which utilize graph models for data storage and execute queries via graph traversals, have seen ubiquitous usage in real-world scenarios such as recommendation systems, knowledge graphs, and social networks. Much like Relational Database Management Systems (RDBMS), GDBMS are not immune to bugs. These bugs typically manifest as logic errors that yield incorrect results (e.g., omitting a node that should be included), performance bugs that lead to sub-optimal performance (e.g., long execution time caused by redundant graph scanning), and exception issues (e.g., unexpected or missing exceptions). This paper adapts Equivalent Query Rewriting (EQR), a high-level, widely applicable concept, to GDBMS testing.} Its core idea is to rewrite a GDBMS query into equivalent ones that trigger distinct query plans, and check whether they exhibit discrepancies in system behaviors. To facilitate the realization of EQR, we propose a general concept called Abstract Syntax Graph (ASG), which embeds the semantics of a query in the paths of a graph. Given a base query, an ASG will be constructed and then an equivalent query can be generated by finding paths collectively carrying the complete semantics of the base query. To this end, we further design Random Walk Covering (RWC), a simple yet effective path-covering algorithm. As a practical implementation of these ideas, we develop a tool GRev, which has successfully detected 22 previously unknown bugs across 5 popular GDBMS, with 15 of them being confirmed. In particular, 14 detected bugs are related to improper implementation of graph data retrieval in GDBMS, which is challenging to identify for existing techniques.

Date
Jan 16, 2024 2:00 PM — 3:00 PM
Event
Weekly Talk
Location
NUS SoC
Qiuyang Mang
Qiuyang Mang
Undergraduate Intern

Qiuyang Mang is an undergraduate student who was doing internship at the lab for half a year.