Weekly Talk

A Benchmark Harness for Query Execution Correctness Verification and Query Optimizer Evaluation of Database Systems

Query engines are the cornerstone of any relational databases, including query optimizers and query executors. It is imperative for database developers to be equipped with a tool to detect the query execution bug and evaluate the query optimizer …

Efficient and Scalable Distributed LLM Training: Hiding Communication Overhead

Training Large Language Models (LLMs) is often inefficient due to high communication overhead, resulting in sub-50% Model FLOPS Utilization (MFU). In this talk, I will discuss how to build a cost-efficient and scalable machine learning system, using …

Type Systems for Query Languages

In this talk, I will introduce type systems for query languages, with a focus on SQL and GQL. Practical SQL engines exhibit subtle differences in their handling of typing constraints and implicit type casts, often overlooked in formal accounts of …

SGL: Deriving Test Case Generators using Domain-Specific Language to Test Database Engines

Various automated testing approaches have been proposed for Database Management Systems (DBMS), which can automatically detect different kinds of bugs such as logic and performance bugs. Such approaches typically compare the results of executing two …

Automated test case reduction in query specific language(s)

Database testing tools like SQLsmith and SQLancer generate lengthy test cases to identify several categories of database bugs. While these tools are effective in identifying issues, usually the resulting test is large and complex, making it difficult …

Improving the Extensibility of SQLancer

SQLancer, an open-source tool for testing database management systems (DBMS), is instrumental in uncovering bugs within real-world applications. However, maintaining SQLancer has become increasingly challenging due to tightly coupled components, …

CodeGRITS: A Research Toolkit for Developer Behavior and Eye Tracking in IDE

Traditional methodologies for exploring programmers’ behaviors have primarily focused on capturing their actions within the Integrated Development Environment (IDE), offering limited view into their cognitive processes. Recent emergent work started …

Automatically Generating an Abstract Interpretation-Based Optimizer from a DSL (SPLASH SRC Practice)

Just-in-Time (JIT) compilers can gain information at run time that are not available to Ahead-of-Time (AOT) compilers. As such, abstract interpretation baseline JIT compilers are common in many dynamic language implementations. Yet the reference …

Are Deep Reinforcement Learning Implementations Really Interchangeable?

Deep Reinforcement Learning (DRL) is a paradigm of artificial intelligence where an agent uses a neural network to learn which actions to take in a given environment. DRL has recently gained traction from being able to solve complex environments like …

Detecting Build Dependency Errors in Incremental Builds

Incremental and parallel builds performed by build tools such as Make are the heart of modern C/C++ software projects. Their correct and efficient execution depends on build scripts. However, build scripts are prone to errors. The most prevalent …