Gibbon is an experimental compiler that transforms high-level functional programs to operate on serialized data. See the README on GitHub to get started with Gibbon.
Typically, programs that process tree-like data represent trees using pointer-based data structures in memory (one heap object per-leaf and per-node) because such a layout is convenient to manipulate in a high-level programming language. This is also generally distinct from the representation of the data in serialized form on disk, which means that a program must perform some sort or marshaling when working with serialized data. Gibbon unifies the in-memory and serialized formats, transforming recursive functions to operate directly on serialized data.
Additionally, while the pointer-based structure is efficient for random access and shape-changing modifications, it can be inefficient for traversals that process most or all of a tree in bulk. The Gibbon project aims to explore optimizations of recursive tree transforms by changing how trees are stored in memory.
Currently, the Gibbon compiler has multiple front-ends: an S-expression syntax similar to Typed Racket, and a small subset of Haskell.
Publications
Peer-reviewed papers
| ECOOP’17 | Compiling Tree Transforms to Operate on Packed Representations: Michael Vollmer, Sarah Spall, Buddhika Chamith, Laith Sakka, Chaitanya Koparkar, Milind Kulkarni, Sam Tobin-Hochstadt, Ryan Newton [PDF] |
| PLDI’19 | LoCal: A Language for Programs Operating on Serialized Data: Michael Vollmer, Chaitanya Koparkar, Mike Rainey, Laith Sakka, Milind Kulkarni, Ryan R. Newton [PDF] [extended version] |
| ICFP’21 | Efficient Tree-Traversals: Reconciling Parallelism and Dense Data Representations: Chaitanya Koparkar, Mike Rainey, Michael Vollmer, Milind Kulkarni, Ryan R. Newton [PDF] |
| ISMM’24 | Garbage Collection for Mostly Serialized Heaps: Chaitanya Koparkar, Vidush Singhal, Aditya Gupta, Mike Rainey, Michael Vollmer, Artem Pelenitsyn, Sam Tobin-Hochstadt, Milind Kulkarni, Ryan R. Newton [PDF] |
| ECOOP’24 | Optimizing Layout of Recursive Datatypes with Marmoset: Vidush Singhal, Chaitanya Koparkar, Joseph Zullo, Artem Pelenitsyn, Michael Vollmer, Mike Rainey, Ryan Newton, Milind Kulkarni [PDF] [extended version] |
Dissertations
-
A Language-based Approach to Programming with Serialized Data by Michael Vollmer (2021)
-
Mostly-Serialized Data Structures for Parallel and General-Purpose Programming by Chaitanya Koparkar (2023)
Videos
-
ECOOP 2017: Compiling tree transforms to operate on packed representations (Michael Vollmer)
-
FHPC 2018: Gibbon: A Compiler for Recursive Functions on mostly Serialized Data (Chaitanya Koparkar)
-
PLDI 2019: LoCal: A Language for Programs Operating on Serialized Data (Michael Vollmer)
-
ICFP 2021: Efficient Tree-Traversals: Reconciling Parallelism and Dense Data Representations (Chaitanya Koparkar)
-
ISMM 2024: Garbage Collection for Mostly Serialized Heaps (Ryan Newton)