Hello, intrepid hackers! This code lab is designed to help you learn some of the ins and outs of extending various parts of CockroachDB's SQL layer. It's made for new and old employees alike. It will require a few hours of your time to complete, start to finish.
This lab will wend its way through all of the major parts of CockroachDB's SQL subsystem, touching the lexer, parser, AST, optimizer, plan tree, distributed runtime, as well as the column-oriented runtime.
By the time you complete it, you will understand the basic programming APIs of many of these subsystems, as well as how they fit together. Buckle up - we're about to write an entirely new (and plausibly useful) relational operator from scratch! It's called
Problem Statement: Implement STEP
The ultimate goal of this lab will be to add a completely novel SQL statement modifier, called `STEP n`, that changes a query to return every nth row. Here's what it would look like, from the perspective of
For obvious contrast, here's what it looks like today:
If you've played with SQL before, you'll notice that our brand new
STEP operator is similar to
OFFSET - it comes at the end of a statement and modifies which rows of that statement are returned. This is a good thing! It means we can follow some of the patterns of how
OFFSET are implemented throughout the codebase, and that's exactly what we'll be doing during this lab.
Parsing: Adding STEP to the Grammar and AST
The first thing we need to do is add our new keyword to the grammar that CockroachDB uses to parse statements. The lack of this keyword is what's causing the error message we saw above: a syntax error! We get a syntax error here because, until now,
STEP wasn't a keyword that CockroachDB understood at all.
CockroachDB's grammar is a YACC grammar. It's defined in
pkg/sql/parser/sql.y, so go ahead and open that file up in your editor - I'll wait.
Like we discussed, we're going to be following the pattern of
OFFSET, so go ahead and search for those keywords in the file. Do you see what's going on? We've defined
OFFSET as tokens, or parseable words, added rules for them, defining where the tokens can go in a statement, and finally associated those rules with actions, telling the YACC parser generator what code to generate when it sees them. For offset, the rule looks like this:
This is saying that an
offset_clause can either be the word
OFFSET followed by an
a_expr, the rule for a general relational expression in this grammar, or the word
OFFSET followed by a
c_expr (a restricted form of
row_or_rows, a rule that expands to either the word
ROW or the word
ROWS, a testament to just how flexible SQL's syntax can sometimes be.
$$ represents the output value of the current rule, $1 the value of the first symbol in the current rule, $2 the value of the second symbol, etc.
- STEP only needs to be added to SELECT statements, not DELETE or UPDATE.
- Remember to add the STEP keyword to one of the keyword category lists at the end of the grammar (this has to be done whenever a new keyword is added).
Type-Checking: Ensuring STEP's arguments are typed as integers
Planning: Adding STEP to the Optimizer, so that it can produce the correct query plan
Execution: Implementing the STEP algorithm in a DistSQL Processor
Now, it's time to implement the
STEP algorithm in a DistSQL processor. DistSQL is what we call the row-at-a-time, distributed SQL execution in CockroachDB. Implementing a DistSQL processor for an relational operator, like
STEP, allows the system to schedule SQL execution on a node that's closest to where the data lives.
OFFSET are both somewhat special in DistSQL - they're implemented as part of the post-process spec for a processor. Post-processing is a phase that does rendering (calculation of scalar expressions), projections (shuffling or removing of columns) and limit and offset calculations. Because of this, we won't be following the
OFFSET template too closely.
STEP, you should try to copy the n
oopProcessor, a simple DistSQL processor that simply passes rows unmodified from its parent, and modify it to implement the algorithm. The noopProcessr is located in
pkg/sql/distsqlrun/noop.go. As you can see, the basic loop of a DistSQL processor is implemented in its
Next method - and that's the method we'll need to copy, with a few modifications, to implement
Go ahead and make a new file in
step.go. You can copy what's in
noop.go to start, renaming the
stepProcessor. You'll also need to extend the constructor to include the argument to
STEP - for now, it's okay to just make it an integer. Later, when we implement DistSQL physical planning for
STEP, we'll change that integer to be a full protobuf message called
StepSpec, but there's no need to do that for now.
Now, implement the algorithm! You'll need to add a little bit of state, since
Next is called once per output row, like a generator. You can put the state that you need into the
stepProcessor struct you added.
Testing the Processor
How'd it go? The
STEP algorithm is relatively simple, but as we all know, writing software is hard! We should write some unit tests for our new processor. To unit test the processor, add a new file called
step_test.go in the same
distsqlrun directory that you put
step.go into. Write some tests there, following the model of another processor test suite like
.go. Try to think of some corner cases (what if
k is bigger than the total number of rows in the input? Smaller? What if the input has no rows? etc.) and do your best to thoroughly test the processor.
Benchmarking the Processor
No processor implementation is complete without a microbenchmark - an isolated test that evaluates the throughput of the processor when acting on in-memory data. The purpose of such a test is to gather the "speed-of-light" throughput for a processor - in other words, the amount of data that it can process on a single thread without any overhead from other processors, network or disk latency.
For this, you should copy and modify the example benchmark in
noop_test.go . Add benchmarks with a couple of different values of
k, so it's possible to reason about what
k might do to the performance of our processor.
Columnar Execution: Implementing the STEP algorithm in a columnar operator
Next, let's add an implementation of
STEP as a columnar operator in the
pkg/sql/exec package. You might be wondering - why do we need to add two implementations of
STEP? The reason is that (as of Jan 2019) the columnar execution engine is still an experimental feature, so it's implemented as a wholly separate engine to the default DistSQL engine. The columnar model does have the potential to make operators much more efficient, though, so we'll implement
STEP in this model to showcase the potential speed increases.
STEP in a new file in
pkg/sql/exec/, the directory that contains the columnar execution engine, called (surprise!)
step.go. In the columnar model,
LIMIT are implemented as normal operators, so you can clone them to start. Go ahead and start by cloning
limit.go, and changing
You'll notice that the structure of
limitOp is somewhat similar to the DistSQL processor we wrote earlier - it consumes something from a parent operator, does some work, and returns something else. The difference is that, instead of consuming rows one at a time from a parent, and returning a single row per invocation, the operators in
exec consume something called a
ColBatch and return another
ColBatch. The idea here is that, by processing a batch of rows at a time, oriented by column, we can waste a lot less processor time doing repeated interface lookups, type switches, and method calls. If you're not yet convinced, that's okay! Hopefully, you'll become more convinced after we write and run a benchmark for our new operator.
Go ahead and implement the operator now. One important detail that you'll need to learn about is the selection vector, which is how we filter out rows of data in the columnar model. The selection vector is briefly documented on the
ColBatch interface, but the idea is that, since we're processing a batch of data at once, it would be great if we could avoid having to mutate that batch if we're just filtering out rows - preventing an
O(n) re-ordering of the data if we decided to remove a row in the middle of the batch. We get around this by writing to this selection vector - a densely-packed vector of integers that contains the indexes of the rows that have not been filtered out already.
You'll also need to have a switch in your operator's
Next method, unlike
limit, that does something different depending on whether there was a selection vector set by a previous operator or not.
Finally, you'll also need to keep some state, like before, to keep track of the sequence number of the rows you've seen before across batches - go ahead and stick that state into
Just like before, please write some tests for your new operator, following the framework example seen in
limit_test.go. The test framework does some nice stuff for us like automatically test all of your cases with several batch sizes, to avoid bugs caused by inconvenient batch sizes.
Now, finally, it's time for some payoff! Write a benchmark for the operator, copying the example in
distinct_test.go. Running it and comparing to the benchmark you made earlier should demonstrate some of the potential gains that the columnar model makes available. Note that the gains in this comparison are limited to what you get by batching data and using the selection vector - we don't get any performance gains from the power of type templating to avoid type switches, since
STEP never has to introspect any columns.