Jargon: CockroachDB words for newcomers

saasSEO: glossary, terms, vocabulary, dictionary

This page is https://go.crdb.dev/jargon

Focusing here on those words and acronyms you may encounter in meetings or documents or Slack threads, but won't necessarily find explained on our blog. Why?

  • you want to understand the chatter between devs

  • you’re in a meeting and somebody uses an acronym you’re not familiar with

  • you're an external contributor to CockroachDB and want to ensure you're not missing anything

  • if you're just starting at Cockroach Labs

  • if you've been at Cockroach Labs for a while but somehow you missed an explanation and you're too afraid to ask

  • You’re a tech person who’s getting exposed to a lot of business jargon you haven’t encountered before, or a sales person getting exposed to a lot of tech jargon

If you wish to have some more words explained, just ask!

The definitions should be given in Simple English. If you find them difficult to understand, ask to clarify!

  • 2DC: two data center, a common DR or HA configuration in the days before distributed systems

  • 2FA: two-factor authentication

  • ACID: the 4 guarantees of a transactional database are Atomicity, Consistency, Isolation, Durability

  • ACV: Annual Contract Value, a customer metric

  • aggregation: an operation that a client app can perform in SQL to simplify a lot of data into a simple result (e.g. counting)

  • AOST: As of System Time

  • Aphyr: the usual name Kyle Kingsbury goes by

  • Attrition: employees leaving the company. “regretted attrition” is employees leaving of their own accord; “unregretted attrition” is “a label assigned to former employees that the company does not wish to rehire”

  • ARR: Annual Recurring Revenue (also see CARR) – the subscription revenue of a given period expressed as an annual run rate for all contracts with revenue recognition dates prior to the period close date.

  • ASP: Annual Selling Price

  • AWS (Amazon Web Services): Amazon's Cloud hosting

  • Azure: Microsoft's Cloud hosting

  • BDR: Business Development Rep, typically characterised as “outbound / prospecting”; related to (but not the same as) SDR

  • Bikeshed: many engineers spending a lot of time debating a minor issue - see the story here

  • BSL: Business Source License, a license applied to part of our codebase. Code licensed under the BSL becomes automatically re-licensed under the Apache License (open source) after 3 years. See also CCL.

  • CAGR: Compound Annual Growth Rate, the annualized average rate of revenue growth between two given years, assuming growth takes place at an exponentially compounded rate

  • CalVer: calendar versioning scheme for CockroachDB. See here for details.

  • CAP Theorem: a theorem that says a distributed database can only deliver two out of three of Consistency, Availability, Partition-ability

  • CARR - Contracted Annual Recurring Revenue - CARR includes the ARR of new customers that are not yet live because the customer onboarding process is not yet complete.

  • Cassandra: another DB product we hear about often

  • CBO: Cost-based Optimizer

  • CC: CockroachCloud

  • CCL: Cockroach Community License, a license applied to part of our codebase that corresponds to “Enterprise” features. See also BSL.

  • CDC: acronym for “Change Data Capture”. CockroachDB’s version of change feeds, see definition below.

  • CEA: Cockroach Enterprise Architect

  • Change feeds: a way for a user to ask the database to ping the user (or an app, i.e. a 3rd party) back when some data changes. The changed data is notified to the 3rd party (usually) asynchronously, i.e. possibly not atomically with the transaction where the change occurs. See also "Trigger".

  • Chaos: testing method that stops nodes in a test cluster unpredictably

  • Chaos monkey: program that performs chaos testing

  • CI (Continuous integration): program that runs tests and produces reports automatically in the background

  • Cloud: someone else's computer

  • Cluster: An ambiguous term, especially in the context of CockroachCloud. Generally, one of:

    • [CRDB/Cockroach/CockroachDB] Cluster: one deployment of CockroachDB, a group of one or more nodes (servers)

    • [Kubernetes] Cluster: A group of one or more Kubernetes nodes, usually including a leader node.

    • [CC/CockroachCloud] Cluster: One or more Kubernetes Clusters running a single deployment of CockroachDB

    • [Host/Serverless] Cluster: One deployment of CockroachDB, usually on Kubernetes, that contains many Tenants.

  • CNI: Container Network Interface

  • CockroachCloud: a hosting service provided by CRL, where CRL runs CockroachDB clusters on behalf of customers.

  • Code review: A process by which a second (or more) engineer reviews code before it is merged into the main codebase. At CRL, every code must be reviewed, and approved (see LGTM).

  • Code yellow: moving an issue to top company priority (idea comes from Google). During a code yellow, any task pertaining to the code yellow takes precedence over non code yellow related tasks.

  • CQL: Conversation Qualified Lead (as opposed, say, to a Sales Qualified Lead)

  • CRL: acronym for Cockroach Labs

  • CRUD: Create, Read, Update, Delete … basic operations on a database object

  • CSM: Customer Success Management

  • CTAS: SQL programmer shorthand for CREATE TABLE AS SELECT

  • CTE: Common Table Expressions.

  • Cutting the release: selecting one particular version of the product to publish out

  • Data sovereignty: the demand for some apps/companies to have data located in specific places geographically, for example in EU data for citizens must be hosted in the EU

  • DDL (Data Definition Language): the part of SQL that apps can use to manage tables and indexes (the schema), e.g. create/rename/delete them. This includes e.g. "CREATE TABLE" but does not include "SELECT". See also "DML."

  • Delta: an incremental change, such as between git commits or incremental backups.

  • Denormalization: An explicit copy of some normalized data in a different format, in order to enable faster access. "Denormalized data" = indexes, materialized views, etc. --- all the stuff that copies "base" data into a different format for speed on operations that aren't by primary key.

  • DLQ: Dead Letter Queue

  • DML (Data Manipulation Language): the part of SQL that apps can use to read and write data, e.g. query tables or update table rows. This includes e.g. "SELECT" or "INSERT" but does not include "CREATE TABLE". See also "DDL."

  • ELA: Enterprise License Agreement

  • EMEA: Europe, Middle East, and Africa

  • Encryption at rest: have the data encrypted in the database, not only when queried by clients

  • ETL: Extract / Transform / Load … a type of data integration

  • Experimental: a label indicating a CRDB feature or command is still under active development, and its behavior and/or UI is still subject to change

  • FCF: Free Cash Flow, the cash a company generates after accounting for cash outflows to support operations and maintain its capital assets

  • FDW: Foreign Data Wrapper, a PostgreSQL extension for accessing a table or schema in one database from another

  • FHMP: Forever hold my peace, a neutral way to bow out of a debate / code review

  • FMEA: Failure Mode & Effects Analysis

  • FTS: Full Table Scan

  • GC: garbage collection, the process of actually deleting and cleaning up items that have been marked for deletion

  • GCE: Google Compute Engine, the compute service (VMs) offered by GCP

  • GCP: Google Cloud Platform, Google's Cloud services

  • GDPR: General Data Protection Regulation, EU data security laws

  • Geospatial index: An index that is efficient for storing 2d coordinates (such as lat/long) such that two points on the coordinate system that are close on the (lat/long) map are stored relatively close together in the index ordering. The uniqueness of the Geospatial index is in maintaining the "closeness" when going down from 2 dimensions (lat/long) to one dimension (the index). Usually achieved with a space filling curve

  • GIS (Geographical Information System): A system optimized for geographical data, makes heavy use of geospatial indexes, but also spatial-temporal indexes (combination of geospatial data and time series data), and also can understand and digest data stored in standard formats used for geographic data.

  • Git: a tool and database to store and share source code

  • HLC: Hybrid Logical Clock

  • HTAP: Hybrid Transactional/Analytical Processing. I.e. OLAP + OLTP.

  • IHAC: I Have A Customer, often used as an intro to a question posted to Slack

  • ILM: Information Lifecycle Management, “a wide-ranging set of strategies for administering storage systems on computing devices”

  • Index: A copy of some parts of a database table, ordered to make lookups very quick according to the index columns. There is always a "primary index", ordered by primary key, making lookups of a row if you know the primary key very fast. Other indexes are called "secondary indexes", and are ordered by some other criteria (could be some other columns, or even combinations of columns, or even combinations of columns from different tables). An index is a denormalization.

  • Jepsen: a tool that tests databases in a harsh way, made by Aphyr; also the name of Aphyr's blog about database testing

  • K8s: shorthand for Kubernetes

  • KMS: (AWS) Key Management Service

  • LB: Load Balancer (see also NLB)

  • LGTM: Short for "looks good to me", the typical way one says they approve of a PR at the end of a code review, okaying it for merge. LGTM doesn't necessarily mean you can hit the merge button, for instance "LGTM, if you fix XYZ" still means XYZ should be done. They are just trusting you to do so, and don't necessarily need to verify that (if XYZ is trivial).

  • LOQ: Loss of Quorum. In the RAFT consensus protocol, “quorum” is the number of voters necessary to make a decision. Loss of Quorum is when there aren’t enough voters available, and no consensus can be reached.

  • LSM: Log Structured Merge Tree, a data structure used by Pebble/CRDB (as well as RocksDB and others) for organizing storage on disk. Alternative to the B+ trees traditionally used in databases. A large topic, read more here or on Wikipedia.

  • Materialized view: a SQL view where the data of the view is duplicated from the original table (As opposed to a simple/dematerialized view, where a query on the view is automatically translated to a query on the underlying table). Materialized views are useful when the query that creates the view is complex and the views is used more often than the data is changed, because it then saves SQL execution time for the clients using the view.

  • MBO: Management By Objectives, a management model that clearly defines objectives that are agreed to by both management and employees

  • Merge: the action of accepting a PR to the main product

  • Merge skew: An error condition encountered by the entire team when two changes on the git repository get merged concurrently and the result of both changes together causes CockroachDB to break. We use the ‘bors’ merge automation bot to prevent concurrent merges and thus merge skews. Merge skews can also be avoided by a healthy regimen of regularly rebasing, i.e. re-creating the patches on top of the latest git revision.

  • Mongo: short for MongoDB, another DB product we hear about often

  • MoSCoW: a hierarchy of requirements … Must have, Should have, Could have, Would be nice to have

  • MPP: Massively Parallel Processing

  • MRC: Monthly Recurring Charge

  • MSA: Master Service Agreement

  • Multi-tenant CockroachDB: an extension of the base CockroachDB architecture where multiple customers (“tenants”) can use the same CockroachDB cluster safely. A Multi-tenant cluster has two kinds of “nodes”: KV nodes that store the data for all tenants, and SQL nodes that are specific to each tenant.

  • MVCC: Multi-Version Concurrency Control. We use it in CockroachDB. A large topic in database concurrency models, see our docs or read more on Wikipedia.

  • MVP: Minimum Viable Product, an early version of a product with just enough features to be usable. Used at Cockroach Labs to refer to the initial release of a new CockroachDB capability, still considered “beta” or “experimental”.

  • NDR: Net Dollar Retention, a SaaS metric that measures how much monthly or annual recurring revenue has grown or shrunk (aka “churned”) from existing customers

  • NLB: Network Load Balancer

  • Node: (CockroachDB node) one instance of a CockroachDB server process, that stores part of the data in an entire CockroachDB cluster. A single cluster can have many nodes. In a serverless (multi-tenant) cluster, we don’t use the word “node” alone and instead specify “KV node” or “SQL pod”.

  • Node.js: a Javascript library that makes it easier to write server programs in Javascript. Completely unrelated to the notion of CockroachDB node.

  • Normalization: Normalization refers to the process of reducing copies of data as much as possible so that there aren't too many logical copies of the same information (as that would increase the possibility of errors if some copies are updated without updating all copies). See wikipedia. Usually contrasted with explicit denormalization.

  • NPS: Net Promoter Score, a metric indicating customer satisfation / loyalty

  • NRC: Non-Recurring Charge

  • ODS: Operational Data Store

  • OIDC: Open ID Connect, an authentication protocol

  • OKRAs - Objectives, Key Results & Actions

  • OLAP (Online Analytics Processing): a class of applications where the most common queries are long and touch most of the data at a time with complex computations -- contrast with OLTP

  • OLTP (Online Transaction Processing): a class of applications where the most common queries are short and touch a bit of data at a time with simple computations -- contrast with OLAP

  • OOM: Out Of Memory … an undesirable condition for a CockroachDB node

  • ORM (Object-Relational Mapping): a piece of software used by an app to access a DB

  • Pebble: CockroachDB’s storage engine, an embedded key-value store based on RocksDB. See our blog post.

  • P2S: Problems to Solve

  • PMF = Product Market Fit

  • PR (Pull Request): a proposal for a change to the source code submitted for review to colleagues. See "merge"

  • PRD: product requirements document

  • PQ: a postgres driver written in go that we use. For more background on database drivers, check out this quora post.

  • PTAL: Please Take Another Look, occasionally shows up in Slack discussions

  • PTS: Protected Time Stamp

  • RACI matrix: the list of who’s Responsible, Accountable, Consulted, and Informed

  • Range: a logical portion of the data in a DB. In other distributed databases called a "shard", "chunk", or "tablet". Each range can have multiple physical copies, on different nodes. Each copy is called a “replica”.

  • RBAC: Role-based access control

  • RCA: Root Cause Analysis

  • RDS: (Amazon’s) Relational Database Service

  • Rebase: Take a git commit (the set of line-by-line changes) and apply those deltas to a different commit. Usually done because "master" moved on while you were working on a change, and now your commit won't merge cleanly. Usually a good habit to do before merging anyway, as not all conflicts are caught by git. See "merge skew".

  • Reg cluster: short for "registration cluster", a CockroachDB cluster ran by CRL internally to store telemetry data sent by customers.

  • Reg server: an internet microservice ran by CRL that receives telemetry data sent by customers.

  • Replication factor: how many copies there are of each Range in a DB or Zone. Default is 3.

  • Replica: one of the copies of some Range in a DB or Zone. There are replication factor replicas of a range across a cluster.

  • RFAL: ready for another look; typically refers to a PR which has been revised after review comment(s).

  • RFC: Short for "Request for comments". An RFC is a description of a larger change, written to outline why the change, what the change will consist of, etc. before the code is written. Used to build consensus around a code, before diving into writing code for weeks. This lets issues/thoughts get surfaced early, so that time is not wasted writing code, only to discover in a code review that there was a much better way to do things, if only, say, the author had requested some comments on their thoughts before diving off.

  • RPO / RTO: Recovery Point Objective / Recovery Time Objective … what data/service/capability is to be restored and how long it takes to restore it, two key measures in a Disaster Recovery plan

  • SDR: Sales Development Rep, a role (though not necessarily a job title) in Sales, typically characterized as “inbound / follow-up”; related to (but not the same as) BDR

  • SemVer: semantic versioning scheme, which we used previously for CockroachDB. Now using calver instead.

  • Server: something running in the backend of a client-server architecture. We don’t like the word “server” too much at CRL because it is very ambiguous. It can designate too many things! We have CockroachDB “servers” (KV nodes and SQL pods), CockroachCloud “servers” (monitoring, console, orchestration), testing “servers” etc.

  • SH: often refers to self-hosted customers

  • SIAM: Service Integration and Management

  • SIEM: Security Information and Event Management (sometimes Security Incident and Event Management)

  • SIG: Severe Incidents Group

  • SLA: Service Level Agreement, the agreement you make with clients/usrs

  • SLI: Service Level Indicator, the measure of service performance

  • SLO: Service Level Objective, the objective you have to hit to meet the SLA

  • Spanner (Google product): another project we get inspiration from

  • SQL: usually means Structured Query Lanaguage, but salesfolk also us it to mean Sales Qualified Lead

  • SRE: Site Reliability Engineer - a discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems, with the goal of creating scalable and reliable software systems

  • SSOT: Single Source of Truth

  • SST: State Snapshot Table

  • TAM: Total Accessible Market, shows up mostly in marketing docs, and reports to the Board

  • TAM: Technical Account Manager, what a lot of other companies call the role of CEA

  • TBI: Time-Based Iterator

  • TCO: Total Cost of Ownership, a measure of particular interest to purchasers. TCO typically includes one-time fees and recurring fees for licensing, maintenance, support, etc.

  • Team City: one of our continuous integration tools

  • TFTR: thanks for the review; typically refers to thanking a PR reviewer for their time.

  • Time series: a way to organize data in a DB where the data is organized primarily by time; commonly used to store events over time; sometimes subject to OLAP applications

  • TLS: Transport Layer Security (successor to Secure Sockets Layer)

  • TPC-C: a database benchmark developed in 1992 to simulate a traditional OLTP application (i.e. wholesale distributor w/ warehouses and retail locations). See also TPC-E.

  • TPC-E: a database benchmark developed in 2007 to simulate a more modern OLTP application (i.e. a stock brokerage w/ fluctuating stock prices and many concurrent orders). See also TPC-C.

  • Trigger: a way for a user to ask the database to ping the user (or an app) back when some data changes, or run a stored procedure on the server. The trigger (usually) happens before the transaction commits, so that the consumer can see the changes atomically. See also "Change feed".

  • TRN: Technical Roadmap Narrative

  • TSE: Technical Support Engineer

  • TTL: Time To Live, generally the configurable longevity for an object. Most frequently encountered in CRDB as a GC TTL, the time for old MVCC values to live before being garbage collected

  • UAT: User Acceptance Testing

  • WAL: Write-Ahead Logging

  • XA Transactions: 2 phase commit protocol referred to as XA (eXtended Architecture)

  • YCSB: Yahoo Cloud Serving Benchmark, designed to measure NoSQL and other cloud service databases.

  • Zone config/zones: CockroachDB way to set different configuration parameters to different parts of a cluster, can be used to set constraints on replication (e.g. at least one copy must be on a different continent) or for data sovereignty (e.g. no copies of this data should reside outside EU territory)