Our customers routinely send crash reports and log files to us, but they want confidence that this data does not contain confidential information. How do we achieve this?

For this purpose, the CockroachDB source code uses redactability. This is a crdb-specific combination of data types and APIs on top of Go’s string manipulation, logging and errors APIs.

Redactability makes it possible to remove sensitive information from a string after the string has been constructed.

This wiki page explains how to maintain redactability when adding or modifying CockroachDB’s source code. For more details, see the section References at the bottom.

Main concepts / definition

We use the word “sensitive” or “unsafe” to designate information that's potentially PII-laden or confidential, and “safe” for information that's certainly known to not be unsafe.

Notice the “priority order” in this definition: information is unsafe by default, until proven safe. For example, the basic string type in Go will be considered unsafe.

A confidentiality leak occurs when unsafe information is incorrectly marked as safe.

The APIs discussed below make it possible to annotate information with proofs/promises that things are safe.

In summary:

Where can users observe redactable information?

Redactability (i.e. information where sensitive and safe bits can be separated from each other) can be found:

How to make information redactable?

Any data inside CockroachDB’s source code that may be included in an error message or a log entry should be made redactable.

Otherwise, it will be considered as unsafe by our tooling and removed when customers send log entries / errors to technical support.

More redactability = more observability + more troubleshootability.

The various APIs try to minimize the work needed by CockroachDB programmers, but sometimes extra care must be taken.

Simple cases

Redactability is mostly noticeable when emitting a log entry. A good way to check the redactability properties of an object is thus to log it and see what happens.

Here is what CockroachDB’s APIs already provide for you:

To make more information redactable, the CockroachDB programmer should thus spend extra effort to annotate information as safe or redactable that would be considered as unsafe otherwise.

This is especially the case with struct types and other Go types that alias basic types.

API Basics

As you start learning about these mechanisms, you will slowly start noticing that Go’s native fmt.Stringer interface (and the String() method) becomes less and less relevant in your code — none of the logging or error code ever uses it if your objects implement SafeFormatter or SafeValue. In fact, we are likely to slowly phase out String() methods over time.

Examples

Before

After

// type MetricSnap does not implement SafeFormat and its representation
// as string is thus considered fully unsafe by default.

func (m MetricSnap) String() string {
        suffix := ""
        if m.ConnsRefused > 0 {
                suffix = fmt.Sprintf(", refused %d conns", m.ConnsRefused)
        }
        return fmt.SPrintf("infos %d/%d sent/received, bytes %dB/%dB sent/received%s",
                m.InfosSent, m.InfosReceived,
                m.BytesSent, m.BytesReceived,
                suffix)
}
// SafeFormat implements the redact.SafeFormatter interface.
func (m MetricSnap) SafeFormat(w redact.SafePrinter, _ rune) {
        // Notice how similar the code below is to the original code on the
        // left. The SafePrinter API has been designed to make it easy
        // to “migrate” existing String() methods into SafeFormat().
        //
        // Why this “does the right thing” without special annotations:
        // - The format string for w.Printf() is a literal constant and considered safe.
        // - The numeric arguments are simple integers and thus considered safe.
        // As a result, the entire string produced is automatically considered
        // safe. No special “this is safe” annotations are needed.
        w.Printf("infos %d/%d sent/received, bytes %dB/%dB sent/received",
                m.InfosSent, m.InfosReceived,
                m.BytesSent, m.BytesReceived)
        if m.ConnsRefused > 0 {
                w.Printf(", refused %d conns", m.ConnsRefused)
        }
}

func (m MetricSnap) String() string {
        // StringWithoutMarkers applies the SafeFormat method
        // then removes the redaction markers to produce a “flat” string.
        // This helps avoid code duplication between String()
        // and SafeFormat().
        //
        // Note: The resulting String() method is only rarely
        // called, since most relevant uses of MetricSnap
        // will now use .SafeFormat() directly.
        return redact.StringWithoutMarkers(m)
}
// type OutgoingConnStatus does not implement SafeFormat and its representation
// as string is thus considered fully unsafe by default.

func (c OutgoingConnStatus) String() string {(w redact.SafePrinter, _ rune) {
        return fmt.Printf("%d: %s (%s: %s)",
                c.NodeID, c.Address,
                roundSecs(time.Duration(c.AgeNanos)), c.MetricSnap)
}
// SafeFormat implements the redact.SafeFormatter interface.
func (c OutgoingConnStatus) SafeFormat(w redact.SafePrinter, _ rune) {
        // Notice how similar the code below is to the original code on the
        // left. The SafePrinter API has been designed to make it easy
        // to “migrate” existing String() methods into SafeFormat().
        //
        // Why this “does the right thing” without special annotations:
        // - The format argument is a literal constant and considered safe.
        // - c.NodeID is a roachpb.NodeID,
        //   which aliases a basic integer type and implements SafeValue() and is considered safe.
        // - c.Address is a string and is unsafe.
        // - roundSecs() returns a time.Duration and this type has been registered as safe.
        // - c.MetricSnap implements a SafeFormat method, which is called implicitly to "do the right thing".
        // The resulting string contains a mix of safe/unsafe information:
        // the address is marked as unsafe, the rest is safe.
        w.Printf("%d: %s (%s: %s)",
                c.NodeID, c.Address,
                roundSecs(time.Duration(c.AgeNanos)), c.MetricSnap)
}

// This String() method is defined via SafeFormat(). See explanation in the other example above.
func (c OutgoingConnStatus) String() string { return redact.StringWithoutMarkers(c) }
type Gossip struct {
   ...
   // lastConnectivity remembers the connectivity details
   // across calls to the LogStatus() method.
   lastConnectivity string
}
// LogStatus logs the current status of gossip such as the incoming and
// outgoing connections.
func (g *Gossip) LogStatus() {
        // The log call below should only report the connectivity
        // if it is different from the last call to LogStatus().
        var connectivity string
        if s := g.Connectivity().String(); s != g.lastConnectivity {
                g.lastConnectivity = s
                connectivity = s
        }

        log.Infof(ctx, "gossip status: %s", connectivity)
}

type Gossip struct {
   ...
   // lastConnectivity remembers the connectivity details
   // across calls to the LogStatus() method.
   lastConnectivity redact.RedactableString
}
// LogStatus logs the current status of gossip such as the incoming and
// outgoing connections.
func (g *Gossip) LogStatus() {
        // g.Connectivity() returns an object that implements SafeFormat().
        // Its redactable representation contains a mix of safe and unsafe information.
        //
        // (Again, notice how the code below is similar to the code on the left.)
        var connectivity redact.RedactableString
        if s := redact.Sprint(g.Connectivity()); s != g.lastConnectivity {
                g.lastConnectivity = s
                connectivity = s
        }

        log.Infof(ctx, "gossip status: %s", connectivity)
}

When to use SafeFormatter vs SafeValue

When in doubt, implement a SafeFormatter method. This creates redactable strings that provably do not leak confidential information.

The SafeValue marker interface is reserved to “leaf” data types which are so simple that they can be argued by just looking at the source code that they never can contain sensitive information. We do this e.g. for roachpb.NodeID, descpb.DescID and other such integer types.

Generally, avoid using the SafeValue interface for non-simple types. The main problem this general rule solves it that that nothing prevents a programmer from later adding more data into values of that type and start leaking confidential information without noticing.

For the same reason, generally avoid using redact.Safe and its aliases errors.Safe / log.Safe. The promise made at the time the call is introduced that its argument is safe can be too easily broken “at a distance” by someone else later, for example by changing the type definition of the argument to start leaking unsafe information.

General rules

Proofs vs promises

A promise is when a person (e.g. a member of the CockroachDB team) expresses in the source code that some information is safe or redactable according to their opinion or understanding.

A proof is a function or algorithm that takes a combination of safe/unsafe information and is guaranteed, by construction (and as long as it compiles without type errors), to avoid confidentiality leaks.

Whenever we have a choice between a “proof API” or a “promise API”, we always prefer the proof, because it ensures that the code is not sensitive to human mistakes.

An axiom is an argument expressed in the code that a bit of information is safe or unsafe in a way that provably always true regardless of which data is processed by CockroachDB. Axioms thus have the same general quality as proofs and are thus superior to promises. We prefer axioms where the argument that it makes can be verified locally at the position in the code where it is made, without relying on knowledge pulled from elsewhere.

For example:

Bad

Good

func foo(s string) RedactableString {
  // Casting an arbitrary string to RedactableString
  // is a PROMISE: only the programmer knows
  // that s does not contain redaction markers
  // and that the string concatenation is guaranteed
  // not to leak information.
  //
  // The promise can easily be broken “by accident” if
  // a new call is made to foo() with a broken
  // string as input.
  return RedactableString("hello ‹" + s + "›") 
}
func foo(s string) RedactableString {
  // The redact.Sprintf function is a PROOF:
  // its algorithm guarantees that the unsafe
  // information in s will be properly annotated
  // in the result, without confidentiality leak.
  return redact.Sprintf("hello %s", s) 
}

type myStruct { s string }

func (m *myStruct) foo(v string) {
   m.s = string(redact.Sprintf("hello %s", v))
}

func (m *myStruct) bar() string {
  // PROMISE: m.s has not be modified since foo() was called,
  // and is thus known (only to the programmer) to still be
  // properly redactable.
  //
  // The promise can easily be broken “by accident” if
  // another programmer adds a separate method that modifies
  // m.s.
  return redact.RedactableString(m.s).Redact().StripMarkers()
}
type myStruct { s redact.ReadactableString }

func (m *myStruct) foo(v string) {
   m.s = redact.Sprintf("hello %s", v)
}

func (m *myStruct) bar() string {
  // PROOF: as long as the type rules are obeyed, m.s
  // will have remained redactable ever since it was
  // constructed.
  return m.s.Redact().StripMarkers()
}

Redactability in error objects

The default error constructors from CockroachDB’s error library (Newf, New, AssertionFailedfd, etc.) automatically implement redactability:

The points above emphasize “constant literal string”. The fact a string is a constant literal (i.e. statically embedded in the CockroachDB executable) is what makes it safe. We enforce this property using a linter.

Redactability in log entries

CockroachDB’s log functions first transform their parameters into an error object internally, as per the rules above. Then, that error object is formatted into a log entry.

Therefore, the rules for error objects described above apply equally when constructing log entries:

The redactability properties of that error object are preserved throughout the logging system.

Redactability through the redact package

Marking things as safe or unsafe

The best way to mark things as safe or unsafe is to implement a SafeFormatter method or use redact.Sprint / redact.Sprintf.

The other ways below are detailed only for reference but are error-prone.

Combining things together

Recursive rules during printf-like formatting

The future of redactability in CockroachDB

See the last section in the blog post: both our external customers and our internal product team want to introduce a new separation inside log and error data:

We wish to introduce this distinction because it would enable us to ingest operational data into telemetry without redaction; most of our customers have expressed willingness to share operational data (but not application data) with us.

Until we make this distinction in the code, we are unable to distinguish them so any sensitive data must be considered as application sensitive data by default.

If/when we study this distinction further, we will need to be extra careful about the following:

References