Error Message Guidelines

This guide is a reference for composing standardized and actionable error messages in Apache Spark.

Include What, Why, and How

Exceptions thrown from Spark should answer the Five W’s and How:

  • Who encountered the problem?
  • What was the problem?
  • When did the problem happen?
  • Where did the problem happen?
  • Why did the problem happen?
  • How can the problem be solved?

The context provided by exceptions can help answer who (usually the user), when (usually included in the log via log4j), and where (usually included in the stack trace). However, these answers alone are often insufficient for the user to solve the problem. An error message that answers the remaining questions — what, why, and how — minimizes user frustration.

Explicitly answer What, Why and How

In many cases, the error message should explicitly answer what, why, and how.

Example 1

Unable to generate an encoder for inner class {} without access to the scope that this class was defined in. Try moving this class out of its parent class.

  • What: Unable to generate encoder inner class.
  • Why: Did not have access to the scope that the class was defined in.
  • How: Try moving this class out of its parent class.
Example 2

If the proposed fix (how) feels arbitrary, providing an explanation for why the error occurred can reduce user frustration.

Before

Unsupported function name {}.

  • What: Unsupported function name.
  • Why: Unclear.
  • How: Unclear.

After

Function name {} is invalid. Temporary functions cannot belong to a catalog. Specify a function name with one or two parts.

  • What: Invalid function name.
  • Why: Temporary functions cannot belong to a catalog.
  • How: Specify a function name with one or two parts.

Implicitly answer How

Not all error messages should be this verbose. Sometimes, explicitly explaining how to resolve the problem would be redundant; you may skip an explicit explanation in this case.

Example 1

Invalid pivot column {}. Pivot columns must be comparable.

  • What: Invalid pivot column.
  • Why: Pivot columns must be comparable.
  • How (implied by Why): Use comparable pivot columns.
Example 2

Before

Cannot specify window frame for {} function

  • What: Cannot specify window frame for the function.
  • Why: Unclear.
  • How: Unclear.

After

Cannot specify frame for window expression {}. Window expression contains mismatch between function frame {} and specification frame {}.

  • What: Cannot specify the frame for the window expression.
  • Why: Window expression contains mismatch between function frame and specification frame.
  • How (implied by Why): Match the function frame and specification frame.
Example 3

Before

Cannot parse any decimal.

  • What: Cannot parse decimal.
  • Why: Unclear.
  • How: Unclear.

After

Invalid decimal {}; encountered error while parsing at position {}.

  • What: Invalid decimal.
  • Why: The decimal parser encountered an error at the specified position.
  • How (implied by Why): Fix the error at the specified position.

Implicitly answer Why and How

Sometimes, even explicitly explaining why the problem happened would be redundant; you may skip an explicit explanation in this case.

Path does not exist: {}

  • What: Path does not exist.
  • Why (implied by What): User specified an invalid path.
  • How (implied by What): Use a different path.

Use clear language

Diction guide

Phrases When to use Example
Unsupported The user may reasonably assume that the operation is supported, but it is not. This error may go away in the future if developers add support for the operation. Data type {} is unsupported.
Invalid / Not allowed / Unexpected The user made a mistake when specifying an operation. The message should inform the user of how to resolve the error. Array has size {}, index {} is invalid.
Found {} generators for the clause {}. Only one generator is allowed.
Found an unexpected state format version {}. Expected versions 1 or 2.
Failed to The system encountered an unexpected error that cannot be reasonably attributed to user error. Failed to compile {}.
Cannot Any time, preferably only if one of the above alternatives does not apply. Cannot generate code for unsupported type {}.

Wording guide

Best practice Before After
Use active voice DataType {} is not supported by {}. {} does not support datatype {}.
Avoid time-based statements, such as promises of future support Pandas UDF aggregate expressions are currently not supported in pivot. Pivot does not support Pandas UDF aggregate expressions.
Parquet type not yet supported: {}. {} does not support Parquet type.
Use the present tense to describe the error and provide suggestions Couldn't find the reference column for {} at {}. Cannot find the reference column for {} at {}.
Join strategy hint parameter should be an identifier or string but was {}. Cannot use join strategy hint parameter {}. Use a table name or identifier to specify the parameter.
Provide concrete examples if the resolution is unclear {} Hint expects a partition number as a parameter. {} Hint expects a partition number as a parameter. For example, specify 3 partitions with {}(3).
Avoid sounding accusatory, judgmental, or insulting You must specify an amount for {}. {} cannot be empty. Specify an amount for {}.
Be direct LEGACY store assignment policy is disallowed in Spark data source V2. Please set the configuration spark.sql.storeAssignmentPolicy to other values. Spark data source V2 does not allow the LEGACY store assignment policy. Set the configuration spark.sql.storeAssignment to ANSI or STRICT.
Do not use programming jargon in user-facing errors RENAME TABLE source and destination databases do not match: '{}' != '{}'. RENAME TABLE source and destination databases do not match. The source database is {}, but the destination database is {}.