| name | audit-comet-expression |
| description | Audit an existing Comet expression for correctness and test coverage. Studies the Spark implementation across versions 3.4.3, 3.5.8, and 4.0.1, reviews the Comet and DataFusion implementations, identifies missing test coverage, and offers to implement additional tests. |
| argument-hint | <expression-name> |
Audit the Comet implementation of the $ARGUMENTS expression for correctness and test coverage.
Overview
This audit covers:
- Spark implementation across versions 3.4.3, 3.5.8, and 4.0.1
- Comet Scala serde implementation
- Comet Rust / DataFusion implementation
- Existing test coverage (Comet SQL Tests and Comet Scala Tests)
- Gap analysis and test recommendations
Step 1: Locate the Spark Implementations
Clone specific Spark version tags (use shallow clones to avoid polluting the workspace). Only clone a version if it is not already present.
set -eu -o pipefail
for tag in v3.4.3 v3.5.8 v4.0.1; do
dir="/tmp/spark-${tag}"
if [ ! -d "$dir" ]; then
git clone --depth 1 --branch "$tag" https://github.com/apache/spark.git "$dir"
fi
done
Find the expression class in each Spark version
Search the Catalyst SQL expressions source:
for tag in v3.4.3 v3.5.8 v4.0.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql/catalyst/src/main/scala" -name "*.scala" | \
xargs grep -l "case class $ARGUMENTS\b\|object $ARGUMENTS\b" 2>/dev/null
done
If the expression is not found in catalyst, also check core:
for tag in v3.4.3 v3.5.8 v4.0.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql" -name "*.scala" | \
xargs grep -l "case class $ARGUMENTS\b\|object $ARGUMENTS\b" 2>/dev/null
done
Read the Spark source for each version
For each Spark version, read the expression file and note:
- The
eval, nullSafeEval, and doGenCode / doGenCodeSafe methods
- The
inputTypes and dataType fields (accepted input types, return type)
- Null handling strategy (
nullable, nullSafeEval)
- ANSI mode behavior (
ansiEnabled, failOnError)
- Special cases, guards,
require assertions, and runtime exceptions
- Any constants or configuration the expression reads
Compare across Spark versions
Produce a concise diff summary of what changed between:
- 3.4.3 → 3.5.8
- 3.5.8 → 4.0.1
Pay attention to:
- New input types added or removed
- Behavior changes for edge cases (null, overflow, empty, boundary)
- New ANSI mode branches
- New parameters or configuration
- Breaking API changes that Comet must shim
Step 2: Locate the Spark Tests
for tag in v3.4.3 v3.5.8 v4.0.1; do
dir="/tmp/spark-${tag}"
echo "=== $tag ==="
find "$dir/sql" -name "*.scala" -path "*/test/*" | \
xargs grep -l "$ARGUMENTS" 2>/dev/null
done
Read the relevant Spark test files and produce a list of:
- Input types covered
- Edge cases exercised (null, empty, overflow, negative, boundary values, special characters, etc.)
- ANSI mode tests
- Error cases
This list will be the reference for the coverage gap analysis in Step 5.
Step 3: Locate the Comet Implementation
Scala serde
grep -r "$ARGUMENTS" spark/src/main/scala/org/apache/comet/serde/ --include="*.scala" -l
grep -r "$ARGUMENTS" spark/src/main/scala/org/apache/comet/ --include="*.scala" -l
Read the serde implementation and check:
- Which Spark versions the serde handles
- Whether
getSupportLevel is implemented and accurate
- Whether all input types are handled
- Whether any types are explicitly marked
Unsupported
- Whether
getIncompatibleReasons() and getUnsupportedReasons() are overridden.
getSupportLevel controls runtime fallback, but GenerateDocs reads these two
methods to build the Compatibility Guide. If getSupportLevel returns
Incompatible(Some(...)) or Unsupported(Some(...)) but the corresponding
get*Reasons() method is not overridden, the reason will not appear in the
published docs. Expect both to return a Seq[String] containing the same
reason text used in getSupportLevel. Follow the pattern in
spark/src/main/scala/org/apache/comet/serde/structs.scala::CometStructsToJson
or spark/src/main/scala/org/apache/comet/serde/datetime.scala::CometHour:
extract the reason as a private val and reference it from both places.
Shims
find spark/src/main -name "CometExprShim.scala" | xargs grep -l "$ARGUMENTS" 2>/dev/null
If shims exist, read them and note any version-specific handling.
Rust / DataFusion implementation
grep -r "$ARGUMENTS" native/spark-expr/src/ --include="*.rs" -l
grep -r "$ARGUMENTS" native/core/src/ --include="*.rs" -l
If the expression delegates to DataFusion, find it there too. Set $DATAFUSION_SRC to a local DataFusion checkout, or fall back to searching the cargo registry:
if [ -n "${DATAFUSION_SRC:-}" ]; then
grep -r "$ARGUMENTS" "$DATAFUSION_SRC" --include="*.rs" -l 2>/dev/null | head -10
else
grep -r "$ARGUMENTS" ~/.cargo/registry/src/*/datafusion* --include="*.rs" -l 2>/dev/null | head -10
fi
Read the Rust implementation and check:
- Null handling (does it propagate nulls correctly?)
- Overflow and underflow handling (returns
Err vs panics)
- Type dispatch (does it handle all types that Spark supports?)
- ANSI / fail-on-error mode
Step 4: Locate Existing Comet Tests
Comet SQL Tests
find spark/src/test/resources/sql-tests/expressions/ -name "*.sql" | \
xargs grep -l "$ARGUMENTS" 2>/dev/null
find spark/src/test/resources/sql-tests/expressions/ -name "*$(echo $ARGUMENTS | tr '[:upper:]' '[:lower:]')*"
Read every SQL test file found and list:
- Table schemas and data values used
- Queries exercised
- Query modes used (
query, spark_answer_only, tolerance, ignore, expect_error)
- Any ConfigMatrix directives
Comet Scala Tests
grep -r "$ARGUMENTS" spark/src/test/scala/ --include="*.scala" -l
Read the relevant Comet Scala Tests and list:
- Input types covered
- Edge cases exercised
- Whether constant folding is disabled for literal tests
Step 5: Gap Analysis
Compare the Spark test coverage (Step 2) against the Comet test coverage (Step 4). Produce a structured gap report:
Coverage matrix
For each of the following dimensions, note whether it is covered in Comet tests or missing:
| Dimension | Spark tests it | Comet SQL Test | Comet Scala Test | Gap? |
|---|
| Column reference argument(s) | | | | |
| Literal argument(s) | | | | |
| NULL input | | | | |
| Empty string / empty array / empty map | | | | |
| Array/map with NULL elements | | | | |
| Zero, negative zero, negative values (numeric) | | | | |
| Underflow, overflow | | | | |
| Boundary values (INT_MIN, INT_MAX, Long.MinValue, minimum positive, etc.) | | | | |
| NaN, Infinity, -Infinity, subnormal (float/double) | | | | |
Multibyte / special UTF-8 (composed vs decomposed, e.g. é U+00E9 vs e + U+0301, non-Latin scripts) | | | | |
| ANSI mode (failOnError=true) | | | | |
| Non-ANSI mode (failOnError=false) | | | | |
| All supported input types | | | | |
| Parquet dictionary encoding (ConfigMatrix) | | | | |
| Cross-version behavior differences | | | | |
Implementation gaps
Also review the Comet implementation (Step 3) against the Spark behavior (Step 1):
- Are there input types that Spark supports but
getSupportLevel returns Unsupported without comment?
- Are there behavioral differences that are NOT marked
Incompatible but should be?
- Are there behavioral differences between Spark versions that the Comet implementation does not account for (missing shim)?
- Does the Rust implementation match the Spark behavior for all edge cases?
- If
getSupportLevel returns Incompatible or Unsupported with a reason, are getIncompatibleReasons() / getUnsupportedReasons() also overridden so the reason is picked up by GenerateDocs and appears in the Compatibility Guide?
Step 6: Recommendations
Summarize findings as a prioritized list.
High priority
Issues where Comet may silently produce wrong results compared to Spark.
Medium priority
Missing test coverage for edge cases that could expose bugs.
Low priority
Minor gaps, cosmetic improvements, or nice-to-have tests.
Step 7: Offer to Implement Missing Tests
After presenting the gap analysis, ask the user:
I found the following missing test cases. Would you like me to implement them?
- [list each missing test case]
I can add them as Comet SQL Tests in spark/src/test/resources/sql-tests/expressions/<category>/$ARGUMENTS.sql
(or as Comet Scala Tests in CometExpressionSuite for cases that require programmatic setup).
If the user says yes, implement the missing tests following the Comet SQL Tests format described in
docs/source/contributor-guide/sql-file-tests.md. Prefer Comet SQL Tests over Comet Scala Tests.
Comet SQL Tests template
statement
CREATE TABLE test_$ARGUMENTS(...) USING parquet
statement
INSERT INTO test_$ARGUMENTS VALUES
(...),
(NULL)
query
SELECT $ARGUMENTS(col) FROM test_$ARGUMENTS
query
SELECT $ARGUMENTS('value'), $ARGUMENTS(''), $ARGUMENTS(NULL)
Verify the tests pass
After implementing tests, tell the user how to run them:
./mvnw test -DwildcardSuites="CometSqlFileTestSuite" -Dsuites="org.apache.comet.CometSqlFileTestSuite $ARGUMENTS" -Dtest=none
Step 8: Update the Expression Support Doc
After completing the audit (whether or not tests were added), add sub-bullets under the expression's
entry in docs/source/contributor-guide/spark_expressions_support.md.
Add one sub-bullet per Spark version checked, each including:
- Spark version (e.g. 3.4.3, 3.5.8, 4.0.1)
- Today's date
- A brief note for any version-specific finding (behavioral difference, known incompatibility); omit if nothing notable
Output Format
Present the audit as:
- Expression Summary - Brief description of what
$ARGUMENTS does, its input/output types, and null behavior
- Spark Version Differences - Summary of any behavioral or API differences across Spark 3.4.3, 3.5.8, and 4.0.1
- Comet Implementation Notes - Summary of how Comet implements this expression and any concerns
- Coverage Gap Analysis - The gap table from Step 5, plus implementation gaps
- Recommendations - Prioritized list from Step 6
- Offer to add tests - The prompt from Step 7
Tone and Style
- Write in clear, concise prose
- Use backticks around code references (function names, file paths, class names, types, config keys)
- Avoid robotic or formulaic language
- Be constructive and acknowledge what is already well-covered before raising gaps
- Avoid em dashes and semicolons; use separate sentences instead