Skip to content

GH-44: Clear signature columns before re-populating in ArrowFlightStatement#executeFlightInfoQuery#1135

Open
hkad98 wants to merge 1 commit intoapache:mainfrom
hkad98:fix/gh-44-jdbc-column-duplication
Open

GH-44: Clear signature columns before re-populating in ArrowFlightStatement#executeFlightInfoQuery#1135
hkad98 wants to merge 1 commit intoapache:mainfrom
hkad98:fix/gh-44-jdbc-column-duplication

Conversation

@hkad98
Copy link
Copy Markdown

@hkad98 hkad98 commented May 6, 2026

Rationale

ArrowFlightStatement#executeFlightInfoQuery appends the dataset schema columns to the statement's reused Meta.Signature without first clearing them, so the column list doubles on every invocation. When the FlightInfo has at least one endpoint, ArrowFlightJdbcVectorSchemaRootResultSet#populateData overwrites signature.columns from the actual stream schema and hides the duplication. With an empty endpoint list — the case reported by @mingnuj in the original issue against a Rust-based Flight SQL server, and independently reproducible against Denodo Express 9.4.2 — that overwrite never runs, and ResultSetMetaData#getColumnCount() reports 2× columnCount. Calcite/Avatica then refuses the metadata with Cannot have more columns with the same name.

This is a regression introduced in 15.0.0 by the prepared-statement parameter binding work in GH-33475, which made handle.signature mutable and shared across the prepareAndExecuteexecuteFlightInfoQuery path.

Fix

One line: signature.columns.clear() before the existing addAll(...). The fresh schema returned at execute time is authoritative; pre-existing entries on the signature are stale by definition. This is exactly the workaround @mingnuj proposed in the issue and confirmed working.

Test

Adds ResultSetMetadataTest#testShouldNotDuplicateColumnsWhenFlightInfoHasNoEndpoints plus a LEGACY_REGULAR_NO_ENDPOINTS_SQL_CMD fixture in CoreMockedSqlProducers (registered with an empty result-provider list, so getFlightInfoStatement returns a FlightInfo with zero endpoints — the bug-triggering shape). The test asserts getColumnCount() == 1, fails on main with Expected: <1> but was <2>, and passes after the fix.

./gradlew-equivalent run for this module:

mvn -pl flight/flight-sql-jdbc-core test

Result: 1234 tests run, 0 failures, 0 errors, 44 skipped.

Cross-server reproductions

The bug surfaces against any Flight SQL server that returns an empty endpoint list:

  • Rust-based server in @mingnuj's original report.
  • Denodo Express 9.4.2 (independent reproduction over JDBC).

Wire-level the same FlightInfo is parsed correctly as 1 column by pyarrow.flight, by ADBC C++ / Python, and by Denodo's own native JDBC driver — only flight-sql-jdbc-driver 18.x / 19.0.0 doubles the column count, confirming the issue is Java-side.

Closes #44.

…ghtStatement#executeFlightInfoQuery

ArrowFlightStatement#executeFlightInfoQuery appended the dataset schema
columns to the statement's reused Meta.Signature without first clearing
them, doubling the column list on every invocation. When the FlightInfo
has at least one endpoint, ArrowFlightJdbcVectorSchemaRootResultSet
overwrites signature.columns from the actual stream schema and hides
the duplication. With an empty endpoint list — reported against both
Rust- and Denodo-based Flight SQL servers — that overwrite never runs
and ResultSetMetaData#getColumnCount() reports 2x the schema width.

Regression introduced in 15.0.0 (GH-33475 prepared-statement parameter
binding) when handle.signature became mutable across executions.

Adds a regression test that registers a mock query with no endpoints
and asserts ResultSetMetaData#getColumnCount() matches the schema.

Closes apache#44.
@github-actions

This comment has been minimized.

@lidavidm lidavidm added the bug-fix PRs that fix a big. label May 6, 2026
@github-actions github-actions Bot added this to the 20.0.0 milestone May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PRs that fix a big.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Java][FlightSQL] Column Duplication When Selecting from no result record in Arrow Flight SQL JDBC Driver

2 participants