branch-4.0: [fix](fe) Return unknown stats for system tables #62913#63009
Open
github-actions[bot] wants to merge 1 commit intobranch-4.0from
Open
branch-4.0: [fix](fe) Return unknown stats for system tables #62913#63009github-actions[bot] wants to merge 1 commit intobranch-4.0from
github-actions[bot] wants to merge 1 commit intobranch-4.0from
Conversation
### What problem does this PR solve? Related PR: introduced by #41790 Problem Summary: We met a case where manually dropping column stats tablet files caused many internal queries against `__internal_schema.column_statistics`, including queries whose target stats IDs belong to the statistics table itself. The visible symptom is a self-amplifying internal-query storm such as: ```sql SELECT * FROM __internal_schema.column_statistics WHERE id IN (... column_statistics own column stats ids ...) ``` The problematic call chain is: ```text ColumnStatisticsCacheLoader.doLoad -> StatisticsRepository.loadColStats -> StatisticsUtil.execStatisticQuery -> StmtExecutor.executeInternalQuery -> NereidsPlanner.optimize -> InitJoinOrder -> StatsCalculator.disableJoinReorderIfStatsInvalid -> StatsCalculator.checkNdvValidation -> StatisticsCache.OlapTableStatistics.getColumnStatistics ``` When the internal stats-loading SQL scans `__internal_schema.column_statistics`, join-reorder's pre-validation path runs before `computeOlapScan()` derives UNKNOWN stats for system tables. That pre-validation can request column statistics for `column_statistics` itself. If the stats load fails, the async stats cache does not retain a useful value for the failed key, so repeated planning can reload the same system-table stats and amplify the internal query volume. This is separate from the audit `State=OK` display issue: those audit rows can look successful even when the internal stats query failed. The bug fixed here is the recursive system-table stats lookup during planning. This PR returns UNKNOWN from `StatisticsCache.OlapTableStatistics` column and partition-column accessors for system tables. That keeps the system-table behavior consistent with `computeOlapScan()` and prevents future callers through this accessor from accidentally loading system-table stats before the normal scan-statistics guard. The fix intentionally does not skip all internal queries. Internal jobs that scan normal user tables, such as import, MV refresh, or internal insert tasks, can still use normal stats validation and optimization.
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
|
run buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-picked from #62913