Skip to content

Commit 7512f30

Browse files
yujun777Your Name
authored andcommitted
[opt](staticstis) use count(1) for rowCount when scan full table (#58153)
### What problem does this PR solve? when do sample, it will use table.getRowCount() as rowsCount, but the table.getRowCount() may be stale because it depend on BE's report, then it may occur rowsCount < ndv. Then when if 10 * rowsCount < ndv, the analyze sql will fail. Then the regression test statistics/analyze_stats.groovy is not stable, and cause error: ``` Exception: java.sql.SQLException: errCode = 2, detailMessage = Failed to analyze following columns:[id] Reasons: java.lang.RuntimeException: ColStatsData is invalid, skip analyzing. ('1763112020393--1-id',0,1763112019723,1763112020393,-1,'id',null,1,16,0,'1','201',64,'2025-11-14 17:41:14','105 :0.06 ;104 :0.06 ;103 :0.06 ;102 :0.06 ;101 :0.06 ;10 :0.06 ;9 :0.06 ;8 :0.06 ;7 :0.06 ;6 :0.06') at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129) at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953) at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:371) at org.codehaus.groovy.vmplugin.v8.IndyInterface.fromCache(IndyInterface.java:321) at org.apache.doris.regression.util.JdbcUtils$_executeToList_closure1.doCall(JdbcUtils.groovy:47) at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:343) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:328) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:279) ``` so when do sample and scan whole table, we use count(1) to represent rowsCount. Notice that this replace will not increase the excute cost, because the staticstic sql has contained `count(1)`.
1 parent 1d63c19 commit 7512f30

File tree

2 files changed

+7
-0
lines changed

2 files changed

+7
-0
lines changed

fe/fe-core/src/main/java/org/apache/doris/statistics/OlapAnalysisTask.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -271,6 +271,8 @@ protected void getSampleParams(Map<String, String> params, long tableRowCount) {
271271
params.put("scaleFactor", "1");
272272
params.put("sampleHints", "");
273273
params.put("ndvFunction", "ROUND(NDV(`${colName}`) * ${scaleFactor})");
274+
// For full table scan, use COUNT(1) for table row count.
275+
params.put("rowCount", "COUNT(1)");
274276
params.put("rowCount2", "(SELECT COUNT(1) FROM cte1 WHERE `${colName}` IS NOT NULL)");
275277
scanFullTable = true;
276278
return;

fe/fe-core/src/test/java/org/apache/doris/statistics/OlapAnalysisTaskTest.java

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -375,6 +375,11 @@ public KeysType getKeysType() {
375375
Assertions.assertEquals("", params.get("sampleHints"));
376376
Assertions.assertEquals("ROUND(NDV(`${colName}`) * ${scaleFactor})", params.get("ndvFunction"));
377377
Assertions.assertNull(params.get("preAggHint"));
378+
Assertions.assertEquals("COUNT(1)", params.get("rowCount"));
379+
params.clear();
380+
381+
task.getSampleParams(params, 10000);
382+
Assertions.assertEquals("10000", params.get("rowCount"));
378383
params.clear();
379384

380385
new MockUp<OlapTable>() {

0 commit comments

Comments
 (0)