Skip to content

Conversation

@uchenily
Copy link
Contributor

What problem does this PR solve?

  1. set eof flag based on _line_reader_eof rather than rows=0, to prevent unnecessary extra one call when reaching end of file during batch reading
  2. refactor CsvReader::get_next_block extract _process_one_line method to reduce code duplication

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Nov 13, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@uchenily
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34266 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 744050cdbf3292e7865395d0d6a29b745e210151, data reload: false

------ Round 1 ----------------------------------
q1	17618	5155	4964	4964
q2	2019	315	207	207
q3	10250	1289	743	743
q4	10236	914	369	369
q5	7567	2423	2320	2320
q6	186	176	141	141
q7	921	779	642	642
q8	9365	1320	1037	1037
q9	6961	5087	5222	5087
q10	6829	2221	1834	1834
q11	500	314	294	294
q12	332	364	224	224
q13	17767	3698	3079	3079
q14	243	231	217	217
q15	567	517	498	498
q16	1022	1002	950	950
q17	583	861	356	356
q18	7441	7216	7216	7216
q19	1078	973	556	556
q20	363	344	242	242
q21	4061	2552	2323	2323
q22	1065	1064	967	967
Total cold run time: 106974 ms
Total hot run time: 34266 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5117	5100	5078	5078
q2	253	325	242	242
q3	2209	2682	2324	2324
q4	1357	1765	1315	1315
q5	4201	4407	4541	4407
q6	216	174	134	134
q7	2060	1998	1864	1864
q8	2646	2567	2598	2567
q9	7398	7343	7560	7343
q10	2972	3227	2884	2884
q11	612	552	523	523
q12	730	784	665	665
q13	3546	4003	3243	3243
q14	307	309	282	282
q15	549	498	483	483
q16	1036	1145	1187	1145
q17	1182	1525	1443	1443
q18	7929	7856	7616	7616
q19	794	803	868	803
q20	2004	2141	1931	1931
q21	5096	4706	4470	4470
q22	1123	1046	1004	1004
Total cold run time: 53337 ms
Total hot run time: 51766 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188089 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 744050cdbf3292e7865395d0d6a29b745e210151, data reload: false

query1	1021	399	419	399
query2	6557	1709	1721	1709
query3	6763	223	223	223
query4	26529	24155	23549	23549
query5	4978	618	477	477
query6	325	242	218	218
query7	4666	497	296	296
query8	308	253	255	253
query9	8746	2606	2561	2561
query10	533	337	289	289
query11	15543	15018	14786	14786
query12	180	123	116	116
query13	1698	571	449	449
query14	10805	9330	9238	9238
query15	225	198	174	174
query16	7583	708	575	575
query17	1189	758	600	600
query18	2025	423	318	318
query19	205	198	169	169
query20	128	122	125	122
query21	214	131	118	118
query22	4013	4082	4077	4077
query23	34038	33006	33034	33006
query24	8483	2395	2407	2395
query25	581	511	448	448
query26	1241	273	161	161
query27	2751	514	345	345
query28	4414	2231	2213	2213
query29	791	610	522	522
query30	309	230	198	198
query31	901	848	739	739
query32	84	74	69	69
query33	587	383	323	323
query34	790	852	527	527
query35	828	840	754	754
query36	942	999	909	909
query37	125	111	86	86
query38	3573	3489	3429	3429
query39	1465	1405	1395	1395
query40	218	128	116	116
query41	61	60	62	60
query42	126	120	121	120
query43	508	505	471	471
query44	1227	742	732	732
query45	192	188	172	172
query46	883	991	646	646
query47	1753	1761	1696	1696
query48	392	415	332	332
query49	794	498	421	421
query50	635	690	402	402
query51	3891	3977	3916	3916
query52	108	113	105	105
query53	244	265	205	205
query54	301	304	280	280
query55	85	86	86	86
query56	330	326	312	312
query57	1179	1182	1112	1112
query58	285	272	269	269
query59	2559	2680	2503	2503
query60	349	348	331	331
query61	161	153	160	153
query62	812	710	675	675
query63	231	196	197	196
query64	4435	1164	839	839
query65	4029	3939	3939	3939
query66	1136	438	332	332
query67	15164	15197	14855	14855
query68	4621	899	604	604
query69	506	337	300	300
query70	1338	1296	1280	1280
query71	429	347	322	322
query72	6065	5110	5117	5110
query73	621	591	366	366
query74	8910	9013	8649	8649
query75	3321	3343	2793	2793
query76	3275	1164	715	715
query77	523	405	318	318
query78	9600	9945	8935	8935
query79	2092	831	620	620
query80	1687	576	508	508
query81	591	273	237	237
query82	405	156	128	128
query83	367	268	251	251
query84	255	112	94	94
query85	932	485	438	438
query86	467	311	285	285
query87	3682	3743	3656	3656
query88	2903	2287	2265	2265
query89	389	330	291	291
query90	1932	222	218	218
query91	168	163	132	132
query92	80	71	64	64
query93	1844	1007	645	645
query94	774	456	356	356
query95	390	329	315	315
query96	493	583	287	287
query97	2907	2988	2863	2863
query98	232	221	213	213
query99	1314	1429	1320	1320
Total cold run time: 271338 ms
Total hot run time: 188089 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.77 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 744050cdbf3292e7865395d0d6a29b745e210151, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.08	0.08
query4	1.60	0.11	0.11
query5	0.26	0.25	0.25
query6	1.19	0.65	0.63
query7	0.03	0.03	0.02
query8	0.06	0.04	0.04
query9	0.58	0.54	0.51
query10	0.58	0.57	0.57
query11	0.16	0.12	0.11
query12	0.15	0.12	0.12
query13	0.62	0.60	0.59
query14	1.00	1.00	1.01
query15	0.84	0.82	0.83
query16	0.40	0.39	0.40
query17	1.02	1.04	1.03
query18	0.22	0.20	0.20
query19	1.92	1.85	1.87
query20	0.02	0.01	0.02
query21	15.42	0.21	0.13
query22	4.97	0.07	0.05
query23	15.69	0.27	0.10
query24	2.74	0.52	0.50
query25	0.08	0.07	0.05
query26	0.14	0.13	0.15
query27	0.08	0.06	0.05
query28	4.46	1.13	0.94
query29	12.66	3.94	3.39
query30	0.28	0.14	0.11
query31	2.82	0.60	0.38
query32	3.23	0.55	0.46
query33	3.03	3.11	3.04
query34	15.71	5.15	4.55
query35	4.59	4.61	4.61
query36	0.69	0.51	0.49
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.04	0.03	0.03
query40	0.19	0.14	0.14
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.03
Total cold run time: 98.19 s
Total hot run time: 27.77 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/41) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.80% (18280/34619)
Line Coverage 38.17% (166198/435373)
Region Coverage 33.16% (129180/389536)
Branch Coverage 33.90% (55439/163541)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 87.80% (36/41) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.55% (24340/34019)
Line Coverage 58.01% (252970/436049)
Region Coverage 53.38% (210880/395047)
Branch Coverage 54.67% (89988/164616)

@uchenily
Copy link
Contributor Author

run p0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants