-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Labels
enhancementNew feature or requestNew feature or requestmediumMedium priorityMedium prioritytestingTestsTests
Description
To make improving performance more measurable, include benchmarks to be run.
Requires benchmark programs (see https://github.com/apache/arrow-rs/tree/master/parquet/benches)
And also large data files, ideally with all supported data types
Note for the data files, completely random data may not be sufficient, as some encodings take advantage of patterns in the data (e.g. int v2 RLE), so need to keep that in mind if considering generating data for the benchmarks
Could also use something like TPCH or TPCDS data, or NYC taxi, for more variety in data
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestmediumMedium priorityMedium prioritytestingTestsTests