Commit 34e7e04
committed
[feature](catalog) support varbinary type mapping in hive/iceberg/paimon table (apache#57821)
Problem Summary:
support varbinary type in hive/iceberg/paimon table, could mapping
varbinary type into doris directly, not of use string type, could use
catalog properties enable.mapping.varbinary control it, and default is
false.
and TVF function, eg HDFS also have param could control, and default is
false.
1. when parquet file column type is tparquet::Type::BYTE_ARRAY and no
logicalType and converted_type,read it to column_varbianry directly. so
both physical convert and logical convert are consistent.
if tparquet::Type::BYTE_ARRAY and have set logicalType, eg String, so
those will be reading as column_string, and if the table column create
as binary column, so VarBinaryConverter used convert column_string to
column_varbinary.
2. when orc file column is binary type, also mapping to varbinary type
directly, and could reuse StringVectorBatch.
3. add cast between string and varbinary type.
4. mapping UUID to binary type instead of string in iceberg .
5. change the bool safe_cast_string(**const char\* startptr, size_t
buffer_size**, xxx) signature to safe_cast_string(**const StringRef&
str_ref**, xxx).
6. add **const** to read_date_text_impl function.
7. add some test with paimon catalog test varbinary, will add more case
for hive/iceberg and update doc.
```
mysql> show create table binary_demo3;
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| binary_demo3 | CREATE TABLE `binary_demo3` (
`id` int NULL,
`record_name` char(10) NULL,
`vrecord_name` text NULL,
`bin` varbinary(10) NULL,
`varbin` varbinary(2147483647) NULL
) ENGINE=PAIMON_EXTERNAL_TABLE
LOCATION 'file:/mnt/disk2/zhangsida/test_paimon/demo.db/binary_demo3'
PROPERTIES (
"path" = "file:/mnt/disk2/zhangsida/test_paimon/demo.db/binary_demo3",
"primary-key" = "id"
); |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql> select *, length(record_name),length(vrecord_name),length(bin),length(varbin) from binary_demo3;
+------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+
| id | record_name | vrecord_name | bin | varbin | length(record_name) | length(vrecord_name) | length(bin) | length(varbin) |
+------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+
| 1 | AAAA | AAAA | 0xAAAA0000000000000000 | 0xAAAA | 10 | 4 | 10 | 2 |
| 2 | 6161 | 6161 | 0x61610000000000000000 | 0x6161 | 10 | 4 | 10 | 2 |
| 3 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+------+-------------+--------------+------------------------+----------------+---------------------+----------------------+-------------+----------------+
```
support varbinary type mapping in hive/iceberg/paimon table1 parent 34be474 commit 34e7e04
File tree
94 files changed
+2274
-208
lines changed- be
- src
- util/arrow
- vec
- columns
- data_types/serde
- exec/format
- orc
- parquet
- functions/cast
- io
- runtime
- test
- exec/test_data/parquet_scanner
- vec
- columns
- data_types/serde
- exec/format/parquet
- docker/thirdparties/docker-compose
- hive/scripts
- create_preinstalled_scripts
- paimon1/db1.db
- binary_demo3
- bucket-0
- index
- manifest
- schema
- snapshot
- binary_size_test
- bucket-0
- index
- manifest
- schema
- snapshot
- iceberg/scripts/create_preinstalled_scripts/iceberg
- fe
- fe-common/src/main/java/org/apache/doris/catalog
- fe-core/src
- main/java/org/apache/doris
- common/util
- datasource
- hive
- iceberg
- paimon
- source
- property/fileformat
- nereids
- rules/expression
- check
- rules
- trees/expressions/literal
- types
- tablefunction
- test/java/org/apache/doris/datasource/paimon
- gensrc/thrift
- regression-test
- data
- external_table_p0
- hive
- iceberg
- paimon
- tvf
- query_p0/sql_functions/binary_functions
- suites
- external_table_p0
- hive
- iceberg
- paimon
- query_p0/sql_functions/binary_functions
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
94 files changed
+2274
-208
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
164 | 168 | | |
165 | 169 | | |
166 | 170 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
165 | 225 | | |
166 | 226 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
| |||
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| 38 | + | |
| 39 | + | |
37 | 40 | | |
38 | 41 | | |
| 42 | + | |
39 | 43 | | |
40 | 44 | | |
41 | 45 | | |
| |||
105 | 109 | | |
106 | 110 | | |
107 | 111 | | |
108 | | - | |
| 112 | + | |
109 | 113 | | |
110 | 114 | | |
111 | 115 | | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
112 | 125 | | |
113 | 126 | | |
114 | 127 | | |
| |||
167 | 180 | | |
168 | 181 | | |
169 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
170 | 188 | | |
171 | 189 | | |
172 | 190 | | |
| |||
Lines changed: 65 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
68 | 133 | | |
69 | 134 | | |
70 | 135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| 26 | + | |
26 | 27 | | |
27 | 28 | | |
| 29 | + | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
| |||
39 | 41 | | |
40 | 42 | | |
41 | 43 | | |
42 | | - | |
43 | | - | |
44 | | - | |
| 44 | + | |
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
50 | 51 | | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
| 52 | + | |
55 | 53 | | |
56 | 54 | | |
57 | 55 | | |
| |||
64 | 62 | | |
65 | 63 | | |
66 | 64 | | |
| 65 | + | |
67 | 66 | | |
68 | 67 | | |
69 | 68 | | |
| |||
75 | 74 | | |
76 | 75 | | |
77 | 76 | | |
78 | | - | |
79 | | - | |
80 | | - | |
| 77 | + | |
| 78 | + | |
81 | 79 | | |
82 | 80 | | |
83 | 81 | | |
| |||
94 | 92 | | |
95 | 93 | | |
96 | 94 | | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
| 95 | + | |
102 | 96 | | |
103 | 97 | | |
104 | 98 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| 21 | + | |
21 | 22 | | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | 26 | | |
29 | 27 | | |
30 | 28 | | |
| |||
225 | 223 | | |
226 | 224 | | |
227 | 225 | | |
| 226 | + | |
| 227 | + | |
228 | 228 | | |
229 | 229 | | |
230 | 230 | | |
| |||
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
243 | 246 | | |
244 | 247 | | |
245 | 248 | | |
| |||
0 commit comments