Core, Data, Flink: Moving Flink to use the new FormatModel API#15329
Core, Data, Flink: Moving Flink to use the new FormatModel API#15329pvary merged 4 commits intoapache:mainfrom
Conversation
| abstract class FlinkSchemaVisitor<T> { | ||
|
|
||
| static <T> T visit(RowType flinkType, Schema schema, FlinkSchemaVisitor<T> visitor) { | ||
| public static <T> T visit(RowType flinkType, Schema schema, FlinkSchemaVisitor<T> visitor) { |
There was a problem hiding this comment.
is this public change required?
There was a problem hiding this comment.
Nope. Remained from old versions.
Reverted
| visitor.beforeStruct(struct.asStructType()); | ||
|
|
||
| LogicalType fieldFlinkType = rowType.getTypeAt(fieldIndex); | ||
| try { |
There was a problem hiding this comment.
why do we need to switch the visit impl to try-finally? if there is any exception, it would just fail. is it important to call the after methods?
There was a problem hiding this comment.
We have the same pattern for every field, that the after method is called in the finally. Since I have added the afterStruct, I have decided to follow the same pattern here as well.
| .split(task.start(), task.length()) | ||
| .caseSensitive(caseSensitive) | ||
| .filter(task.residual()) | ||
| .reuseContainers() |
There was a problem hiding this comment.
the old newOrcIterable method doesn't set this boolean flag since Orc ReadBuilder doesn't support it. I forgot how the new API handles the difference
There was a problem hiding this comment.
We check that the value is always set to the expected value:
iceberg/orc/src/main/java/org/apache/iceberg/orc/ORCFormatModel.java
Lines 290 to 291 in ebaafde
| public void testPromotedFlinkDataType() throws Exception { | ||
| Schema iSchema = | ||
| new Schema( | ||
| Types.NestedField.required( |
There was a problem hiding this comment.
can you help me understand the purpose of expanded types in this test?
There was a problem hiding this comment.
I remember that in one of the versions of the code I needed to test embedded objects. It's kind of late here, but it was somewhat related to the change with the FlinkSchemaVisitor, but I can't find the usage anymore. So it might not be needed in the end.
Let me revert the changes and see if any tests fail. That will help me in the morning to remember the history
|
Merged to main. |
Part of: #12298
Implementation of the new API: #12774
FlinkFormatModel and related changes