-
Notifications
You must be signed in to change notification settings - Fork 59
Support protobuf encoding default AnyValues from missing optional columns
#1447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1447 +/- ##
==========================================
+ Coverage 83.98% 84.00% +0.01%
==========================================
Files 400 400
Lines 110838 110897 +59
==========================================
+ Hits 93083 93154 +71
+ Misses 17221 17209 -12
Partials 534 534
🚀 New features to boost your workflow:
|
| .unwrap_or_default(); | ||
| result_buf.encode_bytes(ANY_VALUE_BYTES_VALUE, val); | ||
| } | ||
| AttributeValueType::Map | AttributeValueType::Slice => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not need to do the same for these types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect yes.
I would say this follows from https://protobuf.dev/programming-guides/proto3/
because AnyValue is a oneof field
If you set a oneof field to the default value (such as setting an int32 oneof field to 0), the “case” of that oneof field will be set, and the value will be serialized on the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to handle this the same way we do for the primitive types.
This attr_ser column is an optional binary column, which OTAP encoding will omit if either all values are null, or they're all zero-length byte arrays.
But for maps and lists, what ends up in the attr_ser array is cbor encoded. Even a default (empty) map or list will still be serialized as a non zero length value with this encoding scheme.
For example:
let log_rec = LogsData {
resource_logs: vec![ResourceLogs {
scope_logs: vec![ScopeLogs {
log_records: vec![
LogRecord {
body: Some(AnyValue::new_array(ArrayValue::default())),
..Default::default()
},
LogRecord {
body: Some(AnyValue::new_kvlist(KeyValueList::default())),
..Default::default()
}
],
..Default::default()
}],
..Default::default()
}],
..Default::default()
};
let otap_batch = encode_logs(&log_rec);
let logs_rb = otap_batch.get(ArrowPayloadType::Logs).unwrap();
arrow::util::pretty::print_batches(&[logs_rb.clone()]).unwrap();prints:
+----------+---------+---------------------+-------------------------+----------------------+
| resource | scope | time_unix_nano | observed_time_unix_nano | body |
+----------+---------+---------------------+-------------------------+----------------------+
| {id: 0} | {id: 0} | 1970-01-01T00:00:00 | 1970-01-01T00:00:00 | {type: 6, ser: 9fff} |
| {id: 0} | {id: 0} | 1970-01-01T00:00:00 | 1970-01-01T00:00:00 | {type: 5, ser: bfff} |
+----------+---------+---------------------+-------------------------+----------------------+
It would actually be unusual if the attribute type column indicated the the value type should be map/list and then the column wasn't present in the attributes record batch, so I think we should just keep the current logic for these types and let it return null if it happens.
3081202
fixes: #1445
We have a handful of columns that that represent
AnyValue's value that are optional columns. If the column is full of default values, we omit these columns. However there was a bug in the encoding of these fields into OTLP protobuf, we treated the missing column as a null value which is not correct if the value_type column is notValueType::Empty.These values must be encoded in OTLP, despite their default value, because they are protobuf
oneoffields.