Skip to content

Conversation

@albertlockett
Copy link
Member

@albertlockett albertlockett commented Nov 18, 2025

fixes: #1445

We have a handful of columns that that represent AnyValue's value that are optional columns. If the column is full of default values, we omit these columns. However there was a bug in the encoding of these fields into OTLP protobuf, we treated the missing column as a null value which is not correct if the value_type column is not ValueType::Empty.

These values must be encoded in OTLP, despite their default value, because they are protobuf oneof fields.

@albertlockett albertlockett requested a review from a team as a code owner November 18, 2025 17:57
@github-actions github-actions bot added the rust Pull requests that update Rust code label Nov 18, 2025
@codecov
Copy link

codecov bot commented Nov 18, 2025

Codecov Report

❌ Patch coverage is 93.82716% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.00%. Comparing base (0fe75aa) to head (b18f9e7).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1447      +/-   ##
==========================================
+ Coverage   83.98%   84.00%   +0.01%     
==========================================
  Files         400      400              
  Lines      110838   110897      +59     
==========================================
+ Hits        93083    93154      +71     
+ Misses      17221    17209      -12     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 85.83% <93.82%> (+0.03%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.61% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

.unwrap_or_default();
result_buf.encode_bytes(ANY_VALUE_BYTES_VALUE, val);
}
AttributeValueType::Map | AttributeValueType::Slice => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to do the same for these types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect yes.

I would say this follows from https://protobuf.dev/programming-guides/proto3/
because AnyValue is a oneof field

If you set a oneof field to the default value (such as setting an int32 oneof field to 0), the “case” of that oneof field will be set, and the value will be serialized on the wire.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to handle this the same way we do for the primitive types.

This attr_ser column is an optional binary column, which OTAP encoding will omit if either all values are null, or they're all zero-length byte arrays.

But for maps and lists, what ends up in the attr_ser array is cbor encoded. Even a default (empty) map or list will still be serialized as a non zero length value with this encoding scheme.

For example:

let log_rec = LogsData {
    resource_logs: vec![ResourceLogs {
        scope_logs: vec![ScopeLogs {
            log_records: vec![
                LogRecord {
                    body: Some(AnyValue::new_array(ArrayValue::default())),
                    ..Default::default()
                },
                LogRecord {
                    body: Some(AnyValue::new_kvlist(KeyValueList::default())),
                    ..Default::default()
                }
            ],
            ..Default::default()
        }],
        ..Default::default()
    }],
    ..Default::default()
};

let otap_batch = encode_logs(&log_rec);
let logs_rb = otap_batch.get(ArrowPayloadType::Logs).unwrap();
arrow::util::pretty::print_batches(&[logs_rb.clone()]).unwrap();

prints:

+----------+---------+---------------------+-------------------------+----------------------+
| resource | scope   | time_unix_nano      | observed_time_unix_nano | body                 |
+----------+---------+---------------------+-------------------------+----------------------+
| {id: 0}  | {id: 0} | 1970-01-01T00:00:00 | 1970-01-01T00:00:00     | {type: 6, ser: 9fff} |
| {id: 0}  | {id: 0} | 1970-01-01T00:00:00 | 1970-01-01T00:00:00     | {type: 5, ser: bfff} |
+----------+---------+---------------------+-------------------------+----------------------+

It would actually be unusual if the attribute type column indicated the the value type should be map/list and then the column wasn't present in the attributes record batch, so I think we should just keep the current logic for these types and let it return null if it happens.

@jmacd jmacd enabled auto-merge November 18, 2025 21:49
@jmacd jmacd added this pull request to the merge queue Nov 18, 2025
Merged via the queue into open-telemetry:main with commit 3081202 Nov 18, 2025
33 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Roundtrip conversion of AnyValues not correct if attribute values are all default value

3 participants