-
Notifications
You must be signed in to change notification settings - Fork 59
Batch processor testing, comments #1428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| (_, _, Some(id)) => { | ||
| // Comment reproduced from the call site, explaining why the parent_id | ||
| // column may be Some(_) or None: | ||
| // | ||
| // When `parent` has both ID and PARENT_ID columns, resort by ID. Why? Because | ||
| // the reindexing code requires that the input be sorted. For all these cases, | ||
| // we've already reindexed by PARENT_ID in an earlier iteration of this loop. | ||
| let id_values = columns[id].clone(); | ||
| smallvec::smallvec![SortColumn { | ||
| values: id_values, | ||
| options, | ||
| }] | ||
| } | ||
| _ => unreachable!(), | ||
| (_, _, None) => { | ||
| // No id or parent_id columns, no sorting required. | ||
| return RecordBatch::try_new(schema, columns) | ||
| .map_err(|e| Error::Batching { source: e }); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am suspicious of the changes here. I had uncovered a way to hit the unreachable! below, and after making these changes find myself a bit confused by the comment reproduced below (from above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure to understand the comment neither.
| // All the present columns should have the same Field definition, so just pick the first | ||
| // one arbitrarily; we know there has to be at least one because if there were none, we | ||
| // wouldn't have a mismatch to begin with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dubious comment here.
| /// Generate test OTLP metrics data at a timestamp offset | ||
| #[must_use] | ||
| pub fn generate_metrics(&mut self) -> MetricsData { | ||
| // TODO: @@@ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where I'm not finished. I was working to exercise the point-count-over-limit case, which isn't tested but clearly needs it. I would be glad to finish the testing fixture work here.
lquerel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick review. I know you know this area inside out and I don't want to slow you down, so I'm approving it.
a69d99d
This contains several edge-case improvements in the crates/pdata batching logic.
Adds new batch round-trip testing framework and a small number of new tests. More testing with synthetic data and specific test fixtures will be necessary.
Adds otap_df_pdata::testing::DataGenerator as a placeholder for future work. This enables the batching tests to obtain multiple payloads with distinct timestamps, presently. This should be expanded to cover more variation in the data, maybe following the approach used in Phase 1's go/pkg/datagen, maybe using OTel-Weaver.
Modifies the public grouping API to take SignalType and restrict to one signal per call. This was already being handled in the crates/otap batch_processor. The code maintains the same structure, just returns an error if there is mixed telemetry data. We can revisit this later, otherwise the batching logic was treating the signals as logically separate anyway, so the wider API was not providing any benefit.
Note: This PR previously included refactoring in the batch processor, but it was not required. The component is modified only for the changed API: one #[ignore] test was removed (had mixed signals), one test was added to maintain coverage (for metrics).
The changes in groups.rs mainly address a degenerate case in logs (no ID b/c no LogsAttributes) and defects in metric batching: