-
Notifications
You must be signed in to change notification settings - Fork 2.7k
ci: exclusion of metrics added #7671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Tushar Anand <[email protected]>
Signed-off-by: Tushar Anand <[email protected]>
bd4ceed to
8c4232c
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7671 +/- ##
=======================================
Coverage 96.54% 96.54%
=======================================
Files 384 384
Lines 19502 19502
=======================================
Hits 18828 18828
Misses 489 489
Partials 185 185
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| should_exclude= should_exclude_metric(sample.name, labels) | ||
| if should_exclude: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| should_exclude= should_exclude_metric(sample.name, labels) | |
| if should_exclude: | |
| if should_exclude_metric(sample.name, labels): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to count these exclusions in stats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to count these exclusions in stats
Like the ones posted in PR, if some stats are excluded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes (although I am not sure excluding full metrics is a good thing, per the other comment).
| # } | ||
| } | ||
|
|
||
| METRIC_EXCLUSION_RULES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than excluding, could we add a curl to the CI scripts that always fails (e.g. looking for a trace id we know doesn't exist)? The main point of the metrics regression checks is to make sure all metrics are covered (but ignore the actual values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than excluding, could we add a curl to the CI scripts that always fails (e.g. looking for a trace id we know doesn't exist)?
I might be misunderstanding the approach, so could you clarify how the curl is meant to work? Would the idea be to make a request to an endpoint that we know will fail, so that a 503 can be generated and included in the comparison, if it isn't there in one of the metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yurishkuro , I tried to implement this approach to generate missing metrics but the there are few inconsistencies that i observed.
First being that getting the exact url or route from where the flaky metrics were exactly generated. In this particular case i tried curl -v http://localhost:13133/health and curl -v http://localhost:13133/ on my local system thinking that these were from these endpoints but the generated metrics didn't exactly match. Sometimes 503 was generated but not matching the ones in the CI and sometimes it returned 200. In CI i believe that these are generated due to some minor mismatch in the starting of service and the OTEL collector , replicating which is bit difficult.
And second was that deciding which metrics are actually diff and which are flaky. We can use rule based approach for that just like one in this PR but the generated metrics have to exactly match the format if we are to go with generating them manually.
With this i feel that going with the above approach of just displaying these excluded metrics like mentioned by you would be more appropriate.
Can you please have a look and suggest how should i move forward with this or should try some different approach.
Which problem is this PR solving?
Description of the changes
How was this change tested?
Before changes

After changes

Checklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test