-
Notifications
You must be signed in to change notification settings - Fork 35
Improve logging of rebalancer and recovery #586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve logging of rebalancer and recovery #586
Conversation
6b1057d to
64cc837
Compare
Serpentian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the comments for the first two commits, more comments are coming later) Thank you for working on this, good logging is crucial and allows us to investigate, what happened during incidents
5a8b3f8 to
f5c25f7
Compare
Serpentian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, shit. I forgot to send the last message of review, I'm very sorry
04c506f to
ccff54f
Compare
46add65 to
a1c095b
Compare
Gerold103
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch! Great that you have taken this task. These logs really needed some polishing.
9623c34 to
62c138e
Compare
Gerold103
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💪✍️
62c138e to
2ccab27
Compare
40bbd59 to
cc07222
Compare
Before this patch "Finish bucket recovery step ..." logs were printed at the end of recovery even if no buckets were successfully recovered. It led to unnecessary log records. This patch fixes the issue by adding an additional check for the number of recovered buckets. Part of tarantool#212 NO_DOC=bugfix
This patch introduces logging of buckets' ids which were recovered during recovery stage of storage. Part of tarantool#212 NO_DOC=bugfix
cc07222 to
b91d290
Compare
This patch adds rebalancer routes' logging. The log file now includes information about the source storage, the number of buckets, and the destination storage where the buckets will be moved. Since the rebalancer service has changed logging of routes that were sent, we change the `rebalancer/rebalancer.test.lua` and `rebalancer/stress_add_remove_several_rs.test.lua` tests. Part of tarantool#212 NO_DOC=bugfix
Before this patch the function `rebalancer_request_state` returned only nil in case of errors (e.g. presence of SENDING, RECEIVING, GARBAGE buckets, not active rebalancer). This makes it inconsistent compared to other rebalancer functions. Now, in case of errors we return `(nil, err)` instead of `nil`. It can help us to propagate a meaningful error in rebalancer service. In addition we throw a new vshard error for cases when the storage has some non-active buckets during rebalancing and for case when rebalancer is not active or in progress. Also we remove "Some buckets are not active" log from `rebalancer_service_f` because a meaningful error about non-active buckets is already returned from `rebalancer_download_states`. NO_TEST=refactoring NO_DOC=refactoring
Before this patch the function `rebalancer_download_states` didn't return information about replicaset from which the states could not be downloaded. As a result, the errors returned from `rebalancer_download_states` lack of valuable information about unhealthy replicaset. Now, we add replicaset.id into error which is returned from the `rebalancer_download_states`. It can help us to propagate replicaset.id to rebalancer service. Closes tarantool#212 NO_DOC=bugfix
b91d290 to
14d9c65
Compare
Gerold103
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀, lets also see if @Serpentian still has any comments unsolved.
Serpentian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work!
Before this patch "Finish bucket recovery step ..." logs were printed at
the end of recovery even if no buckets were successfully recovered, it led
to unnecessary log entries. This patch fixes the issue by adding an
additional check for the number of recovered buckets.
Closes #212
NO_DOC=bugfix