-
Notifications
You must be signed in to change notification settings - Fork 35
test: fix flaky router-luatest/router_2_2_test #625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This commit fixes the flakiness of the router_2_2_test. Most failures were caused by low timeouts. In order to fix this, timeouts are increased, and `t.helpers.retrying` is used in test_replicaset_services_do_not_spam_same_errors and test_log_ratelimiter_is_dropped_on_replica_delete to fix other failures. Fixes tarantool#535 NO_DOC=testfix
| end, {g.replica_2_a:replicaset_uuid()}) | ||
| msg = "Error during master search" | ||
| g.router:wait_log_exactly_once(msg, {timeout = 0.1, on_yield = function() | ||
| g.router:wait_log_exactly_once(msg, {timeout = 0.5, on_yield = function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How increase of waiting timeout makes the test less flaky? wait_log_exactly_once waits for timeout seconds and constantly checks the logs, so that the same message doesn't appear. By making timeout greater we just prolong the time of tests and do not make it less flaky
The same comment for all other places
| g.router:wait_log_exactly_once(msg, {timeout = 0.1, on_yield = function() | ||
| _G.failover_wakeup() | ||
| end}) | ||
| t.helpers.retrying({}, function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot retry here, the whole sense of wait_log_exactly_once is that logs are not duplicated. And if it failed once, it'll fail on all iterations, since the whole log after restart of the instance is checked
| _G.sleep_cond:wait() | ||
| end | ||
| end) | ||
| g.router:exec(function(bid, uuid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, update the issue #535 with fails you encountered and fixing in the current patch. Otherwise, it's impossible to understand, what we're fixing here
This commit fixes the flakiness of the router_2_2_test. Most failures were caused by low timeouts.
In order to fix this, timeouts are increased, and
t.helpers.retryingis used in test_replicaset_services_do_not_spam_same_errors and test_log_ratelimiter_is_dropped_on_replica_delete to fix other failures.Fixes #535
NO_DOC=testfix