-
Notifications
You must be signed in to change notification settings - Fork 65
apollo_consensus: SM now consumes events and emits requests; update SHC and tests #10339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apollo_consensus: SM now consumes events and emits requests; update SHC and tests #10339
Conversation
87290c0 to
db77f88
Compare
ea47287 to
2705c59
Compare
db77f88 to
4bb3c97
Compare
2705c59 to
0de6018
Compare
0de6018 to
a50bb75
Compare
47d26a6 to
869a963
Compare
a50bb75 to
030b969
Compare
869a963 to
f153d27
Compare
d0d314d to
358cb47
Compare
f153d27 to
03e5d13
Compare
03e5d13 to
98c9447
Compare
358cb47 to
83d5968
Compare
98c9447 to
887093f
Compare
2ff8b27 to
0de1088
Compare
acffcd6 to
4c6e2c4
Compare
0de1088 to
a32cd90
Compare
5ec5c37 to
416dd69
Compare
asmaastarkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 2 of 4 files reviewed, 20 unresolved discussions (waiting on @dafnamatsry and @matanl-starkware)
crates/apollo_consensus/src/single_height_consensus.rs line 241 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Please add TODO to reduce the amount of duplicated code between the Prevote and Precommit arms
in this PR I tried not to break the old code (easy to review)
Donr in the upcoming PRs
crates/apollo_consensus/src/single_height_consensus.rs line 272 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Please keep the debug log. It's essential to understand what's happening with Tendermint...
Done.
crates/apollo_consensus/src/single_height_consensus.rs line 410 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
let timeout = match vote.vote_type {
}
let task = ShcTask::Rebroadcast {timeout, vote.clone()};
in this PR I tried not to break the old code (easy to review)
but its done in #10341
crates/apollo_consensus/src/single_height_consensus.rs line 445 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
This TODO was lost...
Done.
crates/apollo_consensus/src/single_height_consensus.rs line 320 at r2 (raw file):
Previously, dafnamatsry wrote…
IIUC Prevote and Precommit only represent peer votes and are not like the other events that indicates a completed task.
Consider completely removing these event types, and simply call state machine handle_prevote and handle_precommit directly from handle_vote.
added TODO
crates/apollo_consensus/src/single_height_consensus.rs line 474 at r2 (raw file):
Previously, dafnamatsry wrote…
What is this for?
will be used next
crates/apollo_consensus/src/single_height_consensus.rs line 477 at r2 (raw file):
Previously, dafnamatsry wrote…
And this?
will be used next
crates/apollo_consensus/src/state_machine.rs line 37 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
What does this parameter mean?
valid_round
crates/apollo_consensus/src/state_machine.rs line 38 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Now Events are only
tothe state machine IIUC
Done.
crates/apollo_consensus/src/state_machine.rs line 56 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
We usually don't use struct-style definition for enums.
Why did you decide to do it here?
Anyhow, I prefer not mixing (some struct-style, some tuple-style)
Done.
crates/apollo_consensus/src/state_machine.rs line 58 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
How come it's dead code?
It will be used when the SHC asks the manager to start validation—currently, the SHC does this itself
crates/apollo_consensus/src/state_machine.rs line 211 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
if self.is_observer {
return output;
}rest of flow here...
Done.
crates/apollo_consensus/src/state_machine.rs line 238 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Eventually, you'll return a single request from this function.
I think it's OK to add a comment explaining why you chose to do so (instead of simply return Option)
Done.
crates/apollo_consensus/src/state_machine.rs line 313 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
What about the
if self.is_observercheck that was under previous arms?
Done.
crates/apollo_consensus/src/state_machine.rs line 490 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
I thought that for case of Repropose you already saved the Init somewhere
We don't save the init, we save the round and the proposal_id
(it's a new round)
crates/apollo_consensus/src/state_machine.rs line 39 at r2 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Still sends or only receives now?
Done.
crates/apollo_consensus/src/state_machine.rs line 50 at r2 (raw file):
Previously, dafnamatsry wrote…
These are only for incoming votes no? If so, fix the documentation.
Done.
crates/apollo_consensus/src/state_machine.rs line 56 at r2 (raw file):
Previously, dafnamatsry wrote…
These are now mean the timeout is completed right?
Maybe call themTimeoutExpiredorTimeoutCompleted? and also fix the documentaion.
yes.. if it's not clear that the timeout expired, I will rename it in another PR
crates/apollo_consensus/src/state_machine.rs line 231 at r2 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
log vote_type as well
Done.
crates/apollo_consensus/src/state_machine.rs line 505 at r2 (raw file):
Previously, dafnamatsry wrote…
I think the
ProposalInitcan be constructed by whom will be performing the repropose (the manager?).
TheSMRquestcan include the necessary information (proposal_id, round and valid_round)
Currently, the SHC is the one who reproposes..
It needs to get the data (H, R, Id) from the SM, that's why I prefer to prepare it here
a32cd90 to
7e72528
Compare
416dd69 to
3d2c110
Compare
3d2c110 to
528df95
Compare
asmaastarkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 4 files reviewed, 20 unresolved discussions (waiting on @dafnamatsry and @matanl-starkware)
crates/apollo_consensus/src/single_height_consensus.rs line 474 at r2 (raw file):
Previously, asmaastarkware (asmaa-starkware) wrote…
will be used next
deleted as discussed in huddle
crates/apollo_consensus/src/state_machine.rs line 58 at r2 (raw file):
Previously, dafnamatsry wrote…
What does this event mean? It sounds more like a task.
In this PR, I left the name as "task" because the SHC is still handling all the logic.
In #10341, once the manager became responsible for broadcasting and the doing the hard work, it was renamed to VoteBroadcasted
matanl-starkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matanl-starkware reviewed 3 of 3 files at r4, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @asmaastarkware and @dafnamatsry)
crates/apollo_consensus/src/state_machine.rs line 313 at r1 (raw file):
Previously, asmaastarkware (asmaa-starkware) wrote…
Done.
Do you need to check under the ScheduleTimeout* that we are not observer?
Previously:
if self.is_observer {
continue;
}
crates/apollo_consensus/src/state_machine.rs line 72 at r4 (raw file):
BroadcastVote(Vote), /// Request to schedule a TimeoutPropose. ScheduleTimeoutPropose(Round),
Consider merging all Timeout requests into a single enum:
ScheduleTimeout(Step, Round)
Code quote:
ScheduleTimeoutPropose
asmaastarkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @dafnamatsry and @matanl-starkware)
crates/apollo_consensus/src/state_machine.rs line 313 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
Do you need to check under the
ScheduleTimeout*that we are not observer?
Previously:
if self.is_observer {
continue;
}
was it?
observer should act normally, without voting/proposing
528df95 to
5acfb94
Compare
matanl-starkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @asmaastarkware and @dafnamatsry)
crates/apollo_consensus/src/state_machine.rs line 313 at r1 (raw file):
Previously, asmaastarkware (asmaa-starkware) wrote…
was it?
observer should act normally, without voting/proposing
IDK
In original code, we had these arms:
StateMachineEvent::Proposal(_, , )
| StateMachineEvent::Prevote()
| StateMachineEvent::Precommit() => {
if self.is_observer {
continue;
}
Is it all handled the same way now?
asmaastarkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @dafnamatsry and @matanl-starkware)
crates/apollo_consensus/src/state_machine.rs line 313 at r1 (raw file):
Previously, matanl-starkware (Matan Lior) wrote…
IDK
In original code, we had these arms:
StateMachineEvent::Proposal(_, , )
| StateMachineEvent::Prevote()
| StateMachineEvent::Precommit() => {
if self.is_observer {
continue;
}Is it all handled the same way now?
Yes, make_self_vote and advance_to_round prevent voting/proposing when the node is observer
matanl-starkware
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @dafnamatsry)
crates/apollo_consensus/src/state_machine.rs line 525 at r4 (raw file):
} fn advance_to_step(&mut self, step: Step) -> VecDeque<SMRequest> {
Please define as a type (in future PR)
Code quote:
VecDeque<SMRequest>
dafnamatsry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dafnamatsry reviewed 1 of 3 files at r2, 3 of 3 files at r4, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @asmaastarkware)
crates/apollo_consensus/src/state_machine_test.rs line 449 at r4 (raw file):
); assert_eq!(wrapper.next_request().unwrap(), SMRequest::ScheduleTimeoutPrevote(ROUND)); // Nil Prevote quorum, so we broadcast a nil Precommit.
Why didn't we have the vote broadcasting before?
crates/apollo_consensus/src/state_machine.rs line 51 at r4 (raw file):
/// Precommit message, sent from the SHC to the state machine. Precommit(Vote), /// TimeoutPropose event, sent from the state machine to the SHC.
And also below
Suggestion:
from the SHC to the state machine
No description provided.