-
Notifications
You must be signed in to change notification settings - Fork 3.9k
[move compiler] [CSE Step 2] common subexpression elimination #17989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
junxzm1990
wants to merge
1
commit into
jun/reach-def
Choose a base branch
from
jun/cse-opt
base: jun/reach-def
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+7,295
−447
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
16 tasks
b6f8d5d to
97a2b06
Compare
97a2b06 to
e009ba3
Compare
16 tasks
ca16b0f to
6dd56f1
Compare
e009ba3 to
6eb7704
Compare
6dd56f1 to
d9b6d27
Compare
8b88a87 to
3066301
Compare
a6827da to
69c4e71
Compare
894fa74 to
4e29fd9
Compare
69c4e71 to
54a1873
Compare
4e29fd9 to
947e5c9
Compare
4035dbb to
545e1a4
Compare
6dd4fc1 to
285eb08
Compare
54a1873 to
cde52be
Compare
16 tasks
285eb08 to
054714b
Compare
cde52be to
88ac450
Compare
8bbbf38 to
4f00c2e
Compare
88ac450 to
2d543b1
Compare
5cd8843 to
9e565bb
Compare
9e565bb to
50f2a44
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.

Description
This PR implements the "common subexpression elimination" (CSE) transformation. This is the second PR on the stack to introduce CSE optimizations to Move.
Motivating Example:
At the stackless bytecode level,
data.xis translated into a seq ofBorrowLoc+BorrowField+ReadRefinstructions.Without CSE, all occurance of
data.x(line 2, line 3, line 5) will be translated into the seq above, despitedata.xat line 3 andline 5 share the same result of line 2 and the computations are not necessary.
CSE aims to eliminate such redundant computations by reusing the result of previous computations.
Specifically, in the example above, assuming the
BorrowLoc+BorrowField+ReadRefsequence at line 2 is assigned to tempt1,then the occurrences at line 3 and line 5 can both be replaced by
t1, eliminating the redundant computations.The optimized bytecode would look like:
============================ Implementation Details ============================
Step 1: Build the Control Flow Graph (CFG) and Domination Tree of a target function.
Step 2: Traverse the Domination Tree in preorder, and for each basic block, for each instruction:
ExprKeystructureExprKeycontains the operation and its arguments, represented asExpArg,ExpArgcan be either a constant, a variable (temp), or anotherExprKeyto nest expressions recursivelyReadRef(BorrowField(BorrowLoc(x))), we want torepresent it as a single expression rather than three separate ones, so that we can eliminate
the entire sequence at once.
t1 = Op1(t0); t2 = Op2(t1);asOp2(Op1(t0)):Op1is the only definition of oft1that can reach the instruction ofOp2t1is only used once and exactly byOp2.hencing not missing opportunities for replacement
Step 3: Check if the
ExprKeyfrom Step 2 has been seen before in a dominating block.Given a seen-before
ExprKey(annotated assrc_expr) for the current expression (annotated asdest_expr),and assuming the two expressions have the following formats:
src_expr:(src_temp1, src_temp2, ...) = src_op(src_ope1, src_ope2, ...)defined atsrc_inst, wheresrc_ope1andsrc_ope2can be nested expressions.dest_expr:(dest_temp1, dest_temp2, ...) = dest_op(dest_ope1, dest_ope2, ...)defined atdest_inst, wheredest_ope1anddest_ope2can be nested expressions.we take a set of conservative conditions to check safety of the replacement:
Condition 1.
src_exprdominatesdest_exprsrc_expris always executed beforedest_exprCondition 2: type safety
src_tempsanddest_tempsshare the same typessrc_temptodest_tempstc_tempis not mutably borrowedsrc_tempis mutably borrowedCondition 3:
src_tempsare copyablesrc_tempstodest_tempsdoes not violate ability constraintsCondition 4:
src_tempsatsrc_exprare the only definitions ofsrc_tempsthat can reachdest_expr:dest_tempsCondition 5: Resources used in
src_exprare not changed atdest_expr:BorrowGlobalandExistsoperations are safe to reuse atdest_exprBorrowGlobalandExistsare involved insrc_expranddest_exprCondition 6: Operands used in
src_exprare safe to reuse atdest_expr:src_exprare identical to those used indest_exprsrc_exprare possibly re-defined in a path betweensrc_expranddest_expr(without going throughsrc_expragain)src_instremain unchanged when reachingdest_instsrc_exprare mutably borrowed elsewhereCondition 7: The replacement will bring performance gains! See comments above
gain_perffor detailsStep 4: for each
src_exprpassing the conditions to replacedest_exprin Step 3, we check gather necessary information to perform replacement like below:Example:
==>
Step 5: After processing all basic blocks, we perform the recorded replacements and eliminate the marked code.
============================ Extensions ============================
In principle, the algorithm above is designed to handle PURE instructions, defined as blow
memory(including write via references), control flow (includingabort), or external state (global storage)Yet, we found that some non-pure instructions can be safely handled under certain conditions.
Group 1: operations that are pure if no arithmetic errors like overflows happen (
+,-,*,/,%, etc):aggressivemodesrc_instGroup 2: operations that are pure if no type errors happen (
UnpackVariant):aggressivemodesrc_instGroup 3:
BorrowLoc,BorrowField,BorrowVariantFieldsrc_instanddst_inst, we can treat them as pure.operands, we mean the most deeply nested operands, e.g., inBorrowField(BorrowLoc(x)),xis the operand forBorrowField.Group 4:
AssignCopyorInferredGroup 5:
readrefreadrefis not pure as it depends on memory states.src_instanddst_inst, we can treat them as pure.operands, we mean the most deeply nested operands, e.g., inReadRef(BorrowField(BorrowLoc(x))),xis the operand forReadRef.Group 6:
FunctioncallsGroup 7:
BorrowGlobalandExistssrc_instanddst_instAll TODO items are marked with
TODO(#18203).How Has This Been Tested?
Expected Result Changes
Type of Change
Which Components or Systems Does This Change Impact?