Heuristically choose and preserve rdx across mulx#210
Heuristically choose and preserve rdx across mulx#210andres-erbsen wants to merge 4 commits into0xADE1A1DE:devfrom
Conversation
| byReg[r64] = varname; | ||
| } | ||
| if (matchArg(varname)) { | ||
| if (matchArg(varname) && isMem(store)) { |
There was a problem hiding this comment.
I do not understand why this is necessary. Are we checking if a value is in an register and a memory location at the same time?
There was a problem hiding this comment.
I do not understand why the entire conditional is necessary, but without the && the assertion failed with all my other changes. I believe the old conditional asks whether the variable is an input-array cell, and the new conditional asks whether it is stored in memory. The difference would be an input variable that is cached in a register, for example rdx.
| ((allocationReq.allocationFlags ?? AllocationFlags.NONE) & flagToCheck) === flagToCheck; | ||
| const inAllocationsTemp = allocationReq.in.map((readVariable) => { | ||
| const currentLocation = this._allocations[readVariable]; | ||
| if (currentLocation && isRegister(currentLocation.store)) return currentLocation.store; |
There was a problem hiding this comment.
why do we need this condition up here?
I see that for the most part, the remaining parts are checking if it is not a register and then do something. But I am a bit unsure if there is really no side-effects.
There was a problem hiding this comment.
If arg1[3] is also stored in a register, reading it from the register is more flexible than loading it from memory again. Without this new short-circuit, the next conditional would take the memory path.
There was a problem hiding this comment.
So instead of
mov rdx, [rsi]
mulx r8 , r9, [rsi]it will now emit
mov rdx, [rsi]
mulx r8 , r9, rdx?
There was a problem hiding this comment.
Then I'd like to write a test case for that.
There was a problem hiding this comment.
No wait. This would be a square operation, which is handled differently anyway.
This emits
mov rax, [rsi]
...
mulx r8 , r9, raxinstead of
mov rax, [rsi]
...
mulx r8 , r9, [rsi]| } | ||
|
|
||
| if (caf(AllocationFlags.ONE_IN_MUST_BE_IN_RDX) && !inAllocations.includes(Register.rdx)) { | ||
| if (caf(AllocationFlags.ONE_IN_MUST_BE_IN_RDX) && !allocationReq.in.some((i) => this._allocations[i].store === Register.rdx)) { |
There was a problem hiding this comment.
Why do we checkthis._allocations[i] instead of inAllocations?
There was a problem hiding this comment.
I do not remember for sure, but I think it was a difference between [rax+4] and arg1[1]. You're right to question this code as I was working from examples, not deep understanding.
There was a problem hiding this comment.
Thing is, I could see this working but also failing. inAllocations is being used and modified in this function, whereas _allocations is modified by the getW() function if I remember correctly. I'd want to not touch anything if we don't know if its necessary. This whole RegisterAllocator needs a rewrite, but I've never had the time.
|
is that little orange block other the brown block (y=50, x=22000) the reason why we think this heuristic is better? |
|
I am not sure we want the heuristic. What do you think would be a good test? If you'd be more comfortable landing this change minus the heuristic, I'd be enthusiastic for that as well. |
|
Hm. So lets summarise. The heuristic is only impacting Additionally, the effect of preferring loading |
Closes #199
Experiment with p256_mul, this patch versus an alternative that always uses
chooseArg.