-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Support ref tags on arguments to ref params
#6312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ref tags on arguments to ref params
#6312
Conversation
zygoloid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How likely do you think it is that we retain the ref_tags map in the longer term? If we do, I think we should be thinking about ways to remove elements from it so it doesn't grow without bound, and also maybe thinking about avoiding adding all operands of operators to it -- currently I expect it'll get pretty large due to the number of operator invocations in a typical source file.
We could probably track whether we're in an "operand of an operator" context from call handling through into pattern matching and conversion in order to suppress the diagnostics for missing ref.
Co-authored-by: Richard Smith <[email protected]>
toolchain/check/convert.cpp
Outdated
| case ConversionTarget::Value: | ||
| return category == SemIR::ExprCategory::Value; | ||
| case ConversionTarget::ValueOrRef: | ||
| case ConversionTarget::RefParam: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's what I did in the initial version of #6283, but in order to do that I had to model var parameters as VarStorage insts rather than TemporaryStorage insts. That means the textual SemIR names them as if they were local variables (e.g. if a function has a parameter var y: i32, the caller SemIR gets an inst named %y.var) and lowering no longer appends temp to their names in the LLVM IR. @chandlerc raised that as a concern in the review, but I haven't figured out how to fix it without introducing a separate inst kind or plumbing the necessary context through both formatting and lowering, which both seem excessive given that this problem only affects the ergonomics of the IRs.
The other issue is that restricting this to durable refs prevents us from invoking a ref self method on an initializing expression. I think we should disallow that anyway, and I've proposed that in #5545, but that proposal hasn't gotten any leads feedback since I made that change, which makes me wary of investing too much engineering time in it.
| // source code, or because they appear in a position where that tag is not | ||
| // required, such as an operator operand (the RefTag value indicates which | ||
| // of those is the case). | ||
| Map<SemIR::InstId, RefTag> ref_tags_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(@zygoloid I'm replying to your top-level comment here, for threading purposes)
How likely do you think it is that we retain the
ref_tagsmap in the longer term? If we do, I think we should be thinking about ways to remove elements from it so it doesn't grow without bound, and also maybe thinking about avoiding adding all operands of operators to it -- currently I expect it'll get pretty large due to the number of operator invocations in a typical source file.We could probably track whether we're in an "operand of an operator" context from call handling through into pattern matching and conversion in order to suppress the diagnostics for missing
ref.
We'd have to plumb it through a lot of layers, but yes, that's an option. The main problem I see is that an operator usage and an explicit call to the corresponding interface method will emit identical SemIR, but the former cannot use a ref tag, whereas the latter might sometimes be required to use a ref tag (unless we're willing to categorically forbid ref parameters in the explicit parameter lists of operator methods, or say that explicit calls to operator methods don't need to (can't?) use ref tags).
So if ref_tags only tracks the places where ref tags are actually written in the code, as you suggest, we'd have to stop using the SemIR to validate ref_tags, as @chandlerc suggested, because it won't be possible to use the SemIR to reconstruct where the ref tags must have been.
I think this just brings us back to the question of whether we're modeling a syntactic property ("the user wrote a ref tag here", as you've been advocating for) or a semantic one ("this inst is used as the argument to a ref parameter", as @chandlerc has been advocating for), because the operator problem seems to rule out trying to finesse the difference. As you point out, the syntactic property would probably be substantially cheaper to represent, since it's much sparser in the code, which seems like evidence in favor of the syntactic approach, but I'm not sure I really understand the arguments in the other direction.
zygoloid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my remaining comments are largely about places in which this PR is preserving a pre-existing behavior rather than making a behavior change. (In particular, the code currently uses ConversionTarget::ValueOrRef for ref parameters.) So while I'm uncomfortable about introducing a new special-purpose conversion target that behaves in a way I find unexpected, I'd be OK with not addressing that concern in this PR if you prefer.
I would appreciate your thoughts on my other high-level comment -- it seems that this PR will build a map containing every inst that is an operand of an operator in the file, which sounds like quite a lot of data to be storing, if this is our long-term approach.
toolchain/check/convert.cpp
Outdated
| case ConversionTarget::Value: | ||
| return category == SemIR::ExprCategory::Value; | ||
| case ConversionTarget::ValueOrRef: | ||
| case ConversionTarget::RefParam: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following something here -- whether we allow ref x: T to bind to an ephemeral reference seems like it would be a design question with observable semantic effects, not only a question of how we represent things in SemIR. I thought we didn't currently intend RefParams to allow ephemeral references as arguments.
Also, if I'm understanding correctly, we don't use RefParam for var parameters any more after this PR, so I don't understand how the modeling of var parameters affects this.
If we do allow EphemeralRef here then allowing Initializing also seems logical, since we can materialize an Initializing expression into an EphemeralRef. But I don't understand why we'd allow Value here -- a value expression doesn't seem like it should ever be a valid argument for a reference parameter.
Co-authored-by: Richard Smith <[email protected]>
I responded to that here. |
zygoloid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I still have a lingering concern that the map here will get undesirably large, but I don't think that's a blocking concern; we can iterate on that later.
toolchain/check/convert.cpp
Outdated
| case ConversionTarget::Value: | ||
| return category == SemIR::ExprCategory::Value; | ||
| case ConversionTarget::ValueOrRef: | ||
| case ConversionTarget::RefParam: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following something here -- whether we allow
ref x: Tto bind to an ephemeral reference seems like it would be a design question with observable semantic effects, not only a question of how we represent things in SemIR. I thought we didn't currently intendRefParams to allow ephemeral references as arguments.I don't think that's right. See e.g. here: the currently-adopted design is that a
refbinding can bind to an ephemeral reference if it's aselfbinding. I'm proposing to change that in #5545 (see the "Support bindingref selfto ephemeral references" alternative-considered for the rationale), but I haven't gotten any indication of whether the leads support that aspect of the proposal.
OK, I see. It looks like we currently (before this PR) allow all parameter ref bindings to bind to ephemeral references, not only self bindings (example), but after this PR, the ref call-side tag will require a durable reference, which seems to effectively implement the rest of the rule described in the design. I've suggested a change to the comment on RefParam to explain how this works.
Co-authored-by: Richard Smith <[email protected]>
toolchain/check/convert.cpp
Outdated
| // Don't diagnose a non-reference scrutinee if it has a user-written | ||
| // `ref` tag, because that's diagnosed in `Convert`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're in Convert :)
| // Don't diagnose a non-reference scrutinee if it has a user-written | |
| // `ref` tag, because that's diagnosed in `Convert`. | |
| // Don't diagnose a non-reference scrutinee if it has a user-written | |
| // `ref` tag, because that's diagnosed in `CheckRefTag`. |
(Github stripped the whitespace off the suggested text again, so I predict applying this suggestion will go wrong again. The diff formatting is not consistently working for me either; if it breaks for you too, the change here is replacing Convert with CheckRefTag.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
The issue of whether/how to include
reftags in the textual and in-memory SemIR (see discussion here) is left as future work.