Skip to content

Conversation

@geoffromer
Copy link
Contributor

The issue of whether/how to include ref tags in the textual and in-memory SemIR (see discussion here) is left as future work.

@geoffromer geoffromer requested a review from a team as a code owner November 1, 2025 00:34
@geoffromer geoffromer requested review from zygoloid and removed request for a team November 1, 2025 00:34
Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How likely do you think it is that we retain the ref_tags map in the longer term? If we do, I think we should be thinking about ways to remove elements from it so it doesn't grow without bound, and also maybe thinking about avoiding adding all operands of operators to it -- currently I expect it'll get pretty large due to the number of operator invocations in a typical source file.

We could probably track whether we're in an "operand of an operator" context from call handling through into pattern matching and conversion in order to suppress the diagnostics for missing ref.

case ConversionTarget::Value:
return category == SemIR::ExprCategory::Value;
case ConversionTarget::ValueOrRef:
case ConversionTarget::RefParam:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I did in the initial version of #6283, but in order to do that I had to model var parameters as VarStorage insts rather than TemporaryStorage insts. That means the textual SemIR names them as if they were local variables (e.g. if a function has a parameter var y: i32, the caller SemIR gets an inst named %y.var) and lowering no longer appends temp to their names in the LLVM IR. @chandlerc raised that as a concern in the review, but I haven't figured out how to fix it without introducing a separate inst kind or plumbing the necessary context through both formatting and lowering, which both seem excessive given that this problem only affects the ergonomics of the IRs.

The other issue is that restricting this to durable refs prevents us from invoking a ref self method on an initializing expression. I think we should disallow that anyway, and I've proposed that in #5545, but that proposal hasn't gotten any leads feedback since I made that change, which makes me wary of investing too much engineering time in it.

// source code, or because they appear in a position where that tag is not
// required, such as an operator operand (the RefTag value indicates which
// of those is the case).
Map<SemIR::InstId, RefTag> ref_tags_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@zygoloid I'm replying to your top-level comment here, for threading purposes)

How likely do you think it is that we retain the ref_tags map in the longer term? If we do, I think we should be thinking about ways to remove elements from it so it doesn't grow without bound, and also maybe thinking about avoiding adding all operands of operators to it -- currently I expect it'll get pretty large due to the number of operator invocations in a typical source file.

We could probably track whether we're in an "operand of an operator" context from call handling through into pattern matching and conversion in order to suppress the diagnostics for missing ref.

We'd have to plumb it through a lot of layers, but yes, that's an option. The main problem I see is that an operator usage and an explicit call to the corresponding interface method will emit identical SemIR, but the former cannot use a ref tag, whereas the latter might sometimes be required to use a ref tag (unless we're willing to categorically forbid ref parameters in the explicit parameter lists of operator methods, or say that explicit calls to operator methods don't need to (can't?) use ref tags).

So if ref_tags only tracks the places where ref tags are actually written in the code, as you suggest, we'd have to stop using the SemIR to validate ref_tags, as @chandlerc suggested, because it won't be possible to use the SemIR to reconstruct where the ref tags must have been.

I think this just brings us back to the question of whether we're modeling a syntactic property ("the user wrote a ref tag here", as you've been advocating for) or a semantic one ("this inst is used as the argument to a ref parameter", as @chandlerc has been advocating for), because the operator problem seems to rule out trying to finesse the difference. As you point out, the syntactic property would probably be substantially cheaper to represent, since it's much sparser in the code, which seems like evidence in favor of the syntactic approach, but I'm not sure I really understand the arguments in the other direction.

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my remaining comments are largely about places in which this PR is preserving a pre-existing behavior rather than making a behavior change. (In particular, the code currently uses ConversionTarget::ValueOrRef for ref parameters.) So while I'm uncomfortable about introducing a new special-purpose conversion target that behaves in a way I find unexpected, I'd be OK with not addressing that concern in this PR if you prefer.

I would appreciate your thoughts on my other high-level comment -- it seems that this PR will build a map containing every inst that is an operand of an operator in the file, which sounds like quite a lot of data to be storing, if this is our long-term approach.

case ConversionTarget::Value:
return category == SemIR::ExprCategory::Value;
case ConversionTarget::ValueOrRef:
case ConversionTarget::RefParam:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following something here -- whether we allow ref x: T to bind to an ephemeral reference seems like it would be a design question with observable semantic effects, not only a question of how we represent things in SemIR. I thought we didn't currently intend RefParams to allow ephemeral references as arguments.

Also, if I'm understanding correctly, we don't use RefParam for var parameters any more after this PR, so I don't understand how the modeling of var parameters affects this.

If we do allow EphemeralRef here then allowing Initializing also seems logical, since we can materialize an Initializing expression into an EphemeralRef. But I don't understand why we'd allow Value here -- a value expression doesn't seem like it should ever be a valid argument for a reference parameter.

@geoffromer
Copy link
Contributor Author

I would appreciate your thoughts on my other high-level comment -- it seems that this PR will build a map containing every inst that is an operand of an operator in the file, which sounds like quite a lot of data to be storing, if this is our long-term approach.

I responded to that here.

@geoffromer geoffromer requested a review from zygoloid November 5, 2025 22:04
Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I still have a lingering concern that the map here will get undesirably large, but I don't think that's a blocking concern; we can iterate on that later.

case ConversionTarget::Value:
return category == SemIR::ExprCategory::Value;
case ConversionTarget::ValueOrRef:
case ConversionTarget::RefParam:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following something here -- whether we allow ref x: T to bind to an ephemeral reference seems like it would be a design question with observable semantic effects, not only a question of how we represent things in SemIR. I thought we didn't currently intend RefParams to allow ephemeral references as arguments.

I don't think that's right. See e.g. here: the currently-adopted design is that a ref binding can bind to an ephemeral reference if it's a self binding. I'm proposing to change that in #5545 (see the "Support binding ref self to ephemeral references" alternative-considered for the rationale), but I haven't gotten any indication of whether the leads support that aspect of the proposal.

OK, I see. It looks like we currently (before this PR) allow all parameter ref bindings to bind to ephemeral references, not only self bindings (example), but after this PR, the ref call-side tag will require a durable reference, which seems to effectively implement the rest of the rule described in the design. I've suggested a change to the comment on RefParam to explain how this works.

@geoffromer geoffromer requested a review from zygoloid November 11, 2025 00:51
Comment on lines 1635 to 1636
// Don't diagnose a non-reference scrutinee if it has a user-written
// `ref` tag, because that's diagnosed in `Convert`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're in Convert :)

Suggested change
// Don't diagnose a non-reference scrutinee if it has a user-written
// `ref` tag, because that's diagnosed in `Convert`.
// Don't diagnose a non-reference scrutinee if it has a user-written
// `ref` tag, because that's diagnosed in `CheckRefTag`.

(Github stripped the whitespace off the suggested text again, so I predict applying this suggestion will go wrong again. The diff formatting is not consistently working for me either; if it breaks for you too, the change here is replacing Convert with CheckRefTag.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks!

@geoffromer geoffromer enabled auto-merge November 11, 2025 19:56
@geoffromer geoffromer added this pull request to the merge queue Nov 11, 2025
Merged via the queue into carbon-language:trunk with commit 43ffd72 Nov 11, 2025
8 checks passed
@geoffromer geoffromer deleted the ref-tag-hash-table branch November 11, 2025 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants