Passing both reference and value to job#870
Open
gpetretto wants to merge 1 commit intomaterialsproject:mainfrom
Open
Passing both reference and value to job#870gpetretto wants to merge 1 commit intomaterialsproject:mainfrom
gpetretto wants to merge 1 commit intomaterialsproject:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces the option to have both the
OutputReferenceand the corresponding value as job inputs when coming from the output of another job. This would be activated by settingresolve_references="both"inJobConfig.One example use case where this may be useful: imagine to have multiple jobs that produce an output and having a final job that compares the outputs and needs to select one or more among them. Currently the only option would be to pass the values and store the selected value(s) as output of the final job. This has the downsides that 1) the data in the DB is duplicated 2) The reference to Job executing the calculation is lost.
With the option to pass "both" it would instead be possible to make the comparison while using the reference of the selected job as output.
A few notes on the implementation:
ResolvedReferencedoes not contain explicitly theOutputReferencebecause otherwise in jobflow-remote it gets re-resolved at runtime.resolve_references="both"theResolvedReferenceis always used, even withOnMissing.NoneorOnMissing.PASS.resolve_references="both"is used in combination withOnMissing.PASSit can still cause problems for jobflow-remote. However, I have discovered thatOnMissing.PASSis acutally not supported in jobflow-remote (I will address this). IsOnMissing.PASSis used somewhere?resolve_referencescould be a list of the variable names that need to be resolved. One can then solve the previous example by passing both the list of values and list of references. While this could have given more flexibility, the implementation is trickier and I believe less reliable (e.g. when the inputs are*argsand**kwargs). This would also require passing the same reference twice in the example above, which does not seem very intuitive.