Currently, we do not know which program has been generated by which LLM when using a LLMEnsemble.
It would be nice to track it in metadata to compare LLM performances.
NB: When sampled we can read the selected model in log but as iterations are made in parallel we cannot relate it to the iteration number.