Skip to content

Conversation

@mkouba
Copy link
Contributor

@mkouba mkouba commented Nov 28, 2025

  • optimize the case where the item/failure is known and subscription/synchronization is not needed

Background

In Quarkus, we often design non-blocking APIs with Mutiny. We also provide a blocking variant of the API that usually defaults to Uni.await().indefinitely(). However, UniBlockingAwait is fairly expensive in terms of performance.

It's not unusual that some parts of the API return Uni.createFrom().item(); i.e. if the default implementation is blocking.

In this pull request, we try to optimize this path and bypass the subscription/synchronization unless really needed.

I've created a simple benchmark for quarkus-flags: https://github.com/mkouba/quarkus-flags-benchmarks and the results are really good.

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.16%. Comparing base (95dfed7) to head (215e727).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
.../main/java/io/smallrye/mutiny/groups/UniAwait.java 94.44% 1 Missing ⚠️
...mallrye/mutiny/operators/uni/UniBlockingAwait.java 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##               main    #2019      +/-   ##
============================================
- Coverage     89.18%   89.16%   -0.03%     
- Complexity     3112     3122      +10     
============================================
  Files           412      412              
  Lines         13274    13295      +21     
  Branches       1684     1688       +4     
============================================
+ Hits          11838    11854      +16     
  Misses          812      812              
- Partials        624      629       +5     
Files with missing lines Coverage Δ
...rators/uni/builders/UniCreateFromKnownFailure.java 100.00% <100.00%> (ø)
...operators/uni/builders/UniCreateFromKnownItem.java 100.00% <100.00%> (ø)
.../main/java/io/smallrye/mutiny/groups/UniAwait.java 96.00% <94.44%> (-4.00%) ⬇️
...mallrye/mutiny/operators/uni/UniBlockingAwait.java 86.04% <50.00%> (-9.20%) ⬇️

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jponge
Copy link
Member

jponge commented Nov 28, 2025

Hi @mkouba, and thanks for looking into this.

Do you have any benchmark / test result to share here?

@jponge jponge self-assigned this Nov 28, 2025
@mkouba
Copy link
Contributor Author

mkouba commented Dec 1, 2025

Hi @mkouba, and thanks for looking into this.

Do you have any benchmark / test result to share here?

Yes, so there's a simple benchmark that basically tests Uni.createFrom().item(foo).await().indefinitely() (which is exactly the use case where a blocking implementation of an API returns Uni) and the results (for benchmark with throughput mode) look like:

RESULTS SUMMARY          |999-SNAPSHOT_2.9.5 (Base)|999-SNAPSHOT_999-SNAPSHOT
=========================|Score    |Error  |Diff   |Score    |Error  |Diff   
-------------------------|-------------------------|-------------------------
FlagComputeBenchmark     |     2569|     78|       |   118068|   3616| +4496%

999-SNAPSHOT_2.9.5 is using Mutiny 2.9.5. 999-SNAPSHOT_999-SNAPSHOT is using this patch.

Ofc, you should take these results with a grain of salt. I didn't use a dedicated machine etc. On the other hand, it's clear that if we avoid the allocations and synchronization in UniBlockingAwait#await() it should be much faster.

@jponge
Copy link
Member

jponge commented Dec 1, 2025

Thanks @mkouba (BTW I had a look at your JMH tests, they're correct).

My only real concern is that we'd be introducing a special case, but any extra operator like .map() will not get any boost, even if it's all a CPU-bound workload. Of course we could imagine further optimizations like RxJava or Reactor do, but we discussed that in https://inria.hal.science/hal-03409277/file/paper-author-version.pdf 😃

@jponge
Copy link
Member

jponge commented Dec 1, 2025

Just so I understand fully, can you please point me to some impacting places where we need that boost?

@jponge
Copy link
Member

jponge commented Dec 1, 2025

On the commit message, I'd use perf: as a prefix, see https://www.conventionalcommits.org/en/v1.0.0/#summary (the proposed set of prefixes here is not mandatory, but they are widespread and the tool we use to check the commit assumes these conventions)

@mkouba
Copy link
Contributor Author

mkouba commented Dec 1, 2025

Thanks @mkouba (BTW I had a look at your JMH tests, they're correct).

My only real concern is that we'd be introducing a special case, but any extra operator like .map() will not get any boost, even if it's all a CPU-bound workload. Of course we could imagine further optimizations like RxJava or Reactor do, but we discussed that in https://inria.hal.science/hal-03409277/file/paper-author-version.pdf 😃

I do understand your concern, it's not nice at all 🤷.

Speaking of the referenced paper - I think that this optimization is more related to the reactive/imperative integration which can be perceived as a misuse, but it's IMO necessary for practical APIs.

Just so I understand fully, can you please point me to some impacting places where we need that boost?

Yes, so typically the API provides a method that returns Uni<Foo> ping(), but also a convenient method like Foo pingAndAwait(), or the users are instructed to call ping().await() if not using a reactive stream. However, ping() implementations are often blocking and so Uni.createFrom().item() is used instead. Now, this optimization is exactly about the use case where pingAndAwait() is called and the implementation is blocking, i.e. Uni.createFrom().item() is used.

If you're curious about real APIs, then for example HttpUpgradeCheck#perform() from quarkus-websockets-next, or InitialCheck#perform from the MCP server, or Flag#compute() that was used in the benchmark.

@mkouba mkouba changed the title optimization: Uni.await() perf: optimize Uni.await() for known item/failure Dec 1, 2025
@mkouba
Copy link
Contributor Author

mkouba commented Dec 1, 2025

I cannot reproduce the TCK failure locally 🤔.

@jponge
Copy link
Member

jponge commented Dec 1, 2025

I cannot reproduce the TCK failure locally 🤔.

Weird, it sometimes happens that CI runners get stuck, so timing checks get broken, etc

@jponge
Copy link
Member

jponge commented Dec 1, 2025

These CI failures do not make sense, especially as you haven't touched anything relevant to these.

@jponge
Copy link
Member

jponge commented Dec 1, 2025

@ozangunalp it looks like we have TCK failures in CI about pausable Multi 😉

@ozangunalp
Copy link
Collaborator

Oh well

@ozangunalp
Copy link
Collaborator

It is still the same NPE in queue.clear() . Maybe we need to clear inside the drain loop.

@jponge
Copy link
Member

jponge commented Dec 1, 2025

Return of the infamous drain loop 🤣

More seriously can I let you check and open a PR? There's no urgency, this isn't in a release yet.

Copy link
Member

@jponge jponge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this, but I will always be cautious about such kind of protocol-bypass 😄

Let's merge after the TCK fix for pausable Multi is here.

@mkouba
Copy link
Contributor Author

mkouba commented Dec 2, 2025

I approve this, but I will always be cautious about such kind of protocol-bypass 😄

Makes sense.

Let's merge after the TCK fix for pausable Multi is here.

👍

@jponge
Copy link
Member

jponge commented Dec 2, 2025

@mkouba it looks like your commit is not signed or the signature is not verified, and we need this (https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification#ssh-commit-signature-verification)

- optimize the case where the item/failure is known and
subscription/synchronization is not needed
@jponge jponge merged commit 245cffe into smallrye:main Dec 2, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants