Skip to content

Avoid 500x500 hash shuffles #30015

@lll-phill-lll

Description

@lll-phill-lll

cluster: https://nda.ya.ru/t/BDBskrlm7P3mCa

query: https://paste.yandex-team.ru/0c9fa072-4f68-44fb-9e75-316176fef304

Problem

In this query, 560 early aggregation tasks and 420 final aggregation tasks are created. This results in a hash shuffle of 420 × 560 -> about 250,000 channels.

Since we have almost no control over the memory consumed by channels, we hit ooms almost immediately.

I added a filter to reduce the amount of data being processed, and the query succeeded. Here is the plan of the successful run:

Image

query with filter:
https://paste.yandex-team.ru/3c535fd8-33f1-47e1-8c32-9e59da269d3b

!Note:
The number of tasks may we overwritten for every stage manually using this pragma:

#pragma ydb.OverridePlanner = @@ [
                        { "tx": 0, "stage": 1, "tasks": 100 }
                    ] @@;

Proposal

When planning the number of tasks, we should consider for the number of channels that will be created. And plan the number of tasks so that we can fit all the channels into memory.

reference:
#28520 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions