Skip to content

Performance issue for GreaterFilter #148

@lyiu18

Description

@lyiu18

We uses GreaterFilter combine with some other filters to query data.

The example usage is like
For order cache, order can belongs to different account, and order can have an entry date
We create unsorted index on account id, and sorted index on entry date

Query to find out order of certain account (equal filter on account id) and entry date of last 7 days (greater filter on entry date)
similar to what InvocableMapHelper.query will do, already applied partition index for the query

The query usually will do equal filter first, which is great, when it comes to greater filter, there could be some problems.

As i understand the code the GreaterFilter will use inverse map of the index and compare against remaining setKeys, it does optimize to make sure only head or tail map will be used

However, you can see the setKeys only contains one account's order now (it could be just hundreds), while the head/tail map even it is just half, it can contains orders from many accounts and going through it may not be fast enough.

what we saw is if we don't apply GreaterFilter to the index, instead, manually deserialize those hundreds orders and check the date in some case it still (much) faster than apply the index

We use Coherence 14.1.2.0.1 currently, i think there is a bug fix for ChainedCollection usage in later patch, but i am not sure that is the problem we have, since we saw the slowness in headMap/removeall case as well

So questions

  1. Do we use the index wrong? or there is some other simple solution for the case?
  2. Trying to understand why we choose to use inverse map only, not forward map, is there any concern the forward map could cost issues, side effects, etc?

By using forward map, the code may look like

`

        if (fHeadHeavy || index.isPartial())
        {
            if (mapTail.size() <= setKeys.size())
            {
               // original code
            }
            else
            {
                loopThroughInput(setKeys, index);
            }
        }
        else
        {
            if (mapHead.size() <= setKeys.size())
            {
               // original code
            }
            else
            {
                loopThroughInput(setKeys, index);
            }
        }

private void loopThroughInput(Set setKeys, MapIndex index)
{
    setKeys.removeIf(o -> !evaluateExtracted((E) index.get(o)));
}

`

the <= comparison is rough, but i think you get where i am going there.

supposedly when setKeys has not many items (comparing to tail/head map) loop through it should be quicker, isn't it? or I missed something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions