Skip to content

Huge harvester hangs #549

@ziorick

Description

@ziorick

Hi to all!
I have installed ckan 2.10.3. I'm trying to harvest (using ckan-harvester plugin) a huge other ckan portal (data.gov) about 296k datasets. I don't need to import "remote_orgs" and my configuration is only with "clear_tags" as true.
The gather process start successfully, and ask to remote the correct api/path and row num... All works well. After the first read stage, the gather process start to log: Creating HarvestObject for ... foreach dataset. But never write the line: xxxxxx datasets sent to fetch queue or similar, as in other harvest processes. This instance run on 32GB DDR4, 40c/40t Xeon CPU. The result is a ckan process that use about 10% CPU, 35% RAM resource and ythe postgres grow up (the harvest_object table) but no fetch is started.
Can you help me?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions