A super-fast in-process lookup service for canonical names, backed by tantivy.
juditha exists to tame the noise that follows from Named Entity Recognition: given a huge list of known names (company registries, persons of interest, sanctions lists), it tells you whether a span produced by your NER pipeline corresponds to one of them, even when the casing, accents, token order, or spelling differs.
The implementation uses a pre-populated names database and index. Data is either FollowTheMoney entities or simply list of names.
https://docs.investigraph.dev/lib/juditha
Juditha Dommer was the daughter of a coppersmith and raised seven children, while her husband Johann Pachelbel wrote a canon.
To mark the compatibility with followthemoney, juditha follows the same major version, which is currently 4.x.x.
juditha, (C) 2024 investigativedata.io. (C) 2025, 2026 Data and Research Center – DARC. Licensed under AGPLv3 or later. See NOTICE and LICENSE.