Skip to content

Conversation

@etj
Copy link
Contributor

@etj etj commented Dec 10, 2025

Improve language recognition when fitlering by index.

Checklist

For all pull requests:

  • Confirm you have read the contribution guidelines
  • You have sent a Contribution Licence Agreement (CLA) as necessary (not required for small changes, e.g., fixing typos in the documentation)
  • Make sure the first PR targets the master branch, eventual backports will be managed later. This can be ignored if the PR is fixing an issue that only happens in a specific branch, but not in newer ones.

The following are required only for core and extension modules (they are welcomed, but not required, for contrib modules):

  • There is a ticket in https://github.com/GeoNode/geonode/issues describing the issue/improvement/feature (a notable exemption is, changes not visible to end-users)
  • The issue connected to the PR must have Labels and Milestone assigned
  • PR for bug fixes and small new features are presented as a single commit
  • PR title must be in the form "[Fixes #<issue_number>] Title of the PR"
  • New unit tests have been added covering the changes, unless there is an explanation on why the tests are not necessary/implemented

Submitting the PR does not require you to check all items, but by the time it gets merged, they should be either satisfied or inapplicable.

@etj etj added this to the 5.1.0 milestone Dec 10, 2025
@etj etj self-assigned this Dec 10, 2025
@cla-bot cla-bot bot added the cla-signed CLA Bot: community license agreement signed label Dec 10, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @etj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where search results were incomplete due to problems with language recognition in the new indexing system. It refines the language detection mechanism by introducing a robust fallback to the default language and ensures that language-specific filtering is applied only when relevant multi-language fields are present in the search index. This change aims to prevent missing entries and improve the overall accuracy and reliability of search results.

Highlights

  • Improved Language Recognition: The language detection logic for indexed searches has been significantly improved to ensure more accurate filtering.
  • Default Language Fallback: A new fallback mechanism has been introduced to use the default system language if no specific search language is provided or detected from the request.
  • Conditional Language Filtering: The application of language-specific filtering is now conditional, only engaging when the search index contains multi-language fields, preventing incorrect filtering for non-multilingual data.
  • Refined Language Filter: The lang_filter construction has been adjusted to correctly handle both multi-language fields (filtering by specific language) and non-multi-language fields (filtering for null language entries).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request improves language recognition when filtering by index, fixing an issue with missing entries in search results. The changes correctly separate the logic for multilang and non-multilang indexes, which is a solid improvement. Instead of mixing language-specific and language-neutral results, the new logic filters for a specific language if the index is multilang, and for language-neutral entries otherwise. I've identified a potential TypeError in the language detection logic that could lead to a server error and have provided a suggestion to fix it while also improving the code's readability.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@etj etj linked an issue Dec 10, 2025 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 13.33333% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.13%. Comparing base (7c402fd) to head (f774af6).
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #13808      +/-   ##
==========================================
- Coverage   74.13%   74.13%   -0.01%     
==========================================
  Files         944      944              
  Lines       56291    56315      +24     
  Branches     7617     7626       +9     
==========================================
+ Hits        41730    41747      +17     
- Misses      12897    12900       +3     
- Partials     1664     1668       +4     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@etj etj requested a review from giohappy December 10, 2025 18:05
@etj
Copy link
Contributor Author

etj commented Dec 10, 2025

/gemini summary
/gemini review

@gemini-code-assist
Copy link
Contributor

Hello @etj, I see you've requested a summary and a review. I provided a summary in my initial comment on this pull request. Below is my detailed review of the changes.

Review of Pull Request

This pull request effectively addresses the issue of incomplete search results by refining the language detection and filtering mechanisms within the indexing system. The changes introduce a more robust and context-aware approach to handling multilingual and non-multilingual content, which is a significant improvement for search accuracy and reliability.

geonode/indexing/api/filters.py

Changes:

  • Import get_default_language: This import is necessary to support the new fallback mechanism for language detection.
  • Conditional Language Filtering Logic: The most substantial change is the introduction of a conditional block that differentiates between indexes containing multi-language fields and those that do not. This is a critical improvement.
    • For multilingual indexes: The system now prioritizes search_lang from query parameters, then from the request headers/cookies, and finally falls back to the system's default language. The lang_filter is then set to Q(lang=search_lang), ensuring that only entries matching the detected language are returned. This is more precise than the previous Q(lang__isnull=True) | Q(lang=search_lang) which could lead to irrelevant results.
    • For non-multilingual indexes: The lang_filter is correctly set to Q(lang__isnull=True), meaning it will only search for entries where no specific language is assigned. This prevents incorrect language-specific filtering on content that isn't designed to be multilingual.

Assessment:

  • Correctness: critical - The previous logic had a potential flaw where lang__isnull=True was always included, which could lead to incorrect results for multilingual searches. The new conditional logic correctly separates the handling of multilingual and non-multilingual indexes, directly addressing the core problem of missing entries due to misapplied language filters. The fallback to get_default_language() is a sensible approach when no explicit language is provided.
  • Maintainability: medium - The code is now more explicit and easier to understand regarding how language filtering is applied based on the index's characteristics. The logger.info message for the language fallback is helpful for debugging and monitoring.

geonode/indexing/manager.py

Changes:

  • pg_lang for non-localized indexes: The pg_lang for non-localized indexes is now set to multi.get_pg_language(None) instead of multi.get_pg_language(multi.get_default_language()).

Assessment:

  • Correctness: critical - This change is essential for consistency with the new filtering logic in filters.py. By setting pg_lang to None for non-localized indexes, content is correctly indexed as language-agnostic. This ensures that these entries will be found by the Q(lang__isnull=True) filter applied to non-multilingual searches, preventing them from being missed. If it were indexed with a default language, it would not be found by the lang__isnull=True filter, leading to missing results.

Conclusion

Overall, these changes are well-implemented and significantly improve the accuracy and reliability of search results by correctly handling language detection and filtering for both multilingual and non-multilingual content. The logic is clearer and more robust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed CLA Bot: community license agreement signed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing entries in searching with the new indexing

2 participants