Skip to content

lib: more robust signed polygon size#7366

Open
metzm wants to merge 7 commits intoOSGeo:mainfrom
metzm:polygon_size
Open

lib: more robust signed polygon size#7366
metzm wants to merge 7 commits intoOSGeo:mainfrom
metzm:polygon_size

Conversation

@metzm
Copy link
Copy Markdown
Contributor

@metzm metzm commented May 4, 2026

There are two identical simple algorithms in the GRASS libs to calculate the signed size of a polygon in a projected CRS: G_planimetric_polygon_area() in https://github.com/OSGeo/grass/blob/main/lib/gis/area_poly2.c#L25 and dig_find_area_poly() in https://github.com/OSGeo/grass/blob/main/lib/vector/diglib/poly.c#L97.

In the current implementation, the sign of the calculated area becomes unstable for very small polygons. The sign of the area size is needed to determine the orientation of the polygon, which in turn is needed to decide if a polygon is a topological area or isle. This PR makes calculation of the signed area size more robust.

Example polygon with x|y coordinates:

357641.7000523223|5703067.799947679
357641.7000000004|5703067.800000001
357641.6999988449|5703067.800001157
357641.7000523223|5703067.799947679

The calculated size for this polygon is about 5.5e-14 which is on sub-atomar level and in a geographical context not different from zero. However, such tiny polygons can occur with a combination of certain vector operations, e.g. v.overlay. In these cases, it is important to get the correct sign of the area size, not matter how large it actually is, otherwise rather cryptic topological errors can occur later on.

This fix also makes area size calculation for larger polygons more robust if (some of the) adjacent vertices are very close to each other, e.g. a circle approximated with a large number of points.

Backport also to G84?

@metzm metzm added this to the 8.6.0 milestone May 4, 2026
@metzm metzm added bug Something isn't working vector Related to vector data processing C Related code is in C libraries backport to 8.5 PR needs to be backported to release branch 8.5 labels May 4, 2026
@echoix
Copy link
Copy Markdown
Member

echoix commented May 4, 2026

Since you have some test data here, would it make sense to create a unit test using that to ensure it is kept non-broken in the long-run?

@github-actions github-actions Bot added Python Related code is in Python module tests Related to Test Suite labels May 5, 2026
@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 5, 2026

Since you have some test data here, would it make sense to create a unit test using that to ensure it is kept non-broken in the long-run?

Done

Copy link
Copy Markdown
Member

@wenzeslaus wenzeslaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please turn the comments into sentences with punctuation? It will be much easier to determine what is the intended meaning there. While we don't have that in style guidelines, I recently added that to AGENTS.md.

Also, linking the two places in a comment would mitigate the duplication.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 5, 2026

Can you please turn the comments into sentences with punctuation? It will be much easier to determine what is the intended meaning there. While we don't have that in style guidelines, I recently added that to AGENTS.md.

Also, linking the two places in a comment would mitigate the duplication.

got used to diglib code comment style, short and precise
better like this? b55a971

Comment thread lib/gis/area_poly2.c Outdated
@wenzeslaus
Copy link
Copy Markdown
Member

If anyone is interested in my thinking on the comment style:

got used to diglib code comment style, short and precise

Short and precise is something to keep, but since the primary audience of comments is humans and we typically read and otherwise communicate in full sentences, I would go with full sentences. Your comment here is an exception to the rule I would say. It is still perfectly readable because there is clear context. Code comments often lack some of the context we have in our conversations (speaker, chronological ordering of thoughts).

Copy link
Copy Markdown
Member

@wenzeslaus wenzeslaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new comment in the test is strange, but otherwise we may indeed need a more specialized value comparison code. On the other hand, pytest or NumPy may already have something.

Comment thread vector/v.to.db/testsuite/test_v_to_db.py Outdated
Comment thread vector/v.to.db/testsuite/test_v_to_db.py Outdated
Comment thread lib/vector/diglib/poly.c
wenzeslaus
wenzeslaus previously approved these changes May 6, 2026
Copy link
Copy Markdown
Member

@wenzeslaus wenzeslaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, put the xshift comment somewhere - code or commit message. Otherwise, it looks good to go.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 6, 2026

The xshift comment is now in the code.

BTW, a comment about your usage of Claude: even though I use coding agents myself sometimes, this PR was done by me without any AI assistance, not even PR review, the changes to the code are too minimal and too simple ;-) The amount of new code comment is now a multiple of the new code...

Backport also to G84?

Comment thread lib/gis/area_poly2.c Outdated
Comment thread lib/vector/diglib/poly.c Outdated
@wenzeslaus
Copy link
Copy Markdown
Member

Here is a raw re-review by Claude if you are interested (the suggestion it refers to is in #7366 (comment)):

PR 7366 — Updated Review

The author moved in the opposite direction from the comment-shortening suggestion: both code comments are now substantially longer, adding a paragraph explaining why the shift is applied to x as well as y. I'd push back on this — there are also two real bugs in the new content.

Bugs introduced in this revision

  1. Wrong formula in the comment (both files)
  • Considering the basic formula
  • area += (x2 - x1) * (y2 + 1))

Two errors:

  • (y2 + 1) should be (y2 + y1) — pretty visible mistake given it's the formula the whole comment is justifying.
  • Stray closing paren: (y2 + 1)).

Same typo in both area_poly2.c and poly.c. Worth fixing before merge — a wrong formula in a comment about numerical correctness is an unfortunate signal.

  1. Wrong variable in the new failure message

vector/v.to.db/testsuite/test_v_to_db.py:

msg=f"Comparing mantissas of {d1[k1]} and {d1[k1]}",

Both interpolations are d1[k1] — the second should be d2[k1]. As written, a failure prints the same number twice, which defeats the point of the message.

The expanded comment

Per project guidance (AGENTS.md): "Don't add features, refactor, or introduce abstractions beyond what the task requires. ... Don't design for hypothetical future requirements." The new paragraph —

▎ If for some reason the formula would be changed in the future by adding x and subtracting y, the current shift for both x and y would guarantee that the results stay correct (with reversed sign).

— is exactly that: justifying present-day code by a hypothetical future edit. If someone later swaps which axis is summed, they'll re-derive what to shift on the spot; the comment doesn't help them and bloats the file today.

The whole "considering the basic formula … in theory only needed for addition" digression also reads as design notes that belong in the PR description, not the source. My concrete suggestion remains close to last time:

  /* Shift coordinates toward a point near the polygon so intermediate
   * sums stay small; this preserves the sign of the signed area for
   * very small or detail-dense polygons under floating-point rounding.
   * The midpoint between vertex 0 and the middle vertex is a cheap
   * approximation of a centroid and good enough as an origin.
   * Keep in sync with dig_find_area_poly() in lib/vector/diglib/poly.c. */

If the author feels strongly that the symmetric x-shift in poly.c deserves a note (since it cancels algebraically), one short line is enough, e.g.:

  /* x-shift cancels in (x[i] - x[i-1]); kept for symmetry with the y-shift. */

That states the fact without inventing a future scenario.

Other items unchanged from last review

  • Tests still in testsuite/ (gunittest) rather than tests/ (pytest). The added test_areas_islands does not require NC SPM; it would fit pytest. I noted this last round; the author may have left it intentionally for cohesion with the existing file. Worth a one-line answer in the PR thread either
    way.
  • Duplicated implementations: now duplicated with the same long comment. If the two functions must stay in sync, that is an argument for one calling the other (or a static helper) — not for keeping two long comments synchronized by hand. Probably out of scope for this PR; flag it for a follow-up.
  • tiny_polygons.pack is still opaque; a one-line note in the test about its provenance / how to regenerate would help future maintainers.
  • No author / \author update in the doxygen header of area_poly2.c. Optional.

Summary

The numerical fix and tests are still correct and worthwhile. Before merge:

  1. Fix (y2 + 1)) → (y2 + y1) in both comments (or, better, drop the formula-restating paragraph entirely and trim the comment).
  2. Fix {d1[k1]} and {d1[k1]} → {d1[k1]} and {d2[k1]} in the failure message.
  3. Decide on test placement (testsuite/ vs. tests/) and reply in the thread.

Copy link
Copy Markdown
Member

@wenzeslaus wenzeslaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Very effective and complete comment for xshift issue.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 6, 2026

  1. comments are fixed and shortened
  2. the variable in the test message is corrected
  3. I'm not deciding on testsuite/ (gunittest) or tests/ (pytest) in this PR and use the existing tests. Syncing tests is for another PR.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 6, 2026

A last comment on code duplication: there is an important difference, dig_find_area_poly() returns the signed area size whereas G_planimetric_polygon_area() returns the unsigned area size which is an important difference. Ideally there would be only one fn in lib/gis to get the signed area size. dig_find_area_poly() would then return exactly that while G_planimetric_polygon_area() returns the fabs of the signed area size. For another PR maybe.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 6, 2026

Backport to G84 would not be possible with simple cherry-picking because v.build and v.to.db don't have tests in G84.

@wenzeslaus
Copy link
Copy Markdown
Member

Thanks for the additional comments here and in the code. Well-targeted, non-verbose comments make a world of difference for maintainability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport to 8.5 PR needs to be backported to release branch 8.5 bug Something isn't working C Related code is in C libraries module Python Related code is in Python tests Related to Test Suite vector Related to vector data processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants