Feature added: Get WSI at mpp#7574
Conversation
a65ef4a to
3264079
Compare
|
@NikolasSchmitz, I resolved the conflicts and updated the PR. Let's focus on the functionality and make this PR ready for reviewing. We can take care of any other issue once the PR is ready and please feel free to reach out to me or the working group if you still have any questions. Thanks for taking on this feature. |
|
Thank you @drbeh ! I now marked it as ready for review. |
|
@drbeh would you be able to review this? I think you're best qualified here. I made some minor comments about print and code duplication but otherwise it seems fine to me without having tested it. Thanks! |
JHancox
left a comment
There was a problem hiding this comment.
Thanks for the effort @NikolasSchmitz. A welcome addition! I agree with @ericspod about the refactoring to remove duplication - that would be great.
|
Thank you for the feedback, @JHancox . Unfortunately, I couldn't join the meeting because I was on my way home from ICLR. |
|
I factored out the actual resizing of the image into _resize_to_mpp_res() and kept the tolerance checks in get_wsi_at_mpp(). I had some problems with specific tiff files and the TiffFile backend. Some files had zero as X/Y Resolution, which led to a division by zero error in the function get_mpp (added a check for that in the new commit). |
|
There are lines added/removed to two other files that don't need to be changed in this PR, tests/test_regularization.py is causing a conflict for example. I would remove those and we can use black here through Github to resolve formatting issues if we need to. |
Thanks for the info. Interesting though, haven't noticed that these files were changed. Perhaps through one of the coding style checks. |
|
There's a few changes to do but it looks good in general. We do need tests for these new methods as well so please look at the existing WSI tests to see what needs to be added to cover new functionality. The flake8 and DCO fixes can be done at the end. Thanks! |
I'm also seeing some very different information in the two header from these files I've included below. I don't know how the tiff was generated but maybe @bhashemian would and know what's the discrepancy is. DetailsIn [11]: dict(openslide.open_slide("CMU-1.tiff").properties) In [12]: dict(openslide.open_slide("Aperio-CMU-1.svs").properties) |
|
Hi @NikolasSchmitz and @ericspod, Your observation is spot-on! The TIFF file lacks the correct MPP. We’ve been using these files from OpenSlide test cases without making any modifications. Previously, we simply assumed the files were correct and adjusted our test cases to read them with those metadata. For instance, we would read TIFF files at a specific MPP of 1000 and SVS files at a MPP of 0.4999. You can continue with the same approach. However, if you can convert the SVS file to TIFF with the correct metadata and update the test cases accordingly, that would be fantastic. Thanks for your contribution! |
|
Hi @bhashemian, thanks for the feedback! I have converted the SVS file to TIFF with the correct metadata. According to the Contributing document, the testing data is stored in the Project-MONAI/MONAI-extra-test-data repository. However, I'm unsure about the process for uploading files there—I've noticed that others have opened issues for similar cases. Should I follow the same approach? In the meantime, I’ll proceed with updating the other test cases accordingly. Thanks again for your guidance! |
|
Hi @NikolasSchmitz we can upload the test data that you have to MONAI-extra-test-data, it's actually stored in the contents of a release in that repo. Please do get your tests, DCO, and data sorted out, we can then upload the data for you if you could share it somewhere, and then verify the tests are working here. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds MPP-based reading to WSIReader and its backends. BaseWSIReader gains helpers to compute target pixel dimensions and MPP tolerance checks. WSIReader exposes get_wsi_at_mpp and delegates to backend readers. CuCIMWSIReader, OpenSlideWSIReader, and TiffFileWSIReader implement backend-specific _resize_to_mpp_res to read the closest/higher-resolution level and, when needed, resize to the requested MPP. TiffFile get_mpp validates X/Y resolutions. Tests and test data for MPP-based reads were added. A single whitespace edit was made in a transform file. Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
monai/data/wsi_reader.py (2)
1103-1103: Fix hardcoded tolerance values.Same issue as in CuCIMWSIReader - the method uses hardcoded tolerance values instead of the actual parameters.
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1340-1340: Fix hardcoded tolerance values.Consistent with other readers, this method also uses hardcoded tolerance values.
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
🧹 Nitpick comments (1)
monai/data/wsi_reader.py (1)
823-864: Consider further reducing code duplication in get_wsi_at_mpp methods.The three
get_wsi_at_mppimplementations share identical logic except for backend-specific calls. Consider moving the common algorithm toBaseWSIReaderwith abstract methods for backend-specific operations.Example refactoring approach:
- Move the tolerance checking and level selection logic to
BaseWSIReader.get_wsi_at_mpp- Define abstract methods like
_read_region_at_leveland_convert_to_numpyfor backend-specific operations- Override only these backend-specific methods in each reader class
Also applies to: 1084-1124, 1321-1359
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
monai/data/wsi_reader.py(9 hunks)monai/transforms/regularization/array.py(0 hunks)
💤 Files with no reviewable changes (1)
- monai/transforms/regularization/array.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.4.1)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: build-docs
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: packaging
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: flake8-py3 (codeformat)
🔇 Additional comments (6)
monai/data/wsi_reader.py (6)
434-456: LGTM! Well-structured helper method for MPP target resolution calculation.The method correctly computes target dimensions by calculating separate downsampling factors for x and y dimensions, with proper rounding.
457-490: LGTM! Robust tolerance checking with clear logic.The method properly handles both absolute and relative tolerances, and correctly determines if the closest level has sufficient resolution.
663-681: LGTM! Clean delegation pattern.The method properly delegates to the underlying reader implementation while maintaining a consistent interface.
949-971: LGTM! Proper GPU-accelerated resizing implementation.The method correctly uses CuCIM's resize function with bilinear interpolation (order=1) as suggested in past reviews.
1195-1216: LGTM! Appropriate PIL-based resizing for OpenSlide backend.The method correctly uses PIL's resize with BILINEAR interpolation, which is consistent with the backend's capabilities.
1437-1458: LGTM! Consistent PIL-based implementation for TiffFile backend.The method properly converts the TIFF array to PIL Image before resizing.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/utils/enums/test_wsireader.py (1)
473-499: Well-structured test with appropriate skip logic.The new test method effectively validates the
get_wsi_at_mppfunctionality with:
- Proper parameterized testing approach
- Appropriate skip conditions for unsupported backend/format combinations
- Clear shape validation
Consider enhancing the test coverage by also validating:
- Actual image content/pixel values (not just shape)
- Tolerance parameter behavior
- Error conditions and edge cases
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
monai/data/wsi_reader.py(9 hunks)tests/utils/enums/test_wsireader.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- monai/data/wsi_reader.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/utils/enums/test_wsireader.py (1)
monai/data/wsi_reader.py (9)
WSIReader(504-723)read(711-723)read(885-908)read(1146-1167)read(1379-1400)get_wsi_at_mpp(663-680)get_wsi_at_mpp(823-863)get_wsi_at_mpp(1089-1128)get_wsi_at_mpp(1326-1363)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.4.1)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: packaging
- GitHub Check: build-docs
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-py3 (3.11)
🔇 Additional comments (3)
tests/utils/enums/test_wsireader.py (3)
40-45: LGTM! Clear naming and helpful documentation.The new constants for the TIFF file with corrected MPP metadata are well-named and the comment clearly explains the distinction from the existing file with incorrect MPP values.
262-309: Comprehensive test coverage for MPP-based WSI reading.The new test cases provide excellent coverage across:
- Multiple MPP resolutions (1.5, 3.0, 4.0, 8.0)
- Both SVS and TIFF file formats
- All three backends (openslide, cucim, tifffile)
- Tolerance parameter testing
- Different expected shapes per backend (accounting for RGBA vs RGB channel differences)
461-472: Clarify the intent behind commenting out the existing test.The
test_read_whole_imagemethod has been commented out. Is this intentional for the long term, or should this test be re-enabled? If it's permanently replaced by the new MPP-based tests, consider removing the commented code to keep the codebase clean.
|
Hi all, the feature and a few associated tests have been implemented. I ran into an issue with the resizing functions, as they expected different channel orders for the target resolution. That’s been resolved. I’ve also uploaded the updated TIFF file here: https://gigamove.rwth-aachen.de/en/download/722a2879de053fd9a0ddd7f6e4874624 |
|
Hi @NikolasSchmitz thanks for getting back to this. There's a few tests which still aren't passing, a few say just that a timeout occurred so it might be internal but the code formatting test did find something. As for DCO, clicking on that brings up advice on how to resolve, specifically doing an empty remedial commit. If that doesn't work we will set the DCO to pass just before merging. Please have a look at the tests and see if we can get that through. |
|
We also are putting more test data on Huggingface now so you can propose to upload your image to here https://huggingface.co/datasets/MONAI/testing_data if you like. We're planning to adjust our tests to pull from this dataset. |
|
Your new tiff file is added to the hf dataset now, so you need to change @skipUnless(has_cucim or has_osl or has_tiff, "Requires cucim, openslide, or tifffile!")
def setUpModule():
download_url_or_skip_test(
testing_data_config("images", WSI_GENERIC_TIFF_KEY, "url"),
WSI_GENERIC_TIFF_PATH,
hash_type=testing_data_config("images", WSI_GENERIC_TIFF_KEY, "hash_type"),
hash_val=testing_data_config("images", WSI_GENERIC_TIFF_KEY, "hash_val"),
)
download_url_or_skip_test(
testing_data_config("images", WSI_GENERIC_TIFF_CORRECT_MPP_KEY, "url"),
WSI_GENERIC_TIFF_CORRECT_MPP_PATH,
hash_type=testing_data_config("images", WSI_GENERIC_TIFF_CORRECT_MPP_KEY, "hash_type"),
hash_val=testing_data_config("images", WSI_GENERIC_TIFF_CORRECT_MPP_KEY, "hash_val"),
)
download_url_or_skip_test(
testing_data_config("images", WSI_APERIO_SVS_KEY, "url"),
WSI_APERIO_SVS_PATH,
hash_type=testing_data_config("images", WSI_APERIO_SVS_KEY, "hash_type"),
hash_val=testing_data_config("images", WSI_APERIO_SVS_KEY, "hash_val"),
)Add the following to "wsi_generic_tiff_corrected": {
"url": "https://huggingface.co/datasets/MONAI/testing_data/resolve/main/CMU-1_correct_mpp.tiff",
"hash_type": "sha256",
"hash_val": "65306e3f8f7f5282d19d942dadc525cd06a80d5fd8268053939751365226c65f"
},As you can see, our process for downloading data needs to be revamped to improved this process! Thanks. |
|
@NikolasSchmitz thanks for your contribution. Please let me know if you need help fixing the DCO issues. |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
monai/data/wsi_reader.py (1)
663-681: Validate mpp input early.Reject non-tuples, wrong length, or non-positive values to avoid undefined behavior downstream.
def get_wsi_at_mpp(self, wsi, mpp: tuple, atol: float = 0.00, rtol: float = 0.05) -> np.ndarray: @@ - return self.reader.get_wsi_at_mpp(wsi, mpp, atol, rtol) + # basic validation + if not isinstance(mpp, tuple) or len(mpp) != 2: + raise ValueError("mpp must be a tuple (x, y).") + if any((not isinstance(v, (int, float))) or v <= 0 for v in mpp): + raise ValueError("mpp values must be positive numbers.") + return self.reader.get_wsi_at_mpp(wsi, mpp, atol, rtol)
🧹 Nitpick comments (8)
tests/utils/enums/test_wsireader.py (2)
479-505: Strengthen assertions for cross-backend robustness.Exact H×W can vary a few pixels due to rounding/interp differences. Also assert:
- dtype and channel count per backend,
- monotonicity of area vs requested MPP,
- that returned MPP (via metadata or helper) is within tolerance.
Consider computing expected size from level-0 dims and requested mpp rather than hard-coding tuples to reduce brittleness.
467-478: Remove commented-out test or re-enable behind a marker.Dead/commented code adds noise. Delete or gate with a feature flag/xfail.
monai/data/wsi_reader.py (5)
843-846: Don’t pass hard-coded tolerances to _find_closest_level; avoid unintended behavior.The 0,5 literals silently decouple user-provided tolerances. Either:
- use the provided (atol, rtol), or
- bypass tolerance checking entirely when only the index of the closest level is needed.
Apply one of the following:
Option A (simple):
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)Option B (preferred: no early raising):
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + # Pick closest level without tolerance checks; actual tolerances are evaluated below. + closest_lvl = min(range(len(mpp_list)), + key=lambda i: abs(mpp_list[i][0]-mpp[0]) + abs(mpp_list[i][1]-mpp[1]))Also applies to: 1108-1111, 1345-1348
949-975: Normalize dimension handling to reduce fragility.closest_lvl_dim is backend-native (often (width, height)); _compute_mpp_target_res expects (height, width), leading to implicit swaps. Unify by passing self.get_size(wsi, closest_lvl) to _compute_mpp_target_res, and use explicit width/height when calling backend resizers.
Example for CuCIM (apply similarly to OpenSlide/TiffFile as appropriate):
- closest_lvl_dim = wsi.resolutions["level_dimensions"][closest_lvl] - target_res_x, target_res_y = self._compute_mpp_target_res(closest_lvl, closest_lvl_dim, mpp_list, user_mpp) - wsi_arr = cp.array(wsi.read_region((0, 0), level=closest_lvl, size=closest_lvl_dim, num_workers=self.num_workers)) + # Consistent dims: (h, w) + h, w = self.get_size(wsi, closest_lvl) + target_w, target_h = self._compute_mpp_target_res(closest_lvl, (h, w), mpp_list, user_mpp) + # cuCIM read_region expects (w, h) + wsi_arr = cp.array(wsi.read_region((0, 0), level=closest_lvl, size=(w, h), num_workers=self.num_workers)) - closest_lvl_wsi = cucim_resize( - wsi_arr, - (target_res_x, target_res_y), + # cucim.resize expects (rows, cols) == (h, w) + closest_lvl_wsi = cucim_resize( + wsi_arr, + (target_h, target_w), order=1, preserve_range=True, anti_aliasing=False).astype(cp.uint8)For OpenSlide’s PIL.resize, pass (target_w, target_h).
Also applies to: 1200-1221, 1442-1462
1314-1324: Normalize ResolutionUnit and tighten error handling.Lowercase the unit to match ConvertUnits expectations; use shorter messages (TRY003).
- convert_to_micron = ConvertUnits(unit, "micrometer") + convert_to_micron = ConvertUnits(str(unit).lower(), "micrometer") @@ - if xres[0] and yres[0]: + if xres[0] and yres[0]: return convert_to_micron(yres[1] / yres[0]), convert_to_micron(xres[1] / xres[0]) else: - raise ValueError("The `XResolution` and/or `YResolution` property of the image is zero, " - "which is needed to obtain `mpp` for this file. Please use `level` instead.") + raise ValueError("Invalid TIFF X/YResolution (zero). Cannot compute mpp; use `level` instead.")
960-974: Consider anti-aliasing when downscaling.Downscaling without anti_aliasing can introduce artifacts. Enable it when target size is smaller than source.
- closest_lvl_wsi = cucim_resize( + downscale = (target_h < h) and (target_w < w) + closest_lvl_wsi = cucim_resize( wsi_arr, (target_h, target_w), order=1, preserve_range=True, - anti_aliasing=False).astype(cp.uint8) + anti_aliasing=downscale).astype(cp.uint8)
1099-1105: Docstring: clarify tolerance interplay and behavior on miss.Explicitly state: closest level is selected ignoring tolerances; tolerances only decide “return as-is vs resize”. Document behavior when requested MPP is finer than level 0 (error vs upsample).
Also applies to: 1336-1342, 833-839
tests/testing_data/data_config.json (1)
8-12: Consider pinning to a specific commit for reproducibility.Hash verified ✓ and artifact accessible. However,
resolve/mainremains mutable; prefer pinning to a commit hash (e.g.,resolve/abc1234) for CI stability. The SHA256 guard mitigates risk but doesn't prevent future mutations.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (4)
monai/data/wsi_reader.py(9 hunks)monai/transforms/regularization/array.py(0 hunks)tests/testing_data/data_config.json(1 hunks)tests/utils/enums/test_wsireader.py(4 hunks)
💤 Files with no reviewable changes (1)
- monai/transforms/regularization/array.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/data/wsi_reader.pytests/utils/enums/test_wsireader.py
🪛 Ruff (0.14.1)
monai/data/wsi_reader.py
1322-1323: Avoid specifying long messages outside the exception class
(TRY003)
1324-1324: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: packaging
- GitHub Check: quick-py3 (ubuntu-latest)
|
@NikolasSchmitz please let me know if I can help to finalize and merge this PR. Thanks |
|
Hey Bruce, I will finalize this PR in the next days and will let you know whether I need help. Thanks! |
|
/build |
|
Hi @bhashemian, I’ve just pushed my current progress which includes two additions suggested by coderabbitai. Thanks! |
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (5)
monai/data/wsi_reader.py (5)
843-843: Fix hardcoded tolerance values.Line uses hardcoded
(0, 5)instead of theatolandrtolparameters passed to the function. This bug was flagged in previous reviews.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1110-1110: Fix hardcoded tolerance values.Line uses hardcoded
(0, 5)instead of theatolandrtolparameters.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1124-1127: Guard against level underflow.Decrementing
closest_lvlwithout checking if it's 0 can cause underflow to -1.Apply this diff:
else: # If both checks fail, increase resolution (i.e., decrement level) and then downsample it. + if closest_lvl == 0: + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + else: - closest_lvl = closest_lvl - 1 - closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + closest_lvl = closest_lvl - 1 + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp)
1347-1347: Fix hardcoded tolerance values.Line uses hardcoded
(0, 5)instead of theatolandrtolparameters.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1359-1362: Guard against level underflow.Decrementing
closest_lvlwithout checking if it's 0 can cause underflow to -1.Apply this diff:
else: # If both checks fail, increase resolution (i.e., decrement level) and then downsample it. + if closest_lvl == 0: + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + else: - closest_lvl = closest_lvl - 1 - closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + closest_lvl = closest_lvl - 1 + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
monai/data/wsi_reader.py(9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/data/wsi_reader.py
🪛 Ruff (0.14.4)
monai/data/wsi_reader.py
1324-1325: Avoid specifying long messages outside the exception class
(TRY003)
1326-1326: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: packaging
- GitHub Check: build-docs
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: flake8-py3 (codeformat)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-os (macOS-latest)
🔇 Additional comments (1)
monai/data/wsi_reader.py (1)
1317-1326: Validation logic for zero resolution is correct.The check prevents division-by-zero when X/YResolution numerators are zero. Static analysis flagged the long exception message (TRY003), but this is acceptable for clarity.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (5)
monai/data/wsi_reader.py (5)
843-843: Fix hardcoded tolerance values.Uses hardcoded
(0, 5)instead of the function'satolandrtolparameters. This bug was flagged in past reviews but remains unfixed.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1110-1110: Fix hardcoded tolerance values.Uses hardcoded
(0, 5)instead of the function'satolandrtolparameters. Same bug as in CuCIMWSIReader.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1359-1362: Guard against level underflow.If
closest_lvlis 0, line 1361 decrements it to -1, causing negative indexing.Apply this diff:
else: # If both checks fail, increase resolution (i.e., decrement level) and then downsample it. - closest_lvl = closest_lvl - 1 - closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + if closest_lvl == 0: + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + else: + closest_lvl = closest_lvl - 1 + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp)
1347-1347: Fix hardcoded tolerance values.Uses hardcoded
(0, 5)instead of the function'satolandrtolparameters. Same bug as in other backends.Apply this diff:
- closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5) + closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, atol, rtol)
1124-1127: Guard against level underflow.If
closest_lvlis 0, line 1126 decrements it to -1, causing negative indexing into the level list.Apply this diff:
else: # If both checks fail, increase resolution (i.e., decrement level) and then downsample it. - closest_lvl = closest_lvl - 1 - closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + if closest_lvl == 0: + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp) + else: + closest_lvl = closest_lvl - 1 + closest_lvl_wsi = self._resize_to_mpp_res(wsi, closest_lvl, mpp_list, mpp)
🧹 Nitpick comments (3)
monai/data/wsi_reader.py (3)
434-455: Document return type in docstring.The method returns
tuple[int, int](target width, height) but this isn't specified in the docstring.Add to docstring:
Args: closest_lvl: Whole slide image level closest to user-provided MPP resolution. closest_lvl_dim: Dimensions (height, width) of the image at the closest level. mpp_list: List of MPP values for all levels of the whole slide image. mpp: The MPP resolution at which the whole slide image representation should be extracted. + + Returns: + tuple[int, int]: Target dimensions (width, height) for the resized image.
457-489: Document return type in docstring.The method returns
tuple[bool, bool]but this isn't specified in the docstring.Add to docstring:
Args: closest_lvl: Whole slide image level closest to user-provided MPP resolution. mpp_list: List of MPP values for all levels of the whole slide image. mpp: The MPP resolution at which the whole slide image representation should be extracted. atol: Absolute tolerance for MPP comparison. rtol: Relative tolerance for MPP comparison. + + Returns: + tuple[bool, bool]: (is_within_tolerance, closest_level_is_bigger)
951-977: Minor formatting: add space before method call.Line 975 missing space before
.astype.Apply this diff:
(target_res_x, target_res_y), order=1, preserve_range=True, - anti_aliasing=False).astype(cp.uint8) + anti_aliasing=False + ).astype(cp.uint8)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
monai/data/wsi_reader.py(9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/data/wsi_reader.py
🪛 Ruff (0.14.4)
monai/data/wsi_reader.py
1324-1325: Avoid specifying long messages outside the exception class
(TRY003)
1326-1326: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: quick-py3 (ubuntu-latest)
🔇 Additional comments (3)
monai/data/wsi_reader.py (3)
1202-1222: LGTM.Resizing logic is correct and uses appropriate PIL interpolation.
1317-1326: LGTM - division-by-zero guard added.The validation correctly prevents division by zero when TIFF resolution tags contain zero numerators. The static analysis hint (TRY003) about exception message length is pedantic and can be ignored here.
1444-1464: LGTM.Resizing logic is correct and consistent with OpenSlide implementation.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
monai/data/wsi_reader.py (1)
460-496: Fix return type annotation.
_compute_mpp_tolerancesactually returns a pair of booleans; the annotation-> boolis misleading and confuses type checkers. Update it totuple[bool, bool]to match the real contract.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
monai/data/wsi_reader.py(9 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
⚙️ CodeRabbit configuration file
Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.
Files:
monai/data/wsi_reader.py
🪛 Ruff (0.14.4)
monai/data/wsi_reader.py
1350-1351: Avoid specifying long messages outside the exception class
(TRY003)
1352-1352: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: min-dep-py3 (3.11)
- GitHub Check: min-dep-pytorch (2.8.0)
- GitHub Check: min-dep-py3 (3.10)
- GitHub Check: min-dep-py3 (3.9)
- GitHub Check: min-dep-pytorch (2.6.0)
- GitHub Check: min-dep-os (ubuntu-latest)
- GitHub Check: min-dep-pytorch (2.7.1)
- GitHub Check: min-dep-pytorch (2.5.1)
- GitHub Check: min-dep-py3 (3.12)
- GitHub Check: min-dep-os (windows-latest)
- GitHub Check: min-dep-os (macOS-latest)
- GitHub Check: flake8-py3 (pytype)
- GitHub Check: packaging
- GitHub Check: quick-py3 (windows-latest)
- GitHub Check: build-docs
- GitHub Check: quick-py3 (ubuntu-latest)
- GitHub Check: flake8-py3 (mypy)
- GitHub Check: quick-py3 (macOS-latest)
- GitHub Check: flake8-py3 (codeformat)
|
Hey, cool PR! Super useful! Would love to see this incorporated into MONAI soon. Could it be that the method doesn't respect the default RGB mode set for the reader? >>> from monai.data.wsi_reader import WSIReader
>>> reader = WSIReader("openslide", mode="RGB")
>>> image = reader.read("CMU-3.tiff")
>>> reader.get_wsi_at_mpp(image, (10, 10)).shape
(2307, 3272, 4)Also, I hit my feet against something pointed out in this coderabbit comment. This indeed didn't work while users might expect it to work as the exception is not obvious what to change. >>> from monai.data.wsi_reader import WSIReader
>>> reader = WSIReader("openslide", mode="RGB")
>>> image = reader.read("CMU-3.tiff")
>>> reader.get_wsi_at_mpp(image, 10).shape
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/siemdejong/segmenteer/.venv/lib/python3.10/site-packages/monai/data/wsi_reader.py", line 690, in get_wsi_at_mpp
return self.reader.get_wsi_at_mpp(wsi, mpp, atol, rtol)
File "/home/siemdejong/segmenteer/.venv/lib/python3.10/site-packages/monai/data/wsi_reader.py", line 1130, in get_wsi_at_mpp
closest_lvl = self._find_closest_level("mpp", mpp, mpp_list, 0, 5)
File "/home/siemdejong/segmenteer/.venv/lib/python3.10/site-packages/monai/data/wsi_reader.py", line 161, in _find_closest_level
closest_value = min(value_at_levels, key=lambda a_value: sum([abs(x - y) for x, y in zip(a_value, value)]))
File "/home/siemdejong/segmenteer/.venv/lib/python3.10/site-packages/monai/data/wsi_reader.py", line 161, in <lambda>
closest_value = min(value_at_levels, key=lambda a_value: sum([abs(x - y) for x, y in zip(a_value, value)]))
TypeError: 'int' object is not iterable |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@monai/data/wsi_reader.py`:
- Around line 880-881: get_wsi_at_mpp currently returns a raw HxWxC array (e.g.,
RGBA) without applying the configured image mode and channel order, which makes
its output inconsistent with _get_patch/get_data; update get_wsi_at_mpp to honor
the instance's mode and channel_dim settings by converting the raw array to the
configured mode (e.g., drop/convert alpha, convert to RGB/GRAY as appropriate)
and then reorder axes according to channel_dim before returning; do the same fix
for the equivalent code paths that produce raw arrays in the OpenSlide, CuCIM,
and TiffFile handlers (the same pattern appears in the methods referenced around
get_wsi_at_mpp and the other two locations), using the same internal helpers or
logic used by _get_patch/get_data to ensure consistent channel semantics.
- Around line 462-476: The _compute_mpp_tolerances function currently declares a
bool return type but actually returns a 2-tuple; update its signature to return
a tuple of two booleans (e.g., Tuple[bool, bool] or tuple[bool, bool]) and add
the corresponding typing import (from typing import Tuple) if not present;
ensure the docstring and any callers expecting a bool are adjusted/checked to
handle the (is_within_tolerance, closest_level_is_bigger) tuple from
_compute_mpp_tolerances.
🧹 Nitpick comments (1)
monai/data/wsi_reader.py (1)
835-855: AddRaisessections to the newget_wsi_at_mppdocstrings.
They can raiseValueError(andOptionalImportErrorin CuCIM), but the docstrings omit this.As per coding guidelines, Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings.
Also applies to: 1110-1128, 1356-1374
|
Hi @NikolasSchmitz we still want to get this merged into MONAI, there's a few things to resolve flagged by Coderabbit and users as well as the DCO. Would it be possible to get back to this? Thanks! |
Thanks for the feedback and the testing! Both issues are fixed now: The RGB mode is properly enforced before returning the array, and I've added a guard (ensure_tuple_rep(mpp, 2)) to safely handle scalar inputs for mpp without throwing a TypeError. |
2b4441c to
8988563
Compare
…, fix mpp scalar inputs, and update tests Signed-off-by: Nikolas Schmitz <nikolas.schmitz@rwth-aachen.de>
for more information, see https://pre-commit.ci
Hi @ericspod! Yes, I'm still on board. I just force-pushed an update addressing all feedback, including the DCO issue. Could you please trigger a new CodeRabbit review from your end? Just to make sure I didn't forget anything. Let me know if anything else is needed! |


Fixes: #4980
Description
In this pull request, the feature to retrieve a whole slide image at a given mpp (microns per pixel) resolution was implemented for every WSIReader class in the function
get_wsi_at_mpp.While the implementations in the
OpenslideWSIReaderandCuCIMWSIReaderclasses were tested thoroughly, I could not find a suitable TIFF file for testing with theTiffFileWSIReaderclass.For resizing, I have used
PIL.Image.resizefor Openslide and TiffFile, andcucim.sklearn.transform.resizefor CuCIM. Originally, I usedcv2.resize, but since the package isn't listed in requirements-dev.txt, I explored alternative solutions."Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.Please let me know what you think and how I can improve this feature.
Best
Niko
(Last PR was closed, because I changed the branch name to include the ticket id)