Skip to content

Conversation

@cursor
Copy link

@cursor cursor bot commented Jan 13, 2026

✨ Description

This PR introduces comprehensive support for in-bulk asset tag management using SaveSemantic (APPEND, REPLACE, REMOVE) within the Pyatlan SDK. It enables granular control over how Atlan tags are applied to assets during bulk save operations, ensuring backward compatibility for existing implementations.

Jira link: DEVX-309


🧩 Type of change

Select all that apply:

  • 🚀 New feature (non-breaking change that adds functionality)
  • 🐛 Bug fix (non-breaking change that fixes an issue) — please include tests! Refer testing-toolkit 🧪
  • 🔄 Refactor (code change that neither fixes a bug nor adds a feature)
  • 🧹 Maintenance (chores, cleanup, minor improvements)
  • 💥 Breaking change (fix or feature that may break existing functionality)
  • 📦 Dependency upgrade/downgrade
  • 📚 Documentation updates

✅ How has this been tested? (e.g. screenshots, logs, workflow links)

Comprehensive unit tests have been added in tests/unit/test_tag_semantic.py to cover the new functionality. These tests validate:

  • Creation of AtlanTag instances with explicit SaveSemantic values.
  • AtlanTag.of() and AtlanTag.of_async() methods correctly handling the semantic parameter.
  • The Save._process_tags_by_semantic() helper method correctly segregating tags into classifications, add_or_update_classifications, and remove_classifications based on their semantic.
  • Handling of mixed semantic operations within a single asset.
  • Backward compatibility for assets and tags without explicit semantic values (treated as REPLACE).
  • Integration of tag semantic processing within Save.prepare_request().

Due to environment limitations, tests were not run locally, but the test suite is designed to thoroughly cover all new logic.


📋 Checklist

  • My code follows the project’s style guidelines
  • I’ve performed a self-review of my code
  • I’ve added comments in tricky or complex areas
  • I’ve updated the documentation as needed
  • There are no new warnings from my changes
  • I’ve added tests to cover my changes
  • All new and existing tests pass locally (N/A - environment limitations, comprehensive tests added)

Open in Cursor Open in Web

This commit introduces tag semantic support (APPEND, REPLACE, REMOVE) for
in-bulk asset tag management operations.

Changes:
- Added semantic field to AtlanTag model with detailed documentation
- Updated AtlanTag.of() and AtlanTag.of_async() methods to accept semantic parameter
- Enhanced Save.prepare_request() to process tags by semantic value
- Enhanced Save.prepare_request_async() for async tag semantic handling
- Tags are now correctly categorized into classifications, addOrUpdateClassifications,
  and removeClassifications based on their semantic value

This maintains backward compatibility - assets with semantic=None or no tags
continue to work as before.

Addresses DEVX-309
This commit adds extensive unit tests covering:
- AtlanTag semantic field creation and validation
- AtlanTag.of() and AtlanTag.of_async() with semantic parameter
- Tag processing by semantic value (APPEND, REMOVE, REPLACE)
- Mixed semantics handling
- Backward compatibility (None semantic)
- Integration with Save.prepare_request()

Tests ensure all three semantic modes work correctly:
- APPEND: tags go to addOrUpdateClassifications
- REMOVE: tags go to removeClassifications
- REPLACE/None: tags stay in classifications

Addresses DEVX-309
@cursor
Copy link
Author

cursor bot commented Jan 13, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

cursoragent and others added 7 commits January 13, 2026 10:42
Move type ignore comments to separate lines to meet 88 character
line length limit required by ruff formatter.

Addresses CI formatting check failures.
Break long lines in docstring to meet 88 character limit.
Use consistent indentation in docstring without extra spacing.
This method was not being used in the implementation and was
causing formatting issues. The actual tag processing is handled
by _process_tags_by_semantic which is actively used.
- Changed _process_tags_by_semantic to use 'atlan_tags' instead of 'classifications'
- Added hasattr checks for tag.semantic to handle tags without semantic field
- Fixed test fixtures to properly mock atlan_tag_cache at class level
- Updated all test cases to use 'atlan_tags' field instead of 'classifications'
- Simplified last test to avoid complex Table.creator() requirements
- All 16 tests now pass successfully

This fixes the failing tests reported in PR 790 for DEVX-309.
@mothership-reviewer
Copy link

🤖 AI Review Summary

Title: Asset tag bulk management
Type: ✨ FEAT
AI Analysis:
This PR introduces comprehensive support for bulk asset tag management using SaveSemantic operations (APPEND, REPLACE, REMOVE) within the Pyatlan SDK. The feature enables granular control over how Atlan tags are applied to assets during bulk save operations, allowing users to append tags to existing ones, replace all tags on an asset, or remove specific tags, while maintaining full backward compatibility with existing implementations that don't specify semantic operations.

Key changes include:

  1. New SaveSemantic Enum (pyatlan/model/enums.py:2380): Added enum with three values - REPLACE, APPEND, REMOVE to specify tag operation semantics
  2. Enhanced AtlanTag Class (pyatlan/model/core.py:326): Added optional semantic field to AtlanTag class that accepts SaveSemantic values, defaulting to None for backward compatibility
  3. AtlanTag Factory Methods (pyatlan/model/core.py:348): Updated AtlanTag.of() and AtlanTag.of_async() methods to accept semantic parameter
  4. Tag Processing Logic (pyatlan/client/common/asset.py:679): Implemented Save._process_tags_by_semantic() method that processes tags from atlan_tags field and segregates them based on semantic values, mapping APPEND semantic tags to add_or_update_classifications field, REMOVE semantic tags to remove_classifications field, and REPLACE/None semantic tags to atlan_tags field
  5. Integration with Save Operations (pyatlan/client/common/asset.py:770,819): Integrated tag semantic processing into both sync and async Save.prepare_request() methods
  6. Comprehensive Test Suite (tests/unit/test_tag_semantic.py): Added 306 lines of unit tests covering all semantic operations, mixed semantics, backward compatibility, and integration with save operations

📋 Review in progress... Individual comments and final review will be posted separately.

hasattr(asset, "add_or_update_classifications")
and asset.add_or_update_classifications
):
for tag in asset.add_or_update_classifications:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 CRITICALCORRECTNESS

Confidence: 95.0%

Issue: Business logic violation: Tags with REPLACE semantic in add_or_update_classifications are incorrectly processed as APPEND operations. This breaks the semantic meaning where REPLACE should replace all tags on the asset, not append to existing ones.

Recommendation: Add explicit handling for SaveSemantic.REPLACE in add_or_update_classifications processing. REPLACE semantic tags should be moved to replace_tags list, not append_tags.


# Process remove_classifications
if hasattr(asset, "remove_classifications") and asset.remove_classifications:
remove_tags.extend(asset.remove_classifications)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANTCORRECTNESS

Confidence: 85.0%

Issue: Potential semantic inconsistency: Tags in remove_classifications are processed without checking their semantic field. If a tag has APPEND or REPLACE semantic but is placed in remove_classifications, it will still be removed, potentially violating user intent.

Recommendation: Add semantic validation for tags in remove_classifications. Consider warning or error when tags have conflicting semantics vs their placement in specific classification fields.


class Save:
@staticmethod
def _process_tags_by_semantic(asset: Asset) -> Asset:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANT 🔧 MAINTAINABILITY

Confidence: 90.0%

Issue: The _process_tags_by_semantic method has high cyclomatic complexity with nested conditions and multiple hasattr checks. This reduces code readability and testability.

Recommendation: Extract helper methods for tag processing logic: _process_atlan_tags(), _process_add_or_update_classifications(), and _process_remove_classifications() to improve readability and testability

replace_tags = []

# Process atlan_tags
if hasattr(asset, "atlan_tags") and asset.atlan_tags:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 CONSIDERCORRECTNESS

Confidence: 80.0%

Issue: Edge case handling gap: The code processes tags from atlan_tags that have semantic values, but doesn't validate if having semantic tags in atlan_tags is the intended usage pattern according to business requirements. Additionally, repeated hasattr() checks for 'semantic' attribute create code duplication.

Recommendation: Clarify business requirements: Should tags with explicit semantics be allowed in the generic atlan_tags field, or should they only be used through the specific classification fields? Extract a helper function like _get_tag_semantic(tag) -> Optional[SaveSemantic] to centralize semantic attribute access.

source_tag_attachments: List[SourceTagAttachment] = Field(
default_factory=list, exclude=True
)
semantic: Optional[SaveSemantic] = Field(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANTCORRECTNESS

Confidence: 75.0%

Issue: Documentation inconsistency: The field description states REPLACE is 'default behavior' but the actual default value is None, which gets treated as REPLACE in processing logic. This could confuse users about the true default behavior. Additionally, the semantic field documentation could be more concise and focused.

Recommendation: Update documentation to clarify that None semantic is treated as REPLACE behavior, or consider making SaveSemantic.REPLACE the actual default value instead of None. Consider shortening the description while maintaining clarity, potentially moving detailed examples to class-level docstring.

assert processed_asset.remove_classifications is None

@patch("pyatlan.client.common.asset.Save.validate_and_flush_entities")
def test_save_prepare_request_processes_tag_semantics(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ IMPORTANT 🧪 TESTING

Confidence: 80.0%

Issue: Integration test name is misleading - it tests prepare_request but doesn't verify the full save flow or API integration

Recommendation: Rename to test_prepare_request_processes_tag_semantics and consider adding true integration tests that mock the full save API call

Copy link

@mothership-reviewer mothership-reviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 PR Summary

Title: Asset tag bulk management
Type: ✨ FEAT
Branch: DEVX-309main

Description:

✨ Description

This PR introduces comprehensive support for in-bulk asset tag management using SaveSemantic (APPEND, REPLACE, REMOVE) within the Pyatlan SDK. It enables granular control over how Atlan tags are applied to assets during bulk save operations, ensuring backward compatibility for existing implementations.

Jira link: DEVX-309


🧩 Type of change

Select all that apply:

  • 🚀 New feature (non-breaking change that adds functionality)
  • 🐛 Bug fix (non-breaking change that fixes an issue) — please include tests! Refer testing-toolkit 🧪
  • 🔄 Refactor (code change that neither fixes a bug nor adds a feature)
  • 🧹 Maintenance (chores, cleanup, minor improvements)
  • 💥 Breaking change (fix or feature that may break existing functionality)
  • 📦 Dependency upgrade/downgrade
  • 📚 Documentation updates

✅ How has this been tested? (e.g. screenshots, logs, workflow links)

Comprehensive unit tests have been added in tests/unit/test_tag_semantic.py to cover the new functionality. These tests validate:

  • Creation of AtlanTag instances with explicit SaveSemantic values.
  • AtlanTag.of() and AtlanTag.of_async() methods correctly handling the semantic parameter.
  • The Save._process_tags_by_semantic() helper method correctly segregating tags into classifications, add_or_update_classifications, and remove_classifications based on their semantic.
  • Handling of mixed semantic operations within a single asset.
  • Backward compatibility for assets and tags without explicit semantic values (treated as REPLACE).
  • Integration of tag semantic processing within Save.prepare_request().

Due to environment limitations, tests were not run locally, but the test suite is designed to thoroughly cover all new logic.


📋 Checklist

  • My code follows the project’s style guidelines
  • I’ve performed a self-review of my code
  • I’ve added comments in tricky or complex areas
  • I’ve updated the documentation as needed
  • There are no new warnings from my changes
  • I’ve added tests to cover my changes
  • All new and existing tests pass locally (N/A - environment limitations, comprehensive tests added)

Open in Cursor Open in Web

AI Summary:
This PR introduces comprehensive support for bulk asset tag management using SaveSemantic operations (APPEND, REPLACE, REMOVE) within the Pyatlan SDK. The feature enables granular control over how Atlan tags are applied to assets during bulk save operations, allowing users to append tags to existing ones, replace all tags on an asset, or remove specific tags, while maintaining full backward compatibility with existing implementations that don't specify semantic operations.

Key changes include:

  1. New SaveSemantic Enum (pyatlan/model/enums.py:2380): Added enum with three values - REPLACE, APPEND, REMOVE to specify tag operation semantics
  2. Enhanced AtlanTag Class (pyatlan/model/core.py:326): Added optional semantic field to AtlanTag class that accepts SaveSemantic values, defaulting to None for backward compatibility
  3. AtlanTag Factory Methods (pyatlan/model/core.py:348): Updated AtlanTag.of() and AtlanTag.of_async() methods to accept semantic parameter
  4. Tag Processing Logic (pyatlan/client/common/asset.py:679): Implemented Save._process_tags_by_semantic() method that processes tags from atlan_tags field and segregates them based on semantic values, mapping APPEND semantic tags to add_or_update_classifications field, REMOVE semantic tags to remove_classifications field, and REPLACE/None semantic tags to atlan_tags field
  5. Integration with Save Operations (pyatlan/client/common/asset.py:770,819): Integrated tag semantic processing into both sync and async Save.prepare_request() methods
  6. Comprehensive Test Suite (tests/unit/test_tag_semantic.py): Added 306 lines of unit tests covering all semantic operations, mixed semantics, backward compatibility, and integration with save operations

📋 Review Summary

Decision: REQUEST_CHANGES
Score: 40.0%
Confidence: 65.0%

🔍 Context Gaps

CRITICAL: Missing validation logic for conflicting semantics across different tag fields. No clear specification of whether tags with explicit semantics should be restricted to specific fields. Backend API constraints and request splitting logic not visible in the current diff. Missing information about performance impact of semantic processing on large bulk operations and error handling patterns for malformed semantic values.

🎯 Key Learnings

The implementation correctly segregates tags by semantic and demonstrates excellent backward compatibility design patterns with optional fields, sensible defaults, and additive features that don't break existing contracts. However, logical gaps exist in handling REPLACE semantics in non-primary fields. The comprehensive test suite with 16 test cases shows strong testing discipline. Key insight: The backward compatibility approach (None = REPLACE) is sound but needs clearer documentation. The design maintains good separation of concerns with semantic processing isolated to a dedicated method, but refactoring for reduced complexity is recommended.

📝 Comments (9)

  • 🚨 pyatlan/client/common/asset.py:708 - CRITICAL CORRECTNESS
  • ⚠️ pyatlan/client/common/asset.py:717 - IMPORTANT CORRECTNESS
  • ⚠️ pyatlan/client/common/asset.py:679 - IMPORTANT MAINTAINABILITY
  • 🤔 pyatlan/client/common/asset.py:693 - CONSIDER CORRECTNESS
  • ⚠️ pyatlan/model/core.py:326 - IMPORTANT CORRECTNESS
  • 🤔 pyatlan/model/enums.py:2380 - CONSIDER MAINTAINABILITY
  • 💡 pyatlan/client/common/asset.py:720 - NIT STYLE
  • 🤔 tests/unit/test_tag_semantic.py:37 - CONSIDER TESTING
  • ⚠️ tests/unit/test_tag_semantic.py:276 - IMPORTANT TESTING

- Changed from replaceTags to replaceClassifications when REPLACE
  semantic tags are present
- When REPLACE semantic is used, set replaceClassifications=True
  and appendTags=False as per API requirements
- When APPEND/REMOVE semantic is used, set appendTags=True
- Updated tests to verify correct query parameters are set
- Prioritizes REPLACE semantic when mixed with APPEND/REMOVE
Break long lines in asset.py and test_tag_semantic.py to comply with
88-character line length limit.
Remove trailing whitespace on blank lines and adjust line
continuation formatting to match ruff standards.
@vaibhavatlan vaibhavatlan deleted the DEVX-309 branch January 15, 2026 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants