diff --git a/CHANGELOG.MD b/CHANGELOG.MD
index 381e1af9..ac2d1927 100644
--- a/CHANGELOG.MD
+++ b/CHANGELOG.MD
@@ -12,7 +12,9 @@ _released 04--2026
 
 ### Added
  - **AI Evaluation Template Support**: Uploading test result support for TestRail's AI Evaluation Template with multi-dimensional quality ratings. See README "AI Evaluation Template Support" section for complete examples.
+ - **Multi-Step AI Evaluation Workflows**: Support for combining step-level execution tracking (`testrail_result_step`) with overall quality ratings in AI Evaluation tests. See README "Multi-Step AI Evaluation Workflows" section.
  - **Global Quality Rating via `--result-fields`**: Added support for applying quality ratings to all test results using `--result-fields quality_rating:'{"category": value}'`. Test-specific quality ratings in XML/JSON properties take precedence over CLI global ratings.
+ - **Automatic AI Evaluation Template Detection**: When using `-y` (auto-creation mode), TRCLI now automatically detects and creates test cases with the AI Evaluation template. See README "Automatic Case Creation for AI Evaluation Template" section.
 
 ## [1.14.1]
 
diff --git a/README.md b/README.md
index e7abcc68..272bb1e5 100644
--- a/README.md
+++ b/README.md
@@ -690,6 +690,235 @@ trcli parse_robot \
   --suite-id 100
 ```
 
+### Multi-Step AI Evaluation Workflows
+
+For complex AI systems with multiple pipeline stages (like RAG, multi-agent systems, or sequential AI workflows), you can combine **step-level execution tracking** with **overall quality assessment** in your AI Evaluation tests. quality_rating result field can be added to to Test Case (Steps)
+
+#### How It Works
+
+**Step-Level Tracking:**
+- Each step has its own **status** (passed, failed, skipped, untested)
+- See exactly where in the pipeline the failure occurred
+
+**Overall Quality Rating:**
+- One **quality_rating** applies to the entire test result 
+- Assess the final output quality across multiple dimensions
+
+#### JUnit XML Example
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="RAG Pipeline Tests" tests="1" failures="1" time="10.5">
+  <testsuite name="Document QA" tests="1" failures="1" time="10.5">
+
+    <testcase classname="ai.rag.DocumentQA" name="C1000_test_rag_pipeline" time="10.5">
+      <properties>
+        <property name="test_id" value="C1000"/>
+
+        <!-- Step-Level Execution Tracking -->
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="passed:Step 2 Document Retrieval"/>
+        <property name="testrail_result_step" value="failed:Step 3 Answer Generation"/>
+        <property name="testrail_result_step" value="untested:Step 4 Response Validation"/>
+
+        <!-- Overall Quality Rating -->
+        <property name="quality_rating" value='{"factual_accuracy": 2, "coherence": 3, "completeness": 1}'/>
+
+        <!-- AI Context Fields (not applicable to Test Case (Steps) -->
+        <property name="testrail_result_field" value="custom_ai_input:What programming language is used for machine learning?"/>
+        <property name="testrail_result_field" value="custom_ai_output:JavaScript is the primary language for machine learning."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/rag-001"/>
+        <property name="testrail_result_field" value="custom_ai_latency:10.5 seconds"/>
+      </properties>
+      <failure message="Answer generation produced factually incorrect response"/>
+    </testcase>
+
+  </testsuite>
+</testsuites>
+```
+
+**Upload Command:**
+```bash
+trcli parse_junit \
+  -f rag_pipeline_results.xml \
+  --project-id 1 \
+  --suite-id 100
+```
+
+#### Important Notes
+
+1. **Quality Rating Scope**: The `quality_rating` applies to the **entire test result**, not individual steps. It represents the overall quality of the AI system's final output.
+
+2. **Step Status Format**: Use `status:description` format for step-level tracking:
+   - `passed:Step 1 Query Understanding`
+   - `failed:Step 3 Answer Generation`
+   - `skipped:Optional Enhancement`
+   - `untested:Step 4 Response Validation`
+
+3. **Available Step Statuses**:
+   - `passed` (status_id: 1) - Step completed successfully
+   - `untested` (status_id: 3) - Step not executed
+   - `skipped` (status_id: 4) - Step intentionally skipped
+   - `failed` (status_id: 5) - Step failed
+
+4. **Test Status Aggregation**: The overall test status follows **fail-fast** logic - if any step fails, the entire test fails.
+
+### Automatic Case Creation for AI Evaluation Template
+
+When using the `-y` flag (auto-creation mode), TRCLI can automatically detect and create test cases with the **AI Evaluation template**. This eliminates the need to manually select templates or pre-create cases.
+
+#### How Auto-Detection Works
+
+TRCLI detects AI Evaluation indicators through three methods:
+
+1. **Quality Rating in Test Results**: When `quality_rating` is present in any test result
+2. **AI Case Fields in CLI**: When `--case-fields` includes `custom_ai_type` or `custom_ai_model`
+3. **AI Case Fields in XML Properties**: When `testrail_case_field` properties include AI fields
+
+If any of these indicators are detected, TRCLI will validate that the AI Evaluation template exists in your project or exit with an error if the template is not found.
+
+#### Example: Auto-Create with Quality Rating
+
+```bash
+trcli -y \
+  -h https://your-instance.testrail.io \
+  --project "AI Testing" \
+  -n \
+  --title "RAG Pipeline Tests" \
+  -f junit_results.xml
+```
+
+**junit_results.xml:**
+```xml
+<testsuites name="RAG Tests">
+  <testsuite name="Document QA">
+    <testcase name="test_rag_pipeline">
+      <properties>
+        <!-- Automation ID for case matching -->
+        <property name="test_id" value="ai.rag.test_rag_pipeline"/>
+
+        <!-- Quality rating triggers AI Evaluation template -->
+        <property name="quality_rating" value='{"factual_accuracy": 5, "coherence": 4}'/>
+
+        <!-- AI context fields for observability -->
+        <property name="testrail_result_field" value="custom_ai_input:What is ML?"/>
+        <property name="testrail_result_field" value="custom_ai_output:ML is a subset of AI..."/>
+      </properties>
+    </testcase>
+  </testsuite>
+</testsuites>
+```
+
+#### Example: Auto-Create with AI Case Fields
+
+You can specify AI case fields either via CLI or in XML properties:
+
+**Via CLI `--case-fields`:**
+```bash
+trcli -y \
+  -h https://your-instance.testrail.io \
+  --project "AI Testing" \
+  --case-fields custom_ai_type:1 custom_ai_model:2 \
+  -f test_results.xml
+```
+
+**Via XML Properties:**
+```xml
+<testcase name="test_llm_chatbot">
+  <properties>
+    <property name="test_id" value="ai.llm.test_chatbot"/>
+
+    <!-- AI case fields trigger AI Evaluation template -->
+    <!-- custom_ai_type: 1=RAG, 2=ML, 3=LLM -->
+    <property name="testrail_case_field" value="custom_ai_type:3"/>
+    <!-- custom_ai_model: 1=GPT-5, 2=Gemini 3, 3=Sonnet 3.5 -->
+    <property name="testrail_case_field" value="custom_ai_model:1"/>
+
+    <!-- Optional: Add quality rating -->
+    <property name="quality_rating" value='{"factual_accuracy": 4}'/>
+  </properties>
+</testcase>
+```
+
+#### AI Case Field Values
+
+The AI Evaluation template includes two dropdown case fields:
+
+**`custom_ai_type`** - Type of AI system:
+- `1` = RAG (Retrieval-Augmented Generation)
+- `2` = ML (Machine Learning)
+- `3` = LLM (Large Language Model)
+
+**`custom_ai_model`** - AI model used:
+- `1` = GPT-5
+- `2` = Gemini 3
+- `3` = Sonnet 3.5
+
+**Note:** Values must be integers (1-3), not strings.
+
+#### Combining Auto-Creation with Multi-Step Results
+
+Auto-creation works seamlessly with step-level results for Test Case (Steps) template. Simply include both `quality_rating` and `testrail_result_step` properties:
+
+```xml
+<testcase name="test_rag_full_pipeline">
+  <properties>
+    <property name="test_id" value="ai.rag.test_full_pipeline"/>
+
+    <!-- Step-level execution tracking -->
+    <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+    <property name="testrail_result_step" value="passed:Step 2 Vector Search"/>
+    <property name="testrail_result_step" value="failed:Step 3 Answer Generation"/>
+
+    <!-- Overall quality rating (applies to entire test) -->
+    <property name="quality_rating" value='{"factual_accuracy": 2, "coherence": 4}'/>
+
+    <!-- AI case fields for metadata -->
+    <property name="testrail_case_field" value="custom_ai_type:1"/>
+    <property name="testrail_case_field" value="custom_ai_model:3"/>
+  </properties>
+</testcase>
+```
+
+#### Template Validation
+
+Before creating cases, TRCLI validates that the AI Evaluation template exists in your project. If the template is not found, you'll see:
+
+```
+ERROR: Cannot auto-create cases with AI Evaluation template.
+AI Evaluation template not found in project (ID: 1).
+
+Please enable the AI Evaluation template in your TestRail project:
+1. Go to Administration > Customizations > Templates
+2. Enable 'AI Evaluation' template for your project
+```
+
+#### Robot Framework Support
+
+Robot Framework tests also support auto-creation with AI Evaluation template:
+
+```robot
+*** Test Cases ***
+Test RAG Pipeline
+    [Documentation]    - testrail_case_field:custom_ai_type:1
+    ...                - testrail_case_field:custom_ai_model:3
+    ...                - quality_rating:{"factual_accuracy": 5, "relevance": 4}
+    ...                - testrail_result_field:custom_ai_input:What is quantum computing?
+    ...                - testrail_result_field:custom_ai_output:Quantum computing uses...
+    [Tags]    ai-evaluation
+
+    # Test steps here
+    Should Be Equal    ${status}    success
+```
+
+#### Important Notes
+
+1. **Template Requirement**: The AI Evaluation template must be enabled in your TestRail project
+2. **Global vs. Test-Specific**: AI case fields can be specified globally via `--case-fields` or per-test via XML properties
+3. **Field Type**: AI case field values are dropdown IDs (integers 1-3), not strings
+4. **Detection Scope**: Detection checks ALL test cases in the file - if any test has AI indicators, ALL auto-created cases will use the AI Evaluation template
+5. **Compatible with BDD**: Auto-creation is NOT supported for BDD workflows (Cucumber/Gherkin), which have their own template assignment logic
+
 ## Behavior-Driven Development (BDD) Support
 
 The TestRail CLI provides comprehensive support for Behavior-Driven Development workflows using Gherkin syntax. The BDD features enable you to manage test cases written in Gherkin format, execute BDD tests with various frameworks (Cucumber, Behave, pytest-bdd, etc.), and seamlessly upload results to TestRail.
diff --git a/tests/test_ai_evaluation_auto_creation.py b/tests/test_ai_evaluation_auto_creation.py
new file mode 100644
index 00000000..e7fd4814
--- /dev/null
+++ b/tests/test_ai_evaluation_auto_creation.py
@@ -0,0 +1,456 @@
+"""
+Unit tests for AI Evaluation Template auto-creation feature
+
+Tests verify that when using -y flag (auto-creation mode), TRCLI automatically:
+1. Detects AI Evaluation indicators (quality_rating, AI case fields)
+2. Validates AI Evaluation template exists in project
+3. Applies template_id=5 to auto-created test cases
+"""
+
+from pathlib import Path
+from unittest.mock import Mock, MagicMock
+import pytest
+
+from trcli.data_classes.dataclass_testrail import TestRailSuite, TestRailSection, TestRailCase, TestRailResult
+from trcli.data_classes.data_parsers import FieldsParser
+
+
+class TestFieldsParserIntegerConversion:
+    """Test that FieldsParser converts numeric strings to integers"""
+
+    def test_convert_ai_dropdown_fields_to_int(self):
+        """Test that AI dropdown fields are converted to integers"""
+        fields = ["custom_ai_type:1", "custom_ai_model:2"]
+
+        result, error = FieldsParser.resolve_fields(fields)
+
+        assert error is None
+        assert result["custom_ai_type"] == 1  # Should be integer, not string
+        assert result["custom_ai_model"] == 2
+        assert isinstance(result["custom_ai_type"], int)
+        assert isinstance(result["custom_ai_model"], int)
+
+    def test_keep_non_ai_numeric_strings_as_strings(self):
+        """Test that non-AI numeric strings remain as strings"""
+        fields = ["custom_automation_id:1234", "custom_steps:5"]
+
+        result, error = FieldsParser.resolve_fields(fields)
+
+        assert error is None
+        assert result["custom_automation_id"] == "1234"  # Should remain string
+        assert result["custom_steps"] == "5"  # Should remain string
+        assert isinstance(result["custom_automation_id"], str)
+        assert isinstance(result["custom_steps"], str)
+
+    def test_mixed_ai_and_regular_fields(self):
+        """Test that AI fields are converted but regular fields remain strings"""
+        fields = ["custom_ai_type:3", "custom_preconds:AI setup", "custom_ai_model:1", "custom_automation_id:999"]
+
+        result, error = FieldsParser.resolve_fields(fields)
+
+        assert error is None
+        assert result["custom_ai_type"] == 3  # AI field -> integer
+        assert isinstance(result["custom_ai_type"], int)
+        assert result["custom_preconds"] == "AI setup"  # Text field -> string
+        assert isinstance(result["custom_preconds"], str)
+        assert result["custom_ai_model"] == 1  # AI field -> integer
+        assert isinstance(result["custom_ai_model"], int)
+        assert result["custom_automation_id"] == "999"  # Regular numeric field -> string
+        assert isinstance(result["custom_automation_id"], str)
+
+    def test_list_values_remain_lists(self):
+        """Test that list values (using ast.literal_eval) are preserved"""
+        fields = ["custom_steps:[1, 2, 3]", 'custom_tags:["ai", "evaluation"]']
+
+        result, error = FieldsParser.resolve_fields(fields)
+
+        assert error is None
+        assert result["custom_steps"] == [1, 2, 3]
+        assert isinstance(result["custom_steps"], list)
+        assert result["custom_tags"] == ["ai", "evaluation"]
+
+
+class TestAIEvaluationFieldParsing:
+    """Test parsing of AI case fields - integration tests are in test_junit_quality_rating.py"""
+
+    def test_fields_parser_handles_ai_case_fields(self):
+        """Test that FieldsParser correctly processes AI case fields"""
+        # This test validates the core parsing logic that powers XML/Robot parsing
+        case_fields_list = ["custom_ai_type:1", "custom_ai_model:2", "custom_preconds:Setup AI environment"]
+
+        result, error = FieldsParser.resolve_fields(case_fields_list)
+
+        assert error is None
+        assert result["custom_ai_type"] == 1  # Integer conversion
+        assert isinstance(result["custom_ai_type"], int)
+        assert result["custom_ai_model"] == 2  # Integer conversion
+        assert isinstance(result["custom_ai_model"], int)
+        assert result["custom_preconds"] == "Setup AI environment"  # String preserved
+        assert isinstance(result["custom_preconds"], str)
+
+
+class TestAIEvaluationDetection:
+    """Test _should_use_ai_evaluation_template() detection logic"""
+
+    def test_detect_quality_rating_in_results(self):
+        """Test detection when quality_rating is present"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite with quality_rating
+        result = TestRailResult(status_id=1, quality_rating={"factual_accuracy": 5})
+        case = TestRailCase(title="Test", result=result)
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader with mock env and api_request_handler
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        result = uploader._should_use_ai_evaluation_template()
+
+        assert result is True
+        env.vlog.assert_called_with("Detected quality_rating in test results - will use AI Evaluation template")
+
+    def test_detect_ai_case_fields_in_cli(self):
+        """Test detection when AI case fields are in CLI --case-fields"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite without quality_rating
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="Test", result=result)
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader with AI case fields in CLI
+        env = Mock()
+        env.case_fields = {"custom_ai_type": 1, "custom_ai_model": 2}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        result = uploader._should_use_ai_evaluation_template()
+
+        assert result is True
+        env.vlog.assert_called_with("Detected AI case fields in --case-fields - will use AI Evaluation template")
+
+    def test_detect_ai_case_fields_in_xml(self):
+        """Test detection when AI case fields are in XML properties"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite with AI case fields in test case
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="Test", case_fields={"custom_ai_type": 1, "custom_ai_model": 2}, result=result)
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        result = uploader._should_use_ai_evaluation_template()
+
+        assert result is True
+        env.vlog.assert_called_with("Detected AI case fields in XML properties - will use AI Evaluation template")
+
+    def test_no_detection_without_indicators(self):
+        """Test no detection when no AI indicators present"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite without any AI indicators
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="Test", result=result)
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        result = uploader._should_use_ai_evaluation_template()
+
+        assert result is False
+
+
+class TestSelectiveTemplateApplication:
+    """Test that AI Evaluation template is applied selectively per test case"""
+
+    def test_apply_template_only_to_cases_with_quality_rating(self):
+        """Test that only cases with quality_rating get AI template"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite with mixed cases
+        result_with_rating = TestRailResult(status_id=1, quality_rating={"factual_accuracy": 5})
+        result_without_rating = TestRailResult(status_id=1)
+
+        case_with_rating = TestRailCase(title="AI Test", result=result_with_rating)
+        case_without_rating = TestRailCase(title="Regular Test", result=result_without_rating)
+
+        section = TestRailSection(name="Section")
+        section.testcases = [case_with_rating, case_without_rating]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+        env.log = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Test per-case logic
+        assert uploader._test_case_needs_ai_template(case_with_rating) is True
+        assert uploader._test_case_needs_ai_template(case_without_rating) is False
+
+    def test_ai_case_fields_do_not_require_ai_template(self):
+        """Test that AI case fields do NOT require AI template - they work with any template"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite with AI case fields but NO quality_rating in result
+        result = TestRailResult(status_id=1)  # No quality_rating
+
+        case_with_ai_fields = TestRailCase(
+            title="AI Test", case_fields={"custom_ai_type": 1, "custom_ai_model": 2}, result=result
+        )
+
+        section = TestRailSection(name="Section")
+        section.testcases = [case_with_ai_fields]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # AI case fields are just metadata - they do NOT require AI template
+        # Only quality_rating requires AI Evaluation template
+        assert uploader._test_case_needs_ai_template(case_with_ai_fields) is False
+
+    def test_ai_case_fields_with_quality_rating_gets_template(self):
+        """Test that cases with BOTH AI case fields AND quality_rating get AI template"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create case with both AI case fields AND quality_rating
+        result_with_rating = TestRailResult(status_id=1, quality_rating={"factual_accuracy": 5})
+        case_with_both = TestRailCase(
+            title="AI Test", case_fields={"custom_ai_type": 1, "custom_ai_model": 2}, result=result_with_rating
+        )
+
+        section = TestRailSection(name="Section")
+        section.testcases = [case_with_both]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Should need AI template due to quality_rating
+        assert uploader._test_case_needs_ai_template(case_with_both) is True
+
+    def test_mixed_report_selective_template_application(self):
+        """Test full workflow: mixed report with selective template application"""
+        from trcli.api.results_uploader import ResultsUploader
+
+        # Create suite with 3 cases: 2 with quality_rating, 1 without
+        result1 = TestRailResult(status_id=1, quality_rating={"factual_accuracy": 5})
+        result2 = TestRailResult(status_id=1, quality_rating={"coherence": 4})
+        result3 = TestRailResult(status_id=1)  # No quality_rating
+
+        case1 = TestRailCase(title="AI Test 1", result=result1)
+        case2 = TestRailCase(title="AI Test 2", result=result2)
+        case3 = TestRailCase(title="Regular Test", result=result3)
+
+        section = TestRailSection(name="Section")
+        section.testcases = [case1, case2, case3]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create uploader and mock project
+        env = Mock()
+        env.case_fields = {}
+        env.vlog = Mock()
+        env.log = Mock()
+
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+        api_handler.validate_ai_evaluation_template = Mock(return_value=(True, "", 10))
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+        uploader.project = Mock()
+        uploader.project.project_id = 1
+
+        # Apply template
+        uploader._apply_ai_evaluation_template()
+
+        # Verify: cases 1 and 2 should have template_id=10, case 3 should not
+        assert case1.template_id == 10
+        assert case2.template_id == 10
+        assert case3.template_id is None  # No template set
+
+        # Verify log message
+        env.log.assert_any_call(
+            "Using AI Evaluation template (ID: 10) for 2 test case(s), 1 test case(s) will use default template"
+        )
+
+
+class TestValidateAIEvaluationTemplate:
+    """Test validate_ai_evaluation_template API method"""
+
+    def test_validate_template_exists_by_id(self):
+        """Test validation succeeds when template ID 5 exists"""
+        from trcli.api.api_request_handler import ApiRequestHandler
+
+        mock_client = Mock()
+        mock_response = Mock()
+        mock_response.status_code = 200
+        mock_response.error_message = None
+        mock_response.response_text = [
+            {"id": 1, "name": "Test Case (Text)"},
+            {"id": 5, "name": "AI Evaluation", "i18n_custom_id": "templates_ai_evaluation"},
+            {"id": 2, "name": "Test Case (Steps)"},
+        ]
+        mock_client.send_get.return_value = mock_response
+
+        # Create handler using __new__ to bypass __init__
+        handler = ApiRequestHandler.__new__(ApiRequestHandler)
+        handler.client = mock_client
+        handler.environment = Mock()
+        handler.environment.vlog = Mock()
+
+        exists, error, template_id = handler.validate_ai_evaluation_template(project_id=1)
+
+        assert exists is True
+        assert error == ""
+        assert template_id == 5
+        mock_client.send_get.assert_called_once_with("get_templates/1")
+
+    def test_validate_template_exists_by_i18n(self):
+        """Test validation succeeds when template has i18n_custom_id with non-standard ID"""
+        from trcli.api.api_request_handler import ApiRequestHandler
+
+        mock_client = Mock()
+        mock_response = Mock()
+        mock_response.status_code = 200
+        mock_response.error_message = None
+        mock_response.response_text = [
+            {"id": 10, "name": "AI Evaluation Custom", "i18n_custom_id": "templates_ai_evaluation"}
+        ]
+        mock_client.send_get.return_value = mock_response
+
+        handler = ApiRequestHandler.__new__(ApiRequestHandler)
+        handler.client = mock_client
+        handler.environment = Mock()
+        handler.environment.vlog = Mock()
+
+        exists, error, template_id = handler.validate_ai_evaluation_template(project_id=1)
+
+        assert exists is True
+        assert error == ""
+        assert template_id == 10  # Returns actual ID, not hardcoded 5
+
+    def test_validate_template_not_found(self):
+        """Test validation fails when template doesn't exist"""
+        from trcli.api.api_request_handler import ApiRequestHandler
+
+        mock_client = Mock()
+        mock_response = Mock()
+        mock_response.status_code = 200
+        mock_response.error_message = None
+        mock_response.response_text = [{"id": 1, "name": "Test Case (Text)"}, {"id": 2, "name": "Test Case (Steps)"}]
+        mock_client.send_get.return_value = mock_response
+
+        handler = ApiRequestHandler.__new__(ApiRequestHandler)
+        handler.client = mock_client
+        handler.environment = Mock()
+        handler.environment.vlog = Mock()
+
+        exists, error, template_id = handler.validate_ai_evaluation_template(project_id=1)
+
+        assert exists is False
+        assert "AI Evaluation template" in error
+        assert "not enabled" in error
+        assert "To enable AI Evaluation template" in error
+        assert template_id == 0  # Returns 0 when not found
+
+    def test_validate_template_api_error(self):
+        """Test validation handles API errors gracefully"""
+        from trcli.api.api_request_handler import ApiRequestHandler
+
+        mock_client = Mock()
+        mock_response = Mock()
+        mock_response.status_code = 403
+        mock_response.error_message = "Insufficient permissions"
+        mock_response.response_text = None
+        mock_client.send_get.return_value = mock_response
+
+        handler = ApiRequestHandler.__new__(ApiRequestHandler)
+        handler.client = mock_client
+        handler.environment = Mock()
+        handler.environment.vlog = Mock()
+
+        exists, error, template_id = handler.validate_ai_evaluation_template(project_id=1)
+
+        assert exists is False
+        assert "Insufficient permissions" in error
+        assert template_id == 0  # Returns 0 on API error
diff --git a/tests/test_data/XML/ai_eval_auto_create.xml b/tests/test_data/XML/ai_eval_auto_create.xml
new file mode 100644
index 00000000..41f0160e
--- /dev/null
+++ b/tests/test_data/XML/ai_eval_auto_create.xml
@@ -0,0 +1,84 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+Sample JUnit XML demonstrating AI Evaluation Template auto-creation with -y flag
+
+This file shows how to trigger automatic case creation with AI Evaluation template by including:
+1. quality_rating field in test results (overall AI quality assessment)
+2. AI case fields (custom_ai_type, custom_ai_model) in properties
+3. Step-level results for multi-step AI workflows
+
+When uploaded with -y flag, TRCLI will:
+- Detect AI Evaluation indicators
+- Validate AI Evaluation template exists (ID: 5)
+- Auto-create cases with template_id=5
+- Support both quality_rating AND step-level results
+-->
+<testsuites name="RAG Pipeline - Auto Creation Demo" tests="3" failures="1" errors="0" time="45.0">
+
+  <testsuite name="Document QA Pipeline" tests="3" failures="1" errors="0" time="45.0">
+
+    <!-- Example 1: Auto-create case with quality_rating only -->
+    <testcase classname="ai.rag.document_qa" name="test_simple_qa_with_quality" time="12.0">
+      <properties>
+        <!-- Automation ID for matching (supports -y flag) -->
+        <property name="test_id" value="ai.rag.document_qa.test_simple_qa_with_quality"/>
+
+        <!-- Overall quality rating triggers AI Evaluation template -->
+        <property name="quality_rating" value='{"factual_accuracy": 5, "coherence": 5, "completeness": 4, "relevance": 5}'/>
+
+        <!-- AI result fields for observability -->
+        <property name="testrail_result_field" value="custom_ai_input:What is machine learning?"/>
+        <property name="testrail_result_field" value="custom_ai_output:Machine learning is a subset of AI that enables systems to learn from data."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/001"/>
+      </properties>
+    </testcase>
+
+    <!-- Example 2: Auto-create case with AI case fields (dropdown values) -->
+    <testcase classname="ai.llm.chatbot" name="test_chatbot_with_case_fields" time="8.5">
+      <properties>
+        <property name="test_id" value="ai.llm.chatbot.test_chatbot_with_case_fields"/>
+
+        <!-- AI case fields trigger AI Evaluation template -->
+        <!-- custom_ai_type: 1=RAG, 2=ML, 3=LLM -->
+        <property name="testrail_case_field" value="custom_ai_type:3"/>
+        <!-- custom_ai_model: 1=GPT-5, 2=Gemini 3, 3=Sonnet 3.5 -->
+        <property name="testrail_case_field" value="custom_ai_model:1"/>
+
+        <!-- Optional: Add quality rating for this test -->
+        <property name="quality_rating" value='{"factual_accuracy": 4, "coherence": 5}'/>
+
+        <property name="testrail_result_field" value="custom_ai_input:Hello, how are you?"/>
+        <property name="testrail_result_field" value="custom_ai_output:I'm doing well, thank you for asking!"/>
+      </properties>
+    </testcase>
+
+    <!-- Example 3: Auto-create case with quality_rating AND multi-step results -->
+    <testcase classname="ai.rag.document_retrieval" name="test_rag_full_pipeline_multistep" time="24.5">
+      <properties>
+        <property name="test_id" value="ai.rag.document_retrieval.test_rag_full_pipeline_multistep"/>
+
+        <!-- Step-level execution tracking (works with AI Evaluation template) -->
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="passed:Step 2 Vector Search"/>
+        <property name="testrail_result_step" value="passed:Step 3 Reranking"/>
+        <property name="testrail_result_step" value="passed:Step 4 Answer Generation"/>
+        <property name="testrail_result_step" value="passed:Step 5 Response Validation"/>
+
+        <!-- Overall quality rating (applies to entire test, not per-step) -->
+        <property name="quality_rating" value='{"factual_accuracy": 5, "coherence": 5, "completeness": 5, "relevance": 5}'/>
+
+        <!-- AI case fields for metadata -->
+        <property name="testrail_case_field" value="custom_ai_type:1"/>
+        <property name="testrail_case_field" value="custom_ai_model:3"/>
+
+        <!-- AI result fields -->
+        <property name="testrail_result_field" value="custom_ai_input:Explain quantum entanglement in simple terms"/>
+        <property name="testrail_result_field" value="custom_ai_output:Quantum entanglement is a phenomenon where two particles become connected..."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/002"/>
+        <property name="testrail_result_field" value="custom_ai_latency:24.5 seconds"/>
+      </properties>
+    </testcase>
+
+  </testsuite>
+
+</testsuites>
diff --git a/tests/test_data/XML/sample_ai_eval_multistep_workflow.xml b/tests/test_data/XML/sample_ai_eval_multistep_workflow.xml
new file mode 100644
index 00000000..6f8220be
--- /dev/null
+++ b/tests/test_data/XML/sample_ai_eval_multistep_workflow.xml
@@ -0,0 +1,90 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="RAG Pipeline - AI Evaluation" tests="3" failures="2" errors="0" time="35.2">
+
+  <!-- Suite 1: Document QA Tests -->
+  <testsuite name="Document QA RAG Pipeline" tests="3" failures="2" errors="0" time="35.2">
+
+    <!-- Test 1: Successful RAG Pipeline (All Steps Pass) -->
+    <testcase classname="ai.rag.DocumentQA" name="C1000_test_rag_pipeline_success" time="12.5">
+      <properties>
+        <property name="test_id" value="C1000"/>
+
+        <!-- Step-Level Execution Tracking -->
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="passed:Step 2 Document Retrieval"/>
+        <property name="testrail_result_step" value="passed:Step 3 Answer Generation"/>
+        <property name="testrail_result_step" value="passed:Step 4 Response Validation"/>
+
+        <!-- Overall Quality Rating (High Quality) -->
+        <property name="quality_rating" value='{"factual_accuracy": 5, "coherence": 5, "completeness": 4, "relevance": 5}'/>
+
+        <!-- AI Context Fields -->
+        <property name="testrail_result_field" value="custom_ai_input:What is the capital of France?"/>
+        <property name="testrail_result_field" value="custom_ai_output:The capital of France is Paris. Paris is the largest city in France and has been the capital since 987 AD."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/rag-success-001"/>
+        <property name="testrail_result_field" value="custom_ai_latency:12.5 seconds"/>
+      </properties>
+    </testcase>
+
+    <!-- Test 2: Failed Answer Generation (Step 3 Fails) -->
+    <testcase classname="ai.rag.DocumentQA" name="C1001_test_rag_pipeline_factual_error" time="10.5">
+      <properties>
+        <property name="test_id" value="C1001"/>
+
+        <!-- Step-Level Execution Tracking -->
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="passed:Step 2 Document Retrieval"/>
+        <property name="testrail_result_step" value="failed:Step 3 Answer Generation"/>
+        <property name="testrail_result_step" value="untested:Step 4 Response Validation"/>
+
+        <!-- Overall Quality Rating (Low Due to Factual Error) -->
+        <property name="quality_rating" value='{"factual_accuracy": 1, "coherence": 3, "completeness": 2, "relevance": 2}'/>
+
+        <!-- AI Context Fields -->
+        <property name="testrail_result_field" value="custom_ai_input:What programming language is primarily used for machine learning?"/>
+        <property name="testrail_result_field" value="custom_ai_output:JavaScript is the primary language for machine learning, widely used in neural networks and deep learning."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/rag-failure-001"/>
+        <property name="testrail_result_field" value="custom_ai_latency:10.5 seconds"/>
+      </properties>
+      <failure message="Answer generation produced factually incorrect response">
+        Expected: Python is the primary language for machine learning
+        Actual: JavaScript is the primary language for machine learning
+
+        Issue: Model hallucinated incorrect information despite correct document retrieval
+        Impact: Users receive misleading information that could affect decision-making
+      </failure>
+    </testcase>
+
+    <!-- Test 3: Document Retrieval Failure (Step 2 Fails) -->
+    <testcase classname="ai.rag.DocumentQA" name="C1002_test_rag_pipeline_retrieval_failure" time="12.2">
+      <properties>
+        <property name="test_id" value="C1002"/>
+
+        <!-- Step-Level Execution Tracking -->
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="failed:Step 2 Document Retrieval"/>
+        <property name="testrail_result_step" value="untested:Step 3 Answer Generation"/>
+        <property name="testrail_result_step" value="untested:Step 4 Response Validation"/>
+
+        <!-- Overall Quality Rating (Low Due to No Relevant Documents) -->
+        <property name="quality_rating" value='{"factual_accuracy": 0, "coherence": 1, "completeness": 0, "relevance": 1}'/>
+
+        <!-- AI Context Fields -->
+        <property name="testrail_result_field" value="custom_ai_input:Explain the Heisenberg uncertainty principle in quantum mechanics"/>
+        <property name="testrail_result_field" value="custom_ai_output:I don't have enough information to answer your question about the Heisenberg uncertainty principle."/>
+        <property name="testrail_result_field" value="custom_ai_traces:https://logs.example.com/trace/rag-retrieval-failure-001"/>
+        <property name="testrail_result_field" value="custom_ai_latency:12.2 seconds"/>
+      </properties>
+      <failure message="Document retrieval failed to find relevant sources">
+        Expected: Retrieved at least 3 relevant documents about quantum mechanics
+        Actual: Retrieved 0 relevant documents (only found documents about classical physics)
+
+        Issue: Vector search embeddings failed to capture semantic meaning of quantum mechanics query
+        Impact: System cannot provide accurate answers for domain-specific questions
+        Recommendation: Retrain embedding model with physics-domain knowledge or use specialized vector database
+      </failure>
+    </testcase>
+
+  </testsuite>
+
+</testsuites>
diff --git a/tests/test_junit_quality_rating.py b/tests/test_junit_quality_rating.py
index 7555e78a..116694db 100644
--- a/tests/test_junit_quality_rating.py
+++ b/tests/test_junit_quality_rating.py
@@ -259,3 +259,253 @@ def test_backward_compatibility_no_quality_rating(self, env, tmp_path):
         assert "case_id" in result_dict
         assert "status_id" in result_dict
         assert "custom_field" in result_dict
+
+    # ========== Step-Level Results with Quality Rating ==========
+
+    def test_step_level_results_with_quality_rating(self, env, tmp_path):
+        """Test AI Evaluation with step-level results and overall quality rating"""
+        xml_content = """<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="AI Tests" tests="1" failures="1" errors="0" time="10.0">
+  <testsuite name="Multi-Step AI Workflow" tests="1" failures="1" errors="0" time="10.0">
+    <testcase classname="ai.RAGPipeline" name="C500_test_rag_pipeline" time="10.0">
+      <properties>
+        <property name="test_id" value="C500"/>
+        <property name="testrail_result_step" value="passed:Step 1 Query Understanding"/>
+        <property name="testrail_result_step" value="passed:Step 2 Document Retrieval"/>
+        <property name="testrail_result_step" value="failed:Step 3 Answer Generation"/>
+        <property name="testrail_result_step" value="untested:Step 4 Response Validation"/>
+        <property name="quality_rating" value='{"factual_accuracy": 2, "coherence": 3, "completeness": 1}'/>
+        <property name="testrail_result_field" value="custom_ai_input:What is Python?"/>
+        <property name="testrail_result_field" value="custom_ai_output:Python is a snake..."/>
+      </properties>
+      <failure message="Answer generation produced factually incorrect response"/>
+    </testcase>
+  </testsuite>
+</testsuites>"""
+
+        xml_file = tmp_path / "test_step_level_quality.xml"
+        xml_file.write_text(xml_content)
+
+        env.file = xml_file
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        test_case = suites[0].testsections[0].testcases[0]
+        result = test_case.result
+
+        # Verify step-level results
+        assert len(result.custom_step_results) == 4
+        assert result.custom_step_results[0].content == "Step 1 Query Understanding"
+        assert result.custom_step_results[0].status_id == 1  # Passed
+        assert result.custom_step_results[1].content == "Step 2 Document Retrieval"
+        assert result.custom_step_results[1].status_id == 1  # Passed
+        assert result.custom_step_results[2].content == "Step 3 Answer Generation"
+        assert result.custom_step_results[2].status_id == 5  # Failed
+        assert result.custom_step_results[3].content == "Step 4 Response Validation"
+        assert result.custom_step_results[3].status_id == 3  # Untested
+
+        # Verify overall quality rating
+        assert result.quality_rating == {"factual_accuracy": 2, "coherence": 3, "completeness": 1}
+
+        # Verify overall test status is failed
+        assert result.status_id == 5
+
+    def test_step_level_serialization_with_quality_rating(self, env, tmp_path):
+        """Test that step-level results and quality rating serialize correctly"""
+        xml_content = """<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="AI Tests" tests="1" failures="0" errors="0" time="5.0">
+  <testsuite name="Success Flow" tests="1" failures="0" errors="0" time="5.0">
+    <testcase classname="ai.ChatBot" name="C501_test_chatbot_steps" time="5.0">
+      <properties>
+        <property name="test_id" value="C501"/>
+        <property name="testrail_result_step" value="passed:Intent Detection"/>
+        <property name="testrail_result_step" value="passed:Response Generation"/>
+        <property name="testrail_result_step" value="passed:Quality Check"/>
+        <property name="quality_rating" value='{"accuracy": 5, "relevance": 5, "tone": 4}'/>
+      </properties>
+    </testcase>
+  </testsuite>
+</testsuites>"""
+
+        xml_file = tmp_path / "test_step_serialization.xml"
+        xml_file.write_text(xml_content)
+
+        env.file = xml_file
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        test_case = suites[0].testsections[0].testcases[0]
+        result_dict = test_case.result.to_dict()
+
+        # Verify custom_step_results serialization
+        assert "custom_step_results" in result_dict
+        assert len(result_dict["custom_step_results"]) == 3
+        assert result_dict["custom_step_results"][0]["content"] == "Intent Detection"
+        assert result_dict["custom_step_results"][0]["status_id"] == 1
+        assert result_dict["custom_step_results"][1]["content"] == "Response Generation"
+        assert result_dict["custom_step_results"][1]["status_id"] == 1
+        assert result_dict["custom_step_results"][2]["content"] == "Quality Check"
+        assert result_dict["custom_step_results"][2]["status_id"] == 1
+
+        # Verify quality_rating at root level
+        assert "quality_rating" in result_dict
+        assert result_dict["quality_rating"] == {"accuracy": 5, "relevance": 5, "tone": 4}
+
+    def test_step_level_mixed_statuses(self, env, tmp_path):
+        """Test step-level results with various status combinations"""
+        xml_content = """<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="AI Tests" tests="1" failures="0" errors="0" time="3.0">
+  <testsuite name="Partial Success" tests="1" failures="0" errors="0" time="3.0">
+    <testcase classname="ai.Pipeline" name="C502_test_mixed_steps" time="3.0">
+      <properties>
+        <property name="test_id" value="C502"/>
+        <property name="testrail_result_step" value="passed:Pre-processing"/>
+        <property name="testrail_result_step" value="skipped:Optional Enhancement"/>
+        <property name="testrail_result_step" value="passed:Final Output"/>
+        <property name="quality_rating" value='{"quality": 4}'/>
+      </properties>
+    </testcase>
+  </testsuite>
+</testsuites>"""
+
+        xml_file = tmp_path / "test_mixed_steps.xml"
+        xml_file.write_text(xml_content)
+
+        env.file = xml_file
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        test_case = suites[0].testsections[0].testcases[0]
+        result = test_case.result
+
+        # Verify all step statuses
+        assert len(result.custom_step_results) == 3
+        assert result.custom_step_results[0].status_id == 1  # Passed
+        assert result.custom_step_results[1].status_id == 4  # Skipped
+        assert result.custom_step_results[2].status_id == 1  # Passed
+
+        # Overall test should pass (no failures)
+        assert result.status_id == 1
+
+        # Quality rating should be preserved
+        assert result.quality_rating == {"quality": 4}
+
+    def test_step_level_without_quality_rating(self, env, tmp_path):
+        """Test that step-level results work without quality rating (backward compatibility)"""
+        xml_content = """<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="Tests" tests="1" failures="0" errors="0" time="2.0">
+  <testsuite name="Basic Steps" tests="1" failures="0" errors="0" time="2.0">
+    <testcase classname="test.Steps" name="C503_test_steps_only" time="2.0">
+      <properties>
+        <property name="test_id" value="C503"/>
+        <property name="testrail_result_step" value="passed:Step 1"/>
+        <property name="testrail_result_step" value="passed:Step 2"/>
+      </properties>
+    </testcase>
+  </testsuite>
+</testsuites>"""
+
+        xml_file = tmp_path / "test_steps_no_rating.xml"
+        xml_file.write_text(xml_content)
+
+        env.file = xml_file
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        test_case = suites[0].testsections[0].testcases[0]
+        result_dict = test_case.result.to_dict()
+
+        # Should have steps
+        assert "custom_step_results" in result_dict
+        assert len(result_dict["custom_step_results"]) == 2
+
+        # Should NOT have quality_rating
+        assert "quality_rating" not in result_dict
+
+    def test_quality_rating_without_steps(self, env, tmp_path):
+        """Test that quality rating works without step-level results"""
+        xml_content = """<?xml version="1.0" encoding="UTF-8"?>
+<testsuites name="Tests" tests="1" failures="0" errors="0" time="1.0">
+  <testsuite name="No Steps" tests="1" failures="0" errors="0" time="1.0">
+    <testcase classname="test.Simple" name="C504_test_quality_only" time="1.0">
+      <properties>
+        <property name="test_id" value="C504"/>
+        <property name="quality_rating" value='{"accuracy": 5}'/>
+      </properties>
+    </testcase>
+  </testsuite>
+</testsuites>"""
+
+        xml_file = tmp_path / "test_rating_no_steps.xml"
+        xml_file.write_text(xml_content)
+
+        env.file = xml_file
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        test_case = suites[0].testsections[0].testcases[0]
+        result_dict = test_case.result.to_dict()
+
+        # Should have quality_rating
+        assert "quality_rating" in result_dict
+        assert result_dict["quality_rating"] == {"accuracy": 5}
+
+        # Should NOT have custom_step_results (empty list skipped by serialization)
+        assert "custom_step_results" not in result_dict or result_dict["custom_step_results"] == []
+
+    def test_parse_sample_multistep_workflow(self, env):
+        """Test parsing the sample multi-step AI evaluation workflow file"""
+        env.file = Path(__file__).parent / "test_data/XML/sample_ai_eval_multistep_workflow.xml"
+        parser = JunitParser(env)
+        suites = parser.parse_file()
+
+        assert len(suites) == 1
+        suite = suites[0]
+        assert len(suite.testsections) == 1
+        section = suite.testsections[0]
+        assert len(section.testcases) == 3
+
+        # Test 1: All steps pass
+        test1 = section.testcases[0]
+        assert test1.result.case_id == 1000
+        assert test1.result.status_id == 1  # Passed
+        assert len(test1.result.custom_step_results) == 4
+        assert all(step.status_id == 1 for step in test1.result.custom_step_results)  # All passed
+        assert test1.result.quality_rating == {
+            "factual_accuracy": 5,
+            "coherence": 5,
+            "completeness": 4,
+            "relevance": 5,
+        }
+
+        # Test 2: Step 3 fails
+        test2 = section.testcases[1]
+        assert test2.result.case_id == 1001
+        assert test2.result.status_id == 5  # Failed
+        assert len(test2.result.custom_step_results) == 4
+        assert test2.result.custom_step_results[0].status_id == 1  # Step 1 passed
+        assert test2.result.custom_step_results[1].status_id == 1  # Step 2 passed
+        assert test2.result.custom_step_results[2].status_id == 5  # Step 3 failed
+        assert test2.result.custom_step_results[3].status_id == 3  # Step 4 untested
+        assert test2.result.quality_rating == {
+            "factual_accuracy": 1,
+            "coherence": 3,
+            "completeness": 2,
+            "relevance": 2,
+        }
+
+        # Test 3: Step 2 fails
+        test3 = section.testcases[2]
+        assert test3.result.case_id == 1002
+        assert test3.result.status_id == 5  # Failed
+        assert len(test3.result.custom_step_results) == 4
+        assert test3.result.custom_step_results[0].status_id == 1  # Step 1 passed
+        assert test3.result.custom_step_results[1].status_id == 5  # Step 2 failed
+        assert test3.result.custom_step_results[2].status_id == 3  # Step 3 untested
+        assert test3.result.custom_step_results[3].status_id == 3  # Step 4 untested
+        assert test3.result.quality_rating == {
+            "factual_accuracy": 0,
+            "coherence": 1,
+            "completeness": 0,
+            "relevance": 1,
+        }
diff --git a/tests/test_update_existing_cases_case_fields.py b/tests/test_update_existing_cases_case_fields.py
new file mode 100644
index 00000000..a1f9ee67
--- /dev/null
+++ b/tests/test_update_existing_cases_case_fields.py
@@ -0,0 +1,183 @@
+"""
+Unit tests for updating existing cases with case fields via --update-existing-cases yes
+"""
+
+from unittest.mock import Mock
+import pytest
+
+from trcli.api.results_uploader import ResultsUploader
+from trcli.data_classes.dataclass_testrail import TestRailSuite, TestRailSection, TestRailCase, TestRailResult
+
+
+class TestUpdateExistingCasesWithCaseFields:
+    """Test that --update-existing-cases yes properly updates case fields"""
+
+    def test_global_case_fields_applied_to_existing_cases(self):
+        """Test that global --case-fields are applied before updating existing cases"""
+        # Create suite with existing case (has case_id)
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="Existing Test", case_id=1234, result=result)  # Existing case
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create environment with global case fields
+        env = Mock()
+        env.case_fields = {"custom_ai_type": 1, "custom_ai_model": 2}
+        env.update_existing_cases = "yes"
+        env.vlog = Mock()
+        env.log = Mock()
+        env.elog = Mock()
+
+        # Create uploader
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+        api_handler.update_existing_case_references = Mock(
+            return_value=(True, None, [], [], ["custom_ai_type", "custom_ai_model"])
+        )
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Call update method
+        update_results, failed_cases = uploader.update_existing_cases_with_junit_refs(added_test_cases=None)
+
+        # Verify global case fields were applied
+        assert case.case_fields["custom_ai_type"] == 1
+        assert case.case_fields["custom_ai_model"] == 2
+
+        # Verify update was called with the case fields
+        api_handler.update_existing_case_references.assert_called_once()
+        call_args = api_handler.update_existing_case_references.call_args
+        assert call_args[0][0] == 1234  # case_id
+        assert call_args[0][2]["custom_ai_type"] == 1  # case_fields
+        assert call_args[0][2]["custom_ai_model"] == 2
+
+        # Verify results
+        assert len(update_results["updated_cases"]) == 1
+        assert update_results["updated_cases"][0]["case_id"] == 1234
+        assert "custom_ai_type" in update_results["updated_cases"][0]["updated_fields"]
+        assert "custom_ai_model" in update_results["updated_cases"][0]["updated_fields"]
+
+    def test_xml_case_fields_override_global(self):
+        """Test that XML case fields override global CLI case fields"""
+        # Create suite with existing case that has XML case fields
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(
+            title="Existing Test",
+            case_id=5678,
+            case_fields={"custom_ai_type": 3},  # XML specifies type=3
+            result=result,
+        )
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create environment with global case fields
+        env = Mock()
+        env.case_fields = {"custom_ai_type": 1, "custom_ai_model": 2}  # CLI specifies type=1
+        env.update_existing_cases = "yes"
+        env.vlog = Mock()
+        env.log = Mock()
+        env.elog = Mock()
+
+        # Create uploader
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+        api_handler.update_existing_case_references = Mock(
+            return_value=(True, None, [], [], ["custom_ai_type", "custom_ai_model"])
+        )
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Call update method
+        update_results, failed_cases = uploader.update_existing_cases_with_junit_refs(added_test_cases=None)
+
+        # Verify XML value (3) takes precedence over global CLI value (1)
+        assert case.case_fields["custom_ai_type"] == 3  # Should be 3 from XML, not 1 from CLI
+        assert case.case_fields["custom_ai_model"] == 2  # Should be 2 from CLI (not in XML)
+
+        # Verify update was called with merged case fields
+        call_args = api_handler.update_existing_case_references.call_args
+        assert call_args[0][2]["custom_ai_type"] == 3  # XML value
+        assert call_args[0][2]["custom_ai_model"] == 2  # CLI value
+
+    def test_newly_created_cases_excluded_from_update(self):
+        """Test that newly created cases are excluded from update"""
+        # Create suite with a newly created case
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="New Test", case_id=9999, result=result)  # This case was just created
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create environment
+        env = Mock()
+        env.case_fields = {"custom_ai_type": 1}
+        env.update_existing_cases = "yes"
+        env.vlog = Mock()
+        env.log = Mock()
+        env.elog = Mock()
+
+        # Create uploader
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+        api_handler.update_existing_case_references = Mock()
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Call update method with case 9999 in added_test_cases (newly created)
+        added_test_cases = [{"case_id": 9999}]
+        update_results, failed_cases = uploader.update_existing_cases_with_junit_refs(added_test_cases=added_test_cases)
+
+        # Verify update was NOT called (case was excluded)
+        api_handler.update_existing_case_references.assert_not_called()
+
+        # Verify no cases were updated (newly created cases are silently excluded)
+        assert len(update_results["updated_cases"]) == 0
+        assert len(failed_cases) == 0
+
+    def test_no_case_fields_skips_update(self):
+        """Test that cases without case fields or refs are skipped"""
+        # Create suite with existing case but no case fields
+        result = TestRailResult(status_id=1)
+        case = TestRailCase(title="Existing Test", case_id=1111, result=result)
+        section = TestRailSection(name="Section")
+        section.testcases = [case]
+        suite = TestRailSuite(name="Suite")
+        suite.testsections = [section]
+
+        # Create environment with NO global case fields
+        env = Mock()
+        env.case_fields = {}  # No global case fields
+        env.update_existing_cases = "yes"
+        env.vlog = Mock()
+        env.log = Mock()
+        env.elog = Mock()
+
+        # Create uploader
+        api_handler = Mock()
+        api_handler.suites_data_from_provider = suite
+        api_handler.update_existing_case_references = Mock()
+
+        uploader = ResultsUploader.__new__(ResultsUploader)
+        uploader.environment = env
+        uploader.api_request_handler = api_handler
+
+        # Call update method
+        update_results, failed_cases = uploader.update_existing_cases_with_junit_refs(added_test_cases=None)
+
+        # Verify update was NOT called (no case fields to update)
+        api_handler.update_existing_case_references.assert_not_called()
+
+        # Verify no cases were updated
+        assert len(update_results["updated_cases"]) == 0
+        assert len(update_results["skipped_cases"]) == 0
diff --git a/trcli/api/api_request_handler.py b/trcli/api/api_request_handler.py
index cf44c017..524fd501 100644
--- a/trcli/api/api_request_handler.py
+++ b/trcli/api/api_request_handler.py
@@ -1072,3 +1072,78 @@ def add_case_bdd(
         self, section_id: int, title: str, bdd_content: str, template_id: int, tags: List[str] = None
     ) -> Tuple[int, str]:
         return self.bdd_handler.add_case_bdd(section_id, title, bdd_content, template_id, tags)
+
+    def validate_ai_evaluation_template(self, project_id: int) -> Tuple[bool, str, int]:
+        """
+        Validate that AI Evaluation template exists in the project
+
+        Args:
+            project_id: TestRail project ID
+
+        Returns:
+            Tuple of (exists, error_message, template_id)
+            - exists: True if AI Evaluation template is enabled, False otherwise
+            - error_message: Empty string on success, error details on failure
+            - template_id: The actual template ID from TestRail (0 if not found)
+
+        Note:
+            The AI Evaluation template is identified by i18n_custom_id "templates_ai_evaluation".
+            We check only by i18n_custom_id (not template ID) because the ID can vary depending
+            on when custom templates were created in the instance.
+        """
+        self.environment.vlog(f"Validating AI Evaluation template for project {project_id}")
+        response = self.client.send_get(f"get_templates/{project_id}")
+
+        if response.status_code == 200:
+            templates = response.response_text
+            if isinstance(templates, list):
+                self.environment.vlog(f"Retrieved {len(templates)} template(s) from TestRail")
+
+                # Log all available templates for debugging
+                if templates:
+                    self.environment.vlog("Available templates:")
+                    for template in templates:
+                        template_id = template.get("id")
+                        template_name = template.get("name", "")
+                        template_i18n = template.get("i18n_custom_id", "")
+                        self.environment.vlog(f"  - ID {template_id}: '{template_name}' ({template_i18n})")
+
+                # Look for AI Evaluation template by i18n_custom_id (system identifier)
+                for template in templates:
+                    template_id = template.get("id")
+                    template_name = template.get("name", "")
+                    template_i18n = template.get("i18n_custom_id", "")
+
+                    if template_i18n == "templates_ai_evaluation":
+                        self.environment.vlog(
+                            f"  ✓ MATCH: Found AI Evaluation template '{template_name}' (ID: {template_id})"
+                        )
+                        self.environment.log(f"AI Evaluation template is enabled in this project.")
+                        return True, "", template_id
+
+                # Build detailed error message
+                error_parts = [
+                    "AI Evaluation template is not enabled in this project.",
+                    "This feature requires the AI Evaluation template to be enabled in TestRail.",
+                ]
+                if templates:
+                    template_list = ", ".join([f"'{t.get('name', 'Unknown')}' (ID: {t.get('id')})" for t in templates])
+                    error_parts.append(f"Available templates: {template_list}")
+                    error_parts.append(
+                        "\nTo enable AI Evaluation template:\n"
+                        "1. Go to TestRail Administration > Customizations > Templates\n"
+                        "2. Enable 'AI Evaluation' template for your project"
+                    )
+                else:
+                    error_parts.append("No templates are available in this project.")
+
+                self.environment.elog("\n".join(error_parts))
+                return False, "\n".join(error_parts), 0
+            else:
+                error_msg = "Unexpected response format from get_templates"
+                self.environment.elog(error_msg)
+                return False, error_msg, 0
+        else:
+            error_msg = response.error_message or f"Failed to get templates (HTTP {response.status_code})"
+            self.environment.elog(error_msg)
+            return False, error_msg, 0
diff --git a/trcli/api/results_uploader.py b/trcli/api/results_uploader.py
index fdb4b579..487c529f 100644
--- a/trcli/api/results_uploader.py
+++ b/trcli/api/results_uploader.py
@@ -80,7 +80,12 @@ def upload_results(self):
                 self.environment.log("\n".join(revert_logs))
                 exit(1)
 
+            # Detect if AI Evaluation template should be used for auto-created cases
             if missing_test_cases:
+                use_ai_evaluation = self._should_use_ai_evaluation_template()
+                if use_ai_evaluation:
+                    self._apply_ai_evaluation_template()
+
                 added_test_cases, result_code = self.add_missing_test_cases()
             else:
                 result_code = 1
@@ -127,13 +132,12 @@ def upload_results(self):
         case_update_results = None
         case_update_failed = []
         if hasattr(self.environment, "update_existing_cases") and self.environment.update_existing_cases == "yes":
-            self.environment.log("Updating existing cases with JUnit references...")
+            self.environment.log("Updating existing cases...")
             case_update_results, case_update_failed = self.update_existing_cases_with_junit_refs(added_test_cases)
 
             if case_update_results.get("updated_cases"):
-                self.environment.log(
-                    f"Updated {len(case_update_results['updated_cases'])} existing case(s) with references."
-                )
+                updated_count = len(case_update_results["updated_cases"])
+                self.environment.log(f"Updated {updated_count} existing case(s).")
             if case_update_results.get("failed_cases"):
                 self.environment.elog(f"Failed to update {len(case_update_results['failed_cases'])} case(s).")
 
@@ -264,6 +268,16 @@ def update_existing_cases_with_junit_refs(self, added_test_cases: List[Dict] = N
 
         strategy = getattr(self.environment, "update_strategy", "append")
 
+        # Apply global case fields from CLI to all test cases
+        # This ensures --case-fields values are merged into test case objects
+        global_case_fields = getattr(self.environment, "case_fields", {}) or {}
+        if global_case_fields:
+            self.environment.vlog(f"Applying global case fields: {global_case_fields}")
+            for section in self.api_request_handler.suites_data_from_provider.testsections:
+                for test_case in section.testcases:
+                    if test_case.case_id:  # Only for existing cases
+                        test_case.add_global_case_fields(global_case_fields)
+
         # Process all test cases in all sections
         for section in self.api_request_handler.suites_data_from_provider.testsections:
             for test_case in section.testcases:
@@ -441,3 +455,114 @@ def rollback_changes(
             else:
                 returned_log.append(RevertMessages.suite_deleted)
         return returned_log
+
+    def _should_use_ai_evaluation_template(self) -> bool:
+        """
+        Determine if AI Evaluation template should be used for auto-created test cases.
+
+        Checks for:
+        1. presence of quality_rating in any test result
+        2. AI case fields (custom_ai_type, custom_ai_model) in CLI --case-fields
+        3. AI case fields in XML properties (testrail_case_field)
+
+        Returns:
+            True if AI Evaluation template should be used, False otherwise
+        """
+        suite_data = self.api_request_handler.suites_data_from_provider
+
+        # Check 1: quality_rating in any test result
+        has_quality_rating = any(
+            test_case.result.quality_rating is not None
+            for section in suite_data.testsections
+            for test_case in section.testcases
+        )
+
+        if has_quality_rating:
+            self.environment.vlog("Detected quality_rating in test results - will use AI Evaluation template")
+            return True
+
+        # Check 2: AI case fields in CLI --case-fields
+        case_fields_cli = getattr(self.environment, "case_fields", {}) or {}
+        has_ai_case_fields_cli = any(field in case_fields_cli for field in ["custom_ai_type", "custom_ai_model"])
+
+        if has_ai_case_fields_cli:
+            self.environment.vlog("Detected AI case fields in --case-fields - will use AI Evaluation template")
+            return True
+
+        # Check 3: AI case fields in XML properties (testrail_case_field)
+        has_ai_case_fields_xml = any(
+            any(field in (test_case.case_fields or {}) for field in ["custom_ai_type", "custom_ai_model"])
+            for section in suite_data.testsections
+            for test_case in section.testcases
+        )
+
+        if has_ai_case_fields_xml:
+            self.environment.vlog("Detected AI case fields in XML properties - will use AI Evaluation template")
+            return True
+
+        return False
+
+    def _test_case_needs_ai_template(self, test_case) -> bool:
+        """
+        Determine if a specific test case needs AI Evaluation template.
+
+        IMPORTANT: A test case needs AI Evaluation template ONLY if it has quality_rating
+        in the test result, because quality_rating is a required field for AI Evaluation template.
+
+        AI case fields (custom_ai_type, custom_ai_model) are metadata that can be used with
+        ANY template and do NOT require AI Evaluation template.
+
+        Args:
+            test_case: The test case to check
+
+        Returns:
+            True if test case has quality_rating in result, False otherwise
+        """
+        # ONLY check for quality_rating in test result
+        # AI case fields do NOT require AI Evaluation template
+        if test_case.result and test_case.result.quality_rating is not None:
+            return True
+
+        return False
+
+    def _apply_ai_evaluation_template(self):
+        """
+        Validate AI Evaluation template and apply its template_id to test cases that need it.
+
+        Calls the API to validate that AI Evaluation template exists in the project.
+        If validation succeeds, applies the template_id selectively to test cases based on:
+        - Test-specific quality_rating in results
+        - Test-specific AI case fields in XML properties
+        - Global AI case fields from CLI --case-fields
+
+        If validation fails, logs error and exits.
+        """
+        self.environment.log("AI Evaluation indicators detected. Validating AI Evaluation template...")
+
+        # Validate template exists via API and get its actual ID
+        template_exists, error_message, template_id = self.api_request_handler.validate_ai_evaluation_template(
+            self.project.project_id
+        )
+
+        if not template_exists:
+            self.environment.elog("ERROR: Cannot auto-create cases with AI Evaluation template.")
+            self.environment.elog(error_message)
+            exit(1)
+
+        # Apply template_id selectively to test cases that need it
+        suite_data = self.api_request_handler.suites_data_from_provider
+        ai_cases_count = 0
+        regular_cases_count = 0
+
+        for section in suite_data.testsections:
+            for test_case in section.testcases:
+                if self._test_case_needs_ai_template(test_case):
+                    test_case.template_id = template_id
+                    ai_cases_count += 1
+                else:
+                    regular_cases_count += 1
+
+        self.environment.log(
+            f"Using AI Evaluation template (ID: {template_id}) for {ai_cases_count} test case(s), "
+            f"{regular_cases_count} test case(s) will use default template"
+        )
diff --git a/trcli/data_classes/data_parsers.py b/trcli/data_classes/data_parsers.py
index 8905d8e5..ef88f26a 100644
--- a/trcli/data_classes/data_parsers.py
+++ b/trcli/data_classes/data_parsers.py
@@ -147,6 +147,9 @@ class FieldsParser:
     def resolve_fields(fields: Union[List[str], Dict]) -> Tuple[Dict, str]:
         error = None
         fields_dictionary = {}
+        # AI case fields that should be converted to integers (dropdown IDs)
+        AI_DROPDOWN_FIELDS = {"custom_ai_type", "custom_ai_model"}
+
         try:
             if isinstance(fields, list) or isinstance(fields, tuple):
                 for field in fields:
@@ -156,6 +159,13 @@ def resolve_fields(fields: Union[List[str], Dict]) -> Tuple[Dict, str]:
                             value = ast.literal_eval(value)
                         except Exception:
                             pass
+                    elif field in AI_DROPDOWN_FIELDS:
+                        # Convert AI dropdown fields to integers
+                        try:
+                            value = int(value)
+                        except (ValueError, TypeError):
+                            # Keep as string if not a valid integer
+                            pass
                     fields_dictionary[field] = value
             elif isinstance(fields, dict):
                 fields_dictionary = fields
diff --git a/trcli/data_providers/api_data_provider.py b/trcli/data_providers/api_data_provider.py
index 9570c135..787ba474 100644
--- a/trcli/data_providers/api_data_provider.py
+++ b/trcli/data_providers/api_data_provider.py
@@ -132,10 +132,18 @@ def add_run(
         return body
 
     def add_results_for_cases(self, bulk_size, user_ids=None):
-        """Return bodies for adding results for cases. Returns bodies for results that already have case ID."""
+        """Return bodies for adding results for cases. Returns bodies for results that already have case ID.
+
+        Splits results into separate batches:
+        1. Results WITHOUT quality_rating (for Text template cases)
+        2. Results WITH quality_rating (for AI Evaluation template cases)
+
+        This is necessary because TestRail validates each batch and rejects mixed batches.
+        """
         testcases = [sections.testcases for sections in self.suites_input.testsections]
 
-        bodies = []
+        bodies_without_quality_rating = []
+        bodies_with_quality_rating = []
         user_index = 0
         assigned_count = 0
         total_failed_count = 0
@@ -155,17 +163,39 @@ def add_results_for_cases(self, bulk_size, user_ids=None):
                             user_index += 1
                             assigned_count += 1
 
-                    bodies.append(case.result.to_dict())
+                    result_dict = case.result.to_dict()
+
+                    # Split results based on presence of quality_rating
+                    # This prevents TestRail validation errors when mixing template types
+                    if "quality_rating" in result_dict and result_dict["quality_rating"] is not None:
+                        bodies_with_quality_rating.append(result_dict)
+                    else:
+                        bodies_without_quality_rating.append(result_dict)
 
         # Store counts for logging (we'll access this from the api_request_handler)
         self._assigned_count = assigned_count if user_ids else 0
         self._total_failed_count = total_failed_count
 
-        result_bulks = ApiDataProvider.divide_list_into_bulks(
-            bodies,
-            bulk_size=bulk_size,
-        )
-        return [{"results": result_bulk} for result_bulk in result_bulks]
+        # Create separate batches for results with and without quality_rating
+        result_batches = []
+
+        # Add batches for results WITHOUT quality_rating (Text template cases)
+        if bodies_without_quality_rating:
+            result_bulks_without = ApiDataProvider.divide_list_into_bulks(
+                bodies_without_quality_rating,
+                bulk_size=bulk_size,
+            )
+            result_batches.extend([{"results": result_bulk} for result_bulk in result_bulks_without])
+
+        # Add batches for results WITH quality_rating (AI Evaluation template cases)
+        if bodies_with_quality_rating:
+            result_bulks_with = ApiDataProvider.divide_list_into_bulks(
+                bodies_with_quality_rating,
+                bulk_size=bulk_size,
+            )
+            result_batches.extend([{"results": result_bulk} for result_bulk in result_bulks_with])
+
+        return result_batches
 
     def update_data(
         self,