[Bug]: DeepSeek V3.1 Tool Parser: Leading whitespace accumulation in multi-turn tool calling conversations

### Your current environment

- **vLLM Version**: nightly (commit: `0b25498990f01ea2553c02731d6e2ce2d550156a`)
- **Model**: DeepSeek-V3.1-Terminus
- **Tool Parser**: `deepseek_v31`
- **Chat Template**: `tool_chat_template_deepseekv31.jinja`
- **Request Mode**: Non-streaming

### 🐛 Describe the bug

### Docker Configuration

```yaml
command: >
  --model /models/DeepSeek-V3.1-Terminus
  --served-model-name DeepSeek-V3.1
  --trust-remote-code
  --host 0.0.0.0
  --port 8000
  --tensor-parallel-size 8
  --reasoning-parser deepseek_v3
  --max-model-len 128000
  --gpu-memory-utilization 0.9
  --enable-auto-tool-choice 
  --tool-call-parser deepseek_v31 
  --chat-template /models/DeepSeek-V3.1-Terminus/tool_chat_template_deepseekv31.jinja
```

### Steps to Reproduce

1. Start vLLM with DeepSeek V3.1 model and `deepseek_v31` tool parser
2. Create a multi-turn conversation with tool calls (10+ rounds recommended)
3. Alternate between different tool calls in each round
4. Observe the leading whitespace in assistant responses

### Expected Behavior

Assistant responses should have **no leading whitespace** regardless of the number of conversation rounds. The content should be clean and consistent across all turns.

### Actual Behavior

Leading whitespace progressively accumulates in assistant responses:

**Round 1**: ✓ No leading spaces
```json
{
  "content": "我来帮您查询北京今天的天气情况。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "北京今天天气很好，是晴天，温度25°C，比较舒适的温度。"
}
```

**Round 2**: ⚠️ 6 leading spaces after tool execution
```json
{
  "content": "我来帮您计算15乘以23等于多少。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "      15乘以23等于345。"
}
```

**Round 4**: ⚠️ 10 leading spaces before tool call, 12 after
```json
{
  "content": "          我来帮您查询上海的天气情况。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "            上海今天也是晴天，温度25°C，和北京的天气情况很相似。"
}
```

**Round 10**: ⚠️ 30 leading spaces before tool call, 48 after
```json
{
  "content": "                              我来帮您计算100除以4的结果。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "                                                100除以4等于25。"
}
```

### Whitespace Accumulation Pattern

| Round | Leading Spaces (Before Tool Call) | Leading Spaces (After Tool Result) |
|-------|-----------------------------------|-------------------------------------|
| 1     | 0 ✓                               | 0 ✓                                 |
| 2     | 0 ✓                               | 6 ⚠️                                |
| 3     | 0 ✓                               | 6 ⚠️                                |
| 4     | 10 ⚠️                             | 12 ⚠️                               |
| 5     | 0 ✓                               | 12 ⚠️                               |
| 6     | 10 ⚠️                             | 18 ⚠️                               |
| 7     | 10 ⚠️                             | 18 ⚠️                               |
| 8     | 10 ⚠️                             | 24 ⚠️                               |
| 9     | 20 ⚠️                             | 42 ⚠️                               |
| 10    | 30 ⚠️                             | 48 ⚠️                               |

### Test Script

<details>
<summary>Click to expand test script</summary>

```python
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Multi-turn Tool Call Test Script
Tests for whitespace accumulation in DeepSeek V3.1
"""

import json
from openai import OpenAI

# Tool definitions
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取指定城市的天气信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "城市名称"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "执行数学计算",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "数学表达式"}
                },
                "required": ["expression"]
            }
        }
    }
]

def check_leading_spaces(text, round_num):
    """Check and report leading spaces"""
    if not text:
        return 0
    
    leading_spaces = len(text) - len(text.lstrip(' '))
    if leading_spaces > 0:
        print(f"⚠️  Round {round_num}: Detected {leading_spaces} leading spaces!")
        print(f"   First 50 chars: {repr(text[:50])}")
    else:
        print(f"✓ Round {round_num}: No leading spaces")
    
    return leading_spaces

def test_multi_round_toolcall(base_url="http://localhost:8000", rounds=10):
    client = OpenAI(api_key="sk-test", base_url=f"{base_url}/v1")
    
    test_questions = [
        "北京今天天气怎么样？",
        "计算一下 15 乘以 23 等于多少？",
        "上海的天气如何？",
        "帮我计算 100 除以 4 的结果",
    ]
    
    messages = []
    space_counts = []
    
    for round_num in range(1, rounds + 1):
        print(f"\n{'='*60}")
        print(f"Round {round_num}")
        print(f"{'='*60}")
        
        question = test_questions[(round_num - 1) % len(test_questions)]
        messages.append({"role": "user", "content": question})
        
        # First call - tool selection
        response = client.chat.completions.create(
            model="DeepSeek-V3.1",
            messages=messages,
            tools=tools,
            stream=False,
        )
        
        message = response.choices[0].message
        
        # Check for leading spaces
        if message.content:
            space_count = check_leading_spaces(message.content, f"{round_num}-before-tool")
            space_counts.append(space_count)
        
        # Handle tool calls
        if message.tool_calls:
            messages.append({
                "role": "assistant",
                "content": message.content,
                "tool_calls": [{
                    "id": tc.id,
                    "type": "function",
                    "function": {
                        "name": tc.function.name,
                        "arguments": tc.function.arguments
                    }
                } for tc in message.tool_calls]
            })
            
            # Simulate tool execution
            for tc in message.tool_calls:
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": f"Tool {tc.function.name} executed successfully"
                })
            
            # Get final response
            final_response = client.chat.completions.create(
                model="DeepSeek-V3.1",
                messages=messages,
                tools=tools,
                stream=False,
            )
            
            final_message = final_response.choices[0].message
            if final_message.content:
                space_count = check_leading_spaces(final_message.content, f"{round_num}-after-tool")
                space_counts.append(space_count)
            
            messages.append({
                "role": "assistant",
                "content": final_message.content
            })
    
    # Summary
    print(f"\n{'='*60}")
    print("Summary")
    print(f"{'='*60}")
    print(f"Total rounds: {rounds}")
    print(f"Rounds with leading spaces: {sum(1 for c in space_counts if c > 0)}/{len(space_counts)}")
    print(f"Min spaces: {min(space_counts)}")
    print(f"Max spaces: {max(space_counts)}")
    print(f"Average spaces: {sum(space_counts)/len(space_counts):.2f}")

if __name__ == "__main__":
    test_multi_round_toolcall(base_url="http://localhost:8000", rounds=10)
```

</details>

### Impact

This issue causes:
1. **Poor user experience**: Responses contain visible leading whitespace
2. **Context pollution**: Whitespace gets added to conversation history
3. **Cumulative degradation**: The problem worsens with each turn, potentially affecting model behavior
4. **Inconsistent output**: Some rounds have spaces, others don't

### Additional Context

- The issue appears to be related to how the `DeepSeekV31ToolParser` extracts content from model output
- The whitespace is preserved when content is returned to the client and subsequently added back to conversation history
- The chat template (`tool_chat_template_deepseekv31.jinja`) may also contribute by not stripping whitespace when rendering assistant messages


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: DeepSeek V3.1 Tool Parser: Leading whitespace accumulation in multi-turn tool calling conversations #28804

Your current environment

🐛 Describe the bug

Docker Configuration

Steps to Reproduce

Expected Behavior

Actual Behavior

Whitespace Accumulation Pattern

Test Script

Impact

Additional Context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Round	Leading Spaces (Before Tool Call)	Leading Spaces (After Tool Result)
1	0 ✓	0 ✓
2	0 ✓	6 ⚠️
3	0 ✓	6 ⚠️
4	10 ⚠️	12 ⚠️
5	0 ✓	12 ⚠️
6	10 ⚠️	18 ⚠️
7	10 ⚠️	18 ⚠️
8	10 ⚠️	24 ⚠️
9	20 ⚠️	42 ⚠️
10	30 ⚠️	48 ⚠️

Uh oh!

[Bug]: DeepSeek V3.1 Tool Parser: Leading whitespace accumulation in multi-turn tool calling conversations #28804

Description

Your current environment

🐛 Describe the bug

Docker Configuration

Steps to Reproduce

Expected Behavior

Actual Behavior

Whitespace Accumulation Pattern

Test Script

Impact

Additional Context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions