Skip to content

[Bug]: DeepSeek V3.1 Tool Parser: Leading whitespace accumulation in multi-turn tool calling conversations #28804

@momaek

Description

@momaek

Your current environment

  • vLLM Version: nightly (commit: 0b25498990f01ea2553c02731d6e2ce2d550156a)
  • Model: DeepSeek-V3.1-Terminus
  • Tool Parser: deepseek_v31
  • Chat Template: tool_chat_template_deepseekv31.jinja
  • Request Mode: Non-streaming

🐛 Describe the bug

Docker Configuration

command: >
  --model /models/DeepSeek-V3.1-Terminus
  --served-model-name DeepSeek-V3.1
  --trust-remote-code
  --host 0.0.0.0
  --port 8000
  --tensor-parallel-size 8
  --reasoning-parser deepseek_v3
  --max-model-len 128000
  --gpu-memory-utilization 0.9
  --enable-auto-tool-choice 
  --tool-call-parser deepseek_v31 
  --chat-template /models/DeepSeek-V3.1-Terminus/tool_chat_template_deepseekv31.jinja

Steps to Reproduce

  1. Start vLLM with DeepSeek V3.1 model and deepseek_v31 tool parser
  2. Create a multi-turn conversation with tool calls (10+ rounds recommended)
  3. Alternate between different tool calls in each round
  4. Observe the leading whitespace in assistant responses

Expected Behavior

Assistant responses should have no leading whitespace regardless of the number of conversation rounds. The content should be clean and consistent across all turns.

Actual Behavior

Leading whitespace progressively accumulates in assistant responses:

Round 1: ✓ No leading spaces

{
  "content": "我来帮您查询北京今天的天气情况。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "北京今天天气很好,是晴天,温度25°C,比较舒适的温度。"
}

Round 2: ⚠️ 6 leading spaces after tool execution

{
  "content": "我来帮您计算15乘以23等于多少。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "      15乘以23等于345。"
}

Round 4: ⚠️ 10 leading spaces before tool call, 12 after

{
  "content": "          我来帮您查询上海的天气情况。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "            上海今天也是晴天,温度25°C,和北京的天气情况很相似。"
}

Round 10: ⚠️ 30 leading spaces before tool call, 48 after

{
  "content": "                              我来帮您计算100除以4的结果。",
  "tool_calls": [...]
}
// After tool execution:
{
  "content": "                                                100除以4等于25。"
}

Whitespace Accumulation Pattern

Round Leading Spaces (Before Tool Call) Leading Spaces (After Tool Result)
1 0 ✓ 0 ✓
2 0 ✓ 6 ⚠️
3 0 ✓ 6 ⚠️
4 10 ⚠️ 12 ⚠️
5 0 ✓ 12 ⚠️
6 10 ⚠️ 18 ⚠️
7 10 ⚠️ 18 ⚠️
8 10 ⚠️ 24 ⚠️
9 20 ⚠️ 42 ⚠️
10 30 ⚠️ 48 ⚠️

Test Script

Click to expand test script
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Multi-turn Tool Call Test Script
Tests for whitespace accumulation in DeepSeek V3.1
"""

import json
from openai import OpenAI

# Tool definitions
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取指定城市的天气信息",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "城市名称"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["city"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "执行数学计算",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {"type": "string", "description": "数学表达式"}
                },
                "required": ["expression"]
            }
        }
    }
]

def check_leading_spaces(text, round_num):
    """Check and report leading spaces"""
    if not text:
        return 0
    
    leading_spaces = len(text) - len(text.lstrip(' '))
    if leading_spaces > 0:
        print(f"⚠️  Round {round_num}: Detected {leading_spaces} leading spaces!")
        print(f"   First 50 chars: {repr(text[:50])}")
    else:
        print(f"✓ Round {round_num}: No leading spaces")
    
    return leading_spaces

def test_multi_round_toolcall(base_url="http://localhost:8000", rounds=10):
    client = OpenAI(api_key="sk-test", base_url=f"{base_url}/v1")
    
    test_questions = [
        "北京今天天气怎么样?",
        "计算一下 15 乘以 23 等于多少?",
        "上海的天气如何?",
        "帮我计算 100 除以 4 的结果",
    ]
    
    messages = []
    space_counts = []
    
    for round_num in range(1, rounds + 1):
        print(f"\n{'='*60}")
        print(f"Round {round_num}")
        print(f"{'='*60}")
        
        question = test_questions[(round_num - 1) % len(test_questions)]
        messages.append({"role": "user", "content": question})
        
        # First call - tool selection
        response = client.chat.completions.create(
            model="DeepSeek-V3.1",
            messages=messages,
            tools=tools,
            stream=False,
        )
        
        message = response.choices[0].message
        
        # Check for leading spaces
        if message.content:
            space_count = check_leading_spaces(message.content, f"{round_num}-before-tool")
            space_counts.append(space_count)
        
        # Handle tool calls
        if message.tool_calls:
            messages.append({
                "role": "assistant",
                "content": message.content,
                "tool_calls": [{
                    "id": tc.id,
                    "type": "function",
                    "function": {
                        "name": tc.function.name,
                        "arguments": tc.function.arguments
                    }
                } for tc in message.tool_calls]
            })
            
            # Simulate tool execution
            for tc in message.tool_calls:
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": f"Tool {tc.function.name} executed successfully"
                })
            
            # Get final response
            final_response = client.chat.completions.create(
                model="DeepSeek-V3.1",
                messages=messages,
                tools=tools,
                stream=False,
            )
            
            final_message = final_response.choices[0].message
            if final_message.content:
                space_count = check_leading_spaces(final_message.content, f"{round_num}-after-tool")
                space_counts.append(space_count)
            
            messages.append({
                "role": "assistant",
                "content": final_message.content
            })
    
    # Summary
    print(f"\n{'='*60}")
    print("Summary")
    print(f"{'='*60}")
    print(f"Total rounds: {rounds}")
    print(f"Rounds with leading spaces: {sum(1 for c in space_counts if c > 0)}/{len(space_counts)}")
    print(f"Min spaces: {min(space_counts)}")
    print(f"Max spaces: {max(space_counts)}")
    print(f"Average spaces: {sum(space_counts)/len(space_counts):.2f}")

if __name__ == "__main__":
    test_multi_round_toolcall(base_url="http://localhost:8000", rounds=10)

Impact

This issue causes:

  1. Poor user experience: Responses contain visible leading whitespace
  2. Context pollution: Whitespace gets added to conversation history
  3. Cumulative degradation: The problem worsens with each turn, potentially affecting model behavior
  4. Inconsistent output: Some rounds have spaces, others don't

Additional Context

  • The issue appears to be related to how the DeepSeekV31ToolParser extracts content from model output
  • The whitespace is preserved when content is returned to the client and subsequently added back to conversation history
  • The chat template (tool_chat_template_deepseekv31.jinja) may also contribute by not stripping whitespace when rendering assistant messages

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions