#44131 [CI] Stabilize OpenAI schema fuzzing for malformed structural tags

原始 PR 作者 AndreasKaratzas 合并时间 2026-06-02 10:56 文件变更 6 提交数 6 评论 2 代码增减 +201 / -3

执行摘要

在 API 层提前校验 structural_tag 格式

根据 PR 描述，Schemathesis 生成的请求 {"response_format": {"type": "structural_tag", "format": null}} 导致服务器返回 500，而应当拒绝返回 400。此举也消除了 CI fuzzing 测试的不稳定性。

推荐阅读，特别是 validate_structural_tag_response_format 的实现，展示了一种将深层引擎错误转化为 API 层校验错误的模式，有助于保持 API 的错误分类清晰。

讨论亮点

review 中仅有一轮讨论：DarkLight1337 提问是否应该使用 VLLMValidationError 进行测试，作者回应直接用 ValidationError 更直接，因为测试触发的是 Pydantic 校验器而非手动调用函数。最终 reviewer 认可并通过。

实现拆解

在 vllm/entrypoints/openai/engine/protocol.py 中新增 validate_structural_tag_response_format、validate_structural_tag_payload、validate_structured_outputs_structural_tag 三个函数。其中 validate_structural_tag_response_format 负责将 dict 类型的 response_format 转换为 Pydantic 模型，然后调用 validate_structural_tag_payload 进行 xgrammar 检查。validate_structured_outputs_structural_tag 解构 structured_outputs 中的 structural_tag 字段，调用相同 payload 验证。
在 ChatCompletionRequest 的 validate_response_format 校验器中插入代码：当 rf_type == 'structural_tag' 时调用 validate_structural_tag_response_format。同时在其 check_structured_outputs_count 和 BatchChatCompletionRequest.check_batch_mode 中增加对应的 validate_structured_outputs_structural_tag 调用。
在 CompletionRequest 的对应校验器中进行完全相同的修改，确保 completion 端点也受保护。
为以上三个路径编写针对性测试：新增 test_structural_tag_response_format_invalid 和 test_batch_structural_tag_response_format_invalid 覆盖 chat completion 和 batch，test_structured_outputs_structural_tag_invalid 覆盖直接 structured_outputs 发送的方式。completion 端点测试同理。
调整 tests/tool_parsers/test_mistral_tool_parser.py 中的测试 fixture，将其原先使用的部分有效（但存在歧义）的 structural_tag 格式改为明确的合法结构，以免被新验证拦截。

文件	模块	状态	重要度
`vllm/entrypoints/openai/engine/protocol.py`	协议层	modified	8.22
`vllm/entrypoints/openai/chat_completion/protocol.py`	协议层	modified	6.35
`tests/entrypoints/openai/chat_completion/test_chat_error.py`	聊天错误测试	modified	6.44
`vllm/entrypoints/openai/completion/protocol.py`	协议层	modified	5.59
`tests/entrypoints/openai/completion/test_completion_error.py`	完成错误测试	modified	5.67
`tests/tool_parsers/test_mistral_tool_parser.py`	工具解析	modified	4.61

关键符号

validate_structural_tag_response_format validate_structural_tag_payload validate_structured_outputs_structural_tag

关键源码片段

vllm/entrypoints/openai/engine/protocol.py dependency-wiring

核心验证函数所在文件，新增三个验证函数，是整个 PR 的基石

def validate_structural_tag_response_format(
    response_format: AnyStructuralTagResponseFormat | dict[str, Any],
) -> None:
    """Validate structural tags before they are sent to the engine.

    Engine-side validation reports malformed structural tags as generation
    failures. OpenAI request parsing should classify them as bad requests.
    """
    import json

    from pydantic import TypeAdapter, ValidationError

    # 如果 response_format 是 dict，先转换为 Pydantic 模型以获得字段校验
    if isinstance(response_format, dict):
        try:
            response_format = TypeAdapter(
                AnyStructuralTagResponseFormat
            ).validate_python(response_format)
        except ValidationError as exc:
            raise VLLMValidationError(
                "Invalid response_format structural_tag specification.",
                parameter="response_format",
            ) from exc

    try:
        # 序列化后交给下层 payload 验证
        payload = json.dumps(response_format.model_dump(by_alias=True))
        validate_structural_tag_payload(payload, parameter="response_format")
    except (TypeError, ValueError) as exc:
        raise VLLMValidationError(
            "Invalid response_format structural_tag specification.",
            parameter="response_format",
        ) from exc


def validate_structural_tag_payload(payload: Any, *, parameter: str) -> None:
    from vllm.sampling_params import SamplingParams, StructuredOutputsParams
    from vllm.v1.structured_output.backend_xgrammar import validate_xgrammar_grammar

    # 空字符串直接拒绝，避免下层引擎当作缺失处理
    if isinstance(payload, str) and not payload:
        raise VLLMValidationError(
            f"Invalid {parameter} structural_tag specification.",
            parameter=parameter,
        )

    try:
        # 复用 xgrammar 的 grammar 验证逻辑
        validate_xgrammar_grammar(
            SamplingParams(
                structured_outputs=StructuredOutputsParams(structural_tag=payload)
            )
        )
    except (TypeError, ValueError) as exc:
        raise VLLMValidationError(
            f"Invalid {parameter} structural_tag specification.",
            parameter=parameter,
        ) from exc

tests/entrypoints/openai/chat_completion/test_chat_error.py test-coverage

为聊天完成请求添加三种 structural_tag 拒绝测试

@pytest.mark.parametrize("format_value", [None, {}])
def test_structural_tag_response_format_invalid(format_value):
    # 验证当 response_format 的 format 字段为 None 或空 dict 时，
    # 请求构建会抛出 ValidationError，匹配提示信息
    with pytest.raises(
        ValidationError,
        match="Invalid response_format structural_tag",
    ):
        ChatCompletionRequest(
            model=MODEL_NAME,
            messages=[{"role": "user", "content": "hello"}],
            response_format={"type": "structural_tag", "format": format_value},
        )

评论区精华

测试中应使用 ValidationError 还是 VLLMValidationError 设计

DarkLight1337 提议使用 VLLMValidationError 以保持一致性，但 AndreasKaratzas 指出由于测试构造的是 Pydantic 请求类，ValidationError 更直接，且能同时覆盖三个不同的请求类（ChatCompletionRequest、BatchChatCompletionRequest、CompletionRequest）。

结论：决定保留 ValidationError，reviewer 认可并批准。 · 已解决

风险与影响

新加的 validate_structural_tag_payload 内部调用 validate_xgrammar_grammar，如果 xgrammar 版本更迭导致验证规则变化，可能影响有效请求。但由于该函数本就在引擎内部使用，风险保持一致。另外，空字符串被提前拒绝，可能影响一些边缘用例，但空 structural_tag 本无意义，所以影响很小。

对用户：无效 structural_tag 请求得到明确的 400 错误响应而非 500；对系统：无额外性能开销；对团队：新增的验证函数可被其他需要校验 structural_tag 的路径复用，降低重复代码。

依赖 xgrammar 验证行为新增验证路径可能误拒有效请求涉及三个 API 路径（chat、completion、batch）

关联 Issue

未识别关联 Issue

当前没有检测到明确关联的 Issue 链接，后续同步到相关引用后会出现在这里。

完整报告

执行摘要

一句话：在API层提前校验structural_tag格式
推荐动作：推荐阅读，特别是 validate_structural_tag_response_format 的实现，展示了一种将深层引擎错误转化为 API 层校验错误的模式，有助于保持 API 的错误分类清晰。

功能与动机

实现拆解

在 vllm/entrypoints/openai/engine/protocol.py 中新增 validate_structural_tag_response_format、validate_structural_tag_payload、validate_structured_outputs_structural_tag 三个函数。其中 validate_structural_tag_response_format 负责将 dict 类型的 response_format 转换为 Pydantic 模型，然后调用 validate_structural_tag_payload 进行 xgrammar 检查。validate_structured_outputs_structural_tag 解构 structured_outputs 中的 structural_tag 字段，调用相同 payload 验证。
在 ChatCompletionRequest 的 validate_response_format 校验器中插入代码：当 rf_type == 'structural_tag' 时调用 validate_structural_tag_response_format。同时在其 check_structured_outputs_count 和 BatchChatCompletionRequest.check_batch_mode 中增加对应的 validate_structured_outputs_structural_tag 调用。
在 CompletionRequest 的对应校验器中进行完全相同的修改，确保 completion 端点也受保护。
为以上三个路径编写针对性测试：新增 test_structural_tag_response_format_invalid 和 test_batch_structural_tag_response_format_invalid 覆盖 chat completion 和 batch，test_structured_outputs_structural_tag_invalid 覆盖直接 structured_outputs 发送的方式。completion 端点测试同理。
调整 tests/tool_parsers/test_mistral_tool_parser.py 中的测试 fixture，将其原先使用的部分有效（但存在歧义）的 structural_tag 格式改为明确的合法结构，以免被新验证拦截。

关键文件：

vllm/entrypoints/openai/engine/protocol.py（模块协议层；类别 source；类型 dependency-wiring；符号 validate_structural_tag_response_format, validate_structural_tag_payload, validate_structured_outputs_structural_tag）: 核心验证函数所在文件，新增三个验证函数，是整个PR的基石
vllm/entrypoints/openai/chat_completion/protocol.py（模块协议层；类别 source；类型 core-logic）: ChatCompletionRequest调用验证函数，覆盖主要入口
tests/entrypoints/openai/chat_completion/test_chat_error.py（模块聊天错误测试；类别 test；类型 test-coverage；符号 test_structural_tag_response_format_invalid, test_batch_structural_tag_response_format_invalid, test_structured_outputs_structural_tag_invalid）: 为聊天完成请求添加三种structural_tag拒绝测试
vllm/entrypoints/openai/completion/protocol.py（模块协议层；类别 source；类型 core-logic）: CompletionRequest调用验证函数，覆盖完成请求路径
tests/entrypoints/openai/completion/test_completion_error.py（模块完成错误测试；类别 test；类型 test-coverage；符号 test_structural_tag_response_format_invalid, test_structured_outputs_structural_tag_invalid）: 为完成请求添加structural_tag拒绝测试
tests/tool_parsers/test_mistral_tool_parser.py（模块工具解析；类别 test；类型 test-coverage）: 调整测试数据以通过新的验证

关键符号：validate_structural_tag_response_format, validate_structural_tag_payload, validate_structured_outputs_structural_tag

关键源码片段

`vllm/entrypoints/openai/engine/protocol.py`

核心验证函数所在文件，新增三个验证函数，是整个PR的基石

def validate_structural_tag_response_format(
    response_format: AnyStructuralTagResponseFormat | dict[str, Any],
) -> None:
    """Validate structural tags before they are sent to the engine.

    Engine-side validation reports malformed structural tags as generation
    failures. OpenAI request parsing should classify them as bad requests.
    """
    import json

    from pydantic import TypeAdapter, ValidationError

    # 如果 response_format 是 dict，先转换为 Pydantic 模型以获得字段校验
    if isinstance(response_format, dict):
        try:
            response_format = TypeAdapter(
                AnyStructuralTagResponseFormat
            ).validate_python(response_format)
        except ValidationError as exc:
            raise VLLMValidationError(
                "Invalid response_format structural_tag specification.",
                parameter="response_format",
            ) from exc

    try:
        # 序列化后交给下层 payload 验证
        payload = json.dumps(response_format.model_dump(by_alias=True))
        validate_structural_tag_payload(payload, parameter="response_format")
    except (TypeError, ValueError) as exc:
        raise VLLMValidationError(
            "Invalid response_format structural_tag specification.",
            parameter="response_format",
        ) from exc


def validate_structural_tag_payload(payload: Any, *, parameter: str) -> None:
    from vllm.sampling_params import SamplingParams, StructuredOutputsParams
    from vllm.v1.structured_output.backend_xgrammar import validate_xgrammar_grammar

    # 空字符串直接拒绝，避免下层引擎当作缺失处理
    if isinstance(payload, str) and not payload:
        raise VLLMValidationError(
            f"Invalid {parameter} structural_tag specification.",
            parameter=parameter,
        )

    try:
        # 复用 xgrammar 的 grammar 验证逻辑
        validate_xgrammar_grammar(
            SamplingParams(
                structured_outputs=StructuredOutputsParams(structural_tag=payload)
            )
        )
    except (TypeError, ValueError) as exc:
        raise VLLMValidationError(
            f"Invalid {parameter} structural_tag specification.",
            parameter=parameter,
        ) from exc

`tests/entrypoints/openai/chat_completion/test_chat_error.py`

为聊天完成请求添加三种structural_tag拒绝测试

@pytest.mark.parametrize("format_value", [None, {}])
def test_structural_tag_response_format_invalid(format_value):
    # 验证当 response_format 的 format 字段为 None 或空 dict 时，
    # 请求构建会抛出 ValidationError，匹配提示信息
    with pytest.raises(
        ValidationError,
        match="Invalid response_format structural_tag",
    ):
        ChatCompletionRequest(
            model=MODEL_NAME,
            messages=[{"role": "user", "content": "hello"}],
            response_format={"type": "structural_tag", "format": format_value},
        )

评论区精华

测试中应使用 ValidationError 还是 VLLMValidationError (design): 决定保留 ValidationError，reviewer 认可并批准。

风险与影响

风险：新加的 validate_structural_tag_payload 内部调用 validate_xgrammar_grammar，如果 xgrammar 版本更迭导致验证规则变化，可能影响有效请求。但由于该函数本就在引擎内部使用，风险保持一致。另外，空字符串被提前拒绝，可能影响一些边缘用例，但空 structural_tag 本无意义，所以影响很小。
影响：对用户：无效 structural_tag 请求得到明确的 400 错误响应而非 500；对系统：无额外性能开销；对团队：新增的验证函数可被其他需要校验 structural_tag 的路径复用，降低重复代码。
风险标记：依赖 xgrammar 验证行为, 新增验证路径可能误拒有效请求, 涉及三个API路径（chat、completion、batch）

关联脉络

暂无明显关联 PR

#44131 [CI] Stabilize OpenAI schema fuzzing for malformed structural tags

执行摘要

在 API 层提前校验 structural_tag 格式

实现拆解

评论区精华

风险与影响

关联 Issue

未识别关联 Issue

完整报告

参与讨论