#25100 [model] Apertus Tool/Function and Reasoning parser

原始 PR 作者 EduardDurech 合并时间 2026-06-06 15:04 文件变更 12 提交数 6 评论 13 代码增减 +555 / -7

执行摘要

新增 Apertus2509 工具调用与推理块解析器

该 PR 设计思路清晰，遵循了现有解析器框架的扩展模式，是添加新模型格式的良好范例。建议阅读 apertus2509_detector.py 和 reasoning_parser.py 的源码，了解如何集成 tool call 与 reasoning 解析。对于需要支持相似自定义格式的开发者，该 PR 提供了可复用的模式。

讨论亮点

Review 中，JustinTong0323 指出三个关键问题：

skip_special_tokens 问题：普通聊天请求仍会解码时剥离特殊 token，导致推理解析器看不到标记，需要强制 skip_special_tokens=False。→ 已修复（_patch_reasoning_skip_special_tokens 中对 apertus2509 设置 False）。
文档路径问题：旧版 docs/ 下的更新需移至 docs_new/ 以避免 lint 拒绝。→ 已修复。
流式工具块后文本残留：parse_streaming_increment 中完成工具块后可能不及时刷新尾随普通文本。→ 已修复（'Fix review' 提交）。
推理检测器参数转发：Apertus2509Detector 构造函数需接收 continue_final_message 等 kwargs 并传递至基类。→ 已修复。
最终获两位 reviewer 批准合并。

实现拆解

新增工具调用检测器：在 python/sglang/srt/function_call/apertus2509_detector.py 中实现 Apertus2509Detector，继承 BaseFormatDetector。支持一次性解析（detect_and_parse）和流式增量解析（parse_streaming_increment），识别 <|tools_prefix|>[{...}]<|tools_suffix|> 格式，并将每个 JSON 对象转换为 ToolCallItem。
新增推理块解析器：在 python/sglang/srt/parser/reasoning_parser.py 中添加 Apertus2509Detector（继承 BaseReasoningFormatDetector）。实现了 detect_and_parse_block_sequence 方法，将文本拆分为有序的 ('reasoning', ...) 和 ('text', ...) 块，并支持流式增量解析（parse_streaming_increment）。同时提供 _split_inner_reasoning 方法，处理推理块内可能嵌入的工具调用。
自动检测与路由：在 python/sglang/srt/managers/template_detection.py 中新增 _is_apertus2509 检测函数（基于词汇表是否包含 <|inner_prefix|>），并将 apertus2509 同时注册到 REASONING_PARSER_RULES 和 TOOL_CALL_PARSER_RULES，实现模型自动识别。
服务路径适配：在 python/sglang/srt/entrypoints/openai/serving_chat.py 中将原有 _patch_mistral_skip_special_tokens 泛化为 _patch_reasoning_skip_special_tokens，对 apertus2509 解析器强制设置 skip_special_tokens=False，确保推理标记在解码时不被剥离。
API 扩展：在 python/sglang/srt/entrypoints/http_server.py 的 /separate_reasoning 端点增加 return_blocks 参数，当请求中包含该参数时，返回结构化的块列表（blocks、reasoning_blocks、text_blocks），使客户端能够保留交替的推理和工具调用片段。
解析器注册与测试：在 python/sglang/srt/function_call/function_call_parser.py 中将 Apertus2509Detector 注册到 FunctionCallParser.ToolCallParserEnum；更新测试文件 test/registered/unit/managers/test_template_manager.py 以覆盖新检测规则。同时更新 docs_new 下的文档和 notebook 示例。

文件	模块	状态	重要度
`python/sglang/srt/function_call/apertus2509_detector.py`	工具调用	added	9.08
`python/sglang/srt/parser/reasoning_parser.py`	推理解析	modified	8.62
`python/sglang/srt/entrypoints/openai/serving_chat.py`	服务入口	modified	6.64
`python/sglang/srt/entrypoints/http_server.py`	服务入口	modified	5.59
`python/sglang/srt/managers/template_detection.py`	模型识别	modified	5.41
`python/sglang/srt/function_call/function_call_parser.py`	工具调用	modified	4.66
`python/sglang/srt/managers/io_struct.py`	数据结构	modified	3.95
`test/registered/unit/managers/test_template_manager.py`	模板管理测试	modified	3.76
`docs_new/docs/advanced_features/separate_reasoning.mdx`	文档	modified	2.79
`docs_new/docs/advanced_features/tool_parser.mdx`	文档	modified	2.65
`docs_new/docs/advanced_features/separate_reasoning.ipynb`	文档	modified	2.32
`docs_new/docs/advanced_features/tool_parser.ipynb`	文档	modified	2.0

关键符号

Apertus2509Detector.__init__ Apertus2509Detector.detect_and_parse Apertus2509Detector.parse_streaming_increment Apertus2509Detector.detect_and_parse_block_sequence Apertus2509Detector._split_inner_reasoning _patch_reasoning_skip_special_tokens _is_apertus2509

关键源码片段

python/sglang/srt/function_call/apertus2509_detector.py core-logic

核心新增文件，实现 Apertus 工具调用格式的检测与解析。

# python/sglang/srt/function_call/apertus2509_detector.py
class Apertus2509Detector(BaseFormatDetector):
    def __init__(self):
        super().__init__()
        self.bot = '<|tools_prefix|>['
        self.suffix = '<|tools_suffix|>'
        self._in_tools_block: bool = False

    def detect_and_parse(self, text: str, tools: List[Tool]) -> StreamingParseResult:
        '''一次性解析：提取所有工具块并解析 JSON 负载。'''
        if not self.has_tool_call(text):
            return StreamingParseResult(normal_text=text, calls=[])
        calls: List[ToolCallItem] = []
        normal_parts: List[str] = []
        cursor = 0
        while True:
            if (start := text.find(self.bot, cursor)) == -1:
                normal_parts.append(text[cursor:])
                break
            normal_parts.append(text[cursor:start])
            tool_part = text[start:]
            parsed_arr, json_end = self._try_parse_json_array(tool_part)
            if parsed_arr is None:
                normal_parts.append(tool_part)
                break
            if (suffix_pos := tool_part.find(self.suffix, json_end)) == -1:
                normal_parts.append(tool_part)
                break
            calls.extend(self._parse_apertus_call_list(parsed_arr, tools, tool_index_offset=len(calls)))
            cursor = start + suffix_pos + len(self.suffix)
        return StreamingParseResult(normal_text=''.join(normal_parts).strip(), calls=calls)

    # parse_streaming_increment 等略

python/sglang/srt/parser/reasoning_parser.py core-logic

核心修改文件，添加 Apertus 推理块解析器及块序列解析支持。

# python/sglang/srt/parser/reasoning_parser.py
class Apertus2509Detector(BaseReasoningFormatDetector):
    def __init__(self, stream_reasoning=True, force_reasoning=False,
                 continue_final_message=False, previous_content='',
                 force_nonempty_content=False):
        super().__init__(
            '<|inner_prefix|>', # 推理块起始标记
            '<|inner_suffix|>', # 推理块结束标记
            force_reasoning=False,
            stream_reasoning=stream_reasoning,
            continue_final_message=continue_final_message,
            previous_content=previous_content,
        )
        self._force_reasoning = force_reasoning
        self._force_nonempty_content = force_nonempty_content
        self._tool_start_token = '<|tools_prefix|>['
        self._tool_end_token = '<|tools_suffix|>'
        self._reasoning_acc = ''
        self._in_inner_tool = False

    def detect_and_parse_block_sequence(self, text: str) -> list[tuple[str, str]]:
        '''返回有序块序列：[('reasoning'|'text', content), ...]'''
        start_tok = self.think_start_token
        end_tok = self.think_end_token
        blocks = []
        cursor = 0
        if self._in_reasoning: # 继续消息可能已在推理块内
            if (e := text.find(end_tok, cursor)) == -1:
                blocks.extend(self._split_inner_reasoning(text[cursor:]))
                blocks.append(('text', ''))
                return blocks
            blocks.extend(self._split_inner_reasoning(text[cursor:e]))
            cursor = e + len(end_tok)
        while True:
            if (s := text.find(start_tok, cursor)) == -1:
                blocks.append(('text', text[cursor:]))
                break
            if s > cursor:
                blocks.append(('text', text[cursor:s]))
            cursor = s + len(start_tok)
            if (e := text.find(end_tok, cursor)) == -1:
                blocks.extend(self._split_inner_reasoning(text[cursor:]))
                blocks.append(('text', ''))
                break
            blocks.extend(self._split_inner_reasoning(text[cursor:e]))
            cursor = e + len(end_tok)
        last_idx = len(blocks) - 1
        blocks = [(k, t) for i, (k, t) in enumerate(blocks)
                  if not (k == 'text' and t == '' and i != last_idx)]
        return blocks

评论区精华

skip_special_tokens 导致推理标记被剥离 正确性

JustinTong0323 指出：普通聊天请求仍会解码时剥离特殊 token，导致推理解析器看不到标记，需要强制 `skip_special_tokens=False`。

结论：已修复：`_patch_reasoning_skip_special_tokens` 中对 `apertus2509` 解析器设置 `skip_special_tokens=False`。 · 已解决

文档路径需从 docs/ 迁移至 docs_new/ documentation

JustinTong0323 要求将旧版 `docs/` 下的更新移到 `docs_new/`，因为 lint 流程拒绝非白名单的 docs/ 变更。

结论：已修复：提交将文档更新移至 `docs_new/`。 · 已解决

流式解析中工具块后普通文本残留 正确性

JustinTong0323 指出在 `parse_streaming_increment` 中，完成工具块后可能不及时刷新尾随普通文本，导致 `tail` 被缓冲。

结论：已修复（在 'Fix review' 提交中）。 · 已解决

推理检测器构造函数需转发公共 kwargs 设计

JustinTong0323 要求 `Apertus2509Detector` 接受并转发 `continue_final_message`、`previous_content` 等参数，否则使用这些选项的请求会失败。

结论：已修复：`__init__` 现在接受完整 kwargs 并传递至基类。 · 已解决

风险与影响

自动检测误触风险：_is_apertus2509 仅检查词汇表中是否存在 <|inner_prefix|>，若其他模型意外包含该 token 则可能被错误识别，但概率极低。
skip_special_tokens 影响：强制关闭特殊 token 剥离可能导致解码输出中出现其他特殊 token，但仅对 apertus2509 解析器生效，影响范围有限。
流式缓冲边界：parse_streaming_increment 中的缓冲管理逻辑若未正确处理 token 边界，可能导致工具调用解析不完整或推理块分裂。
API 扩展兼容性：return_blocks 为新增参数，不影响现有客户端调用。

对使用 Apertus 2509 模型的用户，该 PR 使工具调用和推理内容得到正确分离与流出；对系统，新增约 500 行代码，整体性能影响极小；对团队，需在未来维护该模型格式的解析逻辑，但代码结构与现有框架一致，学习成本低。

特殊 token 剥离风险自动检测误触发流式边界 case

关联 Issue

未识别关联 Issue

当前没有检测到明确关联的 Issue 链接，后续同步到相关引用后会出现在这里。

完整报告

执行摘要

一句话：新增 Apertus2509 工具调用与推理块解析器
推荐动作：该 PR 设计思路清晰，遵循了现有解析器框架的扩展模式，是添加新模型格式的良好范例。建议阅读 apertus2509_detector.py 和 reasoning_parser.py 的源码，了解如何集成 tool call 与 reasoning 解析。对于需要支持相似自定义格式的开发者，该 PR 提供了可复用的模式。

功能与动机

实现拆解

新增工具调用检测器：在 python/sglang/srt/function_call/apertus2509_detector.py 中实现 Apertus2509Detector，继承 BaseFormatDetector。支持一次性解析（detect_and_parse）和流式增量解析（parse_streaming_increment），识别 <|tools_prefix|>[{...}]<|tools_suffix|> 格式，并将每个 JSON 对象转换为 ToolCallItem。
新增推理块解析器：在 python/sglang/srt/parser/reasoning_parser.py 中添加 Apertus2509Detector（继承 BaseReasoningFormatDetector）。实现了 detect_and_parse_block_sequence 方法，将文本拆分为有序的 ('reasoning', ...) 和 ('text', ...) 块，并支持流式增量解析（parse_streaming_increment）。同时提供 _split_inner_reasoning 方法，处理推理块内可能嵌入的工具调用。
自动检测与路由：在 python/sglang/srt/managers/template_detection.py 中新增 _is_apertus2509 检测函数（基于词汇表是否包含 <|inner_prefix|>），并将 apertus2509 同时注册到 REASONING_PARSER_RULES 和 TOOL_CALL_PARSER_RULES，实现模型自动识别。
服务路径适配：在 python/sglang/srt/entrypoints/openai/serving_chat.py 中将原有 _patch_mistral_skip_special_tokens 泛化为 _patch_reasoning_skip_special_tokens，对 apertus2509 解析器强制设置 skip_special_tokens=False，确保推理标记在解码时不被剥离。
API 扩展：在 python/sglang/srt/entrypoints/http_server.py 的 /separate_reasoning 端点增加 return_blocks 参数，当请求中包含该参数时，返回结构化的块列表（blocks、reasoning_blocks、text_blocks），使客户端能够保留交替的推理和工具调用片段。
解析器注册与测试：在 python/sglang/srt/function_call/function_call_parser.py 中将 Apertus2509Detector 注册到 FunctionCallParser.ToolCallParserEnum；更新测试文件 test/registered/unit/managers/test_template_manager.py 以覆盖新检测规则。同时更新 docs_new 下的文档和 notebook 示例。

关键文件：

python/sglang/srt/function_call/apertus2509_detector.py（模块工具调用；类别 source；类型 core-logic；符号 Apertus2509Detector, init, has_tool_call, detect_and_parse）: 核心新增文件，实现 Apertus 工具调用格式的检测与解析。
python/sglang/srt/parser/reasoning_parser.py（模块推理解析；类别 source；类型 core-logic；符号 Apertus2509Detector, init, _ends_with_partial_token, detect_and_parse）: 核心修改文件，添加 Apertus 推理块解析器及块序列解析支持。
python/sglang/srt/entrypoints/openai/serving_chat.py（模块服务入口；类别 source；类型 core-logic；符号 _patch_mistral_skip_special_tokens, _patch_reasoning_skip_special_tokens）: 泛化 skip_special_tokens 补丁，确保 apertus2509 特殊标记不被剥离。
python/sglang/srt/entrypoints/http_server.py（模块服务入口；类别 source；类型 core-logic）: 扩展 /separate_reasoning API 以支持 return_blocks 结构输出。
python/sglang/srt/managers/template_detection.py（模块模型识别；类别 source；类型 core-logic；符号 _is_apertus2509）: 新增 apertus2509 自动检测规则（基于词汇表）。
python/sglang/srt/function_call/function_call_parser.py（模块工具调用；类别 source；类型 dependency-wiring）: 将 Apertus2509Detector 注册到 ToolCallParserEnum。
python/sglang/srt/managers/io_struct.py（模块数据结构；类别 source；类型 core-logic）: 添加 Apertus 相关配置（细节未在 patch 中详细展示）。
test/registered/unit/managers/test_template_manager.py（模块模板管理测试；类别 test；类型 test-coverage）: 添加针对 apertus2509 检测规则的单元测试。
docs_new/docs/advanced_features/separate_reasoning.mdx（模块文档；类别 other；类型 documentation）: 文档更新，添加 Apertus 2509 模型的使用说明。
docs_new/docs/advanced_features/tool_parser.mdx（模块文档；类别 other；类型 documentation）: 文档更新，添加 Apertus 2509 工具解析说明。
docs_new/docs/advanced_features/separate_reasoning.ipynb（模块文档；类别 other；类型 documentation）: Notebook 示例更新。
docs_new/docs/advanced_features/tool_parser.ipynb（模块文档；类别 other；类型 documentation）: Notebook 示例更新。

关键符号：Apertus2509Detector.init, Apertus2509Detector.detect_and_parse, Apertus2509Detector.parse_streaming_increment, Apertus2509Detector.detect_and_parse_block_sequence, Apertus2509Detector._split_inner_reasoning, _patch_reasoning_skip_special_tokens, _is_apertus2509

关键源码片段

`python/sglang/srt/function_call/apertus2509_detector.py`

核心新增文件，实现 Apertus 工具调用格式的检测与解析。

# python/sglang/srt/function_call/apertus2509_detector.py
class Apertus2509Detector(BaseFormatDetector):
    def __init__(self):
        super().__init__()
        self.bot = '<|tools_prefix|>['
        self.suffix = '<|tools_suffix|>'
        self._in_tools_block: bool = False

    def detect_and_parse(self, text: str, tools: List[Tool]) -> StreamingParseResult:
        '''一次性解析：提取所有工具块并解析 JSON 负载。'''
        if not self.has_tool_call(text):
            return StreamingParseResult(normal_text=text, calls=[])
        calls: List[ToolCallItem] = []
        normal_parts: List[str] = []
        cursor = 0
        while True:
            if (start := text.find(self.bot, cursor)) == -1:
                normal_parts.append(text[cursor:])
                break
            normal_parts.append(text[cursor:start])
            tool_part = text[start:]
            parsed_arr, json_end = self._try_parse_json_array(tool_part)
            if parsed_arr is None:
                normal_parts.append(tool_part)
                break
            if (suffix_pos := tool_part.find(self.suffix, json_end)) == -1:
                normal_parts.append(tool_part)
                break
            calls.extend(self._parse_apertus_call_list(parsed_arr, tools, tool_index_offset=len(calls)))
            cursor = start + suffix_pos + len(self.suffix)
        return StreamingParseResult(normal_text=''.join(normal_parts).strip(), calls=calls)

    # parse_streaming_increment 等略

`python/sglang/srt/parser/reasoning_parser.py`

核心修改文件，添加 Apertus 推理块解析器及块序列解析支持。

# python/sglang/srt/parser/reasoning_parser.py
class Apertus2509Detector(BaseReasoningFormatDetector):
    def __init__(self, stream_reasoning=True, force_reasoning=False,
                 continue_final_message=False, previous_content='',
                 force_nonempty_content=False):
        super().__init__(
            '<|inner_prefix|>', # 推理块起始标记
            '<|inner_suffix|>', # 推理块结束标记
            force_reasoning=False,
            stream_reasoning=stream_reasoning,
            continue_final_message=continue_final_message,
            previous_content=previous_content,
        )
        self._force_reasoning = force_reasoning
        self._force_nonempty_content = force_nonempty_content
        self._tool_start_token = '<|tools_prefix|>['
        self._tool_end_token = '<|tools_suffix|>'
        self._reasoning_acc = ''
        self._in_inner_tool = False

    def detect_and_parse_block_sequence(self, text: str) -> list[tuple[str, str]]:
        '''返回有序块序列：[('reasoning'|'text', content), ...]'''
        start_tok = self.think_start_token
        end_tok = self.think_end_token
        blocks = []
        cursor = 0
        if self._in_reasoning: # 继续消息可能已在推理块内
            if (e := text.find(end_tok, cursor)) == -1:
                blocks.extend(self._split_inner_reasoning(text[cursor:]))
                blocks.append(('text', ''))
                return blocks
            blocks.extend(self._split_inner_reasoning(text[cursor:e]))
            cursor = e + len(end_tok)
        while True:
            if (s := text.find(start_tok, cursor)) == -1:
                blocks.append(('text', text[cursor:]))
                break
            if s > cursor:
                blocks.append(('text', text[cursor:s]))
            cursor = s + len(start_tok)
            if (e := text.find(end_tok, cursor)) == -1:
                blocks.extend(self._split_inner_reasoning(text[cursor:]))
                blocks.append(('text', ''))
                break
            blocks.extend(self._split_inner_reasoning(text[cursor:e]))
            cursor = e + len(end_tok)
        last_idx = len(blocks) - 1
        blocks = [(k, t) for i, (k, t) in enumerate(blocks)
                  if not (k == 'text' and t == '' and i != last_idx)]
        return blocks

评论区精华

Review 中，JustinTong0323 指出三个关键问题：

skip_special_tokens 问题：普通聊天请求仍会解码时剥离特殊 token，导致推理解析器看不到标记，需要强制 skip_special_tokens=False。→ 已修复（_patch_reasoning_skip_special_tokens 中对 apertus2509 设置 False）。
文档路径问题：旧版 docs/ 下的更新需移至 docs_new/ 以避免 lint 拒绝。→ 已修复。
流式工具块后文本残留：parse_streaming_increment 中完成工具块后可能不及时刷新尾随普通文本。→ 已修复（'Fix review' 提交）。
推理检测器参数转发：Apertus2509Detector 构造函数需接收 continue_final_message 等 kwargs 并传递至基类。→ 已修复。
最终获两位 reviewer 批准合并。
skip_special_tokens 导致推理标记被剥离 (correctness): 已修复：_patch_reasoning_skip_special_tokens 中对 apertus2509 解析器设置 skip_special_tokens=False。
文档路径需从 docs/ 迁移至 docs_new/ (documentation): 已修复：提交将文档更新移至 docs_new/。
流式解析中工具块后普通文本残留 (correctness): 已修复（在 'Fix review' 提交中）。
推理检测器构造函数需转发公共 kwargs (design): 已修复：__init__ 现在接受完整 kwargs 并传递至基类。

风险与影响

风险：
1. 自动检测误触风险：_is_apertus2509 仅检查词汇表中是否存在 <|inner_prefix|>，若其他模型意外包含该 token 则可能被错误识别，但概率极低。
2. skip_special_tokens 影响：强制关闭特殊 token 剥离可能导致解码输出中出现其他特殊 token，但仅对 apertus2509 解析器生效，影响范围有限。
3. 流式缓冲边界：parse_streaming_increment 中的缓冲管理逻辑若未正确处理 token 边界，可能导致工具调用解析不完整或推理块分裂。
4. API 扩展兼容性：return_blocks 为新增参数，不影响现有客户端调用。
  - 影响：对使用 Apertus 2509 模型的用户，该 PR 使工具调用和推理内容得到正确分离与流出；对系统，新增约 500 行代码，整体性能影响极小；对团队，需在未来维护该模型格式的解析逻辑，但代码结构与现有框架一致，学习成本低。
  - 风险标记：特殊 token 剥离风险, 自动检测误触发, 流式边界 case

关联脉络

暂无明显关联 PR

#25100 [model] Apertus Tool/Function and Reasoning parser

执行摘要

新增 Apertus2509 工具调用与推理块解析器

实现拆解

评论区精华

风险与影响

关联 Issue

未识别关联 Issue

完整报告

参与讨论