#26948 Improve type annotations in unified radix cache

原始 PR 作者 ispobock 合并时间 2026-06-02 21:17 文件变更 3 提交数 3 评论 5 代码增减 +31 / -15

执行摘要

改进 unified radix cache 类型注解

PR body 指出 unified radix cache 和 base hicache controller 中的多个属性缺少类型或类型松散，导致常见链式调用（如 self.cache_controller.write_storage(...) 或 operation.id）无法解析类型。此 PR 添加缺失注解以改善代码可读性和 IDE 支持，无行为变更。

建议合并。此 PR 是纯类型注解改进，有助于减少未来开发中的类型错误和 IDE 提示缺失，且已通过 CI 测试。对于关注代码质量和可维护性的团队，值得精读以了解如何逐步加强现有代码的类型覆盖。

讨论亮点

此 PR 无 review 评论，无讨论线程。

实现拆解

在 unified_radix_cache.py 中：
- 导入 Queue、Iterator、TypeVar 和 PrefetchOperation。
- 定义 T = TypeVar("T") 用于泛型 _drain_queue 方法。
- 将 self.cache_controller 注解为 Optional[HybridCacheController]。
- 将 self.ongoing_prefetch 和 self.ongoing_backup 从 dict 改为具体嵌套元组类型，明确各字段语义。
- 为 cache_unfinished_req 的 chunked 参数添加 bool 注解。
- 为 tracker 参数在 _evict_component_and_detach_lru 和 _evict_to_host 中添加 Optional[dict[ComponentType, int]]。
- 为 _prefetch_timeout_check_linear_func 添加 PrefetchOperation 参数和 bool 返回注解。
- 将 _drain_queue 泛型化，返回 Iterator[T] 而非裸 Queue。
在 cache_controller.py 中：
- 为 prefetch_revoke_queue、ack_backup_queue、host_mem_release_queue 添加元素类型注解，分别指定为 str、StorageOperation、torch.Tensor。
在 hybrid_cache_controller.py 中：
- 为 extra_host_mem_release_queues 的值类型添加 Queue[torch.Tensor]。

文件	模块	状态	重要度
`python/sglang/srt/mem_cache/unified_radix_cache.py`	缓存层	modified	8.08
`python/sglang/srt/managers/cache_controller.py`	调度器	modified	5.31
`python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py`	缓存层	modified	4.82

关键符号

cache_unfinished_req _prefetch_timeout_check_linear_func _drain_queue _evict_to_host _evict_component_and_detach_lru

关键源码片段

python/sglang/srt/mem_cache/unified_radix_cache.py core-logic

核心变更文件，包含最多的类型注解添加（27 行新增），涉及缓存控制器、预取 / 备份字典、drain 队列等关键结构的类型完善。

# python/sglang/srt/mem_cache/unified_radix_cache.py

from __future__ import annotations

from queue import Empty, Queue
from typing import TYPE_CHECKING, Any, Iterator, Optional, TypeVar

# 导入 PrefetchOperation 用于注解预取操作参数
if TYPE_CHECKING:
    from sglang.srt.mem_cache.hybrid_cache.hybrid_cache_controller import (
        PrefetchOperation,
    )

T = TypeVar("T") # 用于泛化 _drain_queue 方法

class UnifiedRadixCache:
    def __init__(self, ...):
        # ...
        # 之前是 self.cache_controller = None，现在显式注解为 Optional[HybridCacheController]
        self.cache_controller: Optional[HybridCacheController] = None

        # ...

    def _reset_full(self) -> None:
        # ...
        # 之前是空 dict，现在具体指定键值类型，提升可读性和静态检查
        self.ongoing_prefetch: dict[
            str,
            tuple[
                UnifiedTreeNode,
                RadixKey,
                torch.Tensor,
                PrefetchOperation,
                DecLockRefParams,
                dict[ComponentType, list[PoolTransfer]],
            ],
        ] = {}
        self.ongoing_backup: dict[int, tuple[UnifiedTreeNode, DecLockRefParams]] = {}
        # ...

    def cache_unfinished_req(self, req: Req, chunked: bool = False, **kwargs) -> None:
        # chunked 参数之前无类型，现在明确为 bool
        ...

    # _drain_queue 方法现在使用泛型 T，返回 Iterator[T] 而非原始 Queue
    def _drain_queue(self, q: Queue[T]) -> Iterator[T]:
        while True:
            try:
                item = q.get_nowait()
                yield item
            except Empty:
                break

    # 预取超时检查函数现在明确参数和返回类型
    def _prefetch_timeout_check_linear_func(
        self, operation: PrefetchOperation
    ) -> bool:
        ...

    def _evict_to_host(
        self,
        node: UnifiedTreeNode,
        tracker: Optional[dict[ComponentType, int]] = None, # 之前缺少 Optional
    ) -> None:
        ...

python/sglang/srt/managers/cache_controller.py entrypoint

为存储控制线程队列添加元素类型注解，确保队列中的元素类型被正确推断。

# python/sglang/srt/managers/cache_controller.py

from queue import Queue

class CacheController:
    def _start_storage_threads(self):
        # ...
        # 之前为裸 Queue()，现在指定元素类型，提升静态检查准确性
        self.prefetch_revoke_queue: Queue[str] = Queue()
        self.ack_backup_queue: Queue[StorageOperation] = Queue()
        self.host_mem_release_queue: Queue[torch.Tensor] = Queue()
        # ...

python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py entrypoint

为 extra_host_mem_release_queues 字典值添加 Queue[torch.Tensor] 注解，与其他队列保持一致。

# python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py

from queue import Queue

class HybridCacheController(BaseHiCacheController):
    def __init__(self, ...):
        # ...
        # 之前为 dict[PoolName, Queue]，现在指定元素类型
        self.extra_host_mem_release_queues: dict[PoolName, Queue[torch.Tensor]] = {}
        # ...

评论区精华

没有提炼出高价值讨论线程

当前评论区没有形成足够清晰的争议点或结论，后续有更多讨论时会体现在这里。

风险与影响

风险极低。所有变更均为类型注解，且使用 from __future__ import annotations 延迟求值，不会在运行时引入额外检查或影响。唯一可能的风险是如果类型注解存在错误（如 OngoingPrefetchEntry 类型与实际使用的元组结构不符），可能导致静态类型检查工具误报，但不会影响运行行为。

直接影响开发者体验：IDE 类型提示和静态检查（如 mypy）将能正确解析 cache_controller 的方法调用和 operation.id 等表达式。对用户无影响，对系统无性能/功能影响。影响范围限于三个源文件，约 31 行新增、15 行删除。

类型注解变更低风险

关联 Issue

未识别关联 Issue

当前没有检测到明确关联的 Issue 链接，后续同步到相关引用后会出现在这里。

完整报告

执行摘要

一句话：改进 unified radix cache 类型注解
推荐动作：建议合并。此 PR 是纯类型注解改进，有助于减少未来开发中的类型错误和 IDE 提示缺失，且已通过 CI 测试。对于关注代码质量和可维护性的团队，值得精读以了解如何逐步加强现有代码的类型覆盖。

功能与动机

实现拆解

在 unified_radix_cache.py 中：
- 导入 Queue、Iterator、TypeVar 和 PrefetchOperation。
- 定义 T = TypeVar("T") 用于泛型 _drain_queue 方法。
- 将 self.cache_controller 注解为 Optional[HybridCacheController]。
- 将 self.ongoing_prefetch 和 self.ongoing_backup 从 dict 改为具体嵌套元组类型，明确各字段语义。
- 为 cache_unfinished_req 的 chunked 参数添加 bool 注解。
- 为 tracker 参数在 _evict_component_and_detach_lru 和 _evict_to_host 中添加 Optional[dict[ComponentType, int]]。
- 为 _prefetch_timeout_check_linear_func 添加 PrefetchOperation 参数和 bool 返回注解。
- 将 _drain_queue 泛型化，返回 Iterator[T] 而非裸 Queue。
在 cache_controller.py 中：
- 为 prefetch_revoke_queue、ack_backup_queue、host_mem_release_queue 添加元素类型注解，分别指定为 str、StorageOperation、torch.Tensor。
在 hybrid_cache_controller.py 中：
- 为 extra_host_mem_release_queues 的值类型添加 Queue[torch.Tensor]。

关键文件：

python/sglang/srt/mem_cache/unified_radix_cache.py（模块缓存层；类别 source；类型 core-logic；符号 cache_unfinished_req, _prefetch_timeout_check_linear_func, can_terminate_prefetch, _drain_queue）: 核心变更文件，包含最多的类型注解添加（27 行新增），涉及缓存控制器、预取/备份字典、drain 队列等关键结构的类型完善。
python/sglang/srt/managers/cache_controller.py（模块调度器；类别 source；类型 entrypoint）: 为存储控制线程队列添加元素类型注解，确保队列中的元素类型被正确推断。
python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py（模块缓存层；类别 source；类型 entrypoint）: 为 extra_host_mem_release_queues 字典值添加 Queue[torch.Tensor] 注解，与其他队列保持一致。

关键符号：cache_unfinished_req, _prefetch_timeout_check_linear_func, _drain_queue, _evict_to_host, _evict_component_and_detach_lru

关键源码片段

`python/sglang/srt/mem_cache/unified_radix_cache.py`

核心变更文件，包含最多的类型注解添加（27 行新增），涉及缓存控制器、预取/备份字典、drain 队列等关键结构的类型完善。

# python/sglang/srt/mem_cache/unified_radix_cache.py

from __future__ import annotations

from queue import Empty, Queue
from typing import TYPE_CHECKING, Any, Iterator, Optional, TypeVar

# 导入 PrefetchOperation 用于注解预取操作参数
if TYPE_CHECKING:
    from sglang.srt.mem_cache.hybrid_cache.hybrid_cache_controller import (
        PrefetchOperation,
    )

T = TypeVar("T") # 用于泛化 _drain_queue 方法

class UnifiedRadixCache:
    def __init__(self, ...):
        # ...
        # 之前是 self.cache_controller = None，现在显式注解为 Optional[HybridCacheController]
        self.cache_controller: Optional[HybridCacheController] = None

        # ...

    def _reset_full(self) -> None:
        # ...
        # 之前是空 dict，现在具体指定键值类型，提升可读性和静态检查
        self.ongoing_prefetch: dict[
            str,
            tuple[
                UnifiedTreeNode,
                RadixKey,
                torch.Tensor,
                PrefetchOperation,
                DecLockRefParams,
                dict[ComponentType, list[PoolTransfer]],
            ],
        ] = {}
        self.ongoing_backup: dict[int, tuple[UnifiedTreeNode, DecLockRefParams]] = {}
        # ...

    def cache_unfinished_req(self, req: Req, chunked: bool = False, **kwargs) -> None:
        # chunked 参数之前无类型，现在明确为 bool
        ...

    # _drain_queue 方法现在使用泛型 T，返回 Iterator[T] 而非原始 Queue
    def _drain_queue(self, q: Queue[T]) -> Iterator[T]:
        while True:
            try:
                item = q.get_nowait()
                yield item
            except Empty:
                break

    # 预取超时检查函数现在明确参数和返回类型
    def _prefetch_timeout_check_linear_func(
        self, operation: PrefetchOperation
    ) -> bool:
        ...

    def _evict_to_host(
        self,
        node: UnifiedTreeNode,
        tracker: Optional[dict[ComponentType, int]] = None, # 之前缺少 Optional
    ) -> None:
        ...

`python/sglang/srt/managers/cache_controller.py`

为存储控制线程队列添加元素类型注解，确保队列中的元素类型被正确推断。

# python/sglang/srt/managers/cache_controller.py

from queue import Queue

class CacheController:
    def _start_storage_threads(self):
        # ...
        # 之前为裸 Queue()，现在指定元素类型，提升静态检查准确性
        self.prefetch_revoke_queue: Queue[str] = Queue()
        self.ack_backup_queue: Queue[StorageOperation] = Queue()
        self.host_mem_release_queue: Queue[torch.Tensor] = Queue()
        # ...

`python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py`

为 extra_host_mem_release_queues 字典值添加 Queue[torch.Tensor] 注解，与其他队列保持一致。

# python/sglang/srt/mem_cache/hybrid_cache/hybrid_cache_controller.py

from queue import Queue

class HybridCacheController(BaseHiCacheController):
    def __init__(self, ...):
        # ...
        # 之前为 dict[PoolName, Queue]，现在指定元素类型
        self.extra_host_mem_release_queues: dict[PoolName, Queue[torch.Tensor]] = {}
        # ...

评论区精华

此 PR 无 review 评论，无讨论线程。

暂无高价值评论线程

风险与影响

风险：风险极低。所有变更均为类型注解，且使用 from __future__ import annotations 延迟求值，不会在运行时引入额外检查或影响。唯一可能的风险是如果类型注解存在错误（如 OngoingPrefetchEntry 类型与实际使用的元组结构不符），可能导致静态类型检查工具误报，但不会影响运行行为。
影响：直接影响开发者体验：IDE 类型提示和静态检查（如 mypy）将能正确解析 cache_controller 的方法调用和 operation.id 等表达式。对用户无影响，对系统无性能/功能影响。影响范围限于三个源文件，约 31 行新增、15 行删除。
风险标记：类型注解变更, 低风险

关联脉络

PR #27064 Fix stale import after kl_nightly rename: 两者均涉及 unified radix cache 相关的文件修改，且本 PR 也影响了 radix_cache 相关的类型定义。
PR #26927 [UnifiedTree]: Add HiCache Nightly CI For GLM5: 同样属于 unified radix cache（UnifiedTree）的维护工作，本 PR 改进了该模块的类型系统。

#26948 Improve type annotations in unified radix cache

执行摘要

改进 unified radix cache 类型注解

实现拆解

评论区精华

没有提炼出高价值讨论线程

风险与影响

关联 Issue

未识别关联 Issue

完整报告

参与讨论