#39750 [Refactor] Remove unused param

原始 PR 作者 yewentao256 合并时间 2026-04-22 05:59 文件变更 5 提交数 2 评论 1 代码增减 +0 / -11

执行摘要

移除多个 Qwen 模型和引擎类中未使用的缓存与哈希参数。

PR body 中明确说明目的是“Remove unused param”。从review评论中，DarkLight1337 提问“why did we have these in the first place?”，表明这些参数的存在原因不明且未被使用，因此进行清理。

该PR是一个简单的清理重构，不值得精读。工程师可以快速浏览以了解哪些未使用参数被移除，但无需深入分析设计决策。关注点在于确认这些参数确实未被使用，以避免未来类似冗余。

讨论亮点

review 中仅有一次讨论：DarkLight1337 在 qwen2_5_omni_thinker.py 的变更处提问“@ywang96 why did we have these in the first place?”，但未得到回复。这暗示这些参数可能是历史遗留或未实现的特性，但当前已确认无用。gemini-code-assist[bot] 的自动审查确认了移除的是未使用参数，且无其他反馈。讨论简单，无重大争议。

实现拆解

清理Qwen模型的多模态处理函数：在三个模型文件（qwen2_5_omni_thinker.py、qwen3_asr.py、qwen3_omni_moe_thinker.py）中，移除了 _process_audio_input 和 _process_video_input 方法中未使用的 audio_hashes、cached_audio_features、video_hashes、cached_video_embeds 参数。这些参数在函数签名中声明但从未在函数体内被引用，移除后不影响核心处理逻辑。
清理引擎初始化参数：在 async_llm.py 和 llm_engine.py 中，移除了 AsyncLLM.__init__ 和 LLMEngine.__init__ 方法中的 use_cached_outputs 参数及其相关文档。该参数在构造函数中被接收但未在类内部使用，移除后简化了引擎的配置接口。
无测试或配置配套改动：本次变更仅涉及源码清理，没有修改测试文件、配置文件或部署脚本，表明这是一个纯粹的内部重构。

文件	模块	状态	重要度
`vllm/model_executor/models/qwen2_5_omni_thinker.py`	模型执行器	modified	5.28
`vllm/v1/engine/async_llm.py`	引擎层	modified	4.7
`vllm/model_executor/models/qwen3_asr.py`	模型执行器	modified	4.7
`vllm/model_executor/models/qwen3_omni_moe_thinker.py`	模型执行器	modified	4.7
`vllm/v1/engine/llm_engine.py`	引擎层	modified	4.56

关键符号

_process_audio_input _process_video_input __init__

关键源码片段

vllm/model_executor/models/qwen2_5_omni_thinker.py data-contract

移除了音频和视频处理函数中未使用的哈希和缓存参数，涉及多模态模型的核心数据处理逻辑。

def _process_audio_input(
    self,
    audio_input: Qwen2_5OmniAudioFeatureInputs,
    # 移除了未使用的参数：audio_hashes 和 cached_audio_features
    # 这些参数在函数体内从未被引用，因此安全删除
) -> torch.Tensor:
    input_features = audio_input["input_features"]
    audio_feature_lengths = audio_input["audio_feature_lengths"]
    # ... 剩余处理逻辑保持不变
    audio_feat_lengths, audio_output_lengths = (
        self.audio_tower._get_feat_extract_output_lengths(audio_feature_lengths)
    )
    audio_outputs = self.audio_tower(
        input_features.to(self.audio_tower.dtype),
        feature_lens=audio_feature_lengths,
        aftercnn_lens=audio_feat_lengths,
    )
    return audio_outputs.last_hidden_state.split(audio_output_lengths.tolist())

vllm/v1/engine/async_llm.py core-logic

移除了 AsyncLLM 构造函数中未使用的 `use_cached_outputs` 参数，影响引擎初始化配置。

def __init__(
    self,
    vllm_config: VllmConfig,
    executor_class: type[Executor],
    log_stats: bool,
    usage_context: UsageContext = UsageContext.ENGINE_CONTEXT,
    mm_registry: MultiModalRegistry = MULTIMODAL_REGISTRY,
    # 移除了未使用的参数：use_cached_outputs
    # 该参数在构造函数中未被使用，因此安全删除
    log_requests: bool = True,
    start_engine_loop: bool = True,
    stat_loggers: list[StatLoggerFactory] | None = None,
    aggregate_engine_logging: bool = False,
    client_addresses: dict[str, str] | None = None,
    client_count: int = 1,
    client_index: int = 0,
) -> None:
    # ... 初始化逻辑保持不变
    self.vllm_config = vllm_config
    self.model_config = vllm_config.model_config
    self.observability_config = vllm_config.observability_config
    # ...

评论区精华

未使用参数的历史原因 question

DarkLight1337 提问为什么这些参数最初存在，但未得到回答。

结论：参数被确认为未使用，因此移除。 · 已解决

风险与影响

技术风险极低：

回归风险：移除的参数在代码中未被实际使用，因此不会影响现有功能。但需确保没有其他代码（如子类覆盖、动态调用）依赖这些参数签名。从上下文看，这些是私有方法或内部构造函数，外部依赖可能性小。
兼容性风险：由于是移除未使用的参数，不改变现有API的行为，因此向后兼容。但若下游用户代码（可能性极低）传入了这些参数，调用将失败。鉴于这些是内部模块，风险可控。
性能与安全：无影响。

影响范围有限：

对用户：无直接影响，因为变更涉及内部模型处理和引擎初始化接口，不暴露给外部API。
对系统：简化了代码结构，减少了不必要的参数传递，可能轻微提升代码可读性和维护性。
对团队：清理了技术债务，为后续开发提供了更干净的代码库。

接口签名变更潜在下游依赖

关联 Issue

未识别关联 Issue

当前没有检测到明确关联的 Issue 链接，后续同步到相关引用后会出现在这里。

完整报告

执行摘要

一句话：移除多个Qwen模型和引擎类中未使用的缓存与哈希参数。
推荐动作：该PR是一个简单的清理重构，不值得精读。工程师可以快速浏览以了解哪些未使用参数被移除，但无需深入分析设计决策。关注点在于确认这些参数确实未被使用，以避免未来类似冗余。

功能与动机

实现拆解

清理Qwen模型的多模态处理函数：在三个模型文件（qwen2_5_omni_thinker.py、qwen3_asr.py、qwen3_omni_moe_thinker.py）中，移除了 _process_audio_input 和 _process_video_input 方法中未使用的 audio_hashes、cached_audio_features、video_hashes、cached_video_embeds 参数。这些参数在函数签名中声明但从未在函数体内被引用，移除后不影响核心处理逻辑。
清理引擎初始化参数：在 async_llm.py 和 llm_engine.py 中，移除了 AsyncLLM.__init__ 和 LLMEngine.__init__ 方法中的 use_cached_outputs 参数及其相关文档。该参数在构造函数中被接收但未在类内部使用，移除后简化了引擎的配置接口。
无测试或配置配套改动：本次变更仅涉及源码清理，没有修改测试文件、配置文件或部署脚本，表明这是一个纯粹的内部重构。

关键文件：

vllm/model_executor/models/qwen2_5_omni_thinker.py（模块模型执行器；类别 source；类型 data-contract；符号 _process_audio_input, _process_video_input）: 移除了音频和视频处理函数中未使用的哈希和缓存参数，涉及多模态模型的核心数据处理逻辑。
vllm/v1/engine/async_llm.py（模块引擎层；类别 source；类型 core-logic；符号 init）: 移除了 AsyncLLM 构造函数中未使用的 use_cached_outputs 参数，影响引擎初始化配置。
vllm/model_executor/models/qwen3_asr.py（模块模型执行器；类别 source；类型 data-contract；符号 _process_audio_input）: 移除了音频处理函数中未使用的哈希和缓存参数，保持与同类模型的一致性。
vllm/model_executor/models/qwen3_omni_moe_thinker.py（模块模型执行器；类别 source；类型 data-contract；符号 _process_audio_input）: 移除了MoE变体模型中音频处理函数未使用的参数，统一清理模式。
vllm/v1/engine/llm_engine.py（模块引擎层；类别 source；类型 core-logic；符号 init）: 移除了 LLMEngine 构造函数中未使用的 use_cached_outputs 参数，与 AsyncLLM 保持同步。

关键符号：_process_audio_input, _process_video_input, init

关键源码片段

`vllm/model_executor/models/qwen2_5_omni_thinker.py`

移除了音频和视频处理函数中未使用的哈希和缓存参数，涉及多模态模型的核心数据处理逻辑。

def _process_audio_input(
    self,
    audio_input: Qwen2_5OmniAudioFeatureInputs,
    # 移除了未使用的参数：audio_hashes 和 cached_audio_features
    # 这些参数在函数体内从未被引用，因此安全删除
) -> torch.Tensor:
    input_features = audio_input["input_features"]
    audio_feature_lengths = audio_input["audio_feature_lengths"]
    # ... 剩余处理逻辑保持不变
    audio_feat_lengths, audio_output_lengths = (
        self.audio_tower._get_feat_extract_output_lengths(audio_feature_lengths)
    )
    audio_outputs = self.audio_tower(
        input_features.to(self.audio_tower.dtype),
        feature_lens=audio_feature_lengths,
        aftercnn_lens=audio_feat_lengths,
    )
    return audio_outputs.last_hidden_state.split(audio_output_lengths.tolist())

`vllm/v1/engine/async_llm.py`

移除了 AsyncLLM 构造函数中未使用的 use_cached_outputs 参数，影响引擎初始化配置。

def __init__(
    self,
    vllm_config: VllmConfig,
    executor_class: type[Executor],
    log_stats: bool,
    usage_context: UsageContext = UsageContext.ENGINE_CONTEXT,
    mm_registry: MultiModalRegistry = MULTIMODAL_REGISTRY,
    # 移除了未使用的参数：use_cached_outputs
    # 该参数在构造函数中未被使用，因此安全删除
    log_requests: bool = True,
    start_engine_loop: bool = True,
    stat_loggers: list[StatLoggerFactory] | None = None,
    aggregate_engine_logging: bool = False,
    client_addresses: dict[str, str] | None = None,
    client_count: int = 1,
    client_index: int = 0,
) -> None:
    # ... 初始化逻辑保持不变
    self.vllm_config = vllm_config
    self.model_config = vllm_config.model_config
    self.observability_config = vllm_config.observability_config
    # ...

评论区精华

未使用参数的历史原因 (question): 参数被确认为未使用，因此移除。

风险与影响

风险：技术风险极低：
- 回归风险：移除的参数在代码中未被实际使用，因此不会影响现有功能。但需确保没有其他代码（如子类覆盖、动态调用）依赖这些参数签名。从上下文看，这些是私有方法或内部构造函数，外部依赖可能性小。
- 兼容性风险：由于是移除未使用的参数，不改变现有API的行为，因此向后兼容。但若下游用户代码（可能性极低）传入了这些参数，调用将失败。鉴于这些是内部模块，风险可控。
- 性能与安全：无影响。
影响：影响范围有限：
- 对用户：无直接影响，因为变更涉及内部模型处理和引擎初始化接口，不暴露给外部API。
- 对系统：简化了代码结构，减少了不必要的参数传递，可能轻微提升代码可读性和维护性。
- 对团队：清理了技术债务，为后续开发提供了更干净的代码库。
- 风险标记：接口签名变更, 潜在下游依赖

关联脉络

PR #40445 [MM][CG] Optimize default max_frames_per_batch auto-infer for ViT CUDA graph video inference: 同样涉及多模态模型（Qwen）的优化和清理，但本PR是参数清理而非功能优化。
PR #37114 [Bugfix] LoRA: extend expert base_layer loading to Qwen3.5 and Step3.x: 都涉及Qwen模型系列的代码调整，但本PR是重构而非bug修复。

#39750 [Refactor] Remove unused param

执行摘要

移除多个 Qwen 模型和引擎类中未使用的缓存与哈希参数。

实现拆解

评论区精华

风险与影响

关联 Issue

未识别关联 Issue

完整报告

参与讨论