#40715 [BE][Bugfix] Respect TORCH_COMPILE_DISABLE env var at the vLLM config level for torch 2.12

原始 PR 作者 Lucaskabela 合并时间 2026-04-25 07:25 文件变更 2 提交数 2 评论 4 代码增减 +23 / -0

执行摘要

修复 TORCH_COMPILE_DISABLE 环境变量未被 vLLM 配置层尊重的问题

PyTorch 2.12 合入了 pytorch/pytorch#177809，添加了 fullgraph=True 时无编译帧会报错的运行时检查。这导致设置 TORCH_COMPILE_DISABLE=1 时 vLLM 仍调用 torch.compile(..., fullgraph=True) 触发 RuntimeError。关联的 PyTorch Issue #181247 和 vLLM 的 CPU 兼容性 CI 测试（如 Intel SDE 环境）均因此失败。作者在评论中总结：'at the very least it seems when compile is disabled we should not be running torch.compile on models'。

该 PR 是典型的“上游依赖行为变更引发下游适配”场景，值得关注。虽然变更量小，但体现了对上游 PyTorch 变化的快速响应和正确性判断。TORCH_COMPILE_DISABLE 环境变量检查的实现方式（严格匹配 == "1"）是很好的实践，值得在其他类似环境变量处理中推广。建议合并。

讨论亮点

主要讨论集中在环境变量检查的具体实现方式上。

gemini-code-assist[bot] 的高优先级评论：指出初始代码 os.environ.get("TORCH_COMPILE_DISABLE", "0") != "0" 过于宽泛，与 PyTorch 内部仅检查 "1" 的行为不一致，可能导致 vLLM 在变量设为 "false" 或空字符串时错误禁用编译。建议改为严格判断 == "1"。最终代码采纳了此建议。
zou3519 的提问：询问遇到此问题的具体上下文。Lucaskabela 回应指出是 PyTorch Issue #181247 中的 CI 测试场景，并强调当编译被禁用时不应再调用 torch.compile。

实现拆解

在 vLLM 配置后处理阶段添加环境变量检测：在 vllm/config/vllm.py 的 __post_init__ 方法中，在 enforce_eager 检查之后插入一段新逻辑：若环境变量 TORCH_COMPILE_DISABLE 等于 '1'，则记录警告日志并将 self.compilation_config.mode 设置为 CompilationMode.NONE。这确保了编译包装器根本不会被实例化。
新增测试用例：在 tests/compile/test_config.py 中添加 test_torch_compile_disable 测试函数。该测试通过 monkeypatch 设置 VLLM_ENABLE_V1_MULTIPROCESSING=0、TORCH_COMPILE_DISABLE=1、VLLM_DISABLE_COMPILE_CACHE=1，然后使用 vllm_runner 加载模型，并利用 compilation_counter.expect 断言未触发任何编译（num_graphs_seen=0, stock_torch_compile_count=0），验证编译被正确禁用。测试使用 @pytest.mark.forked 避免与其他测试的环境变量冲突。
Review 后优化环境变量取值判断：初始实现使用了 os.environ.get("TORCH_COMPILE_DISABLE", "0") != "0"，经 gemini-code-assist[bot] 指出后，改为与 PyTorch 一致的 == "1" 判断，避免误将 "false"、"no" 或空字符串视为禁用。

文件	模块	状态	重要度
`vllm/config/vllm.py`	配置层	modified	5.8
`tests/compile/test_config.py`	配置层	modified	5.22

关键符号

test_torch_compile_disable

关键源码片段

vllm/config/vllm.py core-logic

在 VllmConfig 的 __post_init__ 方法中新增环境变量检测逻辑，核心修复所在。

# vllm/config/vllm.py (__post_init__ 方法片段 )

        # 处理 enforce_eager：用户显式设置
        if self.model_config is not None and self.model_config.enforce_eager:
            logger.warning(
                "Enforce eager set, disabling torch.compile and CUDAGraphs. "
                "This is equivalent to setting -cc.mode=none -cc.cudagraph_mode=none"
            )
            self.compilation_config.mode = CompilationMode.NONE
            self.compilation_config.cudagraph_mode = CUDAGraphMode.NONE

        # 新增：尊重 TORCH_COMPILE_DISABLE 环境变量（严格匹配 "1"）
        # 解决 PyTorch 2.12 新增的 fullgraph=True 无编译帧报错问题
        # 参考 : https://github.com/pytorch/pytorch/issues/181247
        if os.environ.get("TORCH_COMPILE_DISABLE") == "1":
            logger.warning(
                "TORCH_COMPILE_DISABLE is set, disabling torch.compile. "
                "This is equivalent to setting -cc.mode=none"
            )
            self.compilation_config.mode = CompilationMode.NONE

        # 原有的后端 / 模式检查逻辑，保持不变
        if self.compilation_config.backend == "eager" or (
            self.compilation_config.mode is not None
            and self.compilation_config.mode != CompilationMode.VLLM_COMPILE
        ):
            logger.warning(
                "Inductor compilation was disabled by user settings, "
                "optimizations settings that are only active during "
                "inductor compilation will be ignored."
            )

        # ... 后续初始化逻辑

tests/compile/test_config.py test-coverage

新增 test_torch_compile_disable 测试用例，验证 TORCH_COMPILE_DISABLE=1 时编译被正确禁用。

# tests/compile/test_config.py ( 新增测试函数 )

@pytest.mark.forked # 隔离环境变量，避免影响其他测试
                       # 相关 issue: https://github.com/vllm-project/vllm/issues/21073
def test_torch_compile_disable(vllm_runner, monkeypatch):
    # 禁用多进程，使编译计数器在同一个进程中
    monkeypatch.setenv("VLLM_ENABLE_V1_MULTIPROCESSING", "0")
    # 设置核心环境变量：禁用 torch.compile
    monkeypatch.setenv("TORCH_COMPILE_DISABLE", "1")
    # 禁用编译缓存，确保每次都是干净状态
    monkeypatch.setenv("VLLM_DISABLE_COMPILE_CACHE", "1")

    with (
        # 期望：没有触发任何编译（0 个图，0 次 stock torch.compile 调用）
        compilation_counter.expect(num_graphs_seen=0, stock_torch_compile_count=0),
        vllm_runner(
            "facebook/opt-125m",
            gpu_memory_utilization=0.4,
        ) as _,
    ):
        pass # 模型加载期间应无编译发生

评论区精华

环境变量检查应严格匹配 '1' 而不是非 '0' 正确性

gemini-code-assist[bot] 指出初始实现 `os.environ.get("TORCH_COMPILE_DISABLE", "0") != "0"` 过于宽泛，与 PyTorch 内部仅检查 `'1'` 不一致，可能将 `"false"`、`"no"` 或空字符串误认为禁用。

结论：代码修改为使用 `== "1"`，与 PyTorch 行为对齐，避免歧义。 · 已解决

遇到此问题的具体上下文 question

zou3519 询问该 bug 的触发场景和上下文。

结论：Lucaskabela 回答指出是在 PyTorch Issue #181247 的 CI 测试中发现的，并强调当编译被禁用时不应调用 torch.compile。 · 已解决

风险与影响

回归风险（低）：变更仅限于在 __post_init__ 中增加一段条件判断，逻辑简单，且已有针对性的测试覆盖。但若用户依赖 TORCH_COMPILE_DISABLE 的非 '1' 值（如 'true'）来控制编译，则此 PR 后这些值将不再生效，可能需要用户调整脚本。
兼容性风险（低）：与 PyTorch 的运行时检查行为对齐，避免了未来 PyTorch 版本中更严格的编译错误导致的崩溃。
测试隔离风险（低）：测试使用 monkeypatch 和 @pytest.mark.forked，环境变量影响被隔离，不会污染其他测试。

影响范围：主要影响设置 TORCH_COMPILE_DISABLE=1 的场景，如 Intel SDE 仿真环境中的 CPU 兼容性测试。普通用户若未设置此环境变量则不受影响。
影响程度：轻微。该 PR 修复了一个在特定条件下（PyTorch 2.12 + TORCH_COMPILE_DISABLE=1）会导致启动失败的 bug，提升了系统的鲁棒性和对上游 PyTorch 变化的兼容性。
团队影响：消除了 torch 2.12 升级的阻塞项（关联 vllm-project/vllm#40077），使团队能在更多环境中顺利升级 PyTorch 版本。

环境变量语义变更

关联 Issue

#181247 [vllm] [2.12 regression][CPU] torch.compile fullgraph=True raises "found no compiled frames" under Intel SDE (TORCH_COMPILE_DISABLE=1 not honored?)

完整报告

执行摘要

一句话：修复 TORCH_COMPILE_DISABLE 环境变量未被 vLLM 配置层尊重的问题
推荐动作：该 PR 是典型的“上游依赖行为变更引发下游适配”场景，值得关注。虽然变更量小，但体现了对上游 PyTorch 变化的快速响应和正确性判断。TORCH_COMPILE_DISABLE 环境变量检查的实现方式（严格匹配 == "1"）是很好的实践，值得在其他类似环境变量处理中推广。建议合并。

功能与动机

实现拆解

在 vLLM 配置后处理阶段添加环境变量检测：在 vllm/config/vllm.py 的 __post_init__ 方法中，在 enforce_eager 检查之后插入一段新逻辑：若环境变量 TORCH_COMPILE_DISABLE 等于 '1'，则记录警告日志并将 self.compilation_config.mode 设置为 CompilationMode.NONE。这确保了编译包装器根本不会被实例化。
新增测试用例：在 tests/compile/test_config.py 中添加 test_torch_compile_disable 测试函数。该测试通过 monkeypatch 设置 VLLM_ENABLE_V1_MULTIPROCESSING=0、TORCH_COMPILE_DISABLE=1、VLLM_DISABLE_COMPILE_CACHE=1，然后使用 vllm_runner 加载模型，并利用 compilation_counter.expect 断言未触发任何编译（num_graphs_seen=0, stock_torch_compile_count=0），验证编译被正确禁用。测试使用 @pytest.mark.forked 避免与其他测试的环境变量冲突。
Review 后优化环境变量取值判断：初始实现使用了 os.environ.get("TORCH_COMPILE_DISABLE", "0") != "0"，经 gemini-code-assist[bot] 指出后，改为与 PyTorch 一致的 == "1" 判断，避免误将 "false"、"no" 或空字符串视为禁用。

关键文件：

vllm/config/vllm.py（模块配置层；类别 source；类型 core-logic）: 在 VllmConfig 的 post_init 方法中新增环境变量检测逻辑，核心修复所在。
tests/compile/test_config.py（模块配置层；类别 test；类型 test-coverage；符号 test_torch_compile_disable）: 新增 test_torch_compile_disable 测试用例，验证 TORCH_COMPILE_DISABLE=1 时编译被正确禁用。

关键符号：test_torch_compile_disable

关键源码片段

`vllm/config/vllm.py`

在 VllmConfig 的 post_init 方法中新增环境变量检测逻辑，核心修复所在。

# vllm/config/vllm.py (__post_init__ 方法片段 )

        # 处理 enforce_eager：用户显式设置
        if self.model_config is not None and self.model_config.enforce_eager:
            logger.warning(
                "Enforce eager set, disabling torch.compile and CUDAGraphs. "
                "This is equivalent to setting -cc.mode=none -cc.cudagraph_mode=none"
            )
            self.compilation_config.mode = CompilationMode.NONE
            self.compilation_config.cudagraph_mode = CUDAGraphMode.NONE

        # 新增：尊重 TORCH_COMPILE_DISABLE 环境变量（严格匹配 "1"）
        # 解决 PyTorch 2.12 新增的 fullgraph=True 无编译帧报错问题
        # 参考 : https://github.com/pytorch/pytorch/issues/181247
        if os.environ.get("TORCH_COMPILE_DISABLE") == "1":
            logger.warning(
                "TORCH_COMPILE_DISABLE is set, disabling torch.compile. "
                "This is equivalent to setting -cc.mode=none"
            )
            self.compilation_config.mode = CompilationMode.NONE

        # 原有的后端 / 模式检查逻辑，保持不变
        if self.compilation_config.backend == "eager" or (
            self.compilation_config.mode is not None
            and self.compilation_config.mode != CompilationMode.VLLM_COMPILE
        ):
            logger.warning(
                "Inductor compilation was disabled by user settings, "
                "optimizations settings that are only active during "
                "inductor compilation will be ignored."
            )

        # ... 后续初始化逻辑

`tests/compile/test_config.py`

新增 test_torch_compile_disable 测试用例，验证 TORCH_COMPILE_DISABLE=1 时编译被正确禁用。

# tests/compile/test_config.py ( 新增测试函数 )

@pytest.mark.forked # 隔离环境变量，避免影响其他测试
                       # 相关 issue: https://github.com/vllm-project/vllm/issues/21073
def test_torch_compile_disable(vllm_runner, monkeypatch):
    # 禁用多进程，使编译计数器在同一个进程中
    monkeypatch.setenv("VLLM_ENABLE_V1_MULTIPROCESSING", "0")
    # 设置核心环境变量：禁用 torch.compile
    monkeypatch.setenv("TORCH_COMPILE_DISABLE", "1")
    # 禁用编译缓存，确保每次都是干净状态
    monkeypatch.setenv("VLLM_DISABLE_COMPILE_CACHE", "1")

    with (
        # 期望：没有触发任何编译（0 个图，0 次 stock torch.compile 调用）
        compilation_counter.expect(num_graphs_seen=0, stock_torch_compile_count=0),
        vllm_runner(
            "facebook/opt-125m",
            gpu_memory_utilization=0.4,
        ) as _,
    ):
        pass # 模型加载期间应无编译发生

评论区精华

主要讨论集中在环境变量检查的具体实现方式上。

gemini-code-assist[bot] 的高优先级评论：指出初始代码 os.environ.get("TORCH_COMPILE_DISABLE", "0") != "0" 过于宽泛，与 PyTorch 内部仅检查 "1" 的行为不一致，可能导致 vLLM 在变量设为 "false" 或空字符串时错误禁用编译。建议改为严格判断 == "1"。最终代码采纳了此建议。
zou3519 的提问：询问遇到此问题的具体上下文。Lucaskabela 回应指出是 PyTorch Issue #181247 中的 CI 测试场景，并强调当编译被禁用时不应再调用 torch.compile。
- 环境变量检查应严格匹配 '1' 而不是非 '0' (correctness): 代码修改为使用 == "1"，与 PyTorch 行为对齐，避免歧义。
- 遇到此问题的具体上下文 (question): Lucaskabela 回答指出是在 PyTorch Issue #181247 的 CI 测试中发现的，并强调当编译被禁用时不应调用 torch.compile。

风险与影响

风险：
1. 回归风险（低）：变更仅限于在 __post_init__ 中增加一段条件判断，逻辑简单，且已有针对性的测试覆盖。但若用户依赖 TORCH_COMPILE_DISABLE 的非 '1' 值（如 'true'）来控制编译，则此 PR 后这些值将不再生效，可能需要用户调整脚本。
2. 兼容性风险（低）：与 PyTorch 的运行时检查行为对齐，避免了未来 PyTorch 版本中更严格的编译错误导致的崩溃。
3. 测试隔离风险（低）：测试使用 monkeypatch 和 @pytest.mark.forked，环境变量影响被隔离，不会污染其他测试。
  - 影响：影响范围：主要影响设置 TORCH_COMPILE_DISABLE=1 的场景，如 Intel SDE 仿真环境中的 CPU 兼容性测试。普通用户若未设置此环境变量则不受影响。
  影响程度：轻微。该 PR 修复了一个在特定条件下（PyTorch 2.12 + TORCH_COMPILE_DISABLE=1）会导致启动失败的 bug，提升了系统的鲁棒性和对上游 PyTorch 变化的兼容性。
  团队影响：消除了 torch 2.12 升级的阻塞项（关联 vllm-project/vllm#40077），使团队能在更多环境中顺利升级 PyTorch 版本。
风险标记：环境变量语义变更

关联脉络

PR #40077 Upgrade torch to 2.12.0: 该 PR 是 torch 2.12 升级的一部分或为升级扫清障碍；本 PR 修复了 torch 2.12 引入的兼容性问题。

#40715 [BE][Bugfix] Respect TORCH_COMPILE_DISABLE env var at the vLLM config level for torch 2.12

执行摘要

修复 TORCH_COMPILE_DISABLE 环境变量未被 vLLM 配置层尊重的问题

实现拆解

评论区精华

风险与影响

关联 Issue

完整报告

参与讨论