#38844 [Gemma4][Bugfix]: Enable Gemma4ForCasualLM to load lora adapters correctly

vllm-project/vllm · 作者 ShubyM · 合并时间 2026-04-11 17:06

分析状态已生成

文件变更 2提交数 7 · 评论 7

代码增减 +40 / -0

bugfix model v1 lora

执行摘要

修复 Gemma4ForCausalLM 加载 LoRA 适配器的命名映射问题，确保兼容性。

PR body中指出：'Gemma4ForConditionalGeneration names the text stack under model.language_model., while the text-only Gemma4ForCausalLM path exposes the same layers under model.. That means LoRA adapters produced against the conditional wrapper naming do not line up when we load them against Gemma4ForCausalLM'，因此需要映射器来对齐LoRA键。

对于涉及Gemma4模型或LoRA加载机制的工程师，此PR值得精读以了解权重映射设计。重点关注hf_to_vllm_mapper的实现，以及如何通过WeightsMapper处理不同模型命名约定。

讨论亮点

review中，gemini-code-assist[bot]指出初始mapper可能不完整，缺少MoE组件的映射，并建议使用orig_to_new_substr更健壮地处理路径。作者ShubyM回复解释per_expert_scale不是LoRA目标，因此无需映射，并已通过orig_to_new_substr补充了MoE映射。最终评审者jeejeelee批准了PR，表明讨论已解决。

实现拆解

主要改动在两个文件：1) 在vllm/model_executor/models/gemma4.py的Gemma4ForCausalLM类中添加hf_to_vllm_mapper，使用WeightsMapper定义orig_to_new_prefix将'model.language_model.'映射到'model.'，并使用orig_to_new_substr处理MoE组件的路径映射（如'.moe.experts.gate_up_proj'到'.moe.gate_up_proj'）。2) 在tests/lora/test_lora_checkpoints.py中添加两个单元测试函数，验证普通层和MoE层的LoRA权重映射正确性。

文件	模块	状态	重要度
`vllm/model_executor/models/gemma4.py`	model_executor	modified	8.0
`tests/lora/test_lora_checkpoints.py`	tests	modified	6.0

分析完成后，这里会展示 LLM 生成的相对完整源码片段和详细注释。

关键符号

Gemma4ForCausalLM.hf_to_vllm_mapper test_gemma4_lora_weights_mapping test_gemma4_moe_lora_weights_mapping

评论区精华

MoE 组件映射的完整性 设计

gemini-code-assist[bot] 在 review 中指出 hf_to_vllm_mapper 可能缺少 MoE 组件的映射（如 '.router.per_expert_scale' 到 '.moe.per_expert_scale'），建议使用 orig_to_new_substr。

结论：作者 ShubyM 回复解释 per_expert_scale 不是 LoRA 目标，因此无需映射，并通过 orig_to_new_substr 添加了 MoE 组件的映射（如 '.moe.experts.gate_up_proj' 到 '.moe.gate_up_proj'），解决了问题。 · 已解决

风险与影响

风险较低：映射逻辑简单且针对特定模型，但若存在未覆盖的边缘路径或未来模型结构变化，可能导致LoRA加载失败。测试覆盖了主要场景，但依赖映射正确性；若有错误映射，可能引发权重加载错误或模型行为异常。

直接影响使用Gemma4模型和LoRA适配器的用户，确保兼容性并修复加载问题。对系统其他部分无影响，仅限于模型加载层。影响范围中等，针对特定模型功能，但提升了LoRA生态的可用性。

模型特定修复依赖映射正确性

关联 Issue

未识别关联 Issue

当前没有检测到明确关联的 Issue 链接，后续同步到相关引用后会出现在这里。

完整报告

执行摘要

此PR修复了Gemma4ForCausalLM模型加载LoRA适配器时因命名路径不一致导致的失败问题，通过添加权重映射器（hf_to_vllm_mapper）对齐键名，确保兼容性。影响范围仅限于Gemma4模型的LoRA功能，风险低且已通过测试验证，推荐相关工程师关注映射设计。

功能与动机

PR旨在解决Gemma4模型中两个变体（Gemma4ForConditionalGeneration和Gemma4ForCausalLM）的模型路径命名差异。如PR body所述，Gemma4ForConditionalGeneration使用model.language_model.*路径，而纯文本的Gemma4ForCausalLM使用model.*路径，这导致针对前者训练的LoRA适配器无法在后一种模型上正确加载。因此，需要引入映射机制来重命名LoRA键。

实现拆解

关键改动分为两个文件：

vllm/model_executor/models/gemma4.py：在Gemma4ForCausalLM类中添加hf_to_vllm_mapper属性，使用WeightsMapper定义映射规则。
- orig_to_new_prefix: 将"model.language_model."映射到"model."。
- orig_to_new_substr: 处理MoE组件，如将".moe.experts.gate_up_proj"映射到".moe.gate_up_proj"。
tests/lora/test_lora_checkpoints.py：新增两个测试函数：
- test_gemma4_lora_weights_mapping: 验证普通层的映射。
- test_gemma4_moe_lora_weights_mapping: 验证MoE层的映射。
  代码示例（取自gemma4.py）：

hf_to_vllm_mapper = WeightsMapper(
    orig_to_new_prefix={
        "model.language_model.": "model.",
    },
    orig_to_new_substr={
        ".moe.experts.gate_up_proj": ".moe.gate_up_proj",
        ".moe.experts.down_proj": ".moe.down_proj",
    },
)

评论区精华

review中主要讨论围绕映射的完整性：

gemini-code-assist[bot]: 指出初始mapper可能缺少MoE组件映射，建议使用orig_to_new_substr更健壮地处理路径。
ShubyM（作者）: 回复解释per_expert_scale不是LoRA目标，因此无需映射，并已补充MoE映射。
最终，评审者jeejeelee批准PR，表明讨论已解决。

风险与影响

风险: 映射逻辑简单，但若未来模型结构变化或存在未覆盖路径，可能导致LoRA加载失败。测试覆盖了主要场景，但依赖映射正确性。
影响: 直接影响使用Gemma4和LoRA的用户，修复加载问题提升兼容性；对系统其他部分无影响，属于模型特定优化。

关联脉络

从历史PR看，相关PR如#39450（添加Gemma4 Eagle3支持）也涉及Gemma4模型修改，表明仓库正持续增强Gemma4功能。此PR作为bug修复，补全了LoRA支持链条，与近期模型功能演进趋势一致。

支持 Prhub ♥

#38844 [Gemma4][Bugfix]: Enable Gemma4ForCasualLM to load lora adapters correctly

执行摘要

修复 Gemma4ForCausalLM 加载 LoRA 适配器的命名映射问题，确保兼容性。

实现拆解

评论区精华

风险与影响

关联 Issue

未识别关联 Issue

完整报告

执行摘要

功能与动机

实现拆解

评论区精华

风险与影响

关联脉络

参与讨论