# PR #5732 完整报告

- 仓库：`verl-project/verl`
- 标题：[model] chore: Corrected the description of errors related to the 235b script and fixed the error in running the sft script.
- 合并时间：2026-03-24 23:06
- 原文链接：http://prhub.com.cn/verl-project/verl/pull/5732

---

## 执行摘要
本次 PR 修复了两个训练脚本（235b 和 sft）中的配置错误，确保脚本能够正常运行，但 sft 脚本中新增的参数可能绕过关键数据验证检查，引入潜在训练风险，需用户谨慎对待。

## 功能与动机
PR 的动机是纠正与 235b 脚本相关的错误描述并修复 sft 脚本的运行错误。从 PR body 看，未关联具体 Issue，但旨在解决实际执行中的问题，确保训练流程顺畅。

## 实现拆解
变更涉及两个 shell 脚本文件：
- `examples/grpo_trainer/run_qwen3_235b_megatron_npu.sh`：移除 `actor_rollout_ref.rollout.enable_expert_parallel=True`，将 `actor_rollout_ref.rollout.enforce_eager=True` 改为 `False`，并添加编译配置如 `cudagraph_capture_sizes` 和 `cudagraph_mode`。
- `examples/sft/gsm8k/run_qwen3_8b_sft_peft_sp2_npu.sh`：添加 `data.messages_key=messages` 和 `data.ignore_input_ids_mismatch=True` 参数，以解决执行错误。

## 评论区精华
review 讨论中，gemini-code-assist[bot] 和 wucong25 均对 `data.ignore_input_ids_mismatch=True` 提出警告：
> "Setting `data.ignore_input_ids_mismatch=True` silences a critical data processing validation check... By ignoring this mismatch, you are likely proceeding with incorrectly tokenized training data."
尽管风险被指出，PR 仍被合并，暗示可能作为权宜之计，但未解决数据 tokenization 不一致的根本问题。

## 风险与影响
- **风险**：sft 脚本中 `ignore_input_ids_mismatch` 参数禁用数据验证，可能导致训练数据 tokenization 错误，影响模型性能；235b 脚本的配置调整风险较低，旨在纠正现有错误。
- **影响**：修复后脚本可正常运行，方便用户部署；但 sft 脚本用户需注意数据风险，可能需后续修复或监控训练效果。

## 关联脉络
从历史 PR 看，PR #5740 同样涉及 NPU 训练脚本的修复和依赖调整，显示仓库在持续优化 NPU 支持。本次 PR 延续了这一趋势，聚焦于具体脚本错误的纠正，但 sft 脚本中的风险参数提示了更深层的数据处理问题，值得后续关注。