← 返回仓库列表

verl-project/verl

verl: Volcano Engine Reinforcement Learning for LLMs

监控状态：已开启最近同步：2026-04-18 19:13 同步状态：空闲下次计划：2026-04-18 20:13

PR 列表

已合并 143 · 已分析 143

最近 1 天最近 3 天最近 7 天

更多筛选

排序重要度开始结束

✕ 清空

标签聚合仓库周报

2026-03-26

#5748 [ci] chore: delete install current repository for npu ci

作者 yyyy2000 · 合并时间 2026-03-26 02:43

基础设施重要性 4.00 洞察度 3.00

优化NPU CI流水线，移除依赖安装并添加本地镜像配置以提升执行速度。

建议快速浏览此PR，重点关注镜像配置的添加和安装命令的变更，以理解CI优化策略。对于工程师，应注意到文档中的隐含假设，并在需要时手动安装依赖；对于技术管理者，此变更体现了CI性能优化的常见做法，但需监控风险。

查看完整分析 GitHub 原始 PR cinpumisc

2026-03-25

#5742 [ckpt] fix: handle string task_type in LoRA model merger

作者 FrankHo-Hwc · 合并时间 2026-03-25 19:32

缺陷修复重要性 4.00 洞察度 4.00

修复 LoRA 模型合并器中字符串 task_type 导致的 AttributeError。

该 PR 值得精读，因为它展示了在兼容性修复中如何处理类型检查和错误处理，尤其关注 review 反馈对 falsy 值行为的讨论，这对于类似场景的设计决策有参考价值。

查看完整分析 GitHub 原始 PR lora

#5740 [misc] fix: supplement the dependencies that are missing in the requirements-npu.txt

作者 nuerxiati · 合并时间 2026-03-25 14:07

缺陷修复重要性 4.00 洞察度 3.00

修复 NPU 依赖缺失并调整检查点引擎参数以解决大权重错误。

建议快速浏览此 PR，关注依赖更新和参数调整的合理性。对于工程师，值得注意 review 中提到的修复不完整性问题，考虑是否需扩展修复到其他 NPU 脚本。

查看完整分析 GitHub 原始 PR miscdeps

#5723 [1/2][rollout,trainer] refactor: Teacher colocate mode -- Move teacher logprob computation to `AsyncTeacherLLMServerManager`

作者 JacobHelwig · 合并时间 2026-03-25 10:14

重构重要性 6.00 洞察度 5.00

重构教师模型对数概率计算，移动至专用管理器以提升模块化。

建议技术管理者和工程师精读此PR，关注设计决策如分离关注点、处理循环依赖和初始化顺序修复。重点关注verl/experimental/teacher_loop/teacher_manager.py中新类的实现，以及agent_loop.py中的修改逻辑，以理解重构带来的模块化改进和潜在风险。

查看完整分析 GitHub 原始 PR teacherdistillation

#5736 [ci] fix: fix circular import in ci

作者 vermouth1992 · 合并时间 2026-03-25 10:10

缺陷修复重要性 2.00 洞察度 2.00

修复了 megatron_utils.py 中的循环导入问题，优化 CI 测试。

PR 变更简单，值得快速浏览以了解循环导入修复模式。设计决策使用局部导入避免依赖循环，可作为类似场景的参考。

查看完整分析 GitHub 原始 PR cimisc

#5735 [misc] fix: make the assert user-friendly for `get_tensordict`

作者 stas00 · 合并时间 2026-03-25 08:58

缺陷修复重要性 4.00 洞察度 3.00

为`get_tensordict`函数添加详细的断言错误信息，提升调试体验。

该PR变更简单，但展示了代码风格一致性维护和潜在设计决策（assert vs exception）。建议开发者关注此类小修复以提升代码质量，并注意assert在生产环境中的使用风险。

查看完整分析 GitHub 原始 PR misc

2026-03-24

#5732 [model] chore: Corrected the description of errors related to the 235b script and fixed the error in running the sft script.

作者 autbuster · 合并时间 2026-03-24 23:06

缺陷修复重要性 4.00 洞察度 3.00

修复两个训练脚本的配置错误，包括可能绕过数据验证的风险参数。

建议关注 sft 脚本中 `data.ignore_input_ids_mismatch=True` 的风险；该 PR 代码变更简单，但讨论揭示了重要数据验证问题，值得开发者了解相关权衡。

查看完整分析 GitHub 原始 PR miscexamples

#5733 [model] fix: An end-to-end script for the 235b model is provided for the 256k long sequence

作者 autbuster · 合并时间 2026-03-24 23:04

缺陷修复重要性 6.00 洞察度 5.00

新增 Qwen3-235B 模型的 256k 长序列端到端脚本，并修复相关配置错误。

建议关注此 PR 以学习如何编写健壮的训练脚本和配置管理，特别是在处理长序列和分布式训练时的最佳实践。review 评论中的错误修复点和数据验证警告值得借鉴，有助于提升脚本质量。

查看完整分析 GitHub 原始 PR examplesmiscmodel

第 15 / 18 页 · 共 143 条

上一页 1 … 14 15 16 17 18 下一页

支持 Prhub ♥

verl-project/verl

PR 列表

#5748 [ci] chore: delete install current repository for npu ci

#5742 [ckpt] fix: handle string task_type in LoRA model merger

#5740 [misc] fix: supplement the dependencies that are missing in the requirements-npu.txt

#5723 [1/2][rollout,trainer] refactor: Teacher colocate mode -- Move teacher logprob computation to `AsyncTeacherLLMServerManager`

#5736 [ci] fix: fix circular import in ci

#5735 [misc] fix: make the assert user-friendly for `get_tensordict`

#5732 [model] chore: Corrected the description of errors related to the 235b script and fixed the error in running the sft script.

#5733 [model] fix: An end-to-end script for the 235b model is provided for the 256k long sequence

参与讨论