← 返回仓库详情

标签聚合

verl-project/verl · 标签视图

标签列表

misc · 79

trainer · 64

rollout · 35

megatron · 32

npu · 31

ci · 27

model · 24

perf · 17

vllm · 16

doc · 14

examples · 14

fsdp · 12

config · 11

worker · 10

docker · 9

distillation · 7

experimental · 7

fully_async · 7

lora · 6

algo · 5

deps · 5

quantization · 5

sglang · 5

tool · 5

ckpt · 4

diffusion · 4

reward · 4

agent_loop · 3

trtllm · 3

veomni · 3

data · 2

teacher · 2

one_step_off · 1

transferqueue · 1

聚合结果

config 相关 PR

2026-04-08

#5718 [ckpt, trainer] feat: Add plugin hooks for custom CheckpointEngineManager and CheckpointEngine

作者 NaomiEisen · 合并时间 2026-04-08 13:50

功能重要性 6.00 洞察度 6.00

新增检查点引擎插件钩子，支持自定义权重同步后端管理器和后端模块导入。

ckpt trainer config misc

该PR值得精读，特别是关注插件钩子设计模式和安全考虑。建议工程师学习如何复用现有`agent_loop_manager_class`钩子模式，并审视`import_external_libs`的使用以评估安全风险。

查看完整分析 GitHub 原始 PR

2026-04-07

#5885 [cfg] fix: sync strategy from ActorConfig/CriticConfig to EngineConfig

作者 yifannnwu · 合并时间 2026-04-07 10:46

缺陷修复重要性 5.00 洞察度 4.00

修复FSDP Actor/Critic配置中strategy未同步到EngineConfig，导致FSDP2训练回退到FSDP1的问题。

trainer config fsdp misc

该PR值得精读，因为它揭示了配置层与引擎层之间的字段同步问题，特别是使用object.__setattr__绕过BaseConfig冻结逻辑的设计决策。关注点：为何只同步strategy而未采纳ulysses_sequence_parallel_size建议，以及FSDP1/FSDP2后端选择机制。

查看完整分析 GitHub 原始 PR

2026-04-03

#5870 [megatron] fix: support critic model

作者 wuxibin89 · 合并时间 2026-04-03 22:07

缺陷修复重要性 6.00 洞察度 6.00

修复 Megatron critic 模型配置和训练问题，统一配置到 HFModelConfig。

megatron trainer config model

建议技术管理者和工程师精读此 PR，重点关注：配置统一的设计决策如何简化系统架构、critic warmup 逻辑的修复细节、以及 Megatron 引擎中的关键技术权衡。对于用户，应检查并更新现有脚本以避免配置不兼容。

查看完整分析 GitHub 原始 PR

#5874 [megatron, cfg] feat: add Qwen3.5-122B Megatron launch script

作者 none0663 · 合并时间 2026-04-03 14:20

功能重要性 5.00 洞察度 5.00

新增 Qwen3.5-122B Megatron 启动脚本，支持 32 GPU 大规模 GRPO 训练。

megatron config examples trainer

该 PR 对于需要运行 Qwen3.5-122B 或类似大规模模型的工程师值得参考，特别是关注 Megatron 并行配置（如 TP、PP、CP 设置）和模型特定限制（如 GDN 注意力格式）。建议精读脚本中的配置注释，以了解架构权衡和未来优化方向。

查看完整分析 GitHub 原始 PR

2026-04-02

#5848 [cfg] refactor: unify ppo_trainer and ppo_megatron_trainer config

作者 wuxibin89 · 合并时间 2026-04-02 22:58

重构重要性 6.00 洞察度 5.00

统一PPO训练器配置，通过model_engine参数替代独立Megatron配置文件

trainer config megatron lora

该PR值得精读，因为它是配置系统的重大重构，涉及设计决策如model_engine参数的使用和配置分层。建议关注review中指出的风险点，检查配置迁移指南或文档更新，并验证Megatron工作流的兼容性。

查看完整分析 GitHub 原始 PR

2026-03-27

#5556 [rollout, tool] feat: support multi-step in skip_rollout v2

作者 zyang6 · 合并时间 2026-03-27 11:44

功能重要性 6.00 洞察度 6.00

扩展skip rollout至V2版本，支持多步数据缓存与三种重用策略，加速RL训练。

rollout trainer config

该PR值得精读，重点关注RolloutSkip类的设计决策（如三种动作类型的实现机制、步长跟踪逻辑）和配置迁移策略。建议工程师审查安全风险和CACHE动作逻辑，确保在生产环境中配置安全目录并考虑序列化替代方案。

查看完整分析 GitHub 原始 PR

2026-03-26

#5763 [doc] refactor: add constraints on the use of vpp and mbridge parameters

作者 zjchenn · 合并时间 2026-03-26 20:18

文档重要性 2.00 洞察度 1.00

更新 Ascend 后端文档，明确 VPP 与 mbridge 参数的不兼容性。

misc megatron config

此 PR 是一个简单的文档更新，工程师可快速浏览以了解新约束，尤其在使用 Ascend 后端和 Megatron 时。无需深入技术细节，但值得关注以确保配置正确。

查看完整分析 GitHub 原始 PR

2026-03-24

#5722 [algo] feat: Implement IcePop in rollout correction

作者 HollowMan6 · 合并时间 2026-03-24 20:49

功能重要性 6.00 洞察度 6.00

在 rollout correction 中实现 IcePop 算法，通过重用阈值字段支持范围截断。

algo config

建议技术管理者和工程师精读此 PR，重点关注 IcePop 算法实现细节（如 `_parse_rollout_is_threshold` 解析逻辑）和配置扩展设计（重用字段避免 breaking change），这些决策展示了兼容性权衡和模块化设计，值得借鉴于类似功能添加场景。

查看完整分析 GitHub 原始 PR

第 1 / 2 页 · 共 11 条

1 2 下一页