Repositories / sgl-project / sglang

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

监控状态：已开启最近同步：2026-06-07 11:28 同步状态：空闲下次计划：2026-06-07 12:28

PR 列表

最近 1 天最近 3 天最近 7 天

更多筛选

排序重要度开始结束

✕ 清空

标签聚合仓库周报

2026-06-03

#26937 Add per-rank staggered weight loading for improved TP I/O concurrency

原始 PR · 作者 power-more · 合并时间 2026-06-03 11:25

性能优化重要性 6.55 洞察度 5.00

TP 权重加载排序与交错 I/O 优化

值得仔细阅读 `loader.py` 中交错逻辑的实现，并确认默认行为变更已广而告之。建议在 `test/registered` 中添加一个加载相关测试，覆盖 k=-1, 0, 1, 2 等场景，确保回归捕获。

performancediffusioninfra

#26106 Support Command A plus

原始 PR · 作者 zijiexia · 合并时间 2026-06-03 11:23

功能重要性 9.18 洞察度 6.00

Cohere Command A Plus 模型推理支持

该 PR 值得精读，特别是自定义 Centered LayerNorm、sigmoid topk 路由、混合 MoE 后端分派策略以及推理/工具调用解析器的状态机设计。对于想扩展新模型支持的开发者，这是很好的参考模式。建议后续增加测试覆盖。

featuremultimodalmoe

#27080 [diffusion] Fix LingBot realtime consistency GT pin

原始 PR · 作者 mickqian · 合并时间 2026-06-03 10:29

缺陷修复重要性 4.74 洞察度 3.00

更新 LingBot 实时一致性 GT 数据及测试用例修复

此 PR 属于常规的测试数据同步与清理，技术价值有限。建议相关测试维护者关注 ci-data 仓库的关联 PR#13，确保 GT 数据版本一致。

diffusionbugfixtest

#27084 [diffusion] Optimize Cosmos3 i2v latent prep

原始 PR · 作者 qimcis · 合并时间 2026-06-03 10:23

性能优化重要性 5.61 洞察度 4.00

优化 Cosmos3 I2V 潜变量预处理，减少 70% 阶段耗时

建议合并。这是一个干净的微小优化，改动明确、性能数据扎实、风险极低。代码库维护者可关注是否存在类似潜在冗余操作（例如其他 diffusion 模型的 I2V 预处理）。

diffusionperformancerefactor

#27086 [diffusion] Clamp WanVAE decode output in place

原始 PR · 作者 mickqian · 合并时间 2026-06-03 10:16

重构重要性 4.89 洞察度 3.00

WanVAE 解码输出就地 clamp，减少 FP32 分配

该 PR 改动简单但值得推广：类似的后处理 clamp 操作在 SGLang 其他 VAE 或生成模型中也可采用就地版本以减少显存开销。建议在编码规范中加入 '优先使用就地操作避免冗余分配' 的指引。

diffusionperformancerefactor

#27081 [diffusion] Use Conv2d width padding in WanVAE

原始 PR · 作者 mickqian · 合并时间 2026-06-03 10:14

重构重要性 5.84 洞察度 4.00

WanVAE 使用 Conv2d 原生宽度 padding

可精读，作为如何利用框架原生特性替代手动 pad 的案例。性能提升有限，但代码简洁性提升明显。

diffusionperformancerefactor

#24870 Support NextN = 2/4 in DSV32

原始 PR · 作者 b8zhong · 合并时间 2026-06-03 10:06

功能重要性 7.52 洞察度 6.00

支持 DSV32 中 NextN=2/4 利用 deep-gemm 原生路径提升 MTP 性能

建议仔细阅读 `_build_paged_mqa_schedule_2d_ctx_lens` 和 `_get_topk_paged` 中的条件判断，理解原生路径与回退路径的设计取舍。同时关注后续 revert 或修复 PR 中对测试失败的处理。

performancedeepseekkv-cache

#27094 docs: tag new diffusion cookbooks

原始 PR · 作者 mickqian · 合并时间 2026-06-03 09:50

文档重要性 2.59 洞察度 1.00

为扩散模型 cookbook 添加 NEW 标签

可直接合并。建议前端文档站点确保支持 `tag` 字段渲染。

documentationdiffusion

第 25 / 357 页 · 共 2850 条

上一页 1 … 23 24 25 26 27 … 357 下一页

sgl-project/sglang

PR 列表

#26937 Add per-rank staggered weight loading for improved TP I/O concurrency

#26106 Support Command A plus

#27080 [diffusion] Fix LingBot realtime consistency GT pin

#27084 [diffusion] Optimize Cosmos3 i2v latent prep

#27086 [diffusion] Clamp WanVAE decode output in place

#27081 [diffusion] Use Conv2d width padding in WanVAE

#24870 Support NextN = 2/4 in DSV32

#27094 docs: tag new diffusion cookbooks

参与讨论