Repositories / sgl-project / sglang

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

监控状态：已开启最近同步：2026-06-07 11:28 同步状态：空闲下次计划：2026-06-07 12:28

PR 列表

最近 1 天最近 3 天最近 7 天

更多筛选

排序重要度开始结束

✕ 清空

标签聚合仓库周报

2026-05-20

#23925 [NPU]use triton split_qkvgate_gemma_rmsnorm_rope for Qwen3.5 and Qwen3_next

原始 PR · 作者 Liwansi · 合并时间 2026-05-20 20:22

性能优化重要性 8.36 洞察度 6.00

为 NPU 上 Qwen3.5/Next 引入融合 Triton 注意力预处理核

该 PR 值得精读，尤其是 `mrope.py` 中新 Triton kernel 的设计（如何通过条件掩码实现 interleaved 数据选择）以及 `forward_prepare_npu` 中融合 kernel 的调用模式。评论区的 stride 修正讨论也值得关注，能帮助理解 Triton 中多维度 stride 的正确用法。建议在后续类似 fusion 工作中参考此设计。

npuperformancefeature

#25818 spec_v2: consolidate seq_lens_cpu/sum maintenance into helper

原始 PR · 作者 hnyls2002 · 合并时间 2026-05-20 19:42

重构重要性 6.61 洞察度 5.00

集中 seq_lens_cpu/sum 维护到单一辅助方法

合并后建议执行 PR body 中的 test plan。此 PR 的设计决策（延迟计算 + 统一同步点）值得在类似状态维护中借鉴。

speculative-decodingrefactorscheduling

#25819 disagg prebuilt: drop dead prepare_for_extend shift

原始 PR · 作者 hnyls2002 · 合并时间 2026-05-20 19:39

重构重要性 6.57 洞察度 4.00

移除 disagg decode 中死代码，提取 EAGLE 预填充旋转函数

重构方向正确，代码风格（向量化、type hints）值得参考。建议精读 `apply_eagle_prefill_input_rotation` 的实现和解耦思路。社区 reviewer 可关注此类死代码清理。

refactorspeculative-decodingscheduling

#25872 pr-test: schedule 3x -> 2x; fix extra gate skipped on schedule

原始 PR · 作者 hnyls2002 · 合并时间 2026-05-20 18:58

基础设施重要性 3.47 洞察度 4.00

CI 定时调度从3次改为2次，修复额外测试跳过问题

值得合并，属于 CI 运维优化。建议关注合并后的 CI 运行情况，确认取消率下降。

ciinfra

#25886 [Test] Add fwd_occupancy sanity kit

原始 PR · 作者 hnyls2002 · 合并时间 2026-05-20 18:34

测试重要性 7.71 洞察度 6.00

新增GPU前向占用测试套件并重构hellaswag测试

值得精读，尤其 FwdOccupancyMixin 通过 Prometheus gauge 实现系统级断言的模式，以及 HellaswagMixin 的提取手法。团队可在后续 PR 中考虑采纳 review 建议（min、Event.wait、锁）进一步提升稳定性。

testfeatureperformance

#25697 [diffusion] Fix GLM-Image /v1/images/edits support

原始 PR · 作者 qimcis · 合并时间 2026-05-20 17:11

缺陷修复重要性 7.29 洞察度 5.00

修复 GLM-Image 的 /v1/images/edits 支持

值得一读，特别是关注扩散模型编辑功能接入方式的读者。其中 `image_path_to_list` 和 `pooled_image_features_to_tensor` 的抽象模式、空张量保护前置检查都是稳健编码的好例子。此外，`generate_prior_tokens` 中的跨来源 Token ID 上采样逻辑展示了如何适配多图像输入。

diffusionbugfixjit-kernel

#25831 [Test] Stage-a sanity kits; consolidate core/ + models_e2e/ tests

原始 PR · 作者 hnyls2002 · 合并时间 2026-05-20 16:58

测试重要性 8.18 洞察度 6.00

引入 stage-a 检测套件并重组测试目录

推荐测试基础设施维护者精读，其 mixin 化设计和共享资源策略值得在大型项目中推广。

testrefactorrun-ci

#24070 [BugFix] Fix rid_to_state leak for aborted queued requests

原始 PR · 作者 guoyuhong · 合并时间 2026-05-20 16:32

缺陷修复重要性 6.94 洞察度 4.00

修复队列 abort 后 rid_to_state 泄漏导致重复请求 ID 错误

值得精读。这是一个典型的资源泄漏 bug 修复，代码简洁但影响明确。特别关注测试设计：使用 `__new__` 绕过 `__init__` 的 mock 方式，以及利用 frozenset 分类字段避免硬编码，都是良好的实践。

bugfixschedulingtest

第 93 / 357 页 · 共 2850 条

上一页 1 … 91 92 93 94 95 … 357 下一页