Repositories / vllm-project / vllm

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

监控状态：已开启最近同步：2026-05-31 00:21 同步状态：空闲下次计划：2026-05-31 01:21

后台正在同步并分析最近 PR，页面会自动刷新并逐步显示最新结果。

PR 列表

最近 1 天最近 3 天最近 7 天

更多筛选

排序重要度开始结束

✕ 清空

标签聚合仓库周报

2026-04-29

#41149 [CI/Build] Auto-detect manylinux ABI tag for nightly wheels

原始 PR · 作者 Harry-Chen · 合并时间 2026-04-29 15:37

基础设施重要性 7.58 洞察度 7.00

自动检测 wheel 的 manylinux 标签，消除升级漂移

值得精读，尤其是 `detect-manylinux-tag.py` 中如何绕过 `auditwheel repair` 的限制直接调用内部 API 的设计，以及 `lib/manylinux.sh` 中通过容器化固定环境的权衡。该 PR 提供了基础设施可靠性的良好范例。

ci/buildrefactorcleanup

#41202 [CI] fix test_rotary_embedding_opcheck format error

原始 PR · 作者 chaunceyjiang · 合并时间 2026-04-29 15:32

测试重要性 3.54 洞察度 2.00

修复测试 dtype 参数格式

简单格式修正，无需精读。但可作为代码风格一致的参考。

testcleanup

#34668 [Reasoning][Feature] Support for speculative decoding with thinking budget

原始 PR · 作者 rishitdholakia13 · 合并时间 2026-04-29 14:14

功能重要性 9.18 洞察度 6.00

支持思考预算与推测解码的兼容

值得精读，尤其是从 LogitsProcessor 向独立状态管理器迁移的设计模式，对 vLLM v1 采样架构扩展有示范意义。Review 中关于性能与异步调度的权衡也值得关注。

featurespeculative-decodingrefactor

#39186 [KV Offload] Per-job store completion for CPU offloading connector

原始 PR · 作者 Etelis · 合并时间 2026-04-29 13:52

功能重要性 9.36 洞察度 7.00

逐作业通知 KV 卸载完成，加速前缀缓存重用

此 PR 值得深刻理解，尤其关注 `TransferJobStatus` 的 fencing 机制和 `OffloadingWorkerMetadata` 的聚合模式。建议团队在后续开发中复用 `build_connector_worker_meta` 模式来收集异步传输完成状态。作者与 reviewer 的多次迭代展示了良好的工程权衡，设计文档可以从中提炼。

kv-connectorperformancefeature

#41113 [Bugfix] Fix rope

原始 PR · 作者 jeejeelee · 合并时间 2026-04-29 13:42

缺陷修复重要性 5.08 洞察度 5.00

修复ROPE内核中cos/sin cache类型硬编码为float32的问题

建议优先审核并合并此PR，因为它修复了实际的CI OOM问题，且实现经过充分考量（限制模板组合）。开发者可关注csrc/pos_encoding_kernels.cu中模板派发模式，未来在其他kernel中可复用此方法。

bugfixperformancekernel

#40734 [Bugfix] Fix max_num_batched_token not captured in cuda graph

原始 PR · 作者 wzhao18 · 合并时间 2026-04-29 12:33

缺陷修复重要性 5.76 洞察度 5.00

修复非 stride 倍数时 max_num_batched_tokens 不被 CUDA graph 捕获

建议合并。这是影响实际生产性能的 bugfix，改动量小、逻辑清晰。同时建议后续考虑是否允许用户选择 padding 到下一个 stride 的策略，但当前方案足以解决问题。

bugfixcudagraphperformance

#41135 [Bugfix] fix inductor error for dpsk v4

原始 PR · 作者 ZJY0516 · 合并时间 2026-04-29 12:18

缺陷修复重要性 6.31 洞察度 5.00

修复 DeepSeek V4 在 Inductor 下的 AssertionError

值得阅读，展示了如何通过 custom op wrapper 绕过 Inductor 对 Triton kernel 的限制。对于其他遇到类似 inductor 错误的团队有参考价值。设计模式：使用 `direct_register_custom_op` 提供 opaque boundary。

bugfixdeepseektorch.compile

#41034 [BugFix][CPU] fix error on CPU runner shutdown

原始 PR · 作者 fadara01 · 合并时间 2026-04-29 12:08

缺陷修复重要性 6.79 洞察度 5.00

修复 CPU 后端 shutdown 时 accelerator 调用崩溃

该 PR 修复明确，方案简洁（+11 行），值得立即合并。建议后续为 CPU 后端增加对应的 shutdown 控制流测试，确保 monkey-patch 不遗漏未来新增的 accelerator API。开发者可参考该模式处理其他跨后端的兼容性问题。

bugfixcpucleanup

第 120 / 253 页 · 共 2018 条

上一页 1 … 118 119 120 121 122 … 253 下一页