vllm-project/vllm · 标签视图

标签列表

bugfix · 911

performance · 486

refactor · 476

test · 423

feature · 397

cleanup · 359

model · 305

v1 · 299

rocm · 271

quantization · 264

kernel · 212

ci/build · 208

frontend · 204

documentation · 187

kv-connector · 187

nvidia · 186

multi-modality · 162

moe · 145

deepseek · 134

ci · 132

attention · 129

speculative-decoding · 115

infra · 103

tool-calling · 101

qwen · 96

cpu · 86

compilation · 74

intel-gpu · 57

cudagraph · 56

xpu · 53

fp8 · 33

scheduler · 32

torch.compile · 26

lora · 25

parser · 25

responses-api · 25

gpt-oss · 18

pooling · 17

structured-output · 13

mistral · 11

security · 7

mamba · 6

llama · 5

metrics · 4

dflash · 3

gemma · 3

gemma4 · 3

reasoning · 3

rust · 3

tpu · 3

v2 · 3

bug · 2

configuration · 2

distributed · 2

jais · 2

ray · 2

benchmark · 1

config · 1

dependencies · 1

dependency-wiring · 1

docker · 1

eplb · 1

fips · 1

gdn · 1

hy_v3 · 1

hybrid · 1

mla · 1

mtp · 1

nccl · 1

nixl · 1

numa · 1

observability · 1

pipeline-parallelism · 1

platform · 1

pluggablelayer · 1

prefix-caching · 1

riscv · 1

sampling · 1

setup · 1

vllm · 1

聚合结果

attention 相关 PR

2026-06-03

2026-06-02

2026-05-30

2026-05-29