执行摘要
- 一句话:迁移 generative scoring 入口并重命名服务类
- 推荐动作:值得关注,该 PR 展示了如何正确进行入口点重构,包括安全导入处理(相对导入)、状态初始化统一管理,以及 CI 配置同步。适合作为类似重构的参照。
功能与动机
Generative scoring 入口不是 OpenAI 官方 API,将其从 openai 文件夹中移出,归入 generate 入口集合(来自 PR body)。
实现拆解
- 目录搬迁:将
vllm/entrypoints/openai/generative_scoring/ 下所有文件搬至 vllm/entrypoints/generate/generative_scoring/,测试目录同步移动。
- 类重命名:
OpenAIServingGenerativeScoring → ServingGenerativeScoring,更新所有引用。
- 入口整合:在
api_server.py 中移除对 generative scoring 路由和状态的直接注册,改为由 generate/api_router.py 统一管理。
- 测试与 CI 适配:更新测试导入和类名,修改
.buildkite/test-amd.yaml 添加新测试覆盖。
关键文件:
vllm/entrypoints/generate/generative_scoring/api_router.py(模块 入口层;类别 source;类型 rename-or-move;符号 generative_scoring, init_generative_scoring_state): 核心路由文件,从 openai 目录移入 generate 目录,导入路径和服务类引用全部更新。
vllm/entrypoints/generate/api_router.py(模块 入口层;类别 source;类型 rename-or-move): 整合了 generative scoring 路由注册和状态初始化,是入口统一的体现。
vllm/entrypoints/openai/api_server.py(模块 入口层;类别 source;类型 entrypoint): 移除了单独的 generative scoring 注册和初始化,改为通过 generate/api_router.py 统一处理。
vllm/entrypoints/generate/generative_scoring/serving.py(模块 入口层;类别 source;类型 rename-or-move;符号 OpenAIServingGenerativeScoring, ServingGenerativeScoring): 服务类重命名,从 OpenAIServingGenerativeScoring 改为 ServingGenerativeScoring。
tests/entrypoints/generate/generative_scoring/test_generative_scoring.py(模块 测试;类别 test;类型 rename-or-move;符号 _create_serving): 测试文件迁移并更新类名引用。
关键符号:generative_scoring, create_generative_scoring, init_generative_scoring_state, ServingGenerativeScoring.init, ServingGenerativeScoring.create_generative_scoring, _create_serving, register_generative_scoring_api_router
关键源码片段
vllm/entrypoints/generate/generative_scoring/api_router.py
核心路由文件,从 openai 目录移入 generate 目录,导入路径和服务类引用全部更新。
# vllm/entrypoints/generate/generative_scoring/api_router.py
# 导入路径从旧的 openai 目录更新为 generate 目录
from vllm.entrypoints.generate.generative_scoring.serving import (
GenerativeScoringResponse,
ServingGenerativeScoring,
)
from vllm.entrypoints.openai.engine.protocol import ErrorResponse
from vllm.entrypoints.openai.utils import validate_json_request
from vllm.entrypoints.utils import load_aware_call, with_cancellation
router = APIRouter()
logger = init_logger(__name__)
def generative_scoring(request: Request) -> ServingGenerativeScoring | None:
return request.app.state.serving_generative_scoring
@router.post("/generative_scoring", dependencies=[Depends(validate_json_request)])
@with_cancellation
@load_aware_call
async def create_generative_scoring(raw_request: Request):
handler = generative_scoring(raw_request)
if handler is None:
raise NotImplementedError(
"The model does not support the Generative Scoring API"
)
raw_body = await raw_request.json()
# 延迟导入避免循环引用
from vllm.entrypoints.generate.generative_scoring.serving import (
GenerativeScoringRequest,
)
gen_request = GenerativeScoringRequest(**raw_body)
result = await handler.create_generative_scoring(gen_request, raw_request)
if isinstance(result, ErrorResponse):
return JSONResponse(
content=result.model_dump(), status_code=result.error.code
)
elif isinstance(result, GenerativeScoringResponse):
return JSONResponse(content=result.model_dump())
raise ValueError(f"Unexpected response type: {type(result)}")
vllm/entrypoints/generate/api_router.py
整合了 generative scoring 路由注册和状态初始化,是入口统一的体现。
# vllm/entrypoints/generate/api_router.py
# 在 register_generate_api_routers 中添加 generative scoring 路由注册
def register_generate_api_routers(app: FastAPI):
# ... 其他路由注册
from .generative_scoring.api_router import (
register_generative_scoring_api_router,
)
register_generative_scoring_api_router(app)
# 在 init_generate_state 末尾添加 ServingGenerativeScoring 初始化
async def init_generate_state(
engine_client, state, args, request_logger, supported_tasks
):
# ... 其他状态初始化
from .generative_scoring.serving import ServingGenerativeScoring
state.serving_generative_scoring = ServingGenerativeScoring(
engine_client,
state.openai_serving_models,
request_logger=request_logger,
)
vllm/entrypoints/generate/generative_scoring/serving.py
服务类重命名,从 OpenAIServingGenerativeScoring 改为 ServingGenerativeScoring。
# vllm/entrypoints/generate/generative_scoring/serving.py
# 类名从 OpenAIServingGenerativeScoring 改为 ServingGenerativeScoring
# 由于该 endpoint 不属于 OpenAI 官方 API,去掉前缀避免误导
class ServingGenerativeScoring(OpenAIServing):
"""Serving class for generative scoring computation.
This class handles computing the probability of specified token IDs
appearing as the next token after concatenating query and item prompts.
"""
def __init__(
self,
engine_client: EngineClient,
models: OpenAIServingModels,
*,
request_logger: RequestLogger | None,
) -> None:
super().__init__(
engine_client=engine_client,
models=models,
request_logger=request_logger,
)
async def create_generative_scoring(
self,
request: GenerativeScoringRequest,
raw_request: Request | None = None,
) -> GenerativeScoringResponse | ErrorResponse:
# 业务逻辑保持不变
...
评论区精华
风险与影响
- 风险:风险较低,但存在以下潜在问题:外部代码若直接引用旧导入路径会断裂(属于有意 break);新路径下的导入必须正确;需要回归确认 api endpoint
/generative_scoring 行为不变。
- 影响:HTTP API 用户无感知;Python 开发者需更新导入路径;团队代码组织更清晰,符合非 OpenAI API 分离原则。
- 风险标记:路径变更影响外部导入, 端点回归风险, CI 配置正确性
关联脉络
参与讨论