docs: add vLLM 开发入门 guide to vllm/ by Copilot · Pull Request #33 · DaoCloud-OpenSource/docs

Copilot · 2026-03-03T09:21:31Z

Adds a new Chinese-language developer onboarding guide for vLLM contributors under vllm/vLLM-dev-guide.md, based on a community contribution by noooop.

Environment setup — uv venv + two install paths (Python-only with VLLM_USE_PRECOMPILED=1 vs full C++/CUDA)
Debugging — VS Code launch.json config for single-GPU vLLM server with debugpy
Profiling — --profiler-config flag, /start_profile / /stop_profile endpoints, perfetto.dev visualization
Linting — pre-commit setup and manual invocation (pre-commit run -a)
Docs & testing — MkDocs local preview, pytest setup with CUDA/CPU caveats
PR process — DCO/sign-off, title prefix taxonomy, code quality bar, RFC threshold (>500 LOC), review cadence

Original prompt

This section details on the original issue you should resolve

<issue_title>vLLM 开发入门</issue_title>
<issue_description>作者：noooop

大家好！如果你对 vLLM 项目感兴趣，这里有一份来自社区成员的贡献指南，希望能帮你顺利上手。

vLLM 欢迎任何形式的帮助！例如：

发现并报告问题：在使用过程中遇到 Bug 或不一致的地方？
支持新模型：希望 vLLM 支持你喜欢的模型？可以提需求，或者自己动手实现。
提出想法或新功能：有好点子？或者直接动手添加一个新特性。
改进文档与教程：觉得文档哪里不容易理解？欢迎帮忙改得更清晰。
帮助其他小伙伴：在社区回答问题，或者协助 Review 代码，都是很棒的贡献。
帮忙宣传：如果觉得 vLLM 好用，欢迎在博客或社交媒体分享，或者给我们的 GitHub 点个 Star，这都是很大的支持！

如果不知道从何开始：
可以到项目的 Job Board（任务板）看看，那里标注了许多适合入门的任务和新模型支持任务，挑一个感兴趣的即可！

🚀 第一步：准备好开发环境

把代码搞到本地：
git clone https://github.com/vllm-project/vllm.git
cd vllm
配置Python环境：
推荐使用 uv 管理环境，更轻量快速。安装 uv 后，一键创建环境：
uv venv --python 3.12 --seed
source .venv/bin/activate
注意：建议使用 Python 3.12，vLLM 的主要测试和兼容性基于此版本，可减少本地与测试环境不一致的问题。
安装vLLM（两种方式）：

如果只改 Python 代码，可开启预编译加速安装：
VLLM_USE_PRECOMPILED=1 uv pip install -e .
如果还需要修改底层 C++/CUDA 内核代码，使用标准安装方式：
uv pip install -e .
🔧 开发与调试小贴士
调试（Debug）：
如果使用 VS Code，可以直接复制下面提供的 launch.json 配置，一键启动带调试的 vLLM 服务。
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "vllm server single",
"type": "debugpy",
"request": "launch",
"module": "vllm.entrypoints.cli.main",
"env": {
"VLLM_LOGGING_LEVEL": "DEBUG",
// "VLLM_USE_MODELSCOPE": "True",
// "MODELSCOPE_DOWNLOAD_PARALLELS": "10",
},
"args": [
"serve",
"Qwen/Qwen3-0.6B",
"--reasoning-parser",
"qwen3",
"--gpu-memory-utilization",
"0.8",
"--port",
"8000",
"--enforce-eager",
"--max-model-len",
"5120",
"-tp",
"1",
],
},
]
}
性能分析（Profiling）：
- 启动服务时可通过 --profiler-config 参数开启性能分析：
  vllm serve Qwen/Qwen3-0.6B --profiler-config '{"profiler": "torch", "torch_profiler_dir": "./vllm_profile"}'
- 服务启动后，会提供 /start_profile 和 /stop_profile 接口，通过 curl 控制记录开始与结束：

We need first call /start_profile api to start profile.

$ curl -X POST http://localhost:8000/start_profile

Call model generate.

curl -X POST http://localhost:8000/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "Qwen/Qwen3-0.6B",
"messages": [
{
"role": "user",
"content": "9.11 and 9.8, which is greater?"
}
]
}'

After need call /stop_profile api to stop profile.

$ curl -X POST http://localhost:8000/stop_profile

生成的 .pt.trace.json.gz 文件可拖到 https://ui.perfetto.dev/ 可视化，查看函数调用和耗时详情。

We need first call /start_profile api to start profile.

$ curl -X POST http://localhost:8000/start_profile

Call model generate.

curl -X POST http://localhost:8000/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "Qwen/Qwen3-0.6B",
"messages": [
{
"role": "user",
"content": "9.11 and 9.8, which is greater?"
}
]
}'

After need call /stop_profile api to stop profile.

$ curl -X POST http://localhost:8000/stop_profile
[图片]

代码风格检查（Linting）：
vLLM 使用 pre-commit 自动检查代码格式。安装后，每次 git commit 会自动运行检查。
uv pip install pre-commit
pre-commit install
也可手动检查所有文件：pre-commit run -a
📖 写文档和跑测试
文档
- vLLM 文档基于 MkDocs 编写，源文件为 Markdown。安装依赖后，运行mkdocs serve本地实时预览：
测试
- vLLM 测试主要使用 pytest 运行测试。安装测试依赖后，可运行全部测试或单个测试文件
  pytest tests/
  📬 正式贡献流程：提交 Issue 与 PR
  提交 Issue（问题或需求）
遇到 Bug 或有新想法？请先到 GitHub Issues 搜索是否已有类似内容。
若没有，新建一个 Issue，尽量详细描述（如复现步骤、环境信息）。
重要！！！：如发现安全问题，请按安全指南私下报告，勿公开在 Issue 中。

提交 Pull Request（PR）
代码完成后即可提交 PR。为保障流程顺畅，请遵守以下约定：

同意开发者协议（DCO）
在每个 commit 信息中需添加一行 Signed-off-by: 你的名字 <邮箱>。
使用 git commit -s 可自动添加，也可在 VS Code 或 PyCharm 中设置自动添加。
撰写清晰的 PR 标题
标题前加上明确前缀，便于快速识别修改类型，例如：
- [Bugfix]：修复 Bug
- [Doc]：文档更新
- [Model] Qwen2：模型支持或改进
- [Kernel]：CUDA 内核修改
- [Core]：核心引擎逻辑修改
- [Hardware][AMD]：硬件相关改动
- [Misc]：其他小改动（慎用）
  若涉及多个方面，可合并相关前缀。
保证代码质量
- 遵循 Google 的 Python 与 C++ 代码风格指南。
- 通过所有 pre-commit 格式检查。
- 编写清晰的注释与文档。
- 记得补充测试，确保改动可靠。
- 若影响用户使用（如新增参数、功能），请同步更新 docs/ 下的文档。
关于内核（Kernel）代码
- 若修改底层 C++/CUDA 内核，建议了解“增量编译工作流程”以提升编译效率。
- 新增自定义算子时，请按规范在 PyTorch 中注册并编写对应测试。
大型改动请先讨论（RFC）
如果改动较大（代码超过 500 行，不含数据、配置、测试），请...

Fixes vLLM 开发入门 #32

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: pacoxu <2010320+pacoxu@users.noreply.github.com>

pacoxu · 2026-03-03T09:26:24Z

@noooop 这边也同步了一份

Initial plan

6221357

Copilot AI assigned Copilot and pacoxu Mar 3, 2026

Copilot started work on behalf of pacoxu March 3, 2026 09:21 View session

add vLLM 开发入门 article to vllm/ folder

4dffb18

Co-authored-by: pacoxu <2010320+pacoxu@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update vLLM contribution guidelines~~ docs: add vLLM 开发入门 guide to vllm/ Mar 3, 2026

Copilot finished work on behalf of pacoxu March 3, 2026 09:25

pacoxu marked this pull request as ready for review March 3, 2026 09:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add vLLM 开发入门 guide to vllm/#33

docs: add vLLM 开发入门 guide to vllm/#33
Copilot wants to merge 2 commits intomainfrom
copilot/improve-vllm-contribution-guide

Copilot AI commented Mar 3, 2026 •

edited

Loading

Uh oh!

pacoxu commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contents

We need first call /start_profile api to start profile.

Call model generate.

After need call /stop_profile api to stop profile.

We need first call /start_profile api to start profile.

Call model generate.

After need call /stop_profile api to stop profile.

Uh oh!

pacoxu commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 3, 2026 •

edited

Loading