Can not implement stream in langchain with vllm (qwen2.5) #29428

lxbworld · 2025-01-26T07:44:35Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code:

from langchain_community.llms import VLLM

llm = VLLM(
    model="./models/Qwen2.5-14B-Instruct",
    trust_remote_code=True, 
    max_new_tokens=128,
    top_k=10,
    top_p=0.95,
    temperature=0.8,
)

for chunk in llm.stream("What is the capital of France ?"):
    print(chunk, end="|", flush=True)

Error Message and Stack Trace (if applicable)

Output from langchain (not stream):

Correct! The capital of France is Paris. It is known for its iconic landmarks like the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral, among many others. Do you have any other questions about France or Paris? I'd be happy to help! 

Is there something specific you would like to know about Paris or France in general? For example:

1. History
2. Culture
3. Cuisine
4. Attractions
5. Transportation
6. Language
7. Weather
8. Population
9. Economy
10. Education

Let me know if you have any particular interests|[rank0]:[W126 15:42:23.470267241 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present,  but this warning has only been added since PyTorch 2.4 (function operator())

Description

These code can not output in stream, and I know vllm with qwen2.5 can output in stream (I make it), so is there something wrong to config in langchain? Is there anyone tell me the right code? I want to realize vllm with qwen2.5 and output in stream inside langchain.

I have search the whole web, and none of code can work.

System Info

System Information

OS: Linux
OS Version: #1 SMP Fri Mar 24 10:04:47 CST 2023
Python Version: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0]

Package Information

langchain_core: 0.3.31
langchain: 0.3.15
langchain_community: 0.3.15
langsmith: 0.3.1
langchain_text_splitters: 0.3.5
langgraph_sdk: 0.1.51

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.11.11
async-timeout: Installed. No version info available.
dataclasses-json: 0.6.7
httpx: 0.28.1
httpx-sse: 0.4.0
jsonpatch: 1.33
langsmith-pyo3: Installed. No version info available.
numpy: 1.26.4
orjson: 3.10.13
packaging: 24.2
pydantic: 2.10.4
pydantic-settings: 2.7.1
pytest: 8.3.4
PyYAML: 6.0.2
requests: 2.32.2
requests-toolbelt: 1.0.0
rich: 13.9.4
SQLAlchemy: 2.0.37
tenacity: 9.0.0
typing-extensions: 4.12.2
zstandard: 0.23.0

The text was updated successfully, but these errors were encountered:

dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can not implement stream in langchain with vllm (qwen2.5) #29428

Can not implement stream in langchain with vllm (qwen2.5) #29428

lxbworld commented Jan 26, 2025 •

edited

Loading

Can not implement stream in langchain with vllm (qwen2.5) #29428

Can not implement stream in langchain with vllm (qwen2.5) #29428

Comments

lxbworld commented Jan 26, 2025 • edited Loading

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

lxbworld commented Jan 26, 2025 •

edited

Loading