-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cerebras Inference Integration #265
Cerebras Inference Integration #265
Conversation
b77aac4
to
5b02f89
Compare
One thing I am unsure about is whether "Agent" support is available out of the box just from implementing "Inference". I did notice other API vendors advertising on README.md that they have Agent support, but I could not find the corresponding implementation in code. |
@ashwinb friendly bump on this PR :) Please allow the CI to run for this PR |
which other vendors? if you mean Fireworks and Together, that is because they both have Llama Stack distribution endpoints so they make the entirety of the Llama Stack APIs available on their ends. That includes Agents, Memory, etc. |
Ah I see, I didn't know about the distribution endpoints. I'll take out the ✅ on the Agents column |
89bc01a
to
f07c6f2
Compare
Thanks for sharing instruction for reproducing. Well, here's what the server is returning:
This is clearly a malformed message. Why is it doing that? Because you aren't stopping generation on |
f536c84
to
9e53ebc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, happy to get this in. Could you just fix the conflicts with the documentation files now?
9e53ebc
to
659764b
Compare
@ashwinb please have another look. I just rebased to latest Also this might be a bug, but it seems like that
As a result I haven't been able to reverify that the E2E tests for this integration are still passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this in. It has been forever. Thank you for the all the iteration!
@henrytwo We will check why tests are suddenly not getting picked up. |
Adding Cerebras Inference as an API provider.
Testing
Conda
Chat Completion
Non-Streaming Response
Streaming Response
Completion
Non-Streaming Response
Streaming Response
Pre-Commit Checks
Testing with
test_inference.py
I ran
python llama_stack/scripts/distro_codegen.py
to run codegen.