-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for OSS models via HuggingFace endpoints #22
Comments
Yup, this is simply missing passing in the base url into the openai client. If it did that this extension would be compatible with open source endpoints like vLLM. |
@jgpruitt I whipped out an implementation tonight that gets the OpenAI endpoints all working with a base_url variable and setting (similar to api_key). Tested and working with vLLM's endpoint. One thing I think I need to do is to support sending in extra parameters to the client through a json payload (or something). This would allow users to pass in any custom settings the open source endpoints may support (e.g. custom sampler settings like Beam Search, etc). After I get that going, and some tests added, i'll open a PR. |
@jgpruitt I'd like to pick your brain about the parameters and return types for pgai if that's alright. I am trying to get the postgres wrapper to fully comply with the OpenAI python client as it is specified (other than things like streaming which can't be supported), and see some differences I wanted to get some clarification on. e.g. why are the embedding functions returning text/table vs the consistent json based on the input parameters? Why did you not follow the json return type for the embed endpoint, but you did for the chat completions endpoint? I'd really like to get some consistency here, and would rather upstream the changes rather than creating a fork/new project to get it...but I also understand not breaking compatibility if that is a requirement at this point. |
@jgpruitt Oh...just dug into some of the other branches before submitting a PR. Looks like you already handled this, as well as a few of the other things I was working on. If I wanted to migrate the non-conflicting changes to a new branch that will be used for the 0.4.0 release, which should I use? |
hey @Tostino, we'd also very much appreciated using HuggingFace models here. do you have any updates on where this is at? cheers! |
@Tostino Same here as well - would be nice to have OpenAI compatible APIs work such as a https://github.com/michaelfeil/infinity |
Here is where it sits @nicoscordialo and @BW-Projects: #219 There is currently a performance issue that is known and blocking things from getting merged. If anyone wants to attempt to fix that, it would push things along. I am out of free time for a while, and when I tried to spend some time looking at the problem today the project's build system changed enough to break the build for me. |
No description provided.
The text was updated successfully, but these errors were encountered: