Skip to content
This repository has been archived by the owner on Jun 21, 2024. It is now read-only.

Triton inference server support #26

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Triton inference server support #26

wants to merge 10 commits into from

Conversation

cat-state
Copy link
Collaborator

This PR adds support for triton inference server. Might be worth @dmahan93 or @conceptofmind trying it out to verify. Also maybe worth adding a non-triton infer_model function that just loads the .pt file and runs it in process.

from langchain.chains import LLMChain
from langchain import Cohere, PromptTemplate

# Optional imports
from googleapiclient.discovery import build


class Tool:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs an update function that can be called, retrieval and calendar will need this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should the update do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass in the data, the update will use either the text, url, or etc to setup for the generation

selected_start_tokens = probs[:, insert_api_at, start_tokens].argmax().item()

for i in range(m):
_, api_calls = await model(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to add in the tool here before sending, e.g. [Calendar(

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, shouldn't that already be in the annotation prompt?


async def sample_and_filter_api_calls(tool, text, top_k, n_gen):
async for tool_use in sample_api_calls(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should chunk this, otherwise we're cutting off everything past 512? tokens

Needs a tool update function for current data

@dmahan93
Copy link
Collaborator

Cool with putting this into a dev branch and working on the stuff I found independently if that sounds good, or we can update this

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants