-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pinecone integration #189
base: main
Are you sure you want to change the base?
Pinecone integration #189
Conversation
27729fc
to
f6c01c8
Compare
Thanks for the PR! I'll review it properly later. The first reaction was that we won't need a custom index (to avoid duplication), but you might be right that it's the easiest way. Questions for you:
(some notes for me)
|
The chunks and embeddings get uploaded to Pinecone and then Pinecone takes care of sorting/clustering and all that when retrieving.
Well, no, but it's possible I think: https://docs.pinecone.io/guides/inference/rerank |
"_type": "metadatamessage" | ||
}, | ||
{ | ||
"content": "Act as a world-class Julia language programmer with access to the latest Julia-related knowledge via Context Information. \n\n**Instructions:**\n- Answer the question based only on the provided Context.\n- Be precise and answer only when you're confident in the high quality of your answer.\n- Be brief and concise.\n\n**Context Information:**\n---\n{{context}}\n---\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be further down the line but I'd recommend adding stronger wording about refusing to answer when the context is bad / wrong to avoid weak answer. Knowing when to refuse is a key for hallucination minimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be removed from here altogether. It has nothing to do with Pinecone.
Project.toml
Outdated
@@ -7,15 +7,18 @@ version = "0.45.0" | |||
AbstractTrees = "1520ce14-60c1-5f80-bbc7-55ef81b5835c" | |||
Base64 = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" | |||
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a" | |||
DotEnv = "4dc1fcf4-5e3b-5448-94ab-0c38ec0385c1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is just a Draft, but these deps would have to go out and Pinecone would be an extension
Apologies for scattering comments all over! One thing, Pinecone should be relevant for "retrieve" step. It should need very little changed for Eg, If you say:
Option 1) The the simplest path should be to have a lightweight PineconeIndex + method for retrieve (manually writing everything) + method for build_context (to avoid creating the overlaps which are not possible with 1:1 retrieval). Option 2) (this also promotes more re-use and optional addition of other methods as the RAG paradigm progresses)
Option 3) (I would recommend this) The last option requires also some methods for find_closest (your "query"). And a few extensions to support the No-Op with new Index (eg, in find_tags). What do you think? |
Sorry for butchering your API! I didn't mean to actually hit click on creating this PR, it is definitely far from "ready"! Thanks for all the comments. I would go for your recommendation (option 3), but I am still unsure how to go about that. @pankgeorg what do you say? |
I integrated all feedback, I hope. There's still some TODOs left and some choices left to make, as we discussed. I'll try to include everything below: Comments
|
indexid(cc::CandidateWithChunks) = cc.index_id | ||
positions(cc::CandidateWithChunks) = cc.positions | ||
scores(cc::CandidateWithChunks) = cc.scores | ||
chunks(cc::CandidateWithChunks) = cc.chunks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we will need these accessors but we can clean it up later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good so far! We can clean it up later.
As you pointed out, the build_index and the SubManagedIndex are index (for the view op).
It would be amazing to add a minimal example (you can grab some example documents from the docstrings of airag) to show how it works end-to-end
TODOs:
|
I got to the cleaning up point. There are also a bunch of TODOs scattered around, minor aspects that need fixing. I will do the end-to-end example once those are finished. @svilupp would you mind having another look at this, please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good so far! A few minor questions around the Pinecone index and its fields.
Btw do you have like a minimal running example? I have a feeling this wouldnt work for airag
|
||
Dispatch for `AbstractManagedIndex`. | ||
""" | ||
function answer!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a separate method? I believe the index isnt used so maybe we just need to allow the new type in the definition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was an ambiguity issue with this, that's why I added a different function instead of providing the index type as a Union
.
end | ||
HasKeywords(::PineconeIndex) = false | ||
HasEmbeddings(::PineconeIndex) = true | ||
embeddings(index::PineconeIndex) = index.embeddings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have these accessors and fields if we know that the data isn’t there? we hold the data in the individual candidatechunk objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are removed, then getting the embeddings in retrieve
should be changed.
I integrated your feedback @svilupp. I hope I understood your comments well enough. I left some TODOs and also added a MWE for |
Thank you for taking it thus far, Iulia! I hope to take it over and merge it, but I'm short on time these days, so it might take a while before I get to it. |
Trying to finish this up but I am facing issues with the Pinecone.jl package. The last released version was v1.0. However, the version that gets installed is Am I missing something? Where is this version |
Thank you for looking into it! My understanding that Iulia was working on some changes to Pinecone as well, but I don't know any details. If we don't hear from Iulia, we could probably target Pinecone v1 compatibility? All we need is to save the objects/configs needed to make queries in the Index and then save the chunks we get back in the Candidates. |
It's coming from this PR most likely tullytim/Pinecone.jl#19; @vdayanand can say more!
v1 no longer works with the free tier of Pinecone, which is serverless :( |
This is an attempt to integrate Pinecone into PromptingTools. The preparation and retrieval steps are not necessary with Pinecone. The result of the Pinecone query serves as context in the
RAGResult
.Questions
PT
prefix such as to not cause confusion withPinecone.jl
(there's already aPineconeIndex
there, which is a different thing). That is not great.NoChunker
type such thatPTPineconeIndexer
can still have a chunker. Is that best? Probably not.find_tags
might not be necessary.AbstractPTPineconeIndex
)? Probably not.JuliaRAGAssistant
, and I put that underpersona-task
. I'm not sure that's the best place.