Skip to content

Commit

Permalink
Update RAGTools docstrings
Browse files Browse the repository at this point in the history
add names of the corresponding methods to various structs (same as using methodswith...)
  • Loading branch information
svilupp authored Aug 5, 2024
1 parent 20a8447 commit e85e4ab
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 20 deletions.
8 changes: 4 additions & 4 deletions src/Experimental/RAGTools/generation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -150,14 +150,14 @@ struct NoRefiner <: AbstractRefiner end
"""
SimpleRefiner <: AbstractRefiner
Refines the answer using the same context previously provided via the provided prompt template.
Refines the answer using the same context previously provided via the provided prompt template. A method for `refine!`.
"""
struct SimpleRefiner <: AbstractRefiner end

"""
TavilySearchRefiner <: AbstractRefiner
Refines the answer by executing a web search using the Tavily API. This method aims to enhance the answer's accuracy and relevance by incorporating information retrieved from the web.
Refines the answer by executing a web search using the Tavily API. This method aims to enhance the answer's accuracy and relevance by incorporating information retrieved from the web. A method for `refine!`.
"""
struct TavilySearchRefiner <: AbstractRefiner end

Expand All @@ -172,7 +172,7 @@ end
refiner::NoRefiner, index::AbstractChunkIndex, result::AbstractRAGResult;
kwargs...)
Simple no-op function for `refine`. It simply copies the `result.answer` and `result.conversations[:answer]` without any changes.
Simple no-op function for `refine!`. It simply copies the `result.answer` and `result.conversations[:answer]` without any changes.
"""
function refine!(
refiner::NoRefiner, index::AbstractDocumentIndex, result::AbstractRAGResult;
Expand Down Expand Up @@ -511,7 +511,7 @@ end
"""
RAGConfig <: AbstractRAGConfig
Default configuration for RAG. It uses `SimpleIndexer`, `SimpleRetriever`, and `SimpleGenerator` as default components.
Default configuration for RAG. It uses `SimpleIndexer`, `SimpleRetriever`, and `SimpleGenerator` as default components. Provided as the first argument in `airag`.
To customize the components, replace corresponding fields for each step of the RAG pipeline (eg, use `subtypes(AbstractIndexBuilder)` to find the available options).
"""
Expand Down
6 changes: 3 additions & 3 deletions src/Experimental/RAGTools/preparation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ struct NoProcessor <: AbstractProcessor end
"""
BinaryBatchEmbedder <: AbstractEmbedder
Same as `BatchEmbedder` but reduces the embeddings matrix to a binary form (eg, `BitMatrix`).
Same as `BatchEmbedder` but reduces the embeddings matrix to a binary form (eg, `BitMatrix`). Defines a method for `get_embeddings`.
Reference: [HuggingFace: Embedding Quantization](https://huggingface.co/blog/embedding-quantization#binary-quantization-in-vector-databases).
"""
Expand All @@ -61,7 +61,7 @@ struct BinaryBatchEmbedder <: AbstractEmbedder end
"""
BitPackedBatchEmbedder <: AbstractEmbedder
Same as `BatchEmbedder` but reduces the embeddings matrix to a binary form packed in UInt64 (eg, `BitMatrix.chunks`).
Same as `BatchEmbedder` but reduces the embeddings matrix to a binary form packed in UInt64 (eg, `BitMatrix.chunks`). Defines a method for `get_embeddings`.
See also utilities `pack_bits` and `unpack_bits` to move between packed/non-packed binary forms.
Expand Down Expand Up @@ -146,7 +146,7 @@ function _normalize end
load_text(chunker::AbstractChunker, input;
kwargs...)
Load text from `input` using the provided `chunker`
Load text from `input` using the provided `chunker`. Called by `get_chunks`.
Available chunkers:
- `FileChunker`: The function opens each file in `input` and reads its contents.
Expand Down
26 changes: 13 additions & 13 deletions src/Experimental/RAGTools/retrieval.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,14 @@ struct NoRephraser <: AbstractRephraser end
"""
SimpleRephraser <: AbstractRephraser
Rephraser implemented using the provided AI Template (eg, `...`) and standard chat model.
Rephraser implemented using the provided AI Template (eg, `...`) and standard chat model. A method for `rephrase`.
"""
struct SimpleRephraser <: AbstractRephraser end

"""
HyDERephraser <: AbstractRephraser
Rephraser implemented using the provided AI Template (eg, `...`) and standard chat model.
Rephraser implemented using the provided AI Template (eg, `...`) and standard chat model. A method for `rephrase`.
It uses a prompt-based rephrasing method called HyDE (Hypothetical Document Embedding), where instead of looking for an embedding of the question,
we look for the documents most similar to a synthetic passage that _would be_ a good answer to our question.
Expand All @@ -29,14 +29,14 @@ struct HyDERephraser <: AbstractRephraser end
"""
CosineSimilarity <: AbstractSimilarityFinder
Finds the closest chunks to a query embedding by measuring the cosine similarity between the query and the chunks' embeddings.
Finds the closest chunks to a query embedding by measuring the cosine similarity between the query and the chunks' embeddings. A method for `find_closest` (see the docstring for more details and usage example).
"""
struct CosineSimilarity <: AbstractSimilarityFinder end

"""
BinaryCosineSimilarity <: AbstractSimilarityFinder
Finds the closest chunks to a query embedding by measuring the Hamming distance AND cosine similarity between the query and the chunks' embeddings in binary form.
Finds the closest chunks to a query embedding by measuring the Hamming distance AND cosine similarity between the query and the chunks' embeddings in binary form. A method for `find_closest`.
It follows the two-pass approach:
- First pass: Hamming distance in binary form to get the `top_k * rescore_multiplier` (ie, more than top_k) candidates.
Expand All @@ -49,7 +49,7 @@ struct BinaryCosineSimilarity <: AbstractSimilarityFinder end
"""
BitPackedCosineSimilarity <: AbstractSimilarityFinder
Finds the closest chunks to a query embedding by measuring the Hamming distance AND cosine similarity between the query and the chunks' embeddings in binary form.
Finds the closest chunks to a query embedding by measuring the Hamming distance AND cosine similarity between the query and the chunks' embeddings in binary form. A method for `find_closest`.
The difference to `BinaryCosineSimilarity` is that the binary values are packed into UInt64, which is more efficient.
Expand All @@ -61,7 +61,7 @@ struct BitPackedCosineSimilarity <: AbstractSimilarityFinder end
"""
BM25Similarity <: AbstractSimilarityFinder
Finds the closest chunks to a query embedding by measuring the BM25 similarity between the query and the chunks' embeddings in binary form.
Finds the closest chunks to a query embedding by measuring the BM25 similarity between the query and the chunks' embeddings in binary form. A method for `find_closest`.
Reference: [Wikipedia: BM25](https://en.wikipedia.org/wiki/Okapi_BM25).
Implementation follows: [The Next Generation of Lucene Relevance](https://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/).
Expand All @@ -71,7 +71,7 @@ struct BM25Similarity <: AbstractSimilarityFinder end
"""
MultiFinder <: AbstractSimilarityFinder
Composite finder for `MultiIndex` where we want to set multiple finders for each index.
Composite finder for `MultiIndex` where we want to set multiple finders for each index. A method for `find_closest`.
Positions correspond to `indexes(::MultiIndex)`.
"""
struct MultiFinder <: AbstractSimilarityFinder
Expand All @@ -91,14 +91,14 @@ struct NoTagFilter <: AbstractTagFilter end
"""
AnyTagFilter <: AbstractTagFilter
Finds the chunks that have ANY OF the specified tag(s).
Finds the chunks that have ANY OF the specified tag(s). A method for `find_tags`.
"""
struct AnyTagFilter <: AbstractTagFilter end

"""
AllTagFilter <: AbstractTagFilter
Finds the chunks that have ALL OF the specified tag(s).
Finds the chunks that have ALL OF the specified tag(s). A method for `find_tags`.
"""
struct AllTagFilter <: AbstractTagFilter end

Expand Down Expand Up @@ -622,14 +622,14 @@ struct NoReranker <: AbstractReranker end
"""
CohereReranker <: AbstractReranker
Rerank strategy using the Cohere Rerank API. Requires an API key.
Rerank strategy using the Cohere Rerank API. Requires an API key. A method for `rerank`.
"""
struct CohereReranker <: AbstractReranker end

"""
FlashRanker <: AbstractReranker
Rerank strategy using the package FlashRank.jl and local models.
Rerank strategy using the package FlashRank.jl and local models. A method for `rerank`.
You must first import the FlashRank.jl package.
To automatically download any required models, set your
Expand Down Expand Up @@ -659,7 +659,7 @@ end
"""
RankGPTReranker <: AbstractReranker
Rerank strategy using the RankGPT algorithm (calling LLMs).
Rerank strategy using the RankGPT algorithm (calling LLMs). A method for `rerank`.
# Reference
[1] [Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents by W. Sun et al.](https://arxiv.org/abs/2304.09542)
Expand Down Expand Up @@ -869,7 +869,7 @@ end
"""
SimpleRetriever <: AbstractRetriever
Default implementation for `retrieve`. It does a simple similarity search via `CosineSimilarity` and returns the results.
Default implementation for `retrieve` function. It does a simple similarity search via `CosineSimilarity` and returns the results.
Make sure to use consistent `embedder` and `tagger` with the Preparation Stage (`build_index`)!
Expand Down

0 comments on commit e85e4ab

Please sign in to comment.