Skip to content

Commit

Permalink
fix: sync frequency and sync routes example
Browse files Browse the repository at this point in the history
  • Loading branch information
jamescalam committed Nov 10, 2024
1 parent ca34d21 commit 8683f68
Show file tree
Hide file tree
Showing 3 changed files with 472 additions and 6 deletions.
391 changes: 391 additions & 0 deletions docs/indexes/pinecone-sync-routes.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,391 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#!pip install -qU \"semantic-router[pinecone]==0.0.73\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Syncing Routes with Pinecone Index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When using the `PineconeIndex`, our `RouteLayer` is stored in two places:\n",
"\n",
"* We keep route layer metadata locally.\n",
"* Vectors alongside a backup of our metadata is stored remotely in Pinecone.\n",
"\n",
"By storing some data locally and some remotely we achieve improved persistence and the ability to recover our local state if lost. However, it does come with challenges around keep our local and remote instances synchronized. Fortunately, we have [several synchronization options](https://docs.aurelio.ai/semantic-router/route_layer/sync.html). In this example, we'll see how to use these options to keep our local and remote Pinecone instances synchronized."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from semantic_router import Route\n",
"\n",
"# we could use this as a guide for our chatbot to avoid political conversations\n",
"politics = Route(\n",
" name=\"politics\",\n",
" utterances=[\n",
" \"isn't politics the best thing ever\",\n",
" \"why don't you tell me about your political opinions\",\n",
" \"don't you just love the president\",\n",
" \"don't you just hate the president\",\n",
" \"they're going to destroy this country!\",\n",
" \"they will save the country!\",\n",
" ],\n",
")\n",
"\n",
"# this could be used as an indicator to our chatbot to switch to a more\n",
"# conversational prompt\n",
"chitchat = Route(\n",
" name=\"chitchat\",\n",
" utterances=[\n",
" \"how's the weather today?\",\n",
" \"how are things going?\",\n",
" \"lovely weather today\",\n",
" \"the weather is horrendous\",\n",
" \"let's go to the chippy\",\n",
" ],\n",
")\n",
"\n",
"# we place both of our decisions together into single list\n",
"routes = [politics, chitchat]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"from semantic_router.encoders import OpenAIEncoder\n",
"\n",
"# get at platform.openai.com\n",
"os.environ[\"OPENAI_API_KEY\"] = os.environ.get(\"OPENAI_API_KEY\") or getpass(\"Enter OpenAI API key: \")\n",
"\n",
"encoder = OpenAIEncoder(name=\"text-embedding-3-small\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For our `PineconeIndex` we do the exact same thing, ie we initialize as usual:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from semantic_router.index.pinecone import PineconeIndex\n",
"\n",
"# get at app.pinecone.io\n",
"os.environ[\"PINECONE_API_KEY\"] = os.environ.get(\"PINECONE_API_KEY\") or getpass(\"Enter Pinecone API key: \")\n",
"\n",
"pc_index = PineconeIndex(\n",
" dimensions=1536,\n",
" init_async_index=True, # enables asynchronous methods, it's optional\n",
" sync=None, # defines whether we sync between local and remote route layers\n",
" # when sync is None, no sync is performed\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## RouteLayer"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `RouteLayer` class supports both sync and async operations by default, so we initialize as usual:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from semantic_router.layer import RouteLayer\n",
"\n",
"rl = RouteLayer(encoder=encoder, routes=routes, index=pc_index)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can check our route layer and index information as usual:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['politics', 'chitchat']"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rl.list_route_names()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(rl.index)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see if our local and remote instances are synchronized..."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hash_id: sr_hash#\n"
]
},
{
"data": {
"text/plain": [
"ConfigParameter(field='sr_hash', value='f8f04794014c855bd68e283e64c57d8cc7a92f2ecd143386105de98f57c55e04', namespace='', created_at='2024-11-10T21:41:35.991948')"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import time\n",
"\n",
"# due to pinecone indexing latency we wait 3 seconds\n",
"time.sleep(3)\n",
"rl.index._read_hash()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hash_id: sr_hash#\n"
]
},
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rl.is_synced()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It looks like everything is synced! Let's try deleting our local route layer, initializing it with just the politics route, and checking again."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hash_id: sr_hash#\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[33m2024-11-10 22:41:41 WARNING semantic_router.utils.logger Local and remote route layers were not aligned. Remote hash not updated. Use `RouteLayer.get_utterance_diff()` to see details.\u001b[0m\n"
]
}
],
"source": [
"del rl\n",
"\n",
"rl = RouteLayer(encoder=encoder, routes=[politics], index=pc_index)\n",
"time.sleep(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's try `rl.is_synced()` again:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hash_id: sr_hash#\n"
]
},
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rl.is_synced()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can use the `get_utterance_diff` method to see exactly _why_ our local and remote are not synced"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['chitchat: how are things going?', \"chitchat: how's the weather today?\", \"chitchat: let's go to the chippy\", 'chitchat: lovely weather today', 'chitchat: the weather is horrendous', \"politics: don't you just hate the president\", \"politics: don't you just love the president\", \"politics: isn't politics the best thing ever\", 'politics: they will save the country!', \"politics: they're going to destroy this country!\", \"politics: why don't you tell me about your political opinions\"]\n",
"[\"politics: don't you just hate the president\", \"politics: don't you just love the president\", \"politics: isn't politics the best thing ever\", 'politics: they will save the country!', \"politics: they're going to destroy this country!\", \"politics: why don't you tell me about your political opinions\"]\n"
]
},
{
"data": {
"text/plain": [
"['+ chitchat: how are things going?',\n",
" \"+ chitchat: how's the weather today?\",\n",
" \"+ chitchat: let's go to the chippy\",\n",
" '+ chitchat: lovely weather today',\n",
" '+ chitchat: the weather is horrendous',\n",
" \" politics: don't you just hate the president\",\n",
" \" politics: don't you just love the president\",\n",
" \" politics: isn't politics the best thing ever\",\n",
" ' politics: they will save the country!',\n",
" \" politics: they're going to destroy this country!\",\n",
" \" politics: why don't you tell me about your political opinions\"]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rl.get_utterance_diff()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "semantic_router_1",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 8683f68

Please sign in to comment.