Merge remote-tracking branch 'origin/main' into vNext-Dev

microsoft · Jul 2, 2024 · b2bc8dc · b2bc8dc
2 parents e8ffbdb + 8ec9cbe
commit b2bc8dc
Show file tree

Hide file tree

Showing 4 changed files with 24 additions and 11 deletions.
diff --git a/app/backend/approaches/tabulardataassistant.py b/app/backend/approaches/tabulardataassistant.py
@@ -106,7 +106,13 @@ def process_agent_scratch_pad(question, df):
     deployment_name=OPENAI_DEPLOYMENT_NAME)  
 
     question = save_chart(question)
-    pdagent = create_pandas_dataframe_agent(chat, df, verbose=True,agent_type=AgentType.OPENAI_FUNCTIONS)
+    # This agent relies on access to a python repl tool which can execute arbitrary code.
+    # This can be dangerous and requires a specially sandboxed environment to be safely used.
+    # Failure to properly sandbox this class can lead to arbitrary code execution vulnerabilities,
+    # which can lead to data breaches, data loss, or other security incidents. You must opt in
+    # to use this functionality by setting allow_dangerous_code=True.
+    # https://api.python.langchain.com/en/latest/agents/langchain_experimental.agents.agent_toolkits.pandas.base.create_pandas_dataframe_agent.html
+    pdagent = create_pandas_dataframe_agent(chat, df, verbose=True,agent_type=AgentType.OPENAI_FUNCTIONS,allow_dangerous_code=True , handle_parsing_errors=True )
     for chunk in pdagent.stream({"input": question}):
         if "actions" in chunk:
             for action in chunk["actions"]:
@@ -134,8 +140,8 @@ def process_agent_response(question):
     deployment_name=OPENAI_DEPLOYMENT_NAME)  
 
 
-    pdagent = create_pandas_dataframe_agent(chat, dffinal, verbose=True,handle_parsing_errors=True,agent_type=AgentType.OPENAI_FUNCTIONS)
+    pdagent = create_pandas_dataframe_agent(chat, dffinal, verbose=True,handle_parsing_errors=True,agent_type=AgentType.OPENAI_FUNCTIONS, allow_dangerous_code=True)
     for chunk in pdagent.stream({"input": question}):
         if "output" in chunk:
             output = f'Final Output: ```{chunk["output"]}```'
-            return output
+            return output
diff --git a/app/backend/requirements.txt b/app/backend/requirements.txt
@@ -1,9 +1,9 @@
 #### Any version change made here should also be made and tested for the enrichment and function apps in /functions and /app/enrichment
-azure-identity==1.12.0
+azure-identity==1.16.1
 Flask==2.3.2
-langchain==0.1.16
+langchain==0.2.5
 azure-mgmt-cognitiveservices==13.5.0
-openai==1.17.0
+openai==1.24.0
 # azure-search-documents==11.4.0
 azure-search-documents==11.4.0b11
 azure-storage-blob==12.16.0
@@ -13,7 +13,7 @@ fastapi == 0.109.1
 fastapi-utils == 0.2.1
 uvicorn == 0.23.2
 numexpr == 2.10.0
-langchain-experimental==0.0.52
+langchain-experimental==0.0.61
 microsoft-bing-websearch==1.0.0
 tabulate==0.9.0
 matplotlib==3.8.3
@@ -22,6 +22,7 @@ pandas==2.2.1
 python-multipart==0.0.9
 Pillow==10.3.0
 wikipedia==1.4.0
-langchain-openai == 0.1.3
+langchain-openai == 0.1.7
 pytest==8.2.1
-python-dotenv==1.0.1
+python-dotenv==1.0.1
+langchain-community==0.2.5 
diff --git a/docs/deployment/deployment.md b/docs/deployment/deployment.md
@@ -56,9 +56,9 @@ AZURE_ENVIRONMENT | Yes | This will determine the Azure cloud environment the de
 SECURE_MODE | Yes | Defaults to `false`. This feature flag will determine if the Information Assistant deploys it's Azure Infrastructure in a secure mode or not.</br>:warning: Before enabling secure mode please read the extra instructions on [Enabling Secure Deployment](/docs/secure_deployment/secure_deployment.md)
 ENABLE_WEB_CHAT | Yes | Defaults to `false`. This feature flag will enable the ability to use Web Search results as a data source for generating answers from the LLM. This feature will also deploy a Bing v7 Search instance in Azure to retrieve web results from, however Bing v7 Search is not available in AzureUSGovernment regions, so this feature flag is **NOT** compatible with `AZURE_ENVIRONMENT=AzureUSGovernment`.
 ENABLE_BING_SAFE_SEARCH | No | Defaults to `true`. If you are using the `ENABLE_WEB_CHAT`feature you can set the following values to enable safe search on the Bing v7 Search APIs.
-ENABLE_UNGROUNDED_CHAT | Defaults to `false`. This feature flag will enable the ability to interact directly with an LLM. This experience will be similar to the Azure OpenAI Playground.
+ENABLE_UNGROUNDED_CHAT | Yes | Defaults to `false`. This feature flag will enable the ability to interact directly with an LLM. This experience will be similar to the Azure OpenAI Playground
 ENABLE_MATH_ASSISTANT | Yes | Defaults to `true`. This feature flag will enable the Math Assistant tab in the Information Assistant website. Read more information on the [Math Assistant](/docs/features/features.md)
-ENABLE_TABULAR_DATA_ASSISTANT | Yes | Defaults to `true`. This feature flag will enable the Tabular Data Assistant tab in the Information Assistant website. Read more information about the [Tabular Data Assistant](/docs/features/features.md)
+ENABLE_TABULAR_DATA_ASSISTANT | Yes | Defaults to `true`. This feature flag will enable the Tabular Data Assistant tab in the Information Assistant website. Read more information about the [Tabular Data Assistant](/docs/features/features.md). Read the security warnings on the Tabular Data Assistant feature page when deploying this feature.
 ENABLE_SHAREPOINT_CONNECTOR | Yes | Defaults to `false`. This feature flag enabled the ability to ingest data from SharePoint document stores into the Information Assistant. When enabled, be sure to set the `SHAREPOINT_TO_SYNC` parameter for your SharePoint sites. Read more about configuring the [SharePoint Connector](/docs/features/sharepoint.md). This feature flag is **NOT** compatible with `AZURE_ENVIRONMENT=AzureUSGovernment`.
 SHAREPOINT_TO_SYNC | No | This is a JSON Array of Objects for SharePoint Sites and their entry folders. The app will crawl down from the folder specified for each site. Specifying "/Shared Documents" will crawl all the documents in your SharePoint. `[{"url": "https://SharePoint.com/", "folder": "/Shared Documents"}]` This will **overwrite** any prior changes you've made to config.json. Information on setting up SharePoint Ingestion can be found here [SharePoint Connector](/docs/features/sharepoint.md)
 REQUIRE_WEBSITE_SECURITY_MEMBERSHIP | Yes | Use this setting to determine whether a user needs to be granted explicit access to the website via an Azure AD Enterprise Application membership (true) or allow the website to be available to anyone in the Azure tenant (false). Defaults to false. If set to true, A tenant level administrator will be required to grant the implicit grant workflow for the Azure AD App Registration manually.

diff --git a/docs/features/features.md b/docs/features/features.md
@@ -97,6 +97,12 @@ To learn more, please visit the [Cognitive Search](/docs/features/cognitive_sear
 
 We are rolling out the Math Assistant and Tabular Data Assistant in a preview mode. The Math Assistant combines natural language understanding with robust mathematical reasoning, enabling users to express mathematical queries in plain language and receive step-by-step solutions and insights.The Tabular Data Assistants allows users to ask natural language questions about tabular data stored in CSV files and extract insights from structured datasets with the ability to filter, aggregate, and perform computations on CSV data. The key strength of Agents lies in their ability to autonomously reason about tasks, decompose them into steps, and determine the appropriate tools and data sources to leverage, all without the need for predefined task definitions or rigid workflows.The Math Assistant and Tabular Data assistant are being released in preview mode as we continue to evaluate and mitigate the potential risks associated with autonomous reasoning Agents, such as misuse of external tools, lack of transparency, biased outputs, privacy concerns, and remote code execution vulnerabilities. With future release we plan work to enhance the safety and robustness of these autonomous reasoning capabilities.
 
+### :warning: Security Notice
+
+The Tabular Data Assistant relies on access to a python repl tool which can execute arbitrary code. This can be dangerous and requires a specially sandboxed environment to be safely used. Failure to run this code in a properly sandboxed environment can lead to arbitrary code execution vulnerabilities, which can lead to data breaches, data loss, or other security incidents.
+
+Do not use this code with untrusted inputs, with elevated permissions, or without consulting your security team about proper sandboxing!
+
 ## Customization and Personalization
 
 **User-Selectable Options:** Users can fine-tune their interactions by adjusting settings such as temperature and persona, tailoring the AI experience to their specific needs.