Agent

PandasAI Agent Overview

While the pai.chat() method is meant to be used in a single session and for exploratory data analysis, an agent can be used for multi-turn conversations. To instantiate an agent, you can use the following code:

import os
from pandasai import Agent
import pandas as pd

# Sample DataFrames
sales_by_country = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000],
    "deals_opened": [142, 80, 70, 90, 60, 50, 40, 30, 110, 120],
    "deals_closed": [120, 70, 60, 80, 50, 40, 30, 20, 100, 110]
})

agent = Agent(sales_by_country)
agent.chat('Which are the top 5 countries by sales?')
# Output: China, United States, Japan, Germany, Australia

Contrary to the pai.chat() method, an agent will keep track of the state of the conversation and will be able to answer multi-turn conversations. For example:

agent.chat('And which one has the most deals?')
# Output: United States has the most deals

Clarification questions

An agent will also be able to ask clarification questions if it does not have enough information to answer the query. For example:

agent.clarification_questions('What is the GDP of the United States?')

This will return up to 3 clarification questions that the agent can ask the user to get more information to answer the query.

Explanation

An agent will also be able to explain the answer given to the user. For example:

response = agent.chat('What is the GDP of the United States?')
explanation = agent.explain()

print("The answer is", response)
print("The explanation is", explanation)

Rephrase Question

Rephrase question to get accurate and comprehensive response from the model. For example:

rephrased_query = agent.rephrase_query('What is the GDP of the United States?')

print("The rephrased query is", rephrased_query)

Using the Agent in a Sandbox Environment

The sandbox works offline and provides an additional layer of security for code execution. It’s particularly useful when working with untrusted data or when you need to ensure that code execution is isolated from your main system.

To enhance security and protect against malicious code through prompt injection, PandasAI provides a sandbox environment for code execution. The sandbox runs your code in an isolated Docker container, ensuring that potentially harmful operations are contained.

Installation

Before using the sandbox, you need to install Docker on your machine and ensure it is running. First, install the sandbox package:

pip install pandasai-docker

Basic Usage

Here’s how to use the sandbox with your PandasAI agent:

from pandasai import Agent
from pandasai_docker import DockerSandbox

# Initialize the sandbox
sandbox = DockerSandbox()
sandbox.start()

# Create an agent with the sandbox
df = pai.read_csv("data.csv")
agent = Agent([df], sandbox=sandbox)

# Chat with the agent - code will run in the sandbox
response = agent.chat("Calculate the average sales")

# Don't forget to stop the sandbox when done
sandbox.stop()

Customizing the Sandbox

You can customize the sandbox environment by specifying a custom name and Dockerfile:

sandbox = DockerSandbox(
    "custom-sandbox-name",
    "/path/to/custom/Dockerfile"
)

Training the Agent with local Vector stores

Training agents with local vector stores requires a PandasAI Enterprise license. See Enterprise Features for more details or contact us for production use.

It is possible also to use PandasAI with a few-shot learning agent, thanks to the “train with local vector store” enterprise feature (requiring an enterprise license). If you want to train the agent with a local vector store, you can use the local ChromaDB, Qdrant or Pinecone vector stores. Here’s how to do it: An enterprise license is required for using the vector stores locally. See Enterprise Features for licensing information. If you plan to use it in production, contact us.

from pandasai import Agent
from pandasai.ee.vectorstores import ChromaDB
from pandasai.ee.vectorstores import Qdrant
from pandasai.ee.vectorstores import Pinecone
from pandasai.ee.vector_stores import LanceDB

# Instantiate the vector store
vector_store = ChromaDB()
# or with Qdrant
# vector_store = Qdrant()
# or with LanceDB
vector_store = LanceDB()
# or with Pinecone
# vector_store = Pinecone(
#     api_key="*****",
#     embedding_function=embedding_function,
#     dimensions=384, # dimension of your embedding model
# )

# Instantiate the agent with the custom vector store
agent = Agent("data.csv", vectorstore=vector_store)

# Train the model
query = "What is the total sales for the current fiscal year?"
# The following code is passed as a string to the response variable
response = '\n'.join([
    'import pandas as pd',
    '',
    'df = dfs[0]',
    '',
    '# Calculate the total sales for the current fiscal year',
    'total_sales = df[df[\'date\'] >= pd.to_datetime(\'today\').replace(month=4, day=1)][\'sales\'].sum()',
    'result = { "type": "number", "value": total_sales }'
])

agent.train(queries=[query], codes=[response])

response = agent.chat("What is the total sales for the last fiscal year?")
print(response)
# The model will use the information provided in the training to generate a response

Overview

Natural Language

Data layer

Advanced Usage

PandasAI v2 to v3

About

PandasAI Agent Overview

Clarification questions

Explanation

Rephrase Question

Using the Agent in a Sandbox Environment

Installation

Basic Usage

Customizing the Sandbox

Training the Agent with local Vector stores

Overview

Natural Language

Data layer

Advanced Usage

PandasAI v2 to v3

About

​PandasAI Agent Overview

​Clarification questions

​Explanation

​Rephrase Question

​Using the Agent in a Sandbox Environment

​Installation

​Basic Usage

​Customizing the Sandbox

​Training the Agent with local Vector stores

PandasAI Agent Overview

Clarification questions

Explanation

Rephrase Question

Using the Agent in a Sandbox Environment

Installation

Basic Usage

Customizing the Sandbox

Training the Agent with local Vector stores