Here are some examples of how to use PandasAI.
More examples are included in the repository along with samples of data.
Working with pandas dataframes
Using PandasAI with a Pandas DataFrame
import os
from pandasai import SmartDataframe
import pandas as pd
sales_by_country = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe(sales_by_country)
response = sdf.chat('Which are the top 5 countries by sales?')
print(response)
Working with CSVs
Example of using PandasAI with a CSV file
import os
from pandasai import SmartDataframe
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Loan payments data.csv")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
Working with Excel files
Example of using PandasAI with an Excel file. In order to use Excel files as a data source, you need to install the pandasai[excel]
extra dependency.
pip install pandasai[excel]
Then, you can use PandasAI with an Excel file as follows:
import os
from pandasai import SmartDataframe
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Loan payments data.xlsx")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
Working with Parquet files
Example of using PandasAI with a Parquet file
import os
from pandasai import SmartDataframe
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Loan payments data.parquet")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
Working with Google Sheets
Example of using PandasAI with a Google Sheet. In order to use Google Sheets as a data source, you need to install the pandasai[google-sheet]
extra dependency.
pip install pandasai[google-sheet]
Then, you can use PandasAI with a Google Sheet as follows:
import os
from pandasai import SmartDataframe
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("https://docs.google.com/spreadsheets/d/fake/edit#gid=0")
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
Remember that at the moment, you need to make sure that the Google Sheet is public.
Working with Modin dataframes
Example of using PandasAI with a Modin DataFrame. In order to use Modin dataframes as a data source, you need to install the pandasai[modin]
extra dependency.
pip install pandasai[modin]
Then, you can use PandasAI with a Modin DataFrame as follows:
import os
import pandasai
from pandasai import SmartDataframe
import modin.pandas as pd
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sales_by_country = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
pandasai.set_pd_engine("modin")
sdf = SmartDataframe(sales_by_country)
response = sdf.chat('Which are the top 5 countries by sales?')
print(response)
Working with Polars dataframes
Example of using PandasAI with a Polars DataFrame (still in beta). In order to use Polars dataframes as a data source, you need to install the pandasai[polars]
extra dependency.
pip install pandasai[polars]
Then, you can use PandasAI with a Polars DataFrame as follows:
import os
from pandasai import SmartDataframe
import polars as pl
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sales_by_country = pl.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})
sdf = SmartDataframe(sales_by_country)
response = sdf.chat("How many loans are from men and have been paid off?")
print(response)
Plotting
Example of using PandasAI to plot a chart from a Pandas DataFrame
import os
from pandasai import SmartDataframe
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Countries.csv")
response = sdf.chat(
"Plot the histogram of countries showing for each the gpd, using different colors for each bar",
)
print(response)
Saving Plots with User Defined Path
You can pass a custom path to save the charts. The path must be a valid global path.
Below is the example to Save Charts with user defined location.
import os
from pandasai import SmartDataframe
user_defined_path = os.getcwd()
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
sdf = SmartDataframe("data/Countries.csv", config={
"save_charts": True,
"save_charts_path": user_defined_path,
})
response = sdf.chat(
"Plot the histogram of countries showing for each the gpd,"
" using different colors for each bar",
)
print(response)
Working with multiple dataframes (using the SmartDatalake)
Example of using PandasAI with multiple dataframes. In order to use multiple dataframes as a data source, you need to use a SmartDatalake
instead of a SmartDataframe
. You can instantiate a SmartDatalake
as follows:
import os
from pandasai import SmartDatalake
import pandas as pd
employees_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}
salaries_data = {
'EmployeeID': [1, 2, 3, 4, 5],
'Salary': [5000, 6000, 4500, 7000, 5500]
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
lake = SmartDatalake([employees_df, salaries_df])
response = lake.chat("Who gets paid the most?")
print(response)
Working with Agent
With the chat agent, you can engage in dynamic conversations where the agent retains context throughout the discussion. This enables you to have more interactive and meaningful exchanges.
Key Features
-
Context Retention: The agent remembers the conversation history, allowing for seamless, context-aware interactions.
-
Clarification Questions: You can use the clarification_questions
method to request clarification on any aspect of the conversation. This helps ensure you fully understand the information provided.
-
Explanation: The explain
method is available to obtain detailed explanations of how the agent arrived at a particular solution or response. It offers transparency and insights into the agent’s decision-making process.
Feel free to initiate conversations, seek clarifications, and explore explanations to enhance your interactions with the chat agent!
import os
import pandas as pd
from pandasai import Agent
employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}
salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent([employees_df, salaries_df], memory_size=10)
query = "Who gets paid the most?"
response = agent.chat(query)
print(response)
questions = agent.clarification_questions(query)
for question in questions:
print(question)
response = agent.explain()
print(response)
Description for an Agent
When you instantiate an agent, you can provide a description of the agent. THis description will be used to describe the agent in the chat and to provide more context for the LLM about how to respond to queries.
Some examples of descriptions can be:
- You are a data analysis agent. Your main goal is to help non-technical users to analyze data
- Act as a data analyst. Every time I ask you a question, you should provide the code to visualize the answer using plotly
import os
from pandasai import Agent
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent(
"data.csv",
description="You are a data analysis agent. Your main goal is to help non-technical users to analyze data",
)
Add Skills to the Agent
You can add customs functions for the agent to use, allowing the agent to expand its capabilities. These custom functions can be seamlessly integrated with the agent’s skills, enabling a wide range of user-defined operations.
import os
import pandas as pd
from pandasai import Agent
from pandasai.skills import skill
employees_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Name": ["John", "Emma", "Liam", "Olivia", "William"],
"Department": ["HR", "Sales", "IT", "Marketing", "Finance"],
}
salaries_data = {
"EmployeeID": [1, 2, 3, 4, 5],
"Salary": [5000, 6000, 4500, 7000, 5500],
}
employees_df = pd.DataFrame(employees_data)
salaries_df = pd.DataFrame(salaries_data)
@skill
def plot_salaries(merged_df: pd.DataFrame):
"""
Displays the bar chart having name on x-axis and salaries on y-axis using streamlit
"""
import matplotlib.pyplot as plt
plt.bar(merged_df["Name"], merged_df["Salary"])
plt.xlabel("Employee Name")
plt.ylabel("Salary")
plt.title("Employee Salaries")
plt.xticks(rotation=45)
plt.savefig("temp_chart.png")
plt.close()
os.environ["PANDASAI_API_KEY"] = "YOUR_API_KEY"
agent = Agent([employees_df, salaries_df], memory_size=10)
agent.add_skills(plot_salaries)
response = agent.chat("Plot the employee salaries against names")
print(response)