PandasAI

Release CI CD Coverage Documentation Status Discord Downloads License: MIT Open in Colab

PandasAI is a Python library that makes it easy to ask questions to your data (CSV, XLSX, PostgreSQL, MySQL, BigQuery, Databrick, Snowflake, etc.) in natural language. xIt helps you to explore, clean, and analyze your data using generative AI.

Beyond querying, PandasAI offers functionalities to visualize data through graphs, cleanse datasets by addressing missing values, and enhance data quality through feature generation, making it a comprehensive tool for data scientists and analysts.

Features

  • Natural language querying: Ask questions to your data in natural language.
  • Data visualization: Generate graphs and charts to visualize your data.
  • Data cleansing: Cleanse datasets by addressing missing values.
  • Feature generation: Enhance data quality through feature generation.
  • Data connectors: Connect to various data sources like CSV, XLSX, PostgreSQL, MySQL, BigQuery, Databrick, Snowflake, etc.

How does PandasAI work?

PandasAI uses a generative AI model to understand and interpret natural language queries and translate them into python code and SQL queries. It then uses the code to interact with the data and return the results to the user.

Who should use PandasAI?

PandasAI is designed for data scientists, analysts, and engineers who want to interact with their data in a more natural way. It is particularly useful for those who are not familiar with SQL or Python or who want to save time and effort when working with data. It is also useful for those who are familiar with SQL and Python, as it allows them to ask questions to their data without having to write any complex code.

How to get started with PandasAI?

To get started with PandasAI, you first need to install it. You can do this by running the following command:

# Using poetry (recommended)
poetry add pandasai

# Using pip
pip install pandasai

Once you have installed PandasAI, you can start using it by importing the SmartDataframe class and instantiating it with your data. You can then use the chat method to ask questions to your data in natural language.

import pandas as pd
from pandasai import SmartDataframe

# Sample DataFrame
sales_by_country = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})

# Instantiate a LLM
from pandasai.llm import OpenAI
llm = OpenAI(api_token="YOUR_API_TOKEN")

df = SmartDataframe(sales_by_country, config={"llm": llm})
df.chat('Which are the top 5 countries by sales?')
## Output
# China, United States, Japan, Germany, Australia

Demo

Try out PandasAI yourself in your browser:

Open in Colab

Support

If you have any questions or need help, please join our discord server.

License

PandasAI is available under the MIT expat license, except for the pandasai/ee directory (which has it's license here if applicable.

If you are interested in managed PandasAI Cloud or self-hosted Enterprise Offering, take a look at our website or book a meeting with us.