This documentation page is for backwards compatibility. For new projects, we recommend using the new semantic dataframes.
Migrate SmartDatalakes
To migrate from SmartDatalake to the new semantic dataframes:
- Use the new multiple dataframe syntax with
pai.chat()
- Update configuration to use
pai.config.set()
Example migration:
# Old code
from pandasai import SmartDatalake, SmartDataframe
lake = SmartDatalake([
SmartDataframe(customers_df),
SmartDataframe(orders_df)
])
response = lake.chat("Query across dataframes")
# New code
import pandasai as pai
df_customers = pai.DataFrame(customers_df)
df_orders = pai.DataFrame(orders_df)
response = pai.chat("Query across dataframes", df_customers, df_orders)
SmartDatalakes (Legacy)
The SmartDatalake class was the previous way to work with multiple dataframes in PandasAI. It allowed you to combine multiple SmartDataframes and ask questions that span across them.
Basic Usage
from pandasai import SmartDatalake, SmartDataframe
import pandas as pd
# Create sample dataframes
customers_df = pd.DataFrame({
'customer_id': [1, 2, 3],
'name': ['John', 'Emma', 'Alex']
})
orders_df = pd.DataFrame({
'order_id': [101, 102, 103],
'customer_id': [1, 2, 1],
'amount': [100, 200, 150]
})
# Convert to SmartDataframes
smart_customers = SmartDataframe(customers_df)
smart_orders = SmartDataframe(orders_df)
# Create SmartDatalake
lake = SmartDatalake([smart_customers, smart_orders])
# Ask questions spanning multiple dataframes
response = lake.chat("What is the total order amount for each customer?")
Configuration
SmartDatalake inherited configuration from its SmartDataframes, but you could also set config directly:
lake = SmartDatalake(
[smart_customers, smart_orders],
config={
"llm": llm,
"verbose": True
}
)