[Pandas] (https://pandas.pydata.org/) is the most famous Python Data Analysis library.

For my internal tests, I use a dataset downloaded from Kaggle: the historical data on avocado prices.

I will run my tests from the Conda notebook: a web-based computing environment.

Loading a CSV file in a Pandas DataFrame:

import pandas as pd

# df is a Pandas DataFrame structure
df = df.read_csv('avocado.csv')

We can now extract from this data structure only the lines related to one specific country.

Filtering:

albany_df = pf [ pf["region"] == "Albany"]

The new dataframe will only contains records related to “Albany”.

To be continued…