Pandas: data analysis with Python
[Pandas] (https://pandas.pydata.org/) is the most famous Python Data Analysis library.
For my internal tests, I use a dataset downloaded from Kaggle: the historical data on avocado prices.
I will run my tests from the Conda notebook: a web-based computing environment.
Loading a CSV file in a Pandas DataFrame:
import pandas as pd
# df is a Pandas DataFrame structure
df = df.read_csv('avocado.csv')
We can now extract from this data structure only the lines related to one specific country.
Filtering:
albany_df = pf [ pf["region"] == "Albany"]
The new dataframe will only contains records related to “Albany”.
To be continued…