Getting Started with Pandas

Getting Started with Pandas#

Before we get started, you may need to install Pandas - depending on how/where you are running Python. This guide is intented for people running a local installation of Python and using Jupyter Notebook or Jupyter Lab.

There are many ways to install Pandas. Ours is just one. You can read more here.


Installing (or upgrading) Pandas from PyPI#

Pandas can be installed via pip from PyPi.

This command installs the pandas library using pip, Python’s package installer.
If you have not installed Pandas before, this will download and install the latest version of Pandas along with its dependencies.

  pip install pandas

pip commands are most often run through a shell such as the Terminal on macOS or the Command Prompt on Windows.
However, you can run shell commands directly from Jupyter notebooks by adding an exclamation mark ! in front of the command:

! pip install pandas

Adding the --upgrade flag not only installs pandas if it is not already installed but also ensures that if Pandas is already installed, it is updated to the latest version. This is useful to make sure you have the newest features and bug fixes.

pip install --upgrade pandas

Again, we can run this shell command directly in Jupyter notebooks:

! pip install --upgrade pandas

Importing Pandas#

Even though we have now installed Pandas, we still need to import Pandas when we want to use it in our code.
This is conventionally done at the top of the script (together with any other imports) to make it easier for future readers (including ourselves!) to see, which packages are used.

Aliasing pandas as pd is a widely adopted convention that simplifies the syntax for accessing its functionalities.
After this statement, you can use pd to access all the functionalities provided by the pandas library.

 # This line imports the pandas library and aliases it as 'pd'.

import pandas as pd