🐼 Pandas & Data Analysis

🐼 Pandas & Data Analysis#

This part builds on the foundations from Part 1 and Part 2.

In the Python for not quite Absolute Beginners courses, we expand on previous learning to write more structured and reliable programs. Before you attend these courses, make sure you have completed Part 1 and 2 or are familiar with its topics.

Pandas & Data Analysis will teach you how to import, analyse, and visualise data using the library pandas.

pandas is a library for Python, which is tailored to enable coders to handle tabular data (and is used widely in conjunction with other Python libraries including the machine-learning package, SciKitLearn, and the mathematics-oriented Numpy). pandas’ tabular data take the form of DataFrames: Python’s version of an Excel spreadsheet. This object and the various methods associated with it will be the central focus.

Download the Titanic data#

For much of this course, we will be working with the Titanic dataset. The full dataset can be found on Kaggle, but we will only be using the training data, which we have made available here.

To follow the exercises and use the code exactly as written below, you will need to download the data and place it in a folder called ‘data’ in the same directory as your Jupyter notebook. In other words, the folder where your notebook is located should contain a subfolder named ‘data’, which holds the file ‘titanic.csv’.


Content#