{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "513cfe15-50b1-4385-a0a9-16b337d970bc",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "# Subsets"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f9ddee47-e8ab-4f68-b283-2e6c71b6aa52",
   "metadata": {},
   "source": [
    "```{admonition} Learning Objectives\n",
    "\n",
    "Questions:\n",
    "* How can we select and filter specific columns and rows in a `DataFrame`?\n",
    "\n",
    "Objectives:\n",
    "* Select specific columns from a `DataFrame` using square bracket notation (`[]`).\n",
    "* Use the `loc` property for label-based indexing in a `DataFrame`.\n",
    "* Use the `iloc` property for position-based indexing in a `DataFrame`.\n",
    "* Filter rows based on the presence or absence of missing values.\n",
    "* Understand the differences and complementary uses of `.loc[]` and `.iloc[]`. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b90cc6fd-2f57-4f94-bff7-cf6b0aace5f5",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Import Pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75fa3e37-d901-4cdc-8822-5b6b0102ce4e",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "remove-cell"
    ]
   },
   "source": [
    "Aliasing pandas as `pd` is a widely adopted convention that simplifies the syntax for accessing its functionalities.\\\n",
    "After this statement, you can use `pd` to access all the functionalities provided by the pandas library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "765b4040-4b8a-420d-bad5-219654bc6174",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [],
   "source": [
    "# This line imports the pandas library and aliases it as 'pd'.\n",
    "\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4a72659e-005b-4390-a8fe-4dd61b356945",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4f7a487-146c-4573-904e-45601338891f",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Creating a `DataFrame` from a CSV file"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1e434ec-fdb0-4c8a-89a4-7281de89308f",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "The `pd.read_csv()` function is used to read the data from the file 'data/titanic.csv'."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "fe1c7a91-0caf-4eae-bf55-0938bcdaa851",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [],
   "source": [
    "# Load the Titanic dataset from a CSV file into a DataFrame named 'titanic'.\n",
    "\n",
    "titanic = pd.read_csv('data/titanic.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "9d5a2159-bb3b-4a20-94be-21ea37755071",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>113803</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>886</th>\n",
       "      <td>887</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>Montvila, Rev. Juozas</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>211536</td>\n",
       "      <td>13.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>887</th>\n",
       "      <td>888</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Graham, Miss. Margaret Edith</td>\n",
       "      <td>female</td>\n",
       "      <td>19.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>112053</td>\n",
       "      <td>30.0000</td>\n",
       "      <td>B42</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>888</th>\n",
       "      <td>889</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
       "      <td>female</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>W./C. 6607</td>\n",
       "      <td>23.4500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>889</th>\n",
       "      <td>890</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Behr, Mr. Karl Howell</td>\n",
       "      <td>male</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>111369</td>\n",
       "      <td>30.0000</td>\n",
       "      <td>C148</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>890</th>\n",
       "      <td>891</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Dooley, Mr. Patrick</td>\n",
       "      <td>male</td>\n",
       "      <td>32.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>370376</td>\n",
       "      <td>7.7500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>891 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived  Pclass  \\\n",
       "0              1         0       3   \n",
       "1              2         1       1   \n",
       "2              3         1       3   \n",
       "3              4         1       1   \n",
       "4              5         0       3   \n",
       "..           ...       ...     ...   \n",
       "886          887         0       2   \n",
       "887          888         1       1   \n",
       "888          889         0       3   \n",
       "889          890         1       1   \n",
       "890          891         0       3   \n",
       "\n",
       "                                                  Name     Sex   Age  SibSp  \\\n",
       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
       "..                                                 ...     ...   ...    ...   \n",
       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
       "\n",
       "     Parch            Ticket     Fare Cabin Embarked  \n",
       "0        0         A/5 21171   7.2500   NaN        S  \n",
       "1        0          PC 17599  71.2833   C85        C  \n",
       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
       "3        0            113803  53.1000  C123        S  \n",
       "4        0            373450   8.0500   NaN        S  \n",
       "..     ...               ...      ...   ...      ...  \n",
       "886      0            211536  13.0000   NaN        S  \n",
       "887      0            112053  30.0000   B42        S  \n",
       "888      2        W./C. 6607  23.4500   NaN        S  \n",
       "889      0            111369  30.0000  C148        C  \n",
       "890      0            370376   7.7500   NaN        Q  \n",
       "\n",
       "[891 rows x 12 columns]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Display the DataFrame 'titanic'.\n",
    "\n",
    "titanic"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13331c76-ef8d-4881-9229-b6dbf6a34447",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "688e5d6e-f20e-41ac-b44f-aaea410cd399",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Selecting specific columns\n",
    "\n",
    "![Selecting specific columns from a DataFrame](images/03_subset_columns.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13a500a8-9996-4478-8b37-496110b587d1",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "Selecting specific columns is useful when you want to work with specific subsets of the dataset e.g. you might want to perform statistical calculations based on only the colum `Age` or later on when you want to access and manipulate your data in more complex ways. \n",
    "\n",
    "Access the 'Age' column from the `DataFrame` 'titanic' to return a `Series` object containing all the data in the 'Age' column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "6752e1f0-6227-4128-b2f5-323c429ffe5c",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      22.0\n",
       "1      38.0\n",
       "2      26.0\n",
       "3      35.0\n",
       "4      35.0\n",
       "       ... \n",
       "886    27.0\n",
       "887    19.0\n",
       "888     NaN\n",
       "889    26.0\n",
       "890    32.0\n",
       "Name: Age, Length: 891, dtype: float64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "titanic['Age']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "0e3d6cdf-0b96-489c-9271-b0dcf887682d",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.series.Series"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Check the type of the 'Age' column in 'titanic' using the 'type()' function.\n",
    "\n",
    "type(titanic['Age'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f2c9d264-6190-49dd-96c5-ea5ffb177a41",
   "metadata": {},
   "source": [
    "Use the `shape` attribute to determine the dimensions of the `Series`.\\\n",
    "It returns a tuple representing the number of rows and columns (rows, columns)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "ddcdfad9-cfb3-413c-bf2a-f2b485ec0d50",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(891,)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "titanic['Age'].shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b48c4ac-45cf-449b-bcf1-da1ef4408e41",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "54af87e3-2151-4b56-9cca-6fb72cada392",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "### Calling multiple `Series`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14237564-b732-47a6-a52d-c2af0c00694a",
   "metadata": {},
   "source": [
    "Select columns 'Age' and 'Sex' from the 'titanic' `DataFrame` using double square brackets:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "dc8709c5-0b04-4869-8668-cc5144571fd5",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Age</th>\n",
       "      <th>Sex</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>22.0</td>\n",
       "      <td>male</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>38.0</td>\n",
       "      <td>female</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>26.0</td>\n",
       "      <td>female</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>35.0</td>\n",
       "      <td>female</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>35.0</td>\n",
       "      <td>male</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>886</th>\n",
       "      <td>27.0</td>\n",
       "      <td>male</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>887</th>\n",
       "      <td>19.0</td>\n",
       "      <td>female</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>888</th>\n",
       "      <td>NaN</td>\n",
       "      <td>female</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>889</th>\n",
       "      <td>26.0</td>\n",
       "      <td>male</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>890</th>\n",
       "      <td>32.0</td>\n",
       "      <td>male</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>891 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Age     Sex\n",
       "0    22.0    male\n",
       "1    38.0  female\n",
       "2    26.0  female\n",
       "3    35.0  female\n",
       "4    35.0    male\n",
       "..    ...     ...\n",
       "886  27.0    male\n",
       "887  19.0  female\n",
       "888   NaN  female\n",
       "889  26.0    male\n",
       "890  32.0    male\n",
       "\n",
       "[891 rows x 2 columns]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select the columns 'Age' and 'Sex' from the 'titanic' DataFrame.\n",
    "\n",
    "titanic[['Age', 'Sex']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f8e7d7d-4ae1-4542-8caf-31787d7d8081",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "```{note}\n",
    "The inner square brackets define a Python `list` with column names, whereas the outer square brackets are used to select the data from a pandas `DataFrame` as seen in the previous example."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8190374b-8681-48c0-a0cc-028dd9e1f936",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "Select columns 'Age' and 'Sex' from the 'titanic' `DataFrame` using double square brackets, and then apply the `type()` function to the resulting `DataFrame` subset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "41856f83-9f98-4b7a-9800-4acf1c222588",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pandas.core.frame.DataFrame"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Use the 'type()' function to determine the data type of the DataFrame subset.\n",
    "\n",
    "type(titanic[['Age', 'Sex']])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6c073f3-f58f-462c-98f1-d027cb0dfc41",
   "metadata": {},
   "source": [
    "Select columns 'Age' and 'Sex' from the 'titanic' `DataFrame` using double square brackets, and then apply the `shape` attribute to the resulting `DataFrame` subset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "b0388ea7-0e3e-4ebe-b9e7-dad3e8271928",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(891, 2)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Use the 'shape' attribute to determine the dimensions of the DataFrame subset.\n",
    "\n",
    "titanic[['Age', 'Sex']].shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6cb6bcc8-8a2a-428f-9102-394ee6c333d3",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6035c0e3-bd8f-4b93-bb6a-65ed95db22ea",
   "metadata": {},
   "source": [
    "## Introducing `.loc[]` and `.iloc[]`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "975963cb-241a-4036-b90e-c3c1290511a4",
   "metadata": {},
   "source": [
    "Now that you’ve learned how to select specific columns from a `DataFrame`, it’s time to explore more advanced and flexible ways to access and manipulate your data. As you start filtering rows based on conditions or need to select both rows and columns simultaneously, using `.loc[]` and `.iloc[]` will give you greater control and clarity in your code.\n",
    "\n",
    "These powerful tools allow you to:\n",
    "\n",
    "* Safely and explicitly select rows and columns by labels with `.loc[]`.\n",
    "* Access rows and columns by their integer positions using `.iloc[]`.\n",
    "* Avoid potential issues that can arise with more complex operations, ensuring your data manipulations are clear, consistent, and reliable.\n",
    "\n",
    "With these methods, you’ll be better equipped to handle more complex data tasks efficiently."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b4d65d36-9d60-41a2-adcf-ba6ac1c5e2e3",
   "metadata": {},
   "source": [
    "```{admonition} What exactly are loc and iloc?\n",
    ":class: tip dropdown\n",
    "\n",
    "`loc` and `iloc` in Pandas are technically properties that return a specialized indexer object. These indexer objects allow for label-based (`loc`) and integer-based (`iloc`) indexing. Although they behave similarly to methods in that you use them to perform actions (i.e., selecting or slicing data), they are properties because they return an indexing object rather than directly executing a function.\n",
    "\n",
    "When you access `loc` or `iloc` on a `DataFrame`, what you are getting is not a simple value but an object that has its own methods and behaviors, specifically designed to handle indexing operations. This design allows for the concise and powerful data selection syntax that Pandas is known for.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0f2bd5c-8ea4-4fd9-9d4a-2c527ab072ea",
   "metadata": {},
   "source": [
    "```{admonition} Avoid ambiguous operations\n",
    ":class: warning dropdown\n",
    "\n",
    "Some of the operations we’ll be covering can technically be performed without using `.loc[]` or `.iloc[]`. For instance, you might see data selections or modifications done with simple square brackets like `df[df['column'] > value]`. While this can work for basic tasks, it can lead to issues in more complex scenarios, such as unexpected behavior, ambiguity, or performance inefficiencies.\n",
    "\n",
    "**Why is this a bad idea?**\n",
    "\n",
    "- **Ambiguity**: It’s not always clear if you're modifying a copy of your data or the original `DataFrame`, which can lead to unintended side effects.\n",
    "- **Potential errors**: Using square brackets for complex selections might trigger warnings like `SettingWithCopyWarning`, indicating that your code might not be doing what you think it is.\n",
    "- **Lack of flexibility**: As your data operations become more complex, relying on square brackets can limit your ability to write clear and maintainable code.\n",
    "\n",
    "For these reasons, we recommend using `.loc[]` and `.iloc[]` as they provide a more reliable and explicit way to handle data selection and modification in Pandas.\n",
    "\n",
    "If you’re interested in learning more about the potential pitfalls of using square brackets and the benefits of `.loc[]` and `.iloc[]`, you can read more in the [Pandas documentation on indexing and selecting data](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html).\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99663d1a-3f33-44fa-9387-1e9b1766748d",
   "metadata": {},
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "961bdfdf-11b0-474a-bd74-4500fad73448",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Filtering rows based on conditional expressions\n",
    "\n",
    "![Filtering specific rows from a DataFrame](images/03_subset_rows.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ece850f-2344-49d6-8dfb-6fed8c58f18c",
   "metadata": {},
   "source": [
    "With `.loc[]` and `.iloc[]`, you have more control and flexibility when filtering rows and selecting columns in a `DataFrame`.\n",
    "\n",
    "Let’s start by exploring how you can use these tools to filter rows based on conditions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "903513f6-4ba8-4a19-850f-918b5dd97b21",
   "metadata": {},
   "source": [
    "### Filtering specific rows with `.loc[]`\n",
    "\n",
    "To filter rows based on a condition, you can place the condition inside the `.loc[]` selector.\n",
    "\n",
    "Here’s how you can filter rows where the `Age` column is greater than 35:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "60ff0a11-3cbe-48e0-8ee0-28fd8d492d45",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>McCarthy, Mr. Timothy J</td>\n",
       "      <td>male</td>\n",
       "      <td>54.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>17463</td>\n",
       "      <td>51.8625</td>\n",
       "      <td>E46</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>12</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Bonnell, Miss. Elizabeth</td>\n",
       "      <td>female</td>\n",
       "      <td>58.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>113783</td>\n",
       "      <td>26.5500</td>\n",
       "      <td>C103</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>14</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Andersson, Mr. Anders Johan</td>\n",
       "      <td>male</td>\n",
       "      <td>39.0</td>\n",
       "      <td>1</td>\n",
       "      <td>5</td>\n",
       "      <td>347082</td>\n",
       "      <td>31.2750</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>16</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>Hewlett, Mrs. (Mary D Kingcome)</td>\n",
       "      <td>female</td>\n",
       "      <td>55.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>248706</td>\n",
       "      <td>16.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>865</th>\n",
       "      <td>866</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>Bystrom, Mrs. (Karolina)</td>\n",
       "      <td>female</td>\n",
       "      <td>42.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>236852</td>\n",
       "      <td>13.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>871</th>\n",
       "      <td>872</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Beckwith, Mrs. Richard Leonard (Sallie Monypeny)</td>\n",
       "      <td>female</td>\n",
       "      <td>47.0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>11751</td>\n",
       "      <td>52.5542</td>\n",
       "      <td>D35</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>873</th>\n",
       "      <td>874</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Vander Cruyssen, Mr. Victor</td>\n",
       "      <td>male</td>\n",
       "      <td>47.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>345765</td>\n",
       "      <td>9.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>879</th>\n",
       "      <td>880</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)</td>\n",
       "      <td>female</td>\n",
       "      <td>56.0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>11767</td>\n",
       "      <td>83.1583</td>\n",
       "      <td>C50</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>885</th>\n",
       "      <td>886</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Rice, Mrs. William (Margaret Norton)</td>\n",
       "      <td>female</td>\n",
       "      <td>39.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>382652</td>\n",
       "      <td>29.1250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>217 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived  Pclass  \\\n",
       "1              2         1       1   \n",
       "6              7         0       1   \n",
       "11            12         1       1   \n",
       "13            14         0       3   \n",
       "15            16         1       2   \n",
       "..           ...       ...     ...   \n",
       "865          866         1       2   \n",
       "871          872         1       1   \n",
       "873          874         0       3   \n",
       "879          880         1       1   \n",
       "885          886         0       3   \n",
       "\n",
       "                                                  Name     Sex   Age  SibSp  \\\n",
       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "6                              McCarthy, Mr. Timothy J    male  54.0      0   \n",
       "11                            Bonnell, Miss. Elizabeth  female  58.0      0   \n",
       "13                         Andersson, Mr. Anders Johan    male  39.0      1   \n",
       "15                    Hewlett, Mrs. (Mary D Kingcome)   female  55.0      0   \n",
       "..                                                 ...     ...   ...    ...   \n",
       "865                           Bystrom, Mrs. (Karolina)  female  42.0      0   \n",
       "871   Beckwith, Mrs. Richard Leonard (Sallie Monypeny)  female  47.0      1   \n",
       "873                        Vander Cruyssen, Mr. Victor    male  47.0      0   \n",
       "879      Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)  female  56.0      0   \n",
       "885               Rice, Mrs. William (Margaret Norton)  female  39.0      0   \n",
       "\n",
       "     Parch    Ticket     Fare Cabin Embarked  \n",
       "1        0  PC 17599  71.2833   C85        C  \n",
       "6        0     17463  51.8625   E46        S  \n",
       "11       0    113783  26.5500  C103        S  \n",
       "13       5    347082  31.2750   NaN        S  \n",
       "15       0    248706  16.0000   NaN        S  \n",
       "..     ...       ...      ...   ...      ...  \n",
       "865      0    236852  13.0000   NaN        S  \n",
       "871      1     11751  52.5542   D35        S  \n",
       "873      0    345765   9.0000   NaN        S  \n",
       "879      1     11767  83.1583   C50        C  \n",
       "885      5    382652  29.1250   NaN        Q  \n",
       "\n",
       "[217 rows x 12 columns]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows in the 'titanic' DataFrame where the 'Age' column is greater than 35.\n",
    "\n",
    "titanic.loc[titanic['Age'] > 35]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38ac84cf-9f52-4c01-bca9-e46fb87d66c1",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "This command filters the rows where the condition `titanic['Age'] > 35` is `True`, and returns the entire `DataFrame` for those rows.\n",
    "\n",
    "If you only want to select specific columns (e.g., `Name` and `Pclass`) for those filtered rows, you can add the column names after the comma:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "13936ecd-9c28-45bd-9edb-e4727a81d081",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Pclass</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>McCarthy, Mr. Timothy J</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Bonnell, Miss. Elizabeth</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Andersson, Mr. Anders Johan</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Hewlett, Mrs. (Mary D Kingcome)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>865</th>\n",
       "      <td>Bystrom, Mrs. (Karolina)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>871</th>\n",
       "      <td>Beckwith, Mrs. Richard Leonard (Sallie Monypeny)</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>873</th>\n",
       "      <td>Vander Cruyssen, Mr. Victor</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>879</th>\n",
       "      <td>Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>885</th>\n",
       "      <td>Rice, Mrs. William (Margaret Norton)</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>217 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  Name  Pclass\n",
       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...       1\n",
       "6                              McCarthy, Mr. Timothy J       1\n",
       "11                            Bonnell, Miss. Elizabeth       1\n",
       "13                         Andersson, Mr. Anders Johan       3\n",
       "15                    Hewlett, Mrs. (Mary D Kingcome)        2\n",
       "..                                                 ...     ...\n",
       "865                           Bystrom, Mrs. (Karolina)       2\n",
       "871   Beckwith, Mrs. Richard Leonard (Sallie Monypeny)       1\n",
       "873                        Vander Cruyssen, Mr. Victor       3\n",
       "879      Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)       1\n",
       "885               Rice, Mrs. William (Margaret Norton)       3\n",
       "\n",
       "[217 rows x 2 columns]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows where 'Age' is greater than 35 and select only the 'Name' and 'Pclass' columns.\n",
    "\n",
    "titanic.loc[titanic['Age'] > 35, ['Name', 'Pclass']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "346705f6-3f5a-4f50-a4c2-23e05aa9f353",
   "metadata": {},
   "source": [
    "```{admonition} When using loc/iloc\n",
    ":class: tip\n",
    "\n",
    "The part *before the comma* is the *rows* you want to select.\n",
    "\n",
    "The part *after the comma* is the *columns* you want to select.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39b84234-a6f9-48c8-82a6-e9515d0afa7e",
   "metadata": {},
   "source": [
    "### Using `.iloc[]` for positional indexing\n",
    "\n",
    "The `.iloc[]` method works similarly to `.loc[]`, but it selects rows and columns based on their integer positions rather than labels.\n",
    "\n",
    "For example, to select the first 5 rows and the first 2 columns, you can use:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "7dad5185-69b3-41ce-9de0-626ea02ef6b4",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived\n",
       "0            1         0\n",
       "1            2         1\n",
       "2            3         1\n",
       "3            4         1\n",
       "4            5         0"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select the first 5 rows and the first 2 columns using .iloc[].\n",
    "\n",
    "titanic.iloc[:5, :2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3762ceb-ae60-4877-a117-03260cbe3c7a",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "### Inclusive vs. Exclusive Slicing\n",
    "\n",
    "When slicing with `.loc[]`, the slicing is **inclusive** of the start and end labels. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "f1d37a3e-c4c1-4687-99b1-6366fb1b83bb",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Pclass</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                Name  Pclass\n",
       "0                            Braund, Mr. Owen Harris       3\n",
       "1  Cumings, Mrs. John Bradley (Florence Briggs Th...       1\n",
       "2                             Heikkinen, Miss. Laina       3\n",
       "3       Futrelle, Mrs. Jacques Heath (Lily May Peel)       1"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select rows 0 through 3 (inclusive) and columns 'Name' and 'Pclass'.\n",
    "\n",
    "titanic.loc[0:3, ['Name', 'Pclass']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47842350-bee8-4683-91f0-da8905771656",
   "metadata": {},
   "source": [
    "This command returns rows 0, 1, 2, and 3, including both the start and end indices.\n",
    "\n",
    "When slicing with `.iloc[]`, the slicing is **exclusive** of the end position:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "319db576-5646-4549-83dc-67d4c1b9c2c7",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   PassengerId  Survived\n",
       "0            1         0\n",
       "1            2         1\n",
       "2            3         1"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Select rows 0 through 2 (exclusive of 3) and the first two columns using .iloc[].\n",
    "\n",
    "titanic.iloc[0:3, 0:2]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e0be5523-1168-4a3d-be4c-0cea81a4a2cc",
   "metadata": {},
   "source": [
    "### Using `.loc[]` and `.iloc[]` for complex conditions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b48d6163-81a7-4bea-9fa8-cbdab2288cb5",
   "metadata": {},
   "source": [
    "You can use both `.loc[]` and `.iloc[]` to filter rows based on more complex conditions, combining multiple criteria with logical operators.\n",
    "\n",
    "For instance, to filter rows where `Pclass` is either 2 or 3 using `.loc[]`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "0e5b553f-8987-4f9a-90af-30a3ac194a88",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>6</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Moran, Mr. James</td>\n",
       "      <td>male</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>330877</td>\n",
       "      <td>8.4583</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>8</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Palsson, Master. Gosta Leonard</td>\n",
       "      <td>male</td>\n",
       "      <td>2.0</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>349909</td>\n",
       "      <td>21.0750</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>884</th>\n",
       "      <td>885</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Sutehall, Mr. Henry Jr</td>\n",
       "      <td>male</td>\n",
       "      <td>25.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>SOTON/OQ 392076</td>\n",
       "      <td>7.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>885</th>\n",
       "      <td>886</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Rice, Mrs. William (Margaret Norton)</td>\n",
       "      <td>female</td>\n",
       "      <td>39.0</td>\n",
       "      <td>0</td>\n",
       "      <td>5</td>\n",
       "      <td>382652</td>\n",
       "      <td>29.1250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>886</th>\n",
       "      <td>887</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>Montvila, Rev. Juozas</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>211536</td>\n",
       "      <td>13.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>888</th>\n",
       "      <td>889</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
       "      <td>female</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>W./C. 6607</td>\n",
       "      <td>23.4500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>890</th>\n",
       "      <td>891</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Dooley, Mr. Patrick</td>\n",
       "      <td>male</td>\n",
       "      <td>32.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>370376</td>\n",
       "      <td>7.7500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>675 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived  Pclass                                      Name  \\\n",
       "0              1         0       3                   Braund, Mr. Owen Harris   \n",
       "2              3         1       3                    Heikkinen, Miss. Laina   \n",
       "4              5         0       3                  Allen, Mr. William Henry   \n",
       "5              6         0       3                          Moran, Mr. James   \n",
       "7              8         0       3            Palsson, Master. Gosta Leonard   \n",
       "..           ...       ...     ...                                       ...   \n",
       "884          885         0       3                    Sutehall, Mr. Henry Jr   \n",
       "885          886         0       3      Rice, Mrs. William (Margaret Norton)   \n",
       "886          887         0       2                     Montvila, Rev. Juozas   \n",
       "888          889         0       3  Johnston, Miss. Catherine Helen \"Carrie\"   \n",
       "890          891         0       3                       Dooley, Mr. Patrick   \n",
       "\n",
       "        Sex   Age  SibSp  Parch            Ticket     Fare Cabin Embarked  \n",
       "0      male  22.0      1      0         A/5 21171   7.2500   NaN        S  \n",
       "2    female  26.0      0      0  STON/O2. 3101282   7.9250   NaN        S  \n",
       "4      male  35.0      0      0            373450   8.0500   NaN        S  \n",
       "5      male   NaN      0      0            330877   8.4583   NaN        Q  \n",
       "7      male   2.0      3      1            349909  21.0750   NaN        S  \n",
       "..      ...   ...    ...    ...               ...      ...   ...      ...  \n",
       "884    male  25.0      0      0   SOTON/OQ 392076   7.0500   NaN        S  \n",
       "885  female  39.0      0      5            382652  29.1250   NaN        Q  \n",
       "886    male  27.0      0      0            211536  13.0000   NaN        S  \n",
       "888  female   NaN      1      2        W./C. 6607  23.4500   NaN        S  \n",
       "890    male  32.0      0      0            370376   7.7500   NaN        Q  \n",
       "\n",
       "[675 rows x 12 columns]"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows where 'Pclass' is either 2 or 3 using .loc[].\n",
    "\n",
    "titanic.loc[titanic['Pclass'].isin([2, 3])]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3db35580-9be7-4c17-9d1d-975da8faab2f",
   "metadata": {},
   "source": [
    "Similarly, if you want to perform position-based filtering using `.iloc[]`, you would first need to identify the positions of the relevant rows and columns. However, `.iloc[]` is typically less useful for condition-based filtering because it requires knowledge of the exact row and column positions rather than labels."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b2120f5-0c03-41a4-8b98-4fbfb43ba993",
   "metadata": {},
   "source": [
    "You can also combine multiple conditions using logical operators like `&` (AND) and `|` (OR):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "a6ec973c-37e0-4aa7-bd61-ad4ef508206b",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Pclass</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Andersson, Mr. Anders Johan</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Hewlett, Mrs. (Mary D Kingcome)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25</th>\n",
       "      <td>Asplund, Mrs. Carl Oscar (Selma Augusta Emilia...</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>Wheadon, Mr. Edward H</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40</th>\n",
       "      <td>Ahlin, Mrs. Johan (Johanna Persdotter Larsson)</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>854</th>\n",
       "      <td>Carter, Mrs. Ernest Courtenay (Lilian Hughes)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>860</th>\n",
       "      <td>Hansen, Mr. Claus Peter</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>865</th>\n",
       "      <td>Bystrom, Mrs. (Karolina)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>873</th>\n",
       "      <td>Vander Cruyssen, Mr. Victor</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>885</th>\n",
       "      <td>Rice, Mrs. William (Margaret Norton)</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>113 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  Name  Pclass\n",
       "13                         Andersson, Mr. Anders Johan       3\n",
       "15                    Hewlett, Mrs. (Mary D Kingcome)        2\n",
       "25   Asplund, Mrs. Carl Oscar (Selma Augusta Emilia...       3\n",
       "33                               Wheadon, Mr. Edward H       2\n",
       "40      Ahlin, Mrs. Johan (Johanna Persdotter Larsson)       3\n",
       "..                                                 ...     ...\n",
       "854      Carter, Mrs. Ernest Courtenay (Lilian Hughes)       2\n",
       "860                            Hansen, Mr. Claus Peter       3\n",
       "865                           Bystrom, Mrs. (Karolina)       2\n",
       "873                        Vander Cruyssen, Mr. Victor       3\n",
       "885               Rice, Mrs. William (Margaret Norton)       3\n",
       "\n",
       "[113 rows x 2 columns]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows where ('Pclass' is either 2 or 3) AND ('Age' is greater than 35),\n",
    "# and then select the 'Name' and 'Pclass' columns.\n",
    "\n",
    "titanic.loc[((titanic['Pclass'] == 2) | (titanic['Pclass'] == 3)) & (titanic['Age'] > 35), ['Name', 'Pclass']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da9cb04b-4246-4c5f-be72-bf8f85fa0a5b",
   "metadata": {},
   "source": [
    "### Handling missing values with `.loc[]` and `.iloc[]`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "406599e3-1a91-4e45-b609-cf6f3e1f7cc0",
   "metadata": {},
   "source": [
    "Both `.loc[]` and `.iloc[]` allow you to filter rows based on the presence or absence of missing values.\\\n",
    "(However, `.loc[]` is generally more convenient when working with labeled data, while `.iloc[]` requires knowing the exact positions.)\n",
    "\n",
    "For example, to filter rows where the `Embarked` column is not null using `.loc[]`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "6ac6403c-8cf1-4161-bd7c-13607d4dfe90",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Braund, Mr. Owen Harris</td>\n",
       "      <td>male</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>A/5 21171</td>\n",
       "      <td>7.2500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>PC 17599</td>\n",
       "      <td>71.2833</td>\n",
       "      <td>C85</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>Heikkinen, Miss. Laina</td>\n",
       "      <td>female</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>STON/O2. 3101282</td>\n",
       "      <td>7.9250</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
       "      <td>female</td>\n",
       "      <td>35.0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>113803</td>\n",
       "      <td>53.1000</td>\n",
       "      <td>C123</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Allen, Mr. William Henry</td>\n",
       "      <td>male</td>\n",
       "      <td>35.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>373450</td>\n",
       "      <td>8.0500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>886</th>\n",
       "      <td>887</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>Montvila, Rev. Juozas</td>\n",
       "      <td>male</td>\n",
       "      <td>27.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>211536</td>\n",
       "      <td>13.0000</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>887</th>\n",
       "      <td>888</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Graham, Miss. Margaret Edith</td>\n",
       "      <td>female</td>\n",
       "      <td>19.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>112053</td>\n",
       "      <td>30.0000</td>\n",
       "      <td>B42</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>888</th>\n",
       "      <td>889</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Johnston, Miss. Catherine Helen \"Carrie\"</td>\n",
       "      <td>female</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>W./C. 6607</td>\n",
       "      <td>23.4500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>S</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>889</th>\n",
       "      <td>890</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Behr, Mr. Karl Howell</td>\n",
       "      <td>male</td>\n",
       "      <td>26.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>111369</td>\n",
       "      <td>30.0000</td>\n",
       "      <td>C148</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>890</th>\n",
       "      <td>891</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>Dooley, Mr. Patrick</td>\n",
       "      <td>male</td>\n",
       "      <td>32.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>370376</td>\n",
       "      <td>7.7500</td>\n",
       "      <td>NaN</td>\n",
       "      <td>Q</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>889 rows × 12 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived  Pclass  \\\n",
       "0              1         0       3   \n",
       "1              2         1       1   \n",
       "2              3         1       3   \n",
       "3              4         1       1   \n",
       "4              5         0       3   \n",
       "..           ...       ...     ...   \n",
       "886          887         0       2   \n",
       "887          888         1       1   \n",
       "888          889         0       3   \n",
       "889          890         1       1   \n",
       "890          891         0       3   \n",
       "\n",
       "                                                  Name     Sex   Age  SibSp  \\\n",
       "0                              Braund, Mr. Owen Harris    male  22.0      1   \n",
       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...  female  38.0      1   \n",
       "2                               Heikkinen, Miss. Laina  female  26.0      0   \n",
       "3         Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  35.0      1   \n",
       "4                             Allen, Mr. William Henry    male  35.0      0   \n",
       "..                                                 ...     ...   ...    ...   \n",
       "886                              Montvila, Rev. Juozas    male  27.0      0   \n",
       "887                       Graham, Miss. Margaret Edith  female  19.0      0   \n",
       "888           Johnston, Miss. Catherine Helen \"Carrie\"  female   NaN      1   \n",
       "889                              Behr, Mr. Karl Howell    male  26.0      0   \n",
       "890                                Dooley, Mr. Patrick    male  32.0      0   \n",
       "\n",
       "     Parch            Ticket     Fare Cabin Embarked  \n",
       "0        0         A/5 21171   7.2500   NaN        S  \n",
       "1        0          PC 17599  71.2833   C85        C  \n",
       "2        0  STON/O2. 3101282   7.9250   NaN        S  \n",
       "3        0            113803  53.1000  C123        S  \n",
       "4        0            373450   8.0500   NaN        S  \n",
       "..     ...               ...      ...   ...      ...  \n",
       "886      0            211536  13.0000   NaN        S  \n",
       "887      0            112053  30.0000   B42        S  \n",
       "888      2        W./C. 6607  23.4500   NaN        S  \n",
       "889      0            111369  30.0000  C148        C  \n",
       "890      0            370376   7.7500   NaN        Q  \n",
       "\n",
       "[889 rows x 12 columns]"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows where 'Embarked' is not null using .loc[].\n",
    "\n",
    "titanic.loc[titanic['Embarked'].notna()]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af3013f4-ed33-4052-b5eb-c9c52d7156da",
   "metadata": {},
   "source": [
    "You can then compare the number of extracted rows with the original DataFrame to identify how many rows have missing data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "10cd8859-5c33-45c9-b327-3708c4e0b05d",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 891 entries, 0 to 890\n",
      "Data columns (total 12 columns):\n",
      " #   Column       Non-Null Count  Dtype  \n",
      "---  ------       --------------  -----  \n",
      " 0   PassengerId  891 non-null    int64  \n",
      " 1   Survived     891 non-null    int64  \n",
      " 2   Pclass       891 non-null    int64  \n",
      " 3   Name         891 non-null    object \n",
      " 4   Sex          891 non-null    object \n",
      " 5   Age          714 non-null    float64\n",
      " 6   SibSp        891 non-null    int64  \n",
      " 7   Parch        891 non-null    int64  \n",
      " 8   Ticket       891 non-null    object \n",
      " 9   Fare         891 non-null    float64\n",
      " 10  Cabin        204 non-null    object \n",
      " 11  Embarked     889 non-null    object \n",
      "dtypes: float64(2), int64(5), object(5)\n",
      "memory usage: 83.7+ KB\n"
     ]
    }
   ],
   "source": [
    "# Display information about the DataFrame to compare row counts.\n",
    "\n",
    "titanic.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47535c74-e158-4a89-9bc9-a1cea1aa70df",
   "metadata": {},
   "source": [
    "Comparing the two number of rows (889 and 891) tells us, that two rows are missing embarkation data.\n",
    "\n",
    "As we saw in the exercise on the previous page on [reading and writing tabular data](02_pandas_tabular_data), the two rows missing embarkation data can be found like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "e4a29062-fc16-4519-bde3-ad8f802a0b14",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>PassengerId</th>\n",
       "      <th>Survived</th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "      <th>SibSp</th>\n",
       "      <th>Parch</th>\n",
       "      <th>Ticket</th>\n",
       "      <th>Fare</th>\n",
       "      <th>Cabin</th>\n",
       "      <th>Embarked</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>61</th>\n",
       "      <td>62</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Icard, Miss. Amelie</td>\n",
       "      <td>female</td>\n",
       "      <td>38.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>113572</td>\n",
       "      <td>80.0</td>\n",
       "      <td>B28</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>829</th>\n",
       "      <td>830</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>Stone, Mrs. George Nelson (Martha Evelyn)</td>\n",
       "      <td>female</td>\n",
       "      <td>62.0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>113572</td>\n",
       "      <td>80.0</td>\n",
       "      <td>B28</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     PassengerId  Survived  Pclass                                       Name  \\\n",
       "61            62         1       1                        Icard, Miss. Amelie   \n",
       "829          830         1       1  Stone, Mrs. George Nelson (Martha Evelyn)   \n",
       "\n",
       "        Sex   Age  SibSp  Parch  Ticket  Fare Cabin Embarked  \n",
       "61   female  38.0      0      0  113572  80.0   B28      NaN  \n",
       "829  female  62.0      0      0  113572  80.0   B28      NaN  "
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "titanic[titanic['Embarked'].isna()]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c50b13ad-3a6a-4162-ac04-fc0aa86da086",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3975713-1cbe-4c43-a97a-14c6d6b18554",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Selecting rows and columns\n",
    "\n",
    "![Selecting specific rows and columns from a DataFrame](images/03_subset_columns_rows.svg)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "55aa6a2d-a5bb-42c9-84fa-ab55160ccf39",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "The true power of `.loc[]` becomes evident when you want to filter rows and simultaneously select specific columns. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "8fe0feca-1a72-4596-a9a7-d9e30d8fed0f",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Pclass</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>McCarthy, Mr. Timothy J</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Bonnell, Miss. Elizabeth</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>Andersson, Mr. Anders Johan</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>Hewlett, Mrs. (Mary D Kingcome)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>865</th>\n",
       "      <td>Bystrom, Mrs. (Karolina)</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>871</th>\n",
       "      <td>Beckwith, Mrs. Richard Leonard (Sallie Monypeny)</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>873</th>\n",
       "      <td>Vander Cruyssen, Mr. Victor</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>879</th>\n",
       "      <td>Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>885</th>\n",
       "      <td>Rice, Mrs. William (Margaret Norton)</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>217 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  Name  Pclass\n",
       "1    Cumings, Mrs. John Bradley (Florence Briggs Th...       1\n",
       "6                              McCarthy, Mr. Timothy J       1\n",
       "11                            Bonnell, Miss. Elizabeth       1\n",
       "13                         Andersson, Mr. Anders Johan       3\n",
       "15                    Hewlett, Mrs. (Mary D Kingcome)        2\n",
       "..                                                 ...     ...\n",
       "865                           Bystrom, Mrs. (Karolina)       2\n",
       "871   Beckwith, Mrs. Richard Leonard (Sallie Monypeny)       1\n",
       "873                        Vander Cruyssen, Mr. Victor       3\n",
       "879      Potter, Mrs. Thomas Jr (Lily Alexenia Wilson)       1\n",
       "885               Rice, Mrs. William (Margaret Norton)       3\n",
       "\n",
       "[217 rows x 2 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Filter rows where 'Age' is greater than 35 and select only the 'Name' and 'Pclass' columns.\n",
    "\n",
    "titanic.loc[titanic['Age'] > 35, ['Name', 'Pclass']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ee5aef8-30b9-4e07-ad18-d938822d9ad2",
   "metadata": {},
   "source": [
    "### Combining `loc` and `iloc`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cc456912-4e0f-4245-856b-06e2d6286456",
   "metadata": {},
   "source": [
    "We can use `iloc` to select:\n",
    "\n",
    "* Rows from index position 9 to 25 (exclusive).\n",
    "* Columns from index position 2 to 6 (exclusive).\n",
    "\n",
    "Then further filter these rows using `loc` based on the condition where:\n",
    "\n",
    "* 'Age' is greater than 35."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "40868771-b1f1-491b-9cdf-3397b731ada7",
   "metadata": {
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Pclass</th>\n",
       "      <th>Name</th>\n",
       "      <th>Sex</th>\n",
       "      <th>Age</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>1</td>\n",
       "      <td>Bonnell, Miss. Elizabeth</td>\n",
       "      <td>female</td>\n",
       "      <td>58.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>3</td>\n",
       "      <td>Andersson, Mr. Anders Johan</td>\n",
       "      <td>male</td>\n",
       "      <td>39.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>2</td>\n",
       "      <td>Hewlett, Mrs. (Mary D Kingcome)</td>\n",
       "      <td>female</td>\n",
       "      <td>55.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Pclass                              Name     Sex   Age\n",
       "11       1          Bonnell, Miss. Elizabeth  female  58.0\n",
       "13       3       Andersson, Mr. Anders Johan    male  39.0\n",
       "15       2  Hewlett, Mrs. (Mary D Kingcome)   female  55.0"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "titanic.iloc[9:25, 2:6].loc[titanic['Age'] > 35]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d06d4633-9550-4978-a275-0d6920770a9c",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "___"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4114093e-a6fd-40f0-ac86-d0b2493f9682",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": []
   },
   "source": [
    "## Key points\n",
    "\n",
    "* Access a single column or multiple columns from a `DataFrame` using square bracket notation (`[]`).\n",
    "\n",
    "* Use label-based indexing (`.loc[]`) and position-based indexing (`.iloc[]`) to select specific rows and columns.\n",
    "\n",
    "* Slicing with `.loc[]` is inclusive, while slicing with `.iloc[]` is exclusive.\n",
    "\n",
    "* Filter rows based on conditions and apply additional conditions or select specific columns within the same operation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "95ec0280-28be-43f5-9c20-5db2f795bf04",
   "metadata": {
    "editable": true,
    "slideshow": {
     "slide_type": ""
    },
    "tags": [
     "hide-output"
    ]
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}