Dictionary for Pandas#
Attributes#
Properties associated with an object in Python. In the context of the pandas library, attributes provide access to specific information or properties of pandas objects like DataFrames
and Series
.
Example: df.shape
df.columns
df.index
CSV#
CSV stands for Comma-Separated Values. It is a plain text file format used to store tabular data, where each line of the file represents a row in the table, and each value in the row is separated by a comma.
DataFrame#
A 2-dimensional labeled data structure with columns of potentially different types. Is often called df in the examples, but can be given any name.
Example:
df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
Index#
In pandas, an index is a label or set of labels used to identify and access elements in data structures like Series
and DataFrames
. It provides a way to reference rows and columns by name or number.
NaN#
NaN stands for “Not a Number.” It is a special floating-point value used to represent missing or undefined data in numerical computations. It is often used in datasets to indicate that a value is missing.
pd#
Aliasing pandas as pd
is a widely adopted convention that simplifies the syntax for accessing its functionalities.
After this statement, you can use pd
to access all the functionalities provided by the pandas library.
Example:
import pandas as pd
Series#
A Series
in Pandas is like a list of values in a single column, where each value has a unique label called an index. It is a simple way to store and manage a sequence of data.
It is a 1-dimensional labeled array capable of holding any data type.
Example:
s = pd.Series([1, 2, 3, 4])
Subset#
In Pandas, a subset refers to a selection or extraction of a portion of a DataFrame
or Series
based on specific criteria. This can involve selecting particular rows, columns, or both from the original data structure.
Transpose#
transpose()
is a method for swapping rows and colums in a DataFrame
. Transposing is useful for reshaping data, making it easier to compare rows or apply certain operations that are typically column-based.
df.transpose()