Skip to content

Using Pandas

Using Datetimes as a DataFrame Index

Set a column with string-type entries to datetimes by calling the .to_datetime function.

python
data['datetime'] = pd.to_datetime(data['timestamp'], format='%b %d, %Y %H:%M:%S')
data['datetime'] = pd.to_datetime(data['timestamp'], format='%b %d, %Y %H:%M:%S')

It is important to set the index of the DataFrame to use a DatetimeIndex.

python
data = data.set_index('datetime')
data = data.set_index('datetime')

Essential DataFrame Methods

drop_duplicates()

Returns a DataFrame with any duplicates dropped, e.g., people['heights'].drop_duplicates() will return only unique heights.

nlargest()

Returns a DataFrame of n rows with the largest values. Often good to use in combination with drop_duplicates()

.iloc[]

Index-based selection for rows. DF.iloc[-1] will return the last row, for example.

ADDITIONAL RESOURCES