Pandas Basics Cheat Sheet



Python basics or Python Debugger cheat sheets for beginners covers important syntax to get started. Community-provided libraries such as numpy, scipy, sci-kit and pandas are highly relied on and the NumPy/SciPy/Pandas Cheat Sheet provides a quick refresher to these. Python 2.7 Quick Reference Sheet; Python Cheat Sheet by DaveChild via. Python Data Visualization Bokeh 2. Enjoy our journey. If you have any suggestion or comments, or any question, please email me. The 26 Best Cheat Sheets & Infographics for Artificial Intelligence, Machine Learning, Deep Learning, Neural Networks & Big Data Science. Pandas Cheat Sheet Python Pandas Cheat Sheet Pandas is one of the most popular packages in Python. It is widely used for data manipulation, data cleaning and wrangling. Panda’s package comes up with multiple feature-rich functions and options which could be overwhelming.

Pandas cheat sheet¶¶

Pandas is Python Data Analysis library. Series and Dataframes are major data structures in Pandas. Pandas is built on top of NumPy arrays.

ToC

  • Series
  • DataFrames
    • Slicing and dicing DataFrames
    • Conditional selection
    • Operations on DataFrames
    • DataFrame index

Series¶¶

Pandas Basics Cheat Sheet Printable

Series is 1 dimensional data structure. It is similar to numpy array, but each data point has a label in the place of an index.

Create a series¶¶

Thus Series can have different datatypes.

Operations on series¶¶

You can add, multiply and other numerical opertions on Series just like on numpy arrays.

When labels dont match, it puts a nan. Thus when two series are added, you may or may not get the same number of elements

DataFrames¶¶

Creating dataFrames¶¶

Pandas DataFrames are built on top of Series. It looks similar to a NumPy array, but has labels for both columns and rows.

reliabilitycostcompetitionhalflife
Car10.1343020.6252070.9709810.717605
Car20.7137660.7731820.0596890.450899
Car30.0589900.9043010.4314870.087683
Car40.5098910.5010370.2442790.763135

Slicing and dicing DataFrames¶¶

You can access DataFrames similar to Series and slice it similar to NumPy arrays

Access columns¶¶
Accessing using index number¶¶

If you don’t know the labels, but know the index like in an array, use iloc and pass the index number.

Dicing DataFrames¶¶

Dicing using labels > use DataFrameObj.loc[[row_labels],[col_labels]]

costcompetition
Car20.9353680.719570
Car30.6599500.605077
costcompetition
Car20.9353680.719570
Car30.6599500.605077

Conditional selection¶¶

When running a condition on a DataFrame, you are returned a Bool dataframe.

reliabilitycostcompetitionhalflife
Car10.7764150.4350830.2361510.169087
Car20.7904030.9874590.3705700.734146
Car30.8847830.2338030.6916390.725398
Car40.6930380.7168240.7669370.490821
reliabilitycostcompetitionhalflife
Car30.8847830.2338030.6916390.725398
Pandas Basics Cheat Sheet
Chaining conditions¶¶

In a Pythonic way, you can chain conditions

Multiple conditions¶¶

You can select dataframe elements with multiple conditions. Note cannot use Python and , or. Instead use &, |

reliabilitycostcompetitionhalflife
Car10.7764150.4350830.2361510.169087
Car20.7904030.9874590.3705700.734146
reliabilitycostcompetitionhalflife
Car10.7764150.4350830.2361510.169087
Car20.7904030.9874590.3705700.734146
Car30.8847830.2338030.6916390.725398

Operations on DataFrames¶¶

Pandas
Adding new columns¶¶

Create new columns just like adding a kvp to a dictionary.

reliabilitycostcompetitionhalflifefull_life
Car10.1343020.6252070.9709810.7176051.435210
Car20.7137660.7731820.0596890.4508990.901799
Car30.0589900.9043010.4314870.0876830.175366
Car40.5098910.5010370.2442790.7631351.526270
Dropping rows and columns¶¶

Row labels are axis = 0 and columns are axis = 1

reliabilitycostcompetitionhalflife
Car10.1343020.6252070.9709810.717605
Car20.7137660.7731820.0596890.450899
Car30.0589900.9043010.4314870.087683
Car40.5098910.5010370.2442790.763135
Cheat
reliabilitycostcompetitionhalflifefull_life
Car10.1343020.6252070.9709810.7176051.435210
Car20.7137660.7731820.0596890.4508990.901799
Car40.5098910.5010370.2442790.7631351.526270
reliabilitycostcompetitionhalflifefull_life
Car10.1343020.6252070.9709810.7176051.43521
Car40.5098910.5010370.2442790.7631351.52627

DataFrame Index¶¶

So far, Car1, Car2.. is the index for rows. If you would like to set a different column as an index, use set_index. If you want to make index as a column rather, and use numerals for index, use reset_index

Panda Warmer Cheat Sheets

Set index¶¶
reliabilitycostcompetitionhalflifecar_names
Car10.7764150.4350830.2361510.169087altima
Car20.7904030.9874590.3705700.734146outback
Car30.8847830.2338030.6916390.725398taurus
Car40.6930380.7168240.7669370.490821mustang

Pandas Basics Cheat Sheet (2021) Python For Data Science

Functions
reliabilitycostcompetitionhalflifecar_names
car_names
altima0.7764150.4350830.2361510.169087altima
outback0.7904030.9874590.3705700.734146outback
taurus0.8847830.2338030.6916390.725398taurus
mustang0.6930380.7168240.7669370.490821mustang

Pandas Basics Cheat Sheet Pdf

indexreliabilitycostcompetitionhalflifecar_names
0Car10.7764150.4350830.2361510.169087altima
1Car20.7904030.9874590.3705700.734146outback
2Car30.8847830.2338030.6916390.725398taurus
3Car40.6930380.7168240.7669370.490821mustang