Python Pandas Library for Beginners

Epythonguru
3 min readApr 10, 2022

Python Pandas Library for Beginners

A Practical Guide to Getting Started and Ditching Spreadsheets. Pandas make it easy some descriptive statistics using functionality expression and counting.

Use .count() to find the number of rows
Use .mean() to find the mean sodium value
Use .median() to find the median sodium value
Use .mode() to find the mode sodium value
Use .min() to find the minimum sodium value
Use .max() to find the maximum sodium value
#Use .Count() to find the number of rowsdf[[‘brand’, ‘flavor’]].count()
#Use .mean() to find the mean sodium valuedf[‘sodium’].mean()
# Use .median() to find the median sodium valuedf[‘sodium’].median()#470.0
# Use .mode() to find the mode sodium valuedf[‘sodium’].mode()#520
# Use .min() to find the minimum sodium valuedf[‘sodium’].min()#140#
#Use .max() to find the maximum sodium valuedf[‘sodium’].max()#650
Adding a new column is easy. This is the basic syntax I use:

Also read, types of Python frameworks for developing applications DataFrame[‘column name’] = something

Before modifying the data frame, I think it is a good idea to copy it. That way I can always refer to the original if I make a mistake. Create a copy using DataFrame.copy()

Create a new copy of the dataframe named df_copy.
Create a column and set each row to a value of 1.
Display the top 3 rows — brand, flavor, and the new column.
#make a copy
df_copy = df.copy()
#add a new column
df_copy[‘My_New_Column’] = 1
#display the new

columndf_copy[[‘brand’,’flavor’,’My_New_Column’]].head(3)df_copy[[‘brand’,’flavor’,’My_New_Column’]].head(3)
Adding new columns can be very useful. For example, suppose I wanted to match sodium with the mean amount.
#create a new column that displays the mean value of sodium.
df_copy[‘sodium_mean’] = df[‘sodium’].mean()df_copy[[‘brand’,’flavor’,’sodium’,’sodium_mean’]].head(3)df_copy[[‘brand’,’flavor’,’sodium’,’sodium_mean’]].head(3)

Also read, types of Python frameworks for developing applications

Create the top_count_string column 1 if the cost is greater than average.
Create a word_count column that outputs the number of words in the manufacture Description column.
There are probably multiple ways to approach the problems, but I’ll show you two. First, I will show how it can be done using a simple for loop.

Also read, types of Python frameworks for developing applications

#use a for loop to create a new column
average_calories = df_copy[‘calories’].mean()above_average = []for calories in df_copy[‘calories’]:
if calories average_calories:above_average.append(1)else:above_average.append(0)df_copy[‘above_average_calories’] = above_average

I would consider using a loop if readability is very important, but generally, I’ll visit the pythonic route and attempt to use list comprehension.

Also read, types of Python frameworks for developing applications

#Use a list to create new columns
# Create a new column that removes 1 if the cost is higher than the average.
Else 0df_copy[‘above_average_calories’] = [1 if n > average_calories else 0 for n in df_copy[‘calories’]]#Create a word_count column that outputs the number of words in the
#manufacture
Description
df_copy[‘word_count’] = [len(str(words).split(“ “)) for words in df_copy[‘manufactureDescription’]]
#Display the top 3 rows
df_copy[[‘brand’,’flavor’,’above_average_calories’,’word_count’]].head(3)top 3 rows with new columns
create a word count column in another way by use pandas.

DataFrame[‘column’].apply()
It might feel a little tricky at first, but apply lets me apply a function along the axis of a dataframe. What that means is I can apply a function to each column or row.
In the example below, I use .apply() to apply a lambda function to each column (axis = 0 by default) of the dataframe.
#Create a new column that outputs the number of words in the manufacture Description
df_copy[‘word_count’]= df_copy[‘manufactureDescription’].apply(lambda x: len(str(x).split(“ “)))

Originally published at https://www.epythonguru.com.

--

--

Epythonguru

A platform for Python Learners from beginners to advanced level. The main focus on the implementation of mathematical equations in and Python frameworks.