Data transformation is a crucial step in any data analysis pipeline. It involves modifying and restructuring data to facilitate further analysis or produce more meaningful insights. In this tutorial, we will explore the various techniques for applying functions and mapping in Pandas, which are essential for transforming data in DataFrames.

## Applying Functions with *apply(), applymap()*, and *map()*

Pandas provides three main methods for applying functions to DataFrames and Series: `apply()`

, `applymap()`

, and `map()`

. Each method serves a specific purpose and works with different types of data.

### Using *apply()*

The `apply()`

method can be used on both Series and DataFrames. When used on a Series, it applies a function to each element in the Series. When used on a DataFrame, it applies a function along a specified axis (either rows or columns).

For example, let’s create a simple DataFrame:

```
import pandas as pd
data = {
'Fruit': ['Apple', 'Banana', 'Orange', 'Grape', 'Watermelon'],
'Price': [1.2, 0.5, 0.75, 2.0, 3.0],
'Quantity': [50, 100, 75, 30, 10],
'Discount': [0.1, 0.05, 0.2, 0.15, 0.25]
}
df = pd.DataFrame(data)
print(df)
```

Now, let’s apply a function that discount each element in column ‘Price’:

```
df['Discounted_Price'] = df['Price'].apply(lambda x: x * 0.9)
print(df)
```

Fruit Price Quantity Discount Discounted_Price 0 Apple 1.20 50 0.10 1.080 1 Banana 0.50 100 0.05 0.450 2 Orange 0.75 75 0.20 0.675 3 Grape 2.00 30 0.15 1.800 4 Watermelon 3.00 10 0.25 2.700

To apply a function along the rows (axis=1), you can do the following:

```
numeric_columns = ['Price', 'Quantity', 'Discount']
row_sums = df[numeric_columns].apply(lambda x: x.sum(), axis=1)
print(row_sums)
```

0 51.30 1 100.55 2 75.95 3 32.15 4 13.25 dtype: float64

This will return a Series with the sum of each numeric row.

Check out our tutorial on selecting and filtering data in Pandas to learn more about working with DataFrames and Series.

### Using *applymap()*

The `applymap()`

method is used to apply a function element-wise to every element in a DataFrame. This method is particularly useful when you need to apply a transformation to the entire DataFrame.

Let’s apply a function that discount ‘Price’, ‘Discount’ element in the DataFrame:

```
df_discounted = df[['Price', 'Discount']].applymap(lambda x: x * 0.9)
print(df_discounted)
```

Price Discount 0 1.080 0.090 1 0.450 0.045 2 0.675 0.180 3 1.800 0.135 4 2.700 0.225

### Using *map()*

The `map()`

method is used to apply a function or a mapping (dictionary, Series, or function) to each element in a Pandas Series. It is similar to the `apply()`

method but works only on Series.

Here’s an example of using `map()`

to replace the elements in a Series:

```
fruit_colors = {
'Apple': 'Red',
'Banana': 'Yellow',
'Orange': 'Orange',
'Grape': 'Purple',
'Watermelon': 'Green'
}
df['Color'] = df['Fruit'].map(fruit_colors)
print(df)
```

Fruit Price Quantity Discount Discounted_Price Color 0 Apple 1.20 50 0.10 1.080 Red 1 Banana 0.50 100 0.05 0.450 Yello 2 Orange 0.75 75 0.20 0.675 Orange 3 Grape 2.00 30 0.15 1.800 Purple 4 Watermelon 3.00 10 0.25 2.700 Green

In this example, we’re using the `map()`

function to map the fruit names in the ‘Fruit’ column to their corresponding colors. We create a dictionary called `fruit_colors`

, where the keys are the fruit names and the values are the fruit colors. We then apply the `map()`

function to the ‘Fruit’ column in the DataFrame, using the `fruit_colors`

dictionary for the mapping. The result is a new column ‘Color’ in the DataFrame, which contains the corresponding color for each fruit.

## Lambda Functions in Pandas

Lambda functions are anonymous functions in Python that can be defined using the `lambda`

keyword. They are particularly useful in Pandas when you need to apply a simple function to a Series or DataFrame without defining a separate function.

Here’s an example of using a lambda function to discount ‘Price’ element in a Series:

```
df['Discounted_Price'] = df['Price'].apply(lambda x: x * 0.9)
print(df)
```

You can also use lambda functions with conditional expressions. We’re using the `apply()`

function along with a lambda function to create a new column in the DataFrame called ‘Price_Rounded’. The lambda function checks each value in the ‘Price’ column, and if the value is less than 1, it rounds the value to the nearest integer. Otherwise, the value remains unchanged. The resulting ‘Price_Rounded’ column contains the modified prices according to the specified condition.

```
df['Price_Rounded'] = df['Price'].apply(lambda x: round(x) if x < 1 else x)
print(df)
```

Lambda functions can be combined with other Pandas methods to perform more complex operations. We use a lambda function to square the elements in the ‘Price’ column of the DataFrame. We then apply the `mean()`

function to calculate the average of the squared prices.

```
avg_squared = df['Price'].apply(lambda x: x**2).mean()
print(avg_squared)
```

## Vectorized Operations

In addition to the `apply()`

, `applymap()`

, and `map()`

methods, Pandas supports vectorized operations, which enable you to perform element-wise operations on Series and DataFrames without using explicit loops or functions. Vectorized operations in Pandas are built on top of NumPy’s array operations, providing high performance and ease of use.

Here’s an example of multiplying two columns in a DataFrame using a vectorized operation:

```
df['Total_Price'] = df['Price'] * df['Quantity']
print(df)
```

You can also perform arithmetic operations, such as addition, subtraction, multiplication, and division, directly on DataFrames and Series:

```
df['Total_Price_with_Discount'] = df['Price'] * (1 - df['Discount']) * df['Quantity']
print(df)
```

Learn more about working with DataFrames in our comprehensive guide on understanding Pandas DataFrames.

## Conclusion

In this tutorial, we explored the various techniques for applying functions and mapping in Pandas, which are essential for transforming data in DataFrames. We learned how to use the `apply()`

, `applymap()`

, and `map()`

methods, as well as how to work with lambda functions and perform vectorized operations.

With this knowledge, you can now effectively transform and manipulate your data using Pandas. To further expand your Pandas skillset, consider diving into topics like **sorting, renaming, and merging DataFrames**, **grouping and aggregating data using GroupBy**, and **handling missing data in Pandas**.