Tutorials References Exercises Sign Up Menu
Create Website Get Certified Pro

Pandas DataFrame duplicated() Method

❮ DataFrame Reference


Check which rows are duplicated and not:

import pandas as pd

data = {
  "name": ["Sally", "Mary", "John", "Mary"],
  "age": [50, 40, 30, 40]

df = pd.DataFrame(data)

s = df.duplicated()
Try it Yourself »

Definition and Usage

The duplicated() method returns a Series with True and False values that describe which rows in the DataFrame are duplicated and not.

Use the subset parameter to specify if any columns should not be considered when looking for duplicates.


dataframe.duplicated(subset, keep)


The parameters are keyword arguments.

Parameter Value Description
subset column label(s) Optional. A String, or a list, containing any columns to ignore
keep 'first'
Optional, default 'first'. Specifies which duplicate to keep. If False, drop ALL duplicates

Return Value

A Series with a boolean value for each row in the DataFrame.

❮ DataFrame Reference