Dropping duplicate values in a column

Refresh

February 2019

Views

25 time

1

i have a frame like;

df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})

tried to convert it to a list, set etc.. but coudnt handle

how can i drop the duplicates

2 answers

1

Use custom function with split and set:

df['America'] = df['America'].apply(lambda x: set(x.split(',')))

Another solution is use list comprehension:

df['America'] = [set(x.split(',')) for x in df['America']]

print (df)
     America
0   {23, 24}
1       {10}
2  {AA,  XY}
1

Это один подход с использованием str.split.

Пример:

import pandas as pd

df = pd.DataFrame({'America':["24,23,24,24","10","AA,AA, XY"]})
print(df["America"].str.split(",").apply(set))

Выход:

0     {24, 23}
1         {10}
2    {AA,  XY}
Name: America, dtype: object