I have a dataset containing 3 columns.
- Personal ID number
- field of employment
- months of working experience in the given field of employment
I have a large number of duplicates where a person has experience in two or more fields of employment. I want to narrow down my dataset so that no personal ID has a duplicate. The criteria being the field of employment where the individual has the most experience.
My dataset looks like this:
a b c
teacher ID99999 38
teachers assistant ID99999 6 2 Answers
You can do this using a helper column in Column D.
- Enter this array formula in D2:
=IF($C2=MAX(IF($A2=$A$2:$A$20,$C$2:$C$20,-1)),"Remain","Remove")
Note: Enter the formula with Ctrl+Shift+Enter and fill down.
Filter column D by "Remove".
Finally, delete the filtered rows.
This formula will find the maximum months of experience in Column C for duplicate ID's, and mark those rows with "Remain". Other rows with duplicate ID's are marked with "Remove".
4Sort C column from largest to smallest.
Now remove duplicate from Data Tab-> Remove Duplicates based on B Column.