Removing duplicate data
To locate duplicates in a worksheet, follow these steps:
- First, open the workbook named Duplicates.xlsx. Always make a copy of your dataset before removing duplicate data.
- Use the Conditional Formatting tool to locate duplicates in the worksheet, or locate the Remove Duplicates tool.
-
Select the data in the worksheet, then go to Home | Conditional Formatting |
Highlight Cells Rules | Duplicate Values….
- This method will highlight all the duplicate data in the range that you have specified. Note that you can change the Format cells that contain dropdown to Unique if you wish.
- Change the Custom Format option if you do not want to use the default color scheme.
- Click OK to identify duplicate values in the worksheet.
To remove duplicates, follow these steps:
- Select the column or table you wish to remove duplicates from (or click inside the data range).
- Click Data | Remove Duplicates:
Figure 9.52 – The Remove Duplicates feature
- Click to indicate whether your data has column headers or not.
- Uncheck any columns you do not want to include in the range. Click OK to remove the duplicates:
Figure 9.53 – The Remove Duplicates dialog box
- Another screen will populate, specifying the number of duplicate values located on the worksheet, as well as the number of unique values that remain. Don’t forget to save the workbook:
Figure 9.54 – The results are displayed in the notification popup
- Click OK to view the results without any duplicates.
Another way to remove duplicates will be explained in the next section.