Identifying Duplicate Values in Excel Listsby
In this article, David Ringstrom, CPA compares Conditional Formatting and Advanced Filter to identify duplicate items in a list. He then takes things a step further by showing accountants how to eliminate duplicates by using the Remove Duplicates feature and the UNIQUE function in Excel 2021 and Microsoft 365.
A sure-fire way to make an accounting professional sigh is to hand over an Excel list peppered with duplicate values. Fortunately, there are several ways that you can identify duplicate values in your spreadsheets.
Conditionally Formatting Duplicate Values
Let’s say that you’d like to determine if any duplicates appear in a list of invoice numbers, as shown in Figure 1:
1. Select the range of cells that may contain duplicate values.
2. Choose Conditional Formatting on Excel’s Home menu.
3. Choose Highlight Cells Rules.
4. Choose Duplicate Values.
5. The Conditional Formatting feature offers a live preview, meaning any duplicate values will be color-coded. You can optionally use the second drop-down list in the Duplicate Values dialog box to customize the formatting to you liking. Click OK to identify the duplicate values.
Figure 1: You can identify duplicates in a list with Conditional Formatting.
At this point you can use the Filter feature to display only the duplicate values:
1. Select any cell within the list that you applied conditional formatting to
2. Activate the Data menu
3. Choose the Filter command
4. Click the filter arrow for the column that has conditional formatting
5. Choose Filter by Color
6. Choose a color from the list
7. As shown in Figure 2 the worksheet is collapsed down to just the duplicate values
8. Click the filter arrow for the column that has conditional formatting
9. Choose Clear Filter to show the entire list or click the Clear command on the Data menu
Figure 2: You can filter lists based upon color.
Keep in mind that Conditional Formatting is a live layer of formatting that you’ve added to that part of the worksheet. If you were to add an extra character to one of the duplicate invoice numbers you’d see that the color coding would vanish from that cell and the previous counterpart duplicate cell.
When clearing up duplicates in a list in this fashion I like to click the Reapply command on the Data menu, which reapplies the filter. This hides any unique values so that the list then only shows the remaining duplicate values.
Identifying First Instances with Advanced Filter
Circumstances could arise where you want to color code the first instance of a given value:
1. As shown in Figure 3, select any cell within your list
2. Activate the Data menu in Excel
3. Click Advanced in the Sort & Filter group
4. Use the List Range field within the Advanced Filter dialog box to select the range of cells that you wish to identify unique values within.
5. Click the Unique Records Only checkbox within the Advanced Filter dialog box.
6. Click OK.
7. At this point only the unique records will be shown, and Excel’s Status Bar reflects the number of unique items. Optionally apply any formatting that you wish, such as a fill color.
8. Activate the Data menu again.
9. Click the Clear command to display the entire list again. If you applied formatting only the first instance of any value will be color-coded, the second instance and beyond will not.
Figure 3: The Advanced Filter allow you to hide duplicate values and display unique records only.
Now that you have a sense of how to identify duplicates, let’s look at how to remove duplicates from a list. The operative term in this first technique is “remove”, meaning that the feature physically removes the duplicate values. Accordingly make sure that you first make a copy of the list you’re going to process if you need an intact set of all the values:
1. Select any cell within your list.
2. Activate the Data menu.
3. Click Remove Duplicates.
4. Click OK in the Remove Duplicates dialog box.
5. Click OK on the prompt that informs you of how many duplicate values where found and how many remain.
Click Undo or press Ctrl-Z if you realize that you jumped the gun on removing duplicates and need your original list back. As shown in Figure 4, if your list has two or more columns Remove Duplicates treats each row as a single record.
Figure 4: Be careful as Remove Duplicates physically removes non-unique items from a list.
If you’re using Excel 2021 or Microsoft 365 then you can also remove duplicates with a worksheet function. Unlike the manual nature of Remove Duplicates the UNIQUE function works in real time and dynamically presents a list of unique items. Further your original list remains intact, as UNIQUE provides a list of unique values in a separate location in your spreadsheet. As shown in Figure 5 the UNIQUE function has three arguments:
• Array – the range of cells that you wish to display unique values from.
• By_col – this argument defaults to TRUE which means that it returns unique values from a column. Enter FALSE if you wish to return unique values from across a row.
• Exactly_once – Enter TRUE here to display the items that appear in a list exactly once, or FALSE to display one of each unique value no matter how many times it appears on the list.
Keep in mind that by_col and exactly_once are optional arguments that you only need to provide when needed. Further UNIQUE will return #CALC! if you specify TRUE for the exactly_once argument but no items appear in your list exactly once.
The UNIQUE function is a dynamic array function, which means that you only enter the formula in a single cell, while the results will spill into as many additional cells as needed. If there is any data in the way, or if you overwrite the results of the UNIQUE function then a #SPILL! error will arise, which you can clear by removing the extraneous data from the area where UNIQUE wants to return its results.
A border will appear around this area when you click on the #SPILL! error. As shown in Figure 5 you can wrap the SORT function around the UNIQUE function to automatically sort your list of unique values.
Figure 5: The UNIQUE function in Microsoft 365 and Excel 2021 enables you remove duplicates by way of a formula.
David H. Ringstrom, CPA, is an author and nationally recognized instructor who teaches scores of webinars each year. His Excel courses are based on over 25 years of consulting and teaching experience. His mantra is “Either you work Excel, or it works you.” David offers spreadsheet and database consulting services nationwide.