How to Resolve Duplicate Data within Excel Pivot Tables

Spreadsheets and graphs on a desk
xfgiro/istock
Share this content

An attendee from my recent pivot table webinar posed a question that I hadn’t encountered before.

Pamela had an issue where some, but not all, items within her pivot table were being duplicated, with two different totals. If you’re new to pivot tables, you can catch up by watching a free recording of the webinar.

In this article I’ll explain how I helped Pamela track down and resolve a nuance within her data. Although Pamela’s data set was much larger, I only need the two columns of data shown in Figure 1 to illustrate what she was experiencing. There’s a hidden aspect to this data that I’ll reveal in a moment, but let’s first create a pivot table from this data:

Please Login or Register to read the full article

To access all of the content on our site, register (it's free!) or login to your existing account.

About David Ringstrom

About David Ringstrom

David H. Ringstrom, CPA, is an author and nationally recognized instructor who teaches scores of webinars each year. His Excel courses are based on over 25 years of consulting and teaching experience. His mantra is “Either you work Excel, or it works you.” David offers spreadsheet and database consulting services nationwide.

Replies

Please login or register to join the discussion.

avatar
By William
Jun 26th 2015 01:11

Wow great info about Resolving issue of Duplicate Data within Excel Pivot Tables, thanks :)

http://www.education-institute...

Thanks (0)
avatar
By Ralph Pawne
Jun 26th 2015 01:11

I've come across a great tool called DataMatch by Data Ladder (http://DataLadder.com), which is an excellent fuzzy matching and deduplication tool used across business and would work really well for this situation. They offer a complimentary trial for new users.

In fact, an independent verified evaluation was done of the software comparing it to major software tools by IBM and SAS. There was a study done at Curtin University Centre for Data Linkage in Australia that simulated the matching of 4.4 Million records. It identified what providers had in terms of accuracy (Number of matches found vs available. Number of false matches)

1.DataMatch Enterprise, Highest Accuracy (>95%), Very Fast, Low Cost

2.IBM Quality Stage , high accuracy (>90%), Very Fast, High Cost (>$100K)

3.SAS Data Flux, Medium Accuracy (>85%), Fast, High Cost (>100K)

Thanks (0)
avatar
By Anselm
Apr 10th 2017 17:44

My pivot table apparently arbitrarily splits the same data into two columns. In the column labelled "Faculty" in the data, for example, the value "All" appears 22 times, but the pivot table randomly splits these into two columns, with 20 appearances in one and two in the other. I've used the =ISNUMBER function to check every cell with that value in it. All return "FALSE", meaning, if I understand correctly, that the value in each cell is stored as text - which it should be.

Any ideas what's going wrong?

Thanks (0)