Have you even been overwhelmed by an extensive Microsoft Excel spreadsheet? If yes, you’re not alone. Upon receiving thousands of lines of complex data you may think, “It has got to be easier than this.” Chances are, there are Excel tools that can simplify your data project and shrink your workload. Better yet, these tools are easy to use and often hide in plain sight. To uncover these helpful tools, all you need is Microsoft Excel help from an expert.
In this blog post, “The Office Guy” is helping us learn more about a common Excel feature that hides in plain sight: the remove duplicates tool.
Microsoft Excel Help – Tools Hiding in Plain Sight
More notes from “The Office Guy”…
A big part of my job as a Microsoft Office instructor is showing students where to find the right buttons to click in the Ribbons at the top of the screen. And many times when I show someone how to use a certain feature in Microsoft Office they will ask “how long has this feature been there?” Well, sometime I am showing them a feature that is new to a particular version of the software (like Office 2013), but often my answer is something like “they put that button up there in the late 1990’s”.
I call this situation “hiding in plain sight”, because one of the things I learned long ago as an instructor is that most users have “tunnel vision”. Meaning that they only focus on the middle of the screen where they are entering, editing or just looking at their data, while ignoring the tools displayed at the edges of the screen. So, this month’s blog is devoted to a feature that has been “hiding in plain sight” in Excel for a long time: the Remove Duplicates Tool.
Microsoft Excel Help – How to Shrink 60,000 Rows of Data
The reason I chose to review this feature is a true story from my own workplace. One of my colleagues had received a large list of potential business contacts from many different companies, but what she really wanted to find out how many unique company names were represented in the listing and there were literally thousands of entries where the company names appeared on multiple rows. The list was close to 60,000 rows of data representing almost 5,000 companies and narrowing down the list to just one occurrence per company looked like it was going to be a very time consuming task. Looking for help, she asked me if there was a way to do this in Excel automatically, so I showed her how to use (and where to find) a tool that has part of Excel for as long I’ve been an instructor (18 years). So I thought it would be a good idea to show all of you how to use a tool that has been “hiding in plain sight”.
Microsoft Excel Help – Using “Remove Duplicates” Tool
First, let’s look at an example of the type of data where this tool could be used – a list with a lot of rows of duplicate information:
Note how in this list there are several names that are duplicated on multiple rows Now imagine that these names are not always listed right next to each other, and that this list is several hundreds or thousands of row of data and your mission is to eliminated the Duplicate rows (or records) of data.
So, where is this tool located? It’s on the Data Tab, in the Data Tools group (and it has been since 2007 when Microsoft introduced the Ribbon-based interface:
So, after you select a cell location in the list you’re working on, click this button and the following dialog box will appear:
Noticed that all of the column names (fields) in your list are selected by default. You may not care if there is duplicate information in some of these fields, so first click the button marked “Unselect All” to clear all the checkboxes as shown below:
Then check (select) the columns (fields) that you want to examine for duplicate data, then click OK: Excel will then remove all but one row (records) in the data that contain duplicate information, leaving you with no duplicates. Excel will also tell you how many rows (records) of data have been removed from the list:
When you complete this operation by clicking the OK button only one unique occurrence of the fields you chose will remain in the list. Now in this example I was working with a small set of sample data that we use to teach our classes – it may not seem like a big deal that it only found 3 duplicate values.
So, to finish my story about my colleague with almost 60,000 rows of data, this tool found and eliminated over 45,000 rows of duplicated company name, leaving behind a list of 15,000 unique company names. Not bad for a tool that takes a few seconds to use – once you know where to look. It’s been “hiding in plain sight” for the last 8 years…