General Question

pixiequeen12's avatar

Is there a free feature for removing duplicate columns/rows in Microsoft Excel?

Asked by pixiequeen12 (137points) May 13th, 2008

I have already used one from AppleBits.com, which was only a free trial that has now expired. Does anybody know of a permanent duplicate removal feature (that doesn’t cost money)? It really should already be part of the software, don’t you all agree?!?

Observing members: 0 Composing members: 0

7 Answers

ccatron's avatar

you can do this without adding software.

http://support.microsoft.com/kb/262277

I think this should work with all versions of Excel. if you’re using a mac, the menus might be a little different, but it should be there somewhere. It is a little tricky if you need to get rid of duplicate data in a specific column, but it can be done.

pixiequeen12's avatar

i tried that out and it doesnt seem to be working. it deletes some, but not all of the duplicate rows… I’ll have to keep playing with it, I guess… But it’s also not as useful because it makes a copy of the entire spreadsheet—so when you’re trying to dedup, say 3000 emails, names and addresses, it creates a pretty bulky document!

ccatron's avatar

yeah, its tricky..i know with Excel 2007 the process is a lot easier, but again, I don’t know about the mac version

Michael's avatar

The workaround that I use is as follows:

1) Sort your data by the column you wish to de-dup (so, for example, with a list of e-mails, sort by e-mail)
2) In the second row of the first empty column, type in the following formula: =if(X1=X2,1,0) where X is whatever column letter the e-mail addresses are in. This essentially puts a 1 in that cell if the e-mail in that row is the same as the one above it.
3) Copy and paste that formula into the rest of the rows in that column (excel will automatically change the row numbers in your formula, so don’t worry about that)
4) Select and copy your new column of 1’s and 0’s and then paste it back over itself using the paste special – values option. This will prevent the 1’s and 0’s from changing when you resort, which you will do in step…
5) Resort the spreadsheet by your new column. This will lump all the 1’s (duplicates) together which will make it really easy to…
6) delete all the rows with 1’s.

That will leave you with only unique e-mails.

pixiequeen12's avatar

actually—i don’t use excel as much on my mac. i need it for a pc at work.

pixiequeen12's avatar

Michael—you’re awesome. I actually know that formula already, but am completely impressed that you do as well! I’m thinking of starting an Excel Club…It’s like Fight Club, but for that rare excel-nerd extraordinaire (I, myself, am just a wanna-be, but I’m sure these people exist), and you can be our honorary member! :)
The trouble with that formula is that I’m trying to de-dup for rows that match entirely—not just in a single column; hope that makes some sort of sense…

CyanoticWasp's avatar

@pixiequeen12, then try this:

Let’s say that you have, oh, 10 columns of data that might be duplicated, and you want to check all 10 columns. So you want to check columns A:J for duplication across all 10 at once. This isn’t so difficult.

Use the CONCATENATE() function to group all of the values from A1:J1 into a single cell:
=CONCATENATE( A1, B1, C1, D1, E1, F1, G1, H1, I1, J1)
and copy that formula down through all the rows of your data. (It’s just possible that this formula will create duplicates that aren’t really dups. Don’t worry about that for now. For example: “Joe Blow 1 10” and “JoeBlow 110” will both concatenate to “JoeBlow110”, making an apparent duplicate. We’ll address that later.)

Sort on this column of concatenated values and apply the formula above from @Michael to flag the “apparent” dups.

If you have thousands of rows of data to review, then I can offer more suggestions to automate the rest of the process as well, but if all you have is a few hundred rows, then this will help you to find those rows that are duplicated completely (and maybe some that “should be” duplicates, but aren’t).

For example, after you sort on the concatenated values, then you can apply @Michael‘s formula to the concatenated values, as explained, AND to some of the other ‘main’ values, as well. The flags that show up in both places would be the ones to concentrate on.

Answer this question

Login

or

Join

to answer.

This question is in the General Section. Responses must be helpful and on-topic.

Your answer will be saved while you login or join.

Have a question? Ask Fluther!

What do you know more about?
or
Knowledge Networking @ Fluther