# Create a Unique List in Excel based on Criteria

Nathan is working on a spreadsheet that contains a list of car models and owners. He needs to create a unique list of owners per car. “Maybe I can use an `IF()` formula,” he thinks to himself and decides to give it a go. However, after several failed attempts he decides to search the web to see how people create a unique list in Excel based on a specific condition. He comes across a formula that looks pretty complicated, but others had claimed it worked for them, so Nathan decides to give it a whirl. Then he copy/pastes the formula in a cell and adjusts the ranges for his worksheet. After that he copies the formula down to create the list. It works! “Perfect,” Nathan excitedly thinks to himself, “now I’ll just copy that to the right…” and when he does, the formula breaks. He tries modifying the formula, but it’s too advanced for him. One small change upsets the formula and returns errors. Frustrated and tired, he reluctantly decides he’ll just create the list by hand.

Does this sound familiar? We’ve all been here at one point or another. The problem with these complicated formulas is that most people don’t understand what they’re copying / pasting into Excel. This leads to many tries and retries, only to lead to frustration. Excel has literally hundreds of formulas and it’s easy to get lost if you have a lot of nested functions to work with. It’s great that people share their clever formulas online and the internet is full of helpful advice, but if you find a formula online and simply copy / paste it without understanding how it works, you limit yourself from the power it has, and therefore, you limit the power that you have.

In this post, I’d like to walk us through how to understand one of these complicated formulas: how to create a unique list in Excel based on criteria. Let’s get started.

# The Problem

Let’s say you have this basic info:

On the left is an Excel Table named Cars (if you’re not familiar with Excel Tables, click here). This will be our data source we want to create a list from.

On the right side of the worksheet is where we will make that list, with each car in its own column. Note that there are duplicate cars owned by different people in the data source – this can make it difficult to create a unique list of car owners, which is why we need a more advanced formula.

We also want the formula needs to be dynamic, meaning that we can copy it down and to the right to create the unique lists for each car model.

How do you think you would construct this kind of formula?

# The Solution

Here’s what the resulting formula will look like:

`=IFERROR(INDEX(Cars,SMALL(IF(Cars[Car]=E\$1,ROW(Cars)-1),ROW(1:1)),2),"")`

Please note that this is an array formula and must be entered by pressing `Ctrl+Shift+Enter`. Click here to learn more about array formulas.

When we enter this in `E2`, we get Susan as the first answer.

Then copy down and to the right and we have our solution:

## Before You Simply Copy / Paste!

Remember what we discussed in the intro? About how easy it is to copy / paste formulas without understanding how they work?

How easy is it to copy / paste answers like these?

Very easy.

And how much power does doing that have?

Very little.

Don’t you want to harness the power of building complex formulas?

## How does this formula work?

Let’s break this formula down piece-by-piece to understand what’s really going on.

First, the main `IFERROR()` function is there to handle any errors and replaces them with a blank cell. Let’s now focus on the part of the formula that’s doing the real work.

### The Index() Function’s Role

The `INDEX()` formula works like so:
`INDEX(array, row, column)`

Here is our `INDEX()` formula for the unique list we are trying to create:

```INDEX(Cars,SMALL(IF(Cars[Car]=E\$1,ROW(Cars)-1),ROW(1:1)),2)
|__| |___________________________________________| |
|                       |                         |
array                   row                     column```

If you’re not familiar with the `INDEX()` function, check out this link to learn more.

Basically, the `INDEX()` function will look at a table (the `array`), then based on the `ROW` and `COLUMN` you give it, a single value will be returned. This single value will be one of the items in our unique list. The secret here is to manipulate the `ROW` portion, which we will dig into shortly.

The `array` that we reference is the Cars table.

The `ROW` portion of the formula has a more complicated formula which allows us to get a list of non-repeating values from the data set. This is actually where the Array Formula (Ctrl+Shift+Enter) is required and where the unique list comes from.

The `COLUMN` portion simply says `2`, which means “use the 2nd column from the array (table) I’ve given you.” While the 2nd column in our case is column `B`, the formula actually refers to the 2nd column of the array you gave it. If we had our table starting in cell `E5`, the 2nd column of the table would be in column `F`.

### The Small() Function’s Role

Let’s break down the `ROW` portion of the `INDEX()` function a bit further.

The `SMALL()` function simply says:

`SMALL(data_set, k-th_smallest_item)`

Here is the formula:

```SMALL(IF(Cars[Car]=E\$1,ROW(Cars)-1),ROW(1:1))
|___________________________| |______|
|                  |
data_set        k-th_smallest_item```

Here’s a quick sample formula to better understand how the SMALL() function works:

`=SMALL({4,5,1,3}, 2)`

The data set is an array of 4`, 5, 1, and 3`. The `k-th_smallest_item` of the formula is set to `2`. This returns the second smallest item of the set, which is `3`.

Both portions of our SMALL() function have another formula in them. Let’s dig into this even further (we’re almost there, I promise).

#### The If() Function’s Role in the data_set Portion

`IF(Cars[Car]=E\$1,ROW(Cars)-1)`

This is the part that needs the array formula (Ctrl+Shift+Enter) and this is also where things get a little tricky, but it’s not as bad as you might think.

This function is looking in the Car column of the Cars table and looking for a match from cell `E1`, which is the Car’s make and model. If `True`, we want to return the row number of where that Car was found. We use the `ROW()` function to accomplish that. Right now, we’re looking to find the owners of the Honda Civic, which are Susan and Miranda. To get that, we first need to find all of the rows that match the car that we’re looking for, then we will simply move 1 cell to the right to get the name of the owner. All matching rows of cars will have the corresponding owners in the next cell and this is how we will get our unique list.

###### A Note about the word 'Rows'
I’d like to take a second and explain something about the word “Rows”.

When working with Excel Tables (or any “data array”) it’s important to realize that the first “row of data” is actually considered Row 1 regardless of whether the data starts at A1 or Z35. When using the `INDEX()` function and your data is in `C5:D10` and you have headers in `C5` and `D5`, then your data begins at `C6` and ends at `D10`. Even though on the worksheet it’s the 6th row for where the data begins, it’s still considered Row `1` in your `INDEX()` function (and other functions that work with arrays).

Because of this confusion, I will refer to rows in an array as “data rows” and rows in the worksheet as “worksheet rows”.

The reason for the `-1` is because if you take the `Honda Civic` as an example, that text first appears on row `2` of the worksheet. While the Excel Table’s data begins on Row `2`, the first “row of data” is still considered “row 1” when working with the table (these are data rows). The `ROW()` function only returns rows of the worksheet, regardless of any data you may be referencing. And since `ROW()` will return `2` for `Honda Civic`, we need to adjust for how `INDEX()` is working with the data rows.

For finding `Susan` and `Miranda`, those would be data rows `1` and `4` because the data starts in `A2` and ends in `B8`.

Here’s how this `IF()` function is broken down:

`IF(Cars[Car]=E\$1,ROW(Cars)-1)`

`Cars[Car]` becomes:

`IF({"Honda Civic";"Toyota Corolla";"Ford Mustang";"Honda Civic";"Honda CRV";"Toyota Corolla";"Honda CRV"}=E\$1,ROW(Cars)-1)`

E\$1 becomes:

`IF({"Honda Civic";"Toyota Corolla";"Ford Mustang";"Honda Civic";"Honda CRV";"Toyota Corolla";"Honda CRV"}="Honda Civic",ROW(Cars)-1)`

Notice that I highlighted the matches that will be made. This will correspond to how the condition section is evaluated in the `IF()` function:

`IF({TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE},ROW(Cars)-1)`

Now, the `ROW(Cars)-1` part gets evaluated:

`IF({TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE},{1;2;3;4;5;6;7})`

Also, since we didn’t include a `value_if_false` section for the `IF` statement, Excel gives us `FALSE` be default. The resulting `IF` formula is actually:

`IF({TRUE;FALSE;FALSE;TRUE;FALSE;FALSE;FALSE},{1;2;3;4;5;6;7},{FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE})`

The `TRUE` matches in the first parameter of the `IF` statement will correspond to the same item in the `value_if_true` section, whereas if `FALSE` is found in the `IF` statement, we simply get `FALSE` as the `IF` formula is building up the resulting array. The resulting formula after the `IF` statement is evaluated is:

`{1;FALSE;FALSE;4;FALSE;FALSE;FALSE}`

What this says is for the Honda Civic, we found Honda Civic in data rows 1 and 4 (which correspond to the data rows where Susan and Miranda are).

#### The Row() Function’s Role in the k-th_smallest_item Portion

Coming back to our `SMALL()` function:

`SMALL({1;FALSE;FALSE;4;FALSE;FALSE;FALSE},ROW(1:1))`

The part:

`ROW(1:1)`

Will simply return `1` when we put the formula in cell `E2`, like we are. When we copy the formula down, it will become `ROW(2:2)` which will return `2`.

Because this formula is sitting in the `k-th_smallest_item` portion of the `SMALL()` function, when we copy the formula down, instead of using the first match of who has a `Honda Civic` (i.e. `Susan`) we will use the 2nd owner (i.e. `Miranda`).

`SMALL({1;FALSE;FALSE;4;FALSE;FALSE;FALSE},1)`

This will now simply pick the 1st smallest item in the set, which is `1`. Note that it ignores the `FALSE` items in the set.

The result for the SMALL() function is simply:

`1`

## Putting it all back together

Coming back to the formula:

`INDEX(Cars,SMALL(IF(Cars[Car]=E\$1,ROW(Cars)-1),ROW(1:1)),2)`

When this is in cell `E2`, the evaluation is:

`INDEX(Cars,1,2)`

Which means to say, “in the Cars table, return data row 1 and data column 2,” corresponding to `Susan`. That’s what we get for the first formula.

When we copy the formula down by one cell (in `E3`), here is the breakdown (with changes highlighted):

`INDEX(Cars,SMALL(IF(Cars[Car]=E\$1,ROW(Cars)-1),ROW(2:2)),2)`
`INDEX(Cars,SMALL({1;FALSE;FALSE;4;FALSE;FALSE;FALSE},2)),2)`
`INDEX(Cars,4,2)`
`Miranda`

Notice here that since `ROW(1:1)` changed to `ROW(2:2)` that it’s simply adjusting our `k-th_smallest_item` in the `SMALL()` function, which is how we can avoid creating duplicates.

Also notice that when you copy to the right, the ONLY thing that changes is the column in the `IF()` function:

`INDEX(Cars,SMALL(IF(Cars[Car]=F\$1,ROW(Cars)-1),ROW(1:1)),2)`

I didn’t lock this column so we can simply copy to the right and have the rest of the worksheet adapt correctly.

### What happens if you copy down or to the right too far?

You’ll get a `#NUM!` error, which is why we put the `IFERROR()` portion in there (to make any error just show a blank string…essentially hiding the errors).

### Are you ready to master Excel Dashboards?

Still not convinced? Check out my review of the course!

• Amar Najal

Hello Joseph. I am confused in the formula “IF(Cars[Car]=E\$1,ROW(Cars)-1)”, there’s no condition set if the statement is false, how come it did not give a “FALSE” value in return? =)

• Joseph

Hi Amar,

You’re right. I should have included that as part of the evaluation. I’ll update the post to show the full evaluation of the formula. Thanks for pointing this out!

Joseph

• Scottie

Hi Joseph,
I’ve found your article to be very good. But i was wanting to confirm if there is a way to add a secondary or even a third criteria to help define a more distinct list.

• Scottie

After doing some more research looks like it was because i was trying to do “and” in the array changing to * to simulate “and” has allowed me to define multiple criteria and return the correct results.

• Hunter

Can you provide more information on this? I also would like to do multiple criteria and am having trouble inserting the “AND” formula into the IF statement. Any help would be much appreciated.

• Joseph

Hi Hunter,

Could you please provide the formula you are trying to use and at which part you are stuck?

Thank you,
Joseph

• Deborah Good

Is there a way to modify this formula so that the resulting list skips blanks?

• Joseph

Hi Deborah,

Sorry I missed this, but in case anyone else is wondering, here’s how to do it:

=IFERROR(INDEX(Cars,SMALL(IF(Cars[Car]=E\$1,IF(Cars[Person]””,ROW(Cars)-1)),ROW(1:1)),2),””)

Note that this will skip blanks in the second column of `Cars[Person]`. You could change this to `Cars[Car]` to skip entries that don’t have a car.

Initially, I thought that I could use `AND(...)` in the `IF` statement inside the `SMALL()` function, but the `AND()` function doesn’t work well when comparing two arrays. That’s why you have to use a nested `IF` statement here (or at least, that’s one solution to this).

• Tim Newman

Thank you for this, Joseph. Very informative. Is there to change this to only return unique values? For instance, if cell C5 of your sheet was also “Susan”, your formulas in column E would only return Susan one time (followed by blanks).

Currently I am using =IFERROR(INDEX(TB!\$I\$2:\$I\$92,MATCH(0,INDEX((LEN(TB!\$I\$2:\$I\$92)=0)+COUNTIF(\$B\$7:B7,TB!\$I\$2:\$I\$92),0),0)),””) to create a dynamic list of unique values, but I’m not sure how to add in a lookup criteria for column H. I was trying to merge some of your logic in this post with my formula, but have been unsuccessful.

Any help would be greatly appreciated.

• Joseph

Hi Tim,

I’m glad you liked the post and thanks for the feedback. I do appreciate it.

I think I understand your question. Using the example that you gave where we have 2 Susans in the list, here’s how you can change the formula (entered in E2 as an array formula):

=IFERROR(INDEX(Cars[Person],MATCH(0,IF(Cars[Car]=\$E\$1,COUNTIF(\$E\$1:\$E1,Cars[Person])),0)),””)

It’s very similar to what you had, but the main difference is that the COUNTIF() portion is wrapped around an IF statement that holds the criteria.

I hope this helps.
Joseph

• Tim Newman

Thank you, Joseph! I had to toy with it a bit to get it to work on my spreadsheet but was able to get it working. Thanks again!

• Joseph

Sure thing, Tim 🙂 glad to help!

• Edd

Hi, I’ve appropriated this formula to create a list of case numbers with an action due on a particular date. I’ve successfully done it for today or tomorrow (based on a cell with that value in), but would like to create a list of case actions coming up in the next few working days. I’ve tried inserting an AND(>2days time,D\$14),(‘Complaints Log’!\$AC\$2:\$AC\$100<\$P\$14)),ROW('Complaints Log'!\$A\$2:\$AJ\$100)-1),ROW(1:1)),4)," ")

It seems to only return the first case on the list (which actually has an action due today in my test scenario).

If anyone can point out where I have gone wrong I would be very grateful.

Kind regards,

Edd

• Edd

=IFERROR(INDEX(‘Complaints Log’!\$A\$2:\$AJ\$500,SMALL(IF(((‘Complaints Log’!\$AC\$2:\$AC\$500>D\$14)*(‘Complaints Log’!\$AC\$2:\$AC\$500<\$P\$14)),ROW('Complaints Log'!\$A\$2:\$AJ\$500)-1),ROW(1:1)),4)," ")

Just in case anyone reads and could follow my problem above, I have solved it. I'm sure I had tried this before (a few times) without success, but it now works. First time I got it to work I'd reduced the data range to A2:A10, thinking it would help me follow the evaluate tool, but it worked. I used find and replace to change it to A2:AJ500 and it still worked. The range is large enough, fortunately we dont get THAT many complaints!

• Joseph

Hi Edd,

Glad to see you got it sorted out. I think I know why you were running into those issues. I believe it’s because when you do:

AND(bool1,bool2)

It won’t work with arrays, so this would work return `True`:

AND(TRUE,TRUE)

While this would return just `False` instead of `{TRUE;FALSE}`

AND({TRUE;FALSE},{TRUE;FALSE})

The `AND()` function does not seem to work well with arrays. It probably evaluates the first array constant and when it finds false, it calls the whole thing false.

Also, if you use this:

={TRUE;FALSE}*{TRUE;FALSE}

And put that in cells A1 and A2 as an array formula, it will result in:

1
1

But if you surround each array constant with parenthesis like this:

=({TRUE;FALSE})*({TRUE;FALSE})

A1 and A2 will have:

1
0

Which is what I would expect. So it seems that operator precedence is making a difference in the array formula here, which is how I think you solved your problem, by using the boolean workaround with *, but adding the parenthesis.

Pretty interesting stuff. I should probably write a post on this topic 🙂

• Kazh

Hi, I wanted to get a list (column) from headers in another Excel table, and it’s only thanks to this very detailed article that I could understand how to tweak it to my liking.

I case it could help: (From an Excel Table to another Excel Table)