Avoid Excel vlookup: use index+match instead

This is I’m sure old news to Excel gurus out there (of which I’m reluctantly, seemingly, turning into).

I started having a little more appreciation for Excel when I began using vlookup more seriously; this allows a semi-structured way of dealing with linked data in Excel. E.g., when a list of grades has a different set of students than a class list, you can still sensibly lookup values without manually copy/pasting information from one to the other.

I suppose an actual example is in order:

Screen Shot 2014-12-02 at 5.18.22 pm

I hope this is fairly obvious; we have an extremely basic lookup-style functionality (hence the name) where the function looks up "c" in the first column and outputs the corresponding entry in row 2. You’d obviously usually use a reference to another cell rather than hard-code the "c" entry in there.

So this is great, but comes with some problems.

First of all, vlookup of course has the incredibly frustrating default behaviour of not exactly matching entries; in my own experience, things never work out unless you write it in the form vlookup(•,•,•,FALSE) — and if I’m rusty I never remember whether I should be writing true or false as the argument in there. So that’s one reason vlookup is dumb.

Second of all, vlookup breaks in a completely transparent manner if you ever alter the columns in your original query. Perhaps I am re-arranging the spreadsheet so that there is a grade column as well:

Screen Shot 2014-12-02 at 5.22.53 pm

Unlike what happens in a lot of other Excel cases, inserting the new column hasn’t updated the vlookup equation (not saying it should, though). And our lookup now obviously returns different data. This can bite you pretty bad unless you have stringent error checking somewhere along the line.

On the other hand, the useful replacement for vlookup does NOT have this same problem. This solution is called something like index+match instead, and you should not be intimidated from using it because it’s slightly more complex. In fact, I find it easier to use in most cases:

Screen Shot 2014-12-02 at 5.28.47 pm

Translating that formula into prose produces something like this: “index into column D the row that matches “c” in column C”. Note again Excel’s insistence that you need a dumb suffix to tell match to do its thing properly; don’t forget the match(•,•,0).

Why is index+match better? I’m glad you asked:

Screen Shot 2014-12-02 at 5.32.07 pm

As soon as we inserted the extra column, the match function automatically updated and our formula is still correct. You can argue more tenuous advantages about index+match (e.g., that vlookup can’t have a negative column index), but automatic updating is the killer.

Advertisements


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s