How may I retrieve data from an Excel table based on a variable number of criteria?

I have the following salary data for example:

Country State 2012 2013 -> 2027
======= ===== ==== ====
China Other 1000 1100
China Shanghai 1310 1400
China Tianjin 1450 1500
India Orissa 1500 1600

So now in another Excel sheet I would want an answer to one of the following questions:

  1. What is the salary in Shanghai for 2013? (Answer would be 1400)
  2. What is the salary in Hubei province for 2012? (Since it is not listed, use "Other" - 1000)
  3. What is the average salary in China for 2013? (Answer would be 1333)
  4. What is the highest salary in China for 2012? (Answer is Tianjin)

So as in the above order of priority, I would like those numbers in another Excel sheet using some form of query. I considered PivotTables but I was wondering if there is another much better more efficient way of doing this?

I imagine SQL is suited for this but I am not clued up on that. Some Excel functionality is much rather preferred. Also suggestions on an appropriate format of data for such queries would be appreciated.

4

5 Answers

Your use case is perfect for Excel's database functions. To use them, you'll first need to unpivot your data so that each row corresponds to one observation in your data. This means you will have 4 columns: Country, State, Year and Salary.

Then you can use DGET() to get a specific entry in the database and DAVERAGE() to get an average. The criteria for database functions need to be laid out in pairs of cells where the filtering field goes on top and the value to be filtered goes underneath it. Here's a working example:

enter image description here

The formula for cell I10 is the same as the one for I9, except it uses DAVERAGE instead.

  • In I9 enter =IFERROR(DGET($B$4:$E$12,$E$4,$G$4:$I$5),"")
  • In I10 enter =IFERROR(DAVERAGE($B$4:$E$12,$E$4,$G$4:$I$5),"")

The IFERROR() around it ensures that DGET doesn't show #NUM! when you haven't provided values in all three inputs in row 6. This is because it can't find a single value for the salary if, say, you're not telling it which year to filter on.

1

I think one way is to name the column - For example, right click on the cell with the value 1400 (salary for Shanghai for 2013) and define name

Then, in another cell or just refer to that cell - the beauty of this is it doesn't matter if that cell is moved, it will remain with the content you want.

With some preparation, this can also be easily done in Excel too. (Although might be easier and more natural with SQL).

The preparation: use named ranges as @Adam suggested - country for the Country, state for the State, and I also named each of the year in this format year2012.

Then you can create your answers sheet

 A B C D
1 salary in Shanghai 2013 =IFERROR(INDEX(INDIRECT("year"&C1),MATCH(B1,state,0)),INDEX(INDIRECT("year"&C1),MATCH("Other",state,0)))
2 salary in Hubei 2012 =IFERROR(INDEX(INDIRECT("year"&C2),MATCH(B2,state,0)),INDEX(INDIRECT("year"&C2),MATCH("Other",state,0)))
3 average China 2013 =AVERAGEIF(country,B3,INDIRECT("year" & C3))
4 highest China 2012 {=INDEX(state,MATCH(MAX(IF(country=B4,INDIRECT("year"&C4))),INDIRECT("year"&C4),0))}

Each of these lines are dynamic, so you can change your 'questions' as required. The important thing to notice is that the 4th (highest) formula is Array Formula which requires the use of CTRL+SHIFT+ENTER when entering it.

p.s. Your answer to the 3rd question is wrong, the average salary in China in 2013 should be 1333.

0

I agree with Dave Rook, that it is the place where you should use SQL.

It is also possible to be done entirely with Excel formulas (what isn't?). This is how:

First of all you will need a way of addressing each column. You have two options for that.

  1. First, (like Dave Rook suggested) straightforward and more "beautiful", is to define named ranges. Just be sure, that each named range is big enough to contain all future records (or use some sort of dynamic named range). Unfortunately it require some clicking for each column, so it doesn't scale when you have 100 columns.

  2. The other option (which I prefer for heavy-duty computations) is to generate the column range addresses strings using e.g. =ADDRESS(ROW(C3);COLUMN(C3);4;;"sheet1") & ":" & ADDRESS(ROW(C3)-1+$A$2;COLUMN(C3);4) for accessing data on column C with number of records stored in cell A2. Then you address the range by use of INDIRECT() formula everywhere, when you would normally paste the named range from method 1.

Once all your columns (fields) are named ranges, you can perform all sort of lookup and reference tasks on them.

For instance: to identify the number of record which contains the word "Shanghai" you would use =MATCH("Shanghai";State;0), where State is the named range for the column "State".

Once you have the record number (let's say it is in cell X10, you can get information about Salary with =INDEX(Salary;X10).

All sorts of tasks are possible in Excel formulas; e.g. you can create a list of unique values (or values that match certain criteria), you can compute any form of sum/average. You can even sort them (yes!).

Please note, that in my locale I use semicolon ; as argument separator (not comma). English-speaking countries usually use ,, and in that case to use my formulas you would need to replace all ; -> ,.

1

Go for this vba aproach, It should work without too much tweaking. Error handling could be updated, tho:

Private Sub get_money(byval country as string, byval city as string, byval year as string) if country == "" then country == "nope" end if if city == "" then city =="nope" end if if year == "" then year =="nope" end if if autofiltermode = true then autofiltermode = false dim all_columns as double all_columns = thisworkbook.activesheet.range("A1").currentregion.columns.count with thisworkbook.activesheet for cell_position = 3 to all_columns if .cells(1,cell_position) == year year_to_filter = cell_position cell_position = all_columns end if next cell_position .Range(.cells(1,1), .cells(1,all_columns)).AutoFilter Field:=1, Criteria1:=country, Operator:=xlFilterValues .Range(.cells(1,1), .cells(1,all_columns)).AutoFilter Field:=2, Criteria1:=city, Operator:=xlFilterValues .Range(.cells(1,1), .cells(1,all_columns)).AutoFilter Field:=cell_position, Criteria1:=year, Operator:=xlFilterValues end with
end sub

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like