In the
A challenge for 2016....... accepted! post I said that I would write a post once every month that would explain very basic stuff about SQL Server. Today is the second post, in this post we will take a look at how you can return the row with the maximum value for a group.
In order to explain the code in this post, I decided to grab the top largest cities in the world as the data. You can find the data here
List of cities proper by population
Below is the SQL code to create a table and also the data to populate this table
create table #Temp( Rank int not null,
City varchar(100) not null,
Population int not null,
Density int not null,
Country varchar(100))
INSERT #Temp
SELECT 1,'Shanghai',24256800,3826,'China' union all
SELECT 2,'Karachi',23500000,6663,'Pakistan' union all
SELECT 3,'Beijing',21516000,1311,'China' union all
SELECT 4,'Lagos',21324000,18206,'Nigeria' union all
SELECT 5,'Delhi',16787941,11320,'India' union all
SELECT 6,'Tianjin',15200000,1293,'China' union all
SELECT 7,'Istanbul',14160467,2593,'Turkey' union all
SELECT 8,'Tokyo',13297629,6075,'Japan' union all
SELECT 9,'Guangzhou',13080500,1759,'China' union all
SELECT 10,'Mumbai',12478447,20680,'India' union all
SELECT 11,'Moscow',12197596,4859,'Russia' union all
SELECT 12,'São Paulo',11895893,7821,'Brazil' union all
SELECT 13,'Shenzhen',10467400,5256,'China' union all
SELECT 14,'Jakarta',10075310,15171,'Indonesia' union all
SELECT 15,'Seoul',10048850,17134,'South Korea'
Now that we have the data in our table, let's run a simple query to return the row with the most populous city in the world. You can accomplish this in SQL Server by using the TOP operator and ordering the data by population in descending order.
SELECT TOP 1 * FROM #Temp
ORDER BY population DESC
Here is the output from that query
Rank City Population Density Country
1 Shanghai 24256800 3826 China
What we want to do next is return one row the most populous city per country.
So let's take it a step at a time, the query below will give you the maximum population by country
SELECT MAX(population) as BiggestPopulation,Country
FROM #Temp
GROUP BY Country
ORDER BY BiggestPopulation DESC
Here is the output
BiggestPopulation Country
24256800 China
23500000 Pakistan
21324000 Nigeria
16787941 India
14160467 Turkey
13297629 Japan
12197596 Russia
11895893 Brazil
10075310 Indonesia
10048850 South Korea
What we can do is join the query we have above with the table itself on the country and population columns. The query will look like this
SELECT T.* FROM(
SELECT MAX(population) as BiggestPopulation,Country
FROM #Temp
GROUP BY Country) X
JOIN #Temp t on x.BiggestPopulation = t.Population
AND t.Country = x.Country
ORDER BY t.Population DESC
Here is the output
+------+-----------+------------+---------+-------------+
| Rank | City | Population | Density | Country |
+------+-----------+------------+---------+-------------+
| 1 | Shanghai | 24256800 | 3826 | China |
+------+-----------+------------+---------+-------------+
| 2 | Karachi | 23500000 | 6663 | Pakistan |
+------+-----------+------------+---------+-------------+
| 4 | Lagos | 21324000 | 18206 | Nigeria |
+------+-----------+------------+---------+-------------+
| 5 | Delhi | 16787941 | 11320 | India |
+------+-----------+------------+---------+-------------+
| 7 | Istanbul | 14160467 | 2593 | Turkey |
+------+-----------+------------+---------+-------------+
| 8 | Tokyo | 13297629 | 6075 | Japan |
+------+-----------+------------+---------+-------------+
| 11 | Moscow | 12197596 | 4859 | Russia |
+------+-----------+------------+---------+-------------+
| 12 | São Paulo | 11895893 | 7821 | Brazil |
+------+-----------+------------+---------+-------------+
| 14 | Jakarta | 10075310 | 15171 | Indonesia |
+------+-----------+------------+---------+-------------+
| 15 | Seoul | 10048850 | 17134 | South Korea |
+------+-----------+------------+---------+-------------+
SQL Server 2005 introduced a bunch of ranking/windowing functions which make writing these queries a little simpler. We are going to use ROW_NUMBER for this post. Here is what the query will look like
SELECT ROW_NUMBER() OVER (PARTITION By [what do you want it grouped/partitioned by?] ORDER BY [what do you want it ordered by?]) AS Row,*
FROM Table
In our case, we want to partition the result set by country, we also want to order each partition by population descending. The query looks like this
SELECT ROW_NUMBER() OVER (PARTITION By Country ORDER BY population DESC) AS Row,*
FROM #Temp
ORDER BY Row,Population DESC
Here is what that query returns
+-----+------+-----------+------------+---------+-------------+
| Row | Rank | City | Population | Density | Country |
+-----+------+-----------+------------+---------+-------------+
| 1 | 1 | Shanghai | 24256800 | 3826 | China |
+-----+------+-----------+------------+---------+-------------+
| 1 | 2 | Karachi | 23500000 | 6663 | Pakistan |
+-----+------+-----------+------------+---------+-------------+
| 1 | 4 | Lagos | 21324000 | 18206 | Nigeria |
+-----+------+-----------+------------+---------+-------------+
| 1 | 5 | Delhi | 16787941 | 11320 | India |
+-----+------+-----------+------------+---------+-------------+
| 1 | 7 | Istanbul | 14160467 | 2593 | Turkey |
+-----+------+-----------+------------+---------+-------------+
| 1 | 8 | Tokyo | 13297629 | 6075 | Japan |
+-----+------+-----------+------------+---------+-------------+
| 1 | 11 | Moscow | 12197596 | 4859 | Russia |
+-----+------+-----------+------------+---------+-------------+
| 1 | 12 | São Paulo | 11895893 | 7821 | Brazil |
+-----+------+-----------+------------+---------+-------------+
| 1 | 14 | Jakarta | 10075310 | 15171 | Indonesia |
+-----+------+-----------+------------+---------+-------------+
| 1 | 15 | Seoul | 10048850 | 17134 | South Korea |
+-----+------+-----------+------------+---------+-------------+
| 2 | 3 | Beijing | 21516000 | 1311 | China |
+-----+------+-----------+------------+---------+-------------+
| 2 | 10 | Mumbai | 12478447 | 20680 | India |
+-----+------+-----------+------------+---------+-------------+
| 3 | 6 | Tianjin | 15200000 | 1293 | China |
+-----+------+-----------+------------+---------+-------------+
| 4 | 9 | Guangzhou | 13080500 | 1759 | China |
+-----+------+-----------+------------+---------+-------------+
| 5 | 13 | Shenzhen | 10467400 | 5256 | China |
+-----+------+-----------+------------+---------+-------------+
What we really want is the top 10 rows. You might think that you can just add WHERE Row = 1 to the query. This is not allowed since the WHERE clause gets executed before the ROW_Number function as part of the SELECT. What you can do is use a common table expression and then specify WHERE Row = 1 when selecting from the CTE
Here is what it looks like
;WITH CTE AS(
SELECT ROW_NUMBER() OVER (PARTITION By Country ORDER BY population DESC) AS Row,*
FROM #Temp)
SELECT Rank, City,Population,Density,Country
FROM CTE
WHERE Row = 1
ORDER BY Population DESC
And here is our output, for each country we have only one row, you can also see that 4 cities from the top 10 are not in this list, those cities are in either China or India
+------+-----------+------------+---------+-------------+--+
| Rank | City | Population | Density | Country | |
+------+-----------+------------+---------+-------------+--+
| 1 | Shanghai | 24256800 | 3826 | China | |
+------+-----------+------------+---------+-------------+--+
| 2 | Karachi | 23500000 | 6663 | Pakistan | |
+------+-----------+------------+---------+-------------+--+
| 4 | Lagos | 21324000 | 18206 | Nigeria | |
+------+-----------+------------+---------+-------------+--+
| 5 | Delhi | 16787941 | 11320 | India | |
+------+-----------+------------+---------+-------------+--+
| 7 | Istanbul | 14160467 | 2593 | Turkey | |
+------+-----------+------------+---------+-------------+--+
| 8 | Tokyo | 13297629 | 6075 | Japan | |
+------+-----------+------------+---------+-------------+--+
| 11 | Moscow | 12197596 | 4859 | Russia | |
+------+-----------+------------+---------+-------------+--+
| 12 | São Paulo | 11895893 | 7821 | Brazil | |
+------+-----------+------------+---------+-------------+--+
| 14 | Jakarta | 10075310 | 15171 | Indonesia | |
+------+-----------+------------+---------+-------------+--+
| 15 | Seoul | 10048850 | 17134 | South Korea | |
+------+-----------+------------+---------+-------------+--+
That is all for this month, see you next month. In the meantime, make sure to explore windowing/ranking functions as well as Common Table Expressions