Skip navigation

Category Archives: MySQL

Abacus

Abacus *

Recently I ran into an interesting problem, while using an aggregate function in a SELECT SQL statement. Initially the behavior seems to defy common sense, so I thought I’d share.

Consider this hypothetical database table called “items”:

id | category_id | amount | title
-----------------------------------
1  | 1           | 1      | Item 1
2  | 1           | 2      | Item 2

My goal was to retrieve all items with the maximum amount in each category.  Naturally, I wrote the following SQL statement:

SELECT id, MAX(amount) AS amount, title FROM items GROUP BY category_id;

The result was definitely unexpected. What I got was:

id | amount | title
--------------------
1  | 2      | Item 1

So I did get the maximum amount, but no matching data from the corresponding row. It appears that MySQL just grabbed the first row in the group and returned it along with the maximum amount. After some searching, this behavior actually makes sense. Just consider what would have to be the result if we had more than one row with the same maximum amount, or if we had both MAX() and MIN() functions in our statement.

There are a couple ways to still accomplish this in a single query:

Option 1:

SELECT id, amount, title
FROM items AS items1
WHERE amount = (
    SELECT MAX( amount )
    FROM items AS items2
    WHERE items2.category_id = items1.category_id
)

Option 2:

SELECT items1.id, items1.amount, items1.title
FROM items AS items1, (
    SELECT category_id, MAX( amount ) AS amount
    FROM items
    GROUP BY category_id
) AS items2
WHERE items1.category_id = items2.category_id
AND items1.amount = items2.amount

I’m not sure which approach is better. Option 1 seems to be a bit faster and appears to perform less sub-queries, according to EXPLAIN, but that might entirely depend on your database structure and indexes (MySQL calls them that). I would be happy to hear an explanation of which way is better and why.

In my own application, for the sake of scalability, I decided to keep an additional table with only the highest amounts in each category. I’m planning to update that “tally” table every time an item is added or edited.

Photo by cyberdees

Follow

Get every new post delivered to your Inbox.