Wednesday, April 12, 2006

A little (?) brain exercise

This blog post has moved. Please find it at:

http://www.mpopp.net/2006/04/a-little-brain-exercise/.

9 comments:

Scott Noyes said...

Here's a nasty, ugly solution. I think you could improve on the third subquery a bit using count() and a having clause, removing the need for joining inner3.

(
SELECT *
FROM tt outerTable
WHERE d2 = (
SELECT MAX(d2) FROM tt WHERE outerTable.d1 = tt.d1
)
)
UNION (
SELECT *
FROM tt outerTable
WHERE d2 = (
SELECT max(inner1.d2)
FROM
tt inner1
JOIN tt inner2 ON (inner1.d1 = inner2.d1 AND inner1.d2 < inner2.d2)
WHERE inner1.d1 = outerTable.d1
)
)
UNION (
SELECT *
FROM tt outerTable
WHERE d2 = (
SELECT max(inner1.d2)
FROM
tt inner1
JOIN tt inner2 ON (inner1.d1 = inner2.d1 AND inner1.d2 < inner2.d2)
JOIN tt inner3 ON (inner2.d1 = inner3.d1 AND inner2.d2 < inner3.d2)
WHERE inner1.d1 = outerTable.d1
)
)
ORDER BY d1, d2 DESC;

Robin Vickery said...

How about this one:

SELECT a.* FROM tt a, tt b
WHERE a.d1 = b.d1
GROUP BY a.id
HAVING SUM(a.d1 = b.d1 AND b.d2 > a.d2) < 3
ORDER BY a.d1 ASC, a.d2 DESC;

Or is there something horrible I'm missing?

-robin

Roland Bouman said...

Hi Markus,

you want the '...top 3 values..'. Well, if we could assume that (d1, d2) is unique, the solution is simple:

select t1.id, t1.d1, t1.d2
from tt as t1
where 3 > (
select count(id)
from tt as t2
where t2.d1 = t1.d1
and t2.d2 > t1.d2
)
order by t1.d1, t1.d2 desc
;

+-------+----+--------+
| id | d1 | d2 |
+-------+----+--------+
| 4369 | 1 | 995815 |
| 6471 | 1 | 994064 |
| 7358 | 1 | 993948 |
| 5973 | 2 | 997576 |
| 9513 | 2 | 992569 |
| 8241 | 2 | 987857 |
| 10793 | 3 | 995166 |
| 9687 | 3 | 992792 |
| 12900 | 3 | 992669 |
..
..

Of course, this query only returns the 3 rows with the highest value which is not the same as the three top values (2 or even 3 rows having the highest value might all have the same value)

if course, I don't think it's fair to assume unique values in d2 for a given d1. So, I will think a little more and try to come up with a query that literally does what you want.

Roland Bouman said...

LOL! nice one robin...but it has the same flaw as my subquery solution: It returns the three rows that have the highest value - not the three highest values.

Of course, your JOIN solution is nicer than my subquery. Props to you!

Scott Noyes said...

Could you fix both the suggestions by using >= to compare d2, and add in a t1.id != t2.id?

Markus Popp said...

Great job, guys :-)!

Roland Bouman said...

Just curious Markus,

in retrospect, I realize that you procedure does the same thing as Robin's query (and mine as well), that is: find the three rows that have the highest values - NOT find the three highest values, regardless of the rows that happen to have that value.

What was your requirement really? Finding the 3 rows with the highest value, or finding the 3 distinct highest values?

Markus Popp said...

Hi Roland,

it seems, you got it right - the three highest (not only distinct) values from d2 for each of the 20 d1 values (if there are e.g. two values with 999,999, they should show up twice).

If there are two equal values on 3rd (and 4th) position, I don't mind whether both (so in total four values) should show up or only three. It's more about the connection, how to get all d1 values, but only the top 3 of the d2 values (for each of the d1's).

masQuerade said...

Probably not for this problem but a logical solution for a mystical problem in mysql older versions..


http://formyhelp.blogspot.com/2008/03/error1235-this-version-of-mysql-doesnt.html