Monday, January 02, 2006

Hail to GROUP_CONCAT!

This blog post has moved. Please find it at:

http://www.mpopp.net/2006/01/hail-to-group_concat/.

11 comments:

gmax said...

I have had GROUP_CONCAT in my bag of tricks for quite a while, and I agree with your enthusiasm.
Be careful with GROUP_CONCAT, though, because it can bite you if you don't take into account its limits.

Try this:

select group_concat(column_name) from information_schema.columns where table_schema='mysql'\G

That will return a partial string and a warning.

show warnings\G *************************** 1. row ***************************
Level: Warning
Code: 1260
Message: 1 line(s) were cut by GROUP_CONCAT()


The reason of the warning is in GROUP_CONCAT maximum length.

select @@group_concat_max_len\G
*************************** 1. row ***************************
@@group_concat_max_len: 1024

If you set @@group_concat_max_len to a large enough value, it will accomodate your data. You have just to keep this in mind.

Giuseppe

David Fetter said...

As is often the case, PostgreSQL has a more flexible solution with
fewer gotchas, but it takes more work and/or specialized knowlege to
find it.

Let's say you've got your array_accum_test table. Now, do
CREATE AGGREGATE array_accum (
sfunc = array_append,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);

per http://www.postgresql.org/docs/current/static/xaggr.html

You only need to do this once.

Next,

SELECT
_key,
count(*) AS "cnt",
array_to_string( /* built-in */
array_accum(_value)
, ', '
)
FROM
array_accum_test
GROUP BY _key
ORDER BY _key;

If you insist on doing this only using built-ins (But why? PostgreSQL
is built to be extensible for a reason!), and you're willing to take a small speed penalty, you can do it like this:


SELECT
t._key,
count(t.*) AS "cnt",
array_to_string( /* built-in */
ARRAY(
SELECT _value
FROM array_accum_test
WHERE _key = t._key
ORDER BY _value /* pwnz0r3d!
Try that with group_concat! */
)
, ', '
)
FROM
array_accum_test AS "t"
GROUP BY t._key
ORDER BY t._key;

Markus Popp said...

Thank you guys for the additional infos :-).

David Fetter said...

Stop in at irc://irc.freenode.net/postgresql if you like. There are lots of friendly, helpful people there. :)

Paul Carey said...

A litte problem I had with the group_concat function was when selecting integers only I was getting a blob instead of a string to solve this I used a cast to convert the integers first:

SELECT GROUP_CONCAT(CAST(myInt as CHAR)) myInts FROM aTable;

Anonymous said...

However, I have found that this function actually omits values when aggregating them. For example, I would like to aggregate the sequence of status changes (like "created","in process", "on-hold" etc) for a purchase order. Now I have discovered that sometimes it simply omits values. There seems to be no pattern to which values are lost - it really looks random. Trawling through the bug reports from MySQL has not really helped. Shame really, since th function is just great.

Artem Russakovskii said...

Anonymous, are you sure you're using a LEFT JOIN rather than an INNER JOIN? This could be why some values are omitted.

Tudor Olariu said...

For David Fetter:

I beg to differ: Your postgres solution is much less versatile. All your Postgres code (including your aggregate function and the two selects) can be easily replaced by:

select group_concat(name order by surname) from persons;

GaneshXL said...

Thanks a lot for sharing this. I've discovered this post after googling for "crosstab"; glad I found it. This will indeed make lots of things easier in the future :)

Anonymous said...

I like the idea of GROUP_CONCAT(), but unfortunately its implementation is filled with bugs. See the MySQL Changelogs. Garbled text (!!!), mixed charsets, missing values, segfaults, etc. I've been bitten by bugs in GROUP_CONCAT(), and even as I write this the 5.0.32 version that comes in Debian stable still has serious bugs in it.

Vishy said...

This is my function

CREATE OR REPLACE FUNCTION _group_concat(text, text)
RETURNS text AS $$
SELECT CASE
WHEN $2 IS NULL THEN $1
WHEN $1 IS NULL THEN $2
ELSE $1 operator(pg_catalog.||) '::::' operator(pg_catalog.||) $2
END
$$ IMMUTABLE LANGUAGE SQL;


CREATE AGGREGATE group_concat(
BASETYPE = text,
SFUNC = _group_concat,
STYPE = text
);


On doing select array_to_string(group_concat(value), ',') from data where id = 368467 group by id;

I'm getting an error saying
"No function matches the given name and argument types. You may need to add explicit type casts."


Please let me know what i'm doing wrong here...

Thanks,
Vishy