How to group by “nothing” in SQL
Have you heard of our Masterclass workshops? Here is a Masterclass special from your SQL trainer, Lukas Eder. What’s the point in using GROUPING SETS in SQL standard? Eder demonstrates a pretty subtle effect of using this feature.
Have you heard of our Masterclass workshops? JAXenter Masterclass consists of four intense workshops that provide comprehensive and up-to-date know-how on advanced Java, reliability, SQL and microservice architecture.
If you are interested in powering up your skills and learn from the absolute best, visit JAXenter Masterclass today and find out more information on our workshops!
But for now, here is a Masterclass special from your SQL trainer, Lukas Eder. Find all the information on his workshop here.
~ ~ ~
How to group by “nothing” in SQL
SELECT count(*) FROM film GROUP BY ()
This will yield:
count | ------| 1000 |
What’s the point, you’re asking? Can’t we just omit the GROUP BY clause? Of course, this will yield the same result:
SELECT count(*) FROM film
Yet, the two versions of the query are subtly different. The latter will always return exactly one row. The former will perform grouping and return all the groups. How is this different? Just add a predicate!
SELECT count(*) FROM film WHERE 1 = 0 GROUP BY (); SELECT count(*) FROM film WHERE 1 = 0;
Now, the first query will produce nothing!
count | ------|
Whereas the second one produces:
count | ------| 0 |
Subtle, eh? Note that unlike DB2, Oracle and SQL Server (which expose the above behavior), PostgreSQL does not produce the above result as it seems to implement the SQL standard (so, always producing a row) as shown by Markus Winand:
How to Group By “Nothing” in SQL https://t.co/EljZPVMHZ5
— jOOQ (@JavaOOQ) May 25, 2018
In SQL:1999 (when it was introduced), the
<empty grouping set> was called
<grand total>, akin to a grand total that can be calculated in a Microsoft Excel Pivot Table. It does make more sense for grand totals to always be present in the result, despite the absence of any input data.
What if your database doesn’t support grouping sets?
Not all databases support the awesome GROUPING SETS feature. Among the ones supported by jOOQ, these do:
- DB2 LUW
- PostgreSQL 9.5+
- SQL Server
- Sybase SQL Anywhere
Note that the following databases support a vendor-specific syntax for ROLLUP, which doesn’t help with the empty grouping set.
So, can we emulate it for the other databases?
Of course. There are two ways to emulate the empty grouping set:
By using a constant
You could try using a constant literal:
SELECT count(*) FROM film WHERE 1 = 0 GROUP BY 'a';
Sometimes, you’ll have to tweak the database into thinking it is not a constant literal because it will not accept that:
SELECT count(*) FROM film WHERE 1 = 0 GROUP BY 'a' || 'b';
And if that’s also not supported, try wrapping the literal in a subquery:
SELECT count(*) FROM film WHERE 1 = 0 GROUP BY (SELECT 1);
One of the above three syntaxes is usually accepted, by these databases:
By using a dummy table
In rare cases, none of the above works as the database’s SQL parser tries to be “clever” and rejects my silly attempts to fool it. But no one can fool me!
Again, Microsoft SQL Data Warehouse – you cannot fool me with your lack of functionality. I want to GROUP BY () (the empty grouping set), and I will! pic.twitter.com/cYePMpL58I
— Lukas Eder (@lukaseder) May 25, 2018
I’ll just cross join whatever is in the
FROM clause with a dummy table (akin to an emulation of table dee) and then group by the dummy table’s column:
SELECT count(*) FROM film, (SELECT 1 x) dummy WHERE 1 = 0 GROUP BY dummy.x;
This is guaranteed to work, including on these databases:
- SQL Data Warehouse
- Sybase ASE
Needless to say that jOOQ supports this emulation. You can play around with it here.