Skip to content Skip to sidebar Skip to footer

How To Group Distinct Values And Calculate Fields In One SQL Query

This is only my second time using Stack Overflow so I'm open to any constructive criticism on how to better format my questions. I have a list of orders and I'm wanting to create a

Solution 1:

Below is for BigQuery Standard SQL

If you know in advance product names (like '1', '2', '3' in your example) and there are just few - you can use below simple version

#standardSQL
SELECT name,
  MAX(product = '1') AS has1,
  MAX(product = '2') AS has2,
  MAX(product = '3') AS has3
FROM `project.dataset.table`
GROUP BY name   

If to apply to sample data from your question (I assume your product are of string data type here)

WITH `project.dataset.table` AS (
  SELECT 'Andy' name, '1'  product UNION ALL
  SELECT 'Bill', '2' UNION ALL
  SELECT 'Cole', '2' UNION ALL
  SELECT 'Andy', '2' UNION ALL
  SELECT 'Bill', '1' UNION ALL
  SELECT 'Cole', '2' UNION ALL
  SELECT 'Dave', '3' 
)

result is

Row name    has1    has2    has3     
1   Andy    true    true    false    
2   Bill    true    true    false    
3   Cole    false   true    false    
4   Dave    false   false   true       

In case if product names are not known in advance and/or number of products more than just few - below version can be handy

EXECUTE IMMEDIATE '''
SELECT name,''' || (
  SELECT STRING_AGG(DISTINCT "MAX(product = '" || product || "') AS has" || product)
  FROM `project.dataset.table`
) || '''  
FROM `project.dataset.table`
GROUP BY name
'''   

with exact same output

As you can see here - whole query is assembled dynamically so you don't need to worry about number of products and their names

Below version is identical to above, but easier to manage/read

EXECUTE IMMEDIATE FORMAT('''
SELECT name, %s 
FROM `project.dataset.table`
GROUP BY name
''', (
  SELECT STRING_AGG(DISTINCT "MAX(product = '" || product || "') AS has" || product)
  FROM `project.dataset.table`
))

Solution 2:

You can use aggregation and simple logic:

select name,
       countif(product = 1) > 0 as has_1,
       countif(product = 2) > 0 as has_2,
       countif(product = 3) > 0 as has_3
from t
group by name;

You can also use logical_or():

select name,
       logic_or(product = 1) as has_1,
       logic_or(product = 2) as has_2,
       logic_or(product = 3) as has_3
from t
group by name;

Post a Comment for "How To Group Distinct Values And Calculate Fields In One SQL Query"