Skip to content Skip to sidebar Skip to footer

Distinct Values From An Array?

Following tables: CREATE TEMPORARY TABLE guys ( guy_id integer primary key, guy text ); CREATE TEMPORARY TABLE sales ( log_date date, sales_guys integer[], sales smallint ); INSERT

Solution 1:

There is no kind of order you can trust without ORDER BY. Except that elements of arrays, when unnested, come in array order. If your query does more with the result, it may be re-ordered, though.

You an simply add ORDER BY to any aggregate function in Postgres:

SELECT s.sales_guys, ARRAY_AGG(DISTINCT g.guy ORDERBY g.guy) AS names, SUM(s.sales) AS sum_sales
FROM   sales s
JOIN   guys  g ON g.guy_id = ANY(s.sales_guys)
GROUPBY s.sales_guys;

But that's obviously not the original order of array elements. And the query has other issues ... Neither IN nor = ANY() care about order of elements in the set, list or array on the right side:

Proper solution

For this task (attention to the details!):

Get the total sales per array sales_guys, where the order of elements makes a difference (arrays '{1,2}' and '{2,1}' are not the same) and sales_guys has neither duplicate nor NULL elements. Add an array of resolved names in matching order.

Use unnest() with WITH ORDINALITY. and aggregate arrays before you resolve names, that's cheaper and less error prone.

SELECT s.*, g.
FROM  (
   SELECT sales_guys, sum (sales) AS total_sales                -- aggregate first in subqueryFROM   sales
   GROUPBY1
   ) s
, LATERAL (
   SELECTarray_agg(guy ORDERBY ord) AS names                  -- order by original orderFROMunnest(s.sales_guys) WITH ORDINALITY sg(guy_id, ord)  -- with order of elementsLEFTJOIN guys g USING (guy_id)                            -- LEFT JOIN to add NULL for missing guy_id
   ) g;

The LATERAL subquery can be joined with unconditional CROSS JOIN - comma (,) is shorthand notation - because the aggregate in the subquery guarantees a result for every row. Else you'd use LEFT JOIN LATERAL .. ON true.

Detailed explanation:

Solution 2:

You can use DISTINCT inside of the aggregate:

SELECT sales_guys, ARRAY_AGG(DISTINCT guy), SUM(sales) AS sales FROM sales JOIN guys ON guys.guy_id =ANY(sales.sales_guys) GROUPBY sales_guys;

Post a Comment for "Distinct Values From An Array?"