Using Median Along Side Max, Min, And Avg Functions In Mysql
Solution 1:
You can calculate the median with GROUP BY in MySQL even though there is no median function built in.
Consider the table:
Acrington 200.00
Acrington 200.00
Acrington 300.00
Acrington 400.00
Bulingdon 200.00
Bulingdon 300.00
Bulingdon 400.00
Bulingdon 500.00
Cardington 100.00
Cardington 149.00
Cardington 151.00
Cardington 300.00
Cardington 300.00
For each row you can count the number of similar items that are less. You can also count how many values are less than or equal:
name v < <=
Acrington 200.00 0 2
Acrington 200.00 0 2
Acrington 300.00 2 3
Acrington 400.00 3 4
Bulingdon 200.00 0 1
Bulingdon 300.00 1 2
Bulingdon 400.00 2 3
Bulingdon 500.00 3 4
Cardington 100.00 0 1
Cardington 149.00 1 2
Cardington 151.00 2 3
Cardington 300.00 3 5
Cardington 300.00 3 5
With query
SELECT name,v, (SELECTCOUNT(1) FROM sale WHERE v<o.v AND name=o.name) as ls
, (SELECTCOUNT(1) FROM sale WHERE v<=o.v AND name=o.name) as lse
FROM sale o
The median value will occur when the less-than-or-equal count is half the number of items
Acrington has 4 items. Half of this is 2 which is in the range 0..2 (corresponding to 200.00) and also in the range 2..3 (corresponding to 300.00)
Bullingdon also has 4 items. 2 is in the range 1..2 (value 300.00) and 2..3 (value 400.00)
Cardington has 5 items. The value 2.5 is between 2 and 3 which corresponds to Cardington 151.
The median value is the mean of the min and max values returned by:
SELECT cs.name,v
FROM
(SELECT name,v, (SELECT COUNT(1) FROM sale WHERE v<o.v AND name=o.name) as ls
, (SELECT COUNT(1) FROM sale WHERE v<=o.v AND name=o.name) as lse
FROM sale o) cs JOIN
(SELECT name,COUNT(1)*.5as cn
FROM sale
GROUPBY name) cc ON cs.name=cc.name
WHERE cn between ls and lse
Which gives:
Acrington 200.00
Acrington 200.00
Acrington 300.00
Bulingdon 300.00
Bulingdon 400.00
Cardington 151.00
Finally we can get the median:
SELECT name,(MAX(v)+MIN(v))/2FROM
(SELECT cs.name,v
FROM
(SELECT name,v, (SELECT COUNT(1) FROM sale WHERE v<o.v AND name=o.name) as ls
, (SELECT COUNT(1) FROM sale WHERE v<=o.v AND name=o.name) as lse
FROM sale o) cs JOIN
(SELECT name,COUNT(1)*.5as cn
FROM sale
GROUPBY name) cc ON cs.name=cc.name
WHERE cn between ls and lse
) AS medians
GROUPBY name
Giving
Acrington 250.000000
Bulingdon 350.000000
Cardington 151.000000
Solution 2:
The only way I found to do this is through string manipulation:
with GROUP_CONCAT a list of all value is created then with indented SUBSTRING_INDEX the median value is taken
SELECTcount(*) AS `# of Data points`,
name,
max((QNTY_Sell/QNTYDelivered)*1000) AS `MAX Thousand Price`,
min((QNTY_Sell/QNTYDelivered)*1000) AS `MIN Thousand Price`,
avg((QNTY_Sell/QNTYDelivered)*1000) AS `MEAN Thousand Price`
, CASE (count(*) %2)
WHEN1THEN SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat((QNTY_Sell/QNTYDelivered)*1000ORDERBY (QNTY_Sell/QNTYDelivered)*1000 SEPARATOR ',')
, ',', (count(*) +1) /2)
, ',', -1)
ELSE (SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat((QNTY_Sell/QNTYDelivered)*1000ORDERBY (QNTY_Sell/QNTYDelivered)*1000 SEPARATOR ',')
, ',', count(*) /2)
, ',', -1)
+ SUBSTRING_INDEX(
SUBSTRING_INDEX(
group_concat((QNTY_Sell/QNTYDelivered)*1000ORDERBY (QNTY_Sell/QNTYDelivered)*1000 SEPARATOR ',')
, ',', (count(*) +1) /2)
, ',', -1)) /2END median
FROM
sales
WHEREyear(date) >=2012AND
name LIKE "%art.%" AND
QNTYDelivered >0AND
QNTY_Sell >0GROUPBY name
ORDERBY name;
The CASE is needed to check if we have a single median value, with an odd number of values, or two median values, with an even number of values, in the second case the median is the mean of the two values founded.
Post a Comment for "Using Median Along Side Max, Min, And Avg Functions In Mysql"