Skip to content Skip to sidebar Skip to footer

Using Over(Partition By) When Calculating Median Moving Average Unit Cost

Good morning, I am trying to calculate a 12 month moving average cost (MAUC) for each item in a particular warehouse. I am using the 2012_B – paging trick to calculate the median

Solution 1:

This one caught my attention, so I'm posting two options:

The first is a straight cte approach, and the second uses temp tables. The cte approach is fine for smaller data sets, but performance suffers as the series expands.

Both options will calculate the RUNNING Min, Max, Mean, Median, and Mode for a data series

Just a couple of items before we get into it. The normalized structure is ID and Measure.
- The ID could be a date or identity.
- The Measure is any numeric value - Median is the mid-value of the sorted series. If an even number of observations we return the average of the two middle records - Mode is represented as ModeR1 and ModeR2. If no repeated values, we show the min/max range

OK, let's take a look at the cte Approach

Declare @Table table (ID Int,Measure decimal(9,2))
Insert into @Table (ID,Measure) values
(1,25),
(2,75),
(3,50),
(4,25),
(5,12),
(6,66),
(7,45)

;with cteBase as (Select *,RowNr = Row_Number() over (Order By ID) From  @Table),
      cteExpd as (Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) From cteBase A Join cteBase B on (B.RowNr<=A.RowNr)),
      cteMean as (Select ID,Mean=Avg(Measure2),Rows=Count(*) From cteExpd Group By ID),
      cteMedn as (Select ID,MedRow1=ceiling(Rows/2.0),MedRow2=ceiling((Rows+1)/2.0) From cteMean),
      cteMode as (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc) From cteExpd Group By ID,Measure2)
 Select A.ID
       ,A.Measure
       ,MinVal  = min(Measure2)
       ,MaxVal  = max(Measure2)
       ,Mean    = max(B.Mean)
       ,Median  = isnull(Avg(IIF(ExtRowNr between MedRow1 and MedRow2,Measure2,null)),A.Measure)
       ,ModeR1  = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
       ,ModeR2  = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
  From  cteExpd A
  Join  cteMean B on (A.ID=B.ID)
  Join  cteMedn C on (A.ID=C.ID)
  Join  cteMode D on (A.ID=D.ID and ModeRowNr=1)
  Group By A.ID
          ,A.Measure
  Order By A.ID

Returns

ID  Measure MinVal  MaxVal  Mean        Median      ModeR1  ModeR2
1   25.00   25.00   25.00   25.000000   25.000000   25.00   25.00
2   75.00   25.00   75.00   50.000000   50.000000   25.00   75.00
3   50.00   25.00   75.00   50.000000   50.000000   25.00   75.00
4   25.00   25.00   75.00   43.750000   37.500000   25.00   25.00
5   12.00   12.00   75.00   37.400000   25.000000   25.00   25.00
6   66.00   12.00   75.00   42.166666   37.500000   25.00   25.00
7   45.00   12.00   75.00   42.571428   45.000000   25.00   25.00

This cte approach is very light and fast for smaller data series

Now the Temp Table Approach

-- Generate Base Data -- Key ID and Key Measure
Select ID     =TR_Date
      ,Measure=TR_Y10,RowNr = Row_Number() over (Order By TR_Date)
 Into  #Base
 From [Chinrus-Series].[dbo].[DS_Treasury_Rates] 
 Where Year(TR_Date)>=2013

-- Extend Base Data one-to-many
Select A.*,Measure2 = B.Measure,ExtRowNr = Row_Number() over (Partition By A.ID Order By B.Measure) into #Expd From #Base A Join #Base B on (B.RowNr<=A.RowNr) 
Create Index idx on #Expd (ID)

-- Generate Mean for Series
Select ID,Mean=Avg(Measure2),Rows=Count(*) into #Mean From #Expd Group By ID
Create Index idx on #Mean (ID)

-- Calculate Median Row Number(s)  -- If even(avg of middle two rows)
Select ID,MednRow1=ceiling(Rows/2.0),MednRow2=ceiling((Rows+1)/2.0) into #Medn From #Mean
Create Index idx on #Medn (ID)

-- Calculate Mode
Select * into #Mode from (Select ID,Mode=Measure2,ModeHits=count(*),ModeRowNr=Row_Number() over (Partition By ID Order By Count(*) Desc,Measure2 Desc) From #Expd Group By ID,Measure2) A where ModeRowNr=1
Create Index idx on #Mode (ID)

-- Generate Final Results
 Select A.ID
       ,A.Measure
       ,MinVal  = min(Measure2)
       ,MaxVal  = max(Measure2)
       ,Mean    = max(B.Mean)
       ,Median  = isnull(Avg(IIF(ExtRowNr between MednRow1 and MednRow2,Measure2,null)),A.Measure)
       ,ModeR1  = isnull(max(IIf(ModeHits>1,D.Mode,null)),min(Measure2))
       ,ModeR2  = isnull(max(IIf(ModeHits>1,D.Mode,null)),max(Measure2))
  From  #Expd A
  Join  #Mean B on (A.ID=B.ID)
  Join  #Medn C on (A.ID=C.ID)
  Join  #Mode D on (A.ID=D.ID and ModeRowNr=1)
  Group By A.ID
          ,A.Measure
  Order By A.ID

Returns

ID          Measure MinVal  MaxVal  Mean    Median  ModeR1  ModeR2
2013-01-02  1.86    1.86    1.86    1.86    1.86    1.86    1.86
2013-01-03  1.92    1.86    1.92    1.89    1.89    1.86    1.92
2013-01-04  1.93    1.86    1.93    1.9033  1.92    1.86    1.93
2013-01-07  1.92    1.86    1.93    1.9075  1.92    1.92    1.92
2013-01-08  1.89    1.86    1.93    1.904   1.92    1.92    1.92
...
2016-07-20  1.59    1.37    3.04    2.2578  2.24    2.20    2.20
2016-07-21  1.57    1.37    3.04    2.257   2.235   2.61    2.61
2016-07-22  1.57    1.37    3.04    2.2562  2.23    2.20    2.20

Both approaches where validated in Excel

enter image description here

I should add that in the final query, you could certainly add/remove items like STD, Total


Post a Comment for "Using Over(Partition By) When Calculating Median Moving Average Unit Cost"