Skip to content

add dynamic dof selection for ljung_box feature for both single and multiple models #143

@ghost

Description

Unless I am mistaken, it seems like the ljung_box feature requires manual specification for the dof and lag arguments outside of the defaults, which are 0 and 1 respectively. This can be an issue when your mable contains models which have varying parameter counts, in which case dof should be different for the respective models. In that case, you'd want the ljung_box feature to calculate the statistic and p-value based on each model.
Ex.)

# subset data for training
train <- aus_production %>%
  filter_index("1992 Q1" ~ "2006 Q4")

# Create models 
beer_fit <- train %>%
  model(
    Mean = MEAN(Beer),
    `Seasonal naïve` = SNAIVE(Beer)
  )

# check how many estimated parameters each model has, if any. Only `Mean` will show
# as having at least 1 parameter
beer_fit %>%
  tidy() %>%
  group_by(.model) %>%
  count()

# get ljung box information
beer_fit %>%
  augment()  %>%
  features(.innov,ljung_box)

Note that the last command in the code will produce ljung_box information but with both having a dof value of 0 when Mean should be 1 and Seasonal naïve should be 0.

I believe this can be fixed using a relatively simple mapply() function. (This could obviously be improved on, is just a rough draft) as follows:

ljung_box_mult <- function(dat,lag = 10){
  
  input <- dat %>%
    augment() %>%
    as_tibble() %>%
    select(.model) %>%
    unique(by = ".model") %>%
    left_join(dat %>%
                tidy() %>%
                group_by(.model) %>%
                count()) %>%
    mutate(n = if_else(is.na(n),0L,n))
  
  output <- mapply(function(x,y){
    
    dat %>%
      select(x) %>%
      augment() %>%
      features(.innov,ljung_box,lag=lag,dof = y)
      
  },input$.model,input$n,SIMPLIFY = FALSE)
  
  return(do.call(rbind,output))
  
}

beer_fit %>%
  lung_box_mult()

If I am a bonehead and there is a way to do this already please let me know. If not, then I am open to suggestions on how this can be implemented/improved upon.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions