gamlss vs. mgcv conflicting results of KS test #454

florianhartig · 2024-11-28T14:41:01Z

Here an example from a question that I received via email, showing that KS test p-values on a gamlss model are different from a KS test on an equivalent mgcv model. The question is why and if there is a problem.

library(DHARMa)
library(gamlss)
library(mgcv)

# Simulation function
simulate_mod <- function(model, n = 1000) { # default number of simulations in DHARMa
  fam <- model$family[1]
  random_generator <- get(paste0("r", fam))
  pred <- predict(model, type = "response")
  nObs <- length(pred)
  sim <- matrix(nrow = nObs, ncol = n)
  for(i in 1:n) sim[,i] <- random_generator(nObs, pred)
  return(sim)
}

# Sample model - Negative Binomial type 2
set.seed(2024)
y <- rNBI(1000)
mod <- gamlss(y ~ 1, family = "NBI", data = as.data.frame(y))

sim <- simulate_mod(mod)

DHARMa_res <- createDHARMa(simulatedResponse = sim,
                           observedResponse = eval(mod$call$data) |> dplyr::pull(toString(mod$call$formula[[2]])),
                           fittedPredictedResponse = predict(mod),
                           integerResponse = T)

testUniformity(DHARMa_res)

## mgcv equivalent
mod2 <- gam(y ~ 1, family = nb)
simulationOutput <- simulateResiduals(fittedModel = mod2, n = 1000)
testUniformity(simulationOutput)

The text was updated successfully, but these errors were encountered:

florianhartig · 2024-11-28T14:48:02Z

OK, this is not a problem or error. You have to consider that randomized quantile residuals are, as the name says, randomized. The randomization is smoothing out the inter-valued observations and is done internally in DHARMa.

In principle, each time you create a residual plot, you would get slightly different values. This is also not solved by setting a larger n, as the randomization is over observations, and not the simulations (i.e. it would be solved by increasing observations, but not simulations). To avoid confusing the users or allow them to play around with the randomisation, DHARMa fixes the random seed while calculating the residuals, but you can overrun this by trying out different values of the seed in

simulationOutput <- simulateResiduals(fittedModel = mod2, seed = 2)
testUniformity(simulationOutput)

Running this will allow you to get an appreciation of the "natural variation" of the randomized quantile residuals.

This "fixing" of the seed will, however, not work across different model packages, as they don't all do the simulations in exactly the same way. Therefore, you may get slightly different residuals when you use different regression packages on the same data. In your case, you wrote the simulation function by hand, so you can just also just rerun the block

sim <- simulate_mod(mod)
DHARMa_res <- createDHARMa(simulatedResponse = sim,
                           observedResponse = eval(mod$call$data) |> dplyr::pull(toString(mod$call$formula[[2]])),
                           fittedPredictedResponse = predict(mod),
                           integerResponse = T)

and you will get slightly different residuals.

Note that this is not an issue - the variation in the residuals is considered in the DHARMa tests, and as you see, although you get slightly different values, the result overall is correct, which is that most (I would expect around 95%) of the cases, the KS test is n.s.

I would only see a reason for concern if you note that type I error rates are different between packages or wrong for a particular package.

melina-leite added the packageSupport label Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gamlss vs. mgcv conflicting results of KS test #454

gamlss vs. mgcv conflicting results of KS test #454

florianhartig commented Nov 28, 2024

florianhartig commented Nov 28, 2024 •

edited

Loading

gamlss vs. mgcv conflicting results of KS test #454

gamlss vs. mgcv conflicting results of KS test #454

Comments

florianhartig commented Nov 28, 2024

florianhartig commented Nov 28, 2024 • edited Loading

florianhartig commented Nov 28, 2024 •

edited

Loading