Choosing between ASH, LBFP and GLBFP

This vignette compares the three estimator families exposed by the package. It is a practical guide rather than a universal ranking of methods.

  • ASH() and ASH_estimate() implement averaged shifted histogram estimates.
  • LBFP() and LBFP_estimate() implement linear blend frequency polygon estimates.
  • GLBFP() and GLBFP_estimate() implement the general linear blend frequency polygon estimate.

All estimators share the same basic inputs: data, b, optional grid bounds, and, for ASH/GLBFP, the shift vector m.

library(GLBFP)

x <- cbind(rnorm(200), rnorm(200, sd = 1.25))
b <- c(0.75, 0.9)
m <- c(2, 2)
point <- c(0, 0)

fits <- list(
  ASH = ash(point, x, b = b, m = m),
  LBFP = lbfp(point, x, b = b),
  GLBFP = glbfp(point, x, b = b, m = m)
)

vapply(fits, function(z) z$estimation, numeric(1))
#>       ASH      LBFP     GLBFP 
#> 0.1240741 0.1395065 0.1349684

Grid estimates can be compared through the common *_estimate() interface.

grid_ash <- ash_estimate(x, b = b, m = m, grid_size = 15)
grid_lbfp <- lbfp_estimate(x, b = b, grid_size = 15)
grid_glbfp <- glbfp_estimate(x, b = b, m = m, grid_size = 15)

comparison <- data.frame(
  method = c("ASH", "LBFP", "GLBFP"),
  mean_density = c(
    mean(grid_ash$densities),
    mean(grid_lbfp$densities),
    mean(grid_glbfp$densities)
  ),
  max_density = c(
    max(grid_ash$densities),
    max(grid_lbfp$densities),
    max(grid_glbfp$densities)
  )
)

comparison
#>   method mean_density max_density
#> 1    ASH   0.02629630   0.1574074
#> 2   LBFP   0.02434245   0.1441655
#> 3  GLBFP   0.02439168   0.1476182

Practical starting rules

As a first pass:

  • use LBFP when a simple linear blend frequency polygon is sufficient;
  • use GLBFP when a tunable shifted linear blend estimator is desired;
  • use ASH when an averaged shifted histogram representation is desired.

The bandwidth vector b usually matters more than small changes in m. Use compute_bi_optim() as a reproducible starting point, then inspect sensitivity around that value. This helper implements a plug-in bandwidth choice motivated by the optimal cell-width calculation for multivariate frequency polygons in Carbon and Duchesne (2024).

For manuscript figures or numerical comparisons, report the selected b, the selected m, the grid definition, and the estimator family. This makes the result reproducible and avoids treating the default display as a statistical conclusion by itself.