For doing a block bootstrap using bootstrap_verify, the blocks can be
passed as a data frame with a "pool" column telling bootstrap_verify
how to pool the data into blocks. make_bootstrap_pools
is a function
to make such a data frame.
Arguments
- .fcst
A
harp_fcst
object- pool_col
The column used to define the pools. Can be the column name, quoted, or unquoted. If a variable it should be embraced - i.e. wrapped in
{{}}
- pool_length
The length of a pool. Numeric or a character with a unit qualifier if
pool_col
is in date-time format. The unit qualifier can be : "s" = seconds, "m" = minutes, "h" = hours, "d" = days.- overlap
Logical. Whether the pools should overlap.
Details
Typically block bootstrapping would be used if there are serial
auto-correlations in the data. If for example auto-correlations are suspected
between forecasts, pools could be defined from the fcdate
column to
create blocks of data where those auto-correlations are maintained.
Pools may be set to overlap, whereby a new pool is created beginning at each
new value in pool_col
. The length of a pool should be defined in the
units used in pool_col
- if pool_col
is a date-time column,
then pool_length
is assumed to be in hours, though the units can be
set by adding a qualifier letter: "s" = seconds, "m" = minutes, "h" = hours,
"d" = days.
Examples
make_bootstrap_pools(ens_point_df, lead_time, 2)
#> # A tibble: 24 × 2
#> lead_time pool
#> <dbl> <dbl>
#> 1 0 1
#> 2 1 1
#> 3 2 2
#> 4 3 2
#> 5 4 3
#> 6 5 3
#> 7 6 4
#> 8 7 4
#> 9 8 5
#> 10 9 5
#> # ℹ 14 more rows
make_bootstrap_pools(ens_point_df, lead_time, 2, overlap = TRUE)
#> # A tibble: 46 × 2
#> lead_time pool
#> <dbl> <int>
#> 1 0 1
#> 2 1 1
#> 3 1 2
#> 4 2 2
#> 5 2 3
#> 6 3 3
#> 7 3 4
#> 8 4 4
#> 9 4 5
#> 10 5 5
#> # ℹ 36 more rows
# pool_col as a variable
my_col <- "lead_time"
make_bootstrap_pools(ens_point_df, {{my_col}}, 2)
#> # A tibble: 24 × 2
#> lead_time pool
#> <dbl> <dbl>
#> 1 0 1
#> 2 1 1
#> 3 2 2
#> 4 3 2
#> 5 4 3
#> 6 5 3
#> 7 6 4
#> 8 7 4
#> 9 8 5
#> 10 9 5
#> # ℹ 14 more rows