Skip to contents

The default behaviour in harp is to store ensemble data in wide data frames. That means that there is one column for each member of the ensemble. This isn't always ideal and goes against the principles of tidy data, whereby each ensemble member would be stored on a separate row with a single column denoting the ensemble member. pivot_members can be used to pivot between the wide and long formats in both directions.

Usage

pivot_members(.data)

Arguments

.data

A harp data frame or a harp_list of data frames

Value

The same data frame, or harp_list but with the members pivoted.

Details

When pivoting from a wide to a long data frame, the class is updated to indicate that the ensemble members are stored in rows rather than columns. When pivoting back to a wide data frame format, the class is returned to its original names.

Examples

pivot_members(ens_point_df)
#> ::ensemble point forecast [[long]]:: # A tibble: 96 × 7
#>    sub_model fcst_dttm           lead_time valid_dttm            SID member
#>  * <chr>     <dttm>                  <dbl> <dttm>              <dbl> <chr> 
#>  1 point     2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001 mbr000
#>  2 point     2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001 mbr001
#>  3 point     2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001 mbr000
#>  4 point     2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001 mbr001
#>  5 point     2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001 mbr000
#>  6 point     2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001 mbr001
#>  7 point     2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001 mbr000
#>  8 point     2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001 mbr001
#>  9 point     2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001 mbr000
#> 10 point     2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001 mbr001
#> # ℹ 86 more rows
#> # ℹ 1 more variable: fcst <dbl>
pivot_members(ens_grid_df)
#> ::ensemble gridded forecast [[long]]:: # A tibble: 48 × 6
#>    sub_model fcst_dttm           lead_time valid_dttm          member      fcst
#>  * <chr>     <dttm>                  <dbl> <dttm>              <chr>  <geolist>
#>  1 grid      2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr000   [5 × 5]
#>  2 grid      2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr001   [5 × 5]
#>  3 grid      2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr000   [5 × 5]
#>  4 grid      2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr001   [5 × 5]
#>  5 grid      2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr000   [5 × 5]
#>  6 grid      2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr001   [5 × 5]
#>  7 grid      2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr000   [5 × 5]
#>  8 grid      2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr001   [5 × 5]
#>  9 grid      2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr000   [5 × 5]
#> 10 grid      2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr001   [5 × 5]
#> # ℹ 38 more rows
pivot_members(ens_point_list)
#>  a
#> ::ensemble point forecast [[long]]:: # A tibble: 96 × 8
#>    fcst_model sub_model fcst_dttm           lead_time valid_dttm            SID
#>  * <chr>      <chr>     <dttm>                  <dbl> <dttm>              <dbl>
#>  1 a          a         2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001
#>  2 a          a         2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001
#>  3 a          a         2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001
#>  4 a          a         2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001
#>  5 a          a         2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001
#>  6 a          a         2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001
#>  7 a          a         2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001
#>  8 a          a         2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001
#>  9 a          a         2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001
#> 10 a          a         2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001
#> # ℹ 86 more rows
#> # ℹ 2 more variables: member <chr>, fcst <dbl>
#> 
#>  b
#> ::ensemble point forecast [[long]]:: # A tibble: 96 × 8
#>    fcst_model sub_model fcst_dttm           lead_time valid_dttm            SID
#>  * <chr>      <chr>     <dttm>                  <dbl> <dttm>              <dbl>
#>  1 b          b         2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001
#>  2 b          b         2021-01-01 00:00:00         0 2021-01-01 00:00:00  1001
#>  3 b          b         2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001
#>  4 b          b         2021-01-01 00:00:00         1 2021-01-01 01:00:00  1001
#>  5 b          b         2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001
#>  6 b          b         2021-01-01 00:00:00         2 2021-01-01 02:00:00  1001
#>  7 b          b         2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001
#>  8 b          b         2021-01-01 00:00:00         3 2021-01-01 03:00:00  1001
#>  9 b          b         2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001
#> 10 b          b         2021-01-01 00:00:00         4 2021-01-01 04:00:00  1001
#> # ℹ 86 more rows
#> # ℹ 2 more variables: member <chr>, fcst <dbl>
#> 
pivot_members(ens_grid_list)
#>  a
#> ::ensemble gridded forecast [[long]]:: # A tibble: 48 × 7
#>    fcst_model sub_model fcst_dttm           lead_time valid_dttm          member
#>  * <chr>      <chr>     <dttm>                  <dbl> <dttm>              <chr> 
#>  1 a          a         2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr000
#>  2 a          a         2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr001
#>  3 a          a         2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr000
#>  4 a          a         2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr001
#>  5 a          a         2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr000
#>  6 a          a         2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr001
#>  7 a          a         2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr000
#>  8 a          a         2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr001
#>  9 a          a         2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr000
#> 10 a          a         2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr001
#> # ℹ 38 more rows
#> # ℹ 1 more variable: fcst <geolist>
#> 
#>  b
#> ::ensemble gridded forecast [[long]]:: # A tibble: 48 × 7
#>    fcst_model sub_model fcst_dttm           lead_time valid_dttm          member
#>  * <chr>      <chr>     <dttm>                  <dbl> <dttm>              <chr> 
#>  1 b          b         2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr000
#>  2 b          b         2021-01-01 00:00:00         0 2021-01-01 00:00:00 mbr001
#>  3 b          b         2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr000
#>  4 b          b         2021-01-01 00:00:00         1 2021-01-01 01:00:00 mbr001
#>  5 b          b         2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr000
#>  6 b          b         2021-01-01 00:00:00         2 2021-01-01 02:00:00 mbr001
#>  7 b          b         2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr000
#>  8 b          b         2021-01-01 00:00:00         3 2021-01-01 03:00:00 mbr001
#>  9 b          b         2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr000
#> 10 b          b         2021-01-01 00:00:00         4 2021-01-01 04:00:00 mbr001
#> # ℹ 38 more rows
#> # ℹ 1 more variable: fcst <geolist>
#> 

# Note the change in class
class(ens_point_df)
#> [1] "harp_ens_point_df" "harp_point_df"     "harp_df"          
#> [4] "tbl_df"            "tbl"               "data.frame"       
class(pivot_members(ens_point_df))
#> [1] "harp_ens_point_df_long" "harp_point_df"          "harp_df"               
#> [4] "tbl_df"                 "tbl"                    "data.frame"            
class(ens_grid_df)
#> [1] "harp_ens_grid_df" "harp_grid_df"     "harp_df"          "tbl_df"          
#> [5] "tbl"              "data.frame"      
class(pivot_members(ens_grid_df))
#> [1] "harp_ens_grid_df_long" "harp_grid_df"          "harp_df"              
#> [4] "tbl_df"                "tbl"                   "data.frame"