harp wrapper for the tobac feature_detection_multithreshold() function — detect_features

This function is used to detect features on a field based on contiguous regions. The regions are above / below a threshold depending on the value of the target argument. Note that 3d features are not yet implemented for this wrapper.

detect_features_multithreshold(
  field_data,
  thresholds,
  data_col = "gridded_data",
  dttm_col = "valid_dttm",
  target = c("max", "min"),
  position_threshold = c("centre", "extreme", "weighted_diff", "weighted_abs"),
  sigma_threshold = 0.5,
  n_erosion_threshold = 0,
  n_min_threshold = 0,
  min_distance = 0,
  feature_number_start = 1,
  pbc_flag = c("none", "h_dim1", "h_dim2", "both"),
  vertical_coord = NULL,
  vertical_axis = NULL,
  detect_subset = NULL,
  wavelength_filtering = NULL,
  dz = NULL,
  strict_thresholding = FALSE
)

Arguments

field_data: A harp_grid_df data frame such as one returned by the harpIO functions harpIO::read_grid() with data_frame = TRUE, harpIO::read_forecast(), or harpIO::read_analysis().
thresholds: Threshold values used to select target regions to track. The feature detection is inclusive of the threshold value(s), i.e. values greater/less than or equal are included in the target region. The target argument controls whether the detection is based on less than or greater than the threshold(s).
data_col: <tidy-select> The column in field_data containing the fields to be used to detect features in. Should be a <geolist> column. If the named column is not found in field_data, but field_data contains 1 <geolist> column, that <geolist> column is used.
dttm_col: <tidy-select> The column in field_data containing the date-times to be used for the time dimension. Can be numeric with units in Unix time (seconds since 1970-01-01 00:00:00), or a <POSIXt>column. If the named column is not found in field_data, but field_data contains 1 <POSIXt> column, that <POSIXt> column is used.
target: Flag to determine if tracking is targeting minima or maxima in the data. Should be "max" or "min". Default is "max".
position_threshold: Flag to choose the method to be used for the setting the position of the tracked feature. Can be one of "centre", "extreme", or "weighted_diff". Default is ‘centre’, though "weighted_diff" is often preferable for atmospheric features.
sigma_threshold: Standard deviation for initial filtering step. Default is 0.5.
n_erosion_threshold: Number of pixels by which to erode the identified features. Default is 0.
n_min_threshold: Minimum number of identified contiguous pixels for a feature to be detected. Default is 0.
min_distance: Minimum distance between detected features (in metres). Default is 0.
feature_number_start: Feature id to start with. Default is 1.
pbc_flag: Sets whether to use periodic boundaries, and if so in which directions. "none" means that we do not have periodic boundaries "hdim_1" means that we are periodic along hdim1, "hdim_2" means that we are periodic along hdim2 and "both" means that we are periodic along both horizontal dimensions
vertical_coord: Name of the vertical coordinate. If NULL, tries to auto-detect. It looks for the coordinate or the dimension name corresponding to the string.
vertical_axis: The vertical axis number of the data. If NULL, uses vertical_coord to determine axis. This must be >=0.
detect_subset: Whether to run feature detection on only a subset of the data. If this is not NULL, it should be a named list and it will subset the grid that we run feature detection on to the range specified for each axis specified. The format of this list is: list(axis-number = c(start, end)), where axis-number is the number of the axis to subset, start is inclusive, and end is exclusive. For example, if your data are oriented as (time, z, y, x) and you want to only detect on values between z levels 10 and 29, you would set: list("1" = c(10, 30)). Note that this is not tested.
wavelength_filtering: Minimum and maximum wavelength for horizontal spectral filtering in metres as a 2 element vector. Default is NULL.
dz: Constant vertical grid spacing (metres). If not specified and the input is 3D, this function requires that altitude is available in the features input. If you specify a value here, this function assumes that it is the constant z spacing between points, even if z_coordinate_name is specified.
strict_thresholding: If TRUE, a feature can only be detected if all previous thresholds have been met. Default is FALSE.

Value

A data frame of detected features.