Plot a time series for point data — plot_station

This function plots a time series of data from a harp_point_df data frame, or a harp_list containing harp_point_df data frames. The plotting is done using ggplot2 and thus uses some of the same terminology in its arguments. Data are plotted using "geoms", and the plots are divided into panels using facets. If the data contain a column of observed values these may also be included in the plot, and for ensemble data a best guess forecast may be derived from the data.

Usage

plot_station_ts(
  .data,
  SID,
  fcst_dttm,
  x_axis = "lead_time",
  fcst_geom = "line",
  fcst_geom_args = list(),
  fcst_colour_by = NULL,
  fcst_colours = NULL,
  obs_col = NULL,
  obs_geom = "point",
  obs_geom_args = list(),
  facet_by = NULL,
  num_facet_cols = 1,
  facet_scales = "free_y",
  smooth = FALSE,
  ...
)

# S3 method for harp_ens_point_df
plot_station_ts(
  .data,
  SID,
  fcst_dttm,
  x_axis = "lead_time",
  fcst_geom = "boxplot",
  fcst_geom_args = list(),
  fcst_colour_by = NULL,
  fcst_colours = NULL,
  obs_col = NULL,
  obs_geom = "point",
  obs_geom_args = list(),
  facet_by = NULL,
  num_facet_cols = 1,
  facet_scales = "free_y",
  smooth = FALSE,
  quantiles = NULL,
  best_guess = NULL,
  best_guess_geom = "line",
  best_guess_geom_args = list(),
  ...
)

Arguments

.data: A harp_point_df data frame, or a harp_list containing harp_point_df data frames.
SID: The ID of the station(s) to plot. If more than one SID is asked for then SID should be included in facet_by.
fcst_dttm: The start time(s) of the the forecast to plot. If more than one fcst_dttm is asked for, fcst_dttm should be included in facet_by.
x_axis: The x axis of the plot.
fcst_geom: The geom to use to plot the forecast data (see details).
fcst_geom_args: Arguments to the fcst_geom geom as a named list.
fcst_colour_by: Which column in .data to use to control the colours of the forecast data.
fcst_colours: A vector of colours to use for the forecast data. It should be the same length as the number of colours to appear in the plot. Where the colours are a controlled by the data, this can be a named vector or a data frame with column names equal to the column in the data controlling the colour and "colour".
obs_col: If observations are to be included in the plot, the column containing the observations data.
obs_geom: The geom to use to plot the observations data.
obs_geom_args: Arguments to the obs_geom geom as a named list.
facet_by: The column(s) to use to facet the plot into panels.
num_facet_cols: The number of columns in a faceted plot.
facet_scales: Whether the scales should be fixed. Defaults to "free_y". See facet_wrap for more details.
smooth: For line and ribbon plots, whether to smooth the lines by drawing an X-spline relative to control points. In the background geom_linespline and geom_ribbonspline are used.
...: Other arguments passed to methods.
quantiles: For geom = "ribbon", or geom = "col", the quantiles used to stratify the probabilities of an ensemble forecast.
best_guess: What to plot as a "best guess" forecast. Can be any function as a character string that reduces a vector to a single value. Can also be an ensemble member as a numeric value or a string that is the same as the member in a harp_ens_point_df data frame that has had the members pivoted using pivot_members, e.g. "mbr000".
best_guess_geom: The geom to use to plot the best guess forecast.
best_guess_geom_args: Arguments to best_guess_geom as a named list.

Value

A ggplot object that can be saved with ggasve.

Geoms

The data are plotted using geoms from ggplot2. You can control which geom is used for each aspect of the plot using the respective arguments:

fcst_geom for forecast;
obs_geom for observations;
best_guess_geom for the "best guess" forecast from an ensemble.

The geom should be specified as a character string based on the geom function from ggplot2 with the "geom_" prefix removed. For example, for a line plot for forecast data use fcst_geom = "line". Other arguments to the geom function can be provided as named list to the appropriate *_geom_args argument to control, for example, the colour or size of the geom.

Note that aesthetic mappings to the geom cannot be controlled, except for x via the x_axis argument and colour and fill via the colour_by argument.

Ensembles

For ensemble forecasts, some data manipulation is done prior to plotting depending on the geom as listed below. For geoms not included below, no manipulation is done and the plots may be difficult to interpret or not work at all.

boxplot - The data are grouped by the x_axis argument such that each box is representative of the ensemble distribution for each point on the x-axis. See geom_boxplot for how the hinges, whiskers and outliers are defined.
violin - The data are grouped by the x_axis argument such that each "violin" is representative of the ensemble distribution for each point on the x-axis.
line - The data are grouped by each ensemble member and a time series is plotted for each member. This is the common ensemble "spaghetti" plot.
ribbon - The data are divided into bands depending on the quantiles provided in the quantiles argument. The bands are centred such that the the outer band is between the lowest and highest quantiles with inner bands added until the innermost pair of quantiles is reached. Consequently an even number of quantiles must be provided.
col - The data are divided into bands of increasing quantile pairs starting provided in the quantiles argument, with the minimum value (quantile = 0). This gives columns of stacked probabilities staring at 0 that is particularly useful for accumulated variables such as precipitation, or variables that truncate at 0 such as wind speed or cloud or cloud cover.

A "best_guess" forecast can be added to the plot using the best_guess argument. This can either be the name of a function that reduces the ensemble to a single value (e.g. "mean", "median"), or the ensemble member to treat as the best guess (e.g. 0, or "mbr000"). The geom and its options can be provided via the best_guess_geom and best_guess_geom_args arguments, but they must be geoms that understand the x and y aesthetics.

Filtering and faceting

By default, all of the data that the function is given are plotted. In many cases this can result in overplotting. For data at more than one station (SID), or for more than one forecast start time (fcst_dttm), there are arguments to filter the data prior to plotting based on those values. Otherwise filtering should be done (e.g. using filter) prior to passing the data to this function.

Alternatively multi-panel plots can be made using the facet_by argument. This groups the data based on the values in the columns provided to facet_by and plots each group of data in its own panel. The default behaviour is to plot 1 column of panels so that the x axis lines up for all panels, but this can be changed using the num_facet_cols argument. Additionally, the scale of the y axis for each panel is determined by the data for that panel, but can be common for all panels by setting facet_scales = "fixed".

Observations

If the data include an observations column (e.g. from running join_to_fcst), these observations can be included in the plot be providing the name of the column that contains the observations via the obs_col argument. The geom and its options can be provided via the obs_geom and obs_geom_args arguments, but they must be geoms that understand the x and y aesthetics.