read_obs
generates file names, based on the arguments given and reads
point observations data from them. The data can optionally be re-written to
files of a different format. Due to the large volumes of data that may be
read, the function will only return data to the calling environment if
return_data = TRUE
Usage
read_obs(
dttm,
parameter,
param_defs = get("harp_params"),
stations = NULL,
file_path = getwd(),
file_format = NULL,
file_template = "vobs",
file_format_opts = vfile_opts("vobs"),
output_format = "obstable",
output_format_opts = obstable_opts(),
return_data = FALSE,
start_date = NULL,
end_date = NULL,
by = "1h",
reads_per_write = 24,
...
)
Arguments
- dttm
A vector of date time strings to read. Can be in YYYYMMDD, YYYYMMDDhh, YYYYMMDDhhmm, or YYYYMMDDhhmmss format. Can be numeric or character. A vector of date-times can be generated using seq_dttm.
- parameter
The names of the parameters to read. By default this is NULL, meaning that all parameters are read from the observations files.
- param_defs
A list of parameter definitions that includes the file format to be read. By default the built in list
harp_params
is used. Modifications and additions to this list can be made usingmodify_param_def
andadd_param_def
respectively.- stations
The IDs of the stations to read from the files. By default this is NULL, meaning that observations for all stations are read from the observations files.
- file_path
The parent path to all forecast data. All file names are generated to be under the
file_path
directory. The default is the current working directory.- file_format
The format of the files to read. By default this is "vobs", which is the standard format used by the HIRLAM consortium. If set to something else,
read_obs
will search the global environment for a function calledread_<file_format>
that it will use to read from the files.- file_template
A template for the file names. For available built in templates see
show_file_templates
. If anything else is passed, it is returned unmodified, or with substitutions made for dynamic values. Available substitutions are YYYY for year, {MM} for 2 digit month with leading zero, {M} for month with no leading zero, and similarly {DD} or {D} for day, {HH} or {H} for hour, {mm} or {m} for minute. Note that the full path to the file will always be file_path/template. Other substitutions can be passed via...
- file_format_opts
Specific options for reading the file format specified in
file_format
. Should be a named list, with names corresponding to argument forread_<file_format>
.- output_format
The file format to re-write the data to. By default this is "obstable", which is an sqlite file desgined specifically for the harp ecosystem. If set to something else,
read_obs
will search the global environment for a function calledwrite_<file_format>
that it will use to write to the output file(s).- output_format_opts
Specific options for writing to
file_format
files. Must be a named list and at least include the names"path"
and"template"
. By settingoutput_format_opts$path
to something other than NULL,read_obs
will attempt to write out the data.- return_data
Logical - whether to return the data read in to the calling environment. Due to the potential for large volumes of data, this is set to FALSE by default.
- start_date, end_date, by
The use of
start_date
,end_date
andby
is no longer supported.dttm
together withseq_dttm
should be used to generate equally spaced date-times.- reads_per_write
The number of files to read before writing out the data to new files. Set this to a low number to reduce memory usage. The default is 24 based on the assumption that there are observations files every hour and writing should be done once per observation day. For the default setting of writing to "obstable" files, this number has no impact on the output since these files can be appended to. For other formats, this setting might be important to prevent data from being overwritten.
- ...
Other arguments to
generate_filenames
for getting the names of files to read.
Details
read_obs
is not intended to be used for reading gridded observations.
For this use read_analysis instead.