The main goal of the imgw package is to give convenient and programmable access to the Polish database of the archive meteorological and hydrological measurements.

It consists of tools that are:

• Giving more accessible access to the public data stored in the Institute of Meteorology and Water Management (IMGW)
• Downloading the selected data from among eleven different forms of data depending on the interval and the data type
• Providing the description of the parameters in two languages: Polish and English, as well as in the abbreviated form

## Functions

The imgw package consists of nine main functions - four for meteorological data, four for hydrological data and one for sounding meteo:

1. Meteorological data:
• meteo_hourly() - downloading meteorological the data with hourly interval
• meteo_daily() - downloading meteorological the data with daily interval
• meteo_monthly() - downloading meteorological the data with monthly interval
• meteo() - downloading meteorological the data with interval the user want to choose
1. Hydrological data:
• hydro() - downloading hydrological the data with interval the user want to choose
• hydro_daily() - downloading hydrological the data with daily interval
• hydro_monthly() - downloading hyrological the data with monthly interval
• hydro_annual() - downloading hyrological the data with annual interval
1. Rawinsonde data:
• meteo_sounding() – downloading rawinsonde data (+metadata)

Most of the functions mentioned above have similar arguments allowing to choose:

• rank - type of the stations "synop", "climate", "precip" (only meteo functions)
• year - vector of selected years (e.g., 1966:2000)
• status - logical argument TRUE or FALSE; if TRUE the measurement statuses will be erased (only meteo functions)
• coords - logical argument TRUE or FALSE; if TRUE the coordinates are added to the stations
• station - selection of the stations; it can be the ID of stations (numeric) or name of the station (CAPITAL LETTERS (character))
• col_names - format of the columns names; three types of column names are possible: "short" - default, values with shorten names, "full" - full English description, "polish" - original names in the dataset

## Database

imgw also has a few additional databases:

• hydro_abberv/meteo_abberv - a dictionary containing all original descriptions of parameters in both languages and the abbreviations
#>                               fullname   abbr_eng
#> 1 Absolutna temperatura maksymalna [C]   tmax_abs
#> 2  Absolutna temperatura minimalna [C]   tmin_abs
#> 4      Charakterystyka tendencji [kod] press_tend
#> 5                      Chmury CH [kod] cl_CH_code
#> 6                    Chmury CH tekstem      cl_CH
#>                           fullname_eng
#> 1 Absolute maximum air temperature [C]
#> 2 Absolute minimum air temperature [C]
#> 4                    Pressure tendency
#> 5              High cloud cover [code]
#> 6              High cloud cover [text]
• hydro_stations/meteo_stations - datasets of almost all meteorological/hydrological stations containing their ID, latitude, and longitude
#> # A tibble: 6 x 3
#>          id     X     Y
#>       <dbl> <dbl> <dbl>
#> 1 249190020  19.6  50.0
#> 2 249199999  19.3  49.9
#> 3 249190040  19.4  49.9
#> 4 249190050  19.8  49.9
#> 5 249190060  20.0  49.9
#> 6 249190070  19.2  49.9

## Application

We will show how to use our package and prepare the data for spatial analysis with the additional help of the dplyr and tidyr packages. Firstly, we download ten years (2001-2010) of monthly hydrological observations for all stations available and automatically add their spatial coordinates.

#>              id       X        Y station riv_or_lake  hyy idhyy idex   H
#> 95158 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     1    1 214
#> 95159 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     1    2 228
#> 95160 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     1    3 250
#> 95161 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     2    1 215
#> 95162 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     2    2 225
#> 95163 150210180 21.8335 50.88641 ANNOPOL   Wisła (2) 2001     2    3 258
#>         Q  T mm
#> 95158 172 NA 11
#> 95159 207 NA 11
#> 95160 272 NA 11
#> 95161 174 NA 12
#> 95162 201 NA 12
#> 95163 297 NA 12

The idex variable represents id of the extremum, where 1 means minimum, 2 mean, and 3 maximum.1 Hydrologists often use the maximum value so we will filter the data and select only the station id, hydrological year (hyy), latitude X and longitude Y. Next, we will calculate the mean maximum value of the flow on the stations in each year with dplyr’s summarise(), and spread data by year using tidyr’s spread() to get the annual means of maximum flow in the consecutive columns.

h2 = h %>%
filter(idex == 3) %>%
select(id, station, X, Y, hyy, Q) %>%
group_by(hyy, id, station, X, Y) %>%
summarise(srednie_roczne_Q = round(mean(Q, na.rm = TRUE),1)) %>%
spread(hyy, srednie_roczne_Q)
Examplary data frame of hydrological preprocesssing.
id station X Y 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
149180010 KRZYŻANOWICE 18.28780 49.99301 200.5 147.4 87.9 109.2 170.6 226.9 152.9 131.0 160.9 461.1
149180020 CHAŁUPKI 18.32752 49.92127 174.7 96.7 57.6 91.8 146.9 170.6 110.2 101.6 124.7 314.6
149180040 GOŁKOWICE 18.49640 49.92579 4.5 2.0 1.7 1.7 2.5 3.3 2.1 1.7 2.2 8.6
149180050 ZEBRZYDOWICE 18.61326 49.88025 13.5 7.9 3.8 5.0 10.4 6.5 5.8 2.8 4.5 23.6
149180060 CIESZYN 18.62972 49.74616 57.2 57.7 29.8 26.8 65.4 60.7 54.7 33.0 34.7 135.0
149180070 CIESZYN 18.63137 49.74629 NaN NaN NaN NaN NaN NaN 0.6 0.5 0.6 0.6

The result represent changing annual maximum average of water flow rate over the decade for all available stations in Poland. We can save it to:

• .csv with: write.csv(result, file = "result.csv", sep = ";",dec = ".", col.names = T, row.names = F). This command saves our result to result.csv where the column’s separator is ;, the decimal is ., we are keeping the headers of columns and remove names of rows which are simply numbers of observation.

• .xlsx with: write.xlsx(result, file = "result.xlsx", sheetName = "Poland", append = FALSE) This command saves our result to result.xlsx with the name of the sheet Poland. Argument append=TRUE add the sheet to already existing xlsx file. To save data in .xlsx you have first to install package with command: install.packages("writexl"), and add it: library(writexl).

library(sf)
library(tmap)
library(rnaturalearth)
library(rnaturalearthdata)
world = ne_countries(scale = "medium", returnclass = "sf")

h3 = h2 %>%
filter(!is.na(X)) %>%
st_as_sf(coords = c("X", "Y"))

tm_shape(h3) +
tm_symbols(size = as.character(c(2001:2010)),
title.size = "Średni przepływ maksymalny") +
tm_facets(free.scales = FALSE, ncol = 4) +
tm_shape(world) +
tm_borders(col = "black", lwd = 2) +
tm_layout(legend.position = c(-1.25, 0.05),
outer.margins = c(0, 0.05, 0, -0.25),
panel.labels = as.character(c(2001:2010)))

1. You can find more information about this in the hydro_abbrev dataset.