Medicinal footprint of the population of the Rhine basin

The relation between pharmaceutical residues along the river Rhine and the demographic characteristics of the upstream population was studied. A sampling campaign was performed in which water samples from the Rhine were taken at 42 locations. Measurements were compared to a two parameter model with regional demographic data as main input. For 12 out of the 21 studied pharmaceuticals, a significant dominant demographic group could be identified. For 3 out of these 12 pharmaceuticals the male elderly were the most contributing demographic group. A Monte Carlo analysis showed a high level of significance for the results of this study (p < 0.01). By combining environmental water quality data and demographic data, better insight was gained in the interplay between humans and their environment, showing the medicinal footprint of the population of the Rhine basin.


Introduction
Thirty million people rely on the Rhine for their drinking water [1] and 37% of Dutch drinking water is prepared from river water, the Rhine being the main supplier [2]. It is of great importance to understand the processes generating pharmaceutical pollution of the Rhine.
In this letter, a relation between the concentrations of pharmaceuticals in the river Rhine and specific demographic groups living in the Rhine basin was studied. The main innovation concerns the combination of geophysical and demographic data sources with concentrations of pharmaceuticals. A two parameter model was developed using sampling point locations, measured concentrations, demographic groups and river discharge as input. The demographic groups considered were firstly all combinations of gender (male, female and total) and age (younger than 15, Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. 15-65, older than 65 years and total). Secondly, the different nationalities living in the basin (German, Austrian, Belgian, French and Swiss) were considered. This approach allowed us to link different concentrations of pharmaceuticals to different dominant demographic groups in the Rhine catchment.
Pollution of river water with residues of pharmaceuticals has become a source of concern and topic of research [3]. Pascoe et al [4] showed that exposure to pharmaceuticals in the environment has negative effects on fresh water invertebrates. Fick et al [5] showed that therapeutic levels of pharmaceuticals were found in fish when exposed to waste water treatment plant effluent. Research on understanding the source of this contamination was undertaken by, among others, Nakada et al [6], who compared discharge of pharmaceuticals from different rivers to the population living in the catchments of those rivers and found significant correlations. Sim et al [7] showed that levels of pharmaceuticals in waste water treatment plant effluent could be linked to consumption of pharmaceuticals. Ter Laak et al [8] linked the amount of pharmaceuticals sold in the Rhine basin to the residuals measured in the river at the catchment outlet. By using population estimations and pharmaceutical consumption data, van der Aa et al [9] predicted that the amount of pharmaceuticals in the Rhine will rise by 17% in 2020, compared to 2007.
One would assume that pharmaceutical pollution accumulates in the river water with downstream distance. However, this has never been shown in the literature. In this letter, we combine data sources on demographics, elevation and river discharge with measured concentrations of pharmaceuticals in a two parameter model. This simple model accounts for decay of pharmaceuticals and incorporates the downstream logic of the watershed to identify the 'most contributing groups' per polluting pharmaceutical.

Methods used
We used a two parameter model to predict pharmaceutical concentrations in the river Rhine with measured hydrologic and demographic data. A model is needed because direct correlation of concentration, or loads, of pharmaceuticals with demographic groups ignores the known process of degradation of pharmaceuticals in the river. The parameters in this model were estimated using measured concentrations from our sampling campaign. Parameter estimation was performed for different demographic groups. The groups considered were firstly all combinations of gender (male, female and total) and age (younger than 15, 15-65, older than 65 years and total). Secondly, the different nationalities living in the basin (German, Austrian, Belgian, French and Swiss) were considered. The group, either a combination of sex and age, or a nationality, for which the model fitted best to the measurements was considered 'main contributor'. Below the different data sources, the model and the method for significance testing are described.

Model
The model has two parameters to link population to pharmaceutical concentration. First is the emission factor C (ng/day person), the flux of pharmaceutical residual that is discharged into the river per individual. Second parameter is the decay-length L (km), the river length associated with the decay of the pharmaceutical. Using these two parameters, the concentrationĉ i,j,k of a particular medicine i, caused by a demographic group j, at sample location k is predicted by: where • Parameter C i,j is the emission factor of pharmaceutical i for demographic group j.
• Parameter L i,j is the decay-length of pharmaceutical i for demographic group j. Although available from literature (Kunkel and Radke [12]), the decay-length was optimized as a parameter because the circumstances in this specific river are likely to be different from those used for the assessment of literature values. Optimized decay-lengths that are close to reported literature values, provide extra confidence in the assumptions implicitly underlying the model.
• Input N k is the set of sampling points that are upstream of sampling point k plus k itself.
• Input P n,j is the number of people in demographic group j that lives in the upstream area of sampling point n but not in the upstream area of any other upstream sampling point. For sampling points along the main river, not downstream of a tributary, this means that P n,j is the number of people living in the part of the catchment that is upstream of n, but not in the catchment of the next upstream sample point.
• Input n,k is the length, measured along the river, from sampling point n to sampling point k.
• Input Q k is the river discharge at sampling point k.
The parameters C i,j and L i,j are estimated by minimizing the differences between predicted concentrationsĉ i,j,k and the measured concentrations c i,k . For each combination of pharmaceutical and demographic group, the R-square statistic is computed. Finally, for each pharmaceutical, the demographic group with the highest R 2 i,j is marked as 'most contributing'.

Assumptions
By constructing the model as explained above, the following assumptions were made.
• It is assumed that steady state conditions hold. People discharge a continuous flux of pharmaceuticals per person into the river that does not change with time. Gerrity et al [13] showed that during special events, in their case the NFL Superbowl, this assumption may not hold. The sampling campaign of this research, did not take place during any known event or holiday that would violate this assumption.
• Perfect mixing is assumed. The sample is representative for the water composition of the entire river cross section. To minimize the impact of this assumption on results, samples were always taken from the centre of the river.
• A single demographic group is held responsible for the total pharmaceutical pollution. In reality, different groups contribute differently. The test performed in this research is a 1-step ANOVA using a non-linear model. Ideally, one would like to perform a multivariate (multi-step) analysis with all the groups as combined inputs. That would, however, lead to overparameterization, considering the number of samples. This assumption limits the interpretability of the emission factor.

Sampling and chemical analyses
Water samples were taken at 42 points along the river, each approximately 20-30 km from the previous point. The sampling points were chosen between Liechtenstein, near the source of the river, and Emmerich, on the Dutch-German border where the Rhine delta starts. No specific permits were required for the described sampling campaign. None of the samples were taken at locations that are privately owned or protected in any way. The sampling campaign did not involve endangered or protected species. The campaign was conducted from 10 April until 13 April 2011. The sampling team moved downstream with a velocity about equal to the stream-velocity. During the sampling campaign, and in the weeks before, hardly any rain fell in the Rhine basin that could have led to rapid discharge fluctuations that might influence the analysis. Sampling points were downstream of bridges or ferries. The major tributaries of the Rhine (Aare, Neckar, Main, Lahn, Mosel, Sieg, Ruhr and Lippe) were also sampled just before they entered the main river. Samples were taken using stainless steel buckets, transferred to pre-rinsed bottles of green glass and cooled directly after sampling. Samples were extracted within one day upon arrival at the laboratory. Locations 18 and 24 were sampled and analysed in triplicate to determine the experimental error in this study. MilliQ-water (sampled and analysed in triplicate) served as blanc control for contamination during sampling and laboratory procedures. Analysis of pharmaceuticals was performed as described in Houtman et al [24]. In short, 100 ml volumes were extracted with solid phase extraction using Oasis HLB and eluted with methanol. Extracts were evaporated to 100 µl to which 1 ml MilliQ-water was added. Pharmaceuticals were analysed using an Ultra Performance Liquid Chromatograph (UPLC, Waters Acquity), equipped with a quaternary pump, combined with a Quattro Xevo triple quadrupole Mass Selective Detector (Waters Micromass) with electro spray ionization. Quantification was performed using an external calibration series of seven concentrations.
The analysis method contained forty one pharmaceuticals that were selected according to their consumption volume, earlier detection, ecotoxicity and that represented most relevant therapeutic classes (analgesics, antibiotics, antidiabetic, antidepressant/psycholeptics, anti-hypertension drugs, antilipaemic, beta blockers, cytostatics, diuretics, x-ray contrast agent, anti-epileptic, respiratory drug). Most (32) compounds had a method limit of detection (LOD) of 5 ng l −1 or lower, of which 18 compounds had an LOD between 0.1 ng l −1 to 1 ng l −1 . Highest LOD was obtained for clofibrate (85 ng l −1 ). The method was validated by calculating the recovery and standard deviation of surface water samples from eight different locations and sampled on different days spiked with pharmaceuticals at a low level (resp. 0.5, 2.5 or 15 µg l −1 ) to determine limits of detection and at a higher level (resp. 0.5, 2.5 or 15 µg l −1 ) extracted and analysed at different days to determine reproducibility. The average recovery found was 91 ± 14% (error includes whole process from sampling to analysis). An average recovery of 84 ± 19% was found for the matrix Rhine surface water. Only pharmaceuticals that were detected in concentrations higher than their LOD in the Rhine water samples at 10 or more sampling points were included in the dataset used to estimate the parameters for the prediction model. This resulted in a dataset with concentrations for 21 different pharmaceuticals at 42 sample points.

Data sources
The Hydrosheds digital elevation model [14] and the GPS locations of the sampling points were used to determine the upstream area (catchment) of each sampling point. The boundaries of these catchments were overlaid over the boundaries of the NUTS3 areas 3 from Eurostat. The population statistics per NUTS3 area in the Eurostat online database [15] provided the population statistics per age group and gender per upstream area of each sampling point as counted on the first of January 2010 4 .
The different demographic groups are highly correlated with each other (general correlation coefficient greater than 0.99 i.e. people tend to live together). Because of this high correlation, large differences in emission of pharmaceuticals will show up as small differences in the R 2 and small differences should thus be interpreted as more significant compared to research where different inputs are not correlated with each other.
To derive the discharge at the sampling locations from the discharge at measurement stations along the river [16][17][18][19] the schematization of the Rhine Alarm Model [20] was used. For some sampling locations, most notably in the city of Koblenz and in the tributaries Sieg, Ruhr and Lippe, only water levels and no discharge information could be found, even after contacting the respective authorities. These sampling locations were excluded from the analysis, although the effect of the populations of the tributaries is taken into account in subsequent downstream sampling points.

Significance testing
We tested the significance of the results by running a Monte Carlo simulation in which we repeated the complete analysis presented above, with for each Monte Carlo run a randomly shuffled order of demographic input per sampling point. This method ensured that all the distributions were the same, and only the order was changed. If we had also bootstrapped the values of the concentrations and/or discharge data, any covariance among these inputs might have been destroyed, resulting in overconfidence in our results. By only shuffling the order of the demographic input, we test whether the order of sampling points causes the observed results. For each Monte Carlo run, for each pharmaceutical, the most contributing group and the corresponding R 2 were saved. It should be noted that we are not using a linear model and Figure 1. Measured concentrations of pharmaceuticals, aggregated to therapeutic classes, in the Rhine (left) and its major tributaries (right). The downstream increasing trend indicates that the relative increase of pharmaceutical loads is higher than the relative increase in water discharge. Since more people live in the lower part of the Rhine catchment, this supports the hypothesis that the concentration of pharmaceuticals at a sampling point is linked to the number of people living upstream of that sampling point. thus R 2 cannot be interpreted as the percentage of variance explained, but higher R 2 does relate to better fitted models. The location of the R 2 of the actual (non-shuffled) results in the distribution of R 2 s from the Monte Carlo runs, determines the significance level (p-values), per pharmaceutical.
Since a large group of pharmaceuticals was tested, statistically some of those would have to end up with a low p-value [21]. To test whether the overall results were significant we compared the average pseudo-R 2 for all pharmaceuticals in the actual results to the distribution of average pseudo-R 2 s from the Monte Carlo runs to find the p-value for the overall results. Since the individual Monte Carlo runs use a randomized input, and comparison with individual end-members of distribution introduces uncertainties, a safety factor needs to be applied when the results indicate very high overall significance.
Finally, the catchment area per sample point was also used as an input to the model. Catchment area is strongly correlated with the amount of people living in the catchment. It would be an indication that the observed results are actually due to catchment area as explaining input and not demographics when the results using catchment area as input have a higher R 2 than the results using the correct demographics as input.

Results
Measured concentrations, aggregated in therapeutical classes, are shown in figure 1. A complete table of measured concentrations per pharmaceutical is provided in the supporting material (available at stacks.iop.org/ERL/8/044057/mmedia). Figure 1 shows a significantly increasing trend in measured total concentrations of pharmaceuticals with downstream distance, supporting the hypothesis that the concentration of pharmaceuticals is linked to the number of people in the upstream catchment.
The demographic groups that contribute most per pharmaceutical are shown in table 1. For 12 of the 21 studied pharmaceuticals, a significant (p < 0.05) dominant demographic group could be identified. The R 2 values for all combinations of pharmaceuticals and demographic groups are presented in the supporting material (available at stacks.iop. org/ERL/8/044057/mmedia). Careful analysis of the results per pharmaceutical produces interesting outcomes, both for the pharmaceuticals with a significant dominant demographic group, as for the pharmaceuticals without a significant dominant group.
All three pharmaceuticals that have elderly people as most contributing group show high correlations between concentrations and contributing group (p < 0.01). Apparently, elderly in Germany, France and Switzerland use similar above average quantities of these pharmaceuticals and thus correlate highly with the measured concentrations. This agrees with data for the Netherlands from the Dutch Foundation for Pharmaceutical Statistics that show that elderly people use exponentially more pharmaceuticals with age [10]. The results for the anti-epileptic carbamazepine are shown in figure 2 as an example of a pharmaceutical with a significant correlation (low p-value) with a demographic group. Carbamazepine concentrations correlate best with the demographic group 'male elderly'. This finding corroborates the work of van der Aa et al (figure 1 of [9]) that, based on Dutch prescription data, showed that elderly males are the main consumers of carbamazepine.  Humans are probably not directly correlated with the main source of pollution for pharmaceuticals that have high p-values for their strongest contributing demographic group. Other sources, such as industrial discharge, might, for some pharmaceuticals, cause the measured concentration pattern. For example; the psycholeptic diazepam has 'the French' as most contributing group, but a high p-value of 0.361. Measured and modelled results for diazepam are shown in figure 3. There is a sharp increase in diazepam at 300 km from Konstanz, near the city of Strasbourg. Benzodiazepines, such as diazepam, temazepam and oxazepam, are after administration subject to several metabolisation reactions in the liver. For diazepam, oxazepam is the major metabolite.
Also temazepam and nordazepam are formed. They are excreted as conjugates via urine ( [22,23]). Only a small portion of the administered diazepam is excreted as diazepam. In the environment or wastewater treatment deconjugation of pharmaceuticals takes place [25]. Concentrations of specific benzodiazepines measured in the environment can therefore result from the consumption of the compound itself as well as of others and in most cases will result from a combination of different benzodiazepines prescribed in the tributary. The ratio of concentrations reflects the net result of this mutual relation between benzodiazepines. In the routine monitoring of Rhine river water in the Dutch delta, the 'The Water Laboratory' laboratory observed the ratios of diazepam:oxazepam and temazepam:oxazepam to be normally <0.5 and <1 respectively. In this study, the ratios are 1 and 26 respectively directly downstream from Strasbourg (concentration values are provided in the supporting materials available at stacks. iop.org/ERL/8/044057/mmedia). Therefore, we hypothesize that the diazepam measured here did not result from human consumption but came from another emission source.
Like diazepam, hydrochlorothiazide has a national group as most contributing group, namely Germans. However, for hydrochlorothiazide a much stronger correlation with its strongest contributing group was found (p < 0.04). Either Germans are consuming more hydrochlorothiazide than other nationalities in the Rhine catchment, maybe due to different prescription policies, or other activities that happen all over Germany cause more prominent emission.
Care must be taken when interpreting the results in a causal manner. For example, female children are the main contributing group for the beta blocker sotalol. It is unlikely that these children are actually excreting sotalol, since sotalol is predominantly prescribed to patients suffering from atrial fibrillation, a hearth condition of which the prevalence is strongly positively correlated with age. The actual group responsible for emission of sotalol in the Rhine must, however, be highly correlated with the presence of female children.
Because of the large number of correlation operations performed for this research, it was important to test whether the total results (table 1) could have been due to pure chance. The main hypothesis of this work is that there is a relation between concentrations of pharmaceuticals in the river and the people living in the upstream catchment. This was tested by a Monte Carlo analysis in which the measured concentrations were shuffled over the sampling points for each Monte Carlo run. As the cumulative distribution of R 2 s of the Monte Carlo runs in figure 4 shows, not a single one out of the 1000 Monte Carlo runs has a higher R 2 than the actual non-shuffled results. Using a safety factor, we conclude that the results of table 1 are highly significant (p < 0.01).

Discussion
The logic of watersheds and pollution patterns helps us in understanding a broad range of phenomena. The methodology of this research could be used more widely as it offers a way of monitoring an indicator of population health, without having . Distribution of mean pseudo-R 2 from a thousand Monte Carlo runs in which the order of the samples was randomized. The black lines indicate the results using catchment area as input. The red lines indicates the real, non-randomized demographic data is input, showing that there is a higher than 99% chance that the results are not due to chance.
to use surveys. This holds especially true for illicit drugs and over the counter (non-prescribed) pharmaceuticals, for which data on sales or usage are usually not available, as Thomas et al [11] suggested.
The results show that there is a relationship between concentrations of pharmaceuticals in the river Rhine and the demographic characteristics of the people living upstream in the Rhine basin. For 12 out of 21 pharmaceuticals, the relationships between demographic group and concentrations of pharmaceuticals are significant (p < 0.05). However, care must be taken with the interpretation of these results. These correlations do not show causal relations per se. Other processes that are highly correlated with population density of certain groups might be the cause of pollution with pharmaceuticals. For the nine pharmaceuticals with p-values higher than 0.05, it is concluded that either the dataset is not rich enough to pinpoint the dominant demographic group, or another process that does not correlate highly with demographics causes the pollution. The pollution of diazepam that seems to originate upstream of the city of Strassbourg is an example of this.
A strong correlation between a specific pharmaceutical and a specific demographic group should be seen as a starting point for new research questions. For example: 'do female children really take large quantities of the beta blocker drug sotalol, or do these children cause irregular heart beat symptoms in their (grand)parents?', or 'do differences in national health policies related to medicine explain why Germans are the most contributing group for temazepam, bezafibrate and hydrochlotorhiazide?'.
This research showed that the combination of demographic data with environmental data allows for insight in the interplay between humans and their environment.