Determining tropical cyclone inland flooding loss on a large scale through a new flood peak ratio-based methodology

In recent years, the United States has been severely affected by numerous tropical cyclones (TCs) which have caused massive damages. While media attention mainly focuses on coastal losses from storm surge, these TCs have inflicted significant devastation inland as well. Yet, little is known about the relationship between TC-related inland flooding and economic losses. Here we introduce a novel methodology that first successfully characterizes the spatial extent of inland flooding, and then quantifies its relationship with flood insurance claims. Hurricane Ivan in 2004 is used as illustration. We empirically demonstrate in a number of ways that our quantified inland flood magnitude produces a very good representation of the number of inland flood insurance claims experienced. These results highlight the new technological capabilities that can lead to a better risk assessment of inland TC flood. This new capacity will be of tremendous value to a number of public and private sector stakeholders dealing with disaster preparedness.


Introduction
Inland riverine flooding from tropical cyclones (TCs) is responsible for significant economic losses in the United States (e.g., Pielke et al 2008, Mendelsohn et  Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Yet, little is known about the relationship between TC-related inland flooding and economic losses as most hurricane loss assessment efforts are focused on coastal areas (United States Department of Commerce 2011, Elsberry 2002, Zandbergen 2009). Hurricane Irene which struck the US East Coast in 2011 provides a recent and poignant example: intense media coverage and preparation and evacuation activities focusing on the projected coastal landfall locations in North Carolina and New York, but ultimately most of the losses were due to heavy rainfall and associated inland riverine flooding, not storm surge (United States Department of Commerce 2011, Avila and Cangialosi 2011). Furthermore, even if the loss assessment was focused inland, broadly accepted procedures for the regional characterization and quantification of the spatial structure of TC flooding-essential for a proper assessment-are not available.
Here we address this knowledge gap through a novel approach that combines the two critical hazard and loss data elements. First, we apply new quantification methods of the spatial structure of TC-related flood magnitudes at the regional scale; and second, we benefit from a unique access to the entire portfolio of the federally run National Flood Insurance Program (NFIP) that underwrites the vast majority of residential flood insurance policies throughout the United States. This combination of methods and data allows for a detailed characterization of homeowners' flood claims at a given inland-focused location, which we do for Hurricane Ivan (2004). Not only was Hurricane Ivan one of the most devastating and costly tropical cyclones to ever hit the US, third largest flood event covered by the NFIP (King 2011), it affected a very geographically expansive area of 23 US states. Thus, it is an ideal application to validate our proposed inland flood loss assessment methodology across a large region.

Quantifying the spatial extent and magnitude of TC inland flooding
Flood hazard characterization has long focused on development of methods to estimate the flood discharge at a particular location along a river with a specified return period, or alternatively assessing the return period of a flood peak with specified discharge. Assessing damages associated with a typically geographically expansive individual TC event (i.e. across an entire state or even multiple states), however, requires characterization of the spatial extent of flooding. Characterization and quantification of the spatial structure of flooding over a region has received significantly less attention than characterization of 'at-site' hazards and broadly accepted procedures are not available. Statistical methods that generalize univariate extreme value theory (the foundation of single site flood hazard characterization) to spatial extremes are an active area of research (Davison et al 2012) and provide an important long-term path for characterizing spatial extremes of flood peaks. Mature parametric statistical methods based on extreme value theory are not, however, available. A particularly challenging problem for usefully generalizing extreme value theory to spatial extremes is addressing the role of spatial heterogeneities in flood generation . Another approach to assessing spatial properties of flooding combines observed high-resolution rainfall fields with hydrologic and hydraulic models of runoff production and transport through the drainage network. Although this line of research has substantial potential, the obstacles to implementing hydrologic models for flood hazard characterization over large regions (see, for example Beven 2001) make other methods necessary.
Here we propose a data-driven approach to flood hazard characterization based on discharge observations from a network of stream gaging stations. Our approach for characterizing spatial extremes of flood hazards and linking hazards to damages avoids these spatial heterogeneity issues through the utilization of empirical flood frequency methods. We describe the 'samples' that are the basis for empirical probability estimates. We leverage the wealth of discharge data collected and disseminated by the US Geological Survey (USGS), using the flood ratio approach recently introduced by Villarini et al (2011). We also use USGS Instantaneous Data Archive (IDA) data from 1873 stations over the study region (see figure S1 (available at stacks.iop.org/ERL/8/ 044056/mmedia) for their location). For each station, we extract the largest instantaneous flood peak during the passage of Hurricane Ivan (15-24 September 2004). At each site, we also compute the 10-year flood peak value (90th percentile) from annual maximum instantaneous peak discharge data over the period 1989-2009. We focus only on the most recent period to limit the potential effects of human modifications of these catchments (e.g., Villarini and Smith 2010). We then take the ratio between the flood peak associated with Hurricane Ivan and the corresponding at-site 10-year flood peak. In this way, when this ratio has a value of '1', for instance, it indicates that Ivan caused a flood peak that was equal to the 10-year flood peak. Values larger (smaller) than '1' indicate flood peaks caused by Hurricane Ivan that are larger (smaller) than the 10-year flood peak. Recently, this approach was successfully used to examine flooding associated with predecessor rain events over the central United States (Rowe and Villarini 2013).
The flood magnitude quantification results can then be mapped, as we do here for Hurricane Ivan in figure 1. Hurricane Ivan flooding was most extreme in western North Carolina, associated with orographic enhancement of precipitation along the eastern margin of the Appalachian Mountains. A second area with large flood peak values was in Pennsylvania, related to the interaction of Hurricane Ivan with an extratropical system and the associated extratropical transition of the storm, resulting in large rainfall accumulations several hundred of km away from the center of circulation (see Hart and Evans (2001), for discussion of the importance of extratropical transition for heavy rainfall and flooding in the eastern United States). In both regions, peak discharge values were more than four times the corresponding 10-year flood peak value. As illustrated in figure 1, a single landfalling TC can produce extreme flooding over extensive areas of the eastern United States; the geographic distribution of flooding exhibits pronounced spatial coherence in flood magnitudes (which is linked to tracks of the storm and the organization of rainfall into bands distributed around the center of circulation; Villarini et al 2011. Accordingly, we have directly integrated the key hydrologic processes (e.g., rainfall, soil properties, antecedent soil moisture conditions) associated with flood damage parsimoniously into the flood peak ratio map. We next combine this newly calculated spatial structure of flood magnitude with the spatial structure of flood insurance losses based upon NFIP flood insurance claim observations across the 23 affected states.  (2004) normalized with respect to the at-site 10-year flood peak value. For instance, a value of 3 indicates that the flood peak for this event is three times larger than the corresponding 10-year flood peak. See supplemental figure 1 (available at stacks.iop.org/ERL/8/044056/mmedia) for the location of the available stations. The black line represents Hurricane Ivan's track. Spatial interpolation is performed by means of inverse distance weighted method.

Detailing Hurricane Ivan flood insurance losses
The federal national flood insurance program is the primary source of residential flood insurance in the United States (Michel-Kerjan 2010, Michel-Kerjan and Kunreuther 2011) and we have access to its entire portfolio from 2000 to 2010 as well as individual policy claim data from 1978. Importantly, access to this dataset allows us to measure the relationship between the quantified inland flooding magnitudes from Hurricane Ivan that we have just described and the associated residential insured economic losses. Our analysis of the NFIP database reveals that Hurricane Ivan produced a total of 28 670 residential (single family, two to four family, and other residential) claims with $1.487 billion in total (building and content) damages. To provide some relative context to these values, this represented half of all flood insurance claims received by the federal government for the entire country for the full year of 2004. Over the period 1978-2003, the average annual number of paid claims for the entire country was 34 800. Financially speaking, the $1.487 billion in insured losses represented two-thirds of the total flood insurance payment made by the federal government in 2004. Flood losses related to Hurricane Ivan were higher than what the NFIP had ever paid before for an entire year and the entire country. Clearly Hurricane Ivan provides a significant NFIP loss sample to use for this assessment.
As we are focused on analyzing inland riverine flood losses, we exclude all losses explicitly due to 'tidal water overflow' as classified by the NFIP (i.e., storm surge losses), resulting in a reduced set of 19 273 claims with $800.9 million in flood damages from this hurricane. Thus, 67% of the total residential NFIP flood insurance claims and 54% of the total residential NFIP flood damage from Hurricane Ivan were related to non-storm-surge flooding, or what we designate as inland riverine flooding losses. The overall and relative magnitude of the NFIP insured riverine flood losses from Hurricane Ivan further emphasizes the significance of a better understanding of TC-related inland flooding and associated economic losses.

Translation of Hurricane Ivan's quantified flood magnitudes into inland flood insurance losses
For our analysis we partition these 19 273 inland flood claims to the lowest geographic level identifiable in the NFIP dataset, the census tract, which also represents the level of integration for the flood ratio. Given Hurricane Ivan's occurrence in 2004, we utilize year 2000 census tract information. Corresponding to figure 1, across the 23 impacted states there are a total of 27 790 unique census tracts with a quantified flood ratio. Table S1 (available at stacks.iop.org/ERL/8/044056/mmedia) summarizes the flood ratio values by state, ranked by each state's mean quantified flood value across their associated impacted census tracts. The top five states by mean quantified flood values-Pennsylvania, New York, Delaware, Georgia, and West Virginia-are all rather distant from the storm's coastal landfall location in Alabama and areas not in the proximity of the center of circulation. Further, while 93% of the total 27 790 unique census tracts impacted by Hurricane Ivan had a quantified flood ratio equal to or less than 1.0 (10-year flood peak value), there were nearly 2000 census tracts (7%) having a flood ratio value greater than 1.0. The top four states in terms of maximum quantified flood ratio values across their impacted census tracts-North Carolina, Pennsylvania, Tennessee, and Georgia-have the largest percentages of their total impacted census tracts with flood ratio values greater than 1.0, ranging from 8% to 41% of their impacted tracts.
We are most interested in understanding the relationship between these quantified flood intensities and inland flood losses. From the NFIP database we have complete census tract identification for 16 584 of these total 19 273 inland flood claims (86%) and $736.1 million of the $800.9 million total inland flood damages (92%). From these identifiable census tracts, a total of 1241 unique census tracts incurred at least one inland flood insurance claim. This represents 4.5% of the total 27 790 unique census tracts with a quantified flood magnitude. Figure 2(a) illustrates these 1241 census tracts with at least one flood insurance claim overlaid upon flood magnitudes (figure 1), while table S2 (available at stacks.iop. org/ERL/8/044056/mmedia) details the total claims by state ranked by each state's maximum quantified flood value across their associated impacted census tracts.
From figure 2(a), we see that the location of inland riverine flood claims from Hurricane Ivan is primarily concentrated in three main geographic areas: in Pennsylvania and southeastern Ohio; along the Appalachian Mountains in western North Carolina and northern Georgia; and along the coast near the landfall location in Alabama and Florida. As TCs typically bring large amounts of rainfall to coastal landfall locations in addition to strong winds and storm surge, the highlighted claim occurrences in Florida and Alabama are not unexpected. However, the other primary geographic areas incurring flood losses (Pennsylvania, Ohio, North Carolina, and Georgia) are inland locations that match well to the top states ranked by the mean and maximum quantified flood ratios (table S1 (available at stacks.iop.org/ERL/8/044056/ mmedia) and figure 1). There is a clear relationship between the occurrence of large flood ratios and claims, as 98.5% of total claims are associated with states that have a maximum flood ratio value occurrence of 1.4 or greater in at least one particular census tract.
While the high-level geographic comparisons between figures 1 and 2(a) highlight the relatively close agreement between the flood peak ratio magnitude and insured inland flood losses, there are also geographic areas with flood peaks associated with Hurricane Ivan that are larger than the corresponding 10-year flood, but have no NFIP claims identified. For example, the states of Tennessee, New York, and South Carolina all have at least one census tract with a maximum quantified flood ratio value greater than 2.0, but no flood insurance claims from Hurricane Ivan occurring anywhere in these states (table S2 available at stacks.iop.org/ ERL/8/044056/mmedia). As it is well documented that low market penetration rates are a chronic issue for the NFIP, especially in inland areas (Dixon et al 2006, we next account for the number of NFIP policies-in-force in those impacted areas. We examine the NFIP database to determine a market penetration rate per census tract, defined as the active number of NFIP flood insurance policies-in-force as of 31 December 2004 divided by the number of housing units from the 2000 census data. We find that 6940 census tracts of the total 27 790 unique census tracts with a quantified flood magnitude in the analysis (25%) do not have any active NFIP flood insurance policies and thus a 0% NFIP market penetration. Figure 2(b) highlights these tracts with 80% of the total 6940 zero market penetration tract concentration in the states of Georgia, Kentucky, New Jersey, New York, North Carolina, Ohio, Pennsylvania, and Tennessee. These particular results point to the very low market penetration in areas susceptible to inland flooding from landfalling TCs. For example, 498 of these 6940 tracts have a flood ratio greater than 1.0 and are nearly all located in the states of Pennsylvania, New York, and Tennessee. Moreover, figure 2(b) verifies that the very low (or inexistent) NFIP market penetration is almost entirely a non-coastal problem. Empirical estimation of inland flood claims. Finally, to explicitly determine the statistical relationship between our NFIP inland flood insurance losses and inland flood intensities, we conduct an empirical analysis at the census tract level on the number of claims as a function of the quantified flood magnitude ratio. Here we also control for other relevant exposure factors, including the 2004 population per square mile, the number of flood insurance policies-in-force as of 31 December 2004, and the NFIP market penetration rate in each census tract. All else being equal, as these exposure factors increase one would expect a larger count of flood insurance claims. We create flood ratio dummy variables following from the overall distribution of our census tract flood ratio values that equate to bins of (0;  NFIP market penetration less than or equal to 1.0%, we create a low market penetration dummy variable (=1 if ≤1.0%) to include in the empirical model as opposed to the pure market penetration rate. Table 1 presents the results where we model the count of claims for all 20 850 census tracts with at least one NFIP policy-in-force, and further utilize a zero-inflated negative binomial (ZINB) regression model with robust standard errors to account for the large number of census tracts with zero claims incurred (94% of our total 20 850 census tracts), even with at least one NFIP policy-in-force. A ZINB specification allows for over-dispersion resulting from an excessive number of zeros by splitting the estimation process in two: (1) estimating a logit model to predict the probability that zero claims take place in a given tract; and (2) estimating a negative binomial (NB) model to predict the count of claims in a given tract (Kahn 2005, Long andFreese 2006). The Vuong test results comparing the ZINB to the non-zero-inflated NB specification indicates strong support of the ZINB over the NB. Additional tests strongly support the choice of the ZINB model over zero-inflated Poisson, NB, and Poisson models.
The Wald test of the joint insignificance of our explanatory variables is rejected at the 1% level, with all individual variables in the model significant at the 1% level or less, including notably the flood ratio variables-this supports the validity of our results. The coefficient values on the flood ratio range dummies in comparison to the omitted (0; 0.1] range do in fact indicate an increasing count of claims as flood ratio values increase. For example, coefficient estimates for the (0.5; 0.75] and (1.5; 2.0] ranges indicate that the expected count of claims per tract increases by a factor of 3.62 and 8.93 respectively compared to the omitted (0; 0.1] range while holding all other variables constant. All of the zero-claim probability explanatory variables in the logit estimation (inflate portion of the model) have the correct expected sign and are statistically significant at less than the 1% level. Holding all other variables constant, the logit coefficient estimates indicate that having a higher flood ratio value along its continuum decreases the odds of a census tract experiencing a zero-claim observation by 99%; while a low NFIP market penetration increases the odds of experiencing a zero-claim observation by 77%.
As a straightforward illustration of how these empirical results can be utilized we take the coefficient estimates from table 1 to predict point estimates of the expected count of claims by our binned flood ratio independent variables while holding all other explanatory variables at their mean values. Figure 3 presents these expected counts of claims per an average census tract at the various levels of our binned flood ratio values. While moving from the (0.1; 0.25] flood ratio bin to the (0.25; 0.5] flood ratio bin increases the expected count of claims by 21%, moving from the (0.1; 0.25] flood ratio bin to the (0.75; 1.0] bin doubles the expected count of claims up to 0.28 per average impacted census tract. In fact, anywhere beyond the 10-year flood peak, expected counts of claims are 100% to 283% higher than the lowest illustrated (0.1; 0.25] flood ratio bin. Clearly then, we see a large increase in the predicted count of claims once the 10-year flood peak is achieved. This empirical analysis, in combination with the translation of the quantified flood ratios with inland flood losses as highlighted in figures 1 and 2, demonstrates and quantifies a clear connection between the number of claims and increasing flood ratios.

Implications
The presentation and analysis of the combination of spatial information on flood magnitudes and flood insurance claims is novel. Overall, the geographic and descriptive analyses presented here explicitly highlight that the damage associated with TCs is not limited to the coastal areas close to landfall, but affect substantial regions inland. And these inland regions do not necessarily have to be along the TC path, but can be areas several hundred kilometers away from the center of circulation of the storm, and still severely be affected by its passage. This is clearly an important result that is not well known: as we show, many of residents in these zones had not purchased flood insurance.
Most significantly, the new methodology proposed here allows to quantify on a very large scale (23 states were studied here) the relationship between inland flooding peak magnitudes and incurred flood insurance claims. We show that 19 273 (67%) of the total residential flood insurance claims due to Hurricane Ivan were related to inland riverine flooding. We empirically demonstrate in a number of ways that our data-driven approach to quantify inland flood magnitude produces a very good representation of the number of non-storm-surge flood insurance claims experienced for each impacted geographic area. Specifically for an impacted census tract and conditional on any flood insurance policies being in-force in the tract, the quantified census tract flood peak ratio we have introduced is found to be a key driver of the probability of any one claim occurring there, as well as a key driver of the total number of claims resulting, with the number of claims increasing as the flood ratio values increase. As such our results provide the foundation for TC flood risk assessment across all impacted areas, not just coastal landfall locations. Notably, it is this type of inland risk assessment that is a highlighted priority for the National Weather Service (NWS) as evidenced by their September service assessment of Hurricane Irene (United States Department of Commerce 2011) where improvement on how the NWS 'communicates the risk of inland flooding and educate(s) the public, media, and emergency managers on that risk' was the number one overarching recommendation.
These results highlight the new technological capabilities that can lead to a better characterization and quantification of TC flood extent, magnitude, and related losses. Modeling studies indicate a projected increase in rainfall associated with TCs up to 20% over the 21st century (Knutson et al 2010(Knutson et al , 2013, potentially exacerbating the risk of flooding from TCs in the future over large areas of the United States. These new capabilities require bridging the gap across disciplines, including hydrology, meteorology, economics and risk management, as we have done here. Moreover, in addition to the number of claims, future studies should examine economic damage associated with landfalling TCs. Applications of the present work are numerous. For example, federal, state and local authorities could better sensitize the people living inland who think that TCs affect only the residents of coastal areas. In addition the flood peak ratio proxy could be calculated pre-landfall or relatively quickly thereafter and be used by emergency services, local, state and federal government agencies, and/or by insurers to forecast economic losses. If better flood loss assessment is realized, and results communicated to those living in these inland areas, then it is likely that more of them will be better protected financially because they will more fully understand that hurricanes are likely to impose major flood losses inland as well. As a result they will be able to get back on their feet more quickly after the next catastrophe, an important move toward greater natural disaster resilience (Michel-Kerjan 2012).