Reducing uncertainty of high-latitude ecosystem models through identification of key parameters

Climate change is having significant impacts on Earth’s ecosystems and carbon budgets, and in the Arctic may drive a shift from an historic carbon sink to a source. Large uncertainties in terrestrial biosphere models (TBMs) used to forecast Arctic changes demonstrate the challenges of determining the timing and extent of this possible switch. This spread in model predictions can limit the ability of TBMs to guide management and policy decisions. One of the most influential sources of model uncertainty is model parameterization. Parameter uncertainty results in part from a mismatch between available data in databases and model needs. We identify that mismatch


Introduction
As climate change causes temperatures in the Arctic to increase between 1.7 and 2.8 times the rate of Northern Hemisphere warming [1], Arctic terrestrial ecosystems are undergoing rapid change.
This includes shifts in plant community structure and function [2], coupled with changes in soil and permafrost dynamics [3,4], which impact carbon (C) cycling in these ecosystems [5]. Depending on the scenario and timescale, vegetation changes may counterbalance, dampen, or enhance the microbial Figure 1. Example of a negative feedback loop in a causal loop diagram, including local temperature, greenhouse gas concentration and tundra GPP (gross primary productivity). The annual mean atmospheric CO2 concentration measured at Mauna Loa Observatory, the annual mean temperature in the Arctic region (defined as north of the Arctic circle) from the ERA5 dataset and the yearly maximum of the tundra GPP based on the Global Monthly GPP from an Improved Light Use Efficiency Model have shown a positive trend over the last decades [20][21][22].
decomposition of soil carbon [6], likely causing a strong permafrost-carbon feedback to global climate [7][8][9] (figure 1). However, there is much uncertainty surrounding the possible transition of the Arctic from a carbon sink to a source [10,11]. Many processes contribute to the Arctic and boreal carbon budget, including photosynthesis, respiration, decomposition, wildfires, permafrost thaw, and hydrological processes [7,[12][13][14][15][16][17], making it challenging to represent the full complexity of these ecosystems in a modeling context. As a result, model intercomparisons of high-latitude carbon cycling in the future indicate significant differences across models [6,18]. These uncertainties present challenges in determining which measures are needed to meet climate goals [19] and reinforce the importance of quantifying uncertainty in terrestrial biosphere models (TBMs).
Model uncertainty can be categorized as uncertainty in (1) the system structure and function [19,23], (2) model parameters [23,24], (3) model initial conditions [25,26] (4) external forcing data (i.e. climate drivers and atmospheric data) [27,28], and (5) lack of accurate validation/verification data [29]. Because the configuration of model parameters is one of the most influential sources of model uncertainty [23,30], reducing parameter uncertainty is an impactful step towards a better representation of high-latitude ecosystems. However, it is not independent of structural uncertainty, as adding processes (and thus parameters) to reduce structural uncertainty introduces a higher cumulative parameter uncertainty [23]. Purely focusing on the reduction of parameter uncertainty can thus introduce a bias towards simpler models and ignores the high sensitivity that non-linear models have to their formulation [26,31]. Therefore, the goals of this study are to work towards a better understanding of the full range of tundra and boreal ecosystem model uncertainty in terms of both (1) model structure and function and (2) model parameter uncertainty, and how they relate to one another.
High-latitude ecosystems have several feedback mechanisms that affect their dynamics (see example in figure 1), many of which lack a quantitative description and remain excluded from state-of-theart Earth system models [32], while there are only few models particularly including Arctic and boreal processes [33]. Some of these mechanisms are capable of triggering regime shifts [34] that are re-shaping the high-latitude landscape and can have large impacts on global climate [35]. These shifts, like bogs turning into peatlands, changing fire regimes, and shrubs gaining dominance in the tundra through shrubification, have been studied individually, but regimes in the high-latitudes are closely linked [36,37] and their interactions are important to consider as they might have the potential to cause domino effects [38]. The uncertainty introduced by this incomplete description of the ecosystem dynamics makes it difficult to predict whether and when tipping points are reached [19]. It is important to determine which additional processes should be included in TBMs to further reduce model uncertainty without introducing an excessively computationally complex model that results in expensive simulations and greater parameter uncertainty [23]. Therefore, both sources of uncertainty, parameter and structural-based, should be considered when evaluating ecosystem models.
Model parameters are configuration settings internal to the model. Their quantities can be estimated from data collected in the field, through empirical experimental approaches in laboratory settings, or through model-data assimilation approaches. TBM parameters describe structural, biochemical, and functional traits of vegetation and the ecosystem, such as plant structure and dimensions and soil microbial activity and can thus be highly site-specific. Additionally, some model parameters represent properties that are not easily measured and are thus difficult to inform with measurements. Observational data needed to parameterize TBMs at high-latitudes can be sparse and are often clustered in few accessible areas [19]. Additionally, there is often a disconnect between parameters used by models and available measurements. Field campaigns are typically designed to answer a specific set of questions regarding the behavior of the environment, and the data collected does not necessarily conform to data needs of models, which are not typically considered at the start of field campaigns. Thus, targeted measurements designed to resolve model uncertainties may be an efficient way to refine TBMs.
To help identify the interactions of unmodeled feedback mechanisms and how they tie in with model parameters, we construct a causal loop diagram (CLD), a technique commonly used in dynamic systems modeling [39][40][41]. The CLD is a qualitative description of the ecosystem as a complex network and as such can include processes that cannot be represented in quantitative models, allowing it to be less biased. By bringing several feedback loops into one description, we can identify which processes become reinforced through the interactions.
To identify gaps between data needs for model parameterization and data availability, we query several recently developed databases. We then match available data to model parameter needs for three TBMs of differing complexities in their representation of high-latitude ecosystems. Finally, to integrate the information obtained from the CLD with that from our database query, we take the processes that are identified in the CLD as being important to represent and match them to database parameters as another indicator for which parameters need additional data. This measure takes structural uncertainty into account and, combined with the data availability analysis, provides a less biased estimate of which parameters should be measured.

Building a CLD
To assess structural uncertainty for the three models, we developed a description of high-latitude terrestrial ecosystem dynamics to evaluate which processes might become reinforced through feedbacks and are thus important to represent in models. We created a CLD that builds on a body of research and includes the most relevant feedback processes for Arctic and boreal ecosystems.
A CLD is a qualitative description of a system and its interactions in the form of a network [39], and thus lends itself to an unbiased evaluation of whether a process is well quantified. As a first step, the network is constrained through boundaries and interfaces with surroundings, and key actors of the system are identified. We chose to describe the terrestrial Arctic and boreal region without direct human influence or global drivers other than increases in greenhouse gas concentrations and local temperature. Variables of the system are represented as nodes. These include easily observed variables that are good indicators of change in the system, such as tundra and boreal vegetation productivity in terms of gross primary productivity (GPP), permafrost volume in mineral lowlands, organic lowlands and upland soils, and less visible but important variables of the system, such as summer and winter soil insulation. The impact of one of these nodes on another is represented by a directional arrow between them, also called an edge. These edges, describing the process through which the first node affects the other, are denoted with a positive or negative sign, depending on whether a change in the first node will induce a change in the second node in either the same or the opposite direction, respectively [39].
Once a CLD of a system has been defined, various methods can be applied to analyze the system, one of which is static network analysis. Static network analysis aspires to gain insight into the internal feedback mechanisms of real-life systems by analyzing structures and motifs within the causality network of variables. Motifs are small-scale substructures within interaction networks, which can be important for the overall functioning of a large-scale system and have as such been called the 'building blocks of complex networks' [42,43]. Some examples where motifs are decisive microstructures in larger networks are social networks, gene transcription networks, and networks in ecology [44,45]. One of the most influential microstructures for overall system behavior are feed forward loops (FFLs) (figure 2(a)), a structure in which two nodes are linked through both a direct and an indirect process via one intermediary node [41,46,47]. The number of FFLs targeting a node can serve as an appropriate measure of its vulnerability to change. Another relevant motif is the secondary FFL (figure 2(a)), where one process passes through two, instead of one, intermediary nodes. While the impact of single FFLs is generally more significant, the remaining ambiguity in the definition of the CLD makes the distinctions between motifs somewhat arbitrary. Therefore, we treat both motifs as one topological category (referred to as FFLs in the following) instead of unique features. This accounts for some of the ambiguous choices made in the construction of the CLD, such as whether a process is represented by one or two nodes. Through motifs, we can identify processes that become reinforced or act reinforcing and are thus more likely to cause substantial Table 1. Overview of the elements modeled in the three terrestrial biosphere models [56,57]. While TEM has two main carbon pools (soil and vegetation), they are further subdivided into leaf, wood, and root vegetation carbon, and chemically resistant soil carbon, physically resistant soil carbon, active soil carbon and coarse plant material for each soil layer, and a coarse woody debris carbon pool [58]. We investigated parameters with ecological significance beyond functional requirements for the model run, while excluding initial parameters.

Number of parameters investigated
Frozen soil Wildfire disturbance Number of carbon pools changes in model outputs. That way, we identify the processes that should be represented in high latitude TBMs to lower the possibility of high structural model uncertainty.

Models and databases
We identify the disconnect between available measurements and data needed for model parameterization for three TBMs of differing complexity that have been applied in high latitudes, including (1)  The models were chosen to investigate the trade-off between simplicity and complexity (see table 1). The three models are integrated into the Predictive Ecosystem Analyzer (PEcAn) framework, facilitating a future model intercomparison based on ensemble modeling. PEcAn standardizes the workflow for analyzing model parameter sensitivity and uncertainty, which allows for comparability across models. It further treats parameter values as probability distributions based on available measurements that are included in its associated database (Biofuel Ecophysiological Traits and Yield database, BETY), highlighting the need to examine model parameter data availability [48][49][50]. TEM was developed for high-latitude ecosystems, and the version used in this study, DVM-DOS-TEM, includes both permafrost soils and disturbance through wildfires [51]. Additionally, it represents tundra and boreal communities in detail by subdividing each community (e.g. tussock tundra, heath tundra, wet sedge tundra, white spruce forest, black spruce forest, deciduous forest, bog, fen) into up to ten plant functional types (PFTs). These PFTs, such as sedges, grasses, evergreen shrubs, and feathermoss, have different model parameterizations depending on the community in which they grow and compete for light, water, and nutrients, providing a nuanced picture of high latitude vegetation dynamics. As a result, TEM has been widely applied in modeling studies of Arctic and boreal ecosystems [24,52]. SIPNET (version r136, https://github.com/PecanProject/sipnet) is a box model and represents a lower complexity model in the comparison. However, it does include frozen soil characteristics, can be applied to communities underlain by permafrost, and has been applied to subalpine ecosystems [53]. In SIPNET, three carbon pools are used to model carbon dynamics: a soil carbon pool, and two vegetation pools (wood and leaves carbon pools) [53]. The model ED2 (v.2.2.0) has been applied to Arctic ecosystems as well [54,55]. It is more complex than SIPNET and has more parameters than TEM; however, it is less high-latitude specific than TEM and has a less detailed description of tundra and boreal communities (see https:// github.com/EDmodel/ED2).
Although all three models can be applied to describe the same ecosystems and plant communities, they each depend on different sets of parameters. Therefore, data availability in ecological databases differs between models. We evaluated each model to identify the included parameters and looked for their equivalents in four different databases. The four investigated databases for assessing data availability are (1) TRY (TRY Plant Trait Database) [59], www.try-db.org/, (2) BETY (Biofuel Ecophysiological Traits and Yield database) [60], www.betydb.org/, (3) NGEE (Next-Generation Ecosystem Experiments Arctic data catalogue) https://ngee-Arctic.ornl.gov/ data/, and (4) FRED (Fine Root Ecology Database) [61], https://roots.ornl.gov/. The model parameters are matched to their respective equivalents in each database, and the number of measurements north of 60 degrees is recorded. While measurements north of 66.3 degrees are more relevant to high-latitude ecosystem models, the scarcity of field data overall makes measurements between 60 and 66.3 degrees still useful for model parameterization when the plant communities correspond.

CLD evaluation of model parameters
The mismatch between data needs and availability provides a first approximation of which parameters should be targeted for additional measurements, although this approximation may be biased towards simple models with fewer parameters. To refine these results and represent the influence of structural uncertainty on parameter uncertainty, we include the CLD as an analysis tool. This inclusion permits us to identify parameters that should be targeted for additional measurements to reduce model uncertainty induced by parameter and structural uncertainty.
Parameters in the models usually describe the form and strength of a process, and thus can be mapped to an edge between two nodes (A and B) in the CLD. This creates logical groups of parameters that are mapped to the same edge (parameter groups in the following), which together describe the interaction between A and B. In some cases, we need to generalize edges to fit the model parameters. Because parameters themselves do not distinguish between the ecotypes, and the modeled plant community is instead chosen when selecting a site, we summarize edges that include boreal and tundra GPP into GPP. Similarly, we introduce the collective terms soil moisture, microbial activity, and permafrost volume, as only location determines which of the more specific terms is applicable. The number of motifs is then averaged for the number of edges in the group. With functional parameter groups across models, we can now apply the causal loop analysis to the models.
For the causal loop analysis, we identify substructures surrounding the edges between A and B in the CLD. We can then apply network theory to the parameters in the model that have been mapped to that edge. We specifically look for FFLs ( figure 2(a)), where node A impacts node B not only through the direct connection that the parameter describes, but also through additional processes. Some of these indirect connections from A to B over C, and from A to B over D and E, might not be explicitly modeled but might be implicitly included if the direct connection is well parameterized by field data. For example, local temperature (FFL node A) impacts microbial activity directly (FFL node B), and additionally indirectly through its effects on soil moisture (FFL node C). Under realistic conditions, the impact of the direct connection can be difficult to isolate from that of indirect connections, creating an effective parameter ( figure 2(b)). Models with such data-informed parameters might be able to emulate more complex ecosystem models (emulator models) by implicitly including the indirect processes described by the FFLs [62]. However, without sufficient field measurements, model parameter groups with a high number of FFLs could induce more uncertainty than parameters without additional, unmodeled feedbacks, since they may carry more structural uncertainty. Therefore, we aim to identify the edges from A to B with a high number of FFLs over C, D, and E. We suggest that the parameters in the models that describe those edges be well parameterized, as this would result in the highest potential to decrease model uncertainty in terms of both parameter and structural-based uncertainty.
To assess the level of parameterization for functional processes and quantify their need for additional field data based on availability and vulnerability through reinforcing microstructures, we introduce the Parameterization-factor (Pa-factor). For each parameter group (parameters mapped to the same edge), we calculate the average number of measurements per parameter in the group, N M , where parameters shared across models are counted once. To compare it with the number of FFLs for that edge, N FFL , we normalize N M and N FFL to a scale of [0, 1] and create a scatter plot. For generalized edges, we average the number of FFLs for the number of edges. The Pa-factor is defined as and describes the signed distance to the x = y diagonal. It is therefore a measure of the relationship between parameter vulnerability and the level of data availability. A high positive Pa-factor indicates a high number of FFLs in relation to a low number of measurements, and thus a high need for additional measurements. Through both methods combined, that is, the identification and use of parameters from the databases and analysis of the CLD, we identify which model parameters are likely to play a key role in model structure and dynamics and should be prioritized for further measurements to aid in better constraining model uncertainty.

CLD analysis
We created a CLD as described in section 2.1 on a body of research and discussion among the coauthors. After a review by several experts (see supplementary material for methodology) we applied it as an analysis tool to further inform which parameters should be targeted. The final diagram consists of 30 nodes, which are connected via 127 edges, 71 of which represent positive and 56 of which represent negative interactions (figure 3, supporting file 2). We search for nodes that are targeted by a high number of FFLs, as they may be particularly vulnerable and/or influential in the system [41,45]. We define 'vulnerable' as a parameter that is likely to experience disproportionately large impacts due to changes in the system, and 'influential' as a parameter that is likely to cause large changes in the system. We identify connections linking the node microbial activity to various other nodes, including CO 2 production, tundra GPP, boreal GPP, local temperature, bog and pond area, phosphorus availability, and fires (annually burned area). Processes related to these nodes may be important but may not be included in high-latitude TBMs. Therefore, parameters that include any of these processes in their effective form ( figure 2(b)), should be well informed by data to represent the ecosystem as well as possible with the existing model structure.

Database analysis
While the majority of parameters differs across the three models, there is some overlap (figure 4, supporting file 1). We included 90 parameters from TEM, the majority of which describe the impact of microbes and the respective biomes on carbon and nitrogen cycling and storage. The 80 parameters in ED2 include a large focus on albedo processes, whereas many of the 71 parameters in SIPNET describe the impacts of temperature on different parts of the system. TEM shares a lower percentage of its parameters than ED2 and SIPNET, which is a result of the more high-latitude specific set of parameters and the fact that the model specifically parameterizes each PFT in each community type (as described in the Methods).
Our queries to the TRY, NGEE, BETY and FRED databases for measurements that can be used to parameterize the three models revealed that the TRY, NGEE and BETY databases include numerous observations for each model ( figure 5). The observations in the root-specific FRED database are more specialized than the parameters representing soil processes in the three models, and are therefore less applicable. While there were many observations in the TRY, NGEE, and BETY databases that matched the model parameters, these are typically related to only a few of the model parameters. Considering the field data available from the four databases, only 25.4% of considered parameters in SIPNET include at least one observation from field data, 17.5% of parameters in ED2, and 16.7% in TEM (supporting file 1).

Combined analysis
We map the model parameters to edges in the CLD as described in section 2.3. This creates 25 categories based on functional traits, each of which contains between one and 24 parameters. We can then apply the results from the static network analysis to the parameter groups, by assessing the number of FFLs connected to the individual parameters. When an edge has been generalized to account for ambiguities in the biome, the number of FFLs is averaged. This provides an estimate of how vulnerable or influential the parameters are in terms of the CLD. TEM parameters have lower numbers of FFLs per parameter than ED2 and SIPNET ( figure 6).
Both methods, the data availability and FFL analyses, have a bias towards simple or complex models, respectively. Bringing them together into one assessment of model parameterization can balance the biases and account for both structural and parameter uncertainty. We therefore compare the number of FFLs for a parameter group with the number of measurements, which is averaged for the number of respective parameters in each group.
When plotting the resulting normalized number of measurements (N M ) against the number of FFLs (N FFL ), we observe a distinct L-shape (compare figure 7(a)). This again shows not only the low number of parameters with significant numbers of measurements, but also how these parameters tend to describe the same process, indicating poorly constrained models. Only four edges have a high N M , whereas more than four parameters have a high number of measurements.
Based on N M and N FFL , we calculate the Pa-factor for each edge. The values and corresponding edges are shown in figure 7(b). The edge with the highest Pafactor, and thus also the highest need for additional field measurements points from 'Local temperature' to 'Microbial activity' . This edge contains eight parameters, the base microbial activity, the microbe Q10, soil Q10, soil Q10 when soil temperature is below a certain temperature, E0 and T0 in the Lloyd-Taylor soil respiration function, which are all SIPNET parameters, the rate of increase of heterotrophic respiration with increasing temperature, which is an ED2 parameter, and the TEM parameter heterotrophic respiration Q10. The impact of local temperatures on the rate of microbial activity is not commonly measured, but is highly influential in high-latitude ecosystems and needs to be modeled accurately.
The other outstanding edges go from 'GPP' to 'Nitrogen availability' , and from 'GPP' to 'Leaf area index' . The first edge contains 19 parameters, 12 of which are TEM parameters, four ED2 parameters, and three SIPNET parameters. The second edge contains six TEM parameters, two SIPNET parameters, and two ED2 parameters. These parameters are well defined, but do not have any FFLs that make them particularly vulnerable or influential. For example, specific leaf area, a parameter that is shared across all three models, and the percentage of carbon and carbon nitrogen ratios in leaves and leaf-litter, are exceptionally well-sampled.

Overview
This study presents an analysis of the model parameters from three models, TEM, SIPNET, and ED2 and how they differ from each other. This can provide a foundation to help understand differences in model outputs when performing ensemble modeling and model intercomparisons [18]. The CLD analysis helped us gain an insight into model process and structural uncertainty, highlighting the advantage of more complex models due to their lower number of FFLs per parameter. The database analysis has uncovered a mismatch between data needs and availability and highlighted the advantage of simpler models, as a higher percentage of their parameters have measurements. The Pa-factor helps bring these two perspectives together and provides a measure to prioritize model parameters for data acquisition.

Model process and structural uncertainty: implications
Through the CLD analysis we have identified several nodes that should be candidates for inclusion in TBMs. Some of these, such as bog and pond area, phosphorus availability, and fire regime Figure 4. Schematic representation of the distribution of parameters across models. The parameters are separated into above-and belowground temperature, and nutrient and carbon related groups, and parameters that link above-and belowground processes. The miscellaneous category includes parameters such as the maximum rate of dew formation, parameters for soil decay, and parameters for aerodynamic resistance.  (annual burned area), are currently not represented in all TBMs applied in high latitudes. The impact of different types of waterbodies on soil insulation, erosion and methane production has been identified [33,[63][64][65][66][67][68][69], as well as the impact of fire frequency and intensity [12,70,71] on high-latitude carbon storage. None of the three models includes aquatic dynamics, and SIPNET does not include fire dynamics [72]. Phosphorus in high-latitude soils, which has been shown to co-limit vegetation growth with nitrogen [73], is not included in any of the models selected for this analysis. Additionally, boreal and tundra GPP are among the nodes with the highest FFLs, which suggests that a more complex model description may be beneficial. This is supported by the finding that the least complex model SIPNET has the highest number of FFLs per parameter, and the less high-latitude specific model ED2 ranks in the middle. Other nodes, such as beaver and herbivore populations, might be important based on the literature [74,75], but have not become reinforced through FFLs or loops in the CLD (figure S1).
Overall, the results of the CLD analysis highlight the benefits of more complex models, as more feed forward processes are explicitly modeled. Additionally, they suggest that parameters related to relevant processes be well informed by data to represent the ecosystem as well as possible within the existing model structure.
However, there are some limitations that need to be considered. The list of parameters with a high number of FFLs, that are identified as potentially vulnerable, might not be exhaustive, as nodes with few direct connections (such as permafrost) have fewer motifs and may not be captured by the network analysis. We therefore do not identify processes that are not important in the system, but ones that are becoming reinforced. Additionally, the definition of the CLD is ambiguous, and with the boundaries set to the outlines of the terrestrial Arctic and boreal region, not all interactions could be captured in the diagram. Proximity to sea ice [76], resulting changes in changing precipitation patterns and moisture stress [77], and incoming deciduous vegetation from the south [78] are therefore not represented in the CLD. Lastly, many lateral hydrological processes, such as loss of dissolved organic carbon through stream and river transport, are not represented in either the terrestrial ecosystem models in this study or the CLD. While they can strongly influence for the overall carbon budget [14,16,17], their representation in terrestrial ecosystem models cannot be improved through better constraining model parameters. Therefore, a separate study is needed to evaluate how the omission of these processes affects the carbon budget in terrestrial Arctic and boreal ecosystem models.

Parameter uncertainty and data informed parameters: implications
Identifying data availability and needs for each model is an important starting point for assessing model parameterization. We found that in the queried databases, observations are limited and focused on few parameters, like specific leaf area and leaf carbon. Additionally, this data is often documented without a common standard [79], making it challenging to fully incorporate existing data. As a result, many parameter groups have no available measurements, and further distinction between desirable and crucial measurements is necessary. As expected, the simplest model SIPNET has the highest percentage (25.4%) of field-constrained parameters, since the less specialized parameters are most easily collected in the field. TEM, which has been most specifically developed for high-latitude ecosystems, only has data available for 16.7% of its parameters. ED2 ranks similarly (17.5%). This highlights the benefits of model simplicity but does not consider whether the parameters with missing data are possible or feasible to measure. Similarly, other parameters might be observed in laboratory or remote sensing approaches that have not been included in the queried databases. Existing remote sensing datasets [80,81] and planned upcoming satellite missions [82] may provide spatially and temporally rich information on key plant traits needed for ecosystems models through novel retrieval approaches [83].
Although this analysis identifies parameters without observational data that should be targeted in field efforts to improve model precision, the selection is not sufficiently narrow to make it feasible. Additionally, the data availability analysis suggests that parameters of more complex models are less well constrained and could carry more uncertainty. This can introduce a bias towards less complex models, as this analysis does not consider structural uncertainty, which plays a large role in whether observational data will be available. Therefore, we are looking for an analysis that helps refine which parameters should be targeted for additional observations while accounting for structural uncertainty.
A common method to assess parameter uncertainty is ensemble modeling, which can be facilitated through platforms such as PEcAn [23,50]. Previous studies have found that parameters describing short-term ecosystem processes induce large model uncertainties in ED2 [84], and that TEM is especially sensitive to parameters describing the temperature dependence of photosynthesis [24]. However, model uncertainty analyses through ensemble modeling are rarely combined with a conceptual approach to address structural uncertainty.

Pa-factor and parameter prioritization
The results above illustrate the trade-off between complexity and simplicity and how decisions on future data acquisition should not be based solely on a data availability analysis. Taking the structural uncertainty assessed in the CLD analysis into account helps balance the bias from the database analysis, as it is biased towards more complex and specific models. Combining both into the Pa-factor helps make decisions for future data acquisition and provides a ranking of how necessary additional measurements are for a given parameter group. We find that for modeling purposes, field efforts should focus less on specific leaf area and C:N ratio measurements because these are already well represented in the database and do not impact vulnerability. Instead, parameters describing the impact of local temperature on microbial activity should be prioritized.
However, while most parameters could be matched to the CLD, some were too specific and did not fit any process. These 61 parameters, most of which are without measurements (supporting file 1), are thus captured by the database analysis, but not by the FFL analysis. Additionally, even though emulator models with data-informed parameters can produce results very similar to more complex models, they cannot always compensate for structural accuracy. If a process is modeled fundamentally differently in the simpler model, field data might not be able to make up for this discrepancy [62].

Conclusion
The combination of data availability and needs analysis with the causal network approach provides better insight into how ecosystem models can be refined than with database analysis and CLD analysis alone. Model parameterization with field measurements is a crucial step towards better constraining and ultimately reducing model uncertainty [85], which becomes especially relevant when modeling ecosystems under a changing climate. For this purpose, we propose the Pa-factor as a useful tool for guiding decisions about which parameters to prioritize and target in field efforts and demonstrate its usefulness for different Arctic and boreal ecosystem models. Based on the presented analysis, we suggest prioritizing parameters in future field campaigns and remote sensing efforts that describe the relationship between local temperature and microbial activity in particular, and beyond that all parameter groups with a significantly positive Pa-factor. These are the impact of local temperature on vegetation growth, the effect of drained soils on fire frequency, the impact of vegetation growth on microbial activity and the impact of soil moisture on nitrogen availability.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).