Machine learning and artificial intelligence to aid climate change research and preparedness

Climate change challenges societal functioning, likely requiring considerable adaptation to cope with future altered weather patterns. Machine learning (ML) algorithms have advanced dramatically, triggering breakthroughs in other research sectors, and recently suggested as aiding climate analysis (Reichstein et al 2019 Nature 566 195–204, Schneider et al 2017 Geophys. Res. Lett. 44 12396–417). Although a considerable number of isolated Earth System features have been analysed with ML techniques, more generic application to understand better the full climate system has not occurred. For instance, ML may aid teleconnection identification, where complex feedbacks make characterisation difficult from direct equation analysis or visualisation of measurements and Earth System model (ESM) diagnostics. Artificial intelligence (AI) can then build on discovered climate connections to provide enhanced warnings of approaching weather features, including extreme events. While ESM development is of paramount importance, we suggest a parallel emphasis on utilising ML and AI to understand and capitalise far more on existing data and simulations.


Introduction
Machine learning (ML) and artificial intelligence (AI) increasingly influence lives, enabled by significant rises in processor availability, speed, connectivity, and cheap data storage.AI is advancing medical and health provision, transport delivery, interaction with the internet, food supply systems and supporting security in changing geopolitical structures.Society is approaching the era of self-driving cars, helping medical practitioners avoid misdiagnoses, accurate speech recognition, and receiving tailored purchase suggestions.Most applications are beneficial, although ethical issues exist, e.g.Bostrom (2014), New Scientist (2017).Simultaneously, evolving lifestyles must interact safely with climate change.There is a growing realisation that climate change impacts are not an isolated threat, instead needing more holistic responses alongside addressing other societal issues.Climate change is a complex scientific and multifaceted issue, amenable to ML and AI analysis, but in general, this has not yet occurred.Many ML algorithms have been available for decades, and possibly most notably neural networks.However, until recently, constraints of computational architecture and power have restricted their application, and especially for issues as data-intensive as climate change.
Various names describe new computational methods, including big data, ML and AI.Big data is concerned with using complex datasets, so large that traditional analysis techniques are unsuitable.AI is a form of computer science, where the goal is often to teach a computer to complete tasks a human cannot do, and generally involves decision making in different contexts.ML is a sub-area of AI where computers learn relationships from large training datasets.For climate and weather applications, a simplistic characterisation can be: (i) big data as the collection for analysis of meteorological-or Earth System-related measurements, and high spatial and temporal resolution Earth System model (ESM) outputs, (ii) ML as refining or discovering new linkages between locations, times and quantities in the datasets (e.g.where sea surface temperature features aid weather prediction months later over land regions) and (iii) AI as building on connections that ML discovers, to provide automated warnings and advice to society of approaching weather extremes.The recent ease of application of ML methods through better computational capability is partly supported by novel use of computer graphical processing unit (GPUs), noting that GPU speed is increasing faster than standard central processing units (Baji 2018).Others (e.g.Burr 2019) suggest more inventive use of computer memory, to make calculations both more efficient and much nearer where the data is stored.
Since the 1950s, numerical weather forecasting has advanced remarkably.Limited computational resources meant that until recently, equations must be solved on a course spatial grid.Representation of unresolved small-scale processes is through simplified approaches called 'parameterisation schemes', which can limit forecast predictive skill.Recent increases in computer power allow ultra-fine-resolution weather forecasting models, with grid resolution at almost kilometre scales.While many processes are still parameterised, such finer grids enable explicit calculation of storm tracks, mesoscale cloud systems, and deep convective events.ESMs have much in common with weather forecast models, including a dynamical core.ESMs are forced by prescribed evolving concentrations of atmospheric greenhouse gases (GHGs), and model their interaction with atmospheric radiative fluxes through the atmosphere, thus predicting climate change.As ESMs operate for modelled century timescales, they also include detailed descriptions of ocean circulations and polar ice extents.Many ESMs describe the global carbon cycle, linking known emissions and future projections to atmospheric GHGs levels, which then become a model diagnostic.Unfortunately computers remain not fast enough to allow ESM operation at the high order kilometre resolution of weather forecaststo do so prevents modelled century timescales from completing in a reasonable timeframe.Hence ESMs still retain parameterisation of critical sub-grid processes such as convection.
Approximately 20 research centres have developed ESMs.An achievement of climate modelling is the placement of model outputs in a shared database: Coupled Model Intercomparison Project Phase 5 (CMIP5) (Taylor et al 2012).These models capture two decades of ESM development through inclusion of interactions between physical climate and geochemical cycles.However, climate research has partially failed as, despite ESM improvements, the differences between them remain large.Significant discrepancies include for fundamental summary statistics such as equilibrium climate sensitivity, i.e. global warming for climate stabilised at atmospheric carbon dioxide double pre-industrial levels (Flato et al 2013).Model discrepancies cause difficulties for climate adaptation planning, and for determining gas concentrations to keep warming below target thresholds (e.g. 2 °C).Many ESM differences may be due to the necessary parameterisation of sub-grid processes.We discuss how ML and AI methods may reduce inter-ESM uncertainty.
In the context of measurements, planet Earth is currently monitored at unprecedented levels, and especially by satellites collecting climate-related data, all of which requires advanced algorithms to characterise any overall trends and behaviours.At the same time, large uncertainty bounds revealed by the careful application of ML to data may demonstrate situations where the capture of more data would be helpful and required.

Dimension reduction
Mathematical modelling, e.g.Fowler (1997), Ockendon et al (2003), strives to explain observations by governing equations.These are often partial differential equations, continuous in space and time, coupled through any source and sink terms, as for the climate system (Vallis 2006).After confirming equations reproduce measurements, predicted changes are assumed robust for at least modest perturbations to forcings.Alongside the overarching requirement of applied mathematics to predict change is the quest for dimension reduction.Such reductions are powerful, illustrating dominant processes 'moving together' and defining robust interconnections within a complex system.Knowledge of controlling processes points to model parameters that most strongly influence projections, guiding measurement campaigns to aid uncertainty reduction.
Historically, dimension reduction utilises three approaches.Firstly nondimensionalisation determines the magnitudes of equation terms, yielding a reduced set of linked equation parts that dominate.The balance of equation terms for the climate system is, however, complex, varying by location and season.Furthermore, climate research has at least nine fundamental base dimensions (table 1).Yet, with climate equations known and coded in ESMs, more progress should be possible to determine dominant terms (Huntingford 2017).We suggest the relentless pressure on climate research to make projections with ever-newer ESMs, unfortunately, restricts available time for the detailed examination of the internal calculations implicit within existing simulations.Secondly, dimensional analysis is a technique to both collapse the complexity within, and relate different strands of data, even without an initial underlying model (e.g.Barenblatt 2003, Lemons 2017).Confirmed linkages can aid the construction, parameterisation or testing of related climate model components to ensure they reproduce discovered data-based relationships.These two approaches have notable similarities, despite one being model-led and the other data-led.The third approach is through statistical techniques (Storch and Zwiers 2010), e.g.spatial reduction by empirical orthogonal functions, known to characterise many climate modes of variability, such as the tropical El Niño-Southern Oscillation.
A fourth new technique is emergent constraints (ECs) (Hall et al 2019), which targets a reduction in the dimension of inter-ESM spread to refine projections.This approach searches across ESM ensembles (e.g.All four approaches to dimension reduction rely to some extent on foreknowledge to find related quantities.We suggest ML and AI techniques, integrated into these dimension-reduction frameworks, will aid further discoveries.

Knowledge gaps overview
We concur with Schneider et al (2017) and Reichstein et al (2019) that as ML and AI algorithms develop, these will open vast opportunities to aid climate research, facilitating and going beyond dimension reduction.We expand on this, presenting an overarching view of climate analysis.We review existing applications, summarise ML algorithms, describe three potential applications (UK summer 2018 drought, the 'warming hiatus', and equation building where their form is currently unknown), and address how AI can help society to adapt to climate changes with a focus on drought.
Borrowing terminology from a military context, issues of climate are amenable to three classifications.'Known knowns' are aspects coded accurately as equations into ESMs, where ML could reduce dimensions to elucidate dominant interactions.'Known unknowns' are where an effect influences climate changes, but uncertain equation parameterisation causes inter-ESM spread.Alternatively, data shows an effect to be important, but equations are not yet available to represent its inclusion in ESMs.A particular concern for climate change is the risk of 'unknown unknowns', generating unwelcome surprises.The most commonly suggested are tipping points (Lenton et al 2008), where the Earth System changes disproportionately for small atmospheric GHG increases,

Liess et al (2017)
Weather forecasting Use AI for post-processing of weather forecasts to aid human forecasters.

Future climate scenarios
Merge multiple seasonal climate predictions by weighting models by skill.

Future climate scenarios
Weighting climate models by their skill produce better performance than ensemble averages.

Monteleoni et al (2011)
Climate impacts Assess the impact of climate change on above-ground biomass.et al (2019) raise the appealing prospect of building the underlying differential equations from application of neural networks to data.Should these equations be substantially different to those expected, the new equation terms may also point to the existence of 'unknown unknowns'.
Identifying teleconnections (i.e.intricate connections between Earth System components) offers especially powerful applications of ML, discovering connections hidden within the many dimensions (table 1) of climate that human-based visual inspection is unlikely to recognise.Some teleconnections have time offsets, potentially generating societal warnings of approaching extreme weather events.The extent to which teleconnections exist in extreme rainfall patterns has only recently been recognised, noted as worthy of significant additional investigation (Boers et al 2019).One example is the winter 2013/14 severe flooding across the southern United Kingdom (Huntingford et al 2014), potentially initiated by earlier anomalous tropical precipitation.

Existing AI applications to climate
Many climate researchers have adopted ML methods to advance understanding of specific Earth System components (table 2).We now argue that there is enormous potential for using ML approaches also to find the more connected behaviours between multiple Earth System components, and how they aggregate to overall climate responses.

ML algorithms
We first give an eclectic overview of ML methods and suggest further climate applications.ML methods are (semi) automated approaches to data inference that make few or no prior assumptions.Generally, ML approaches are of two types: supervised and unsupervised (Murphy 2012).Supervised methods rely on a priori specification of a response variable and map inputs to system outputs.Inputs are explanatory variables, e.g.large-scale forcings such as GHG levels, teleconnection drivers, or observations of a particular part of the Earth System.Outputs are response variables of interest e.g.local climatic impacts.Supervised approaches use a training data set, where both measured inputs and outputs are available.Unsupervised approaches only take outputs of collected data, and the aim is to discover interesting patterns in the data and links to inputs, but where these are not listed beforehand.Unsupervised learning may aid the discovery of novel relationships, or teleconnections, across the different dimensions of climate modelling (table 1).A subsequent challenge for the Earth System community would be where an unsupervised approach reveals new system connections, requiring mechanistic understanding.The large flexibility of ML approaches allows strong nonlinearities to be encapsulated, affecting many features of climate change (Dijkstra 2013), which cannot be described with standard regression approaches.
The probability density p (given x is now a vector of inputs) depends only on data, D. This differs from supervised learning where the output(s) are conditional on inputs and data (equation 1).Unsupervised ML approaches might identify the number of clusters or groups in a dataset.If V represents the number of clusters and D a given dataset then, probabilistically, the aim is to estimate the distribution, p(V|D).An application in climate sciences of this unsupervised approach would be estimating the number of distinct North Atlantic weather regimes (Dawson et al 2012).Four common ML methods are now described, and illustrated in schematic form (figure 1).

Gradient descent method
Underpinning many ML approaches, gradient descent relates to standard linear regression and aims to minimize the error ('cost function', F), formulated as: U(n) is a differentiable point (or vector), γ is a scaling that ensures the cost function reduces with each iteration, F(U(n))>F(U(n+1)), and n is iteration time.Parameter γ can be thought of as a learning parameter (or a step size) which determines the speed of the gradient descent approach.Poor optimisation of γ causes inefficient algorithms that either do not converge to the optimal solution, or over-correct the cost function.
An example of gradient descent is supervised (probabilistic) classifier applications.Logistic regression is a popular approach for classification problems and while maximum likelihood methods are frequently used, gradient descent methods generally solve supervised ML problems more efficiently.

Gaussian processes
In nonlinear regression, a first attempt might involve fitting increasingly complex polynomials, Y=f (x, a i ), where Y is an observation, x is a potential predictor of Y, and a i are parameters.However, in a nonlinear system such as the climate, we might not understand the precise parametric process, as this would require consideration of all possible nonlinear functions.As a supervised ML method, Gaussian processes (Rasmussen and Williams 2006) are an alternative to such (linear) regression approaches.A Gaussian process is a collection of random variables, Y, (e.g.data observations) such that any subset of these variables has a multivariate normal distribution.Notable is the Gaussian process is defined over the observation functions, Y, rather than input state, x.The process is specified by a mean function and a covariance matrix.
Combining the Gaussian process prior with a (Gaussian) likelihood based on the data, where some data is observed and some not, produces a Gaussian posterior distribution.This method enables out-ofsample predictions, y * : where f now represents a Gaussian process and D is again an observed data set.for a stationary process this enables the convergence to be tracked.This approach also has the potential for developing non-stationary models for changing processes such as the climate system under anthropogenic forcing.
Adapted sequential Monte Carlo methods, through the Metropolis-Hasting algorithm, allow evolving model parameterisation.Currently, ESMs combine numerical code describing climate system components.These are operated from modelled preindustrial times to contemporary, and onwards corresponding to future GHG scenarios.Yet most ESM modelling centres do not revise projections when compared to historical measurement records, i.e. employ 'adaptive learning'.This is a computational challenge, needing the embedding of ESMs in an iterative framework, and so far only achieved for decadal forecasting (Dunstone and Smith 2010).

Deep learning approaches
Recent excitement around ML approaches often centres on the use of deep neural networks and graphical structures to uncover relationships in nonlinear data.Deep learning methods utilise a directed graph.Data are input at the base, transformed by hidden layers, and output at the top of the graph.Graphs have weights associated with the edges, and socalled 'biases' associated with the neurons (nodes), where the weights determine the strength of connectedness between neurons from different layers, and the bias is an offset that regulates the sensitivity of the neuron.An activation function scales the signal at each neuron, given the weighted input from the previous layer.Training data sets update these weights and biases to given error tolerances.Once trained, these directed graphs can give out-of-sample predictions on test data.Similar to Gaussian processes, for climate research this ability of a model to capture features of recent extremes (i.e.'out of sample' events) raises confidence in projection of future anomalies, which may become commonplace.
Deep learning and neural net approaches avoid specifying a process-based model (e.g. as needed for the Sequential Monte Carlo and Bayesian methods).This more data-led approach can improve our understanding of multivariate relationships in nonlinear systems.Recent application of neural nets to climate sciences include dryland disturbance (Buckland et al 2019), inverse problems for remote sensing (Krasnopolsky and Schiller 2003) and replacing costly components of climate models (Gentine et al 2018).This learning subsequently aids mechanistic model construction.A compelling and emerging concept in hydrological modelling is a hybrid approach, where neural networks utilise data to characterise short-term rainfall-runoff relationships, yet additionally are constrained by prescribed prior knowledge of physical catchment attributes (Kratzert et al 2019).Static but location-specific modelled processes can include, for instance, parameterisation of topography, soil properties and landcover.

Potential applications
Table 2 presents applications of ML and AI to understand Earth System features.We suggest three further illustrative potential examples.First is a recent extreme event, likely caused by multiple forcings to be determined.Second is a major uncertainty in overall climate response, again likely due to many Earth System interactions.The final example is building unknown ecological-climate interaction equations.

UK summer 2018 drought
The UK experienced a hot dry year through the summer of 2018.European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim re-analyses (Dee et al 2011) for mean July and August temperature, rainfall and soil moisture, show strong anomalies for the southern UK, compared to mean conditions of these two months (figure 2).ECMWF uses a state-of-the-art forecasting model, initialised from earlier forecasts, to provide prior estimates of meteorological conditions, which are then modulated by data assimilation from satellite observations, weather stations and radiosondes.Such assimilation ('4D-Var' optimisation) encapsulates measurement error, avoids overly distorting modelled atmospheric physics, (Dee et al 2011) and arguably is a form of ML.Emerging studies place the drought in context, and emphasize that amplified global Rossby waves likely have an important role in these climatic processes (Kornhuber et al 2019).These wave attributes may cause simultaneous extremes elsewhere, and there is evidence that such wave patterns are becoming more frequent (Kornhuber et al 2019).However, mechanisms remain unknown, requiring isolating and understanding possible parallel forcings such as antecedent soil drying, general background warming rates, and warming-circulation change interconnections, all of which ML may elucidate.Any discovered interseasonal connections linking drought risk to springtime soil moisture or oceanic temperature features aids drought adaptation planning, e.g.timing of crop planting.Optimum crop timing itself may gain from ML applications through combined analysis of simultaneous crop yield and climate datasets.
Explaining changes to different extreme event frequencies is valuable knowledge.Most extreme events cannot be wholly attributed to anthropogenic activity, or verified as being unaffected by human behaviour, leading to characterising anthropogenic influence as fraction of attributable risk (FAR) (Allen 2003).As extreme events are by definition rare, FAR calculation requires supplementing data by many simulated years with ESMs.The FAR statistic uses simulations for preindustrial and contemporary GHG levels, capturing thermodynamic (i.e.global warming), and circulation changes (e.g.Otto et al 2016).However only one research group has performed massive ensembles (Massey et al 2015), and while highly informative, this makes the FAR statistic strongly ESM-dependent.To complement this approach, we argue for greater understanding of the physical drivers and interactions leading to extremes, as a 'storyline' (Shepherd 2016), and by utilising ML to perform targeted searches in ECMWF re-analysis data.Deeper post-extreme understanding enables rigorous assessment of the CMIP5 ensemble (Taylor et al 2012) building an inter-ESM consensus on the extent to which anthropogenic forcings alter particular extreme likelihoods.evaluation is challenging due to few global-scale observations.Furthermore high variability in nutrient limitation and plant-soil feedbacks by environmental context (Thomas et al 2015), restricts knowledge of which models apply in particular conditions.Given this complexity, a bottom-up approach of learning the mechanisms of nutrient limitation from available data may provide a better strategy than building nitrogen cycle models before comparing to data.While there is a general dearth of observational datasets relative to the wide variability in space and time in nitrogen cycle processes (Thomas et al 2015), long-term records of plant growth and nitrogen availability can be reconstructed from analysis of natural archives such as tree rings (McLauchlan et al 2017) and sedimentary deposits (McLauchlan et al 2013).These time-series data offer a mostly untapped yet powerful resource for deriving long-term plant-N interactions and how these vary over space and time.For example, Jeffers et al ( 2011) and (2015) used a maximum likelihoodbased model-fitting and model-selection method to infer the most likely mechanisms by which plants interacted with available nitrogen over a 6000 year period; the results provided evidence for plant-soil feedbacks operating over millennia.However, while the maximum likelihood method searches for the best set of parameters, this can only be for a given model.When, as here, we additionally need to determine which model(s) provide the best fit to the data, the process becomes time-and computationally-intensive. Such a high need for numerical calculation is not feasible for fine resolution calculations across large spatial scales with strong heterogeneity.A combination of new computing structures and novel algorithms, (e.g.symbolic regression, Martin et al 2018), may instead offer increased efficiency for finding the best model(s) for describing available data.Notwithstanding the possible high computational cost, we also suggest the proposed use of neural networks to return governing equations (Raissi et al 2019) will further aid improved numerical characterisation of terrestrial ecosystems in ESMs.AI may then be able to harness new insights from the model output to suggest the most suitable locations for tree planting for carbon sequestration.

Climate
How to represent plant feedbacks to soils and thus biogeochemical cycling and global climate in ESMs are even less well known (e.g.Arneth et al 2010).A potentially important process missing from ESMs is the release of plant secondary metabolites into soils and subsequent changes in litter decomposition and nutrient recycling (Chomel et al 2016).Similarly, the release of volatile secondary metabolites induces changes in atmospheric chemistry with important feedbacks on plant health (Rap et al 2018).For example, plant production of oxalates is a key driver of nutrient and biogeochemical cycles (Graustein et al 1977), and can directly affect local and regional climate via formation of cloud condensation nuclei (Zhou et al 2015) and indirectly via feedbacks to carbon sequestration (Tooulakou et al 2016).Plant production of oxalates is highly sensitive to atmospheric chemistry and climate, with increased accumulation under rising ozone concentrations (Fink 1991) and drought (Brown et al 2013).However, like other secondary metabolites, both the type and amount of oxalates produced varies significantly by plant species and ecological conditions (Holopainen et al 2018); therefore future work is needed to measure these ecologically relevant compounds according to their expected range of variation in natural ecosystems.Where data exist, we suggest that ML can help identify links between plant chemical traits, environmental conditions and feedbacks on biogeochemical cycling, thus enabling the development of process-based equations for use in ESMs.We note a direct impact on human well-being too, as changing protein and defence compounds in plants influence food quality and thus human health (Zhu et al 2018).Advances, therefore, in data collection and modelling of plant nutrient and secondary compound production dynamics combined with AI has the potential to help identify risks to food production associated with changes in climate and atmospheric chemistry.

AI to support climate adaptation, with an emphasis on drought warning
The IPCC has been collating climate change knowledge into reports on an approximately seven year cycle, and recently the 5th assessment (IPCC 2013).These underpin the annual Conference of Parties (COP), which focus on measures to avoid dangerous climate change.Recent COP meetings called for constraining global warming to 2 °C above preindustrial levels, or even 1.5 °C.This challenge is tremendous.Despite evidence of the climate changing, emissions have grown, showing little evidence of reducing (Huntingford and Friedlingstein 2015).After a short plateau, CO 2 emissions have resumed growth (Jackson et al 2018).Furthermore, equilibrium global warming even for current greenhouse gas concentrations might already be at or very near 1.5 °C (Huntingford and Mercado 2016), and over land, warming will be even higher (Huntingford and Mercado 2016).Yet, wealth per capita is tightly linked to energy use (Brown et al 2011), so meeting societal goals of more people leading a wealthier lifestyle will increase energy demand.Conversation from fossil fuels to different energy sources remains challenging (York 2012).AI may aid developing non-fossil fuel energy supplies, but it is also prudent to prepare adequately for climate change.
Improvement of forecasts is essential to aid preparation for extreme events.McGovern et al (2017) assessed AI methods in predicting high impact weather events, including the duration of storms, using a historical database.They operationalised the Gradient Boosted Regression Trees (GBRT) algorithm in the National Oceanic and Atmospheric Administration Hazardous Weather Testbed (Karstens et al 2015).Studying how often professional forecasters used the output of the GBRT, it was found that in 75% of cases, the AI-based forecast was selected over human intuition, providing evidence that AI-based forecasts in 'human-in-the-loop systems' can aid decision making (McGovern et al 2017).Directly enhancing forecasts is difficult due to their high resolution and lead times.However, ML and AI methods can post-process forecast model output by accounting for missing model resolution and correcting the resulting biases (Novak et al 2014).Similar ML-based disaggregation, but of ESM projections, may provide bespoke climate services at a very fine spatial scale (Knusel et al 2019).Such disaggregation could link climate outputs to agronomy models to then aid decisions that ensure high crop efficiency in a changing climate.
Droughts are high impact weather events, estimated to have cost $1.5 billion globally between year 1998-2017, and representing 33% of the costs of weather hazards over that period.AI offers the potential to leverage recent advances in drought forecasting accuracy (Belayneh et al 2016) to improve decisionmaking.The Horn of Africa drought in 2011, for instance, impacted over 9 million people and the resulting food insecurity likely caused between 143 000 and 273 000 deaths in Somalia alone (Checchi and Robinson 2013).Improving early warning of these events allows for targeted drought response and better emergency preparation.Drought forecasting systems exist that utilise atmospheric forecasts to force hydrological models (Shukla et al 2014).Noting the success of methods such as GBRT for improving real-time decision making for other high-impact weather events (McGovern et al 2017), it makes sense to utilise drought predictive skill (Belayneh et al 2016) combined with ML and AI methods to ensure optimal water resource allocation and disseminating information ahead of drought events.

Discussion and conclusions
Evidence is accumulating that fossil fuel burning is adjusting climate (IPCC 2013), as projected by many e.g.Broecker (1975), requiring accurate projections to aid adaptation.ESMs estimate climate variation, with discretised equations describing the Earth System.A climate research success is the pooling of ESM simulations (Taylor et al 2012), but unfortunately substantial differences exist between them, even for identical GHG concentration scenarios.This lack of agreement complicates adaptation planning.The perception that ESMs are 'black boxes', alongside pressures to use researcher time to create, continuously, new model versions, discourages efforts to understand their internal calculations, feedbacks, teleconnections and critically model differences.Such an approach circumnavigates standard scientific procedures, where building numerical models should occur with parallel analytical understanding.Yet the Earth System is hugely complicated, and it is challenging to achieve the dimension reduction necessary to identify dominant processes.We believe this makes climate research a perfect application for utilising ML methods.This suggestion is, though, conditional on the algorithms being applied thoughtfully, selecting the most appropriate one for each research question, and with a full appreciation of any underlying assumptions implicit within them.We have summarised many existing applications, noting these relate predominantly to specific climate system parts.Our call is to go much further and employ ML methods to the entirety of the Earth System, analysing gridded datasets (e.g.ECMWF) and the CMIP5 ensemble.
We present an overview of ML methods and suggest three potential applications where system interconnectivity is likely complex; a UK extreme event, the 'warming hiatus', and terrestrial ecosystem equation building.Where the governing mechanisms are uncertain, a recent advance of interest is a formulation of how neural networks enable data-based evaluation of features of hidden ordinary differential equations (Chen et al 2018) and partial differential equations (Raissi et al 2019).We suggest these methods to have enormous potential to advance process description of new components that are currently less understood, and for inclusion in Earth System models.Additionally, AI can utilise model-and data-based ML to provide warnings and aid decision support, for instance during approaching extremes such as droughts.The routine availability of climate data implies that MLbased research papers should have sufficient clarity that others can be encouraged to reproduce, or test, using their own ML algorithms.In other instances, the actual ML code can additionally be made available to enable replication of findings.The first approach provides a more robust and independent check.The issue of reproducibility versus replicability, and in an AI context, is discussed in Drummond (2009).
In summary, many scientific disciplines advocate routine adoption of ML methods, which will have differing levels of success.We believe ML application to the Earth System will fall in the successful category, delivering new insights into the incredibly rich diversity of interconnected Earth System behaviours and their multiple interactions with biochemical cycles.AI is, presently, a commonly used expression in society, but for climate analysis we consider it to be well defined.While ML will reveal climate system attributes and enhance forecasting across time scales, it is AI that can then adopt this information to support decisions.It is the instructing of actions required to ensure safety through environment extremes where ML becomes AI.For general climate policy, embracing ML will likely aid the step-change needed to generate refined advice about the climatic states expected for raised GHG concentrations.

Figure 1 .
Figure 1.Schematic of different ML methods, with potential applications.Each box in the first column shows simulations for the same model framework and forcings, evolving in time, and for different ESMs (ESM1, ESM2, K), and so representing the first five dimensions of table 1.The two different boxes (background yellow or green) represent two sets of simulations selected to span one of the four extra dimensions of table 1, as also indicated in the grey column: ensemble members capturing internal chaotic features, different large-scale estimates of initial state, perturbed-physics experiments and different socio-economic estimates of GHG emissions.The third column shows as illustrations the four main identified ML/AI methods.The fourth column is schematics of potential applications.The gradient descent method can determine functional responses, e.g.ecosystem attribute responses to temperature.Gaussian processes can allow extrapolation of sparse weather station data (black dots) to generate gridded datasets of historical weather features.Nonlinear non-Gaussian inferences can refine key ESM parameters, updated incrementally as more data becomes available.Deep learning approaches (e.g.NNs) can emulate computationally expensive components of ESMs, which impact, e.g.vertical profiles of the predicted variables including temperature.Such emulation enables longer simulations, larger ensembles, or added functionality.Refined ML-based understanding of ESM diagnostics and assessment of performance against data will then support better simulations by the next generation of climate models-as indicated by the bottom right-to-left arrow.

Figure 2 .
Figure 2. UK July and August year 2018 temperature, rainfall and soil moisture anomalies.Mean of two months of ECMWF re-analysis data for July and August, for the year 2018, minus the mean of the same two months and averaged over period 1979-2017.Shown are for (a) temperature at 2 m above ground, (b) rainfall and (c) soil moisture in the top soil level, which has a depth of 0.07 m.
CMIP5, Taylor et al 2012) for regressions between modelled climate system quantities and that can also be measured now, and other climate system features relevant to projecting future change.Such regressions utilise the contemporary measurement to constrain estimates of the future variable.Examples (Hall et al 2019) include physical parts of the climate system, e.g.reducing uncertainty on equilibrium climate sensitivity (Cox et al 2018), and geochemical or ecological features, e.g.refining Amazon 'die-back' risk (Cox et al 2013).Although the method has received some scepticism, such as by Caldwell et al (2014), a review by Hall et al (2019) notes that many have physical justification, whilst other robust regressions are worthy of investigation to search for mechanistic understanding.Some ECs may be difficult to discover by physical intuition, e.g.teleconnections where the measurable quantity is for a different location, season and climate attribute to the component of future interest.

Table 1 .
Nine basic dimensions of climate modelling.These include geographical location, the many configurations of ESMs, and alternative future trajectories in emissions policy.

Table 2 .
Existing application of machine learning algorithms to climate science.Listed is the part of the Earth System studied, the discovery from the application of ML/AI, methods used and reference.

Table 2 .
(Continued.)Supervised learning methods focus on classification problems: in the simplest case is binary, distinguishing between two types (say '0' and '1').Examples are the existence of ice cover and changes (Boe et al 2009) or land use classification (Helber et al 2018) to facilitate monitoring of changes.Another example involves the identification of ship tracks in satellite images(Segrin et al 2007).These polluted regions provide a 'natural laboratory' for the interaction between aerosol and cloud processes (Chen et al 2015), to understand aerosol impact on cloud radiative forcing and climate sensitivity(Stevens and Feingold 2009).
For ship tracks, the training dataset D is multiple subimages classified as c='ship track' or c='not ship track'; x is a sub-image from outside the training dataset which we wish to classify; and ŷ is the predicted classification of the new image, x.Method extension allows multinomial classification (e.g.'0', '1', '2',K) or pre-specified meteorological classes e.g. the Beaufort scale.If the response variable is realvalued continuous, then the problem is a regression.Unsupervised learning is a discovery or data mining approach.It constructs empirical models of the data, and is probabilistically written: 'Hiatus' Knowledge of global-scale variability remains incomplete.Between 1998 and very recently, there has been little additional global warming (the 'hiatus'), noted by those sceptical of global warming.Ascertaining the statistical likelihood of such an occurrence from the broad features of decadal variations has generated multiple studies, reviewed by Risbey et al (2018).Hypotheses for the general deviation of ESMs from measurements during this period include incorrect prescription of radiative forcing for aerosols, wrong climate sensitivities of ESMs, features of decadal variability and how temperature measurements are aggregated globally; Medhaug et al (2017) argue these are not necessarily contradictory to each other.For a climate feature so prominent, all strands of evidence should be combined to generate a more definitive answer.As the 'hiatus' is likely a function of simultaneous interactions in the climate system, ML can aid in the characterisation of any modelled deficiencies in parallel drivers.unknown Atmospheric equations are arguably well-known (Vallis 2006), describing fluid dynamics, thermodynamics and water phase change.Challenges remain in the parameterisation of very high resolution sub-grid processes for ESMs, e.g.storm events (Kendon et al 2014) and localised convection, with ML suggested as aiding the latter (Gentine et al 2018).Less known are the deterministic equations of biological responses, despite the need for mathematical representation in ESMs.Various models are available, but their applicability to a range of biomes is uncertain.Key ecological-climate feedbacks are known, e.g.terrestrial ecosystem photosynthesis, respiration and decomposition.These are described with physiological models representing phenomena operating at individual tree level (Fischer et al 2016) and roughly valid when aggregated to the Earth System scale (Fisher et al 2018).Less known are equations that fully capture complex canopy structures and temperature-dependent variation in leaf properties and processes, causing uncertainty in predicted global carbon fluxes (Rogers et al 2017).Data are typically used to calibrate and validate existing models, but not inform the underlying structure, which is an opportunity for novel AI application.For example, Shiklomanov et al (2016) use a Bayesian inversion framework to infer radiative transfer model parameters from remotely-sensed reflectance data, resulting in better model structures.A 'known-unknown' at the Earth System scale is faithful nutrient limitation representation in terrestrial ecosystems.Eventual strong nitrogen limitation of the global carbon cycle (Thomas et al 2015) could weaken terrestrial systems' ability to offset partially CO 2 emissions.ESMs have different approaches for incorporating nitrogen limitation, but model Hall A, Cox P, Huntingford C and Klein S 2019 Progressing emergent constraints on future climate change Nat.Clim.Change 9 269-78 Helber P, Bischke B, Dengel A, Borth D and IEEE 2018 Introducing eurosat: a novel dataset and deep learning benchmark for land use and land cover classification, Igarss 2018 2018 IEEE Int.Geoscience and Remote Sensing Symp.IEEE Int.Symp. on Geoscience and Remote Sensing IGARSS (New York: IEEE) et al 2015 Evaluation of a probabilistic forecasting methodology for severe convective weather in the 2014 hazardous weather testbed Weather Forecast.30 1551-70 Kendon E J et al 2014 Heavier summer downpours with climate change revealed by weather forecast resolution model Nat.Clim.Change 4 570-6 Knusel B et al 2019 Applying big data beyond small problems in climate research Nat.Clim.Change 9 196-202 Kornhuber K et al 2019 Extreme weather events in early summer 2018 connected by a recurrent hemispheric wave-7 pattern Environ.Res.Lett.14 054002 Krasnopolsky V M, Fox-Rabinovitz M S and Chalikov D V 2005 New approach to calculation of atmospheric model physics: Accurate and fast neural network emulation of longwave radiation in a climate model Mon.Weather Rev. 133 1370-83 Krasnopolsky V M and Schiller H 2003 Some neural network applications in environmental sciences: I. Forward and inverse problems in geophysical remote measurements Neural Netw.16 321-34 Kratzert F et al 2019 Benchmarking a catchment-aware long shortterm memory network (LTSM) for large-scale hydrological modelling Hydrol.Earth Syst.Sci.Discuss.(https://doi.org/10.5194/hess-2019-368) Lee L A et al 2013 The magnitude and causes of uncertainty in global model simulations of cloud condensation nuclei Atmos.Chem.Phys.13 8879-914 Lemons D S 2017 A Student's guide to dimensional analysis Student's Guides (Cambridge: Cambridge University Press) p 102 Lenton T M et al 2008 Tipping elements in the Earth's climate system Proc.Natl Acad.Sci.USA 105 1786-93 Liess S, Agrawal S, Chatterjee S and Kumar V 2017 A teleconnection between the West Siberian Plain and the ENSO region J. Clim.30 301-15 Liu Y et al 2016 Application of deep convolutional neural networks for detecting extreme weather in climate datasets Int.Conf. on Advances in Big Data Analytics (ABDA'16) (Las Vegas, USA) Luo L F, Wood E F and Pan M 2007 Bayesian merging of multiple climate model forecasts for seasonal hydrological predictions J. Geophys.Res.-Atmos.112 13 Martin B T, Munch S B and Hein A M 2018 Reverse-engineering ecological from data Proc.R. Soc.B 285 9 Massey N et al 2015 weather@homedevelopment and validation of a very large ensemble modelling system for probabilistic event attribution Q.J. R. Meteorol.Soc.141 1528-45 McGovern A et al 2017 Using artificial intelligence to improve realtime decision-making for high-impact weather Bull.Am.Meteorol.Soc.98 2073-90 McLauchlan K K et al 2017 Centennial-scale reductions in nitrogen availability in temperate firests if the United States Sci.Rep. 7 7856 McLauchlan K K, Williams J J, Craine J M and Jeffers E S 2013 Changes in global nitrogen cycling during the Holocene epoch Nature 495 352-5 Medhaug I, Stolpe M B, Fischer E M and Knutti R 2017 Reconciling controversies about the 'global warming hiatus' Nature 545 41-7 Mishra A K and Desai V R 2006 Drought forecasting using feedforward recursive neural network Ecol.Modell.198 127-38 Monteleoni C, Schmidt G A, Saroha S and Asplund E 2011 Tracking climate models Stat.Anal.Data Min.: ASA Data Sci.J. 4 372-92 Murphy K P 2012 Machine Learning: A Probabilistic Perspective.(Cambridge, MA: MIT Press) p 1096 New Scientist 2017 Machines that Think: Everything You Need to Know About the Coming Age of Artificial Intelligence.(London: Hodder & Stoughton) p 274 Nguyen P et al 2018 The PERSIANN family of global satellite precipitation data: a review and evaluation of products Hydrol.Earth Syst.Sci.22 5801-16 Novak D R et al 2014 Precipitation and temperature forecast performance at the weather prediction center Weather Forecast.29 489-504 Ockendon J, Howison S, Lacey A and Movchan A 2003 Applied Partial Differential Equations (Oxford: Oxford University Press) p 462 Otto F E L et al 2016 The attribution question Nat.Clim.Change 6 813-6 Park S, Im J, Jang E and Rhee J 2016 Drought assessment and monitoring through blending of multi-sensor indices using machine learning approaches for different climate regions Agric.Forest Meteorol.216 157-69 Raissi M, Perdikaris P and Karniadakis G E 2019 Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations J. Comput.Phys.378 686-707 Rap A et al 2018 Enhanced global primary production by biogenic aerosol via diffuse radiation fertilization Nat.Geosci.11 640-4 Rasmussen C E and Williams C K I 2006 Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning).(Cambridge, MA: MIT Press) p 272 Reichstein M et al 2019 Deep learning and process understanding for data-driven Earth system science Nature 566 195-204 Risbey J S et al 2018 A fluctuation in surface temperature in historical context: reassessment and retrospective on the evidence Environ.Res.Lett.13 23 Rischard M, McKinnon K A and Pillai N 2018 Bias correction in daily maximum and minimum temperature measurements