Space Weather with Quantified Uncertainties: Improving Space Weather Predictions with Data-Driven Models of the Solar Atmosphere and Inner Heliosphere

To address Objective II of the National Space Weather Strategy and Action Plan “Develop and Disseminate Accurate and Timely Space Weather Characterization and Forecasts” and US Congress PROSWIFT Act 116–181, our team is developing a new set of open-source software that would ensure substantial improvements of Space Weather (SWx) predictions. On the one hand, our focus is on the development of data-driven solar wind models. On the other hand, each individual component of our software is designed to have accuracy higher than any existing SWx prediction tools with a dramatically improved performance. This is done by the application of new computational technologies and enhanced data sources. The development of such software paves way for improved SWx predictions accompanied with an appropriate uncertainty quantification. This makes it possible to forecast hazardous SWx effects on the space-borne and ground-based technological systems, and on human health. Our models include (1) a new, open-source solar magnetic flux model (OFT), which evolves information to the back side of the Sun and its poles, and updates the model flux with new observations using data assimilation methods; (2) a new potential field solver (POT3D) associated with the Wang–Sheeley–Arge coronal model, and (3) a new adaptive, 4-th order of accuracy solver (HelioCubed) for the Reynolds-averaged MHD equations implemented on mapped multiblock grids (cubed spheres). We describe the software and results obtained with it, including the application of machine learning to modeling coronal mass ejections, which makes it possible to improve SWx predictions by decreasing the time-of-arrival mismatch. The tests show that our software is formally more accurate and performs much faster than its predecessors used for SWx predictions.


Introduction
The solar wind (SW) emerging from the Sun is the main driving mechanism of solar events which may lead to geomagnetic storms that are the primary causes of space weather (SWx) disturbances that affect the magnetic environment of Earth and may have hazardous effects on the spaceborne and ground-based technological systems, as well as human health.For this reason, accurate modeling of the background SW is a necessary part of SWx forecasting.Geomagnetic storms are caused both by SW stream interactions and by the largest solar coronal disturbances called coronal mass ejections (CMEs) and their interplanetary counterparts, interplanetary coronal mass ejections (ICMEs).The connection of the ambient interplanetary magnetic field to CME-related shocks and impulsive solar flares determines where solar energetic particles propagate.Therefore, modeling stream interaction background and ICMEs propagating through it on the basis of observational data is a key area of research in solar and heliospheric physics.Such modeling should necessarily be built on mathematically-and physically-consistent connections between eruptive events, magnetic phenomena on the Sun, and SW structures in the solar atmosphere and inner heliosphere (IHS).Substantial success has been achieved in global numerical modeling of the IHS.However, the prediction capability remains far from being satisfactory [1,2].There are several reasons for that.Firstly, since global numerical modeling of inner heliosphere (IHS )is a time-dependent problem, developing time-dependent boundary conditions (b.c.'s) that would incorporate remote and in situ observations of the Sun in a self-consistent way remains an important research challenge.Secondly, numerical approaches used to attack SWx problems are rather outdated and may be missing the important features of discontinuity formation, interaction, and propagation from the Sun towards Earth.
We are developing a new data-driven, time-dependent model of the solar corona and IHS to predict the solar wind properties at the Earth's orbit.This model is being implemented in a publicly available software.In addition to simulation of the background SW, we are particularly investigating the physics of the evolution, eruption, and propagation of CMEs between the Sun and Earth in the IHS.For this purpose, we are currently applying our data-constrained, flux-rope-based, constant-turn CME model [3].The latter is based not only on the traditionally used STEREO Sun-Earth Connection Coronal and Heliospheric Investigation (SECCHI) and Solar and Heliospheric Observatory's Large Angle and Spectrometric Coronograph (SOHO/LASCO) coronagraphs, but also on photospheric observations providing us with the estimates of the toroidal and poloidal magnetic fluxes, and the sign of helicity.The enhancement of the observational data base that constrain simulation results is one of the strategic approaches to improve SWx forecasting.It also makes it possible to improve our understanding of basic physical processes occurring in the IHS.

The Open Flux Transport Model (OFT)
The Open-source Flux Transport (OFT) software suite is a complete system for generating full-Sun magnetograms through acquiring and processing observational data, generating realistic convective flows, and running a flux transport model.It is summarized in Figure 1.

Data Acquisition and Processing: OFTpy
Assimilation of magnetograms from solar observatories is the key input to flux transport models.
We have developed OFTpy, a python codebase that facilitates acquisition and processing of data products to a computation-ready format for our flux transport model HipFT described in Sec.2.3.
The current code supports full processing of This extensible software also serves as a prototype/example for future users to incorporate magnetograms from alternative observatories.In future iterations, we plan to expand capabilities to include NSO GONG LOS magnetograms, and HMI vector magnetograms.
For data acquisition of HMI LOS magnetograms, OFTpy acts as a Python wrapper for product query and download from Stanford University's JSOC. Figure 2 shows an example.OFTpy allows the user to find and download the required data product over a specified time range and cadence.Processing steps include: high resolution interpolation to a CR map, downsampling by integration, and conversion from line-of-sight to radial.Additional pre-processing calculates the assimilation weights for use in data assimilation.The default assimilation functionality is to set weights in the pre-processing steps, however the python map-object class has been designed with a flexible set of layers that will allow more complex user-defined assimilation routines.

Convective Flow Generator: ConFlow
Supergranules are convective structures in the Sun associated with the magnetic network and most readily observed in Doppler velocity images.These structures have flow velocities of ∼ 500 m/s, diameters of about 30,000 km, and lifetimes of about a day.Typically, surface flux transport (SFT) models use a diffusion coefficient to account for the mixing that occurs from the turbulent convective motions.This diffusion causes the magnetic flux in active regions to weaken and spread out uniformly.An alternative approach is to explicitly model the turbulent surface flows produced by convection, as is done in the AFT SFT model [4].The convective flows are created by using vector spherical harmonics to create an evolving convective velocity field that captures details of the turbulent motion, including the power spectra (up to l of ∼ 350) and the finite lifetimes of the convective cells.On its own, the simulations of these velocities are visually indistinguishable from Doppler observations of the Sun.When incorporated into an SFT model, the use of realistic photospheric supergranular velocity fields drives the evolution of the flux.This produces an evolving magnetic network pattern similar to the quiet-sun network observed on the Sun, far surpassing the realism of SFT models that use only a diffusion coefficient alone.
Courtesy of Dr. David Hathaway, we have been allowed to implemented his convective flow models into a stand-alone code called ConFlow that will be made available as open-source (the ConFlow module in Figure 1).ConFlow provides the poloidal and toroidal components of the convective flows using an improved specification of the input spectral coefficients of the vector spherical harmonics.In the version of the flow model used in [4], these spectral coefficients were defined as discrete values determined over a trial-and-error iterative process to minimize the differences between the output Doppler velocity spectrum and the one measured by SDO/HMI.For ConFlow, this has been updated with an improved specification for the poloidal and toroidal input spectral coefficients that are now defined analytically for each spherical harmonic of degree l.In addition, to avoid aliasing artifacts due to the sampling limits of the default resolution of our flux transport model (1024x512), the input spectra are cut off at l = 340, which acts as a low-pass filter that keeps most of the supergranular convection cells, but rejects higher-frequency convective cells like granulation that cannot be resolved.

High-performance Flux Transport (HipFT)
HipFT is the computational core of OFT.It implements advection, diffusion, and data assimilation for the solar surface on a logically rectangular non-uniform spherical grid.It is parallelized for use with multi-core CPUs and GPUs (both NVIDIA and Intel) using a combination of Fortran's standard parallel 'do concurrent', OpenMP Target data directives, and MPI.It is designed to be modular, incorporating various differential rotation, meridional flow, supergranular convective flow, and data assimilation models.It can also compute multiple realizations in a single run, spanning multiple choices of parameters, resulting in an ensemble of maps.
HipFT uses a combination of numerical methods focused on accuracy and performance.To avoid the excessive diffusivity of upwind schemes for advection, we have implemented a flux preserving WENO3 [5] scheme that is up to fourth-order accurate in space in the regions of smooth solutions and first-order accurate near high gradients.It is combined with the strong stability preserving SSPRK(4,3) [6] time stepping scheme, which doubles the stable time step limit.For diffusion, we use the second-order explicit Gegenbauer extended stability Runge-Kutta scheme [7] (RKG2).We also iterate the diffusion using a novel practical time step limit (PTL) [8] that ensures the diffusion is accurate at any size of the overall time step.The advection and diffusion operators are second-order Strang split.
In Fig. 3, we show an example production run over 11-years from 2011 to 2022 using convective flows from ConFlow and data assimilation with OFTpy HMI data, along with 175 km2 diffusion.The 11 year run took ∼ 13 hours to compute on a single RTX 3090Ti NVIDIA GPU.
The HipFT code is publicly available for use on GitHub 1 .Full details of the HipFT code, including its capabilities, algorithms, performance, and validation tests are described in an upcoming publication.

WSA solar wind model using a high-performance PFSS+CS model
The simplest description of the coronal magnetic field, based on a photospheric boundary map of B r , is a potential (current-free) field model, i.e., a solution of Laplace's equation, ∇ 2 Φ = 0. Potential field source-surface (PFSS) and potential field current-sheet (PFCS) models have been computed for many years, typically using algorithms based on a spherical harmonic expansion.In the PFSS model, the field at the top boundary (the source-surface, typically at 2-2.5R ⊙ ) is assumed to be radial.The PFCS model uses a Laplace equation solve to extend the field from the source surface to 20-30R ⊙ and reproduce the latitudinally independent behavior of |B r | observed by Ulysses.We have developed a publicly available code POT3D 2 [9], that solves Laplace's equation using finite differences on a non-uniform logically-rectangular spherical grid.POT3D uses MPI to exploit massively parallel computations with 3D data decomposition.It allows user-specified meshes that can be made highly nonuniform.POT3D computes solutions on CPUs and GPUs far more rapidly than traditional spherical harmonic solvers, allowing for efficient high resolution calculations.The GPU acceleration is accomplished using Fortran's standard parallelism ('do concurrent') along with OpenACC directives for data management.In [9], we analyzed the various qualitative and quantitative differences that naturally arise by using different maps as input, and illustrated how coronal morphology and open flux depend most strongly on the outer boundary condition.We also have shown how large-scale morphologies and the open magnetic flux are remarkably insensitive to model resolution, while the surface mapping and embedded magnetic complexity vary considerably.This establishes important context for past, current, and future applications of the PF for coronal and solar wind modeling.
A key requirement for having POT3D serve as the coronal model in our SWQU effort is for it to provide coronal magnetic field solutions that agree well with those from the spherical harmonic potential field solver previously used in the WSA model [10,11] (we denote this overall model as SH-WSA, and that using POT3D as POT3D-WSA).In Fig. 4 we show side-by-side comparisons between the SH-WSA modelresults with thsoe using POT3D-WSA.We compared the asymptotic predicted solar wind speed and interplanetary magnetic field (IMF) polarity at the outer boundary of the models positioned at 21.5 R ⊙ and analyzed sources of discrepancies between the two.While we found that slight differences between the field line tracing methods resulted in a handful of field lines near the magnetic neutral line connecting to dramatically different locations on opposite sides of the magnetic neutral line, they were not the dominant source of discrepancy.The primary sources were due to either low-lying closed flux arcades in polar coronal holes that showed up in one model and not the other, or slight differences in magnetic connectivity in regions where the solar wind velocity equation is particularly sensitive.Inter-model discrepancies were compared to intramodel discrepancies in the presence of small perturbations of model computational settings and the  two results were found to be comparable.We also compared inter-model discrepancies to variation within model ensembles created using ADAPT photospheric magnetic flux maps and found that the ensemble variation of the two model was significantly higher than their computational uncertainties.A paper reporting the results of this study has been submitted for publication.

Modeling the inner heliosphere 4.1. Development of new capabilities in Chombo.
One of the goal of this research is to develop a new code for SWx MHD applications, called HelioCubed, having the following major features.
• The underlying discretization of space is a locally-structured mapped multiblock grid using equiangular cubed-sphere mappings.The use of such a spatial discretization does not have pole singularities, and is well-suited for the remaining choices we have made about the discretization method.• The discretion of the MHD equations uses a fourth-order-accurate finite-volume discretization in space, combined with a method of lines discretization in time based on a fourth-orderaccurate Runge-Kutta method.Discrete spherical symmetry is preserved under time evolution.HelioCubed also supports block-structured adaptive mesh refinement (AMR).• HelioCubed runs efficiently on high-performance computing platforms using MPI + domain decomposition for distributed-memory parallelism, as well as supporting the use of GPUs for the single MPI rank calculations.It is implemented using Proto, a high-performance, high-productivity framework for implementing locally-structured grid methods for partial differential equations being developed at LBNL.Our main reason for using Proto is that it provides a mechanism for supporting the use of GPU-based systems that is not available in the older softer frameworks such as Chombo [12] that are based on call-backs to Fortran, while providing support for obtaining high performance on such systems for high-order methods.Both CPU and GPU execution of Proto have been deployed using CUDA on NVIDIA GPU architectures and using HIP on AMD GPU architectures.The CUDA version has been optimized for use with NVIDIA data center GPUs such as the V100, A100, and H100.The initial implementation of the new testing infrastructure, including continuous integration testing, has been performed.
The development of HelioCubed is a complex undertaking, with multiple major components.For this reason, we have factored the development process into three components that can be developed concurrently, with a relatively small effort needed to integrate them together at the end.

HelioCubed: A code for solving MHD equations
We have extended the algorithm for systems of hyperbolic conservation laws described in [13,14] to the case of MHD.This method uses a method-of-lines approach, with a 4 th -order accurate Runge-Kutta method (RK4) for time discretization.The right-hand sides to the conervation laws are computing using fluxes computed at each time using a 4 th -order finite-volume method.This is combined with dissipation methods for steep gradients and discontinuities, such as a nonlinear limiter, and high-order linear and non-linear artificial viscosities, that preserve 4 th -order in smooth parts of the solution.Divergence cleaning has been conducted using the Powell method [15].Finally, we use a specialized version of the finite-volume discretization for abstract spherical coordinate systems (for which equiangular cubed-sphere and classical spherical coordinates are special cases), which, when applied to discrete spherical solutions, preserves spherical symmetry to to machine precision.
To carry out the development of the MHD method, we implemented a stand-alone test bed for the case of classical spherical coordinates.We translate boundary conditions generated by the POT3D coronal model [9] into a format that is compatible with both MS-FLUKSS [16] and HelioCubed, using the Proto interface to the HDF5 library to intake boundary conditions, output data, and save checkpoint files.The software is capable of executing solar wind simulations using both time-independent and time-dependent boundary conditions.The functionality to extract data along spacecraft trajectories has been incorporated into HelioCubed.Furthermore, the Constant Turn Flux Rope CME model [3,17] has been assimilated into HelioCubed.Left panel of figure 5 shows our preliminary results for data-driven solar wind simulations performed using HelioCubed.We have also validated our results using direct comparison of the solution with the solution created with MS-FLUKSS using the same boundary conditions, as shown in the right panel of Figure 5.

Block-structured AMR in Cubed-Sphere Coordinates
The spatial discretization for HelioCubed is based on a mapped-multiblock discretization [18] of a spherically-symmetric domain.In this approach, the unit sphere is discretized as the union Figure 6: Sample output from applying a 4th order Laplace operator to a mapped multiblock dataset with "cubed sphere shell" geometry.Each of the 6 blocks is oriented with its center normal to the positive and negative Cartesian coordinate axes and is composed of four 32×32×1 patches.When applying the Laplace operator, the 24 patches are distributed across 4 MPI processes and load-balanced to place contiguous patches in the same block on the same CPU.The results vary smoothly at block boundaries, so that no numerical artifacts due to interpolated block boundary conditions are introduced.
of six smooth logically-rectangular blocks that align with each other at patch boundaries.The coordinatization of a 3D spherical domain is ideologically identical to that of a mapped-multiblock domain, which is de facto a tensor product of a spherical domain with a radial coordinate.Thus, we are using the methods developed in [13,14] for a single rectangular domain to represent the coupling across the blocks.
The Mapped-Multiblock Framework.We have implemented the algorithms [18] specialized to the case of abstract spherical coordinates.This includes the methods for representing multiblock coordinate systems and specifying data on the unions of rectangles defined for such coordinate systems.The 4 th -order accurate methods are used for computing ghost cells at block boundaries.
AMR on Mapped Grids.We have developed the AMR capabilities required by HelioCubed for MHD.These include basic representations of data on unions of rectangles, including data objects for data on AMR hierarchies, a version of the refinement-in-time Runge-Kutta time discretization for AMR grids described in [14], and 4 th -order accurate interpolation in space and time at refinement boundaries.
Single-Patch Tools for Mapped Grids.We have implemented a number of tools that are needed for mapped grids, primarily for transforming between logical coordinate space, or in physical space.These include the calculations of metric terms for the mappings, product and quotient operators, and stencils for transforming between various centerings.These are complicated tasks to be implemented with the 4 th -order accuracy, and especially for vector-and tensor-valued patch data.We have also developed specialized versions of these and other operators that are required to implement the versions of our MHD algorithm that preserves spherically symmetric solutions.
These tools are factored into mapped-multiblock, AMR, and single-patch tools, but in fact many of them are used in two or three contexts.For example, the single-patch tools can be called for a single patch or across multiple patches in AMR or mapped-multiblock union of rectangles.A second example is the Proto interface for HDF5 files, which can be written for all three contexts and displayed by the VisIt visualization tool [19] in the abstract-coordinate and physical spaces.In Figure 6, we show an example of a flux divergence calculation on mapped-multiblock coordinates obtained using our HDF5 capability and VisIt.

Ensemble modeling of CMEs.
Flux-rope-based magnetohydrodynamic modeling of coronal mass ejections (CMEs) is a promising tool for prediction of the CME arrival time and magnetic field at Earth.In [3], we introduced a constant-turn flux rope model and used it to simulate the 12-July-2012 16:48 CME in the IHS.We constrained the initial parameters of this CME using the graduated cylindrical shell (GCS) model and the reconnected flux in post-eruption arcades.We correctly reproduced all the magnetic field components of the CME at Earth, with an arrival time error of approximately 1 hour.The error that small is not just a particular advantage of the proposed method.In reality, the chosen CME had properties easily identifiable from the observational data.We further estimated the average subjective uncertainties in the GCS fittings, by comparing the GCS parameters of 56 CMEs reported in multiple studies and catalogs.We determined that the GCS estimates of the CME latitude, longitude, tilt, and speed have average uncertainties of 5.74 • , 11.23 • , 24.71 • , and 11.4% respectively.Using these average uncertainties, we have created 77 ensemble members for the 12-July-2012 CME.We found that 55% of our ensemble members correctly reproduce the sign of the magnetic field components at Earth.We also determined that the uncertainties in GCS fitting can widen the CME arrival time prediction window to about 12 hours for the 12-July-2012 CME.On investigating the forecast accuracy introduced by the uncertainties in individual GCS parameters, we conclude that the half-angle and aspect ratio have little impact on the predicted magnetic field of the 12-July-2012 CME, whereas the uncertainties in longitude and tilt can introduce relatively large spread in the magnetic field predicted at Earth.
In [17], we developed an innovative technique for predicting CME arrival times, which utilizes magnetohydrodynamic simulations with data-informed flux-rope-based CMEs introduced into a data-driven solar wind background.For the six CMEs we analyzed, the Mean Absolute Error (MAE) in arrival time was roughly 8 hours.We further refined our arrival time predictions by implementing ensemble modeling and comparing the collective solutions with data from STEREO-A and STEREO-B heliospheric imagers.This involved using our simulations to generate synthetic J-maps.The machine-learning (ML) technique known as lasso regression was employed for this comparison.By adopting this strategy, we were able to cut down the MAE to approximately 4 hours.Additionally, another ML method built on neural networks (NNs) allowed us to lower the MAE to about 5 hours when HI data from both STEREO-A and STEREO-B were accessible.NNs proved capable of delivering comparable MAE even when only the STEREO-A data were utilized.Moreover, our approaches resulted in quite promising standard deviation values for arrival time.

Conclusions: SWx community engagement
We are developing publicly available software for SWx forecasts with the aim to make it formally more accurate and order of magneitude faster than any existing codes.We work with a clear understanding of the particular requirements an open-source software should satisfy as a potential prototype of the new operational tool.This includes active engagement with the operational end users at NASA/JSC Space Radiation Analysis Group (SRAG) and NOAA/NWS Space Weather Prediction Center (SWPC) to better understand their operational needs and constraints.In consultation with end users, we are adding new features to our software to help simplify the SWx community efforts to create new operational tools and products toward improving SWx analyses and predictions.

Figure 1 :
Figure 1: Summary chart of the Open Flux Transport software suite.

Figure 2 :
Figure 2: An HMI 720s magnetogram, observed on 2021/01/01 23:59:52, converted to radial flux, projected into Carrington Coordinates, and re-binned to 1024x512 resolution with OFTpy Joint Science Operations Center (JSOC) HMI M720s line-of-sight (LOS) magnetograms to a radial-flux Carrington Rotation (CR) Map.This extensible software also serves as a prototype/example for future users to incorporate magnetograms from alternative observatories.In future iterations, we plan to expand capabilities to include NSO GONG LOS magnetograms, and HMI vector magnetograms.For data acquisition of HMI LOS magnetograms, OFTpy acts as a Python wrapper for product query and download from Stanford University's JSOC.Figure2shows an example.OFTpy allows the user to find and download the required data product over a specified time range and cadence.Processing steps include: high resolution interpolation to a CR map, downsampling by integration, and conversion from line-of-sight to radial.Additional pre-processing calculates the assimilation weights for use in data assimilation.The default assimilation functionality is to set weights in the pre-processing steps, however the python map-object class has been designed with a flexible set of layers that will allow more complex user-defined assimilation routines.

Figure 3 :
Figure 3: Example of output B r maps from an 11-year OFT/HipFT run.The run used ConFlow super granular flows with a spectral cutoff of l = 340 and a diffusion coefficient of 175 km 2 /s.Left to right and top to bottom: maps on January 1st from 2011 to 2022.

Figure 4 :
Figure 4: Comparison between PFSS+PFCS+WSA model outer boundaries located at 21.5Rs for 2020 Feb 02 20:00.The left column shows the magnetic field and the right shows the model-derived terminal solar wind speed.The top and middle panels are calculated using the WSA-SH and WSA-POT3D models, respectively.The bottom panel shows the absolute difference between the results of the two codes.

Figure 5 :
Figure 5: Left panel: Solar wind solution calculated with HelioCubed using time-independent boundary conditions at UT 2020-02-02T19:59:53, created using OFT-POT3D-WSA-HelioCubed chain of models.Right panel: The comparison of SW solution at Earth using same boundary conditions with MS-FLUKSS (orange) and HelioCubed (blue).
These efforts are accomplished by a close coordination with the transition centers, such as Community Coordinated Modeling Center (CCMC), Short-term Prediction Research and Transition Center (SPoRT), SWPC operations and SWPC's Space Weather Prediction Testbed (SWPT).15th International Conference on Numerical Modeling of Space Plasma Flows Journal of Physics: Conference Series 2742 (2024) 012013