Brought to you by:
Topical Review

Advanced multiconfiguration methods for complex atoms: I. Energies and wave functions

, , , and

Published 7 September 2016 © 2016 IOP Publishing Ltd
, , Citation Charlotte Froese Fischer et al 2016 J. Phys. B: At. Mol. Opt. Phys. 49 182004 DOI 10.1088/0953-4075/49/18/182004

0953-4075/49/18/182004

Abstract

Multiconfiguration wave function expansions combined with configuration interaction methods are a method of choice for complex atoms where atomic state functions are expanded in a basis of configuration state functions. Combined with a variational method such as the multiconfiguration Hartree–Fock (MCHF) or multiconfiguration Dirac–Hartree–Fock (MCDHF), the associated set of radial functions can be optimized for the levels of interest. The present review updates the variational MCHF theory to include MCDHF, describes the multireference single and double process for generating expansions and the systematic procedure of a computational scheme for monitoring convergence. It focuses on the calculations of energies and wave functions from which other atomic properties can be predicted such as transition rates, hyperfine structures and isotope shifts, for example.

Export citation and abstract BibTeX RIS

1. Introduction

Atomic physics was the original testing ground for the new-born quantum theory close to a century ago, both regarding the nonrelativistic theory by Schrödinger [1] and the relativistic theory by Dirac [2, 3]. Just a few years after this change of paradigm in physics, computational methods were introduced to deal with models for systems other than the simplest hydrogen-like one [46]. Since then, the development of computational methods has been closely linked to dealing with new challenges in atomic physics—from atomic spectroscopy that was introduced and flourishing in the 1900s [7, 8], to the high-order, harmonic generation in ultra-high intense laser fields in recent days [9].

Today atomic physics is an important and very active branch of physics, both for its own sake while constantly finding new and exciting fields and applications, but also in support of other disciplines through data and fundamental insights. Atomic physicists are competing in high-profile journals and are active in many of the most prestigious laboratories in the world, including, e.g., CERN in the search for anti-hydrogen [10].

To classify some of the most interesting fields of research and 'Raison d'être' for atomic physics in general, and computational methods in particular, we can look at

  • (i)  
    Many-body theory, since atomic systems are governed by the well-understood electromagnetic interactions.
  • (ii)  
    Fundamental physics, since atomic experiments give unprecedented accuracy at a relatively low cost, opening up the possibility of performing extremely accurate measurements and finding small disturbances in exotic processes.
  • (iii)  
    Plasma diagnostics in astrophysics, since most of the information about the Universe, consisting of ions and electrons, reaches us through electromagnetic radiation and most of the 'visible' Universe is in the plasma-state.
  • (iv)  
    Complementing experiments, in a symbiotic relationship, in which the expensive and time consuming experiments are used to benchmark critically evaluated computational data, for diagnostics, and other purposes.

In this introduction we will start to describe this in more details, and then give an outline of this review.

1.1. Many-body theory

The electromagnetic interactions in atoms are well-known—in nonrelativistic theory arguably exactly, and within the relativistic framework to very high precision. It is fair to say that the theory of quantum electrodynamics (QED) is the best tested theory of interaction in physics where, e.g., its 'coupling constant'—the fine-structure constant—is known to 0.3 parts in a billion ($\alpha =0.0072973525664(17)$) an unprecedented accuracy [11]. The fact that the interactions are well understood (we know how to describe the nucleus–electron and electron–electron interactions) opens up a unique possibility to study many-body effects. After starting with the independent particle model, we can therefore focus on what we will refer to as correlation—the complex dynamic behavior of electrons.

1.2. Fundamental properties

Experimental atomic physics has reached accuracies that are unprecedented for determining, for example, relative energies of stationary states. If, at the same time, it is possible to develop accurate computational methods for the 'known' atomic structure, it opens up the possibility of measuring small and minute effects, as a difference between the measured and computed results. If everything else is handled in a systematic fashion, these deviations could be interpreted as due to fundamental processes left out in the computational methods. This methodology has been applied to a wide range of fields, for example, the properties of exotic nuclei [12], violation of different fundamental symmetries [13], or the variation of the fine-structure constant with space and time [14, 15]. It is clear that atomic physics offers a unique and, relatively speaking, inexpensive way to investigate these topics. For relatively simple systems, calculations are now accurate enough to lead to the development of the next generation of atomic clocks [16].

1.3. Plasma diagnostics

The most important argument for investigation of atoms and ions is probably the fact that over 99% of all visible matter in the Universe is in the plasma state [17] and many interesting features in the laboratory consist of plasmas. Since the constituents of a plasma are charged ions, together with electrons and photons, virtually all information we get on their properties is from the light they emit. This is the realm of theoretical atomic physics, where one predicts the light-emission of ions and how it is affected by the property of the plasma. If data for atomic transitions are known, the spectra from the plasma can give information about its fundamental properties, e.g., temperature and density (if they are well-defined), as well as the abundance of different elements and the balance between different ionization stages [18]. In some cases we are also able to determine magnetic fields—their strengths and polarization [19, 20].

In cases where the plasma is not in what is referred to as local thermal equilibrium, even stronger demands are put on the atomic data, to be used in modeling [21, 22]. In addition, other atomic parameters such as line shapes, might be useful for different properties of the plasma. If we know the influence of the nucleus on the atomic structure, manifested by the so called hyperfine structure and isotope shifts, we can determine the isotopic composition of, e.g., astrophysical plasma—important to test different models of nucleosynthesis in the stars and in the interstellar medium [2328].

1.4. Complementing experiments

It is clear that experimental determination of the wealth of data needed is both extremely time consuming and expensive. Unfortunately, this has lead to a situation where very few experimental groups today are involved with atomic spectroscopy. At the same time, the need for data is increasing [29]. We mention three examples, where recent developments put great strain on atomic physics:

  • (i)  
    Fusion power might be one of the energy sources for the future. To confine the fusion plasma, which has a temperature of millions of degrees, it is necessary to select the wall and divertor material with great care. The magnetic confinement is just not enough—there will always be some 'stray' particles that will hit the wall or divertor. For the latter, it turns out that tungsten could be the best choice [18]. It has excellent chemical properties, e.g., high heat conductivity and high melting point, but it also has a very complex atomic structure [30, 31]. When tungsten atoms are sputtered and contaminate the plasma, the complex structure of the atom leads to the risk of heavy loss of energy due to many possible ways of radiating, and thereby also causing instabilities within the plasma. The complexity is also a hindrance to the necessary diagnostics of this contamination—very little is known about the structure of the different ions of tungsten. A major effort in later years on the spectroscopy of tungsten, has been to understand these complex systems and be able to help in developing a new energy source [3234].
  • (ii)  
    Recently some astrophysical missions and spectrographs (GIANO [35], CRIRES [36], SOFIA [37]) have moved into the infra-red wavelength region, making new objects visible and open for analysis. This is due to the fact that the infra-red light has a higher transmission through dust clouds, which are common in our Galaxy and beyond. But, the long wavelength infra-red spectra are produced by transitions between levels lying close together in energy. Examples of this in atomic ions are transitions between highly excited states in, e.g., Rydberg series, and unexpected or forbidden transitions within ground configurations. Both of these are a challenge for experiment and need a strong support by accurate and systematic computations.
  • (iii)  
    Also recently, a new form of experiments has been developed that has reached a realm of properties of observed matter and time scales considered impossible just a decade or two ago. It involves ion traps [38, 39] and storage rings [4042] where very low densities, sometimes single ions, of highly charged plasma can be studied. This opens up the possibility to probe exotic processes in ions, such as forbidden and unexpected transitions from states with lifetimes of up to seconds or even hours [19]. Considering that 'normal' lifetimes in ions are in the nanosecond range or less, the lifetimes of these long-lived states are to these 'normal' lifetimes in ions as the age of the Universe is to one single day. Modelling of the processes behind these transitions, whether it is nuclear spin-flips or high-order multipole interactions, is a true challenge to theory and probes our deep understanding of quantum mechanics.

The most efficient approach is therefore to use computations to model ions and benchmark these with selected and targeted experiments. With advances in technology, more and more properties are being predicted through computation, where comparison with experiment provides a validation and a mechanism for assessing the accuracy of a computational result.

1.5. Determination of accuracy

Atomic data are needed for different reasons. This requires a thorough understanding of the atomic system consisting of a nucleus and a number of electrons, possibly in strong magnetic fields. In this review we will discuss one family of methods to deal with these systems. But it is not only important to find theoretical values of different properties, we also need to find a method to critically evaluate the data. There are basically two ways to approach this, either one derives theoretical expressions that give upper limits of the error in a computed result [43, 44], or one designs an approach that, in a systematic fashion, extends the complexity or simply the size of the calculations [4547]. If the complexity could be described quantitatively, it is then possible to estimate the convergence of the calculations, or even extrapolate to give the deviation from the 'exact' value [4851]. The methods we describe here offer a clear way to define a systematic approach since, as we will see in later sections, an atomic state is represented by an atomic state function (ASF), ${\rm{\Psi }}({\rm{\Gamma }}J)$, which is expanded in a set of configuration state functions (CSFs), ${\rm{\Phi }}({\gamma }_{\alpha }J)$,

Equation (1)

The CSFs are created as linear combinations of products of members of an active set (AS) of orbitals, according to suitable angular momentum coupling rules for the case at hand. By extending the AS systematically, we increase the space spanned by the CSFs and thereby approach the complete space and the exact representation of the atomic state. As we will discuss later, this opens up a method for investigating the convergence of our method, which in turn will give an estimation of its accuracy.

1.6. A computational approach

There are many computational methods in atomic physics. They may be classified generally as being based on perturbation theory or variational theory. Each may be further characterized as nonrelativistic or relativistic.

In the present review, we focus on general multiconfiguration variational methods that determine a wave function for an atomic state in terms of a basis of CSFs as shown in equation (1). The basis states are constructed from one-electron orbitals (i.e. wave functions) that depend on the Hamiltonian under consideration. For light atoms, where the size of the nucleus is not a significant factor and relativistic effects can be adequately represented by first-order theory, the nonrelativistic Hamiltonian with a point charge may be used for determining orbitals. For heavier elements, where the effect of the nuclear size needs to be considered and a fully relativistic treatment is needed, the Dirac–Coulomb Hamiltonian is the basic Hamiltonian for one-electron orbitals. Various corrections may then be added including QED effects.

The multiconfiguration Hartree–Fock (MCHF) method with Breit–Pauli corrections (MCHF+BP) and the multiconfiguration Dirac–Hartree–Fock (MCDHF) method with Breit and QED corrections (MCDHF+Breit+QED) represent these two approaches. Both are variational methods where the radial factors of orbitals are functions that optimize an energy expression. As a consequence, orbitals with a low generalized occupation are no longer 'spectroscopic' and represent corrections to the wave function for the electron–electron cusp condition arising from the singularities in the Hamiltonian away from the nucleus [52]. The underlying variational formulation of these two theories differ in detail, but is very similar in concept. The same computational procedures can be used. In fact, many concepts are easier to explain in nonrelativistic theory and we will do so here. Some of the variational theory for the MCHF method was presented in 1977 [53], when MCHF expansions were small and before the systematic computational schemes developed in the early 1990s were introduced.

A significant advance in the last few decades has been the introduction of systematic methods that rely on single and double (SD) excitations from a multireference (MR) set for generating wave function expansions. In a recent study, near spectroscopic accuracy was obtained for Si-like spectra consisting of nearly 100 levels with an expansion size of about a million [54]. The present review is restricted to the prediction of energies and wave functions for all elements of the periodic table. It updates the variational MCHF theory to include MCDHF, describes the MR single- and double- substitution (MRSD) process for generating expansions and the systematic procedure of a computational scheme for monitoring convergence. We refer to these as the MCHF-MRSD or MCDHF-MRSD computational procedures. The codes used for illustrative purposes are ATSP2K [55] and GRASP2K [56], respectively. Descriptions of these codes have been published and codes are freely available. They also are general purpose and have been tested extensively. For atoms with only a few electrons, Hylleraas type methods [57] are recommended (although code is not available, to our knowledge). At the same time, GRASP2K is not at the leading edge with respect to QED corrections, but given its open source availabilty, can be modified by the user.

A more recent advance has been the introduction of B-spline methods in which integro-differential equations are replaced by generalized eigenvalue problems [58]. This facilitates the calculation of high-lying Rydberg states, but also provides a complete basis set of orbitals in a fixed potential, where the range is restricted to the range of an occupied orbital. As a consequence, the orbitals would have a somewhat 'local' character. It is also possible to satisfy orthogonality conditions through the use of projection operators applied to the matrix eigenvalue problem. These methods will not be part of the focus of this review.

2. The many-electron Hamiltonians

2.1. The nonrelativistic Hamiltonian

In quantum mechanics, a stationary state of an N-electron atom is described by a wave function ${\rm{\Psi }}({{\bf{q}}}_{1},..,{{\bf{q}}}_{N})$, where ${{\bf{q}}}_{i}=({{\bf{r}}}_{i},{\sigma }_{i})$ represents the space and spin coordinates, respectively, of the electron labeled i. The wave function is assumed to be continuous with respect to the space variables and is a solution to the wave equation

Equation (2)

where ${ \mathcal H }$ is the Hamiltonian operator for the atomic system. For bound state solutions, the wave function must be square integrable and as a result of this, solutions exist only for discrete values of E that represent the total energy of the system.

The operator ${ \mathcal H }$ depends on the quantum mechanical formalism and on the atomic system, including the model for the nucleus. For nonrelativistic calculations, the starting point is the time-independent Schrödinger's equation (2) using the Hamiltonian for a nuclear point charge of infinite mass located at the origin of the coordinate system. In atomic units [11], this Hamiltonian is

Equation (3)

where h(i) is the one-electron nonrelativistic Schrödinger Hamiltonian of electron i moving in the Coulomb field of the nuclear charge Z, ri is the electron–nucleus distance and rij is the distance between electron i and electron j. The one-electron terms on the right-hand side describe the kinetic and potential energy of the electrons with respect to the nucleus, and the two-electron terms the Coulomb potential energy between the electrons. The latter terms introduce singularities into the wave equation, away from the origin, and are problematic in that they destroy the 'separability' of the Hamiltonian ${{ \mathcal H }}_{{\rm{NR}}}$. The Hamiltonian (3) can be approximated as

Equation (4)

where ui(r) depends only on r and not on the angular coordinates. It is useful to discuss the consequences of such a first approach. As we will see below, it will give us the form of the wave functions, which will be discussed in more detail in the next section. The approximation (4) implies that each electron moves in a central field ${V}_{i}^{C}(r)=-\tfrac{Z}{r}+{u}_{i}(r)$ of spherical symmetry

Equation (5)

For bound state $({\epsilon }_{i}\lt 0)$, the solutions ${\psi }_{i}$ can be written in spherical coordinates as [59]:

Equation (6)

where l and $s=1/2$ denote the orbital and spin quantum numbers, respectively, ml and ms specify the projections of ${\bf{l}}$ and ${\bf{s}}$ along the z-axis, and σ represents the spin variable. The radial functions Pnl(r) are solutions of the radial equation

Equation (7)

Equations (5) and (7) reduce to the hydrogenic Schrödinger equation if N = 1. In that particular case the potential simply reduces to ${V}^{C}(r)=-Z/r$ when neglecting the finite size of the nucleus. The radial equation only has solutions for eigenvalues $\epsilon \equiv {\epsilon }_{n}=-{Z}^{2}/(2{n}^{2})$ ${E}_{{\rm{h}}}$ (the Hartree unit for energy). With the eigenvalues being l-independent, the bound spectrum is highly degenerate $({g}_{n}=2{n}^{2})$. The radial functions Pnl(r) are the well-known hydrogen-like functions [59] with $n-l-1$ nodes. When $\epsilon ={k}^{2}/2\gt 0$, the spectrum is continuous and the different solutions to the radial equation are labeled by k instead of n. In this case the radial function Pkl(r) represents a free electron of momentum k [60, 61]. Though continuum processes are of great importance, this review will focus only on bound state solutions for atoms and ions. The principal quantum number n distinguishes solutions with the same set of other quantum numbers, but different energies. The condition $n\gt l$ assures that the power series expansion of the hydrogenic radial function terminates and the radial function is square integrable.

For a many-electron system with a point charge nucleus of charge $+Z$, a realistic central field potential for each electron has the asymptotic form ${V}^{C}(r)=-(Z-N\,+\,1)/r$ for large distances and behaves as $-Z/r$ as $r\to 0$. The requirement of connecting these two limits forces us to accept that the central field is no longer Coulomb-like. As a result, the one-electron eigenvalues ${\epsilon }_{{nl}}$ of (7) become l-dependent, in contrast with the hydrogenic case.

It is worthwhile to note that the functions (6) are also eigenfunctions of the parity operator Π where

Equation (8)

2.2. The Dirac–Coulomb–Breit Hamiltonian

In the nonrelativistic treatment it is formally straightforward to include the interaction between the electrons but, in the relativistic case, additional terms are needed since the instantaneous Coulomb interaction—electron–nucleus and electron–electron—is not Lorentz invariant and neglects the magnetic properties of the electron motion. Also, the speed of light (c) is finite in a relativistic model, and retardation effects need to be considered [62, 63]. A common approach is to combine the one-electron operators of the Dirac theory, with a nuclear potential, Vnuc(r), corrected for an extended nuclear charge distribution function, instead of the one for a point charge, a correction important for heavy elements. This yields the Dirac–Coulomb Hamiltonian, which in atomic units is [64]

Equation (9)

where hD is the one-electron Dirac operator (shifted for the energy to coincide with nonrelativistic conventions), α and β are usual 4 × 4 Dirac matrices, c is the speed of light ($=1/\alpha =137.035999\ldots \,{\rm{a.u.}}$), and ${\bf{p}}\equiv -{\rm{i}}{\boldsymbol{\nabla }}$ the electron momentum operator. For the finite nucleus approximation, either a uniform nuclear charge distribution, or a more realistic nuclear charge density given by a Fermi distribution function is used. In both cases the root mean square of the nuclear radius that enters in the definition of the nuclear potential changes from one isotope to another [65, 66].

Similarly to (4), the Dirac–Coulomb Hamiltonian can be approximated as

Equation (10)

each electron is then moving in a spherically symmetric central field potential ${V}_{i}^{C}(r)={V}_{{\rm{nuc}}}(r)+{u}_{i}(r)$. The many-electron problem becomes separable just as in the nonrelativistic case, and the many-electron wave function can be expressed as a simple product of one-electron solutions of the Dirac equation with the central field, usually written as

Equation (11)

where Pnlj(r) and Qnlj(r) are the radial functions and ${{\rm{\Omega }}}_{{lsjm}}(\theta ,\varphi )$ are two-component spherical spinors built from the coupling of the spherical harmonics ${Y}_{{{lm}}_{l}}(\theta ,\varphi )$ and the spin functions ${\chi }_{{m}_{s}}^{(1/2)}$

Equation (12)

with

Equation (13)

Note that the spherical spinor for the large component (Pnlj(r)) depends on l whereas for the small component (Qnlj(r)) it depends on a quantity denoted as $\tilde{l}$. The coupling in (12) requires $| l-1/2| \leqslant j\leqslant l+1/2$ and $m=-j,-j+1,..,\,j-1,j$. One can show that in the case of a field of spherical symmetry, the wave function is an eigenfunction of the parity operator introduced in (8). This in turn leads to the pair of two-component spinors in (11) having opposite parity, which implies that $\tilde{l}$ and l are related to each other [67]

Equation (14)

Introducing the quantum number κ as the eigenvalue of the operator $K=-1-{\boldsymbol{\sigma }}\cdot {\boldsymbol{l}}$ through

Equation (15)

allows us to rewrite the eigenfunctions (11) simply as

Equation (16)

where the spin-dependence is represented by the κ quantum number. The relationship between the spectroscopic notation and the angular momentum quantum numbers l, $\tilde{l}$, j and κ is shown in table 1. It should be noted that each state is uniquely specified by κ and that the wave function is a vector of length four. For this reason, Dirac theory is said to be a 4-component theory. The ${nljm}$ quantum numbers often are used instead of the equivalent $n\kappa m$.

Table 1.  Spectroscopic notation of relativistic shells.

  ${s}_{1/2}$ ${p}_{1/2}$ ${p}_{3/2}$ ${d}_{3/2}$ ${d}_{5/2}$ ${f}_{5/2}$ ${f}_{7/2}$ ${g}_{7/2}$ ${g}_{9/2}$
  s ${p}_{-}$ ${p}_{+}$ ${d}_{-}$ ${d}_{+}$ ${f}_{-}$ ${f}_{+}$ ${g}_{-}$ ${g}_{+}$
l 0 1 1 2 2 3 3 4 4
$\tilde{l}$ 1 0 2 1 3 2 4 3 5
j 1/2 1/2 3/2 3/2 5/2 5/2 7/2 7/2 9/2
κ −1 +1 −2 +2 −3 +3 −4 +4 −5
 

Since the spin-angular functions are linearly independent, we can separate out the radial parts of the one-electron functions to get

Equation (17)

where the zero of the energy scale has been shifted to correspond to the electron detachment limit, as in nonrelativistic theory. For the special case of N = 1 and ${V}^{C}(r)\,=-Z/r$, the bound state solutions where $-2(m){c}^{2}\leqslant E\leqslant 0$ are well known. The functions $({P}_{n\kappa }(r),{Q}_{n\kappa }(r))$ and the corresponding eigenenergies $E\equiv {E}_{n\kappa }$ depend on the n and κ quantum numbers (see for instance [64, 68]). The number of nodes in the large component ${P}_{n\kappa }(r)$ is $n-l-1$, as in the nonrelativistic case. The number of nodes in ${Q}_{n\kappa }(r)$ is $n-l-1$ for $\kappa \lt 0$ but nl for $\kappa \gt 0$ [66]. It is worthwhile to emphasize the fact that for a given n, the solutions for $\pm \kappa $ are degenerate (e.g., $2{s}_{1/2}$ and $2{p}_{1/2}$), which preserves the degeneracy in l as we observed for the nonrelativistic case. This, of course, is not true for a general central potential, but a special property of the Coulomb potential.

Solutions with energies $E\gt 0$ are the positive energy continuum states, but (17) also has solutions with $E\lt -2{{mc}}^{2}$ known as negative energy states that constitute what is called the 'negative energy sea'.

For the relativistic description of the many-electron system, the Dirac–Coulomb Hamiltonian (9) is only the first approximation and is not complete. To account for the corrections from the so called transverse photon interaction, we can use an approximation of order ${\alpha }^{2}$:

Equation (18)

to represent the magnetic interactions and the retardation effects [6971] where ${\boldsymbol{\nabla }}$ is the gradient operator involving differentiation with respect to ${{\bf{r}}}_{{ij}}={{\bf{r}}}_{i}-{{\bf{r}}}_{j}$ and ${r}_{{ij}}=| {{\bf{r}}}_{{ij}}| $. In this expression, given in the Coulomb gauge, ${\omega }_{{ij}}$ represents the energy of the virtual exchanged photon between two electrons introduced in QED, even in the absence of the emission or absorption of 'real' radiation. The value of ${\omega }_{{ij}}$ can be interpreted in terms of differences in orbital one-electron energies.

In the low photon energy limit (or the long wavelength approximation), when ${\omega }_{{ij}}\to 0$, the expression (18) reduces to the Breit interaction [64, 66]

Equation (19)

Adding (18) to the Dirac–Coulomb Hamiltonian (9) gives the Dirac–Coulomb–Breit Hamiltonian (in the effective Coulomb gauge)

Equation (20)

Further QED corrections to the Dirac–Coulomb–Breit Hamiltonian are not expressed as operators. The most important correction is the self-energy correction, which arises from the interaction of the electron with its own radiation field. For hydrogenic systems the electron self-energy can be expressed as

Equation (21)

where $F({nlj},Z\alpha )$ is a slowly varying function of $Z\alpha $. The latter function has been derived by Mohr and co-workers [7274]. There have been no generalizations of the self-energy calculations to arbitrary N-electron atomic systems. Instead the total self-energy correction is given as a sum of one-electron corrections weighted by the fractional occupation number of the one-electron orbital in the wave function. Each one-electron contribution is expressed in terms of the tabulated hydrogenic values either by relying on a screened nuclear charge or by a scaling factor obtained from the Welton picture [75]. The most recent developments include also a non-local QED operator, which can be incorporated in the Dirac–Coulomb–Breit eigenvalue problem [76, 77] but has not been implemented in GRASP2K to date.

The other important QED correction is the vacuum polarization correction, which is related to the creation and annihilation of virtual electron–positron pairs in the field of the nucleus. The vacuum polarization can be described by a correction to the Coulomb potential. For a nuclear charge distribution $\rho (r)$ the correction to the nuclear potential, referred to as the Uehling potential [78], is given by

Equation (22)

where

Equation (23)

Terms of higher order can also be evaluated as the expectation value of potentials. Numerical evaluation of the expectation values relies on analytical approximations of the K0 function by Fullerton and Rinker [79]. The above QED terms are included in the GRASP2K package [56], as originally implemented in [80], to yield the final Hamiltonian

Equation (24)

The GRASP2K code also includes terms arising from the lowest-order nuclear motional corrections [81], namely the normal mass shift (NMS) term based on the Dirac kinetic energy operator

Equation (25)

where M is the nuclear mass in atomic units (me), and the specific mass shift (SMS) term

Equation (26)

Higher-order corrections have been derived by Shabaev [82, 83] and independently by Palmer [84], giving rise to the following total nuclear recoil operator

Equation (27)

In this formulation, the NMS and SMS Hamiltonians are defined as the one- and two-body parts of (27). Treating the one- and two-body parts together, the operator (27) now includes a factor $(1/2)$ and the summation is over all pairs of indices i and j.

2.3. The BP Hamiltonian

Dirac theory requires both large, P(r), and small, Q(r), radial components for the one-electron wave function. In the nonrelativistic limit ($c\to \infty $) known as the Pauli approximation [66], the small component can be estimated from the large one [64], as

Equation (28)

and the wave function again depends on only one function. The traditional method, however, is to modify the Hamiltonian. In the BP approximation, relativistic effects are accounted for by modifying the nonrelativistic Hamiltonian (3) to include additional terms of order ${\alpha }^{2}$ as an approximation of the Dirac–Coulomb–Breit operator and using nonrelativistic radial functions [85, 86]. This BP Hamiltonian is often expressed as a sum over operators $\{{{ \mathcal H }}_{i}$, $i=0,\ldots ,5\}$ introduced by Bethe and Salpeter [69], but it is also informative to separate the components according to their effect on the atomic energy spectrum as suggested by Glass and Hibbert [87], namely

Equation (29)

where ${{ \mathcal H }}_{{\rm{NR}}}$ is the nonrelativistic many-electron Hamiltonian. The relativistic shift operator, ${{ \mathcal H }}_{{\rm{RS}}}$, commutes with ${\bf{L}}$ and ${\bf{S}}$ and can be written as

Equation (30)

The mass correction term, ${{ \mathcal H }}_{{\rm{MC}}}$, represents a correction to the kinetic energy:

Equation (31)

The next two interactions describe the one- and two-body Darwin terms ${{ \mathcal H }}_{{\rm{D}}1}$ and ${{ \mathcal H }}_{{\rm{D}}2}$, which are relativistic corrections to the nucleus–electron and electron–electron interactions, respectively. They are given by:

Equation (32)

The term ${{ \mathcal H }}_{{\rm{SSC}}}$ represents the Fermi-contact-type electron interaction contributing to the spin–spin interaction [88] and is therefore called the spin–spin-contact contribution [86]. It has the form

Equation (33)

Finally ${{ \mathcal H }}_{{\rm{OO}}}$ is the orbit–orbit term, which represents the magnetic interaction between the magnetic moments of electron orbits

Equation (34)

The fine-structure Hamiltonian ${{ \mathcal H }}_{{\rm{FS}}}$ describes magnetic interactions between the spin and orbital angular momenta of the electrons, and does not commute with ${\bf{L}}$ and ${\bf{S}}$, but only with the total angular momentum ${\bf{J}}={\bf{L}}+{\bf{S}}$. It consists of three terms

Equation (35)

that induce the term splitting (fine structure). The largest contribution is, in most cases, the spin–orbit interaction ${{ \mathcal H }}_{{\rm{SO}}}$ representing the interaction of the spin and angular magnetic momenta of an electron in the field of the nucleus

Equation (36)

The spin–other-orbit ${{ \mathcal H }}_{{\rm{SOO}}}$ and spin–spin ${{ \mathcal H }}_{{\rm{SS}}}$ contributions are interactions between magnetic moments related to the spin and orbital motion of different electrons

Equation (37)

Equation (38)

These spin-dependent operators usually produce a good representation of the fine structure splitting of terms in light to medium-sized atoms and ions. The spin–orbit interaction is the dominant term and behaves like Z4. The spin–other orbit reduces the size of the calculated fine structure splitting and scales as Z3. The spin–spin interaction obeys the same Z3 scaling law but, is two or three orders of magnitude smaller for most atomic systems [89].

3. The CSF

When the electron–electron interactions can be approximated by a central field, as in equations (4) or (10), the equations for the many-electron system in free space become separable and the solutions are simply products of the one-electron orbitals (6) or (11). Each one-electron orbital, or spin-orbital, is defined by a set of four quantum numbers, ${{nlm}}_{l}{m}_{s}$ in the nonrelativistic case, and nljm or $n\kappa m$ in the relativistic case. Since the orbital energies do not depend on the magnetic quantum numbers ${m}_{l},{m}_{s}$ or m, there are many degeneracies and any linear combination of products with the same total energy is also a solution. Not all solutions are physical since electrons are indistinguishable fermions (the absolute square of the wave function will be independent of a coordinate exchange of two particles) and the wave function should be antisymmetric under coordinate exchange of two particles. This forces us to represent the wave function of a many-electron atom in terms of Slater determinants that identically vanish if two spin–orbitals have the same values of the four quantum numbers. Thus for allowed atomic states no two spin–orbitals can have the same values of the four quantum numbers. This is the exclusion principle originally discovered by Pauli in 1925 [90] and leads to the shell structure of an atom.

For a many-electron system in free space with no preferred direction, the nonrelativistic Hamiltonian commutes with both the total orbital and spin angular momentum operators ${\bf{L}}$ and ${\bf{S}}$, and therefore also ${{\bf{L}}}^{2}$, ${{\bf{S}}}^{2}$, ${L}_{z}$ and ${S}_{z}$, so that the exact solution to the wave equation Ψ can be chosen as an eigensolution of these operators with quantum numbers ${{LSM}}_{L}{M}_{S}$. This approximation often yields results for low ionization stages and light ions that are in good agreement with observation. This has led to the so called LS-approximation, very important in atomic physics, not the least for historical reasons. However, for getting 'spectroscopic accuracy', the L- and S-symmetry ultimately needs to be broken in order to take relativity into account, making the corresponding quantum numbers LS 'good' but not 'perfect' anymore. On the other hand, the Dirac–Coulomb–Breit Hamiltonian and the BP Hamiltonian commute with ${\bf{J}}={\bf{L}}+{\bf{S}}$, and therefore with the ${{\bf{J}}}^{2}$ and Jz operators. The corresponding quantum numbers ${JM}$ are perfect quantum numbers, useful for representing the eigensolutions in relativistic cases for which symmetry-breaking due to the hyperfine interaction or external fields can be neglected.

Because the quantum numbers are different for the nonrelativistic and relativistic cases, it will be clearer at this point to distinguish between the two cases.

3.1. Nonrelativistic CSFs and their construction

In the nonrelativistic framework the Hamiltonian commutes with total angular and total spin operators. As a result, any physical solution corresponds to a symmetry-adapted linear combination of Slater determinants that is also an eigenfunction of these operators. This requirement splits the solutions into a number of LS terms of given parity and each such solution defines a CSF with total quantum numbers ${{LSM}}_{L}{M}_{S}$. The construction of these eigenstates using the relevant angular and spin operators is described in the next section. The associated nl quantum numbers define a subshell, its occupation number w representing the number of electrons with the given nl quantum numbers.

For an N-electron atom or ion, a general configuration consists of m groups of equivalent electrons, namely

Equation (39)

where wi is the occupation number of subshell i.

3.1.1. A single subshell

In the case of a configuration with only a single subshell, ${({nl})}^{w}$, we introduce the antisymmetric CSF, ${\rm{\Phi }}({({nl})}^{w}\alpha \nu {{LSM}}_{L}{M}_{S})$, where the additional numbers α and ν uniquely specify the considered state when there is more than one term with the same LS value. The seniority number ν, which will be discussed below, is needed for $l\geqslant 2$ subshells while an additional number α is introduced for shells with orbital angular momenta $l\geqslant 3,$ i.e. for electrons from the ${f}_{}\,\nobreakhyphen $, ${g}_{}\nobreakhyphen $, ... shells to get the one-to-one classification of the energy levels [92].

Such a CSF can be built by using a recursive method in which the CSF for a state with w electrons is defined as a sum of products of antisymmetric CSFs for states with $w-1$ electrons (the parent states) coupled to a single electron nl state [93]. This process can be expressed in terms of coefficients of fractional parentage (CFPs) $({l}^{w-1}\overline{\alpha \nu }\overline{\,LS}| \}{l}^{w}\alpha \nu {LS})$, the parent states ${\rm{\Phi }}({({nl})}^{w-1}\overline{\alpha \nu }\,{\overline{LSM}}_{L}\,{\overline{M}}_{S})$, and an ${{nlm}}_{l}{m}_{s}$ state for a single electron, namely

Equation (40)

The recurrence continues until w = 1, which is a trivial case since a single electron has no antisymmetry requirement.

The orbital and spin couplings, $\overline{{\bf{L}}}+{\boldsymbol{\ell }}={\bf{L}}$ and $\overline{{\bf{S}}}+{\bf{1}}/{\bf{2}}={\bf{S}}$, involved in (40) require the use of the vector-coupling expansion for ${{\bf{j}}}_{1}+{{\bf{j}}}_{2}={\bf{J}}$, namely

Equation (41)

where the coefficients $\langle {j}_{1}{j}_{2}{m}_{1}{m}_{2}| {j}_{1}{j}_{2}{JM}\rangle $ are the well-known Clebsch–Gordan coefficients [94, 95]. The CFPs are defined to ensure that the wave function is antisymmetric and thereby satisfies the Pauli exclusion principle. It is clear that they play a fundamental role in the theory of many-electron atoms. The final CSF for a single subshell separates into a radial and a spin-angular factor, according to

Equation (42)

where we have used the notation $| {l}^{w}\,\alpha \nu {{LSM}}_{L}{M}_{S}\rangle $ to denote a spin-angular function involving only angular coordinates.

3.1.2. Seniority and quasispin

Racah introduced the seniority quantum number in a formal way [93, 96, 97]. It can be thought of as a classification of terms, with ν equal to the number of electrons in the subshell when it first occurs, say w0. The next time this LS-term can occur is for $w={w}_{0}+2$. If in this case there are two terms with the same LS-symmetry, we choose the seniority $\nu ={w}_{0}$ for the one formed by the coupling of ${\rm{\Phi }}({({nl})}^{2}{}^{1}S({M}_{L}=0)({M}_{S}=0))$ to the previous occurrence, as in

Equation (43)

The second term with the same LS-symmetry, will be defined by seniority $\nu ={w}_{0}+2$ and coupled to be orthogonal to the first.

However, this has a group-theoretical interpretation based on the theory of quasispin [98101]. Briefly, if we define the quasispin quantum number Q as:

Equation (44)

then it is possible to show that the corresponding operator will have the transformation and commutation properties of the spin momentum. The quantum number representing its projection will be

Equation (45)

and it shows the range of the number of electrons in the shell for a given l, in which the term LS, characterized by the quantum number ν, exists.

In the single subshell CSF notation, accounting for the quasispin Q and its projection MQ, the CSF can then be written as [102]:

Equation (46)

3.1.3. Several subshells

To construct a specific CSF associated with the configuration introduced in equation (39) for multiple subshells, one starts with the products of the antisymmetric eigenfunctions for the different groups of equivalent electrons, namely

Equation (47)

where ${{ \mathcal Q }}_{1}$ represents the w1 coordinates $\{{{\bf{q}}}_{1},\ldots ,{{\bf{q}}}_{{w}_{1}}\}$, ${{ \mathcal Q }}_{2}$ the w2 coordinates $\{{{\bf{q}}}_{{w}_{1}+1},\ldots ,{{\bf{q}}}_{{w}_{1}+{w}_{2}}\}$, etc, up to the final set ${{ \mathcal Q }}_{m}$ $\{{{\bf{q}}}_{N-{w}_{m}+1},\ldots ,{{\bf{q}}}_{N}\}$ of the last $m{\rm{t}}{\rm{h}}$ shell. With the repeated use of the vector-coupling expansion (41), we can couple the product functions to the final total angular momenta ${{LSM}}_{L}{M}_{S}$ according to some specified coupling scheme. In this review as well as in ATSP2K and GRASP2K, the coupling applies from left-to-right and downwards, as shown graphically in figure 1 for LS-coupling. The orbital and spin angular momenta of the first two subshells are coupled to yield a resultant state ${L}_{12}{S}_{12}$. Then successively, until all subshells have been coupled, the next subshell is coupled to a resultant to form a new state. Each subshell-coupling uses the angular momenta coupling expansion (41) twice, first in the orbital space (${{\bf{L}}}_{12\ldots (k-1)}+{{\bf{L}}}_{k}={{\bf{L}}}_{12\ldots k}$), and then in the spin space (${{\bf{S}}}_{12\ldots (k-1)}+{{\bf{S}}}_{k}={{\bf{S}}}_{12\ldots k}$).

Figure 1.

Figure 1. Coupling of subshells for a CSF in LS-coupling.

Standard image High-resolution image

This procedure leads to a function, denoted by ${{\rm{\Phi }}}_{\wp }{(\gamma {{LSM}}_{L}{M}_{S})}^{u}$, which is antisymmetric with respect to co-ordinate permutations within each subshell, but not antisymmetric with respect to permutations between different subshells [103]. The additional antisymmetrization can, however, be accomplished through the restricted permutations

Equation (48)

where the sum is over all permutations involving coordinate exchange only between two different subshells such that the coordinate number within each subshell remains in an increasing order. The antisymmetrizing permutations of electron coordinates between different subshells appreciably complicates the appearance of basis functions (48); however, these complications largely disappear in the evaluation of matrix elements of symmetric operators [94, 104]. So, all this leads to the most general form of a CSF

Equation (49)

that in the quasispin representation (46), becomes

Equation (50)

In this notation γ represents the configuration and all the intermediate quantum numbers that define the CSF. The configuration determines the parity $P={(-1)}^{{\sum }_{i}^{N}{l}_{i}}$ of the CSF.

3.2. Relativistic CSFs

3.2.1. Single subshell CSF case

In relativistic atomic theory, there are two possible ${nlj}=n\kappa $, orbitals for each nl nonrelativistic one when $l\ne 0$. Instead of the CSF $| {({nl})}^{w}\,\alpha \nu {LSJ}\rangle $, we then have to deal with a number of CSFs $| {({{nlj}}_{1})}^{{w}_{1}}{\alpha }_{1}{\nu }_{1}{J}_{1}\,{({{nlj}}_{2})}^{{w}_{2}}{\alpha }_{2}{\nu }_{2}{J}_{2}\,J\rangle ,$ with the restrictions that

Equation (51)

For subshells with angular momenta $j=\tfrac{1}{2},\tfrac{3}{2},\tfrac{5}{2},$ and $\tfrac{7}{2}$ corresponding to ${s}^{w},$ ${p}^{w},$ ${d}^{w},$ and fw shells as well as $j=l-\tfrac{1}{2}$ of gw shell, the seniority ν and J are sufficient to classify the relevant states. α becomes relevant when $j\geqslant 9/2$ to avoid any ambiguity.

It is interesting to compare the different couplings for 3d4 (J = 2). For one nonrelativistic configuration 3d4 spanning the eight (J = 2, even parity) CSFs, there are four relativistic configurations $3{d}_{-}^{3}3{d}_{+}$, $3{d}_{-}^{2}3{d}_{+}^{2}$, $3{d}_{-}3{d}_{+}^{3}$ and $3{d}_{+}^{4}$.

A closed subshell is now defined by the quantum numbers nlj and, when $l\ne 0$, there will be two such subshells for each nonrelativistic one. A closed subshell contains $2j+1$ electrons. A separation of an electron configuration ${({nl})}^{w}$ into (jj-coupled) subshells is unique only for closed shells and for shells with a single vacancy. In general, several jj-coupled configurations with different distributions of the electrons can be found for each single nonrelativistic configuration.

Relativistic CSFs for subshells of equivalent electrons are formed as a vector-coupled product of one-electron states, as in the nonrelativistic case, except that the fractional parentage coefficients guaranteeing the exchange antisymmetry involve the JM quantum numbers. They do not factor simply into a radial and spin-angular part, as in the nonrelativistic case (42).

Similar to the nonrelativistic case, the quasispin quantum number Q of a relativistic subshell $| {({nlj})}^{w}\,\alpha \nu {{JM}}_{J}\rangle $ is simply related to the seniority quantum number ν by

Equation (52)

while MQ, the eigenvalue of Qz, depends on the occupation number w, namely

Equation (53)

The wave function of a subshell of w equivalent electrons and total angular momentum J can then be written in both the seniority and quasispin representations:

Equation (54)

3.2.2. Multiple subshells

The CSF for the vector-coupled shells are derived in a similar manner as in the nonrelativistic case (see equations (49) and (50)), except that the subshell Ji angular momenta are the only ones that need to be coupled. For instance, in the seniority representation, a general CSF takes the following form

Equation (55)

where γ represents the electron configuration in jj-coupling and all additional quantum numbers needed to completely specify the state.

3.3. Variational methods for wave functions as a single CSF

Given a set or orthonormal radial functions, the set of CSFs with the same parity and LS or J quantum numbers defined by these radial functions form a basis for a function space of approximate wave functions, or ASFs, denoted as Ψ. A very special case is the one where the wave function is expressed as a single CSF.

In our discussion so far, we have shown how the spin-angular factor of a CSF can be constructed assuming the radial functions were from a general central-field approximation. The question then arises as to which radial functions yield the 'best' approximate wave functions. Variational methods [105, 106] that optimize the total energy, result in equations for the radial functions known as Hartree–Fock (HF) equations in nonrelativistic theory and Dirac–Hartree–Fock (DHF) in relativistic theory.

For a normalized wave function Ψ the total energy is the expectation value of the Hamiltonian (3), namely

Equation (56)

When the definition of Ψ includes functions or constants that can be varied, the 'best' wave function ${{\rm{\Psi }}}^{{\rm{best}}}$ is the function for which $\delta E=0$ for all allowed perturbations $\delta {\rm{\Psi }}$, orthogonal to Ψ and the boundary conditions, namely

Equation (57)

When Ψ is assumed to be a single CSF, only the radial functions can be varied. Thus an expression for the energy in terms of radial functions is useful and is referred to as an energy functional. The orthogonality constraints must also be included in the variational process. As a result, the allowed perturbations, viewed as perturbations of the radial functions, are of two types—those that involve only a single radial function, and those that require that two radial functions be perturbed simultaneously in order to maintain orthonormality. Perturbations of more than two orbitals of the same symmetry can be expressed as a sequence of perturbations two at a time.

3.3.1. The Hartree–Fock equations

Let $a,b,c,\ldots $ represent one-electron radial functions for an orthonormal set of spin–orbitals (6) with associated quantum numbers ${n}_{a}{l}_{a},{n}_{b}{l}_{b},{n}_{c}{l}_{c},\ldots $. and $| \gamma {LS}\rangle $ the CSF for a configuration γ with LS quantum numbers. Since energies are independent of the ML and MS quantum numbers, these quantum numbers will be suppressed in the notation for the CSFs.

The HF equations are derived by applying the variational principle [53] to an expression for the total energy of the CSF based on the nonrelativistic Hamiltonian (3). It can be shown that

Equation (58)

where, in general

Equation (59)

By using the expansion in terms of Legendre polynomials

Equation (60)

where ${r}_{\lt }=\min ({r}_{1},{r}_{2})$ and ${r}_{\gt }=\max ({r}_{1},{r}_{2})$, the contribution from the two-electron operator becomes

Equation (61)

where the sum is over pairs of orbitals, possibly from the same subshell. Here ${F}^{k}({ab})={R}^{k}({ab},{ab})$ and ${G}^{k}({ab})\,={R}^{k}({ab},{ba})$ are special cases of the more general Slater integral

Equation (62)

This integral is symmetric with regard to coordinate exchange as well as left/right exchange. The ${F}^{k}({ab})$ integrals are referred to as 'direct' integrals in that the same orbitals are selected for the left/right pair whereas ${G}^{k}({ab})$ integrals are exchange integrals because they arise from the anti-symmetrizing exchange operator. Though defined as a double integral, Hartree [104] showed they could be evaluated efficiently through a pair of one-dimensional integrals:

Equation (63)

where ${r}_{\lt }$ (${r}_{\gt }$) denotes the smaller (larger) of r and s so that

Equation (64)

The spin-angular coefficients $\{{w}_{a}\}$, $\{{f}_{{abk}}\},\{{g}_{{abk}}\}$ can be determined using the Slater–Condon rules for the Slater determinant algebra [107], or the Fano approach in the Racah–Wigner algebra [103]. In the last decade, a more efficient and general approach has been developed by Gaigalas et al [108], combining second quantization in the coupled tensorial form, angular momentum theory in the orbital, spin and quasispin spaces, and graphical techniques. The relative simplicity of the energy expression (59) and (61) results from the orthonormality assumption for spin–orbitals (6)

Equation (65)

Due to the orthonormality property of the spherical harmonics and spin functions, this reduces to the radial orthonormality condition within each l-subspace

Equation (66)

The energy expression, along with Lagrange multipliers λ for orthonormality constraints (66) define the HF energy functional,

Equation (67)

The first type of perturbation for which the functional must be stationary is ${P}_{a}\to {P}_{a}+\delta {P}_{a}$, where $\delta {P}_{a}$ satisfies all boundary conditions and is orthogonal to all the occupied orbitals with the same symmetry. The perturbation for each term in the energy expression, when summed (see [58, 109]), is a function of the form $2\delta {P}_{a}(r)K(a;r)$ so that the stationary condition becomes

Equation (68)

This condition can only be satisfied if

Equation (69)

Applying the stationary condition for the variation of each orbital a, results in a system of m coupled equations where m is the number of subshells. For a CSF (like 1s22s) with only two orbitals $a,b$ with ${nl},n^{\prime} l$ quantum numbers, subject to orthogonality the two equations have the form [58]

Equation (70)

where Ha, for example, is the integro-differential operator

Equation (71)

Contributions to the direct potential $Y(a;r)$ arise from the ${F}^{k}({ab})$ integrals in the energy functional whereas contributions to the exchange potential $\bar{X}(a;r)$ arise from the ${G}^{k}({ab})$ terms. For the radial function Pa(r), the latter integrals have the form $({Y}^{k}({ab};r)/r){P}_{b}(r)$. In other words, the function Pa(r) is part of an integrand, making the equation an integro-differential equation of eigenvalue type when ${\varepsilon }_{{ab}}=0$, in which case

Equation (72)

In these equations, the matrix (εab) is called the energy matrix [53] which in our definition is the same as the matrix of Lagrange multipliers. It has been customary to write differential equations so that the coefficient of the highest derivative is unity, which requires dividing each equation by $-{w}_{a}/2$. The latter has the consequence that the (εab) matrix is no longer symmetric when the occupation numbers differ, even though ${\lambda }_{{ab}}={\lambda }_{{ba}}$. When this convention is not followed and the epsilon matrix $({\varepsilon }_{{ab}})$ is symmetric, it follows that

Equation (73)

The second type of perturbation relates to the 'rotation' of orbitals in orbital space that in two-dimensional space can be defined in terms of a single parameter $\epsilon \in [-1,1]$ as in

Equation (74)

where $1/\sqrt{1+{\epsilon }^{2}}=\cos (\theta )$ and θ represents the angle of rotation. The radial transformation

Equation (75)

allows the effect of a rotation on the energy to be expanded in powers of epsilon, namely

where g represents the gradient of the energy with respect to rotation and E(0) is the energy before orbitals are rotated. Then the stationary condition

Equation (76)

leads to $\epsilon =-g/(2g^{\prime} )$. When this condition is satisfied, ${\varepsilon }_{{ab}}={\varepsilon }_{{ba}}$. Rules for determining g and $g^{\prime} $ from the energy expression are given in [109].

In the simple case of the HF equation for $1s2s{}^{1}S$ where

the condition for a stationary solution is

Equation (77)

Equation (76) not only determines g (the amount by which the stationary condition is not satisfied) but also how much the radial functions used for evaluating the expression, need to be rotated for a stationary solution. When more than two radial functions are connected through orthogonality, the energy should be stationary for all rotations, a condition that will be satisfied to first-order if it is stationary for the rotation of all pairs of radial functions.

It should be pointed out that the off-diagonal energy parameters prevent the HF-equations from being integro-differential equations of eigenvalue type. In contrast, when B-spline methods are used, expressions can be derived for the off-diagonal parameters which, when substituted into the equations, result in a generalized eigenvalue problem for each radial function [110].

Several properties of the HF solutions follow from these considerations.

3.3.2. Koopmans' theorem

The diagonal energy parameter ${\varepsilon }_{{nl},{nl}}\equiv {\varepsilon }_{{aa}}$ (see (73)) for a singly occupied shell can easily be shown to be directly related to the binding energy of the nl electron, namely the difference in energy on the N-electron system and the energy of the $N-1$ electron system in which the nl electron has been removed, using the same set of radial functions for both the N and $N-1$ electron systems [53, 91]. In general

Equation (78)

where $\bar{E}({({nl})}^{w})$ is the energy of the atomic system when the ${({nl})}^{w}$ subshell has been removed and the remaining term is a correction relating to the self-interaction within the subshell when $w\gt 1$. This is the usual Koopmans' theorem [111, 112] that has been used successfully for estimating many ionization energies.

The HF equations may not always have unique solutions. Consider the case of $1{s}^{2}2{s}^{2}$ where the CSF can be expressed as a single Slater determinant. A unitary transformation (or rotation of the orbitals) changes the radial functions, but leaves the wave function and the total energy invariant. Thus there are an infinite number of solutions to the HF equations. Koopmans also defined a unique solution for this case as the extreme values of the symmetric energy matrix $({\varepsilon }_{{nl},n^{\prime} l})$. For these extreme values, the $1s$ orbital is the most bound orbital in the set of possible solutions, the $2s$ the orbital least bound, and the off-diagonal Lagrange multiplier is zero [53]. Thus, in HF calculations, it is customary to omit the rotations of orbitals of two filled subshells and their Lagrange multipliers, thereby implicitly setting the Lagrange multipliers to zero. But filled subshells are not the only case where the wave function remains invariant under rotation. Another well-known example is $1s2s{}^{3}S$ [53]. Non-unique cases can be detected through rotation analysis in that, for such cases, $g=g^{\prime} =0$ and the Lagrange multiplier can be set to zero [113].

3.3.3. Brillouin's theorem

The requirement that the 'best' solution satisfy the stationary condition of equation (57) can be generalized to matrix elements of the Hamiltonian between CSFs or linear combinations of CSF. At this point it is important to keep in mind the nature of the perturbations of radial functions. In the HF approximation, the 'best' wave function is the HF solution

When a single orbital is perturbed, the perturbations can be expressed in terms of a complete basis ${P}_{n^{\prime} l}(r)$, orthogonal to the occupied orbitals. Then the stationary condition for all $\delta {P}_{{nl}}$ will be satisfied if it is satisfied for every $\delta {P}_{{nl}}=\epsilon {P}_{n^{\prime} l}$. This perturbation of the radial function, denoted by ${nl}\to n^{\prime} l$ results in a perturbation of the CSF, ${\rm{\Phi }}{(\gamma {LS})}_{{nl}\to n^{\prime} l}$. If we recall the construction of the CSF, and equation (42), it is clear that none of the spin-angular factors are affected. Thus, in 1s22s, for example only the CSF for the subshell containing orbital a (or nl) will be affected. Thus the perturbation of the HF $2s$-radial function, ${P}_{2s}\to {P}_{2s}+\epsilon {P}_{{ns}}$, leads to

Equation (79)

so that, to first order in epsilon,

Equation (80)

Since the HF solution is stationary for this perturbation, it follows that

Equation (81)

In this case, adding the ${\rm{\Phi }}(1{s}^{2}{ns}{}^{2}S)$ to the HF wave function as a correction, would not further lower the energy and it is convenient to think of the HF wave function as already having included these CSFs.

The situation changes when orbitals are multiply occupied and the structure of ${\tilde{{\rm{\Phi }}}}_{{nl}\to n^{\prime} l}$ satisfying

Equation (82)

in the general case, is more complex [114]. Consider 2p3 ${}^{2}{P}^{o}$. We must first uncouple an orbital using equation (40) in order to have a single $2p$ coupled to an expansion over the parent 2p2 LS terms where this expansion is determined by the CFPs. Expressing the perturbed wave function in terms of CSFs, the stationary condition requires that the Brillouin matrix element be zero, or

Equation (83)

In the present case, this is a matrix element between 2p3 ${}^{2}{P}^{o}$ and a particular linear combination of CSFs ${\rm{\Phi }}(2{p}^{2}(L^{\prime} S^{\prime} ){np}{}^{2}{P}^{o})$, namely

Equation (84)

where the weights are the associated CFP (40). Thus the HF solution has included a particular combination of 2p2np CSFs but not each CSF exactly: adding the three CSFs separately, each with their own expansion coefficient, would lower the energy of the HF wave function.

When two orbitals $a,b$ are subject to an orthogonality condition, the perturbation from a rotation must also have a zero interaction with the HF wave function [115, 116]. This perturbation comes from a pair of substitutions, namely $a\to b,b\to -a$. An excellent example is the excited state $1s2s{}^{1}S$. A rotational perturbation produces a state proportional to $\{2{s}^{2}-1{s}^{2}\}/\sqrt{2}{}^{1}S$. The stationary condition requires that the HF solution be such that

This condition on the solution is difficult to satisfy without the use of rotational transformations. In general, when two open shells of the same symmetry are present, Brillouins theorem states that HF solutions have the property that the interaction between the HF solution and a specific linear combination of CSFs will be zero [117], implying that some average interaction between CSFs has been included in the approximation. In fact, the hydrogenic $1s2s{}^{1}S$ state and the perturbed linear combination of CSFs are degenerate in Z-dependent perturbation theory so that the mixing of these two CSFs would be included already in the zero-order wave function (see section 5.2).

Brillouin's theorem states, in effect, that $\langle {{\rm{\Phi }}}^{{\rm{HF}}}| { \mathcal H }| \tilde{{\rm{\Phi }}}\rangle =0$ for a class of functions that can be related to the allowed perturbations for which the energy is stationary. The 'annihilation' of Brillouin's matrix elements for fully variational solutions of the HF problem constitutes a useful property. It has been intensively used for testing the extension of the HF code to the fN shell for general occupation numbers [92]. It is worthwhile to note that in the checking process, 'accidental' zeros characterizing the HF solution of Lanthanides in their ground state and appearing in $4f\to {nf}$ Brillouin's matrix elements were discovered and remain unexplained, even after exploring the use of an isospin basis [118, 119].

3.3.4. Solution of the HF equations

With these theorems in mind, given an initial estimate for all the occupied radial functions, solutions to the HF equations of equation (71) can be obtained by an iterative process referred to as the self-consistent field (SCF) method, namely:

Very efficient methods solve the differential equations by using finite differences based on a discrete representation of the radial functions on a logarithmic mesh. Instead of treating the equation as an integro-differential equation, the exchange contribution of equation (72), along with any off-diagonal energy parameters, are treated as a non-homogeneous term and the differential equation solved as a boundary-value problem. Details can be found in [53, 113]. Essentially, in every iteration, the method improves the radial function. This is done by matching the solutions from outward integration and inward integration. Since the differential equations for excited states may be the same as for a lower state, the adjustment process needs to take into account the desired eigenstate. Node counting is used in the numerical HF program, taking into account the possibility that the rotation of orbitals may have introduced additional nodes that need to be ignored, thereby making node counting somewhat of an art. But the SCF process does not guarantee convergence. A well-known example which starts with large oscillations is F 2p5 ${}^{2}{P}^{o}$: if the $2p$ estimated orbital is too contracted, the screening of the nucleus will be too large, and the next estimate will be too extended. 'Accelerating' parameters may be introduced that actually dampen the rate of change thereby damping the oscillations in the change of the orbitals and speeding convergence [58].

The accuracy of the solution of the HF equation can be assessed through the virial theorem [59] which states that the ratio of the potential energy relative the kinetic energy is exactly −2.0.

3.3.5. DHF equations

The DHF equations are similar to the nonrelativistic equations for a single CSF except for some differences in the details. By definition, HF and DHF are methods applied to a single CSF either in LS or jj-coupling. In many cases, the two are equivalent but in others there is a difference. For example, the 2p4 1D case in nonrelativistic theory becomes $0.8258\,(2{p}_{-}2{p}_{+}^{3})+0.5648\,(2{p}_{-}^{2}2{p}_{+}^{2})$ in jj-coupling and Dirac theory. Therefore the equivalent of the HF wave function is no longer a single CSF and needs to be treated as part of a multiconfiguration approximation discussed in the next section.

The relativistic extension of the HF approach to the DHF approach is to apply the variational principle to the energy functional

Equation (85)

where $| \gamma J\rangle $ is a single CSF (54), and ${{ \mathcal H }}_{{\rm{DC}}}$ is the Dirac–Coulomb Hamiltonian (9). Lagrange multipliers ${\lambda }_{{ab}}$ for orbitals a and b belonging to the same κ-space $({\kappa }_{a}={\kappa }_{b})$, are introduced in (85) for each radial orthonormality constraint, namely

Equation (86)

The matrix element for the total energy for the Dirac–Coulomb Hamiltonian (9) can be expressed in terms of spin-angular coefficients and radial integrals

Equation (87)

The one-body interaction gives rise to the spin-angular coefficients that reduce to occupation numbers wa and to the $I(a,a)$ integrals where (in the general case)

Equation (88)

The two-body interaction gives rise to the spin-angular coefficients ${f}_{{abk}},{g}_{{abk}}$ and to the ${F}^{k}({ab})={R}^{k}({ab},{ab})$ and ${G}^{k}({ab})={R}^{k}({ab})$ integrals. The latter are special cases of the relativistic Slater integrals

Equation (89)

The relativistic DHF Yk-functions are defined by

Equation (90)

The spin-angular coefficients appearing in (87) can be evaluated using algebraic expressions for matrix elements adapted for spin-angular integrations in jj coupling, involving the calculation of reduced CFP and completely reduced matrix elements of double tensors [120, 121].

From this expression it is possible to derive the DHF equations from the usual variational argument [64] as an integro-differential problem

Equation (91)

where $V(a;r)=({V}_{{\rm{nuc}}}(r)+Y(a;r)+\bar{X}(a;r))$. In this expression, Vnuc(r) is the effective electron–nucleus potential at radius r taking into account the finite size of the nuclear charge distribution through a uniform or a Fermi distribution of the charge, $Y(a;r)$ is the direct potential, and $\bar{X}(a;r)$ contains the exchange contributions in integro-differential form as described in the HF method.

Koopmans' and Brillouin's theorems apply to the DHF solution as well. Though the relativistic CSF for a shell of equivalent electrons does not factor simply into a radial and spin-angular factor, (82) for multiply occupied subshell still holds. What differs is the perturbation, namely

Equation (92)

A difference, at least conceptually, is that the perturbation may now be either a positive energy function (bound or continuum, not necessarily a state) or a negative energy function that satisfies boundary conditions and orthogonality as indicated by the n* notation. Orbital rotations for stationary conditions may also occur. When compared with perturbation theory methods in the 'no-pair' approximation that exclude contributions from a negative energy sea, this may account for differences in results. GRASP2K calculations, to date, have not found it necessary to constrain the calculation in any way. Also different are the CFP. The equivalent expression for (84) is

Equation (93)

4. The multiconfiguration wave functions

The single CSF approach described in the previous section is based on an independent particle model, where the electrons are assumed to move in an average, central field of the other electrons and the nucleus. In this approach we do not start by defining a detailed form for these potentials, just the fact that they define the form of our wave functions, as linear combination of products of spin–orbitals (6) or (11). We then develop the HF and DHF method by assuming this form of the orbitals. To take into account corrections to the independent particle model is, by definition, to include electron correlation which we will discuss in a later section. Here we just observe that a 'straight-forward' approach would be to represent the ASF, not any longer as a single CSF, but as a multiconfiguration (MC) function expanded in terms of a basis of, say M, CSFs;

Equation (94)

In our definition of an MC approach, there are two phases:

  • (i)  
    the determination of the cα coefficients, or weights, for a given set of CSFs. We will refer to this as the configuration interaction (CI) phase, and
  • (ii)  
    the determination of the orbitals, as an extension to the HF or the DHF method for a given set of expansion coefficients.

Let us start with the CI phase.

4.1. Configuration interaction

In a 'pure' CI approach, only the expansion coefficients in (94) are variational parameters and can be determined by the Rayleigh–Ritz method. The stationary condition then leads to the eigenvalue problem

Equation (95)

where we assume an orthonormal CSF basis. In fact there are M eigenvalues and eigenvectors, often referred to as eigenpairs. If the mth eigenvalue, EmM is the total energy of the desired state, then the associate normalized eigenvector ${{\bf{c}}}_{m}^{M}$ defines the expansion coefficients for the state. The M × M matrix ${{\bf{H}}}^{M}=({H}_{\alpha \beta })$ is called the 'interaction matrix' and has elements

Equation (96)

As stated in the introduction, we are aiming for a systematic approach, where we include a set of CSFs of increasing size to improve our approximate ASF. An essential foundation for this is the Hylleraas–Undheim–MacDonald (HUM) theorem [122, 123], which states the following relationship for the eigenvalues when the size of the matrix increases from M to $M+1$, namely

Equation (97)

In other words, the eigenvalues of the matrix of size M interlace those of size $M+1$. The implication of this is that when the basis set is increased by including an additional CSF of the same symmetry, we are approaching the exact solution for the energy from above. It then follows that the mth eigenvalue is an upper bound for the mth exact solution of the wave equation for the Hamiltonian operator ${ \mathcal H }$, provided the matrix size is at least $M\geqslant m$. To be even more explicit, if the energies are bounded from below as in nonrelativistic theory, the HUM theorem shows that the variational method is a minimization method not only for ASFs lowest in their symmetry, but also for excited states as long as the basis includes the CSFs needed for the lower states.

As an example, if we apply the HUM theorem to ${E}^{{\rm{HF}}}(1s2s{}^{1}S)$ states in He-like systems, then the energy calculation is for a matrix of size M = 1, and hence the result is an upper bound to the energy of the ground state 1S. In order to obtain a wave function whose energy is an upper bound to the exact $1s2s$ 1S energy, the second energy level for this symmetry, it is necessary to have an expansion over a basis that includes the 1s2 1S CSF as well as $1s2s$ 1S so that the desired solution is the second eigenvalue with $M\geqslant 2$. The calculation of the wave function of the $1s2s$ 1S state has a long history [124, 125].

4.1.1. The two-by-two CSF example

The CI method is frequently used in atomic physics, and has become a metaphor for 'interacting configurations' that represent correlation. To understand some of its implications it is valuable to investigate the simplest case of M = 2, to see what differs from the single-configuration approach. In this case the matrix eigenvalue problem is

Equation (98)

with ${H}_{21}={H}_{12}$, since we are dealing with Hermitian operators. The eigenvalues for this problem are roots of the quadratic polynomial obtained from the secular equation

Equation (99)

The two real roots ${E}_{+}$ and ${E}_{-}$, of this equation are [94]

Equation (100)

and the corresponding eigenvectors

Equation (101)

with

Equation (102)

Note that one eigenvalue is above $\max ({H}_{11},{H}_{22})$ while the other is below $\min ({H}_{11},{H}_{22})$. Due to the fact that the trace is conserved (${H}_{11}+{H}_{22}={E}_{-}+{E}_{+}$) it is clear that the interaction term (${H}_{12}={H}_{21}$) produces an apparent mutual repulsion of the two energy levels.

Two interesting cases may be be considered

  • the off-diagonal interaction H12 can be considered as a perturbation of the diagonal energies when $| 2{H}_{12}/({H}_{22}-{H}_{11})| \ll 1$,
  • the diagonal energies are 'nearly degenerate' ($| ({H}_{22}-{H}_{11})/2{H}_{12}| \ll 1$).

In the former, assuming without loss of generality that ${H}_{11}\lt {H}_{22}$ and expanding the square roots of (100) and (102) in binomial series, one finds

Equation (103)

with the corresponding eigenvectors

Equation (104)

where $\delta \equiv {H}_{12}/({H}_{22}-{H}_{11})$. The fact that $| \delta | \,\ll $ guarantees the high purity of the eigenvectors.

For the second case, the most spectacular scenario occurs in the degenerate case (${H}_{11}={H}_{22}$) for which the two eigenvectors are mixed with $| {c}_{1}^{\pm }{| }^{2}=| {c}_{2}^{\pm }{| }^{2}=1/2$ (or $50 \% $) for any non-zero H12 matrix element.

There are many near degeneracies in atomic spectra. An example is the high-lying perturber $3s3{p}^{5}{}^{3}{P}^{o}$ CSF in the sulfur iso-electronic sequence which interacts with the $3{s}^{2}3{p}^{3}{(}^{2}D){nd}{}^{3}{P}^{o}$ Rydberg series CSFs. As the nuclear charge of the atomic system increases [126] the perturber descends into the lower region of the spectrum and the energies of the two components of the wave function change order. As a result there may be 'short-range' interactions in the presence of level crossings at selected values of Z and the order of the dominant component changes and hence, also the label [127]. However, the energy of solutions to the wave equation are continuous functions of Z and plots of the lowest energy of a given symmetry, the second lowest, etc, are continuous functions with an anti-crossing at the point of degeneracy. A unique identification of an ASF is a position number (POS) and symmetry. More will be said about the labelling problem in section 4.7.

4.1.2. Large CI expansions

Another interesting case is the one that occurs when the CSF expansion can be partitioned into two subsets, namely those CSFs whose coefficients may be large and those that are small. Let us assume the two sets of expansion coefficients are represented by the column vectors ${c}^{(0)}$ and ${c}^{(1)}$, respectively. This also partitions the interaction matrix H into blocks so that (95) becomes

Equation (105)

where ${H}^{(00)}$ is the interaction matrix between large components, ${H}^{(11)}$ for interactions between small components of the wave function, and ${H}^{(01)}={H}^{(10)}$ represents the interactions between CSFs of the large and those of the small block. This equation can be rewritten as a pair of equations, namely

Equation (106)

Solving for ${c}^{(1)}$ in the second equation and substituting into the first, we get an eigenvalue problem for ${c}^{(0)}$

Equation (107)

This is known as a method of deflation in numerical analysis since it reduces an eigenvalue for a matrix of size N × N to an eigenvalue problem of size m × m, where m is the expansion size of ${c}^{(0)}$. Of course, once E and ${c}^{(0)}$ have been determined, the small components can be generated from the expression

Equation (108)

and a full wave function is defined. Note that the eigenvalue problem is now nonlinear.

In perturbation theory, where only one eigensolution is computed at a time, the matrix $({H}^{(11)}-{EI})$ is replaced by the difference between the diagonal elements ${H}_{{ii}}^{(11)}$ of ${H}^{(11)}$ and the zero-order energy, ${E}^{(0)}$, which is an eigenvalue of ${H}^{(00)}$. Then $({H}_{{ii}}^{(11)}-{E}^{(0)}I)$ is a diagonal matrix and its inverse is also diagonal. In CI the computation simplifies tremendously if only diagonals are needed since many interactions can then be omitted. But computationally, such an assumption is not necessary since it is also possible to replace ${({H}^{(11)}-{EI})}^{-1}$ by an approximate inverse, a strategy that is used in the GRASP2K Davidson method for solving the eigenvalue problem iteratively. This method has not yet been implemented in GRASP2K and its effectiveness needs to be evaluated when many eigenvalues are required as in a study of states in a Rydberg series. Certainly it could be used to obtain excellent initial estimates for the Davidson algorithm [130, 131].

Note that expression (108) is the linear algebra equivalent of the effective Hamiltonian derived in the CI-MBPT program [128]. In CI-MBPT however, the CSFs that define the small components are used only to correct the energy, unless they are included through perturbation theory applied to the property of interest [129].

4.2. The MCHF method

Multiconfiguration methods, MCHF or MCDHF, differ from CI methods in that both the expansion coefficients and the radial functions are varied for a stationary energy. The procedures are the same as for the single CSF wave function and many of the properties are similar except for some differences. Though the single configuration case is a subset of the multiconfiguration case, here we will focus on the differences.

In the MCHF method the normalized atomic state wave function (ASF) is expanded in a basis set of M CSFs

Equation (109)

and the associated energy becomes

Equation (110)

The diagonal matrix elements for the energy can be expressed as linear combinations of one-electron integrals $I(a,a)$ and two-electron Slater integrals ${F}^{k}({ab})$ and ${G}^{k}({ab})$, as in the HF case (see (59) and (61)), but off-diagonal matrix elements introduce one-electron integrals $I(a,b)$ and Slater integrals ${R}^{k}({ab},{cd})$ with symmetries different from the ${F}^{k}({ab})={R}^{k}({ab},{ab})$ (direct) and the ${G}^{k}({ab})={R}^{k}({ab},{ba})$ (exchange) symmetry. For example, $\langle 3{p}^{2}{}^{1}D| {{ \mathcal H }}_{{\rm{NR}}}| 3s3d{}^{1}D\rangle =2\sqrt{5/3}\,{R}^{1}(3p3p,3s3d)$. Then the energy functional has the form

Equation (111)

where

Equation (112)

are contributions from all the interactions between CSFs. The coefficient ${t}_{{aa}}^{\alpha \alpha }$ is the occupation of the orbital a in CSF α and taa = wa is the generalized occupation number of an orbital a in analogy with the HF notation. Similarly, ${v}_{{abcd};k}^{\alpha \beta }$ is the contribution to the energy of a given Slater integral.

As in the derivation of the HF equations from the variational principle [53], Lagrange multipliers are introduced for each constraint ${{ \mathcal C }}_{{ab}}$ defining the energy functional

Equation (113)

where ${{ \mathcal C }}_{{ab}}$ is the orthonormality constraint (66). Both the expansion coefficients c and the radial functions P are varied.

For a given set of radial functions $\{{P}_{{nl}}(r)\}$, the total energy is optimized through the variation of the expansion coefficients as in the CI method, leading to the matrix eigenvalue problem

Equation (114)

with many solutions. Only one eigenvector is the desired eigenvector, not necessarily the lowest and this vector defines the expansion coefficients.

For a given set of mixing coefficients $\{{c}_{\alpha }\}$, the stationary condition with respect to a variation in the radial functions, $\delta {P}_{a}(r)$, leads to a system of coupled differential equations

Equation (115)

similar in form to the HF equation (71). What differs are the types on integrals that may occur in the energy expression. Slater integrals of the symmetry ${R}^{k}({ab},{ab})$ again contribute to the direct potential $Y(a;r)$ through ${Y}^{k}({bb};r)/r$ functions. All other ${R}^{k}({ab},{cd})$ integrals contribute to $\bar{X}(a;r)$ through ${Y}^{k}({bd};r){P}_{c}(r)/r$ functions. Also included in $\bar{X}(a;r)$ are contributions from $I(a,b)$ integrals where $b\ne a$. Again, for each orbital angular momentum, there is a matrix $({\varepsilon }_{{ab}})$ arising from the orthogonality constraints [53]. In the ATSP2K code, all contributions to $\bar{X}(a;r){P}_{a}(r)$, together with off-diagonal energy parameters, are treated as a non-homogeneous term in the differential equation. But with B-spline matrix methods, the radial functions are again solutions of a generalized eigenvalue problem [132].

4.2.1. Brillouin's theorem for multiconfiguration solutions

Properties of the HF equations can be extended to the MCHF equations, with some qualifications.

The generalized Brillouin's theorem is not nearly as important as in HF when a given orbital occurs in many CSFs. If ${{\rm{\Psi }}}_{{nl}}^{{\rm{MCHF}}}$ is the portion of the ASF (complete with expansion coefficients) that contains the orbital nl and ${{\rm{\Psi }}}_{{nl}\to n^{\prime} l}^{{\rm{MCHF}}}$ represents the function obtained through the ${nl}\to n^{\prime} l$ substitutions that themselves may require expansions in terms of CFPs when an orbital is multiply occupied, then the following holds:

Equation (116)

Thus the included interactions may only apply in a broad average sense.

Table 2 shows the role of Brillouin's theorem in HF and MCHF calculations. For the HF calculation, (82) states that the interaction between 2p4 and a specific linear combination of the 2p33p CSFs has a zero matrix element but by adding the CSFs explicitly into a MCHF calculation in which only $3p$ is varied, the energy is reduced significantly. Varying both $2p$ and $3p$ reduces the total energy more but, at the same time, the rotations that enter into such a calculation have a noticeable effect on the expansion coefficients. In calculation (3a) and (3b), CSFs such as $2{p}^{2}3{p}^{2}$ are also present. These CSFs are part of Brillouin's theorem for calculation (2b). Varying both $2p$ and $3p$ again allows for orbital rotations but the energy reduction is now considerably less and the expansion coefficients for 2p33p CSFs are not greatly changed. Not given in this table are the expansion coefficients of $2{p}^{2}3{p}^{2}$. Thus, in general, Brillouin's theorem is not significant for larger multiconfiguration expansions.

Table 2.  Total energies in Eh for $2{p}^{4}{}^{3}P$ ASF in oxygen illustrating the role of Brillouin's theorem as a function of the method on the total energy and the expansion coefficients: (1) HF, (2) MCHF for $\{2{p}^{4},2{p}^{3}3p\}$, and (3)$\{2{p}^{4},2{p}^{3}3p,2{p}^{2}3{p}^{2}\}$. In (2a) and (3a) the $1s,2s,2p$ are fixed and only $3p$ varied, whereas in (2b) and (3b) both $2p$ and $3p$ are varied allowing orbital rotations.

  Varied Total Expansion coefficients
    energy 2p4 $2{p}^{3}{(}^{2}P)3p$ $2{p}^{3}{(}^{2}D)3p$ $2{p}^{3}{(}^{4}S)3p$
(1) All −74.809398 1.0000
(2a) $3p$ −74.812490 0.9977 0.0379 −0.0154 −0.0532
(2b) $2p,3p$ −74.841396 0.9179 −0.1669 0.2426 −0.2659
(3a) $3p$ −74.844914 0.9942 0.0209 −0.0072 −0.0301
(3b) $2p,3p$ −74.845367 0.9936 0.0110 0.0065 −0.0443

But it may still have a significant effect in some small cases. An example is the interaction of $| 3s3{p}^{6}{}^{2}S\rangle $ with the $| 3{s}^{2}3{p}^{4}{nd}{}^{2}S\rangle $ continuum in Cl. In the HF approximation, the former is located in the continuum but with the introduction of the single $| 3{s}^{2}3{p}^{4}3d{}^{2}S\rangle $ CSF into the wave function expansion, the MCHF $3d$ orbital has included the effect of both the interaction with the continuum states and the bound states. This $3d$ orbital has a mean radius similar to that of the $3p$ orbital [133]. It was confirmed that the MCHF results agreed with perturbation theory only when the latter included both continuum and bound states. In this case, the interaction with the continuum lowered the energy of the state into the bound spectrum and the $3d$ orbital became a bound orbital approaching zero at large r. This is an example where the summation over continuum states may cancel at large r so that the state does not 'decay' into the continuum.

4.2.2. Uniqueness of the multiconfiguration solutions

For multiconfiguration expansions, rotational analysis for detecting a non-unique solution [113] is usually not of sufficient benefit, to justify the needed computational effort. The probability of a non-unique solution often decreases when many different CSFs of different symmetries are included except for a certain class of expansions and modifications can be made to the expansion so that equations have a unique solution.

Well-known cases for which the radial functions are not unique are complete active space (CAS) expansions [134]. Consider an ASF for $1{s}^{2}{}^{1}S$ and the orbital set $\{1s,2s\}$ of the same symmetry. Let the CSF basis be the set of all two-electron 1S CSFs that can be constructed from these orbitals, namely $\{1{s}^{2},1s2s,2{s}^{2}\}{}^{1}S$. Any rotational transformation of the orbital set changes the expansion coefficients of the ASF, but leaves the wave function and its energy invariant. Without a unique solution, a computational process for a solution may still converge (if nodal properties are relaxed) but will depend on the initial estimates. Computationally, it is desirable to have a well-defined solution. Koopmans' theorem for similar situations in the HF case sets the off-diagonal Lagrange multiplier to zero, but this does not always work well for a CAS solution. Essentially, with this CAS expansion, there is a degree of freedom in the expansion. Much more can be gained by using the degree of freedom to set one of the expansion coefficients to zero or, equivalently, eliminating a CSF from the expansion. If the desired solution had been for $1s2s{}^{1}S$, then clearly 2s2 should be eliminated so our computed solution would be an upper bound to the second exact solution of the Hamiltonian. For such a solution, the generalized Brillouin's theorem states that the interaction between ${{\rm{\Psi }}}^{{\rm{MCHF}}}$ and perturbation obtained by rotating the $1s,2s$ orbitals is zero, which in this case is easier to confirm through computation than direct analysis, and the energy is the same as the CAS energy.

But what should be eliminated if the desired solution is the ground state? In this case the HUM theorem is not helpful.

Suppose the orbital set is the $\{1s,2s,\ldots {ms}\}$ set so that the CSF basis consists of all $\{{nsn}^{\prime} s{}^{1}S\}$ CSFs, with $n,n^{\prime} \leqslant m$. The expansion over this basis is a two-electron partial wave since every CSF has the same $| {ss}^{\prime} {}^{1}S\rangle $ spin-angular factor and could be written as

Equation (117)

where ${R}_{{nl}}(r)={P}_{{nl}}(r)/r$ and ${c}_{{nn}^{\prime} }={c}_{n^{\prime} n}$. The above radial factor for the partial wave can be expressed in matrix vector form. Let ${\bf{C}}$ be the symmetric matrix ${c}_{{nn}^{\prime} }$, and ${\bf{R}}({\bf{r}})$ the row vector $\{{R}_{1}(r),{R}_{2}(r),\ldots ,{R}_{m}(r)\}$. Then the radial factor becomes

Equation (118)

Since ${\bf{C}}$ is symmetric, there exists a unitary transformation that will diagonalize the matrix of expansion coefficients so that the partial wave has the form

Equation (119)

The process can be extended to other symmetries so that

Equation (120)

The orbitals of this 'reduced form' of the wave function are also called the 'natural' orbital expansion [53, 135] and are the ones that diagonalize the density matrix [136, 137]. It is the form obtained by using each of the $(m\times (m-1))/2$ degrees of freedom toward eliminating the 'off-diagonal' CSFs.

The forms of these orthogonal transformations depend on the spin-angular symmetry of the different partial waves [135]. For a ${}^{1}\,{P}^{o}$ two-electron system, the set of partial waves have the symmetry $\{{sp},{pd},{df},\ldots \}$. In this case the radial transformations for reducing the ${nsn}^{\prime} p$ expansion to ${ns}(n+1)p$ expansions differ from the radial transformations reducing the ${npn}^{\prime} d$ expansion to ${np}(n+1)d$. For this reason, the reduced forms cannot be used simultaneously in both sp and pd subspaces, unless the sets of p-orbitals involved in the two couplings are allowed to differ. This was one of the original motivations for implementing non-orthogonal orbitals that preserve the othonormality of CSFs within a partial wave [138].

For a given active orbital set, the size of a CAS expansion grows dramatically with the number of electrons and 'restricted active space' wave functions [139] should be built. For nominal two-electron atoms such as alkaline-earth atoms and atoms of the IIB group of the periodic table, multiconfiguration expansions can be generated by restricting the excitations to the outer valence shells (i.e. no hole in the core), and using the reduced forms with non-orthogonal orbitals to include valence correlation [140, 141].

When the generalized occupation is small, the associated radial function is quite different from the normal 'spectroscopic' (hydrogenic) orbital. Koopmans' theorem also has a different interpretation. Consider the case

Equation (121)

where no orthogonality constraints are present. By substituting the expressions for the matrix elements into (110), the expression for the total energy becomes the sum of integrals with coefficients weighted by expansion coefficients. For our example, it is easy to show that

Equation (122)

where $\bar{E}({nl})$ is the energy when the nl orbital has been removed (or set to zero). In the present example for He I, ${c}_{2}=0.005766$, and 4f2 clearly is a correction to the $1{s}^{2}{}^{1}S$ ASF, lowering its total energy by 0.00066 Eh. But this $4{f}^{2}{}^{1}S$ CSF is very different from a HF CSF. In fact, ${H}_{22}=17.0166\ {E}_{{\rm{h}}}$, which is well into the positive energy continuum although the $4f$ orbital is bound.

4.2.3. Solution of the MCHF equations

Because the equations for the expansion coefficients and the radial functions are coupled, the MCHF equations are solved by the MC-SCF process, similar to the SCF except that, after orthogonalization of the orbital set, the interaction matrix needs to be computed and the desired eigenvectors determined. For solving large CI problems, the ATSP2K code uses an approach based on the Davidson method [131] with robust preconditioning [142]. This is an iterative method based entirely on matrix-vector multiplication, requiring an initial estimate of the desired solution. Initially, when no estimates are available, approximate values can be determined by diagonalizing a small matrix. After that, when the current estimate is used as a starting value for the Davidson algorithm as the MC-SCF iteration converges, improved eigenvectors can be obtained with only a few (2–3) matrix-vector multiplies. Sparse matrix methods are used for representing the interaction matrix since possibly only 10% of the matrix or less may be non-zero.

The differential equations are solved using finite difference methods based on a discrete representation of the radial functions on a logarithmic mesh. Details can be found in [53, 113]. For selecting the solution of a given differential equation, node counting of the radial function is applied to the spectroscopic orbitals that are defined as those occupied in the single configuration HF approximation, or more generally, orbitals that have a generalized occupation number of 0.5 or greater. In technical terms node counting amounts to guiding the solution of the differential equation in such a way that radial functions have the same node structure as the corresponding hydrogen-like function provided small amplitudes in the tail are ignored that result from the rotation of orbitals. No node constraint needs to be imposed on the orbitals that are unoccupied in the HF approximation.

4.2.4. Extended MCHF methods

The MCHF method can also be used to simultaneously obtain wave functions for many states. If all are of the same symmetry, the states may require many eigenvalues and eigenvectors of the same interaction matrix, or they may have different symmetries or parities, in which case different interaction matrices are needed. The variational principle is then applied to a weighted linear combination of functionals of the individual states and the energies and expansions coefficients are obtained as the corresponding eigenvalues and eigenvectors of the Hamiltonian matrix for the given symmetry [55]. This method is extremely useful for BP calculations, discussed next.

4.3. BP wave functions

For light atoms, where relativistic effects are expected to be small, an orthonormal orbital basis from an ordinary or extended MCHF calculation may be used in combination with the CI method and the BP Hamiltonian, ${{ \mathcal H }}_{{\rm{BP}}}$ (29). This method has produced many J-dependent energy levels [143] in good agreement with observation. An example of the use of the CI method is given by the BP calculations [143] in which the nonrelativistic Hamiltonian used for optimizing the orbitals in the MCHF approach is corrected by the inclusion of the BP relativistic operators (29). The CSF basis is extended to allow LS-term mixings for a given J and parity. The ASF is then an expansion over a set of CSFs (49),

Equation (123)

in which the ${\bf{L}}+{\bf{S}}={\bf{J}}$ angular momentum coupling (41) is realized for each term symmetry. Here MLS is the length of the expansion for a given LS term.

The evaluation of the BP operators appearing in (29) involves a large variety of radial integrals, as illustrated in [144147]. The two-body terms ${{ \mathcal H }}_{{\rm{SS}}}$ and ${{ \mathcal H }}_{{\rm{SOO}}}$ are not straight forward leading to many radial integrals. The complexity of the two-body ${{ \mathcal H }}_{{\rm{OO}}}$ operator however, exceeds those of ${{ \mathcal H }}_{{\rm{FS}}}$, increasing the computer time required to evaluate an interaction matrix. Thus, it has been customary to omit the orbit–orbit effect from energy spectrum calculations. The theory used to compute the interaction matrix assumes that all the CSFs are defined in terms of a single orthonormal set of orbitals. The extended MCHF method assures that this condition is met and has been used successfully to compute many levels of the Na-like to Ar-like sequences for nuclear charges up to Z = 30 [148].

4.3.1. Complete degeneracies

When relativity is treated in the BP approximation, the relativistic corrections are included in the CI step with orbitals obtained from non relativistic HF/MCHF calculations. The LS-mixing can be dramatic when terms lie close to each other, or are accidentally degenerate. Complete degeneracies may occur when the two different term energy expressions are identical, i.e., they will be the same for all radial functions. For example, a strong relativistic mixing occurs between $2{p}^{5}3d{}^{3}{D}_{2}^{o}$ and $2{p}^{5}3d{}^{1}{D}_{2}^{o}$ CSFs as observed in the study of the Ne-like spectra [148] because of this near degeneracy. A systematic analysis of the energy expressions shows that singlet-triplet term-degeneracies occur not only for ${p}^{5}l\,{}^{1}(L=l),{}^{3}(L=l))$ but also for some singlet and triplet terms arising from the ${l}^{4l+1}l^{\prime} $ configurations, as reported in table 3.

Table 3.  Cases of complete degeneracies of singlet and triplet terms of ${l}^{4l+1}l^{\prime} $ configurations.

${p}^{5}p^{\prime} $ ${(}^{1}P,{}^{3}P)$ ${d}^{9}p$ ${(}^{1}D,{}^{3}D)$ ${f}^{13}p$ ${(}^{1}F,{}^{3}F)$
${p}^{5}d$ ${(}^{1}D,{}^{3}D)$ ${d}^{9}d^{\prime} $ ${(}^{1}P,{}^{3}P)$ ${f}^{13}d$ ${(}^{1}D,{}^{3}D)$
    ${d}^{9}d^{\prime} $ ${(}^{1}F,{}^{3}F)$ ${f}^{13}d$ ${(}^{1}G,{}^{3}G)$
${p}^{5}f$ ${(}^{1}F,{}^{3}F)$ ${d}^{9}f$ ${(}^{1}D,{}^{3}D)$ ${f}^{13}f^{\prime} $ ${(}^{1}P,{}^{3}P)$
    ${d}^{9}f$ ${(}^{1}G,{}^{3}G)$ ${f}^{13}f^{\prime} $ ${(}^{1}F,{}^{3}F)$
        ${f}^{13}f^{\prime} $ ${(}^{1}H,{}^{3}H)$

In all these cases, strong relativistic mixing is expected for J = L. Note that if a degeneracy is found for some terms of ${l}^{4l+1}l^{\prime} $, the complete degeneracy also holds for the same terms arising from $l{^{\prime} }^{4l^{\prime} +1}l$. This can be explained through the spin-quasispin exchange.

4.4. The MCDHF method

The relativistic extension of the MCHF approach is to define the ASF as an expansion over a set of jj-coupled relativistic CSFs (55),

Equation (124)

and the energy functional

Equation (125)

as the expression for relativistic energy using the Dirac–Coulomb Hamiltonian (9). Lagrange multipliers are introduced for constraining the variations in the one-electron functions $(\delta {P}_{n\kappa },\delta {Q}_{n\kappa })$ to satisfy the constraint (86), that guarantees the orthonormality of the one-electron functions and of the CSFs.

The energy functional of (124) with the Dirac–Coulomb Hamiltonian (9) can be expressed in terms of spin-angular coefficients and radial integrals

Equation (126)

where

Equation (127)

are contributions from all the interactions between CSFs. The one-body interactions give rise to the spin-angular coefficients tab and the $I(a,b)$ integrals defined by (88) and the two-body interactions to the spin-angular coefficients vabcdk and to the relativistic Slater integrals ${R}^{k}({ab},{cd})$ defined by (89) and (90).

The coefficient of $I(a,a)$, namely ${w}_{a}={t}_{{aa}}$ is the generalized occupation number for orbital a. The spin-angular coefficients tab and ${v}_{{abcd}}^{k}$ appearing in (126) are evaluated using the same methods as for DHF [120, 121].

As in the nonrelativistic MCHF approach, it is possible to derive the MCDHF equations from the usual variational argument by varying both the large and small component:

Equation (128)

$V(a;r)={V}_{{\rm{nuc}}}(r)+Y(a;r)+\bar{X}(a;r)$ is built, similarly to the nonrelativistic case, from the nuclear, direct and exchange contributions arising from both diagonal and off-diagonal $\langle {{\rm{\Phi }}}_{\alpha }| {{ \mathcal H }}_{{\rm{DC}}}| {{\rm{\Phi }}}_{\beta }\rangle $ matrix elements. In each κ-space, Lagrange related energy parameters ${\epsilon }_{{ab}}={\epsilon }_{{n}_{a}{n}_{b}}$ are introduced to impose the orthonormality constraints (86) in the variational process. The method of solution for both the expansion coefficients and the radial functions are similar to those for the MCHF equations. The main difference is in the solution of the differential equations since the Dirac equations are a pair of first-order differential equations.

An interesting case for relativistic theory is that of the helium-like ground state for high-Z ions discussed in the MCHF section 4.2. A concern is the presence of negative energy states in an approximate wave function. Indelicato and Desclaux [149] claimed their convergence problems for natural orbital expansions, when orbitals with $n\gt 4$ were present, were due to the absence of projection operators. Among the n = 4 orbitals, only $4s$ converged. In a subsequent paper, Indelicato [150] introduced projection operators into an MCDHF calculation and claimed these were essential for a solution. But the difficulty could also have been due to numerical problems. With the GRASP92 [80] code, MCDHF results [151] were obtained for both ${{\rm{U}}}^{90+}$ and Ho65+ for He-like expansions up to n = 6 in good agreement with Indelicato's results including projection operators. The numerical problems can be understood already from the simple expansion over the $\{1{s}^{2},4{f}^{2}\}$ 1S basis, which, in jj-coupling becomes $\{1{s}^{2},4{f}_{+}^{2},4{f}_{-}^{2}\}$ J = 0. As Z increases, ultimately the contribution to the energy from 4f2 will be below the numerical accuracy of the solution of the equation for the $1s$ spinor. SCF iterations then are no longer meaningful unless the $1s$ is fixed.

In this review we have expressed the MCDHF equations in a manner that includes the generalized occupation number so that the matrix of Lagrange multipliers is symmetric. Correlation orbitals may have extremely low occupation number such as ${t}_{{aa}}={10}^{-6}$. In the present form division by small numbers is avoided and as ${t}_{{aa}}\to 0$, ${\varepsilon }_{{aa}}\to 0$. In the previous definition, the diagonal energy was proportional to ${\varepsilon }_{{aa}}/{t}_{{aa}}$, a ratio that approaches $\infty $ as ${t}_{{aa}}\to 0$ and would be of concern if the parameter were related to a binding energy. A derivation of the diagonal energy parameter for ${\varepsilon }_{4{f}_{+}4{f}_{+}}$ (or ${\varepsilon }_{4{f}_{-}4{f}_{-}}$) without the introduction of diagonal Lagrange multipliers has been published [152, 153] where it is shown that large Lagrange multipliers in the earlier definition implied that the $4{f}_{+}^{2}$ or $4{f}_{-}^{2}$ CSF was high in the positive energy continuum, as found in the earlier MCHF study. The one-electron energies of the 4f and 4f+ are shown as a function of Z in [153].

4.4.1. Breit and QED corrections

The variational method can be applied to the ${{ \mathcal H }}_{{\rm{DCB}}}$ (20) with the consequence that Brillouin's theorem would be satisfied for selected excitations but at the cost of considerable computational effort. As in the case of the MCHF method (see table 2) Brillouin's theorem alone is not sufficient for accuracy. Thus, this option has not been implemented in GRASP2K. Instead, larger expansions are used that allow for a systematic calculation.

In the GRASP2K code, Breit and QED corrections are computed using the MCDHF orbitals from a calculation using the HDC Hamiltonian and then applying the CI method with a Hamitonian that includes the desired corrections. Generally, the most important correction is the Breit correction with the Dirac–Coulomb–Breit Hamiltonian (20). In the GRASP2K code, by default, all corrections are included in the matrix elements for the Hamiltonian, such as ${{ \mathcal H }}_{{\rm{DCB}}+{\rm{QED}}}$ (24). Thus they are not perturbative corrections and affect the wave function. When correlation orbitals with small generalized occupation numbers are present, the correction to some individual matrix elements may become large. Then they could also be computed perturbatively, thereby not affecting the wave function.

The first QED correction included in GRASP2K is the vacuum polarization correction, applied to all matrix elements

Equation (129)

where ${V}_{{\rm{Uehl}}}({r}_{i})$ includes vacuum polarization potential terms of both second- and fourth-order in QED perturbation theory.

The second QED correction, the self-energy contribution ${{ \mathcal H }}_{{\rm{SE}}}$, is applied to the diagonal energies as

Equation (130)

where nw is the number of subshells in the CSF, wa is the number of electrons in subshell a in CSF, ${E}_{{\rm{SE}}}(a)$ is the one-electron self-energy of an electron in subshell a. The way of estimating ${E}_{{\rm{SE}}}(a)$ differs from one approach to another [154, 155].

Other corrections that can be added in GRASP2K are the normal mass and SMSs as defined in equations (25) and (26). A more practical way to evaluate isotope shifts for different pairs is to calculate the expectation values of the relevant operators for whatever isotope combination, using the appropriate computational tool SMS92 [156]. The higher-order corrections can be estimated from the expectation values of the nuclear recoil operator (27) using RIS3 [157], a recent module of the GRASP2K package.

4.4.2. Consequence of changing the Hamiltonian

As mentioned earlier, in GRASP2K the Breit and QED corrections are included in a CI calculation after the variational method has been used to determine radial functions. When expansions are compared with orbitals optimized for the ${{ \mathcal H }}_{{\rm{DCB}}}$, the CI mixing may be larger because of Brillouin's theorem not being satisfied although energy may be comparable. It should be remembered that the natural orbitals of a reduced form are optimized for a specific Hamiltonian. When the Hamiltonian changes the wave functions no longer satisfy Brillouin's theorem for the new Hamiltonian. In such cases, a full expansion may be needed for an accurate wave function. This can be seen from table 4 where the difference in energy of the natural and CAS expansions for an n = 3 orbital set is shown for ${{\rm{U}}}^{90+}$. For this highly charged ion, the difference is significant, although for the neutral He atom, the differences were negligible to the digits displayed.

Table 4.  Comparison of total energies (in Eh) of the U90 ground state for reduced and CAS expansions from CI calculations for different Hamiltonians when radial functions are computed for the reduced expansion.

Hamiltonian Reduced CAS Diff.
DC −9637.3780508 −9637.3780509 0.0000000
DCB −9625.5384609 −9625.5959163 −0.0674554
DCB+VP −9632.4011947 −9632.4491848 −0.0479901
DCB+VP+SE −9606.0571813 −9606.1052038 −0.0480225
DCB+VP+SEa −9606.0571795 −9606.1052020 −0.0480225

aComputed perturbatively.

4.5. Nonrelativistic MCHF orbitals with a relativistic Hamiltonian

A complementary low-order relativistic approach, also based on CI, consists of diagonalizing the Dirac–Coulomb-Breit Hamiltonian (20) interaction matrix to get a relativistic ASF representation (124) in a jj-CSF basis (55) built on Dirac spinors (11) whose large and small radial components are calculated from nonrelativistic MCHF radial functions, using the Pauli approximation (28) [64, 66, 158]

Equation (131)

These orbitals are then orthonormalized. This method, based on the use of a relativistic CI approach in the Pauli approximation, labelled RCI-P, provides an interesting way of checking the reliability of independent MCHF-BP calculations [153, 159].

4.6. Extended MCDHF methods

Just as for the MCHF method, the MCDHF method can be extended to simultaneously determine wave functions for many states. Again, the variational principle is applied to a weighted linear combination of functionals of the individual states and the energies and expansions coefficients are obtained as the corresponding eigenvalues and eigenvectors of the Hamiltonian matrix for the given symmetry [160]. Normally, wave functions for fine structure states of a given term are determined together. When determining wave functions for many states (up to a few hundred), calculations are often done by parity, meaning wave functions for all even states are determined in one calculation and wave functions for all odd states are determined in another [161163].

In GRASP2K, an extended MCDHF method is referred to as an extended optimal level calculation. In an extended average level calculation only the diagonal elements of the interaction matrix are included in the variational process.

4.7. Eigenvector representation and jj to LSJ coupling transformations

The BP and MCDHF methods are both relativistic methods that clearly differ in a number of significant ways. One of these is the order of the coupling of the orbital quantum numbers. A BP calculation uses LSJ-coupling and radial functions of the orbitals that depend only on nl-quantum numbers. As in all expansions where the basis CSFs form an orthonormal set, the square of the expansion coefficient represents the fraction of the composition of wave function accounted for by the given CSF. This information is used to determine the classification of the state. When relativistic effects are small, a specific LS value will account for most of the wave function composition and ideally is a single CSF. In the atomic spectra database (ASD) [164] at the National Institute of Standards and Technology the designation of a level is usually associated with the CSF with the largest composition. But such a scheme does not guarantee unique labels for all ASFs [143, 165]. The simplest, but unique labeling scheme, provided all the levels up to a given level were known, would be use a POS index designating the position of the ASD energy for a given symmetry, much in the way orbitals are designating by a principal quantum number n and orbital quantum numbers.

For the above reasons, it is often convenient to express results from an MCDHF calculation performed in jj-coupling in terms of LSJ notation. The JJ2LSJ code in GRASP2K does this by applying a unitary transformation to the MCDHF CSF basis set which preserves orthonormality. The unitary transformation selected is the coupling transformation that changes the order of coupling from jj to LSJ, a transformation that does not involve the radial factor, only the spin-angular factor.

As illustrated in section 3.2.1 for the 3d4 configuration, each nonrelativistic nl-orbital (except for ns) has associated with it two relativistic orbitals ${l}_{\pm }\equiv j=l\pm 1/2$. In the transformation of the spin-angular factor $| {l}^{w}\alpha {LS}\rangle $ into a jj-coupled angular basis, two subshell states, one with ${l}_{-}\equiv j=l-1/2$ and another one with ${l}_{+}\equiv j=l+1/2$ may both occur in the expansion. This shell-splitting

Equation (132)

obviously conserves the number of electrons, provided ($w\,={w}_{1}+{w}_{2}$), with ${w}_{1}{\rm{(max)}}=2l$ and ${w}_{2}{\rm{(max)}}=2(l+1)$.

Making use of this notation, the transformation between the subshell states in LSJ- and jj-coupling can be written as

Equation (133)

Equation (134)

which, in both cases, includes a summation over all the quantum numbers (except of n, ${l}_{-}$, and ${l}_{+}$). Here, $| ({l}_{-}^{{w}_{1}}{\nu }_{1}{J}_{1},{l}_{+}^{{w}_{2}}{\nu }_{2}{J}_{2})\ J\rangle $ is a coupled angular state with well-defined total angular momentum J which is built from the corresponding jj-coupled subshell states with ${j}_{1}={l}_{-}\,=l-\tfrac{1}{2}$, ${j}_{2}={l}_{+}=l+\tfrac{1}{2}$ and the total subshell angular momenta J1 and J2 , respectively.

An explicit expression for the coupling transformation coefficients

Equation (135)

in (133) and (134) can be obtained only if we take the construction of the subshell states of w equivalent electrons from their corresponding parent states with $w-1$ electrons into account. In general, however, the recursive definition of the subshell states, out of their parent states, also leads to a recursive generation of the transformation matrices (135). These transformation coefficients can be chosen real: they occur very frequently as the building blocks in the transformation of all symmetry functions. The expressions and values of these configurations are published in [166].

5. Correlation models

5.1. Electron correlation

HF is an approximation to the exact solution of Schrödinger's equation. Neglected is the notion of 'correlation in the motion of the electrons'; each electron is assumed to move independently in a field determined by the other electrons. For this reason, the error in the energy was defined by Löwdin in 1955 [167], to be the correlation energy, i.e.

Equation (136)

In this definition, ${E}^{{\rm{exact}}}$ is the exact energy eigenvalue of Schrödinger's equation. In line with the definition we will refer to electron correlation as effects beyond the HF approximation. Electron correlation can be thought of as consisting of two parts; static correlation and dynamic correlation [168, 169].

5.1.1. Static electron correlation

Static correlation is the long-range re-arrangement of the electron charge distribution that arises from near degeneracies of the HF energies. Static correlation can be accounted for by including in the wave function a set of important CSFs that define the so called multireference (MR) set. Static correlation can also be interpreted in terms of Z-dependent perturbation theory where the CSFs of the MR set are built from orbitals with the same principal quantum numbers as the ones that occupy the reference state and where we may think of orbitals with the same principal quantum numbers as being degenerate. In a more general setting we can say that the static correlation is described by a set of CSFs that have large expansion coefficients and account for the major correlation effects.

5.1.2. Dynamic electron correlation

Dynamic correlation is a short-range effect that arises from the singularity of the $1/{r}_{{ij}}$ electron–electron interaction near points of coalescence where ${r}_{{ij}}=0$ and has a cusp condition associate with it [52]. These are not isolated points, but include the entire region of space. The more likely regions are those where the probability of finding a pair of electrons is the highest.

It has been shown that by extending expansions to include CSFs with higher l-quantum numbers, the accuracy of the wave function improves [170]. For the helium ground state, a total energy accurate to seven (7) decimal places is estimated to require expansions up to l = 100 [91]. Wave function expansions in terms of CSFs built from central-field orbitals form a non-local basis that is non-zero over the region of space. If, instead, the CSFs are built from a B-spline basis, which is non-zero over only a 'local' subregion, the contributions to expansions with higher l, have been shown to cluster around the ${r}_{1}={r}_{2}$ region [58]. This becomes evident when noting that

Equation (137)

This factor, appearing in all Slater integrals, clearly shows the rapid decrease away from the diagonal (${r}_{1},{r}_{2}$) region with increasing k.

In some instances, symmetry plays an important role. A total wave function of a many-fermion system must be antisymmetric with respect to the interchange of any two of the fermions in order to satisfy the Pauli exclusion principle. For the $1{sns}$ 3S systems the spin function is symmetric so that, by the Pauli exclusion principle, the radial factor for the HF wave function must be antisymmetric, namely

This factor is zero for all ${{\bf{r}}}_{1}$ and ${{\bf{r}}}_{2}$ whenever ${r}_{1}={r}_{2}$ and includes the points where ${r}_{12}=0$. Thus even at the HF level the two electrons are kept away from each other by the symmetry requirements and the effects of dynamic electron correlation are fairly minor.

For many-electron systems the largest contributions to electron correlation come from pairs of electrons which occupy the same region in space. Thus there are large contributions from each doubly occupied orbital with smaller additions from orbital pairs that occupy different shells. Just as for the static correlation the dynamic correlation can be accounted for by expansions over CSFs and the effect should be to mimic the cusp behavior of the exact wave function at points of electron coalescence. Perturbative arguments are used to define classes of CSFs that are important in this regard and this is the topic of the next section.

5.2. Z-dependent perturbation theory

Let us introduce a new variable $\rho ={Zr}$, which in effect changes the unit of length. Then the nonrelativistic Hamiltonian becomes

Equation (138)

where

Equation (139)

Schrödinger's equation now reads

Equation (140)

In this form, the $1/Z$ appears as the natural perturbation parameter. If we assume

Equation (141)

in the ρ unit of length, and

Equation (142)

we may insert these expansions in (140) to obtain equations for ${{\rm{\Psi }}}^{(k)}$ and ${E}^{(k)};$

Equation (143)

The solutions of the first equation are products of hydrogenic orbitals.

Let $| \{{nl}\}\gamma {LS}\rangle $ be a CSF constructed from products of hydrogenic orbitals. Here $\{{nl}\}=\{{n}_{1}{l}_{1},{n}_{2}{l}_{2},\ldots ,{n}_{N}{l}_{N}\}$ is the set of N principal and orbital quantum numbers that define the configuration (39) and γ denotes the complete set of the coupling tree quantum numbers specifying unambiguously the considered configuration state (see (50)). Then

Equation (144)

with ${E}^{(0)}$ being the sum of the hydrogenic energies

Equation (145)

Since ${E}^{(0)}$ is independent of the orbital quantum numbers, it is now clear that different configurations may lead to the same ${E}^{(0)};$ that is, ${E}^{(0)}$ is degenerate. According to first-order perturbation theory for degenerate states [105, 171], ${{\rm{\Psi }}}^{(0)}$ is a linear combination of the degenerate CSFs $| \{{{nl}}^{\prime }\}{\gamma }^{\prime }{LS}\rangle ;$ the coefficients are components of an eigenvector of the interaction matrix, $\langle \{{{nl}}^{\prime }\}{\gamma }^{\prime }{LS}| V| \{{nl}\}\gamma {LS}\rangle $ and ${E}^{(1)}$ is the corresponding eigenvalue. Then

Equation (146)

But only configurations with the same parity interact and so the linear combination is over all CSFs with the same set of principal quantum numbers and the same parity. This set of CSFs is referred to as the complex by Layzer [172]. The relativistic versions of Layzer's complex can be found in [173, 174].

The first-order correction ${{\rm{\Psi }}}^{(1)}$ is a solution of (143) orthogonal to ${{\rm{\Psi }}}^{(0)}$. It can be expanded as a linear combination of normalized intermediate CSFs $| {\gamma }_{v}{LS}\rangle $ belonging to ${{ \mathcal H }}^{(0)}$, but outside the complex. Then

Equation (147)

where ${E}_{{\gamma }_{v}{LS}}=\langle {\gamma }_{v}{LS}| {{ \mathcal H }}^{(0)}| {\gamma }_{v}{LS}\rangle $. Substituting equation (146) into (147) and interchanging the orders of summation, we find

Equation (148)

In other words, the mixing coefficient, ${c}_{{l}^{\prime }{\gamma }^{\prime }}$, is a weight factor in the sum over intermediate CSFs $| {\gamma }_{v}{LS}\rangle $ interacting (having non-zero matrix elements) with CSFs in the complex.

5.2.1. Classification of correlation effects

The zero-order wave function ${{\rm{\Psi }}}^{(0)}$ is obtained as a linear combination of CSFs in the complex. It describes the many-electron system in a general way and accounts for the major part of the long-range static electron correlation. The first-order correction ${{\rm{\Psi }}}^{(1)}$ is a linear combination of CSFs that interact with the CSFs in the complex and it accounts for additional long-range electron correlation and the major part of the short range dynamic correlation. Assume for simplicity that there is only one CSF $| \{{nl}\}\gamma {LS}\rangle $ in the complex. The CSFs interacting with $| \{{nl}\}\gamma {LS}\rangle $ are of two types: those that differ by a single electron (single substitution S) and those that differ by two electrons (double substitution D). The former can be further subdivided into

  • (i)  
    Those that differ from $| \{{nl}\}\gamma {LS}\rangle $ by one principal quantum number but retain the same spin and orbital angular coupling. These configuration states are part of radial correlation.
  • (ii)  
    Those that differ by one principal quantum number and also differ in their coupling. Often the only change is the coupling of the spins, in which case the configuration states are part of spin-polarization.
  • (iii)  
    Those that differ in the angular momentum of exactly one electron and are accompanied by a change in orbital angular coupling of the configuration state and possibly also the spin coupling. The latter represent orbital polarization.

The sums over CSFs that differ in two electrons can also be classified. Let $\{a,b,c,..\}$ be occupied orbitals in $| \{{nl}\}\gamma {LS}\rangle $ and $\{v,{v}^{\prime },..\}$ be orbitals in a so called active set (AS). Then the double replacement ${ab}\to {{vv}}^{\prime }$ generates CSFs in the expansion for ${{\rm{\Psi }}}^{(1)}$. The function defined by CSFs from all double replacements from ab is called a pair-correlation function (PCF) and it corrects for the cusp in the wave function associated with this electron pair. The PCFs from all electron pairs correct for the main part of the dynamic correlation. There is another and more general classification that takes into account if the orbital replacements are from valence or core orbitals:

  • (i)  
    If ab are orbitals for outer electrons the replacement represents outer or valence correlation.
  • (ii)  
    If a is a core orbital but b is an outer orbital, the effect represents the polarization of the core and is referred to as core–valence (CV) correlation.
  • (iii)  
    If both orbitals are from the core, the replacement represents core–core (CC) correlation.

5.3. CSF expansions for energy

Z-dependent perturbation theory is not appropriate for practical calculations, but it is a very useful guide for how the initial HF approximation can be improved in MCHF or CI calculations in order to capture most of the correlation energy. The zero-order wave function ${{\rm{\Psi }}}^{(0)}$ accounting for the major part of the static correlation is an expansion over CSFs with large interactions with the CSF of interest, either those that are nearly degenerate or those with a large interaction matrix element (see section 4.1.1). These CSFs define the MR set and an associated MR function space. In addition, in order to account for dynamic correlation, the wave function Ψ should include CSFs generated by SD replacements of orbitals from each CSF of the MR set, with orbitals in an AS. For a first-order correction, the included CSFs should interact with at least one CSF of the MR set.

As an example we look at $1{s}^{2}2{s}^{2}{}^{1}S$. For infinite Z, $1{s}^{2}2{s}^{2}{}^{1}S$ is degenerate with $1{s}^{2}2{p}^{2}{}^{1}S$ and degenerate perturbation theory needs to be applied. Here, for finite Z, the zero-order wave function is an expansion over the two CSFs $\{| 1{s}^{2}2{s}^{2}{}^{1}S\rangle ,\,| 1{s}^{2}2{p}^{2}{}^{1}S\rangle \}$ that define the MR set. For Be I we have

Equation (149)

and we see that both the CSFs of the MR wave function have large expansion coefficients (generally with a weight $| {c}_{i}{| }^{2}$ greater than a few per cent). Valence correlation is accounted for by considering CSFs obtained from $2s2s\to {nln}^{\prime} l^{\prime} $ replacements from the first CSF and $2p2p\to {nln}^{\prime} l^{\prime} $ replacements from the second. CV correlation is accounted for by considering CSFs obtained from $1s2s\to {nln}^{\prime} l^{\prime} $ and $1s2p\to {nln}^{\prime} l^{\prime} $ replacements from, respectively, the first and second CSF. Core correlation in the n = 1 shell accounted for by considering CSFs obtained from $1s1s\to {nln}^{\prime} l^{\prime} $ replacements from both CSFs. In a first-order calculation, all the generated CSFs should interact with at least one CSF. Included in the general expansions above are also the CSFs obtained by single replacements, although there is no clear classification in valence, CV or core correlation effects.

The generation of the CSF expansions is a very important step in atomic structure calculations. In the ATSP2K and GRASP2K program packages there are flexible program modules for generating CSF expansions based on rules for orbital replacements from an MR set to an AS of orbitals [175, 176].

5.3.1. Correlation and spatial location of orbitals

In the MCHF and MCDHF methods the location and shape of the correlation orbitals depend on the energy functional or CSFs expansion used to derive the MCHF or MCDHF equations [49, 177]. To illustrate this we again consider the ground state of Be I. Figure 2 displays orbitals from MCHF calculations based on CSFs expansions describing valence-, CV and core correlation, respectively. One clearly sees the contraction of the correlation orbitals when going from a valence to a core correlation calculation.

Figure 2.

Figure 2. Contraction of the correlation orbitals from valence, core–valence and core–core correlation MCHF calculations of Be $1{s}^{2}2{s}^{2}{}^{1}S$. The two thick red lines correspond to the spectroscopic $1s$ (no node) and $2s$ (one node) orbitals. Other lines represents the radial distributions of the correlation orbitals of the n = 4 active set. Note the location of the maxima of the different types of orbitals.

Standard image High-resolution image

Orbitals in the valence region are ill suited to describe correlation in the core region and vice versa. Since calculations due to orthogonality constraints are based on one orbital set, this must often be large to saturate all correlation effects. This is especially true for large systems with many subshells. To overcome these problems the partitioned correlation function interaction (PCFI) method has been developed [177, 178]. The PCFI method uses a biorthonormal transformation method [179] to relax the orthogonality constraint of the orbitals and correlation effects can be described by several non-orthogonal sets of correlation orbitals, each set being optimally localized for the considered correlation effect. The PCFI method captures correlation effects more efficiently than do the ordinary MCHF and MCDHF methods [51].

5.4. CSF expansions for energy differences

Often we are interested in determining energy separations between different levels. In these cases we may, in the first approximation, define closed inner subshells as inactive and consider correlation only between the outer valence electrons. The rationale for this is that the correlation energy in the core, although large in an absolute sense, to a great extent cancels when computing energy level differences or the energy relative to the ground state. However, the presence of outer valence electrons polarizes the core. The effect of this polarization is represented by CV correlation where CSFs are obtained by orbital replacements ${ab}\to {{vv}}^{\prime }$ from the CSFs of the MR set, with a and b, respectively, being core and valence orbitals, as shown in studies for Ca I and Ca II [180]. The CV correlation reduces the energy and increases the binding of the valence electrons to the core. In case of a single electron the increase in the binding is reflected in a contraction of the orbital which has a large effect on other computed properties.

Generally, energy separations are much improved if CV correlation is included. For larger atomic systems it is not always clear which subshells should be inactive and which should be part of the active core for which CV effects are to be considered. For each new system this needs to be systematically investigated. Somewhat counter intuitive there are several examples where CV correlation effects are larger for more inner subshells than for more outer subshells [19]. A good starting point for analyzing the situation is to plot the radial part of the core and valence orbital and look at the overlap between the different orbitals. If the overlap is large then CV effects are likely to be important.

To illustrate the discussion above we look at the separation between $1{s}^{2}2{s}^{2}2{p}^{6}3s{}^{2}S$ and $1{s}^{2}2{s}^{2}2{p}^{6}3p{}^{2}{P}^{o}$ in Na I. We systematically include CSFs obtained from

Equation (150)

replacements from the $| 1{s}^{2}2{s}^{2}2{p}^{6}3s{}^{2}S\rangle $ and $| 1{s}^{2}2{s}^{2}2{p}^{6}3p{}^{2}{P}^{o}\rangle $ reference CSFs to an AS that was extended to principal quantum numbers n = 9 and orbital quantum numbers l = 5 leading to energy contributions that are reasonably well converged with respect to the orbital set. Here v denote the $3s$ or $3p$ valence orbital. The accumulated contributions to the total energy of the two states as well to the energy difference is from the CSFs obtained from the replacements are displayed in table 5.

Table 5.  Energies (in Eh) for $3s$ 2S and $3p$ ${}^{2}{P}^{o}$ of Na I as more correlation types of CSFs are added to the wave function expansion.

Corr. $E{(}^{2}S)$ $E{(}^{2}{P}^{o})$ ${\rm{\Delta }}E$
HF −161.858580 −161.786286 0.072293
$2{pv}$ −161.866176 −161.788935 0.077241
$2p2p$ −162.076793 −161.999811 0.076981
$2{sv}$ −162.077481 −162.000103 0.077378
$2s2p$ −162.158012 −162.081373 0.076639
$2s2s$ −162.169542 −162.093022 0.076520
$1{sv}$ −162.169625 −162.093066 0.076558
$1s2p$ −162.193467 −162.117022 0.076445
$1s2s$ −162.199039 −162.122580 0.076460
$1s1s$ −162.238215 −162.161813 0.076402

From this table we see that CSFs obtained by orbital replacements $2{pv}\to {nln}^{\prime} l^{\prime} $, accounting for the CV correlation with $2p$, have a relatively small influence on the total energies. The CSFs are however very important for the energy differences. CSFs obtained by orbital replacements $2p2p\to {nln}^{\prime} l^{\prime} $ account, by far, for most of the correlation energies in the two states. These contributions largely cancel and the change in the energy difference is rather small, of the same order as the effect of the $2{sv}\to {nln}^{\prime} l^{\prime} $ replacements the describe the CV electron correlation with $2s$. Also the correlation between $2s$ and $2p$ described by the $2s2p\to {nln}^{\prime} l^{\prime} $ replacements are important for both the total energies and the energy differences.

5.5. Capturing higher-order correlation effects

Z-dependent perturbation theory defines the structure of the zero-order wave function and the first-order correction. The structure of higher-order corrections for energies, as well as other properties, can be derived in a similar way. For energies, higher-order corrections are captured by including CSFs that interact with the CSFs in the zero- and first-order wave function. In practice this is the same as including some of the CSFs that can be generated by single, double, triple, and quadruple (SDTQ) orbital replacements ${abcd}\to {{vv}}^{\prime }{v}^{\prime ^{\prime} }{v}^{\prime ^{\prime \prime} }$ from the CSFs of the MR set. The number of CSFs increases very rapidly with the increasing number of orbitals in the AS and thus general SDTQ or SDT orbital replacements are feasible only for few-electron systems [181, 182]. A way to include the most important higher-order correlation effects is to increase the MR set by adding CSFs for a certain portion $p={\sum }_{\alpha \subset {\rm{MR}}}{c}_{\alpha }^{2}$ of the wave function composition [183, 184]. The overall accuracy of the wave function increases as the MR set accounts for a larger portion of the wave function. In fact, if the MR space represents the portion p of the total wave function and $E-{E}^{{\rm{MR}}}$ the amount by which the SD excitations have lowered the energy, then an estimate of the error in the energy is $(E-{E}^{{\rm{MR}}})(1-p)/p$ [185, 186].

5.6. Valence and CV correlation in lanthanides

In a relatively simple system, it may be sufficient to have a balanced SD process applied similarly to the odd and even parity states and a common fixed core. But complex, heavy atoms require more care. Consider a ground state calculation for the odd 4f5d6s2 configuration of Ce i, a lanthanide element, with a nearby interacting 4f5d26s configuration. The lowest even parity configuration is the $4{f}^{2}6{s}^{2}$ configuration fairly high in the spectrum. Variational calculations for this spectrum have not been investigated, to our knowledge, but, would raise a number of issues.

It is not obvious what the core should be. Configurations are specified by listing open subshells in order of energy of the electrons or possibly a closed, outer s2 subshell. In the case of Cei, inner closed subshells are $1{s}^{2}2{s}^{2}2{p}^{6}3{s}^{2}3{p}^{6}3{d}^{10}4{s}^{2}4{p}^{6}4{d}^{10}5{s}^{2}5{p}^{6}$. Notice that both the n = 4 shell and the n = 5 shells are unfilled as well as the n = 6 shell. The notation implies the occupied subshells, $4f,5d,6s$ are outside the core but the figure 3 shows that the mean radius of the $4f$ orbital is close to that of $4d$ and that, from the point of view of 'correlation in the motion of the electrons' there may well be more $4d-4f$ interaction than, say $5p-4f$. It is possible that $4f$ should be considered part of the core, so that its main role is to define the screening of the outer orbitals. It would imply that 4f26s6p, for example, should be considered as a system with a larger core, with its own set of outer correlation orbitals, non-orthogonal to those of 4f5d6s2.

Figure 3.

Figure 3. HF and DHF radii of orbitals of some configurations of Ce i.

Standard image High-resolution image

For the lanthanides, it is not clear how the concept of core and valence electrons can be applied. For the ground configuration of Ce i, the complex theory would require $5s,5p,5d,6s$ to be occupied valence orbitals and possibly also other orbitals depending on the strength of interactions between CSFs in the {1}2{2}8{3}18{4}19{5}9{6}2 complex. We could make the assumption that correlation within the n = 4 shell is the same for all levels and cancels in an energy difference but could be included as part of CC correlation. This leads so to the concept of three types of orbitals:

  • (i)  
    An inactive core—$1s,2s,2p,3\,s,3p,3d$
  • (ii)  
    An active core—$4s,4p,4d,4f$
  • (iii)  
    Valence orbitals—$5s,5p,5d,6s$.

In addition to difficulties in defining the core, calculations for the lanthanides suffer from the fact that the number of CSFs generated by SD replacements from an MR set rapidly grow unmanageably large in GRASP2K. Configurations, however, can be ranked according to their interaction strength [187]. Retaining only the most important configurations reduces the number of CSFs, but still calculations strain computational resources and only a few correlation studies have been reported. Examples include the 4f2 configuration of Pr iv [188] and the ground state of Lawrencium [189].

6. Estimating uncertainties

The accuracy of a calculation depends on the MR set, the orbital set, the rules for substitutions which define the correlation model, and finally, the inactive core orbitals. Accuracy should improve when the first two are increased in such a fashion that the smaller set is included in the larger. For example, if {MR} denotes the initial MR set and {MR*} the expanded set, then {MR} ⊂ {MR*}. At the same time, with less constraints on the substitutions and with a smaller inactive core, all other things being equal, the results should be more accurate.

To estimate the uncertainties in computed atomic parameters these parameters should be varied in a systematic way. This can be done in a number of ways and in practice there is always a balance between the available computational resources and desired accuracy. The following steps can be followed:

  • (i)  
    Increase the orbital set systematically and monitor the convergence of the desired property. Expansions by 'layers' (adding one new orbital of a given symmetry) are a commonly used process for which results can be extrapolated.
  • (ii)  
    Expand the MR set so that SD substitutions are applied to a larger portion of the wave function. Since the wave function expansion is normalized, $p={\sum }_{\alpha \subset {\rm{MR}}}{c}_{\alpha }^{2}$ represents a larger fraction of the wave function used in the SD process. Clearly, the most efficient way of increasing the MR set is to promote the CSFs with the largest expansion coeffients in the wave function to the MR set.
  • (iii)  
    relax constraints on the substitutions and make the inactive core smaller.

Systematic MCHF and MCDHF calculations have been performed for many systems. Depending on the calculated properties and the complexity of the atomic structure the steps of the calculations differ, but the general idea of a systematic enlargement of the CSF space remains. Examples of these types of calculations are given in [190, 191], where the convergence of allowed and intercombination transitions in Cii, Ciii and Mg-like ions, respectively, are studied as the orbital sets are enlarged within a correlation model and where the initial valence correlation models are extended to successively include also CV and core correlation. Systematic calculations allowing for uncertainties to be estimated are also discussed and exemplified in [4547, 49, 192].

6.1. A systematic approach

Lithium has been a much studied system due to the possibility of systematically exploring and understanding different correlation effects and convergence properties of the variational solution. For lithium there also are a number of highly accurate Hylleraas variational calculations for comparison. The nonrelativistic total energies for the latter calculations are essentially exact. Tong et al [46] studied the convergence of the total energy for $1{s}^{2}2{s}^{2}S$ in Li I for nl-expansions, similar to partial wave expansions. The generation of the CSFs can be described by

where L and N are upper limits on ${l}_{1},{l}_{2},{l}_{3}$ and ${n}_{1},{n}_{2},{n}_{3}$ respectively. The result is displayed in table 6. The column headed n denotes the largest n of the current l column and for $l\gt 0$, the calculation includes all the orbitals of the previous columns. Thus the calculations for $n=2,\,l=1$ includes all CSFs that can be generated by SDT replacements of orbitals in the 1s22s reference with orbitals in the AS $\{1s,2s,\ldots ,13s,2p\}$. The row denoted $\infty $ contains extrapolated values for each l. The extrapolated values can be obtained using the fact that ratio of the energy differences ${r}_{n}={\rm{\Delta }}{E}_{n}/{\rm{\Delta }}{E}_{n-1}$ where ${\rm{\Delta }}{E}_{n}={E}_{n}-{E}_{n-1}$ is almost constant and in the range 0.5–0.6. This leads to a geometric series for the correction which sums up to

Equation (151)

With rn in the above range the correction is similar to the last corrections ${\rm{\Delta }}{E}_{n}$ to the energy. The row denoted ${\delta }_{l}$ shows the correction between the extrapolated values and the last energy computed for the partial wave. When starting the calculations for a new partial wave l the correction ${\delta }_{l-1}$ from the previous wave needs to be added (for details see the original article). The table shows that the energy converges well within a given partial wave. However, to obtain the total energy the contributions from the high-l partial waves must be estimated. Assuming a similar asymptotic behavior with respect to l as was for two-electron systems ${\rm{\Delta }}{E}^{l}={E}^{l}-{E}^{l-1}$, where El is the limit for the l partial wave, is fitted to an expression of the form

Equation (152)

The remainder is obtained by summing the above expression over l. This gives the final nonrelativistic total energy for $1{s}^{2}2s{}^{2}S$ in Li I

Equation (153)

Similar detailed and systematic convergence studies of energies have been done for other Li-like systems [50] as well as for Be- and B-like systems [48, 49, 51, 177], adding to our understanding of correlation effects and how they can be accounted for in multiconfiguration methods.

Table 6.  Total energies (in Eh) from nl-expansions for $1{s}^{2}2s{}^{2}S$ in Li I as a function of the maximum values of n and l (from [46]).

n l = 0 l = 1 l = 2 l = 3 l = 4
2 −7.432 726 93 −7.469 941 45      
3 −7.447 567 56 −7.471 977 24 −7.476 040 54    
4 −7.448 476 36 −7.473 217 44 −7.476 483 19 −7.477 263 40  
5 −7.448 610 63 −7.473 628 47 −7.476 610 99 −7.477 406 14 −7.477 667 73
6 −7.432 644 19 −7.473 765 36 −7.476 695 98 −7.477 455 71 −7.477 725 20
7 −7.447 656 86 −7.473 809 59 −7.476 734 50 −7.477 477 01 −7.477 748 59
8 −7.448 662 54 −7.473 824 87 −7.476 751 83 −7.477 491 03 −7.477 759 33
9 −7.448 664 96 −7.473 830 96 −7.476 760 26 −7.477 498 88 −7.477 764 96
10 −7.432 666 06 −7.473 833 71 −7.476 764 37 −7.477 503 03 −7.477 768 82
11 −7.447 666 61 −7.473 835 09 −7.476 766 47 −7.477 505 39 −7.477 771 22
12 −7.448 666 90 −7.473 835 86 −7.476 767 61 −7.477 506 78 −7.477 772 64
13 −7.448 667 06 −7.473 836 26 −7.476 768 26 −7.477 507 61 −7.477 773 50
14   −7.473 836 52 −7.476 768 66 −7.477 508 12 −7.477 774 08
           
$\infty $ −7.448 667 26 −7.473 836 90 −7.476 769 24 −7.477 508 95 −7.477 775 17
${\delta }_{l}$ −0.000 000 20 −0.000 000 38 −0.000 000 58 −0.000 000 83 −0.000 001 09
$+\delta $ −7.448 667 26 −7.473 837 10 −7.476 769 82 −7.477 510 11 −7.477 777 16

6.2. Relativistic calculations for light atoms

For light elements with 12–18 electrons, many BP calculations have been performed for all the lower levels of a spectrum up to certain level (see http://nlte.nist.gov/MCHF). These calculations, though relativistic, did not include CV or CC correlation.

With the computer resources now available GRASP2K calculations have been performed for Mg-like elements for a range of Z values [193]. Odd and even orbital sets were optimized separately. For each parity, the MR set was from the set of CSFs that included all $3l3l^{\prime} $ CSFs, all $3{snl},{nl}=4s\ldots 5g$, all $3{pnl},{nl}=4s\ldots 4f$, and all $3{dnl},{nl}=4s\ldots 4f$. SD expansions up to n = 8 for VV correlation and limited CV were used in that at most one $2s$ or $2p$ orbital was allowed in the substitution. The range of J values was 0–5 for both parities and the number of eigenvalues for each parity was 79. For each parity the expansions consist of 6 blocks (one for each J). From 3 to 23 eigenvalues per blocks were required. The results of this calculation were treated as the zero-order approximation. Added to these calculations were the CSFs obtained from SD substitutions from all the core subshells of the CSFs in the MR. The CSFs obtained in this way account for CC correlation and constitute the first-order correction as presented in section 4.1.2. As in perturbation theory, the matrix ${H}^{(11)}$ was replaced by the diagonal matrix denoted by ${H}_{{ii}}^{(11)}$ and the Davidson algorithm as implemented in GRASP2K [131] was used for computing all the needed eigenvalues and eigenvectors for a given block. It is interesting to see the size of the expansions:

Parity VV+CV VV+CV+CC
Even 644 342 5 624 158
Odd 630 502 6 214 393
Thus including substitutions from all core subshells greatly increases the size of the expansion and, on average, the VV+CV+CC expansion is 10 times that of VV+CV. The assumption that ${H}^{(11)}$ is a diagonal matrix, greatly reduces the angular data needed. The calculation including CC correlation required about 20 h on a cluster with 10 nodes.

Table 7 shows the energies relative to the ground state (in cm −1) for the two calculations and compares the results with values, derived from observation, reported in the ASD [164] for Fe XV, but limited to only $3{snl}$ energy levels. Note that some levels are not present in ASD making the spectrum information incomplete. Some are misclassified as, for example, $3s5s$ 3S.

Table 7.  Comparison of calculated and observed excitation energies (in cm−1) in Fe XV. (i) EASD: observed energies from the ASD database [164], ii) ${E}_{{\rm{VV}}+{\rm{CV}}}$: energies from MCDHF calculations that account for valence and core–valence correlation, (iii) ${E}_{{\rm{VV}}+{\rm{CV}}+{\rm{CC}}}$: energies that account for valence and core–valence correlation and where core–core electron correlation effects have been included perturbatively, (iv) EBP(VV): Breit–Pauli energies from valence correlation calculations [148], (v) ${\rm{\Delta }}E$: difference between computed and EASD value.

State EASD ${E}_{{\rm{VV}}+{\rm{CV}}}$ ${\rm{\Delta }}E$ ${E}_{{\rm{VV}}+{\rm{CV}}+{\rm{CC}}}$ ${\rm{\Delta }}E$ EBP ${\rm{\Delta }}E$
$3{s}^{2}{}^{1}{S}_{0}$ 0 0   0 0 0 0
$3s3p{}^{3}{P}_{0}^{o}$ 233842 233828 −14 233928 86 232595 −1248
$3s3p{}^{3}{P}_{1}^{o}$ 239660 239668 8 239741 81 238542 −1118
$3s3p{}^{3}{P}_{2}^{o}$ 253820 253829 9 253773 −47 252751 −1069
$3s3p{}^{1}{P}_{1}^{o}$ 351911 352169 258 352091 180 349866 −2045
$3s3d{}^{3}{D}_{1}$ 678772 678954 182 678329 −443 680377 1655
$3s3d{}^{3}{D}_{2}$ 679785 679986 201 679381 −404 681111 1326
$3s3d{}^{3}{D}_{3}$ 681416 681603 187 680952 −464 683029 2613
$3s3d{}^{1}{D}_{2}$ 762093 762729 636 762176 83 762218 125
$3s4s{}^{3}{S}_{1}$ 1763700 1764876 1176 1763699 −1    
$3s4s{}^{1}{S}_{0}$ 1787000 1788455 1455 1787322 322    
$3s4p{}^{3}{P}_{0}^{o}$   1883187   1882236      
$3s4p{}^{3}{P}_{1}^{o}$   1883595   1882588      
$3s4p{}^{3}{P}_{2}^{o}$   1890703   1889632      
$3s4p{}^{1}{P}_{1}^{o}$ 1889970 1891051 1081 1890042 72    
$3s4d{}^{3}{D}_{1}$ 2031310 2032907 1597 2031683 373    
$3s4d{}^{3}{D}_{2}$ 2032020 2033653 1633 2032413 393    
$3s4d{}^{3}{D}_{3}$ 2033180 2034880 1700 2033623 443    
$3s4d{}^{1}{D}_{2}$ 2035280 2036318 1038 2035053 −227    
$3s4f{}^{3}{F}_{2}^{o}$ 2108520 2109821 1301 2108281 −239    
$3s4f{}^{3}{F}_{3}^{o}$ 2108620 2110029 1409 2108503 −117    
$3s4f{}^{3}{F}_{4}^{o}$ 2108880 2110327 1447 2108798 −82    
$3s4f{}^{1}{F}_{3}^{o}$ 2123150 2124654 1504 2123180 30    
$3s5s{}^{3}{S}_{1}$ 2544800 2512036 −32764 2510852 −33948    
$3s5s{}^{1}{S}_{0}$   2520681   2519752      
$3s5p{}^{3}{P}_{0}^{o}$   2568582   2567624      
$3s5p{}^{3}{P}_{1}^{o}$   2568791   2567639      
$3s5p{}^{1}{P}_{1}^{o}$ 2567000 2571834 4834 2570733 3733    
$3s5p{}^{3}{P}_{2}^{o}$   2572157   2570743      
$3s5d{}^{3}{D}_{1}$ 2640100 2641400 1300 2640247 147    
$3s5d{}^{3}{D}_{2}$ 2639900 2641630 1730 2640442 542    
$3s5d{}^{3}{D}_{3}$ 2640300 2642072 1772 2640870 570    
$3s5d{}^{1}{D}_{2}$   2643981   2642888      
$3s5f{}^{3}{F}_{2}^{o}$ 2676400 2677360 960 2675889 −511    
$3s5f{}^{3}{F}_{3}^{o}$ 2676400 2677455 1055 2675988 −412    
$3s5f{}^{3}{F}_{4}^{o}$ 2676600 2677594 994 2676123 −477    
$3s5f{}^{1}{F}_{3}^{o}$ 2782700 2682597 −100103 2681155 −101545    
$3s5g{}^{3}{G}_{3}$   2687368   2685680      
$3s5g{}^{3}{G}_{4}$   2687556   2685877      
$3s5g{}^{3}{G}_{5}$   2687777   2686099      
$3s5g{}^{1}{G}_{4}$   2690506   2688841      

For comparison, BP results [148] that were state-of-the-art a decade ago are included for a few values. Considerable improvement has been achieved for lower levels. For the more highly excited states, ${\rm{\Delta }}E$ for ${\rm{VV}}+{\rm{CV}}$ is typical of the earlier BP calculations and others when CC correlation is omitted. Generally, lower levels are in better agreement with observation than excited levels. Note that including CC correlation has reduced the discrepancy for higher levels by a factor of 2–3. The mean error in ${E}_{{\rm{VV}}+{\rm{CV}}+{\rm{CC}}}$ is 0.023%.

The computer resources needed by such a calculation can be greatly reduced by a restructuring the computational procedure. GRASP2K was designed for expansions of a few hundred CSFs in mind. When the SD expansions are generated from orbital sets with multiple layers of orbitals, CSFs should be replaced by partial waves where each partial wave has only one spin-angular symmetry and can be structured. When expansions are in the millions, it actually is faster to generate the expansion than to read it from a file. Angular data can then be expressed in terms interactions with partial waves and advantage taken of the fact that angular expressions are independent of the principal quantum number. Advantage can also be taken of the fact that excitations differ by at most two orbitals. In a study of Helium where the wave function was expanded in a tensor product of B-spline functions, the partial wave of a pair correlation function of a given summary was treated as an array and very little angular data was needed, In fact, the energy matrix was not stored in memory. An iterative method for solving the wave equation was devised for parallel computation that only requires a sequence of values of ${\boldsymbol{Hc}}$, where H is the interaction matrix and c is an approximate eigenvector.

6.3. Relativistic calculations for highly charged ions

Though the effect of correlation on atomic properties of light elements has been analyzed extensively using nonrelativistic theory, the same is not true for heavy elements. In these elements the many-body effects are relativistic (as distinct from nonrelativistic with a correction as in BP) and the magnetic Breit, QED, and finite nuclear corrections need to be considered when comparing results with observation. Total energies are a good benchmark although they cannot be compared directly with experiment.

In table 8 the trends of the different contributions to the Be-like ground state energy are shown. The DHF value is the total energy for the $2{s}^{2}{}^{1}{S}_{0}$ CSF. DC is the reduction in the total energy due to correlation when the orbital set is expanded systematically by n. As expected, the correlation energy contribution (DC) converges slowly, becoming more negative with n. It should be noted that, as n increases, an orbital with a new angular momentum is introduced into the orbital set, namely $l=n-1$. For large n, this orbital has the largest generalized occupation number and this contributes to the slow convergence. The TP (Breit) correction raises the total energy and converges more slowly than correlation. At some point, the change in the TP correction is larger than the change in DC which seems counter intuitive. Orbitals with a high angular momentum seem to play a role in this observation. It is important to remember that the Breit operator is a first-order operator, that Breit evaluated the operator as a perturbative correction [195]. Possibly higher order corrections are needed.

Table 8.  Trends in different contributions to the total energy (in eV) of Be-like ground state as a function of the orbital set compared with values from the diagrammatic methods of Malyshev et al [194]. For Ca, n = 4, 5 values are omitted.

Ca (Z = 20, A = 40)      
  n = 2 n = 3 $\ldots n=6$ n = 7 n = 8 [194]
DHF: −12842.650 −12842.650 −12842.650 −12842.650 −12842.650 −11782.120a
DC: −6.194 −7.374 −7.780 −7.802 −7.814 −1056.673
TP: 3.268 3.152 3.095 3.090 3.080 −8.574
Sum: −12945.577 −12846.871 −12847.335 −12847.362 −12847.384 −12847.367
VP: −0.278 −0.278 −0.278 −0.278 −0.278  
SE: 3.745 3.746 3.744 3.744 3.744  
QED: 3.467 3.467 3.466 3.466 3.466 3.455
ISO: 0.177 0.179 0.179 0.179 0.179 0.177
Etotal: −12841.932 −12843.225 −12843.691 −12843.718 −12843.740 −12843.735
Xe (Z = 54, A = 131)
  n = 2 n = 3 n = 4 n = 5 n = 6 [194]
DHF: −101129.81 −101129.81 −101129.81 −101129.81 −101129.81 −98000.517a
DC: −9.68 −10.99 −11.26 −11.37 −11.42 −3058.046
TP: 71.15 69.95 69.66 69.37 69.18 −13.521
Sum: −101068.34 −101070.84 −101071.41 −101071.81 −101072.04 −101071.804
VP: −16.43 −16.39 −16.39 −16.39 −16.39  
SE: 115.28 115.27 115.27 115.27 115.27  
QED: 98.85 98.88 98.88 98.88 98.88 98.434
ISO: 0.44 0.45 0.45 0.45 0.45 0.499
Etotal: −100969.04 −100972.48 −100972.08 −100972.48 −100972.70 −100972.922
U (Z = 92, A = 238)
  n = 2 n = 3 n = 4 n = 5   [194]
DHF: −327628.00 −327628.00 −327628.00 −327628.00   −321276.02a
DC: −10.82 −12.38 −12.72 −12.86   −5933.84
TP: 406.57 400.22 399.29 398.15   −17.92
Sum: −327232.25 −327240.16 −327241.43 −327242.70   −327227.78
VP: −218.07 −217.36 −217.38 −217.42    
SE: 842.17 842.20 842.40 842.02    
QED: 624.10 624.84 625.02 624.60   616.97
Recoil:           2.18
ISO: 1.02 1.02 1.02 1.00    
Etotal: −326607.13 −326614.30 −326615.58 −326617.07   −326608.63

aThe value reported is the zero-order energy from the potential referred to as PZ in [194].

The systematic GRASP2K calculations reported in table 8 were carried out until the generalized occupation numbers of orbitals in the last layer were less than about $5\times {10}^{-8}$ which happens to be close to the accuracy of the $1s$ orbital energy. On that basis, it is clear that correlation converges more rapidly when Z increases.

When Etotal energies are compared with the perturbation theory results reported by Malyshev et al [194], the n = 8 energies are in remarkable agreement with GRASP2K for Ca (Z = 20) and n = 6 for Xe (Z = 45). On the other hand, U (Z = 92) is problematic and more information is needed to explain the difference. For this case, the various nuclear corrections are not equated with ISO since, unlike previous cases where there was good agreement, there is now a factor of two difference implying some difference in the treatment of the nucleus. Whereas in GRASP2K, the DHF total energy is a fairly good approximation to the energy (accurate to 1 in 326) the perturbation theory calculations start with a zero-order energy that is far less accurate and, as a consequence, the equivalent correlation energy is much larger. Since there is an interaction between Breit and correlation, the equivalent value for TP does not even have the same sign, even for the lightest element.

Perturbation theory calculations rely on orbitals from a given potential and use the 'no-pair Hamiltonian' in which states from the negative energy sea are omitted. Malyshev et al report results from three different potentials and it is impressive to see how the sum of contributions for the many-body effects that include the Breit correction (the first 'Sum' in table 8) agree to essentially two decimal places. This is not exactly a proof of accuracy since all calculations may have omitted the same contribution (i.e., contributions from orbitals with higher angular momenta, a range that is not specified in their paper) but it is reassuring. The difference in sign of the TP (Breit) correction in all cases is unexpected.

From the GRASP2K perspective, it would be interesting to see similar results starting from the same variational DHF potential as given in equation (128). The recent DBSR-HF code [196] not only performs variational calculations in a B-spline basis, but can also provide the complete set of orbitals for positive and negative states of DHF potential for an orbital of given symmetry which can be used for RMBPT calculations [16, 129]. Such a comparison might illuminate many questions.

7. Concluding remarks

In this review we have presented the basic theory for generating multiconfiguration wave functions using variational methods that optimize an energy expression in both nonrelativistic and relativistic frameworks. Conceptually, such wave functions could be sufficiently accurate for predicting any atomic property. But the many-body aspects often limit the accuracy of a calculation. Predictions can be improved with expansions that take into account the atomic property under investigation.

Applications that have been extensively investigated are spectrum calculations for all levels up to a designated excited level, allowed and forbidden transitions [54, 143, 148, 197, 198], isotope shifts [184, 199203], hyperfine structures [159, 204206], nuclear effects on transition rates and spectra [26, 207], magnetic field-induced transitions [20, 208, 209], to name a few. A comprehensive list of relativistic calculations and theoretical studies can be found in the RTAM bibliography database [210]. The methods described in this review are important for the calculation of target states for R-matrix calculations, BSR [211, 212] for nonrelativistic and DBSR [196, 213] for relativistic versions, that describe collision, excitation, and scattering processes such as those needed for plasma diagnostics. Another review paper focusing on the applications is being considered.

Acknowledgments

MG acknowledges support from the Belgian National Fund for Scientific Research (FNRS) under Grant CDR J.0047.16 and from the IUAP—Belgian State Science Policy (BriX network P7/12). TB and PJ acknowledge support from the Swedish Research Council under Grant 2015-04842

Please wait… references are loading.
10.1088/0953-4075/49/18/182004