An effective introduction to structural crystallography using 1D Gaussian atoms

Abstract The most important quantitative aspects of computational structural crystallography can be introduced in a satisfactory way using 1D truncated and periodic Gaussian functions to represent the atoms in a crystal lattice. This paper describes in detail and demonstrates 1D structural crystallography starting with the definition of such truncated Gaussians. The availability of the computer programme CRONE makes possible the repetition of the examples provided in the paper as well as the creation of new ones.


Introduction
X-ray crystallography is the most widely used technique to determine the shape of molecules at atomic resolution. The complexity of structures found has increased considerably since William Lawrence Bragg used diffraction images from various crystals to calculate the position of individual atoms in a crystal cell for sodium chloride, zinc sulphide and diamond [1]. Accordingly, the amount and difficulty of algorithms and calculations involved in data collection and structure solution have increased notably. This means that only researchers with strong quantitative and computational backgrounds are capable of designing and concepts in real space can be translated algorithmically and visually into reciprocal space. This fact is well known to methods developers in x-ray structural crystallography, when testing the initial correctness and effectiveness of their novel algorithms on 1D models (e.g. the infinite chain of carbon atoms).
In what follows, details and calculations are sketched in the main text and expanded in the appendices. A computer package, CRONE [4], developed for the R statistical platform [5] and freely distributed in the CRAN repository [6], can be downloaded for free and employed to closely follow the details of the exposure.

The crystal structure of 1D Gaussians
The most important mathematical feature of a crystal structure is its translational periodicity, for which the repeating portion of space is called a unit cell. Gaussians are desirable functions to represent the core electron density of atoms in the unit cell because they peak around the centre of their distribution, thus representing well the cloud of core electrons around the nucleus, and because they can be handled reasonably well from an analytical point of view. Unfortunately they are not periodic functions. One way to regain the lost periodicity is through the convolution of the Gaussian with the lattice function, Indeed, a 1D crystal structure can be modelled as a function y ( ) x equal to the convolution of the Gaussian centred at 0 and the lattice function: It is easy to prove that the obtained result, the infinite summation of regularly shifted Gaussians, is a periodic function. Other properties can be worked out starting from its definition. But every calculation requires the summation of converging series, most of them involving the use of special functions (see appendix A). It is more convenient, pedagogically, to leave unaltered the analytic form of the Gaussian within one unit cell and to impose some kind of 'forced' periodicity whereby Gaussian tails are truncated so as to have their domain exactly equal to one period, i.e. equal to the unit cell length, a. In this section it will be shown how atoms and groups of atoms suitable for crystallographic calculations can be derived starting from the full non-periodic Gaussian function and ending with a truncated but periodic one. The resulting analytic formula is formally equivalent to a standard Gaussian function with a normalisation constant depending on cellʼs size and atomʼs width.

Gaussian atoms
A single 1D atom will be investigated first. As stated earlier, the following Gaussian function extends from -¥ to +¥, is not periodical, but it is normalised, the area under its curve being 1. One way to transform it into a periodical function is to cut symmetrically its two tails and stick each end to the endpoints of two other identical Gaussians, truncated in the same way. In one dimension this is how to proceed (see figure 1). Both the left and right tails are truncated at a distance a 2 from the central peak, i.e. the two cuts occur at x a 2 0 and + x a 2 0 . Infinite copies of the truncated curve are, next, shifted to the left and to the right of x 0 in discrete amounts equal to multiples of the unit cell length, a. The curve thus obtained is continuous but its first derivative is not, its discontinuities falling at  + = ( ) x n a n 2 1 2, 0, 1, 2, ... 0 . The correct analytic form for this curve, limited to interval [ ] a 0, , depends on whether x 0 is in the first half or the second half of the unit cell: and + x a 2 0 (the two points marked with the two short blue segments). Copies of the truncated curve are created and shifted to the left and to the right by na for any integer n (black curve). In this specific example a = 20. exp  2  for 0  2  exp  2  for  2  .  5   0  2  2  0   0  2  2  0 K is the normalization constant calculated in appendix B and different from the usual ps 1 2 2 because the Gaussian is truncated and the integration area of interest goes from 0 to a. It is, in fact, useful in 1D structural crystallography to require that the area under this curve be equal to Z, the atomic number. An auxiliary normalization function, G, depending on a and σ, can be conveniently introduced at this point. It is defined as where erf is the error function. The required normalization constant K is given by the following relation: The 1D Gaussian atom so defined depends on the two parameters x 0 and σ, on the atomic number Z and on the unit cell length a. x 0 is the atomʼs position, provided by the specific structure under investigation. σ is a measure of the size of the atomic number Z and the thermal vibration of the atom nucleus, more specifically the vibration of its centre. We will address both aspects shortly, but first it is important to understand that each atom needs to sit comfortably in the unit cell. This means that all atoms' density should be contained inside the cell. Such condition is certainly met if the tailʼs truncation is carried at a certain integer quantity of standard deviations from the peak position. This integer m a is given a default value of 5, but can, obviously, be increased or shortened. The following formula can be used to make sure that all the atoms are included: where D M is the maximum distance between atoms in one unit cell; if only one atom is contained in the cell, The default choice is, as said before, s = + a D 10 M M .

Atoms' width and thermal vibration
It is reasonable for the atomic width to depend on the number of electrons Z. The width is also augmented if the atomic nucleus vibrates due to thermal motion because this corresponds to the stationary electron cloud being smeared over a larger region. From a quantitative point of view the 'cold' atom (no thermal vibrations) can be associated with a positive parameter s 0 . In the present article (and in the accompanying R package, CRONE) s 0 is proportional to the square root of Z: This ensures that all Gaussians in the unit cell have a shape that preserves individuality (each atom is separate from the others) because each atomʼs width scales up as the square root of the atomic number, rather than linearly with it. Atomic vibrations can be expressed as a probability of the nucleus being found around its position of equilibrium. An appropriate choice of probability is the normal distribution with variance traditionally indicated by the parameter U. The resulting electron density is well modelled as convolution between the 'cold' atom density and the probability distribution of the nucleus' centre. In appendix C it is shown that the resulting width σ satisfies the following relation: 1 0 2 0 2 and the normalized thermal Gaussian is given by the following expression: , as shown in equations (4) and (5), and G is the function defined by formula (6). The net effect of thermal vibration is clearly to broaden the Gaussian shapes, but the specific choice of normalization constant fixes the area under the curve equal to Z, as before.

Gaussian molecule and structure
A 1D crystal structure is obtained when one or more Gaussian atoms populate the unit cell. In the present paper we adopt a point of view similar to the one adopted in current structural crystallography, which approximates the electron density of the crystal structure with the sum of the electron density of all atoms included. If ρ is the crystal structure density and r r r ¼ , , , n 1 2 are the densities of the atoms composing the structure, then we will always assume that where x j is the central peak position of atom j. This is not strictly true as interactions among valence electrons modify the region between nuclei. But relation (12) is approximately valid in the context of kinematic diffraction, where x-ray scattering is mostly affected by core electrons and where the time scales involved are much longer than the nuclei and electrons' motion.
In general the unit cell can include one or more atoms. In addition, copies of the same atom can be the result of crystallographic symmetry operations. In three dimensions there exist 230 different types of symmetries. In 1D the scenario is greatly simplified, but the main features of symmetry are still retained as there are just two types called P1 andP 1 3 . The way symmetry is used in crystallography starts with the partition of unit cells in equal regions called asymmetric units. The atomic content in one asymmetric unit is equivalent to the that of any other asymmetric unit. Symmetry operations consider the coordinates of all atoms populating one asymmetric unit and produce equivalent atomic coordinates in all other asymmetric units. In 1D crystallography, with only two possible symmetries, the initial asymmetric unit can be chosen as either the whole unit cell, or as the segment between 0 and a 2. The two symmetries are quantitatively summarised in table 1. For example, suppose a P1̅ crystal structure with unit cell of length a=10 has two atoms in an asymmetric unit at = x 2.5 1 and = x 4.3 2 . Due to symmetry there will be two more atoms in the unit cell, one at = x 7.5 3 and the other at = x 5.7 4 . x 3 is the symmetry equivalent of x 1 because −2.5 is outside the [ ] a 0, of the reference unit cell, but its translationally equivalent atom resides at -= -= a 2.5 10 2.5 7.5. Similarly, x 4 is the symmetry equivalent of x 2 . 3 There exist certain patterns in 2D that exhibit a kind of linear symmetry described by the so-called Frieze groups.
As the functions treated in our models are strictly 1D and have radial symmetry around atoms' peaks, such symmetry patterns are not applicable in the present context. The description of symmetry in 1D can also be enriched when quasi-crystals are taken into account, but this topic is not essential for the introductory level of this paper.

Atomʼs occupancy
In ideal crystal structures all atoms have translational equivalents in all unit cells. In general, though, crystalʼs imperfections mean that this is not always the case, and that some unit cells have missing atoms. This is taken care of theoretically with the idea of occupancy, a number between 0 and 1. An atom with occupancy equal to 1 will be found in the translationally equivalent positions across all unit cells. A smaller value of the occupancy means that the specific atom cannot be found in all unit cells. For instance, an atom with occupancy equal to 0.85 is found in roughly 85% of the crystalʼs unit cells.
The concept of occupancy is also useful when listing atoms located in the so-called special position of the cell, which is a 2 (in three dimensions there are more than one special position). An atom located at = x a 2 has a symmetry equivalent at exactly the same position; this is, obviously, the same atom. Thus, both atoms can be listed as a structureʼs content in the unit cell, as long as their occupancy is fixed at 0.5.

Linear molecules as models for 1D crystallography
1D crystallography can be demonstrated effectively using the structure of some existing linear molecules, as all their atoms lie approximately on a segment. Six molecules have been chosen to be used as examples in this paper. For each of them the interatomic distances have been searched on the Internet as published average bond lengths, but the coordinates have been placed arbitrarily in unit cells calculated using relation (8). Those cases in which one of the atoms acted as the inversion centre have been assigned space group P1̅ . To characterise atomic thermal vibrations in these molecules we have used the so-called B factors, B, rather than the variances U, as customary in structural crystallography. They are defined as . 13 2 The choice of B factors is for us arbitrary. We have, thus, decided to fix the variance for hydrogen at 2 2 , which corresponds to an approximate vibration of 1 Å around the equilibrium centre. For all other atoms it is reasonable to assume that such oscillations will have smaller amplitudes. This is heuristically justified by the energy equipartition theorem that distributes thermal energies in inverse proportion to atomic masses (atomic number, Z, in our case). All B factors assigned to the linear molecules can, thus, be calculated using the following equation: The six linear molecules are presented in table 2, with details of atomic positions and B factors. Plots of the density corresponding to carbon dioxide and thiocyanate in the table are shown in figure 2, superposed for comparison to the same structures with 'cold' atoms (all B factors equal to 0).

Fourier series and structure factors
The electron density for the full crystal, y ( ) x , is a periodical function. It can, therefore, be expressed as an infinite Fourier series: where the complex expansion coefficient F h in crystallography is called the structure factor. As explained by the theory of the Fourier series, this coefficient is calculated as an integral over one periodic unit or, in this case, ò ò where the last passage is justified by the fact that in the interval [ ] a 0, function y ( ) x coincides with r ( ) x , as defined by equations (4) and (5). The expression of the Fourier coefficient when there is only one atom in the unit cell is important and interesting because all other expressions can be built starting from it. We are, thus, looking to calculate integral (16) in which r ( ) x is the piecewise function (4) or (5), with K given by expression (7). The full and detailed calculation is carried out in appendices D and E. It yields, for one atom: The Fourier coefficient, or equivalent structure factor, is a complex quantity, as expected. Its amplitude, in this specific case of just one atom, is known as the scattering factor and indicated with the symbol f: Its phase (or argument), contains x 0 and is therefore connected to the atomic position. In the expression for the scattering factor σ is a generic quantity. For cold atoms s s = ; 0 for thermal atoms, as seen in formula (10)   This last expression can be re-written using f 0 , equation (19), and B, definition (13): The same result could have been obtained starting from the expression for the density of a thermal atom as a convolution because the Fourier integral of a convolution is the product of the individual Fourier integrals. The scattering factors of thermal atoms decay more rapidly than those of cold atoms because the exponential factor in equation (21) is a number smaller than 1 and decreases with increasing resolution (increasing values of the Miller index h).
Examples of scattering curves plotted against increasing resolution, h/a, are shown in figure 3.

Structure factors of multiple atoms in the cell
When molecules, rather than one atom, are contained in the unit cell, the electron density is represented by equation (12). Integrals defining the structure factors are, in this case, equal to the sum of integrals for individual atoms. Therefore, the structure factor when more than one atom populates the unit cell is is also the sum of individual contributions:

Anomalous scattering
The reason why most quantitative structural crystallography can be introduced using only one dimension is related to the fact that structure factors are defined in the complex plane irrespective of whether the underlying molecules are built in one, two or three dimensions. Once a method to calculate structure factors is available, a whole array of topics mimicking those in 3D crystallography can be explained and easily illustrated. In [7], for instance, a simple 1D structure is used to explain fundamental crystallographic concepts like unit-cell refinement, the Patterson function, E-values and direct methods, and structure refinement. These same concepts can, of course, be explained also with the general framework adopted to build 1D crystallographic structures using truncated Gaussian functions. In this paper these arguments are skipped and the remainder will instead be devoted to anomalous scattering, as this is of relevance to several techniques used in modern structural crystallography. What follows is a very cursory presentation of anomalous scattering; a very accessible but more in-depth introduction to the same subject can be found on the website maintained by Ethan Merritt at the Biomolecular Structure Centre, University of Washington [10].
Atoms scatter anomalously when the energy, and thus the wavelength, used is close to their resonant values. This means that in general crystals do not scatter anomalously or, better, that the magnitude of anomalous diffraction is negligible when compared to ordinary scattering. At specific wavelengths, though, certain atoms in the crystal resonate and the resulting diffracted beam includes significant anomalous contributions. Their particular nature is exploited for phasing, i.e. to find phases of the scattering factors. Ordinary diffraction is mathematically reflected in the usual factor appearing in equations (22) and (23). The expression for anomalous diffraction makes use of imaginary scattering factors in the following way: The new scattering factor is equal to the ordinary, real, component plus a complex one, Both the real and imaginary parts of this additional anomalous factor are virtually independent of resolution and only depend on wavelength. Values of ¢ f j and  f j for all atomic species and for several wavelengths have been calculated theoretically and can be easily tabulated. The values used in the CRONE package have been extracted from Ethan Merrittʼs site [10]. An example of ¢ f j and  f j as functions of wavelength for iron atoms is depicted at figure 5. The top curve describes  f j , while the bottom curve describes ¢ f j . In order to phase structures using anomalous scattering it is necessary to choose wavelengths close to the resonant energies (close to the dip of ¢ f j ) because in its neighbourhood the ¢¢ f anomalous contribution will be the largest, causing structure factors to change appreciably from their ordinary values.
The most important consequence that anomalous scattering has on structure factors is the breaking of the so-called Friedel's law, according to which amplitudes of structure factors with opposite Miller indices are identical, and phases are opposite of each other. For structures not scattering anomalously in any appreciable way, such amplitudes will not be identical, due to experimental errors, but very similar. When some atoms in the structure scatter anomalously, the same related amplitudes are expected to be systematically different. It is on such difference that methods for structure factors' phase estimate are based, as illustrated in the rest of this section.
Given a Miller index, h, and its opposite, -h, Friedelʼs law states that the structure factor associated with h is the complex conjugate of the structure factor associated with -h: When contributions from the anomalous scattering become important, though, Friedelʼs Law is not valid any longer and in general the following will be true: In this case the pair corresponding to opposite Miller indices, called a Bijvoet pair, is separated into a plus and a minus part, indicated as + F and -F . One of the methods used to phase crystal structures with anomalous phasing employs Bijvoet pairs to create a special density map called an anomalous Fourier difference map (or anomalous difference in short). This density map has amplitudes equal to the absolute square difference of each Bijvoet pair, and all phases equal 0. If the anomalous difference map is indicated as r ( ) x ano , then It can be shown that the anomalous difference map is equivalent to a corrupted Patterson map in which some of the prominent peaks correspond to distances between the anomalous scatterers, e.g. peaks between heavy atoms [11]. Once these distances have been measured, an initial, approximate model of the structure, made of anomalous scatterers, can be suggested. An updated Fourier transform calculated with the experimental structure factors' amplitudes and phases from the initial model, should show improved positions for the anomalous scatterers, but also peaks corresponding to atoms not yet determined. The updated density map will be closer to the true density map if the selected peaks from the anomalous Fourier difference are the correct ones. The updated density can, in turn, be used to confirm or reject the positions of the anomalous scatterers and to select new peaks for the atoms missing from the initial approximate model of the structure. Old and new peaks will form a new structural model used to carry out another Fourier cycle (observed amplitudes and calculated phases). This sort of Fourier recycling provides atomic positions converging to the correct structure, if the initial anomalous scatterers have been properly selected. A specific example will make the procedure clearer. in a unit cell of side a=30 Å. All atoms are kept at a very low temperature, corresponding to a B factor equal to 0.5 Å 2 . All atoms in this structure are represented in figure 6. Structure factors have been computed for positive and negative Miller indices up to h=80. Their amplitudes for the first five Bijvoet pairs and corresponding averages, taken to be equal to the observed amplitudes, are tabulated in table 3. The anomalous difference map, calculated with the anomalous differences just obtained, is displayed as thick curve in figure 7, jointly with the Patterson map (thin curve). Some of the highest peaks in the anomalous difference map (with the exclusion of the origin peak, which is always the highest peak) correspond to the interatomic distance between anomalous scatterers. Of the three highest peaks only the one with a height of 10.23 and located at = x 3.00 coincides with a prominent Patterson peak: it is, therefore, the most likely candidate to indicate the distance between the two iron atoms. We can place two iron atoms anywhere in the unit cell (because of the arbitrariness of the cellʼs origin), as long as their distance is 3.00 Å, i.e. the position of the chosen peak in the anomalous difference map. The two iron atoms arranged in this way form an initial model for the complete structure. To make the comparison with the known structure easier, the origin of the unit cell has been located so to have the two iron atoms placed at 10 Å and 13 Å. Phases calculated from this model are used jointly with the observed amplitudes to generate the first approximate electron density map. This is shown in figure 8(a), overlapped to the final correct structure. The approximate density has, obviously, high peaks at the iron position because the chosen interatomic distance from the anomalous Fourier was the correct one. More peaks are present in the density, though. Among the four highest ones there should be peaks corresponding to the carbon and oxygen atoms missing from the initial model. If we select the two peaks at position = x 2.01 (carbon atom) and = x 3.48 (oxygen atom), the updated electron density appears as in figure 8(b), basically matching the density for the correct structure. If the other two peaks (at positions = x 19.53 and = x 21.00) are used, instead, the updated density does not resemble the correct one ( figure 8(c)). If the peak selected in the anomalous difference map had been the highest one, at position = x 1.50, the initial approximate electron density would have been the same as the one depicted at figure 9. This electron density does not resemble the one for the correct structure, with the exception of the peak in correspondence with the only correctly placed atom at x=10.

The CRONE package
A package for the R statistical platform [5], named CRONE (CRystallography in ONE dimension) [4], has been developed to carry out all calculations explained in this paper, and other operations useful for structural crystallography in 1D. CRONE is freely downloadable directly from any CRAN repository [6] or from the GitHub site [4]. This package is com-

Conclusions
Amongst the most interesting and attractive aspects of structural crystallography, the mathematics of Fourier series and Fourier transforms play a dominant role in the community of physicists. Many of the most important features in this area can be explained and illustrated analytically making use of Gaussian functions. These concepts are very familiar to physics undergraduates and graduates, as they are taught in several modules. The 1D introduction to structural crystallography described in the present paper provides, therefore, an excellent starting point to captivate students' attention and to steer their interest in this important topic across both the physical and biological sciences. Such an introduction makes use of truncated and periodical 1D Gaussians to represent atoms in a crystal lattice, both visually and quantitatively. Furthermore, the close parallel between the complex Fourier coefficients for both 1D and 3D structures facilitates the transition to modern crystallographic jargon. Students that learn 1D crystallography as explained in this paper will have an immediate grasp of both the  qualitative and quantitative aspects of the discipline. The paper also introduces and demonstrates a new package in the R environment for statistics, CRONE, that makes any calculation within 1D crystallography possible and easy. Its use is a valid aid in classroom teaching and enables the generation of innumerable examples and exercises at various levels.

Acknowledgements and dedication
We would like to thank Professor Ethan Merritt for reading the manuscript and for providing thorough and extensive comments. We would also like to thank the referees whose observations, comments and critiques have helped to improve this article in its final version. One of the authors, JF, is sponsored by grant number 653706 iNEXT-WP24 of the European Union. We would also like to thank the Summer Placement Students programme at Diamond for providing both money and resources for one of the authors (ES).
This paper is specially dedicated to the memory of the late Alessandro Nero, passionate about rowing and enthusiastic about physics and engineering, and whose premature and tragic death has left family and friends with heavy hearts.

Appendix A. Infinite crystal structures with full Gaussians: some results
Let us consider the simple case of a 1D structure made out of the regular repetition of one atom with atomic number represented by the integer Z, and where the spacing between atoms is a, equal to the cell length. Without loosing generality we can place atomic centres at the lattice points, --+ + a a a a ... 2 , , 0, , 2 , .... The structure is represented by the expression The following points help define the character of this function. (i) y ( ) x is periodical, with period a. We can see this by computing y + ( ) x ka and proving that this is equal to y ( ) x , where k is any integer: å å y ps s ps s The quantity u−k is an integer running from -¥ to +¥; therefore, calling ¢ ºu u k we have: The value of y ( ) ka is analytically given by an auxiliary Jacobi special function. The value of y ( ) x at any lattice point ka, where k is an integer, can be approximated using the Jacobi auxiliary theta function q 3 , which is a complex function defined in the context of elliptic integrals. In detail: because the function is periodic. Therefore, The following property applies to infinite sums of Gaussian-like terms: where q ( ) z q , 3 is the Jacobi third auxiliary function [9] with nome equal to q. It might be possible to compute y ( ) x for values of x different from ka, but such calculation necessarily involves a deeper and thorough use of Jacobi theta functions, a topic beyond the scope and aim of this paper. (iii) y( ) x is normalised in the interval corresponding to the unit cell. The area under the curve represented by y ( ) x is, obviously, infinite because such an area is the infinite summation of the finite and positive area in any interval corresponding to the unit cell. It is, though, important to verify that the area corresponding to one unit cell is finite and to calculate its value. The area to be computed is:  where erf is the error function defined as: The infinite summation can be re-written as the limit of a finite summation: Thus, the area of the infinite sum of Gaussians within a unit cell is finite and equal to 1. From a qualitative point of view this is sensible because, although there are infinite contributions that raise the curve upwards in a unit cell, such contributions are infinitesimally smaller and smaller, so that the final infinite sum converges.

Appendix B. Normalization constant for the truncated Gaussian
Given the following function, we want to compute that value of the constant K that makes the area under r ( ) x , between 0 and a, equal to Z, the atomic number. Calculations are equivalent if the integral is computed between -a 2 and +a 2 for the Gaussian centred at the origin. This yields the following expression: or, equivalently, If x is replaced with the new integration variable s º t x 2 2 , the expression becomes Now, the error function, ( ) x erf is defined as Thus, equation (B.4) can be re-written as Considering definition (6) for function G, K is therefore given by equation (7).

Appendix C. Truncated Gaussian for a thermal atom
The general form of a Gaussian for the 'cold' atom in equations (4) and (5) is When the atomic nucleus vibrates due to an increase of energy (thermal vibration) the whole electron cloud is dragged along in this motion and the net macroscopic effect is a smearing of the cloud over a larger region. Mathematically the effect can be quantified through a convolution between density (C.1) and a normalized Gaussian probability with mean 0 and variance U: The Gaussian density of the thermal atom is, therefore, defined as The calculation is easier to illustrate if only one period of the atom, in the interval [ ] a 0, , is used for r 0 , while the probability is defined in the whole -¥ +¥ ( ) , interval. The full set of periodical atoms can be used to derive the final result, albeit with some pedagogical difficulty. If expressions (C.1) and (C.2) are inserted into expression (C.3) for the convolution, the result is: Most of the algebraic manipulation needed now concerns the argument in square brackets. This should be transformed into the sum of two new quantities, one of which, including m -( ) x 2 , can be taken out of the integral. In fact, after a long, tedious but not difficult, series of algebraic passages, the argument inside the square brackets becomes s m s s s m s ¢ - The first term contains ¢ x , the integration variable. After integration, this term gives rise to a number that multiplies all factors outside the integral in (C.4). The second term is untouched so that expression (C.4) becomes C . 6 2 0 2 ¢ K will not, in general, yield a properly normalized function. We have the freedom, though, to adopt a different constant in the definition of ρ. We can, in fact, use for ρ the same normalization used in equation (7) because this guarantees the area under the curve to be Z. The functional part of ρ is identical to that of r 0 if s 0 is replaced by s + U 0 2 . Thus, the final, properly normalized, Gaussian density for thermal atoms is For U=0 this function correctly returns r 0 .

Appendix D. Fourier coefficient of a truncated Gaussian
We would like to calculate where r ( ) x is given by the piecewise functions (4) or (5), and where K in such expressions is defined in equation (7). Even though there are four different analytic expressions in the definition of ρ, there is no need to carry out four separate integrations. Each branch of the piecewise density include terms of the following form: Furthermore, the important feature of the integration interval used to calculate the Fourier coefficients is its width, a. This means that such interval does not have to be centred necessarily on = x a 2. It is, in fact, convenient to centre it on m = x . The integral needed to compute the Fourier coefficients, thus, has the form  To avoid using integration techniques in the complex plane and evaluating the error function for complex numbers it is, at this point, convenient to separate (D.4) into real and imaginary parts. The integral of the imaginary part involves integration within a symmetric interval of an odd function (product of a sine and a Gaussian); this integral is, therefore, zero. We are, thus, left with the following:

Appendix E. Integrals of trigonometric and Gaussian functions
The purpose of the present appendix is to calculate an integral of the form After an integration by parts, the expression becomes The integral in this expression is ( ) I h times a constant. The calculation leads to the following first-order differential equation: For values of h in an infinitesimal neighbourhood of any integer, the right-hand side of the equation is very close to zero and assumes the following form: The integration constant C can be found when h=0 is used in (E.8), using also definition (E.1): Therefore, the integral (E.1) is This 1D structure has three atoms in a unit cell of length = Å a 4.969 . They are sulphur (S, Z=16) at position = x 1.000, carbon (C, Z=6) at position = x 2.813 and nitrogen (N, Z=7) at position = x 3.969. Their B factors are, respectively, 5.000, 13.333 and 11.429. All have occupancies 1. The symmetry for this structure is P1, which means there is no symmetry.
F.1.2. Calculation and display in direct space. The CRONE function to calculate density corresponding to 1D structures is structure_gauss. The calculation is carried out at every point of a regular grid. The result is a list, rho, with two vectors, the regular grid x and the density values rr. Both vectors are jointly used to display the structure (see figure F1(a)): > rho <-structure_gauss(sdata, N = 1000) > plot (rho$x, rho$rr, type='l', xlab = 'x', ylab = expression(rho)) Function structure_gauss can also be used to display the density in multiple unit cells, simply modifying the starting grid x (see figure F1(b)): > x <seq(−5 * sdata$a,5 * sdata$a, length=1000) > rho <-structure_gauss(sdata, x=x) > plot (rho$x,rho$rr, type='l', xlab='x', ylab=expression(rho)) F.1.3. Calculation and display in reciprocal space. The density displayed at figure F1(a) can also be calculated as Fourier series if the structure factors (i.e. the Fourier coefficients) are computed first. The function in charge of structure factors calculation in CRONE is called strufac. We will need to decide in advance how many Fourier components are required. The remaining part of the input is equivalent to the one used in the calculation for direct space, that is the information on the structure, sdata. The structure factors can be used to calculate a Fourier summation for all points of a grid x representing the unit cell. The function of CRONE available to carry out this so-called Fourier synthesis is fousynth: > N <-1000 # Number of grid points in the unit cell > rtmp <fousynth (sdata$a, ftmp$Fmod, ftmp$Fpha, hidx, N) The vectors x and rho obtained are the same as those giving the density in figure F1(a).

F.2. Approximate phases and peak search
The main goal of structural crystallography is to find the positional coordinates of all atoms in a given structure. The quantity that can be calculated using experimental data is the electron density. Atomic coordinates correspond to the peaks in the electron density. Positional precision and accuracy depend, of course, on the calculated electron density quality, this, in turn, being dependent on the precision and accuracy of all estimated structure factors phases. Amplitudes and phases from the structure factors of thiocyanate (see section F. The Fourier synthesis obtained using experimental amplitudes Fm and approximated phases Fapp will be in general slightly different from the correct density (see figure F2). Next, we need to find peak coordinates in the density obtained. In CRONE this is achieved using the function local_maxima. The peaks found in the correct density are shown in figure F3 and have values 0.999, 2.847 and 3.945, very close to the values in table 2. The second peak is not found in the approximate density. Maxima are displayed only for the first and third atom. They occur at 0.889 and 3.955. Starting with these values and cycling between structure factors and density, a convergence to correct values is eventually reached.

F.3. Errors in structure factors
Amplitudes from calculated structure factors are never exactly met in experiments. Experimental quantities are different from their theoretical values for many reasons. In CRONE function sfobs simulates errors of two types for the structure factor amplitudes, Figure F2. Correct (full line) and approximate (dash line) density for thiocyanate. The approximate density is the result of a Fourier synthesis having correct structure factor amplitudes, but only an approximate estimate of the correct structure factors phases. • Poissonian counting errors • Positional atomic coordinate errors Poissonian counting errors are due to the fluctuations in the number of photons scattered along a given direction (thus corresponding to certain Miller indices). The net effect of these fluctuations is to alter the structure factors theoretical amplitude, producing slightly smaller or larger values, depending on the amount of photons. Fluctuations in the value of sfobs are simulated as random generations of poissonian variates with mean equal to the theoretical structure factorʼs amplitude. The final value is calculated as average of all simulated amplitudes and the error as the corresponding standard deviation. The number of simulated deviates is controlled by the parameter ntrialP (default 100). As an example let us compare theoretical and simulated amplitudes for Miller indices hidx=1,K,5, for thiocyanate: the positional atomic coordinate errors simply simulates errors of the atomic centres around their correct values as Gaussian errors with mean zero and standard deviation controlled by the parameter vx0err. The underlying structure is only approximately equal to the correct structure and, accordingly, the corresponding structure factors' amplitudes are slightly different from their calculated values. The example shown in the following simulate errors in the atomic coordinates for thiocyanate; the standard deviation used for the simulation is vx0err=0.4 Å. Poisson counting errors have been cancelled by using ntrialP=0: the structure corresponding to structure factors calculated with errors just in the atomic coordinates is shown at figure F4. Coordinate errors are visible as peak shifts.    In section 5 it was mentioned that when a structure contains atoms which scatter anomalously in a significant way, then structure factors do not obey Friedelʼs law anymore. This can be nicely demonstrated using the strufac function. In the set of 1D molecules available within CRONE there are no structures including atoms scattering anomalously in any appreciable way. We can artificially create such a structure by replacing the carbon of thiocyanate with an iron atom: Next, we can find out at which wavelength the anomalous scattering is expected to produce the largest differences in structure factors' amplitudes, using function fluorescent_scan: For wavelength close to this value the effect of anomalous scattering will be relatively large. We will, now, calculate structure factors where the anomalous scattering is switched off, so that the corresponding Friedelʼs pairs are exactly equal: > hidx <--10:10 > ># Exact structure factors: no anomalous effect > ftmp1 <strufac (hidx, sdata, lbda=1.0, anoflag=FALSE) > modsF1 <-ftmp1$Fmod > ># Friedelʼs pair h =−1, h=1 > print (paste(modsF1 [10], modsF1 [12] When the anomalous scattering is not switched off, but structure factors are calculated at a wavelength far from 1.74 Å, Friedelʼs pairs will not be exactly equal, but will be approximately close. In fact, atoms always scatter anomalously, but the magnitude of such scattering is very small except at specific resonant wavelengths: ># Exact structure factors: no anomalous effect > ftmp2 <strufac (hidx, sdata, lbda=1.0, anoflag=TRUE) > modsF2 <-ftmp2$Fmod > ># Friedelʼs pair h= −1, h=1 > print (paste (modsF2 [10], modsF2 [12] When structure factors' calculation is carried out at 1.74 Å, the iron atom will resonate anomalously and increase the difference between Friedelʼs pairs: ># Exact structure factors: no anomalous effect > ftmp3 <strufac(hidx, sdata, lbda=1.74, anoflag=TRUE) > modsF3 <-ftmp3$Fmod > ># Friedelʼs pair h= −1, h=1 > print(paste(modsF3 [10], modsF3 [12] For real structure factors derived from collected data, the differences in amplitude for Friedelʼs pairs can be concealed by experimental errors. In CRONE this effect can be explored with the help of function sfobs.