The IVTANTHERMO-Online database for thermodynamic properties of individual substances with web interface

The database structure, main features and user interface of an IVTANTHERMO-Online system are reviewed. This system continues the series of the IVTANTHERMO packages developed in JIHT RAS. It includes the database for thermodynamic properties of individual substances and related software for analysis of experimental results, data fitting, calculation and estimation of thermodynamical functions and thermochemistry quantities. In contrast to the previous IVTANTHERMO versions it has a new extensible database design, the client-server architecture, a user-friendly web interface with a number of new features for online and offline data processing.


Introduction
Thermodynamic databases play essential role in a wide range of applications such as rocket engine engineering, nuclear power, chemical technology, metallurgy, resource usage, waste recycling, etc. Creation of such databases require systematic experimental and theoretical studies of new classes of substances, continuous accumulation of the data with critical analysis of its accuracy and reliability, followed by systematization that makes the data useful for scientists and engineers. Most often these databases are used for the thermodynamic modeling [1].
There are few research centers for thermodynamic properties and thermodynamic modeling where the corresponding databases and software are being developed [2][3][4][5][6][7][8]. Some of them are united in international organizations such as SGTE [9]. In general, the information from these databases is not open and provided for a fee. Probably the most well-known open web resource is the NIST WebBook [2,10] which provides access to a part of information about thermodynamical properties, thermochemical data, energy spectra of ions, vibrational and rotational spectra of molecules, etc.
Most of the above mentioned databases are rather universal. In addition, there are databases which concern a special area of science or technology such as the nuclear power engineering. For example, the ThermoChimie database [11,12] contains thermodynamic properties of about 2300 substances related to the nuclear waste and storage issues; the database developed in the Nuclear Safety Institute of the Russian Academy of Sciences (IBRAE RAN) [13]  information for about 2500 substances and 180 materials divided into 9 classes (fuels, heattransfer agents, moderators, shield materials, noncondensable gases, etc.); the thermodynamical databases for binary and ternary systems [14] used in the nuclear power allows to predict material properties in various conditions, estimate their existence area and phase states. Another example relevant to the physics extreme states of matter is the shock-wave database (SWDB) [15] which contains thermodynamic properties for more than 450 substances at high pressures and temperatures; the database has been developing in the Joint Institute for High Temperatures RAS. Besides experimental points the SWDB also includes special modules for the equation-ofstate calculations, analysis of shock-wave experiments, and graphical representation of computed and experimental information via the Internet [16]. A significant contribution to the accumulation of thermodynamic data has been made by publication of the series of reference books [17]. Based on these books, the IVTANTHERMO database and related software [18] was created in the Institute for High Temperatures of the Academy of Sciences of the USSR. Nowadays it is being developed in the Department for Thermophysical Data, JIHT RAS. The IVTANTHERMO database contains more than 3400 substances, formed of 96 chemical elements. Due to its long history, the technologies originally used for the database and software development became out of date and has to be upgraded. In this paper we present a new version IVTANTHERMO-Online which extends capabilities of the IVTANTHERMO system by using modern web technologies.
2. Database structure and features 2.1. Thermodynamic data representation Thermodynamic modeling and many other applications of the thermodynamical data require representation of the main quantities such as the isobaric heat capacity C p , enthalpy H, entropy S and the Gibbs energy G as smooth functions of temperature T . Typically, this is done by fitting the initial data with polynomials or other functions that reproduce the data with acceptable accuracy.
It is worth noting the differences in obtaining and fitting the data for the condensed and gaseous phases. While thermodynamical properties of gases are typically calculated from the molecular constants, the enthalpy increments or heat capacity for the condensed matter are measured directly or provided by computer simulation. Due to phase transitions in solids the corresponding temperature range is naturally subdivided into relatively short intervals where the thermodynamic functions are continuous and can be fitted with a single polynomial. On the opposite, for the gaseous phase the temperature range may be as long as tens of thousands kelvins and the thermodynamic functions have a rather complicated form. Therefore, a special algorithm is needed to subdivide the whole range into successive intervals and fit the data with a set of polynomials ensuring the continuity at junction points and a given overall accuracy.
In the reference book [17] and the IVTANTHERMO software the same fit functions are used for both gaseous and condensed matter states. The reduced Gibbs energy is given by where the dimensionless temperature is defined as X = T /10000 K and the symbol " • " denotes the standard pressure of P = 101 325 Pa.
Using the standard thermodynamic relations [19] the fit functions for other thermodynamic functions can be obtained  (1)-(4) of the fit functions appeared to be quite convenient and it is still used for calculations of thermodynamic properties of pure substances (see, e.g. [20][21][22]). In the NASA database [23] a similar polynomial is used Recently new approximation approaches are proposed. In [24] a new method for fitting the heat capacity is discussed based on orthogonal polynomials. It does not require a recalculation of existing coefficients when adding an extra one to improve the accuracy. This approach enables one to compare the heat capacity fits for different substances and reveal common properties. An alternative method is proposed in [25] where the Karpov equation is used that has some advantages when fitting the heat capacity of gaseous substances. Unlike the shock-wave database [15] the IVTATHERMO-Online contains only the data for the standard pressure. Following [17] we assume the standard state for gases as the state of a hypothetical ideal gas at the pressure of 101 325 Pa. For liquids and solids, the standard state is the state of a pure liquid or a pure crystalline substance at the same pressure.

Requirements to thermodynamic databases
In the course of the IVTANTHERMO database development a set of requirements was formulated that any fundamental thermodynamic database or a reference book should fulfill [26]. These requirements include critical analysis of all primary information published in the literature; internal consistency of thermochemical values, thermal functions, fundamental constants, atomic weights and other quantities; estimation of the data reliability and uncertainties; availability of brief texts with the details of recommended quantity calculations; a reasonable choice of a set of substances. The data sources that fit these requirements can be regarded as "critical reference books" or "expert level databases" in contrast to the ones compiled from different sources and providing information "as is" without references, clear identification of the calculation procedures and evaluation of its reliability and accuracy. In this work we follow the recommendations [26].
The examples discussed in section 2.1 show the diversity of the data representation in thermodynamic databases. While engineers typically need only the values of thermodynamic functions in the form of fit functions or tables of data, researchers require all the information about the data origin and calculation methods which is called sometimes as "data provenance" [27].
Moreover, to keep information in the database up to date, new versions of data should be published regularly. The reasons for the data correction could be appearance of new experimental or simulation results, error detection, usage of another fitting algorithm, a change of related parameters (fundamental constants). To keep track of these changes the database should contain all versions of data with corresponding comments from the editors.
A special procedure of data editing is proposed in the IVTANTHERMO-Online. As the data reliability is of high priority, any new information should be checked by an expert committee. Only an authorized researcher (editor) should be able to change information in the database. When such a researcher is going to add a new block of data or correct existing values a request should be submitted. The data from this request are stored in the database and marked as an unverified version that is still unaccessible for ordinary users. This version can be edited, removed or commented by other authorized researchers and finally submitted to the expert committee. When the committee approves the new version, it is released on the web site and marked as recommended for users. After this point editing is disabled but authorized users are able to add comments. All previous releases are kept for the history. To provide the researchers with all the above mentioned information, the expert level database should include the following: • original tables of data (experimental or simulation results) with dependences of thermodynamic functions on temperature; • brief overview of experimental conditions or simulations parameters for each data source; • additional parameters used for the ideal gas calculations such as parameters of diatomic or polyatomic molecules, energy spectra, etc; • definitions of the fit functions and details of the fitting procedure; • fit function coefficients for a number of successive temperature intervals; • estimations of the initial and final data accuracy; • annotations where the details of the data selection and processing are given; • full bibliography references; • history of all changes made for a particular block of data with the corresponding comments from the editors; • special fields for user comments and feedback. In combination with the recommendations [26] it constitutes a full set of requirements for the IVTANTHERMO-Online system.

Structure of the IVTANTHERMO-Online database
The IVTANTHERMO-Online is based on the open source relational database PostgreSQL. The database schema is shown in figures 1 and 2. Below the database structure details are considered which illustrates all the supported features.
The "molecule" table contains information about all the molecules that thermodynamic data refer to. Properties of the molecules such as the molar mass and the nuclear-spin entropy are stored in the table "molecule prop" linked with "molecule" using the foreign key "molecule id".
The atomic composition for each molecule is presented in the "molecule atom ref" table which refers to the "atom" table where properties of all the elements and their main isotopes are defined.
Each record in the "substance" table contains information about the substance in a particular phase state. Different isomers correspond to different records in the "substance" table and a single record in the "molecule" table. Possible phase states are given in the "phase" table. As a substance can have alternative names such as "water", "dihydrogen oxide", etc., all the names in Russian and English languages stored in the separate "substance names" table. Substance grouping using various criteria is possible via the tables "substance group" and "substance group ref".
The "substance" table keeps only a basic substance information such as CAS and InChi numbers, molecular structure, reaction of dissociation. Particular substance properties should be given in other tables liked via the substance id. At present, only the thermodynamic properties are implemented and stored in the "thermo" table. In future, it is planned to add other substance properties by adding the corresponding tables using the "thermo" table as a template. Different records in "thermo" linked to the same substance specify different versions of the thermodynamical data submitted by editors in the course of the data addition and correction (see the section 2.2). To specify the release number or unverified version id, the data of change, the editor name and comments, the special table "datainfo" is defined. Moreover, to speed up the search process, a recommended data record (the latest release) is marked by the "thermo.recommended" flag.
A record in the "thermo" table contains the original data and thermochemistry quantities while the fit parameters for a thermal function (typically the reduced Gibbs energy) are stored in the "gibbs coef" table. The later contains a single record per temperature interval. The fit function form is selected from a predefined set in the "approx" table.
The "datainfo" table keeps a universal description (metadata) for a particular block of data of any kind. This provides a flexible mechanism for linking supplementary information or specifying the access rights for each block of data. Using this mechanism the bibliography references are linked to the data descriptions via the "data bib ref" table. The references themselves are stored in the "bibliography" table in the form that enables their output in various formats, including the BiBTeX style. The information about the authors is stored in the "author" table and linked via "bib author ref". The records in the "bibtype" table indicate the reference types such as "paper", "book", etc. The "substprop" and "bib prop ref" tables provide optional information about the substance properties presented in the publication. The user access control is implemented using the tables "user" and "priority" where all the users and their roles (non-authorized user, authorized user, editor, expert, administrator) are specified. These tables are synchronized or controlled by the CMS system used in the web interface (see below). All blocks of data can be divided into groups with possible intersections using the tables "data group" and "data group ref". The user rights to access a particular group of data are specified in the "user group ref" table. User comments can be stored in the "comment" table.

User interface 3.1. General description of the interface
The IVTANTHERMO-Online has a web interface based upon a well-known general principle model-view-controller (MVC). Model is a relational database as well as a set of software modules to access the database. A freeware SQL server PostgreSQL is used, which supports modern SQL standards, is able to work with large amounts of data and is quite effective. View is a modern user-friendly web interface similar to typical graphical applications. The web interface can be displayed on all modern platforms (including mobile) and browsers. Controller consists of the web server Apache that treats data obtained from the model and presents them in a suitable form; also the controller contains a number of software modules for online calculations.
The interaction between a client (web browser) and the web server is accomplished via HTTP and HTTPS protocols. The first one uses a non-encrypted channel between a client and a server; the second one exploits a public key encryption. Both protocols do not support a steady connection between a client and a server: the client sends a request to the server, and the server sends to the client the required information whereupon the connection is closed. An advantage of such a way of communication lies in a relatively weak load of transmission channels; also there is a possibility to use different routes between the client and the server. However, a difficulty to maintain communication sessions arises, as a user should be identified at every connection. Authentication of users is organized in a standard way using a login form. All users should be registered by a database administrator. A password recovery procedure is provided. The web interface is currently available in two languages: Russian and English.

Presentation of data and search
Graphical user interface allows one to search for substances in the database from any web page using different criteria, including searching by a substance name, chemical formula or CAS number. The main web page is shown in figure 3. It represents the periodic table of elements in which the white color of an element symbol indicates the availability of information on properties of compounds containing this element. If the user chooses such an element a list of corresponding    Figure 6. Search results for the search string "uranium".
compounds appears. The full list of compounds may be found using the link on the main page ( figure 4). There is also a possibility to search by a chemical formula together with a phase state (for example, "H2O(g)"), by a substance name or a CAS-number. The type of search can be chosen by clicking the menu to the right of the input field. As it is shown in figure 5, by entering at least two symbols one obtains a dropdown menu with the list of substances, which satisfy the search criterion. If a search string supplied by the user corresponds to several substances then the list of those substances (after pressing "Enter") is displayed on a separate page, as it is shown in figure 6.
If a substance is chosen the user is redirected to the page with data available for the substance ( figure 7). At the top of page the thermochemical data are presented. By clicking the links (see figure 7) the user can switch to the charts with dependences of thermodynamic properties on temperature, tables with thermodynamic data, and calculation of thermodynamic functions for a custom temperature range.
An example of charts for thermodynamic functions is shown in figure 8. By default, the charts display the whole temperature range of data in the database. However, the user may change the range of temperatures in the charts as well as select the required region with the mouse. The original view of a chart can be restored by pressing the button "Reset zoom".   If the mouse cursor is placed above the curve in a chart, the user will see the value of the plotted parameter at the given point ( figure 9). The user may display initial data for the chart by clicking the link "Plain text data" or "Tabulated data" and get the fitting function parameters with the link "Approximation". The button at the right top corner of a chart allows to save the chart in one of the following formats: PNG, JPEG, PDF, and SVG.
There is a list of original publications (experimental and theoretical works) used by experts to obtain the displayed dependencies at the bottom of the page with substance properties ( figure 10). To see a full reference in different formats one should click on the sign to the right of the publication in the list.  Table of thermodynamic functions for water with custom values of initial (final) temperature and step.
A table with numerical data for thermodynamic functions available via the link at the top of the page is shown in figure 11. By default, the data are output for the whole range of temperatures with the step on temperature of 100 K. Phase transitions are marked by a grey background of the cells of the table. To calculate thermodynamic properties for a custom set of temperatures a special form is used ( figure 12).

Calculation of chemical equilibrium for a mixture of substances
The IVTANTHERMO-Online web interface allows one to calculate the equilibrium chemical composition for a mixture of substances. The corresponding link can be found on the main page of the database (figure 3). By this link the user is redirected to the main page of the module, shown in figure 13.
The following information is required for a calculation: • two independent thermodynamic parameters which determine a thermodynamic state of a mixture of substances under consideration; • a list of chemical substances and their quantities; • a flag of whether to includes ions in calculation of the chemical composition. Thermodynamic parameters which identify a thermodynamic state are chosen from the dropdown list ( figure 14).
The types of parameters should be different for the first and second parameters; this condition is controlled by the interface. The values of parameters are also verified. At the next step the user should specify the list of substances the mixture consists of, and their amounts (in moles). When the user types a chemical formula the dropdown list appears that simplifies the search. To enter a new substance the "Insert substance" button should be pressed; after that additional fields for the chemical formula and amount appear. These possibilities are illustrated in figure 15.
Up to 10 substances can be used in a calculation. To remove a substance from the list one should click the delete sign on the left of the formula field. At last one can turn on the switcher "Include ions into calculation" to take into account ions at the calculation of equilibrium chemical composition. Immediately below the table with the list of substances the "Brutto-formula" field is displayed which reflects the composition of the mixture. After providing all necessary parameters the user should press the button "Submit". The results of calculation appear at the bottom of the page. The user gets information about the thermodynamic state of the mixture    and concentrations of all components of the mixture. The example is displayed in figure 16. The form filled by the user is shown at the top of the page that allows one to make changes to the form and recalculate the equilibrium composition.

Conclusions and outlook
The IVTANTHERMO-Online thermodynamic database with web interface is proposed that continues the series of IVTANTHERMO information systems. It meets the requirements to critical reference books as well as the requirements to modern web systems. Opposed to other thermodynamic databases it keeps track of the old versions of data, contains full information on data sources and methods of data processing.
A flexible database design allows future integration with other JIHT RAS data sources such as the database for shock-wave experiments and equations of state [15], constants of diatomic molecules, interatomic potentials, "Thermal" database, etc. Possible extensions includes support of the TermoML standard [28], output in the form of Mathcad functions [29] and additional online calculation services.