Further development of the database of the Mössbauer Effect Data Center

The Mössbauer spectroscopy database compiled and maintained by the Mössbauer Effect Data Center (MEDC) is a unique, wide-scope Mössbauer-spectroscopy related information resource, which forms the basis of information services provided by the Mössbauer Effect Data Center to the worldwide scientific community. The Mössbauer Effect Reference and Data Journal (MERDJ) and the Mössbauer Web Access Database (MWAD), both published by MEDC, are widely known examples of the services that rely on the MEDC database. In recent years a further improvement of these services, especially that of MWAD, has been envisaged, and as a first step of the corresponding process the further development of the MEDC database was started. In the present work we introduce the main features of the MEDC database and the steps that have been already taken in the frame of its further development. Implications of the work regarding the associated services are also presented.


Introduction
The idea and history of the Mössbauer spectroscopy database compiled and maintained by the Mössbauer Effect Data Center (MEDC) dates back to the 1960's when the structure and content of the database was shaped through the work and ideas of Muir et al. [1,2] (1958)(1959)(1960)(1961)(1962)(1963)(1964)(1965)(1966), at the North American Aviation Science Centre, California) and Stevens et al. [3][4][5] (1969-2009, at the University of North Carolina at Asheville). Following 40 years of operation in Asheville, USA, MEDC moved to Dalian, China, in 2010 [5]. Details of the history of the data center and the database have been treated in several publications [4][5][6].
Data collection over above four decades has resulted in a unique database that currently includes bibliographical details of above 60,000 Mössbauer spectroscopy related publications, along with above 115,000 associated data records, incorporating among others ca. 20,000 explicitly given 57 Fe Mössbauer isomer shift and quadrupole splitting data pairs abstracted from publications in conjunction with ca. 13,700 different absorber materials. The scope of the database is being continuously extended by the Mössbauer Effect Data Center [7] with newly added bibliographic and Mössbauer data. The database also forms the basis of the reference and data listing (and further associated) sections of the Mössbauer Effect Reference and Data Journal (MERDJ) (that has been published since 1978 [8] and currently informs about the latest compiled data of the database each year in ten issues [9]), as well as of the online version of the database, the Mössbauer Web Access Database (MWAD), which provides query access to database data on a subscription basis via the WEB site of MEDC [7].
Originally stored on mainframe computers [4], the database was transferred to a Macintosh desktop computer in the 1990's, where subsequently it was managed by the "4D" ("4th Dimension") relational database management system [6, https://en.wikipedia.org/wiki/4th_ Dimension_(software)]. This "4D" database system served exclusively the compilation of the MEDC database right up until 2018.
In order to further enhance the added value the database contributes worldwide to the scientific research process with, recently a further development of the database has been envisaged [6]. The abandonment of the original database software and that of the underlying aged iMac desktop computer appeared to be a prerequisite of efficient progress regarding further developments. At the same time, given that the online accessible MWAD database is managed now on a MS Windows operating system (OS), MS Windows also seemed to be a reasonable choice for the base OS of the newly developed database management system.
In the present work we introduce the newly developed "MEDC DBM" database management software that was prepared as the first step of the envisaged further development process of the MEDC database, as well as the improvements already applied to the database during this step.

The database compilation process
The significance of the database management software serving as the tool of the compilation and maintenance of the MEDC database is emphasized by the schematic flow chart of the database compilation process as shown in Fig. 1. The database management software plays a central role in the information services provided by MEDC: besides serving as a tool for database compilation, it must also provide suitable data output for the on-line Mössbauer Web Access Database and for the issues of the Mössbauer Effect Reference and Data Journal as well. On the basis of Fig. 1 it is also clear, that the online accessible MWAD is not identical with the MEDC database, but rather it is derived on the basis of the latter, and subsequently functions as a standalone database service until its next update. On the one hand this means that any feature update or improvement of MWAD must be based on the further development of the database management software, on the other hand it opens up a wide range of possibilities to improve upon MWAD in comparison with its present state. The latter is important as MWAD is the only database service through which researchers all over the world can query contents of the MEDC database.
In accordance with Fig. 1, an essential starting step of the database compilation process is (1) the survey of newly published literature with the aim to select publications that deal with subjects related to Mössbauer spectroscopy, and are thereby suitable to become listed in the database. This is followed by (2) a manual expert review of the selected works with the aim to determine suitable keywords, comments, data and data attributes to be associated with the corresponding record(s) in the MEDC database. The database management software plays a major role in the subsequent steps, which include (3) the input of these digested publication data into the MEDC database and the preparation of suitable outputs for (4-5) MWAD and (7) MERDJ.
Areas where the new database management software was expected to bring advantages over the previous one included  Schematic flow chart of the MEDC database compilation process by using the "4D" (left side of the framed region) and the newly developed "MEDC DBM" (right side of the framed region) database management software. The "MEDC DBM" software was made to be able to output the SQL file (used for the update of the online MWAD) directly, which led to a simplification of the compilation process. Both applications are suitable to output listings of data selected for the upcoming issue of the Mössbauer Effect Reference and Data Journal 3 Development of the "MEDC DBM" database management system To achieve the above mentioned aims, the development process of the new database management software started with the examination of the MEDC database structure as stored on an iMac desktop computer by the "4D" system. The latter was used to output all the tables of the database in the form of separate text files (like in the case of step (4) in Fig. 1). The contents of the text files were compared/correlated with the data of MWAD and with the contents of MERDJ issues prepared on the basis of previous outputs of the "4D" system, and thereby the way through which the text files represent the data of the MEDC database was elucidated.
The "4D" system output included 28 separate text files corresponding to 28 tables of the MEDC database. On the basis of their content 10 of these tables were found to be not needed for the future management of the database anymore, and were therefore subsequently ignored. (For example, 4 of the abandoned tables were related to "Micros" that were different subdatabases of the MEDC database, each dealing with a particular topic, distributed earlier on floppy disks together with a dedicated database program [10], which service was later on surpassed and replaced by that of the MWAD.) The remaining tables are listed in Table 1 together with their main role in the MEDC database.
In order to maintain compatibility with the original database structure as well as to ensure a smooth transition from the usage of the "4D" system to that of the "MEDC DBM" system, the tables in Table 1 and the logic through which they are related (see, e.g., Fig. 2) have been adopted as the starting point of the envisaged new database management system. As a minimum requirement, the latter was expected to ensure the trouble-free continuation of the MEDC database compilation work and that of the associated MEDC information services (MWAD, MERDJ) even after the full abandonment of the formerly used database management software. In accordance with this requirement, the new "MEDC DBM" database management software had to be made able & to import the latest state of the MEDC database on the basis of the database status output  Table 1.
The development of "MEDC DBM" has been carried out by using the Delphi XE3 development environment on a 64 bit MS Windows 10 operating system platform. From the developed source code both 32 bit and 64 bit executables were compiled, of which the 64 bit version has been adopted on account of its higher performance levels displayed especially regarding the manipulation of the largest database tables.
During the early stages of the development process a character encoding inconsistency has been discovered in the output files of the "4D" system: the contents of the text files representing the database tables (Table 1) included a mixture of two different encoding schemes (Mac OS Roman and Unicode) regarding special characters beyond the default ASCII character set. The characters encoded in Mac OS Roman turned out to be incompatible with the auxiliary program used previously to generate the SQL file (step (5) on Fig. 1) for the update of MWAD, which was found to lead to textual errors in the latter. This kind of errors were eliminated by converting the Mac OS Roman encoded characters to Unicode on importing the "4D" software's database output files with the "MEDC DBM" software. The native use of Unicode encoding of text contents in "MEDC DBM" now ensures proper representation of special characters that may occur in the MEDC database as part of author names and article titles alike. At the same time, the update process of MWAD has also been simplified by making "MEDC DBM" able to output the corresponding SQL files directly ( Fig.  1), thereby removing the need for using an auxiliary software for the same purpose.
The SQL file in question determines the structure and data content of MWAD. In contrast with the relatively large number of tables maintained as part of the MEDC database (Table 1), MWAD currently offers 5 tables for query access: "References", "Data", "Keywords", "Isotopes" and "Abbreviations". The latter four correspond, respectively, to the MEDC tables (Table 1) of "Data" complemented with the associated keywords, "Keyword_code" and "Isotope_code" with the table fields related to the production of MERDJ issues being omitted, and finally "Abbreviations". The "References" table of MWAD, however, does not have a direct analog in the MEDC database. The latter is a relational database, and consequently it stores information spread over several interconnected tables: in the case of the "References" table of MWAD, at least four MEDC tables are involved in the composition of a single record of MWAD "References" as illustrated in Fig. 2.
To ensure continuity of operation, the "MEDC DBM" software outputs an MWAD SQL file that is compatible with the present realization of the MWAD access system, thereby enabling the latter to execute the same database query operations as before. At the same time, the correctness and completeness of database data as accessible via MWAD has been considerably improved with respect to its previous state, among others by the consistent application of Unicode character encoding, as well as by the inclusion of formerly skipped data such as reference information concerning book resources. Subsequent improvement of the MWAD service requires the further development of both the database structure and the access system of MWAD. Being a derivative of the MEDC database, the accuracy of MWAD data records also naturally depends on that of the corresponding MEDC database records. The compilation and maintenance of the latter requires working with hundreds of thousands of database field data, which may involve the occasional introduction of database errors. The latter can be either of a simple textual/typing error, or a more complex structural error that results in the violation of basic design principles of the database. The database management software can alleviate associated problems on the one hand by providing a clear visual overview of the database data and thereby promoting the recognition of database errors by database managers, on the other hand by introducing automatism into the data input process wherever it is possible and justified.
In the "MEDC DBM" software a clear view of the tables listed in Table 1 is available along with various options for filtering and sorting the associated records as shown in Fig. 3. The database manager can adjust the fields of interest that are to be displayed and the order in which those fields appear in the table. However, due to the relational nature of the database, a single table may not explicitly include all the fields according to which one may wish to filter or sort the corresponding records. For example, the "MEDC_References" table does not include any explicit information concerning the author names (Fig. 2) associated with  Table 1) in the MEDC relational database. Only the relevant fields of the latter tables are listed references: the latter need to be composed from information looked up in the "Authors" and "NameAddr" tables on the basis of the unique reference key ("RefKey" field). Similarly, the "MEDC_References" table includes only journal codes, and the corresponding journal titles must be looked up in the "Journals" table. Still, it would certainly be desirable to be able to filter the records of the "MEDC_References" table on the basis of author names and journal titles. This has been solved in the "MEDC DBM" software by augmenting the "MEDC_References" table with additional fields whose value is calculated or looked up on the basis of the content of another tables whenever the "MEDC_References" table is loaded or changed. These additional fields are differentiated from the regular fields of the table by having their name written in between "<" and ">" in the headline of the corresponding column (Fig.  3). With the inclusion of these additional fields the "MEDC_References" table becomes considerably more informative and readable than solely with the original fields, and provides additional possibilities for the filtering and sorting of the records in meaningful ways. Another example for the calculated fields is the number of data associated with the individual references in the "Data" table (see Table 1), whose column is denoted with "<DataNum>" in Fig. 3. A similar technique is also used in conjunction with other tables of the database in order to provide database managers with more insight into and control over the database data, as well as to increase the probability of detection/correction of textual and structural database errors.
Structural database errors may occur when the database manager software enables and/or carries out a data input or data removal operation that result either in a disruption of connections among database tables or in data records that defy the intrinsic logic according to which the data are organized. For example, removing a journal title from the "Journals" table that is still referenced by a record in the "MEDC References" table would create a broken link in the latter. Similarly, allowing for a direct, unchecked input of field values that should be unique or should satisfy certain criteria may lead to ambiguities and data inconsistencies if the data in question were wrongly entered. By using the "MEDC DBM" software we have detected a few examples of such type of errors in the MEDC database, suggesting that the previously used database manager software did not fully exclude the possibility of their occurrence. As the detection of these errors via manual inspection of the tables is rather difficult, the "MEDC DBM" software was added an option that scans the database for typical structural error types that were identified, and outputs the list of discovered errors separately for each of the different error types. This option enables database managers to monitor the structural integrity of the database and to identify errors that need to be corrected.
The occurrence of some error types, such as the reference to keyword codes that had no associated meaning defined in the "Keyword_code" table or to language codes that were not attributed to any languages in the "Lang_code" table (Table 1), may also be a consequence of an unconstrained manual input of these data. In order to avoid such errors from occurring, in the "MEDC DBM" software fields of these types are either constrained to be set via the manual selection of their values from a list of possible valid options (e.g., the existing language codes), or their input is checked in order to exclude invalid values (e.g., in the case of keyword codes). These kinds of techniques can prevent the formerly mentioned errors from occurring and help to maintain consistency of database data.
Certain desirable and envisaged feature enhancements of MWAD require the extension of the current MEDC database with additional fields. Perhaps the most prominent example is the DOI (digital object identifier [11]) link associated with the references in the "MEDC_References" table. It is now common that internet-based scientific bibliographic databases provide an internet link leading to the website that hosts the electronic form of the referenced publication. From the point of view of databases, the DOI link of publications is an optimal candidate for such a purpose, given that it realizes a persistent identification of corresponding scientific contents available on the internet. In order to be able to return such internet links together with bibliographic data as query results in future versions of MWAD, the "MEDC_References" table has been complemented with the "DOI" field, and compilation of corresponding data has recently been started especially in relation with the latest publications included in the MEDC database.
Another example is the hyperfine/apparent magnetic field reflected by Mössbauer spectral patterns displaying magnetic splitting. Historically, the MEDC database was designed for storing only the isomer shift and quadrupole splitting Mössbauer parameters in their separate fields. Still, the Mössbauer parameters of magnetic spectra were also included in the "Data" table of the database by noting the associated numerical hyperfine magnetic field data in the "Comments" field. Currently the database includes ca. 2400 data records referring to magnetically split spectral patterns in this way. It would be desirable that MWAD makes it possible to query these records on the basis of the given hyperfine/apparent magnetic field values. In order to achieve this, in the "Data" table of the MEDC database a separate field has been set up for the magnetic field values in question. On the one hand the magnetic field values associated with magnetic spectrum patterns can now be included in their separate numerical field for newly added records, on the other hand the field in question may be filled on the basis of the contents of the "Comments" field of existing records. This opens up the possibility for future versions of MWAD to be able to handle queries on the basis of the numerical value of the hyperfine/ apparent magnetic field Mössbauer parameter.
As an example for the possible use of such magnetic field data, we have collected 57 Fe hyperfine magnetic field (B hf ) values from data records associated with Sr 2 FeReO 6 double perovskite measured at different temperatures. There were 7 data records found that give explicit numerical value for the 57 Fe hyperfine magnetic field in Sr 2 FeReO 6 . The associated B hf values are given as a function of sample temperature in Fig. 4, together with a corresponding theoretical curve fitted to the data in order to estimate the (ferri)magnetic ordering temperature of Sr 2 FeReO 6 . The fit resulted in a critical temperature of T c = 417(3) K. This value is consistent with literature data [18] revealing that the Curie temperature of Sr 2 FeReO 6 may range from~400 K up until at least~445 K depending on the degree of anti-site disorder of Fe/Re atoms in the double perovskite structure. The seven data records on which the above result (Fig. 4) is based originate from six works [12][13][14][15][16][17] published by four separate research groups in four different years. Thereby the accuracy and reliability of the obtained result were further enhanced through the combination of the data in question, which provides an illustrative example for the added value a database can contribute to the scientific research process with.
One of the important results of the expert review of the works included into the MEDC database ( Fig. 1) is the attribution of relevant keywords to the considered publications and to the associated Mössbauer data. One type of keywords (and the associated keyword codes, e.g., CYA for inorganic cyanides, GLS for glasses and amorphous substances, MAA for metals and alloys, etc.) aims to designate Mössbauer data according to the class of compounds to which the measured sample belongs. The attribution of these keywords to individual Mösssbauer data makes it possible to query the database for Mössbauer data collected for a particular class of compounds, which in turn may reveal approximate limits of Mössbauer parameters associated with the compound class in question.
One of the unique features of the current MWAD is that it allows one to query data records according to the value of the isomer shift and quadrupole splitting Mössbauer parameters. At the same time, the "Data" table of the MEDC database contains isomer shift values given with respect to different standard materials, such as, e.g., α-Fe, SS (stainless steel) and SNP (sodium nitroprusside) in the case of 57 Fe isomer shift data. Consequently, in the case of the isomer shift parameter, to make a query really useful, the isomer shift reference material need to be defined Fig. 4 57 Fe hyperfine magnetic field in Sr 2 FeReO 6 as a function of sample temperature (filled circles) based on 7 data records of the MEDC database, and a corresponding theoretical curve (following the temperature dependence obtained from a mean-field approximation of the Fe 3+ sublattice magnetization) fitted to the data via the least squares method. The data records are associated with 6 different publications [12][13][14][15][16][17] spanning the time period of 2000-2005. Numbers beside the points indicate the publication year(s) of the corresponding work(s). The fitted value of the critical temperature is T c = 417(3) K as well. In a query of this type, one could consider all suitable data records irrespective of the isomer shift reference material, if the database included a field containing the isomer shift parameter given with respect to the same isomer shift reference material for all the data records. Such a field is therefore a further candidate for the extension of the "Data" table (see Table 1) of the MEDC database.

Conclusions
In the frame of the first stage of the further development of the MEDC database, a custom made new database management software named "MEDC DBM" has been developed. The new software enables the continuation of database compilation work in the MS Windows OS environment by offering all the necessary features that were previously utilized for database management in the iMac based "4D" database management system. At the same time, "MEDC DBM" presents greatly enhanced features with respect to the visualization, input and output of database data, among others due to the consistent use of Unicode character encoding. With respect to the previously used system, prevention of errors has been enhanced by introducing higher levels of automatism at various stages of the data input process, while the probability of the detection of errors has been increased by a clear and versatile visualization of the various database tables and-for the case of structural database errors-by an automatic error detection and reporting system. In addition, the database structure has been complemented with additional fields storing information such as the DOI link of publications and the hyperfine/ apparent magnetic field value of spectral patterns displaying magnetic splitting, which can contribute to the implementation of new features in the case of the envisaged new online version of the database (MWAD).