Tracking and viewing modifications in digital calibration certificates
ACTA IMEKO
ISSN: 2221-870X
March 2023, Volume 12, Number 1, 1 - 7
ACTA IMEKO | www.imeko.org March 2023 | Volume 12 | Number 1 | 1
Tracking and viewing modifications in digital calibration
certificates
Vashti Galpin1, Ian Smith2, Jean-Laurent Hippolyte2
1 School of Informatics, University of Edinburgh, United Kingdom
2 National Physical Laboratory, United Kingdom
Section: RESEARCH PAPER
Keywords: digital calibration certificates, provenance, temporal databases, XML, prototype
Citation: Vashti Galpin, Ian Smith, Jean-Laurent Hippolyte, Tracking and viewing modifications in digital calibration certificates, Acta IMEKO, vol. 12, no. 1,
article 11, March 2023, identifier: IMEKO-ACTA-12 (2023)-01-11
Section Editor: Daniel Hutzschenreuter, PTB, Germany
Received November 20, 2022; In final form March 20, 2023; Published March 2023
Copyright: This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use,
distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This work was supported by ERC Consolidator Grant Skye [grant number 682315] and by an ISCF Metrology Fellowship grant provided by the UK
government’s Department for Business, Energy and Industrial Strategy (BEIS).
Corresponding author: Vashti Galpin, e-mail: Vashti.Galpin@ed.ac.uk
1. INTRODUCTION
Information relating to the calibration of an instrument or
artefact is captured in documents referred to as calibration
certificates. These documents are typically provided either as
physical paper-based documents or in electronic form, for
example, in archivable Portable Document Format (PDF-A). A
downside of both approaches is that the information they
contain is not machine-readable, that is it is not available in a
form that can be read and processed by computer. Using the
information therefore requires, to some degree, human
involvement and as such is prone to error, for example, arising
from the transcription of numerical information from paper to
computer.
Recent initiatives have considered the replacement of paper-
based and electronic calibration certificates by fully machine-
readable calibration certificates, referred to as “digital calibration
certificates”, often abbreviated to “DCCs”.
The European Metrology Programme for Innovation and
Research (EMPIR) [1] funded the “SmartCom” Joint Research
Project (2018-2021) [2] which developed a framework for DCCs.
The framework allows the key calibration certificate components
of administrative data and measurement results to be stored. The
recipient of a DCC will have, or be able to develop, software to
ingest and use the information stored therein.
Ensuring the integrity and longevity of current and historical
calibration data is essential for calibration laboratories to
maintain long-term trusted relationships with their customers. It
is also a requirement of ISO/IEC 17025 accreditation for these
laboratories [3], along with the preservation of information about
the methods, processes, software, equipment, and staff involved
in the calibration task. This accompanying information, which is
essential to trace and potentially repeat the conditions under
which the calibration is performed, is more generally known as
“provenance information” [4]. Laboratories must be able to track
and compare the differences between successive calibrations of
the same artefact in data (for example, newly observed
measurement results) or in provenance (for example, a different
employee now signs off the calibration results). Since DCCs are
containers for the trusted exchange of calibration data, their
potential to increase the productivity of laboratories relies on
ABSTRACT
Trust in current and historical calibration data is crucial. The recently proposed XML schema for digital calibration certificates (DCCs)
provides machine-readability and a common exchange format to enhance this trust. We present a prototype web application developed
in the programming language Links for storing and displaying a DCC using a relational database. In particular, we leverage the temporal
database features that Links provides to capture different versions of a certificate and inspect differences between versions. The
prototype is the starting point for developing software to support DCCs and the data with which they are populated and has underlined
that DCCs are the tip of the iceberg in automating the management of digital calibration data, activity that includes data provenance
and tracking of modifications.
mailto:Vashti.Galpin@ed.ac.uk
ACTA IMEKO | www.imeko.org March 2023 | Volume 12 | Number 1 | 2
automated, integrated, and time-based curation of this data and
associated provenance.
The structure of this paper is as follows: we provide
background on DCCs before we go on to describe our prototype.
We discuss the choices made in the development of the
prototype and illustrate its usage with a screenshot. We conclude
with ideas for further work. This paper is an extended version of
an earlier conference paper [5].
2. DIGITAL CALIBRATION CERTIFICATES
The structure of a DCC [6] reflects the information that is
required by ISO/IEC 17025 [3] for reporting calibration results.
A DCC comprises two compulsory sections. The first
compulsory section contains the administrative information
including that which is generally presented on the first page of a
paper-based certificate. The second compulsory section contains
measurement results. The measurement results section itself
relies upon a framework, referred to as the “D-SI”, developed
specifically for the storage of measurement data. The framework
ensures that quantity values, units of measurement and
uncertainty information can all be represented. A DCC may also
include two optional sections. The first optional section contains
information presented purely for humans, for example,
calibration-specific data sheets, and other auxiliary, machine-
interpretable data, for example, relational tables. The second
optional section contains an encoded version of a human-
readable version of the certificate.
The D-SI and DCC frameworks specify, for example, how
numerical values and date and time information should be
presented and uses the BIPM SI brochure [7] and the siunitx
package for LaTeX [8] as the basis for the provision of units of
measurement.
The D-SI and the DCC may be implemented in a language
chosen by the user, for example, Extensible Markup Language
(XML) or JavaScript Object Notation (JSON). The SmartCom
project has developed and made available an XML schema for
the D-SI framework [9] while the Physikalisch-Technische
Bundesanstalt (PTB), the National Metrology Institute (NMI) of
Germany, has made available an XML schema for the DCC [10].
Demonstrators have been developed by NMIs using Excel
and Python [11], [12] and as a web application (GEMIMEG) [13]
to illustrate the use of these schema. A different approach
considers embedding calibration data in PDF calibration
certificates to support machine-readability [14]. NMIs have also
been investigating good practice for DCCs [15] and usage in
specific domains [16], [17]. Developers of commercial software
for calibration are also starting to support the use of DCCs [18],
[19]. The first IMEKO TC6 M4DConf, held in Berlin in 2022,
focussed on the digital transformation of metrology and many
submissions considered DCCs [20].
We are aware that the frameworks (and related schemas) may
be subject to future updates, for example, if the BIPM SI
brochure is updated, or if the requirements for the contents of a
calibration certificate change. In the approach we have taken,
dealing with XML schema changes is straightforward, and in fact
occurred during our development when a newer version of the
DCC schema became available, and we discovered that the
dummy certificate with which we were working was missing a
compulsory XML element.
3. THE DCC PROTOYTPE
Our prototype application brings together concepts from
metrology (specifically, the XML schema for DCCs) and
academic research in databases and programming languages and
enables the transfer of research ideas from academia. On the
computer science research side, there are two distinct concepts:
temporal databases and a technique for storing XML documents
in a relational database, that are combined using the Links
programming language, a tierless language that supports
language-integrated query and provides a single language for the
development of web applications with database back-ends [21],
[22]. We now consider these three components in detail: Links,
temporal databases, and storage of XML documents; before
considering some details of the implementation of the
application.
3.1. The Links Programming Language
Links is a strongly-typed and statically-typed functional
programming language, that is cross-tier (or tierless): it removes the
need for the developer to write JavaScript or a particular database
query language (in this case, PostgreSQL [23]) for the different
tiers of a web application providing data from a database. In
particular, Links assists in the writing of correct software by
providing language-integrated query which allows the programmer to
write high-level queries that are transformed to correct SQL
queries with known performance for execution on the database.
Furthermore, the type-checking of variables and functions
before program execution reduces the likelihood of run-time
errors.
Figure 1 illustrates the Links architecture. The Links
interpreter takes the Links code and generates correct HTML
and JavaScript to run in the user’s browser, and to support
exchange of data between the browser and Links program, thus
allowing an interactive webpage. The interpreter also generates
correct SQL queries which are passed to a database server for
execution which responds with the results. Using this approach,
data intensive operations can take place on the database server,
using the fact that database systems such as PostgreSQL employ
efficient data structures and algorithms for queries. In particular,
indexes on key values can make search operations very efficient.
Links provides syntax to specify database tables as well as
comprehension syntax to iterate over each row of a database
table. We use the MVU (Model-View-Update) library [24] for the
development of the web interface as it provides functions to use
the element of the Bootstrap toolkit for website development
[25], thereby affording a higher level of abstraction by which to
achieve an appealing web frontend.
Figure 1. The Links architecture.
ACTA IMEKO | www.imeko.org March 2023 | Volume 12 | Number 1 | 3
3.2. Temporal databases
Temporal databases capture when a piece of data is valid. In the
case of a Relational DataBase Management System (RDBMS),
we are interested in the time period for which a row in a database
table is valid, and the inclusion of additional fields (columns)
allows this time period to be recorded [26], [27]. We are
interested in the provenance of DCCs and hence we want to
capture when data changes in the database, namely transaction-time
information. Temporal databases can also record valid-time
information which is information about when something is true
in the real world.
Figure 2 demonstrates the difference between an update in a
standard relational database versus one in a temporal database
supporting transaction time. In the standard case, the new data
overwrites the old data and there is no record of the fact that
there was a previous value for the field. In a temporal database,
two additional fields capture information about when the data
was changed, permitting queries such as “What were the values
on this day last year?”, “Has this field ever had a different value?”,
and “Which fields have the most frequent changes?”. In the
example, the date fields are given informally; however, in the
database, the fields are of SQL data type TIMESTAMP WITH TIME
ZONE and cannot be NULL. In terms of performance in the DCC
scenario, since updates are not frequent occurrences, the
additional database operations required in the temporal case will
not impose a significant overhead.
Links has recently been extended with temporal features that
have been demonstrated in the context of curated scientific
databases [28] and we use these features in the development of
our prototype.
3.3. XML documents
For this paper, we have used a synthetic DCC in XML,
containing dummy administrative data and measurement results,
formatted according to the XML schema [10] made available by
PTB.
XML is a tree-structured language, and an XML document is
an (inverted) tree of nodes starting from a root node. Each node
can have any number of child nodes. Nodes without children are
called leaves.
Each node in an XML document is either an element or a text
node (a string of characters). An element consists of an opening
Figure 2. Non-temporal update of a field versus a temporal update.
...
GB
en
...
laboratory
Digital thermometer
...
...
...
...
...
...
Figure 3. An example DCC in XML: text nodes are shown by white text on a
dark blue background, attributes by black text on a grey background and the
other components are XML elements defined by opening and closing tags.
The ellipses indicate where elements have been omitted.
ACTA IMEKO | www.imeko.org March 2023 | Volume 12 | Number 1 | 4
tag and closing tag (or a combined opening and closing tag). The
child nodes of the element (if any) appear in between a pair of
opening and closing tags of that element. These child nodes can
be elements themselves or text (strings of characters). A text
node has no children (and hence is a leaf).
An XML schema describes the tags that can be used in a
document and how they can be nested [29]. Additionally,
opening tags may contain attributes. Example nodes and
attributes are given in Table 1 and Table 2. Figure 3 provides an
example DCC using the XML schema. In the example,
is an opening tag and
is a closing tag, together providing an
XML element, which contains the text node laboratory. In the
figure, the opening tag for the dcc:digitalCalibrationCertificate
element contains three attributes which are shown and these
attributes provide sources for the XML schema and give the
schema version used in the document. The children of this initial
element are the elements dcc:administrativeData and
dcc:measurementResults.
The temporal features of Links are currently supported by an
RDBMS, hence it is necessary to map from the XML schema to
this database. There were two choices here: either to use the
DCC XML schema to create the appropriate tables in the
RDBMS or to treat the data in a schema-agnostic manner and
capture it as an XML document. We choose the latter approach
because changes to the XML schema will not require changes to
the RDBMS schema.
The Dynamic Dewey (DDE) labelling scheme supports the
labelling of XML documents. It uniquely labels each node in an
XML document and thereby provides a key for storing the node
in a relational table. It has good query and insertion performance
[30]. It is an extension of the Dewey labelling scheme which has
good query performance but differs from it by it removing the
need for relabelling (which reduces update performance) when
the XML document is updated because Dynamic Dewey can
provide unique labels for new nodes.
To store an XML document using this approach, only two
tables are required [30] (although a third table can be added to
store path information separately – we omit these details here).
This approach requires the generation of a unique value to serve
as a key for each node (an element or a text node) in an XML
document. The DDE algorithm determines this unique label
(referred to as dde) which consists of a sequence of integers (the
nodes can be labelled with negative numbers in certain cases)
separated by dots – so-called dot decimal notation. A total order
is defined for the labels (which differs from the standard ordering
over dot-decimal notation), and this ordering can be used to
reconstruct the nesting of the XML document as well as to
determine relationships between nodes such as parent, child, and
sibling.
The main table stores nodes and is similar to the one
illustrated in the right-hand side of Figure 2. Its schema is (dde,
tag, text, path, tt-from, tt-to). Each row has the dde value for the node
as the key attribute. If the node is an element, the element tag
will appear in the tag column, otherwise if it is a text node, its text
string will appear in the text column (one of these two columns
is always empty). The path column stores the list of the element
tags followed from the root element to the parent of the node.
The last two columns tt-from and tt-to store the timestamps that
describe when the node is valid. Because the dde value captures
the relationships between nodes, there is no need to explicitly
refer to other nodes in a row of this table.
The second table contains attribute information and has the
schema (dde, attr, value, tt-from, tt-to). In an element, each attribute
can only appear at most once, hence the attribute name and the
dde value are sufficient to uniquely identify the attribute’s value.
Consider the following XML fragment:
Text<\el2>
<\el3>
<\el1>.
The DDE algorithm will generate the content shown in
Tables 1 and 2. For this example, the length of each segment of
the key is 3 characters. In the implementation, this length is a
parameter, and it can be set appropriately for document size.
As can be seen from this explanation, the actual details of the
XML schema are not used for the database schema – there is no
table for customer details, for example.
By using the DDE ordering with the appropriate Links code
to specify joins over the relations, documents can be stored,
retrieved and displayed, as shown by our implementation.
3.4. Implementation
The prototype provides an interface for viewing a single DCC.
Straightforward extensions include the ability to edit the DCC
and to validate that it adheres to the DCC and D-SI XML
schemas. Further obvious extensions are the ability to work with
many DCCs, to compare different DCCs, and to support the
signing of DCCs. We are interested in provenance beyond
tracking changes in the data, such as recording which person and
what software made the changes.
In the longer term, we will require systems that support the
whole calibration workflow including automated collection and
processing of machine-readable calibration data.
Using the interface, it is possible to view the current state of
the document, the state at a previously specified date and time,
and to compare two versions at two specified dates and times. It
is possible to hide or show specific subtrees of the document,
and when doing the comparison, to see the subtrees where
changes have been made.
The Change Analysis option allows the user to see a history
of all changes over the life of a DCC. Figure 4 shows how
changes are highlighted as well as cumulative totals of insertions
and changes for each tag and text node in the document. There
are various options to view subtrees for a specific tab or to
control the display of the whole document. The figure also shows
how details of changes are displayed. (Note that this screenshot
is of a different DCC to the one presented in Figure 3.) The
Links code for the prototype is available at
https://github.com/vcgalpin/dcc-xml-temporal [31].
Table 1. Example node table.
dde tag text path tt-from tt-to
001 el1 \ … …
001.001 el2 \el1\ … …
001.001.001 Text \el1\el2\ … …
001.002 el3 \el1\ … …
Table 2. Example attribute table.
dde attr value tt-from tt-to
001 attr1 A value … …
001.002 attr2 Another value … …
https://github.com/vcgalpin/dcc-xml-temporal
ACTA IMEKO | www.imeko.org March 2023 | Volume 12 | Number 1 | 5
As mentioned above, a more complete application should be
able to validate against the DCC and D-SI XML schemas. The
level of validation that is provided by these schemas is very high
but not complete. In some cases, patterns are used to ensure that
values are within bounds. For example, the si:probabilityValueType
[9] has the following associated pattern