An Experiment with Denotational Semantics
 575 Downloads
Abstract
The paper is devoted to showing how to systematically design a programming language in “reverse order”, i.e., from denotations to syntax. This construction is developed in an algebraic framework consisting of three manysorted algebras: of denotations, of an abstract syntax and of a concrete syntax. These algebras are constructed in such a way that there is a unique homomorphism from concrete syntax to denotations, which constitutes the denotational semantics of the language. Besides its algebraic framework, the model is settheoretic, i.e., the denotational domains are just sets, rather than Scott’s reflexive domains. The method is illustrated by a layerbylayer development of a virtual language Lingua: an applicative layer, an imperative layer (with recursive procedures) and an SQL layer where Lingua is regarded as an API (Application Programming Interface) for an SQL engine. The latter is given a denotational semantics as well. Mathematically, the model is based on socalled naive denotational semantics (Blikle and Tarlecki in Information processing 83. Elsevier Science Publishers B.V., NorthHolland, 1983), Manysorted algebras (Goguen et al. in J ACM 24:68–95, 1977), equational grammars (Blikle in Inform Control 21:134–147, 1972), and a threevalued predicate calculus based on a threevalued proposition calculus of McCarthy (A basis for a mathematical theory of computation, North Holland, 1967). Threevalued predicates provide an adequate framework for errorhandling mechanisms and also for the development of a Hoarelike logic with clean termination (Blikle in Acta Inform 16:199–217, 1981) for Lingua. That logic is used in Blikle and ChrząstowskiWachtel (Complete Unambiguous, https://doi.org/10.13140/rg.2.2.27499.39201/3, 2019) for the development of correctnesspreserving programs’ constructors. This issue is, however, not covered by the paper. The langue is equipped with a strong typing mechanism which covers basic types (numbers, Booleans, etc.), lists, arrays, record and their arbitrary combinations plus SQLlike types: rows, tables, and databases. The model of types includes SQLintegrity constraints.
Keywords
Settheoretic denotational semantics Manysorted algebras Threevalued predicate calculus A denotational model of types Abstract syntax Concrete syntaxIntroduction
Reversing the Traditional Order of Things
For a long time, the term formal verification was almost synonymous with functional verification. In the last years, it became more and more clear that full functional verification is an elusive goal for almost all application scenarios. Ironically, this happened because of advances in verification technology: with the advent of verifiers, such as KeY, that mostly cover and precisely model industrial languages and that can handle realistic systems, it finally became obvious just how difficult and timeconsuming the specification of the functionality of real systems is. Not verification but specification is the real bottleneck in functional verification.
In my opinion, the failure in constructing a practical system for program validation has had two sources. The first lies in the fact that in building a programming language, we start from syntax and only later—if at all—define its semantics. The second source is somehow similar, but concerns programs: we first write a program and only then try to prove it correct.
To build a logic of programs for a programming language, one must first define its semantics on a mathematical ground. Since the 1970s it was rather clear for mathematicians that such semantics to be “practical” must be compositional, i.e., the meaning of a whole must be a composition of the meanings of its parts. Later, such semantics were called denotational—the meaning of a program is its denotation—and for about two decades researchers investigated the possibilities of defining denotational semantics for existing programming languages. Two most complete such semantics were written in 1980 for Ada [4] and for CHILL [18] in using a metalanguage VDM [3]. A little later, but in the same decade, a minor exercise in this field was a semantics of a subset of Pascal written in MetaSoft [11], the latter based on VDM.
Unfortunately, none of these attempts resulted in the creation of softwareengineering tools that would be widely accepted by the IT industry. In my opinion that was unavoidable, since for the existing programming languages a full denotational semantics simply cannot be defined (see “General remarks about denotational models”). That was, in turn, the consequence of the fact that historically syntaxes were coming first and only later researchers were trying to give them a mathematical meaning. In other words—the decision of how to describe things preceded the reflection of what to describe.
In addition to that, two more issues were complicating denotational models of programming languages. They were related to two mechanisms considered important in the 1960s, but later abandoned and forgotten. One was a common jump instruction goto, the other—specific procedures that may take themselves as parameters (Algol 60, see [26]). The former has led to continuations (see [22]), and the latter to reflexive domains (see [27]). Both contributed to a technical complexity of denotational models which was discouraging not only for practitioners, but even for mathematicians.
The second group of problems followed from a tacit assumption that in the development of a mathematically correct program, the development of a program should precede the proof of its correctness. Although this order is quite obvious in mathematics—first theorem, then its proof—it is rather awkward for an engineer who first performs all necessary calculations (the proof) and only then builds his bridge or aeroplane.
The idea “first a program and correctnessproof later” seems not only irrational, but also practically rather unfeasible for two reasons.
The first reason follows from the fact that a proof of a theorem is usually longer than the theorem itself. Consequently, proofs of program correctness should contain thousands if not millions of lines. It makes “handmade proofs” rather unrealistic. On the other hand, automated proofs were not available by the lack of formal semantics for existing programming languages.
Even more important seems, however, the fact that programs that are supposed to be proved correct are usually incorrect! Consequently, correctness proofs are regarded as a method of detecting errors in programs. In other words, we are first doing things in the wrong way to correct them later. Such an approach does not seem very rational either.
As an attempt to cope with all the mentioned problems, I propose some mathematical tools and methods that allow for the development of programming languages with denotational semantics. Their detailed description may be found in [16]. To illustrate these methods, an exemplary programming language, Lingua, has been developed from denotations to syntax (first publication of that method in [12]). In this way, the decision of what to do (denotations) precedes the decision of how to express that (syntax).
Mathematically, both the denotations and the syntaxes constitute manysorted algebras (“Manysorted algebras”), and the associated semantics is the homomorphism from syntax to denotations. As it turns out, there is a simple method—to a large extent algorithmizable—of deriving syntax from (the description of) denotations and the semantics from both—the syntax and the denotations.
At the level of data structures (i.e., sets of data), Lingua contains Booleans, numbers, texts, records, arrays and their arbitrary combinations plus SQL databases. It is also equipped with a relatively rich mechanism of types (which are not sets; see “Composites, transfers, yokes, types and values”), e.g., covering SQLlike integrity constraints,^{1} and with tools allowing the user to define his/her own types structurally. At the imperative level, Lingua contains structured instructions, type definitions (“The imperative layer of the language”), procedures with recursion and multirecursion, and some preliminaries of objectoriented programming.
The issue of concurrency is not tackled in [16], since the development of a “fully” denotational semantics for concurrent programs (if at all possible) would require separate research.^{2}
Once we have a language with denotational semantics, we can define programconstruction rules that guarantee the correctness of programs developed in using these rules. This method was for the first time sketched in my paper [8] and in [16] is described in Sect. 8. It consists in developing socalled metaprograms which syntactically include their specifications. The method guarantees that if we compose two or more correct programs into a new program, we get a correct program again. The correctness proof of a program is hence implicit in the way the program has been developed. The aspect of the development of correct programs is not covered by the present paper.
 1.
fixedpoint theory in partially ordered sets,
 2.
the calculus of binary relations,
 3.
formallanguage theory and equational grammars,
 4.
fixedpoint domain equations based on socalled naive denotational semantics (cf. [17]),
 5.
manysorted algebras,
 6.
abstract errors as a tool for the description of errorhandling mechanisms,
 7.
threevalued predicate calculi of McCarthy and Kleene,
 8.
the theory of total correctness of programs with clean termination (cf. [10]).
All these tools are described in Sects. 2 and 8 of [16], and some of them are sketched in “What is new in my approach” of the present paper.
 1.
the priority of the simplicity of the model, i.e., the simplicity of denotations, syntax, and semantics; this has laid to the resignation from, e.g., goto instruction and selfapplicative procedures,
 2.
the priority of the simplicity of programconstruction rules; e.g., the assumption that the declarations of variables and procedures, as well as the definitions of types, should be located at the beginning of a program,
 3.
the priority of protection against “oversight errors” of a programmer; e.g., the resignation of global variables in procedures and of side effects in functional procedures.
All these commitments forced me to give up some programming constructions, which—although denotationally definable—would lead to complicated descriptions and even more complicated programconstruction rules. It is worth mentioning in this place that the priority of simplicity is not new in the history of programming languages. For that very reason, programming language designers abandoned gotos as well as selfapplicative procedures.
The name Lingua has been chosen to commemorate the circumstances under which from October to December 1969, I wrote my first denotational semantics of a very simple programming language (this work was later published in Dissertationes Mathematicae [5] as my habilitation (postdoctoral) thesis). During 3 months as a scholar of the Italian Government, I was working in the Istituto di Elaborazione dell’Informazione in Pisa. I did not yet know the works of Dana Scott or the concept of denotational semantics, and I constructed my language and its semantics on a model theory known in mathematical logic. Only 18 years later, in the year 1987, I described (in [12]) the idea of how to develop syntax from detonations.
What is in the Paper
I am deeply convinced that one can talk about programming in a precise and clear way. I also believe that taking responsibility by software engineers should be possible in the same way as it is in the case of the engineers of cars, bridges, or aeroplanes. However, I am aware of the fact that the existing tools for software engineers do not allow for the realization of any of these goals.
The paper contains many thoughts developed in the years 1960–1990 that later have been abandoned. One of the teams developing these ideas was working in the Institute of Computer Science of the Polish Academy of Sciences, and I had the pleasure to chair it. At that time we had developed a semiformal metalanguage MetaSoft dedicated to formal definitions of programming languages (cf. [11]). This metalanguage is used in [16] and in the present paper as a definitional vehicle for denotational models.
I am aware of the fact that the content of [16] represents a very restricted part of the world of today’s programming languages. Something had to be chosen, however, to begin with. Lingua contains, therefore, a selection of programming tools that have been known for many years and that are still in use. In the future, I shall try to complete my models with those vehicles that my readers will consider important. I also hope that maybe some of my readers will undertake this challenge. Feel invited to cooperate.
What this Paper is Not Offering
 1.
the compatibility of the program’s specification with the expectations of its user,
 2.
the compatibility of the program itself with its specification.
In this paper, and in [16], I tackle only the second aspect. My choice is not caused by the fact that the first problem is less important, or that it has been already solved, but only because the second problem was the main subject of my research for two decades and therefore I dare to talk about it now.^{3}
I also have to emphasise very strongly that my virtual language Lingua is not regarded either as a practical programming language or even as a standard of such a language, although maybe a real language will grow from Lingua in the future. At present, it only offers a platform where to explain the constructions and the models discussed in [16]. I have tried to cover in it the selected basic tools that are present in languages which are known to me today. I resigned from concurrency, and objectoriented programming is only sketched.
I believe, however, that there are enough applications today that can be developed in using the tools described in [16].
What is New in My Approach
 1.Programming language design and development:
 1.1.
Denotational model based on set theory rather than on D. Scott’s reflexive domains which makes the model much simpler and easy to be formalized.
 1.2.
A model of data types that covers not only structured and userdefined types, but also SQLintegrity constraints.
 1.3.
A formal, and, to a large extent, an algorithmic method of a systematic development of syntax from denotations and of a denotational semantics from both of them.
 1.4.
The idea of a colloquial syntax which allows making syntax user friendly without damaging a denotational model.
 1.5.
Systematic use of error elaboration in programs supported by a threevalued predicate calculus.
 1.1.
 2.The development of correct programs:
 2.1.
A method of systematic development of correct programs with their specifications, rather than an independent development of programs and specifications followed by programcorrectness proof.
 2.2.
The use of threevalued predicates to extend Hoare’s logic by a clean termination property.
 2.1.
 3.General mathematical tools:
 3.1.
Equational grammars applied in defining the syntax of programming languages.
 3.2.
A threevalued calculus of predicates applied in designing programming languages and in defining sound program constructors for such languages.
 3.1.
Mathematical Preliminaries
For a full description of mathematical tools used in the development of denotational models, see Sect. 2 of [16]. Below, there is a selection of concepts and notations that are used in the present paper. They all come from MetaSoft [11]—a metalanguage for the description of programming languages.^{4}
Notational Conventions

a : A means that a is an element of the set A; according to the denotational dialect, sets are most frequently called domains,

f.a denotes f(a), and f.a.b.c denotes ((f(a))(b))(c); intuitively f takes a as an argument and returns the value f(a) which is a function which takes b as an argument and returns the value (f(a))(b), which is again a function…

f · g denotes the sequential composition of functions, i.e., (f·g).a = g.(f.a),

A → B denotes the set of all partial functions from A to B, i.e., functions which are (possibly) undefined for some elements of A,

A \(\mapsto\) B denotes the set of all total functions from A to B, i.e., functions undefined for all elements of A; of course, A \(\mapsto\) B is a subset of A → B,

A \(\Rightarrow\) B denotes the set of all finite function from A to B, i.e., functions defined for only finite subsets of A; such functions are called mappings, and of course, each mapping is a particular case of a partial function,

[a_{1}/b_{1}, …, a_{n},b_{n}] denotes a mapping that assigns b_{i} to a_{i} and is undefined otherwise,

A  B denotes the settheoretic union of A and B,

A × B denotes the Cartesian product of A and B,

A^{c*} denotes the set of all finite (possibly empty) tuples of the elements of A,

A^{c+} denotes the set of all finite nonempty tuples of the elements of A,

c*ctt and ff denote logical values “true” and “falsity”, respectively,

manycharacter symbols like dom, bod, com denote metavariables running over domains and if they are written with quotation marks as ‘abdsr’ denote themselves, i.e., metaconstants,^{5}

in the definitional clauses of Lingua instead of indexed variables like sta_{1,} we write sta1 or sta1 which is closer to a notation used in programs.
 1.
the level of the basic text of the paper written in Times New Roman,
 2.
the level of a formal, but not formalized, metalanguage MetaSoft written in Arial,
 3.
the level of formalized programming language Lingua whose syntax, i.e., programs, are written in Courier New .
The difference between “formal” and “formalized” is such that the former is introduced intuitively as a mathematical notation, whereas the latter requires an explicit definition of syntax (usually by a grammar) and a formal definition of semantics.
Intuitively speaking, the evaluation of such a function goes line by line and stops at the first line where p_{i}.x is satisfied.
In the scheme above, I also allow the situation where, in the place of a gi.x we have the undefinedness sign “?”, which means that for x that satisfies pi.x the function f is undefined. This convention is used in conditional definitions of partial functions.
All these explanations are certainly not very formal, but the notation should be clear when it comes to concrete examples in the sequel of the paper.
By f[a_{1}/v_{n}, …, a_{n}/v_{n}], I denote an overwriting of f by [a_{1}/v_{n}, …, a_{n}/v_{n}], i.e., a function which differs from f only on the domain {a_{1}, …, a_{n}}.
ManySorted Algebras
In our algebra, we have four zeroargument constructors 1, 0, tt, ff, one oneargument constructor not, and four twoargument constructors +, =, <, or. The zeroargument constructors create elements of carriers “from nothing”, whereas all other constructors create elements of carriers from other elements of carriers.
An element of an algebra is called reachable if it can be constructed (reached) using the constructors of the algebra. In NumBool, where Num denotes the set of all real numbers, the reachable subset of Num contains only nonnegative integers.
By a reachable subalgebra of an algebra, we mean its subalgebra with carriers restricted to their reachable parts. In our case, this is an algebra of nonnegative integers and Booleans.
An algebra is said to be reachable if all its carriers contain only reachable elements. Notice that if we remove the zeroargument constructor 1 from NumBool, then the reachable subset of Num becomes empty.
In the algebraic approach to denotational models, the algebra of program denotations (meanings) is usually unreachable, whereas the algebras of syntax are reachable by definition (see “Equational grammars”).
We shall assume that NumBoolExp contains only reachable expressions. Such algebra is implicit in the signature of NumBool and, due to its reachability, is unique. Traditionally, it is called the abstract syntax of the algebra NumBool.

SemN.[ + (1, + (1,0)) ] = 2,

SemB.[ < ( + (1, + (1,0)),0) ] = ff

SemN.[ + (1, + (1,0)) ] = SemN.[ + (1,1) ] = 2,

SemB.[ < ( + (1, + (1,0)),0) ] = SemB.[ < (0,0) ] = ff
The notation of an abstract syntax is rather awkward and therefore abstract syntax is usually transformed into a concrete syntax, which is more “user friendly”. In our case, it would correspond to an infix notation where the concrete + given two numeric expressions, nexp1 and nexp2, creates the expression:

(nexp1 + nexp2)

As = Co · Cs.
There is more about a denotational model of programming languages in “Five steps to a denotational model”. Readers interested in the mathematical justifications of the model are referred to Sects. 2.10 to 2.13 of [16] and to the references given there.
Equational Grammars
Let A be an arbitrary finite set of symbols called an alphabet. By a word over A, we mean every finite sequence of the elements of A including the empty sequence ε. If p and q are words, then by their concatenation—in symbols pq—we mean a sequential combination of these words.

PQ = {pq  p : P and q : Q}
According to a usual style for writing grammars, the symbols 0 , 1 , tt , ff, + , = , < , not, or, (,) and the coma denote oneelement languages: { 0 }, { 1 }, …
Equational grammars correspond closely to contextfree grammars introduced by Noam Chomsky (e.g., in [19]) in the sense that for each contextfree grammar there exists an equational grammar that defines the same manysorted language, and for a certain class of equational grammars there exists an equivalent contextfree grammar. They have been introduced in [6] and are also described in Sects. 2.5 and 2.14 of [16].
Abstract Errors

the value of x/y cannot be computed if y = 0,

the value of the expression x + 1 cannot be computed if x has not been declared in the program,

the value of x + y cannot be computed if the sum exceeds the maximal number allowed in the language,

the value of the array expression a[k] cannot be computed if k is out of the domain of array a, or if a is not an array,

the query “Has John Smith retired?” cannot be answered if John Smith is not listed in a database.
In all these cases, a welldesigned implementation should stop the execution of a program and generate an error message or perform a recovery procedure.
To describe that mechanism formally, we introduce the concept of an abstract error. In a general case abstract errors may be anything, but in our models, they are texts such as, e.g., ‘divisionbyzero’. They are enclosed in apostrophes to distinguish them from metavariables.

x/0 = ‘divisionbyzero’.

DataE = Data  Error,

op : Data _{ 1} x … x Data _{ n} → Data
Of course, ope should coincide with op wherever op is defined.
The operation ope is said to be transparent for errors or simply transparent if the following condition is satisfied:
if d_{k} is the first error in the sequence d_{1},…,d_{n}, then ope.(d_{1},…,d_{n}) = d_{k}
Intuitively, this condition means that arguments of ope are evaluated one by one from left to right, and the first error (if it appears) becomes the final value of the computation.
The majority of operations on data that will appear in our models are transparent. Exceptions are Boolean operations discussed in “Manysorted algebras”
Errorhandling mechanisms may be implemented in such a way that errors serve only to inform the user that (and why) program execution has been aborted. Such a mechanism is called reactive. Another option is that the generation of an error results in an action, e.g., of recovering the last state of a database. Such mechanisms are called proactive.
A reactive mechanism may be quite easily enriched to a proactive one (see Sects. 6.1.8 and 12.7.6.4 of [16]). However, since the latter is technically more complicated, in this paper only reactive model will be discussed.
A welldefined errorhandling mechanism allows avoiding situations where programs are aborted without any explanation, or—even worse—when they generate an incorrect result without warning the user.
ThreeValued Propositional Calculus
Tertium non datur—ancient masters used to say. Computers have denied this principle.
In the Aristotelean classical logic, every sentence is either true or false. The third possibility does not exist. However, in the world of computers, the third possibility is not only possible but also inevitable. E.g., in evaluating a Boolean expression x/y > 2 an error will appear if the value of y equals zero.

BooleanE = {tt, ff, ee}

ff and ee = ee
which would mean that an interpreter that evaluates p and q first evaluates both p and q—as in “usual mathematics”—and only later applies and to them. Such a mode is called an eager evaluation.

ff and ee = ff

tt or ee = tt
Propositional operators of John McCarthy

If p = tt, then we give up the evaluation of q (lazy evaluation) and assume that the value of the expression is tt. Notice that in this case we may avoid an error message or an infinite computation that could be generated by q.

If p = ff, then we evaluate q, and its value—possibly ee—becomes the value of the expression.

If p = ee, then this means that the evaluation of our expression aborts or loops at the evaluation of its first argument, and hence the second argument is not evaluated. Consequently, the final value of the expression must be ee.

p implies m q = (notm p) orm q

associativity of and and or,

De Morgan’s laws

orm and andm are not commutative, e.g., ff andm ee = ff but ee andm ff = ee,
 andm is distributive over orm only on the righthand side, i.e.,

p andm (q orm s)=(p andm q) orm (p andm s); however,

(q orm s) andm p ≠ (q andm p) orm (s andm p) since

(tt orm ee) andm ff = ff and (tt andm ff) orm (ee andm ff) = ee,


analogously orm is distributive over andm only on the righthand side,

p orm (not p) does not need to be true, but is never false,

p andm (not p)not p) does not need to be false, but is never true.
General Remarks About Denotational Models
Why Do We Need Denotational Models?
 1.
building the implementation of the language, i.e., its parser and interpreter or compiler,
 2.
creating rules of building correct specified programs,
 3.
writing a user manual.
A programming language should be as simple and easy to use as possible, although without damaging its functionality, mathematical clarity and the completeness of its description. The same applies to the manual of languages and to the rules of building correct programs.
 1.
the syntax of the language as close as possible to the language of intuitive mathematics; for example, whenever this is common, we use infix notation and allow the omission of “unnecessary” parentheses,
 2.
the structure of the language (i.e., program constructors) leading to possibly simple rules of constructing correct programs (Sect. 8 of [16]),
 3.
the semantics of the language easy to understand by the user rather than convenient for the builder of implementation; for the latter an implementationoriented equivalent model may be written.
Special attention should be given to point 2, because the simplicity of the rules of building correct programs leads to a better understanding of programs by programmers. This fact was realized already in the year 1970 and has led to the elimination of goto instructions. This decision resulted in a major simplification of programs’ structures, which increased their reliability.
Following point 3, I will sometimes—as common in mathematics—“forget” about the difference between syntax and denotations. E.g., I will talk about the value of an expression x + y, rather than about the value of its detonation. I would say that the instruction x: = y + 1 modifies variable x , instead of saying that the denotation of this instruction modifies the memory state at variable x , etc. Of course, on a formal level syntax will be precisely distinguished from denotations.
Five Steps to a Denotational Model

As = Co · Cs
The construction of a denotational model begins with an algebra of detonation Den. Its constructors unambiguously determine the reachable subalgebra ReDen. From the signature of Den, we unambiguously derive the abstract syntax algebra AbsSy. The first of these steps is creative since it comprises all the major decisions about the future language. Contrary to it, the derivation of AbsSy can be performed algorithmically. The corresponding algorithm takes the description—e.g., in MetaSoft—of the signature of Den. This technique will be explained in more detail in the subsequent sections.

;(ins  1,;(ins  2, ins  3))

ins  1; ins  2; ins  3

;(ins1,;(ins2, ins3)) and

;(;(ins  1, ins  2), ins  3)
In other words, computer addition is not associative.
A user manual of a programming language with colloquialisms describes concrete syntax by a grammar, and the colloquialisms as additional grammatical clauses. This means that the programmer is free to use either a concrete syntax or a colloquial one.
 1.
The construction of Den where we decide about the meaning of future programs and their constructors. This is the most creative step where we decide about all the programming mechanisms of our language.
 2.
The derivation of abstract syntax, i.e., its grammar, from the signature of Den. This step is fully programmable.
 3.
The definition of concrete syntax, i.e., its grammar. To a certain degree, this is a creative step again, although in this case it may be supported by a software tool which assists the designer in transforming the grammar of abstract syntax into its concrete counterpart.
 4.
The description of the semantics Cs of concrete syntax. The definition of this semantics, i.e., the semantic clauses as (1), may be derived algorithmically from the definitions of Den, AbsSy and ConSy.
 5.
The enrichment of the concrete syntax by colloquialisms and the definition of the corresponding restoring transformation. This is again a creative step.
Two Layers of a Programming Language
 1.
applicative layer covering data expressions and type expressions whose denotations are functions from states to data and from states to types, respectively,
 2.
imperative layer, covering instructions and declarations whose denotations are functions from states to states.
The Applicative Layer of Lingua
The Data
 1.
simple data including Booleans, numbers, and words (finite strings of characters),
 2.
structural data including lists, multidimensional arrays, records, and their arbitrary combinations.
The symbols boo, num, ide, etc., which precede our equations are metavariables that will run over the corresponding domains in further definitions. This is just another notational convention.
The domain Boolean consists of only two elements that represent “truth” and “falsity”. The domains Alphabet, Number and Identifier are the parameters of our model, which means that they may differ from one implementation to another.
The Alphabet is a finite set of characters (except quotation marks), while Identifier is a finite fixed set of nonempty strings over Alphabet.
A word is a finite string (possibly empty) of the elements of Alphabet closed between apostrophes.
A list is a finite sequence (possibly empty) of arbitrary data.
An array is a mapping from numbers to data, and a record is a mapping from identifiers to data.
A data is a Boolean, a number, a word, a list, an array, or a record. Notice that identifiers are not included in data. They have been introduced only to define the domain of records. Identifiers that appear in records are called record attributes.
As we see, the four last equations have a recursive character, and therefore the existence of a solution of our set of equations is not evident. However, such a solution exists and is (in a sense) unique^{12} which may be proved on the ground of the theory of chaincomplete partially ordered sets (Sect. 2.7 of [16]).
 1.
all “executable” data are restricted in their size—this is formalized be introducing a universal predicate oversized defined for all data,
 2.
for any given list or array all its elements are of the same type (see “Composites, transfers, yokes, types and values”),
 3.
the domain of each array must be of the form {1, …, n}, i.e., must be a set of consecutive positive integers starting from 1.
The constructors of data are defined in such a way that all reachable data satisfy the above restrictions. This technique allows keeping our domain equations relatively simple.
Composites, Transfers, Yokes, Types, and Values
Every data in Lingua has a type. Types describe properties of data, but represent entities which can be constructed and modified independently of data. Our mechanism of types allows programmers to define their own types for future use either in defining new types or in declaring variables.^{13}
Types are pairs consisting of a body and a yoke. Every type is associated with a set of data of that type called the clan of the type.

[Ch  name/(‘word’),

fa  name/(‘word’),

award  years/(‘A’, (‘number’)),

salary/(‘number’),

bonus/(‘number’)].
The words on the lefthand side of semicolons are attributes. The first two attributes and the last two have simple bodies, whereas the third one—an array body. For the sake of further discussion, the body defined above will be referred to as employee.
With every body bod, we associate a set of data with that body called the clan of that body and denoted by CLANBo.bod. The function CLANBo is defined inductively relative to the structure of bodies. E.g., the set CLANBo.employee contains records with numbers, words, and onedimensional number arrays assigned to the respective attributes.

dat : CLAN  Bo.bod
Composites are the results of dataexpression evaluations (“Data expressions”). The use of composites permits to describe the mechanism of checking if the arguments “delivered” to an operation are of appropriate types. E.g., if we try to put a word on a list of numbers, the corresponding operation will generate an error message.
Having defined composites, we can define transfers and yokes. Transfers are oneargument functions that transform composites or errors into composites or errors, and yokes are transfers with Boolean composites as values. By a Boolean composite, we mean (tt, (‘Boolean’)) or (ff, (‘Boolean’)). Yokes may also assume abstract errors as values.

record. salary + record. bonus < 10000.

record .salary + record .bonus
Yokes have been introduced into Lingua to describe SQLintegrity constraints (for details see Sect. 12 of [16]).

CLAN  Tr.tra = (com  tra.com = (tt, (‘Boolean’))}.
Of course, the clans of transfers which are not yokes are empty. By TT we denote the transfer that yields (tt, (‘Boolean’)) for any composite.

CLAN  Ty.(bod, tra) = {(dat, bod)  dat : CLAN  Bo.bod and (dat, bod) : CLAN  Tr.tra}
The last concept associated with data and types is value. A value is a pair (dat, typ), i.e., (dat, (bod, tra)), which we sometimes write as ((dat, bod), tra). As we see, a value may be regarded either as a pair data type or as a pair compositetransfer.
For technical reasons, we also allow pseudovalues of the form (Ω, typ), where Ω is an abstract object called a pseudodata.
Values are assigned in memory states to the identifiers of variables. Variable declarations assign pseudovalues to variables, and initializing assignments replace Ω by a data.
As we are going to see, an assignment instruction—i.e., an instruction that assigns values to identifiers (see “Instructions”)—may only change the data assigned to a variable, and in some special cases its body, but never its yoke. To change a yoke, we use special yokeoriented instruction.
 1.
to define a type of a variable when it is declared, and to assure that this type remains unchanged (with some exceptions)^{16} during program executions,
 2.
to ensure that a data which is assigned to a variable by an assignment is of the type consistent with the declared type of that variable,
 3.
to ensure that a similar consistency takes place when sending actual parameters to a procedure or when returning reference parameters by a procedure,
 4.
to ensure that in evaluating an expression, an error message is generated whenever data “delivered” to that expression are of an inappropriate type, e.g., when we try to add a word to a number or to put a record to a list of arrays.
Expressions in General

(env, (vat, err)) where err : Error  {‘OK’}.
The denotations of data expressions are partial functions, due to the fact that data expressions may include functionalprocedure calls.^{19}
The fact that denotations of transfer expressions are just transfers, rather than functions from states to transfers, is a consequence of the fact that in our model transfers cannot be “stored” in states, as it is in the case for data and types. This is, of course, an engineering decision rather than a mathematical must. It has been assumed only for the sake of simplicity.
Data Expressions
Data expressions are evaluated to composites or errors. With every operation on data, we associate two constructors: of dataexpression denotations and of data expressions. In this way, we define two mutually similar algebras and a homomorphism between them. This homomorphism is unique, is implicit in the definitions of both algebras, and constitutes the semantics of data expressions. This section contains just one example of a syntactic constructor and of the corresponding semantic clause.

( DatExp / DatExp )

(dae  1/dae  2)

If the input state carries an error, then this error becomes the final result of the computation.

Otherwise, we evaluate both component expressions, and if one of these evaluations does not terminate, then (of course) the whole computation does not terminate.

Otherwise, we check the bodies of both resulting composites and if one of them is not (‘number’), then an appropriate error is generated.

Otherwise, we check if the second argument of the division is zero, in which case an error is generated.

Otherwise, we check if the result of the division is not oversized in which case an error is generated.^{20}

Otherwise, the result of division becomes part of the resulting composite.
Transfer expressions
273  ―  the resulting composite is (273, (‘number)) independently of the current composite 

record. price  ―  if the current composite carries a record with an attribute price, its body (‘number’) and its data dat, then the resulting composite is (dat, (‘number’)), and otherwise is an error 
all  list number ee  ―  this is a yoke; if the current composite does not carry a list, then an error is generated, otherwise, if it is a list of numbers then the resulting composite is (tt, (‘Boolean’)), and otherwise, it is (ff, (‘Boolean’)) 
record. price + record. vat < 1000  ―  this is a yoke; if the current composite does not carry an appropriate record, then error; otherwise, if the sum of data assigned to price and vat is less than 1000, then (tt, (‘Boolean’)), and otherwise (ff, (‘Boolean)) 
Now, let us consider a transfer expression with the asyntactic scheme
all  list tre ee.
 1.
If the current composite is an error, then the result is this error.
 2.
Otherwise, if the current composite does not carry a list, then an error is signalized.
 3.
Otherwise, the transfer Stre.[tre] is applied to composites created from the data dati of the list and the “internal body” bod of the list. Notice that lists carry data, rather than composites.
 4.
If one of these composites is an error, then the first such an error is the result of the computation.
 5.
If one of these composites is not a Boolean composite, then an error is generated.
 6.
If all resulting composites are (tt, (‘Boolean’)), then the resulting composite is (tt, (‘Boolean’)); otherwise, it is (ff, (‘Boolean’)).
Type expressions
Type expressions evaluate to types or errors. E.g., the denotation of the type expression:
ee
is a function on states that creates a record type or generates an error. This expression refers to two builtin types word and number and one userdefined type number  array (arrays of numbers).
Now, consider an example of a syntactic scheme of an expression that creates a oneattribute record type:
record  type ide as tex ee
 1.
If the input state carries an error, then this error becomes the result of the computation.
 2.
Otherwise, we compute the type defined by tex, and if it is an error, then this error becomes the result of the computation.
 3.
Otherwise, the resulting type is the record type ((‘R’, [ide/typ]), TT).

expandrecordtype tex1 at ide by tex2 ee,

replace  transfer in tex by tre ee
The Concrete Syntax of Expressions
In the syntax of type expressions number and word denote themselves, i.e., the names of simple types.
The Colloquial Syntax of Expressions
The imperative Layer of the Language
Expressions of all types belong to an applicative layer of Lingua. Their denotations use states as arguments, but neither create them nor change. The latter tasks are performed by instructions, variable declaration, procedure and function declarations and by type definitions. All of them belong to an imperative layer of the language.
Some Auxiliary Concepts
Two new metapredicates are necessary to define the semantics of the imperative layer of our language.

is  error : State \(\mapsto\) {tt, ff}
returns tt whenever a state carries an error.

bod  1 coherent bod  2
 1.
bod1 = bod2 or
 2.
they are record bodies, and one of them results from the other by adding or by removing an attribute.

◄ : State \(\mapsto\) State

(env, (vat, err)) ◄ error = (env, (vat, error)).
Instructions

ind : InsDen = State → State

Sin : Instruction \(\mapsto\) InsDen
Contrary to expression denotations which may generate an error, instruction denotations write an error into the error register of a state. The denotations of the majority of instructions are transparent relative to errorcarrying states, i.e., they do not change such a state but only pass it to the subsequent parts of the program. However, an error may also cause an errorhandling action (see Sect. 6.1.8 of [16]).
The basic instruction is, of course, an assignment of a value to a variable identifier. The syntactic scheme of an assignment is:
ide : = dae
 1.
If an input state carries an error, then this state becomes the output state.
 2.
Otherwise, if the identifier ide has not been declared, i.e., if no value or a pseudovalue has been assigned to it in the valuation val, then an error message is loaded to the error register.
 3.
Otherwise, if an attempt to evaluate the data expression leads to an infinite execution, then (of course) the executions of the instruction is infinite as well.
 4.
Otherwise, if the expression evaluates to an error, then this error is loaded to the error register of the state.
 5.
Otherwise, if the transit applied to the new composite returns an error, then this error is loaded to the error register.
 6.
Otherwise, if the composite computed from the expression has a body noncoherence with the body of the identifier’s type, then an error is loaded to the error register.
 7.
Otherwise, if the composite computed by the transit is not Boolean, i.e., if the transit was not a yoke, then an error is loaded to the error register.
 8.
Otherwise, if the yoke is not satisfied, then an error message is loaded to the error register.
 9.
Otherwise, the new value is the new composite and the current (i.e., not changed) yoke, and this new value is assigned to the identifier ide.
Notice that as a consequence of claim 6, together with the definition of the coherence of bodies (“Some auxiliary concepts”, an assignment may change the body of a value assigned to a variable only if this body is a record, and only by adding or by removing an attribute to/from that record.
 1.the replacement of a yoke assigned to a variable by another one

yoke ide : = tre,

 2.the empty instruction

skip,

 3.the call of an imperative procedure

call ide ( ref aparr val aparv)
where aparr and aparv are lists (maybe empty) of identifiers called, respectively, actual reference parameters and actual value parameters,

 4.the activation of an error handling

if dae do ins fi ,

 5.the conditional composition of instructions

if dae then ins1 else ins2 fi ,

 6.the loop

while dae do ins od ,

 7.the sequence of instructions

ins1; ins2.

In the yokereplacement instruction, the new value of the identifier ide gets the old composite but a new transfer. This transfer must be satisfied with the current composite.^{21}
The empty instruction skip is needed to make functionalprocedure declarations sufficiently universal; this will be seen in “Procedures”.
The discussion of procedures is postponed to “Procedures”.
The error handling is activated if the current state carries an error, i.e., a word that is equal to the word that the dataexpression dae evaluates to. If this happens, the “internal” instruction ins is executed for a state that results from the initial state where the current error has been replaced by ‘OK’.^{22}
The semantics of the three remaining categories of instruction is as usual, except that in 5 and 6 an expression may generate an error message. In such a case that error is stored in the error register of the state.
Variable Declarations and Type Definitions

vdd : VarDecDen = State \(\mapsto\) State

let ide be tex tel

the composite of the value by an assignment instruction,

the yoke of the value by a yoke replacement.

set ide as tex tes

tdd : TypDefDen = State \(\mapsto\) State
with the difference that instead of assigning a pseudovalue to a variable identifier in a valuation, they assign a type to a typeconstant identifier in a type environment.
An identifier that is bound to a type in a state is called a type constant. Notice that we call it “a constant” rather than “a variable”, since a type once assigned to an identifier cannot be changed in the future (an engineering decision).
Similarly to the case of assignments, also type definitions and variable declarations may be combined sequentially using a semicolon constructor.
Procedures

apa : ActPar = ()  Identifier  ActPar × ActPar.

fpa : ForPar = ()  Identifier × TypExpDen  ForPar × ForPar.
Returning to procedures, notice that we do not talk here about “procedure denotations” but about “procedures” as such, since they are purely denotational concepts. In other words, they do not have syntactic counterparts. At the level of syntax, we have only procedure declarations and procedure calls which, of course, have their denotations.

( val age, weight as number, name as word, ref patient as patient  record)
Expressions different from singleidentifier expressions are not allowed as value parameters since such a solution would complicate the model as well as programconstruction rules (an engineering decision).
A call of a functional procedure declared in this way first executes the program pro and then evaluates the data expression dae in the output state of the program. If the composite generated by that expression is of the type defined by the type expression tex , then this composite becomes the result of the call of the function. Otherwise, an error is signalized.
In particular, the program in a functionalprocedure declaration may be the trivial instruction skip —which “does nothing”—and the exporting expression may be a single identifier.
Notice that the second call has no reference parameters since functional procedures do not have any side effects—they do not modify a state (an engineering decision).
All types and procedures defined in the hosting program before (see “Procedures”) the declaration of a procedure are visible in the body of this procedure and therefore do not need to be passed as parameters (an engineering decision).
In the version of Lingua described in the present paper, procedures cannot take other procedures as parameters. However, it is shown in Sect. 7.6 of [16] how to construct a hierarchy of procedures that can take procedures of lower rank as parameters. This construction protects procedures from taking themselves as parameters which would lead to denotational models that cannot be defined within naïve set theory (a mathematical decision).
The Execution of a Procedure Call
In the description of procedure mechanisms, we use some concepts having to do with the fact that procedures are created when they are declared and are executed when they are called. In respect to that, we shall talk about states (and their components) of a declarationtime and of a calltime respectively.^{24} Traditionally by a procedure body, we mean the program that is executed when a procedure is called.
As has been already announced in the Introduction, there are no global variables in procedures (an engineering decision).^{25} The intention is that the head of a procedure call describes explicitly and completely the communication mechanisms between a procedure and the hosting program. That solution may seem restrictive but—in my opinion—guarantees a better understanding of program functionality by programmers and definitely simplifies programconstruction rulers.
 1.The inspection of an initial global state—that state consists of:
 (a)
an initial global environment envig,
 (b)
an initial global store stoig = (vatig, err).
If err ≠ ‘OK’, then the initial global state is returned by procedure call and therefore becomes the terminal global state. In the opposite case, an initial local state is created.
 (a)
 2.The creation of an initial local state—that state consists of:
 (a)
initial local environment envil created from the declarationtime environment by nesting in it the called procedure; this nesting is necessary to enable recursive calls,
 (b)
initial local valuation vatil covering only formal parameters with assigned values of corresponding actual parameters; to get the latter values, we refer to initial global valuation valig.
 (a)
 3.The transformation of the local initial state by executing the procedure body. If this execution terminates, then the local terminal state consists of:
 (a)
terminal local environment envtl,
 (b)
terminal local store stotl = (valtl, errtl).
If errtl ≠ ‘OK’, then a global terminal state is created from the initial global state by loading to it errtl. Notice that in this case, the terminal local environment and the terminal local store are “abandoned”. Otherwise, the terminal global state is created.
 (a)
 4.
The creation of the terminal global state—that state consists of:
initial global environment envig; notice that terminal local environment envtl is “abandoned”,
terminal global store stotg created from initial global store stoig by “returning” to it the values of formal referential parameters (stored in stotl) and assigning them to the corresponding actual referential parameters.
Notice that the initial local environment “inherits” all types and procedures from the declarationtime environment. The procedure body may use its own local environment types and procedures, but after the completion of the call they cease to exist, since the hosting program returns to the initial global environment.
It is to be underlined that the procedure body may access only that part of the environment which was created before the procedure declaration.
Of a similar character is the local valuation that is created only in procedure execution time, although in this case the values or reference parameters stored in it are eventually returned to the terminal global valuation.
 1.
the only variables visible in the procedure body are formal parameters plus variables local to the body (declared in it),
 2.
the only types and procedures visible in the procedure body are declarationtime types and procedures plus locally declared ones,
 3.
variables, types, and procedures declared in the procedure body are not visible outside of the procedure call.
All these choices are not mathematical necessities, but pragmatic engineering decisions dictated by the intention of making our model relatively simple which should contribute to the simplicity of program construction rules and to a better understanding of programs by language users.
Procedures in Lingua may call themselves recursively either directly or indirectly. At the level of semantic clauses, this leads to recursive definitions of the denotations of procedure declarations. For formal definitions, see Sect. 7.3.2 in [16].
Preambles and Programs

begin  program pam; ins end  program
where pam is a preamble.

Spr.[pam; ins] = Spre.[pam] · Sin.[ins]

Spre.[ipd] = Sipd.[ipd]

Spre.[mpd] = Smpd.[mpd]

…

Spre.[pam  1; pam  2] = Spre.[pam  1] · Spre.[pam  2]

the semantics of preambles applied to imperativeprocedure declarations coincide with the semantics of such declarations,

the semantics of preambles applied to multiprocedure declarations coincide with the semantics of such declarations,

…

the denotation of a sequential composition of preambles is a sequential composition of their denotations.
Programs with the trivial preamble skip —if executed “without a context”—will always generate an error, unless they (the programs) are the skip themselves. Such programs are allowed because they may appear in procedure declarations as the bodies of procedures without locally declared objects. In turn, programs with trivial preambles and instructions are allowed in the declarations of functional procedures.^{26}
The Carriers of Our Algebra of Denotations
LinguaSQL
General Assumptions About the Model
 1.
new data domains corresponding to databases, tables, rows, and specific SQL data,
 2.
new constructors defined on these domains.
Data, Bodies, and Composites
So far, values in Lingua consisted of a composite and a transfer. This principle is kept in LinguaSQL for values carrying simple data, rows and tables, but in the case of databases, values are records of tables supplemented by graphs of subordination relations (“Database values”).
In LinguaSQL lists, records and arrays do not carry rows, tables, and databases, and table fields do not contain lists, records, and arrays. On the other hand, the extended repertoire of simple SQL values is available for the constructors of lists, records and arrays.

sda : SimData = Boolean  Number  Word  Date  Time  DateTime  {⊖}
All former constructors with simple data as arguments—e.g., that add a new attribute to a record—are extended in an obvious way to the new domain.
To include rows and tables with empty fields in our model, we introduce an empty data ϴ.^{27} This data will never appear as a value of an expression and will never be assigned to a variable.
With the extended set of simple data, we can extend the set of corresponding operations, e.g., by allowing to add a number to a date. I do not define such operations explicitly assuming that their class is a parameter of our model.
The subcategories of numbers such as INTEGER, SMALLINT, BIGINT, DECIMAL(p, s), or of words CHARACTER(n), CHARACTER VARYING(n), BLOB, will correspond to yokes rather than to types.
At the level of domain equations, tables may contain rows of different length and different attributes. However, such tables will not be reachable in the algebra of composites.
Data bases do not appear at the level of data. They are defined only at the level of values (“Database values”).

sbo : SimBody = {(‘Boolean’), (‘number’), (‘word’), (‘date’), (‘time’), (‘date  time’)}
As one can guess from these definitions, the composites of rows in a table will have a common body. The row contained in a table body carries the information about default data for columns. Its list of attributes must coincide with the list of the attributes of the corresponding row body. This property will be insured by tablebody constructors.
The domain BodyE is extended by new simple bodies and the bodies of rows and tables.
The function CLANBo from Lingua is extended in an obvious way on row bodies. In the case of table bodies, we assume that each row of a table must have an appropriate record structure and that in each field with a nonempty default value there is a nonempty value. Of course, it does not need to be a default value. The latter are used when adding to a table a new row or a new column.
We assume that the empty table—a table with an empty tuple of rows—belongs to the clan of every table body.

com : SimCom = {(dat, bod)  (dat, bod) : CompositeE and bod : SimBody}
The Subordination of Tables
Subordination relations describe the binary relationships that can hold between tables. Let then A and B be tables and let ide be an attribute that appears in both of them. Let A.ide and B.ide be the corresponding columns in these tables.
We say that A is subordinate to B at ide or that A is a child and B is a parent that we write as
A sub[ide] B
 1.
an idecolumn appears in both tables; the identifier ide is called the subordination indicator,
 2.
the column B.ide is repetitionfree,
 3.
the column A.ide contains only the data that appear in B.ide.

sgr: SubGra = Sub.(Identifier × Identifier × Identifier)
About the subordination graphs, we assume only that idec ≠ idep, although such graphs may contain cycles. Notice also that there may be many edges starting in one node (one child may have many parents), and many edges may end in one node (many children may have a common parent).
Transfers
Types—as we understand them in this paper—are mentioned in many SQL manuals only in the context of simple data and even in that case in a very unclear and incomplete way. The types of tables are implicit in table declarations, and the types of rows, columns, and databases are totally absent. In table declarations, the descriptions of bodies are mixed with the description of yokes, and with database instructions, and are called integrity constraints.^{29}
Unfortunately, in none of the SQL manuals known to me (their list is given in the preamble to Sect. 11 of [16]), I have found a complete description of integrity constraints. Although all of them have a certain common part, besides that part, each manual offers different ideas. In this situation, I decided to construct such a model of SQL types that would cover a “sufficiently large” spectrum of types that appear in SQL applications.

the yokes referring to their tables,

subordination graphs which are only seen at the level of values.
We assume that in LinguaSQL we have all sofardefined transfer constructors, and in particular—Boolean constructors. New constructors will generate transfers on new simple composites—these are regarded as the parameters of our model—plus row and table transfers.
The row transfers are analogous to record transfers of Lingua. Table transfers split into two classes.
The first contains quantified table yokes which describe table properties by row yokes that should be satisfied for all rows of a table.
We create a tuple of composites col using a constructor Cc[getcofromtb] which selects a column from a table (for a formal definitions see Sect. 12.2.7 of [16]). This is a column assigned to the attribute ide. Then we check if this tuple satisfies a universal predicate norepetitions. The created column does not contain the element that corresponds to the row of default values.
Since we have Boolean constructors among the constructors of yokes, we can use them to construct yokes that express properties of several columns of a table and all of its rows. Notice that contrary to the SQL standard the properties of columns and rows may be combined by arbitrary Boolean constructor rather than by conjunction only.^{30}
Types

Identifier

Transfer

CompositeE

TypeE,
 1.
new transfer constructors (“Transfers”),
 2.
selected constructors of row composites needed to construct the rows of default values,
 3.
three type constructors: of creating a oneattribute row, of adding an attribute to a row, and of creating table type.
Row types are created similarly to record types with the difference that now the added type must be simple.
Database Values

to make a database accessible in a program, its tables must be assigned to variable identifiers in the current valuation,

in every state its valuation carries tables of only one database; this database is called the active database.

RowVal = {(com, tra)  sort.com = ‘Rq’ and tra.com = (tt, (‘Boolean’))}

TabVal = {(com, tra)  sort.com = ‘Tq’ and tra.com = (tt, (‘Boolean’))}.

dbr : DatBasRec = Identifier \(\Rightarrow\) TabVal
Of course, database records are not records in the sense of “The data”, but only in a settheoretic sense.
We say that a database record dbr satisfies the subordination relation identified by a subordination graph sgr, in symbols
dbr satisfies sgr,

(com  c, tra  c) = dbr.ide  c

(com  p, tra  p) = dbr.ide  p

comc sub[ide] comp.

dbv : DbaVal = {(dbr, sgr)  dbr satisfies sgr}.
We may say that for database values, the role of a yoke is played by the predicate satisfies. Notice, however, that since a database record caries table values, the tables of the database satisfy their own yokes.
States
 1.
simple SQL data,
 2.
rows,
 3.
tables,
 4.
databases.
Of course, database values are not values, as defined formerly, since they are not composed of a data and a type. The type of a database is implicit in the types of its tables and in the subordination graph.
In every state several databases may be stored, i.e., assigned to identifiers, but only one base may be active at a time, i.e., the tables of only one base may be assigned to identifiers in valuations.
 sb  graph

that binds the subordination graph of the active base in the environment,
 copies

that binds a finite sets of table names (identifiers) in the valuation,
 monitor

that binds one table in the valuations (the table displayed on a monitor),
 check

that binds words ‘yes’ and ‘no’ in valuations.
Their role will be explained later. So far, we assume only that they cannot be used as identifiers of variables, of type constants, and of procedures. The identifier check is called the security flag.
The signature of the algebra of denotations of LinguaSQL is an extension of the signature of Lingua (“The carriers of our algebra of denotations”) by new constructors. The carriers change due to new SQL values and SQL types.
Denotations and Their Constructors
The subalgebra of expression denotations of all types is analogous to that in Lingua.

trd : TrnDen = State \(\mapsto\) State.
Transactions are regarded as a separate carrier of the algebra of denotations to avoid the use of arbitrary table instructions in the contexts of transactions.

ide : = table  expression(ide)
 1.
row assignments,
 2.
table assignments,
 3.
database instructions.
 1.
tablemodification instruction where on both sides of the assignment we have the name of the same table; this group of instructions comprise the mechanisms known as CASCADE and RESTRICT,
 2.
tablecreation instruction where on the lefthand side of the instruction we may have a different table name (of the table that is being created) than on the righthand side.
From a mathematical perspective, the first category may be regarded as a particular case of the second, but denotationally they correspond to two different constructors of the algebra of denotations and hence also to different constructors of the algebra of syntax.
 1.
conformist instructions where an execution terminates with an error message whenever it would lead to a violation of subordination constraints; this category corresponds to the option RESTRICT,
 2.
correcting instructions which in the described situation introduce such changes into a database that guarantee the protection of subordination constraints; this category corresponds to the option CASCADE.
Queries are similar to simple instructions with the difference that they always create a new table assigned to the systemidentifier monitor. Consequently, we apply simplified assignments assignmo that never violates any constraints since the transfer of the new value is TT. Of course, I skip here the whole mechanism of displaying and manipulating monitors.
Cursors are mechanisms used to retrieve one row after another from tables. In our model that can be easily defined, e.g., by adding a column to a table that enumerates its rows.
Views are essentially procedures that call table instructions. They may be introduced to our model either as predefined instructions or by providing programming mechanisms of procedures that operate on tables.
With these assumptions, each database program in LinguaSQL that operates on tables either has to create its own tables—and a database thereof—or to activate an already existing database. In LinguaSQL we have therefore only two database instructions that operate on tables—activate and archive—and two that operate on subordination graphs, which add or remove an edge of a graph.
An Example of a Colloquial Syntax
The colloquial syntax of LinguaSQL should be as close as possible to SQL standard. Below just one example of restoring a standard tablevariable declaration—which in LinguaSQL belongs to colloquial syntax—into its corresponding concretesyntax form.

create table Employees as typ_exp ed;

set reference of Employees et Department_Id to Departments ei

tabletype dat_exp with tra_exp ee
Of course, varchar(20), varchar(9) ,… are the names of appropriate predicates. Notice that in this example one “syntax unite” from the colloquial level is transformed into a sequential composition of a declaration with an instruction.
Remarks About a Possible Implementation of LinguaSQL
Typical Application Programming Interfaces (API) for SQL have been created for programming languages such as, e.g., C, PHP, Perl, and Phyton. Each of these programming environments constitutes a programming language equipped with the mechanisms that allow to run procedures of a certain existing database engine. In the case of LinguaSQL, such a situation would not be acceptable. Our language must be based on a dedicated SQL engine with a denotational model, and in the future, maybe, with a dedicated implementation. Such an approach is necessary, if we want to provide sound programconstruction rules for LinguaSQL. But, of course, its functionality and syntax should conform—as much as possible—to some concrete standard of SQL. It is also worth mentioning that LinguaSQL is a strongtype language, whereas SQL is not very much so.
What Remains to be Done
Even though [16] is already of a considerable size, the majority of subjects has been only sketched. Below is a preliminary list of subjects which could be developed further. This list is certainly not complete.
The Development of Lingua
 1.An extension of Lingua to some “practical” language, say Linguaα, where preliminary programming experiments could be performed. Such a language should cover in particular:
 2.The development of tools for correct programs’ development in Linguaα:
 2.1.
The extension of the languages of conditions and thesis sketched in Sect. 8 of [16].
 2.2.
Sound programconstruction rules for the extended language.
 2.1.
 3.
A user manual for Linguaα. This task could also contribute to a methodology of writing programmer’s manuals for languages with denotational semantics.^{31}
 4.A programmer’s environment for Linguaα:
 4.1.
An interpreter or a compiler. To make this interpreter/compiler maximally independent of possible errors in the language used to build it, some basic core could be coded in such a language (e.g., in Python), and the remaining part may be written using this basic core. This could also be the first experiment in using our language.
 4.2.
An editor of programs supporting the construction of correct programs with the use of earlier developed program construction rules (see “Manysorted algebras”)
 4.3.
An adaptation of an existing theorem prover for proving metaconditions (the properties of conditions) described in Sect. 8.4.2 of [16] which is necessary for the use of programconstruction rules.
 4.1.
 5.Preliminary experiments with programming in Linguaα:
 5.1.
Microprograms due to their relatively small volume and a very critical correctness issue.
 5.2.
Simple SQL applications due to a restricted availability of SQL tools in Linguaα at least at the beginning of the project.
 5.1.
This is, of course, only a preliminary sketch of a project which—in the case of realizations—would probably be modified and further developed.
The last problem which should be mentioned here is the dependency of the correctness of Lingua compiler of the correctness of the compiler of the language, say Python, in which Lingua compiler would be written. To solve that problem completely one should validate all the software layers between Lingua code and a machine code. At the moment this would be an obviously unrealistic task. So far, we cannot free ourselves from errors lying “below” the Lingua code. On the other hand, it seems true that very many errors in programs are made at the level of the source code. Such error may be—at least to some extent—eliminated in Lingua.
The Development of a Software Environment for Language Designers
 1.
An editor of the definitions of denotations’ constructors.
 2.
A generator of the grammar of abstract syntax from such definitions.
 3.
An editor supporting language designers in developing concrete syntax grammar from abstract syntax grammar.
 4.
An editor/generator of a transformation restoring colloquial syntax to abstract syntax.
 5.
A generator of a parser from colloquial syntax to abstract syntax.
 6.
A generator of an interpreter of the language.
If such an environment is created before Linguaα, it could be used in the creation of that language.
Two Basic Research Problems
Independently of the tasks mentioned above, two important research problems are worthy of consideration.
The first concerns the extension of our model by the mechanisms of concurrency. Fully denotational models of concurrence are not known today, although there are some attempts to form “semidenotational” models of these mechanisms, as, e.g., in [2].
The second problem has not been probably tackled at all and concerns the construction of semiformal languages for the description of useroriented specifications of programs. So far, all approaches to program correctness—including mine—concentrate on the compatibility of program code with its formal specification. It does not exhaust the reliability problem in the IT industry, because many problems are due to poor communication between a designer of a system and its user. Most probably, many areaoriented languages of specifications would be needed.
Footnotes
 1.
Except subordination relations which are described by a different mechanism.
 2.
There exist mathematical semantics of concurrency which can be said to be only “partially denotational”. An example of such a solution is a “componentbased semantics” (cf. [2]), where the denotations of programs’ components are assigned to programs in a compositional way (i.e., the denotation of a whole is a composition of the denotations of its parts), but the denotations themselves are socalled fucons whose semantics is defined operationally.
 3.
I am convinced that the first problem is equally fascinating as the second. I would very much welcome any initiative of a cooperation in this field.
 4.
Developed in the decade 1980–1990 in the Institute of Computer Science of the Polish Academy of Sciences by a team which I had a honor to chair.
 5.
Metavariables and metaconstants are objects of the metalanguage MetaSoft, whereas variables and constants are objects of the programming language Lingua.
 6.
For simplicity, I use here the same symbol “+” to denote a constructor of expressions and a syntactic symbol of addition.
 7.
Here, nexp1, etc. are written in Arial since they are metavariables that run over syntactic elements, i.e., expressions.
 8.
In the sense of a componentwise inclusion.
 9.
Informally speaking, a partiality of a function F is computable if we can write a procedure which given an arbitrary tuple d_{1}, …, d_{n} of arguments of F will check if F.(d_{1}, …, d_{n}) is or is not defined. E.g., for an array expression arr[k], we can check if the index k belongs to the index range of the array arr. From the general theory of computability we know, however, that there exist functions with noncomputable partialities.
 10.
The suffix “m” stands for “McCarthy” and is used to distinguish McCarthy’s operators not only from classical ones, but also from the operators of Kleene, which are used in SQL.
 11.
Formally, this means that the algebra of concrete syntax is not more ambiguous than the algebra of denotation which guarantees the existence of a unique homomorphism between them (see Sect. 2.13 of [16]).
 12.
It is unique in the sense that by the solution of such an equation, we mean its least solution where the ordering is the componentwise settheoretic inclusion.
 13.
Technical details in Sect. 5.2 of [16].
 14.
This is again a recursive equation (as it was the case of datadomain equations) and again its unique solution exists.
 15.
They “are closed to predicates” rather than simply “are predicates” since they assume as values composites and abstract errors rather than just Boolean values tt and ff. Their logical constructors and, or and not are the threevalued constructors of John McCarthy’s calculus defined by “Threevalued propositional calculus”.
 16.
These exceptions take place, e.g., when we add a new attribute to a record or to a database table or if we remove such attribute.
 17.
The metavariable running over valuations is “vat”, since “val” has been reserved for values.
 18.
The domains Procedure and Function are defined in “Procedures”.
 19.
Functional procedures may loop indefinitely and since this is not a computable property, we cannot expect to have an error message in that case.
 20.
In our definitions, this part of the procedure is described in an abstract way, but the implementation does not need to preform it literarly, i.e., by first dividing the given numbers and only then checkig, if that was possible. In an implementation a programmable solution should be chosen.
 21.
This instruction has been introduced mainly for the sake of SQL tables discussed in [16].
 22.
For details see Sect. 6.1.8 of [16].
 23.
The fact that procedures transform stores rather than states is a technique (introduced in [17]) that allows to define recursion in avoiding the selfapplication of procedures, i.e., a situation where a procedure takes itself as an argument. Of course, procedure calls are instructions and therefore they transform states into states.
 24.
These ideas, similarly to a few others, have been borrowed from Gordon [22].
 25.
If we would like to introduce global variables, we should define the local store of a procedure call as a modification of its global store.
 26.
Both these solutions, although in a slightly different form, have been suggested to me by Andrzej Tarlecki.
 27.
Notice that ϴ, which is assignable to fields of rows and tables, is different from Ω which is assigned to a variable at the declaration time.
 28.
Notice that since the set Identifier is finite, each subordination graph is finite as well.
 29.
For a justification of this criticism see Sect. 11 of [16].
 30.
To say the truth, I am not sure if such a generalization has a practical value.
 31.
Denotational models should provide an opportunity for the revision of current practices seen in the manuals of programming languages. New practices should on one hand base on denotational models, but on the other not assume that today’s readers are acquainted with it. A manual should provide some basic knowledge and notation needed to understand the definition of a programming language written in a new style. At the same time—I strongly believe on that—it should be written for professional programmers rather than for amateurs. The role of a manual is not to teach the skills of programming. Such textbooks are, of course, necessary, but they should tell the readers what the programming is about, rather than the technicalities of a concrete language. An experiment in writing a user manual of Lingua is described in [15].
Notes
Compliance with ethical standards
Conflict of interest
The author declares that he has no conflict of interest.
References
 1.Ahrent W, Beckert B, Bubel R, Hähnle R, Schmitt PH, Ulbrich M, editors. Deductive software verification—the key book; from theory to practice. Lecture notes in computer science, 10001. New York: Springer; 2016.Google Scholar
 2.Binsbergena LT, Mosses PD, Sculthorped CN. Executable componentbased semantics. J Logical Algebraic Methods Program. 2019;103:184–212.MathSciNetCrossRefGoogle Scholar
 3.Bjørner D, Jones BC. The Vienna development method: the metalanguage. Upper Saddle River: Prentice Hall International; 1982.Google Scholar
 4.Bjørner D, Oest ON, editors. Towards a formal description of Ada. Lecture notes of computer science, 98. New York: Springer Verlag; 1980.Google Scholar
 5.Blikle A. Algorithmically definable functions. A contribution towards the semantics of programming languages, Dissertationes Mathematicae, LXXXV, PWN, Warszawa. 1971.Google Scholar
 6.Blikle A. Equational languages. Inf Control. 1972;21(2):134–47.MathSciNetCrossRefGoogle Scholar
 7.Blikle A. Toward mathematical structured programming, formal description of programming concepts. In: Neuhold EJ, editors. Proc. IFIP Working Conf. St. Andrews, N.B., Canada 1977. North Holland, Amsterdam; 1978, pp. 183–2012.Google Scholar
 8.Blikle A. On correct program development. In: Proceedings of the 4th international conference on software engineering, München, September 17–19, 1979. IEEE Computer Society, pp. 164–73.Google Scholar
 9.Blikle A. On the development of correct specified programs. IEEE Trans Soft Eng. 1981;SE7(5):519–27.CrossRefGoogle Scholar
 10.Andrzej Blikle. The clean termination of iterative programs. Acta Inform. 1981;16:199–217.MathSciNetzbMATHGoogle Scholar
 11.Blikle A. MetaSoft primer—towards a metalanguage for applied denotational semantics, Lecture notes in computer science. New York: Springer Verlag; 1987.zbMATHGoogle Scholar
 12.Blikle Andrzej, Denotational Engineering or from Denotations to Syntax, red. D. Bjørner, C.B. Jones, M. Mac an Airchinnigh, E.J. Neuhold, VDM: A Formal Method at Work, Lecture notes in Computer Science 252, Springer, Berlin 1987.Google Scholar
 13.Blikle A. Threevalued predicates for software specification and validation. In: VDM’88, VDM: the way ahead. Proceedings of 2nd, VDMEurope symposium, Dublin 1988, Lecture notes of computer science, New York: Springer; 1988, pp 243–66.CrossRefGoogle Scholar
 14.Blikle A. Denotational engineering. Sci Comput Program. 1989;12:207–53.MathSciNetCrossRefGoogle Scholar
 15.Blikle A. An experiment with a user manual based on a denotational semantics. https://doi.org/10.13140/rg.2.2.23355.67366.
 16.Blikle A, ChrząstowskiWachtel P. A denotational engineering of programming languages—to make software systems reliable and user manuals clear, complete and unambiguous. 2019. https://doi.org/10.13140/rg.2.2.27499.39201/3.
 17.Blikle A, Tarlecki A. Naive denotational semantics. In: Mason REA, editor. Information processing 83. NorthHolland: Elsevier Science Publishers; 1983.Google Scholar
 18.Branquart P, Luis G, Wodon P. An analytical description of CHILL, the CCITT highlevel language. Lecture notes in computer science, 128. New York: SpringerVerlag; 1982.CrossRefGoogle Scholar
 19.Chomsky N. Contextfree grammar and pushdown storage. Quarterly Progress Report 65, MIT Research Laboratories of Electronics, Quarterly Report. 1962. pp. 187–94.Google Scholar
 20.Ginsburg S. The mathematical theory of contextfree languages. New York: McGrawHill; 1966.zbMATHGoogle Scholar
 21.Goguen JA, Thatcher JW, Wagner EG, Wright JB. Initial algebra semantics and continuous algebras. J Assoc Comput Mach. 1977;24:68–95.MathSciNetCrossRefGoogle Scholar
 22.Gordon MJC. The denotational description of programming languages. Berlin: Springer Verlag; 1979.CrossRefGoogle Scholar
 23.Hoare CAR. An axiomatic basis for computer programming. Commun ACM. 1969;12:576–83.CrossRefGoogle Scholar
 24.Kleene SC. Introduction to metamathematics. North Holland 1952 (later republished in the years 1957, 59, 62, 64, 67, 71).Google Scholar
 25.McCarthy J. A basis for a mathematical theory of computation. In: Brawffort P, Hirschberg D, editors. Western joint computer conference, May 1961 later republished in Computer programming and formal systems. North Holland; 1967.Google Scholar
 26.Naur P (editors). Report on the algorithmic language ALGOL60. Communications of the association for computing machinery, Homepage archive Volume 3 Issue 5, 1960. pp 299–314.Google Scholar
 27.Stoy JE. Denotational semantics: the scottstrachey approach to programming language theory. Cambridge: MIT Press; 1977.zbMATHGoogle Scholar
 28.Scott D. Strachey C. Towards a mathematical semantics of computer languages. Technical Monograph PRG6, Oxford University, Oxford, 1971.Google Scholar
 29.Turing A. On checking a large routine. Report of a conference on highspeed calculating machines, University Mathematical Laboratory, Cambridge. 1949, pp. 67–69.Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.