1 Introduction: Why Cellular Automata?

DNA computing is a research field that focuses on extracting computational power from DNA molecules and their reactions. Implementing a Turing machine using DNA is a typical example of research in this field. Research to realize cellular automata using molecules such as DNA has also been actively conducted. Why are cellular automata a target of research in DNA computing?

1.1 Computation by Molecules

The first attempt to perform computations using molecules was a thought experiment in which molecules were used to make Turing machines. Research on molecular Turing machines began with Bennett’s concept of a biological computer [1]. Since then, proposals to realize Turing machines and various kinds of automata by molecules have been repeated, including the recent work on Turing machines by King et al. [2]. The work by Benenson et al. is a typical example of implementing finite automata [3].

However, the computational power of a single molecule is limited because it is not easy for different parts of a molecule to interact and exchange information. Therefore, the possibility of computing with a chemical solution containing a huge number of molecules has been explored. Adleman’s research on data parallel computation by molecules sits in this context [4].

Since then, in DNA computing, research on computation by a chemical solution has progressed from logic circuits to neural networks. For example, some studies implemented logic circuits and neural networks using seesaw gates [5, 6].

In parallel with such efforts, the concept of a chemical reaction network (CRN) has been widely adopted as a unified standard model for computation using a chemical solution, and theoretical research on computation by CRN is being conducted [7].

However, the computational power of a chemical solution that is uniform in space is also limited. Information should be encoded as concentrations of molecular species. Therefore, many researchers in the field became interested in cellular automata, a computational model that divides space into cells. Researchers began implementing cellular automata using molecules. Historically, the self-assembly of DNA tiles, pioneered by Seeman and Winfree, was modeled using cellular automata [8]. In their model, the position where each tile is placed is considered a cell. In the process of self-assembly, each cell transitions from an empty state to a state in which it is filled with a tile.

1.2 Smart Materials

Researchers in DNA computing have continually asked this question: What are the possible applications of the computational power extracted from molecules? Among the various proposals, smart materials, i.e., those that exhibit intelligent behaviors, are considered promising [9].

If cellular automata are implemented as materials, it will become possible to realize materials with abilities such as self-organization, pattern formation, and self-repair. For example, suppose such a smart material is used to implement an artificial blood vessel. The vessel would form autonomously at an appropriate place in the living body, and it would recognize any deformation or occlusion of itself and self-repair. Note that cellular automata capable of simple self-assembly, such as those consisting of DNA tiles, are not sufficiently complex to make such smart materials.

With more sophisticated cellular automata, it is possible to envision smart materials that learn appropriate functions from external stimuli. For example, as we report later in this article, we have designed cellular automata that form a logic circuit from input–output sets in a three-dimensional (3D) cellular space.

1.3 Why Discrete?

When envisioning smart materials, such as the ones above, one may wonder why these should be discrete. It might be possible to realize the same functions in a continuous medium, for example, by taking advantage of Turing patterns and developing design principles for continuous media [10,11,12]. However, using current methods, the rational design of continuous media is challenging. Tuning a huge number of continuous parameters to produce target shapes or behaviors is a nontrivial task in general because these parameters are intricately intertwined.

By contrast, we can take advantage of a sizeable accumulation of research on cellular automata. Therefore, we consider it an appropriate approach to design a discrete model while implementing the model in a continuous medium, such as a reaction–diffusion system. In addition, as we shall see in the next section, implementation at the molecular level is naturally discrete, so this has a high affinity with cellular automata.

2 Implementation of Cellular Automata

Methods to implement cellular automata have been actively investigated in DNA computing, as we shall see below, but no optimal method has been established. Proposed methods can be classified into two groups: those at the molecular level and those using reaction–diffusion systems.

2.1 Molecular Level

Yin, Sahu, Turberfield, and Reif proposed a molecular-level implementation [13]. Each cell is represented by a strand of DNA, doubled in part, which is attached to a 1D track. Each partially double strand has a single-stranded portion that represents its state and is changed by a restriction enzyme. The entire reaction system is complex, and therefore, its implementation is very challenging.

Qian and Winfree proposed CRN on a surface [14]. Again, each cell is represented by a partially double strand of DNA with a single-stranded portion. It makes a state transition by replacing its single strand with a new one. Two neighboring cells make state transitions at the same time. That is, by the reaction A + B & → C + D, a cell in state A transitions to state C, and a neighboring cell in state B transitions to state D. The reaction A + B & → C + D is a basic form of CRN called a population protocol. In other words, Qian and Winfree proposed realizing population protocols in a 2D cellular space, i.e., on a surface. Note that population protocols have been investigated as a standard model of distributed computing, and various distributed algorithms can be represented in this form.

For implementing population protocols on a surface, Qian and Winfree proposed a very sophisticated system consisting of many reactions, called a strand displacement system, but it is not known whether its implementation by DNA is feasible. Despite the apparent difficulty of its implementation, theoretical research on CRNs on a surface is progressing [15].

A new design for molecular-level implementation by data parallel computation based on strand displacement was recently proposed with a simulation of Rule 110 elementary cellular automata [16].

The self-assembly of DNA tiles, mentioned above, realizes a restricted class of cellular automata in which each cell can make a transition only once. This restriction is not appropriate for smart materials that are expected to repair themselves after external disturbances, although 2D self-assembly can simulate the spatiotemporal evolution of 1D cellular automata [17,18,19].

Compared to reaction–diffusion systems, molecular-level proposals of cellular automata have advantages including their size and the speed of state transitions. However, all proposals other than those for self-assembly are based on complex reactions of DNA and are inherently difficult to implement. In addition, the size of molecular-level implementations can be a disadvantage for expressing the macroscopic behaviors needed in smart materials.

2.2 Reaction–Diffusion Systems

Proposals have also been made to implement cellular automata using reaction–diffusion systems, as discussed above. For example, in the proposal by Scalise and Schulman, a cell in cellular automata is implemented by a region with a high concentration of a molecule that is attached to the region and cannot diffuse [20]. Each cell produces a molecule corresponding to its current state, and this molecule diffuses to the neighboring cells; in this paper, we call this a signal molecule.

In the molecular robotics project led by Hagiya, Murata et al. actively attempted to implement cellular automata by a reaction–diffusion system in a gel [21, 22]. Cells were implemented with solutions separated by gel walls. Signal molecules can diffuse through gel walls while other molecules stay within each cell’s solution [23].

In this project, Hagiya started to use the phrase “gellular automata” for cellular automata that are suitable for implementation with gels. Based on the implementation mentioned above, the second model of gellular automata was proposed, as explained in the next section.

However, it is very challenging to implement reaction–diffusion systems in gel materials so that cellular automata work as expected. One serious problem is that the diffusion of signal molecules is not limited to the immediate neighbors of the cell that generated the signal but instead may diffuse to a neighbor of a neighbor. To overcome this, we should decompose signals at an appropriate rate to prevent them from diffusing beyond the immediate neighbors. Consequently, we should constantly supply energy and ingredients from an external environment.

Further, if we allow an external environment to actively interact with a reaction–diffusion system, we can control diffusion via a global effect such as an electric or magnetic field or chemical gradient. Then, we can enhance models of cellular automata using such effects, as in our third model of gellular automata, proposed below, where neighboring cells can be distinguished by their directions.

3 Gellular Automata

In this article, the phrase “gellular automata” describes several models of cellular automata that we have proposed and investigated and which we intend to implement with gel materials. In this section, we briefly introduce three such models. The first is based on gel walls with holes that can open and exchange neighboring solutions. It is somewhat rudimentary in that precise control of the interactions between cells is difficult. The second model is also based on gel walls but differs in that the walls allow small molecules to diffuse. In describing the second model, we focus on the self-stability of the model. Note that this is an important property of distributed systems that is crucial for realizing smart materials. In the subsequent section, we report our ongoing work on the third model, which learns Boolean circuits from input–output sets, i.e., examples of input signals and their expected output signals.

3.1 Gellular Automata with Holes

The first model we investigated is based on gel walls separating cells of solutions. The walls are assumed to have holes that are opened and closed by the solutions that surround them [24,25,26,27].

This model is both continuous and discrete in the sense that, while concentrations of molecules are continuous with respect to time, a hole is either open or closed depending on a continuous parameter of the hole. The parameter is controlled by molecules called composers and decomposers (Fig. 1).

Fig. 1
4 illustrations of solutions separated by gel walls. The first illustration has a decomposer, which opens a hole in a wall. The third illustration has a composer which closes the hole in the wall.

Gel walls separate solutions. One wall has a hole. A molecule called a decomposer changes a parameter of the hole, and the hole eventually opens (top left to top right). When a hole is open, adjacent cells are connected, and their solutions mix. A molecule called a composer then changes the parameter in the opposite direction and the hole eventually closes (bottom right to bottom left). It is assumed that the composer is generated in the mixed solution (top right)

We demonstrated the Turing computability of the model by encoding rotary elements in it [24]. Note that it is possible to implement any reversible Turing machine using only rotary elements [28].

We conducted some preliminary experiments using polyacrylamide gels with DNA bridges. We implemented holes that could open once or close once, but we could not implement a hole that could open and close repeatedly. Since then, various kinds of DNA gels have been developed. It will be a challenge to develop DNA gels that can shrink and swell repeatedly and thus be used as valves for opening and closing holes.

3.2 Boolean Total and Non-Camouflage Gellular Automata

The second model is much simpler than the first and more suited to implementation with gels. In this model, gel walls also separate cells of solutions, but communication between them is realized by the diffusion of signal molecules, as implemented by Abe et al. [23].

With such an implementation in mind, we have defined this model as a limited class of asynchronous cellular automata [29, 30]. Each cell undergoes a state transition only if there are neighboring cells in specified states (Boolean totality). However, each cell cannot recognize a cell in the same state as itself (non-camouflage). A transition rule can be represented in the following form:

ST (P and Q and not R and …).

which means that a cell in state S will transition to state T if there is a neighboring cell in state P and a neighbor in state Q but no neighbor in state R; due to being non-camouflaged, S must be different from P, Q, and R.

In addition, state transitions are asynchronous in the sense that each cell may or may not make a transition at each instance of discrete time.

Note that the above restrictions occur naturally in implementations of cells by solutions separated by gel walls. Each cell is assumed to transmit a signal corresponding to its current state, and therefore, a cell cannot see a neighbor in the same state. Due to Boolean totality, it is not necessary to control the direction of diffusion or consider the number of neighboring cells transmitting the same signal.

This second model of gellular automata also has sufficient computational powers. Turing computability was demonstrated [29, 30] as was the fact that the model can simulate population protocols [31]. This means that the distributed algorithms that population protocols can express can also be realized in the model.

Self-stability is an important property of distributed algorithms; it is essential for smart materials. If a material is implemented by gellular automata that realize a self-stable distributed algorithm, it will eventually reach a desired configuration from any initial configuration. Here, a configuration means a global state of the gellular automata, i.e., a mapping of cells to states. Even if an external disturbance damages the material, i.e., changes the states of some of its cells, it can self-repair from the damaged configuration. Therefore, we tried to show the self-stability of various algorithms realized in the model.

Self-stability consists of safety and reachability. Reachability means that it is possible to reach a desired configuration from any initial configuration; this is usually demonstrated under the assumption of fair scheduling. Safety means staying in a desired configuration forever once it has been reached.

We took the following approach to demonstrating self-stability. First, the conditions for desired configurations are expressed by the local conditions of each cell. Next, we show that state transitions of cells in a desired configuration preserve the local conditions; this derives safety. Furthermore, we showed that, with fairness, a desired configuration that satisfies the local conditions is always achieved; this derives reachability. Taking the above approach, we have proved the self-stability of the gellular automata for solving mazes (Fig. 2), two-distance coloring, and the formation of spanning trees [32,33,34].

Fig. 2
An illustration of a maze starts from the bottom right and ends toward the top left. Different shades of cells have different purposes and levels.

Self-stable gellular automata solve a maze problem. The orange cell (bottom right) is the start, and the blue cell is the goal. Black cells are walls. First, green cells spread from the goal. When they reach the start, red cells with different levels spread from it. Eventually, a single path consisting of red cells forms from the start to the goal. External disturbances can change the states of some cells. For example, some non-black cells may be changed to black ones. Provided that there is one start and one goal, a single path from the start to the goal forms again

We are currently working on 3D gellular automata. In our first attempt, we formulated an algorithm for solving maze problems in 3D. Then, we extended it to solve matching problems in which paths are formed between pairs of cells in 3D (Fig. 3). It is hoped that such paths will be able to transport substances or transmit information in smart materials. Our next target is to realize more intelligent behaviors by extending paths to circuits. This target led us to introduce our third model of gellular automata.

Fig. 3
An illustration of a 3-D maze with the paths are formed by different values of S and T. The text, 651 steps, is given on the top.

Gellular automata for solving maze problems have been extended to solve matching problems in three dimensions. Paths are formed between Si and Ti

3.3 Three-Dimensional Gellular Automata That Learn Boolean Circuits

Recently, we formulated a 3D model of cellular automata that forms Boolean circuits from input–output sets. In this problem, Boolean input signals (0 or 1) are given to some cells (input nodes) and their states are changed. Expected Boolean output signals (0 or 1) are also given to some cells (output nodes). This forms one input–output set. More input–output sets are given to the model successively and repeatedly. For each set, a circuit gradually forms that produces the expected output signals from the input signals. In this way, supervised learning is realized as a kind of pattern formation. If smart materials gain such an ability, they can learn by signals from their external environment.

A cell is placed at each lattice point in a 3D space. Each cell therefore has six neighbors in its von Neumann neighborhood. Some cells are specified as input or output nodes. Other cells are either active or inactive; if a cell is active, it works as an OR gate. Signals are fed to input nodes and transmitted from these to output nodes along the OR gates in the direction of increasing coordinates. This means that from a cell at (x, y, z), a signal can transmit to the cells at (x + 1, y, z), (x, y + 1, z), and (x, y, z + 1). For example, if OR gates are placed at (x, y, z) and (x + 1, y, z), then a signal transmits from the former to the latter. Note that this directionality prohibits loops of signals and makes the construction of circuits tractable.

However, in this way, we have abandoned Boolean totality in this model: The signals are transmitted from input nodes only in certain directions. Thus, each cell is assumed to recognize each neighbor. To implement this model with gel materials, we must provide a global effect, such as that from an electric field, to provide directionality, as mentioned above.

4 Supervised Learning of Boolean Circuits

Under our third model of gellular automata, we are currently developing an algorithm for constructing Boolean circuits consisting of OR gates.

4.1 Assumption

With given Boolean signals at the input nodes, the expected Boolean signals at the output nodes are specified as teacher signals. In other words, an example of supervised learning consists of given signals at the input nodes and expected signals at the output nodes. With each set of these, cells make state transitions and form a Boolean circuit consisting of OR gates between the input and output nodes.

For a cell at (x, y, z), the cells at (x−1, y, z), (x, y−1, z), and (x, y, z−1) are called its “fan-in” cells, and the cells at (x + 1, y, z), (x, y + 1, z), and (x, y, z + 1) are called its “fan-out” cells.

4.2 States

The state of each cell consists of the following components. The type can be input, output, OR, or none. Input and output nodes are cells of those respective types. Cells have type OR if they are active, working as OR gates in a circuit. They have type none if they are inactive and do not belong to a circuit.

Each cell has a Boolean value (0 or 1). This component is called the value component or value and is determined by signals transmitted from input nodes via the OR nodes. The value component of an input node is set by the Boolean signal given to it. If the type of a cell is OR or output and it has a fan-in cell whose value is 1, then its value component is set to 1. If there is no such fan-in cell, it is set to 0.

In addition, the state of each cell has a component containing information for fixing the type of the cell. This component is called the fix component and can be one of FIX0, FIX1, and none. FIX0 means that the cell has the value component 1, but it should be 0. FIX1 means the other way around. In the case of FIX1, the component further specifies one of the fan-in cells of the cell as described in the algorithm below.

In summary, a state of a cell is a tuple in one of the forms (input, v), (t, v, f), and (t, v, FIX1, n), where v is 0 or 1; t is none, OR, or output; f is none or FIX0; and n is a number that specifies a fan-in cell.

4.3 Algorithm

The algorithm for constructing Boolean circuits is formulated as a set of state transition rules for changing the fix component and type of each cell. These rules are executed for one set of input signals and expected output signals. Before fix components and types are changed by these rules, the value components of the input nodes are set according to the input signals in the example set. The values of other cells are set according to their current types and the values of their fan-in cells. The fix components of the output nodes are set according to their values and expected output signals.

The transition rules are described in pseudo-code as follows.

               if the type is output then

                            if the value component is 1 and the expected signal is 0 then

                                          the fix component is set to FIX0

                            if the value component is 0 and the expected signal is 1 then

                                          the fix component is set to FIX1

                                           a fan-in cell is randomly chosen and specified (*)

               if the type is OR and the fix component is none and

                   the value component is 1 and one of its fan-out cells has FIX0 then

                             if there is a fan-out cell whose type is OR or output with FIX0 or

                                 there is no fan-out cell whose type is OR or output then

                                           the fix component is changed to FIX0

                                           the type is changed to none

               if the type is none or OR and the fix component is none and

                  the value component is 0 and

                  specified by a fan-out cell with FIX1 then

                           the fix component is changed to FIX1

                           the type is changed to OR (it may already be OR)

                           a fan-in cell is randomly chosen and specified (*)

At (*), when the fix component of a cell is changed to FIX1, one of its fan-in cells is randomly chosen and specified. This choice greatly affects the number of steps taken for the algorithm to construct the expected circuit. We currently impose the following constraints on the chosen cell as heuristics:

  • It is possible to reach the chosen cell from an input node with the value 1 in the direction with increasing coordinates.

  • All the fan-in cells of the chosen cell that are OR gates or inputs nodes have the value 1.

  • All the fan-out cells of the chosen cell that are OR gates or output nodes have the value 1.

Note that checking these constraints requires more components in the state of each cell. We omit description of those components and the rules for changing them in this article.

The state transitions defined above are repeated until no cell can make a transition. Then, we eliminate dangling branches of the formed circuit, i.e., those branches that do not lie between an input node and an output node. This step can also be realized by state transitions, but its description is omitted here.

The above algorithm is executed for each set of input and expected output signals, and a Boolean circuit of OR gates is formed or modified. After execution for each set, the value and fix components of all cells are reset, and the next set is prepared. After all of the sets have been processed, the first set is processed again. This process is repeated until no change is made to the Boolean circuit for any of the sets.

We had previously tried several algorithms, from which we formulated the one above. The proposed algorithm finds a Boolean circuit that satisfies the input–output sets, as in Fig. 4.

Fig. 4
An illustration of a 3 D maze on a 10 by 10 by 10 graph. It has different nodes for inputs and outputs, and a Boolean circuit comprising OR gates. A shaded input dot is present outside the lines.

Algorithm constructs a Boolean circuit composed of OR gates, four input nodes, and two output nodes. The input nodes are shown in blue (value 1) and cyan (value 0), and the output nodes are shown in magenta (value 1) and red (value 0). The OR gates are shown in green (value 1) and yellow (value 0). This circuit was constructed from four input–output sets, of which one is shown

It is possible to define self-stability over multiple sets; however, we have yet to prove the self-stability of the algorithm.

5 Concluding Remark

In this article, we first reviewed the research efforts in DNA computing that have led to the implementation of cellular automata. Then, we summarized our work on gellular automata, focusing on self-stability. Finally, we proposed our third model of gellular automata that can learn Boolean circuits in 3D from sets of input–output signals. More work remains, including proving the self-stability of the algorithm.

Although gellular automata are based on implementing cellular automata by reaction–diffusion systems, the approach at the molecular level is also attractive and has various applications. It may be possible to combine the two approaches.

Regardless of which approach is taken, when a new concrete implementation method is developed, it may be necessary to update the existing models. It is also interesting to extend existing models assuming possible future implementation methods. One possibility is the introduction of directionality into a cellular space, as we assumed in our third model. Directionality can be introduced by electric or magnetic fields or chemical gradients. With such novel factors, communication between cells will be enhanced, with corresponding enrichment of cellular automata models.