Natural Computing

, Volume 7, Issue 2, pp 255–275

Abstraction layers for scalable microfluidic biocomputing

Authors

    • Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of Technology
  • John Paul Urbanski
    • Hatsopoulos Microfluids LaboratoryMassachusetts Institute of Technology
  • Todd Thorsen
    • Hatsopoulos Microfluids LaboratoryMassachusetts Institute of Technology
  • Saman Amarasinghe
    • Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of Technology
Article

DOI: 10.1007/s11047-006-9032-6

Cite this article as:
Thies, W., Urbanski, J.P., Thorsen, T. et al. Nat Comput (2008) 7: 255. doi:10.1007/s11047-006-9032-6
  • 225 Views

Abstract

Microfluidic devices are emerging as an attractive technology for automatically orchestrating the reactions needed in a biological computer. Thousands of microfluidic primitives have already been integrated on a single chip, and recent trends indicate that the hardware complexity is increasing at rates comparable to Moore’s Law. As in the case of silicon, it will be critical to develop abstraction layers—such as programming languages and Instruction Set Architectures (ISAs)—that decouple software development from changes in the underlying device technology. Towards this end, this paper presents BioStream, a portable language for describing biology protocols, and the Fluidic ISA, a stable interface for microfluidic chip designers. A novel algorithm translates microfluidic mixing operations from the BioStream layer to the Fluidic ISA. To demonstrate the benefits of these abstraction layers, we build two microfluidic chips that can both execute BioStream code despite significant differences at the device level. We consider this to be an important step towards building scalable biological computers.

Keywords

MicrofluidicsLaboratory automationDNA computingBiological computationSelf-assemblyProgramming languages

1 Introduction

Biological computing offers the possibility of a machine that can assemble itself, adapt to its environment, and sustain itself naturally. Numerous mechanisms have been devised for computing with biological primitives, among them DNA computing (Adleman 1994; Pisanti 1998; Ezziane 2006), DNA self-assembly (Winfree et al. 1998; Winfree 2003), DNA cellular automata (Benenson et al. 2001; Benenson et al. 2004; Adar et al. 2004), and cellular signaling (Knight and Sussman 1998; Elowitz and Leibler 2000; Kitano 2002; Batten et al. 2004). While none of these technologies immediately threatens to displace silicon as a general-purpose computing medium, each offers unique advantages and could have far-reaching applications in areas such as programmable nanofabrication, biochemical sensing, embedded therapeutics, and smart agriculture.

One of the challenges in biological computing is that the laboratory protocols needed to carry out a computation can be very time consuming. For example, a 20-variable 3-SAT problem required 96 h to complete (Braich et al. 2002), not counting the considerable time needed for setup and evaluation. To automate and optimize this process, researchers have turned to microfluidic devices (Farfel and Stefanovic 2005; Gehani and Reif 1999; Grover and Mathies 2005; Livstone et al. 2006; McCaskill 2001; Somei et al. 2005; van Noort 2005; van Noort et al. 2002; van Noort and Zhang 2004). Microfluidics offers the promise of a “lab on a chip” system that can individually control picoliter-scale quantities of fluids, with integrated support for operations such as mixing, storage, PCR, heating/cooling, cell lysis, electrophoresis, and others (Breslauer et al. 2006, Erickson and Li 2004; Sia and Whitesides 2003). Apart from being amenable to computer control, microfluidics drastically reduces the volumes of samples, thereby reducing costs and improving capture kinetics. Using microfluidics, DNA hybridization times can be reduced from 24 h to 4 min (van Noort and Zhang 2004) and the number of bases needed to encode information can be decreased from 15 bases per bit to 1 base per bit (Braich et al. 2002; van Noort 2005).

Thus has emerged a vision for creating a hybrid DNA computer: one that uses microfluidics for the plumbing (the control paths) and biological primitives for the computations (the ALUs). On the hardware side, this vision is becoming scalable: microfluidic chips have integrated up to 3,574 valves with 1,000 individually addressable storage chambers (Thorsen et al. 2002). Moreover, recent trends indicate that microfluidics is following a path similar to Moore’s law, with the number of soft-lithography valves per unit area doubling every 4.5 months (Hong and Quake 2003; Fluidigm Corportaion 2006).

On the software side, however, the microfluidic realm is lagging far behind its silicon counterpart. For silicon computers, the complexity and scale of the underlying hardware is masked by a set of well-defined abstraction layers. For example, transistors are organized into gates, which combine to form functional units, which together can implement an Instruction Set Architecture (ISA). The user operates at an even higher level of abstraction (e.g., C++), which is automatically translated into the ISA. These abstraction layers have proven critical for managing complexity. Without them, the computing field would have stagnated as every researcher tried to gain a transistor-level understanding of his machine.

Unfortunately, the current practice in experimental microfluidics is to expose all of the hardware resources directly to the experimentalist. Using a graphical system such as Labview, the user orchestrates the individual behavior of each valve in the microfluidic device. While this practice is merely tedious for today’s devices, it will soon become completely intractable—akin to programming a modern microprocessor by directly toggling each of a million gates.

In this paper, we present a system and methodology that uses new abstraction layers for scalable biological computing. As illustrated in Fig. 1, our system consists of three layers. At the highest level, the programmer indicates the abstract computation to be performed—for example, in the form of a SAT formula. With some expertise in DNA computing and experimental biology, the computation can be transformed to the next layer: a portable biological protocol for performing the computation. The protocol is portable in that it does not depend on the physical implementation of the protocol; for example, it specifies fluid concentrations but not fluid volumes. Finally, the bottom layer specifies the operations needed to execute the protocol on a specific microfluidic chip. Each microfluidic chip designer provides a library that translates an abstract protocol into the specific sequence of valve actuations needed to execute that protocol on a specific chip.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig1_HTML.gif
Fig. 1

Abstraction layers for DNA computing

These abstraction layers provide many benefits. Primarily, by using an architecture-independent description of the biological protocol (the middle layer), the application development can be decoupled from advances in the underlying device technology. Thus, as microfluidic devices come to support additional inputs, mixers, storage cells, etc., the existing suite of protocols can run without modification (much as C programs run without modification on successive generations of microprocessors). In addition, the protocol layer serves as a division of labor. Rather than requiring a heroic and brittle translation from a SAT formula directly to a microfluidic chip, a biologist provides a mapping to the abstract protocol while a microfluidics expert maps the protocol to the underlying device. The abstract protocol is also perfectly suited to simulation, thereby allowing the logical operations to be verified without relying on any physical implementation. Further, a portable protocol description could serve the role of pseudocode in technical publications, providing a precise account of the experimental methods used. Third-party protocols could be downloaded and executed (or called as sub-routines) on one’s own microfluidic device.

In the long term, the protocol description language will support all of the operations needed for biological computing. However, as there does not yet exist a single microfluidic device that can encompass all the functionality (preparation of DNA libraries, selection, readout, etc.), this paper focuses on three fundamental primitives: fluid mixing, fluid transport, and fluid storage. We describe a programming system called BioStream that provides an architecture-independent interface for these operations. To show that BioStream is portable, we execute BioStream code on two fundamentally different microfluidic architectures. We also present a novel algorithm for mixing fluids to a given concentration using the minimal number of simple on-chip mixing steps. Our system represents a fully-functional, end-to-end demonstration of portable software on microfluidic hardware.

2 BioStream protocol language

We have developed a software system called BioStream for portable microfluidics protocols. BioStream is a Java library that virtualizes many aspects of the underlying hardware resources. While BioStream can be targeted by a compiler (for example, a DNA computing compiler that converts a mathematical problem into a biological protocol), it is also suitable for direct programming and experimentation by biologists. As such, the language provides several high-level abstractions to improve readability and programmer productivity.

2.1 Providing portability

As shown in Fig. 2, BioStream offers two levels of abstraction underneath the protocol developer. The first abstraction layer is the BioStream library, which provides first-class Fluid objects to represent the physical fluids on the chip. The programmer deals only with Fluid variables, while the runtime system automatically assigns and tracks the location of the corresponding fluids on the device. The library also supports a general mix operation for combining fluids in arbitrary proportions and with adjustable precision.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig2_HTML.gif
Fig. 2

Abstraction layers in the BioStream system

The second abstraction layer, the Fluidic ISA, interfaces with the underlying hardware. The fundamental operation is mixAndStore, which mixes two fluids in equal proportions and stores the result in a destination cell. (We describe later how to translate the flexible mix operations in BioStream to a series of equal-proportion mixes.) As all storage cells on the chip have unit volume, only one unit of mixture is stored in the destination; any leftover mixture may be discarded. As detailed in a later section, this allows for a flexible implementation of mixAndStore on diverse architectures.

In addition to the abstractions for mixing, there are some architecture-specific features that need to be made available to the programmer. These “native functions” include I/O devices, sensors, and agitators that might not be supported by every chip, but are needed to execute the program; for example, special input lines, cameras, or heaters. As shown in Fig. 2, BioStream supports this functionality by having the programmer declare a set of architecture requirements. BioStream uses the requirements to generate a library which contains the same functionality; it also checks that the architecture target supports all of the required functions. Finally, BioStream includes a generic simulator that inputs a set of architecture requirements and outputs a virtual machine that emulates the architecture. This allows full protocol development and validation even without hardware resources.

The BioStream system is fully implemented. The reflection capabilities of Java are utilized to automatically generate the library and the simulator from the architecture requirements. As described later, we also execute the Fluidic ISA on two real microfluidic chips.

2.2 Example protocol

An example of a BioStream protocol appears in Fig. 3. This is a general program that seeks to find the ratio of two reagents that leads to the highest activity in the presence of a given indicator. Experiments of this sort are common in biology. For example, the program could be applied to investigate the roles of cytochrome-c and caspase 8 in activating apoptosis (cell death); cell lysate would serve as the indicator in this experiment (Ellerby et al. 1997; Allan et al. 2003). The protocol uses feedback from a luminescence detector to guide the search for the highest activity. After sampling some concentrations in the given range, it descends recursively and narrows the range for the next round of sampling. Using self-directed mixing, a high precision can be obtained after only a few rounds.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig3_HTML.gif
Fig. 3

Recursive descent search in BioStream

The recursive descent program declares a SimpleLibrary interface (see bottom of Fig. 3) describing the functionality required on the target architecture. In this case, a camera is needed to detect luminescence. While we have not mounted a camera on our current device, it would be straightforward to do so.

2.3 Improving programmer productivity

A distinguishing feature of BioStream code is the use of Fluid variables to represent samples on the device. The challenge in implementing this functionality is that physical fluids can be used only once, as they are consumed in mixtures and reactions. However, the programmer might reference a Fluid variable multiple times (e.g., variables A and B in the recursive descent example). BioStream supports this behavior by keeping track of how each Fluid was generated and automatically regenerating fluids that are reused. This process assumes that the original steps employed to generate a Fluid (input, mixing, agitation, etc.) will produce an equivalent Fluid if repeated. While this assumption is a natural fit for protocols depending only on the concentrations of reagents, there are also non-deterministic systems (such as directed evolution of cells) to which it does not apply. We leave full consideration of such systems for future work.

The regeneration mechanism works by associating each Fluid object with the name and arguments of the function that created it. The creating function must be a mix operation or a native function, both of which are visible to BioStream (the Fluid constructor is not exposed). BioStream maintains a valid bit for each Fluid, which indicates whether or not the Fluid is stored in a storage chamber on the chip. By default, the bit is true when the Fluid is first created, and it is invalidated when the Fluid is used as an argument to a BioStream function. If a BioStream function is called with an invalid Fluid, that Fluid is regenerated using its history. Note that this regeneration mechanism is fully dynamic (no analysis of the source code is needed) and is accurate even in the presence of pointers and aliasing.

The computation history created for fluids can be viewed as a dependence tree with several interesting applications. For example, the library can execute a program in a demand-driven fashion by initializing each Fluid to an invalid state and only generating it when it is used by a native function. This lazy evaluation affords the library more flexibility in scheduling the mixing operations when the fluids are needed. For example, operations could be reordered to minimize storage requirements or to issue parallel operations with vector control. Dynamic optimizations such as these are especially promising for microfluidics, as the silicon-based control processors operate much faster than their microfluidic counterparts.

3 Microfluidic implementation

To demonstrate an end-to-end system, we have designed and fabricated two microfluidic chips using a standard multi-layer soft-lithography process (Sia and Whitesides 2003). While there are fundamental differences between the chips (see Table 1), both provide support for programmable mixing, storage, and transport of fluid samples. More specifically, both chips implement the mixAndStore operation in the Fluidic ISA: they can load two samples from storage, mix them together, and store the result. Thus, despite their differences, code written in BioStream will be portable between the chips.
Table 1

Key properties of the microfluidic chips developed

 

Driving fluid

Wash fluid

Mixing

Sample size

Inputs

Storage cells

Valves

Control lines

Chip 1

Oil

N/A

Rotary mixer

Half of mixer

2

8

46

26

Chip 2

Air

Water

During transport

Full mixer

4

32

140

21

Chip 1 provides better isolation and retention of samples, while Chip 2 offers faster and simpler operation

The first chip (see Fig. 4) isolates fluid samples by suspending them in oil (Urbanski et al. 2006). To implement mixAndStore, each input sample is transported from a storage bin to one side of the mixer. The mixer uses rotary flow, driven by peristaltic pumps, to mix the samples to uniformity (Chou et al. 2001). Following mixing, one half of the mixer is drained and stored in the target location. While the second half could also be stored, it is currently discarded, as the basic mixAndStore abstraction produces only one unit of output.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig4_HTML.gif
Fig. 4

Layout and photo of Chip 1 (driven by oil)

The second chip (see Fig. 4) isolates fluid samples using air instead of oil. Because fluid transport is very rapid in the absence of oil, a dedicated mixing element is not needed. Instead, the input samples are loaded from storage and aligned in a metering element; when the element is drained, the samples are mixed during transport to storage. Because the samples are in direct contact with the walls of the flow channels, a small fraction of the sample is lost during transport. This introduces the need for a wash phase, to clean the channel walls between operations. Also, to maintain sample volumes, the entire result of mixing is stored. Any excess volume is discarded in future mixing operations, as the metering element has fixed capacity.

To demonstrate BioStream’s portability between these two chips, consider the following code, which generates a gradient of concentrations:
  • Fluid blue = input(1);

  • Fluid yellow = input(2);

  • Fluid[] gradient = new Fluid[5];

  • for (int i = 0; i <= 4; i++) {

  •   gradient[i] = mix(blue, yellow, i/4.0, 1-i/4.0);

  • }

This code was used to generate the gradient pictured in Fig. 4 and produces an identical result on both microfluidic devices. (The gradient shown in Fig. 5 is different and was generated by a different program.)
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig5_HTML.gif
Fig. 5

Layout and photo of Chip 2 (driven by air)

4 Mixing algorithms

The mixing and dilution of fluids plays a fundamental role in almost all bioanalytical procedures. Mixing is used to prepare input samples for analysis, to dilute concentrated substances, and to control reagent volumes. In DNA computing, mixing is needed for reagent preparation (e.g., DNA libraries, PCR buffers, detection assays) and, in some techniques, for restriction digests (Faulhammer et al. 2000; Ouyang et al. 1997) or fine-grained concentration control (Yamamoto et al. 2002). It is critical to provide integrated support for mixing on microfluidic devices, as otherwise the samples would have to leave the system every time a mixture is needed.

As described in the previous sections, our microfluidic chips support the mixAndStore instruction from the Fluidic ISA. This operation simply mixes two fluids in equal proportions. However, the mix command in BioStream allows the programmer to specify complex mixtures involving multiple fluids in various concentrations. To bridge the gap between these abstractions, this section describes how to obtain a complex mixture using a series of simple steps. We describe an abstract model for mixing, an algorithm for minimizing the number of steps required, how to deal with error tolerances, and directions for future work.

4.1 A model of mixing

The following definition gives our notation for mixtures.

Definition 1

A mixture \({{\cal M}}\) is a set of substances Si at given concentrations ci:
$$\begin{aligned}{\cal M}=& \{\langle{S_1}, {c_1}\rangle \cdots \langle{S_k}, {c_k}\rangle\}\\&\Sigma_{i=1}^k c_i=1\\\end{aligned}$$

For example, a mixture of 3/4 buffer and 1/4 reagent is denoted as \({\{\langle\hbox{buffer},{3/4}\rangle,\langle\hbox{reagent},{1/4}\rangle\}}\). We further define a sample to be a mixture with only one substance \({(|{\cal M}|=1)}\). For example, a sample of buffer is denoted \({\{\langle\hbox{buffer},1\rangle\}}\), or just \({\langle\hbox{buffer}\rangle}\).

To obtain a given mixture on a microfluidic chip, one performs a series of mixes using an on-chip mixing primitive. While the capabilities of this mixer might vary from one chip to another, a simple 1-to-1 mixing model can be implemented on both continuous flow and droplet-based architectures (Chou et al. 2001; Paik et al. 2003). In this model, all fluids are stored in uniform chambers of unit volume. The mix operation combines two fluids in equal proportions, producing two units of the mixture. However, since there may be some amount of fluid loss with every operation, the result of the mixture might not be able to completely fill the contents of two storage cells. Thus, the result is stored in only one storage cell, and the extra mixture is discarded.

The 1-to-1 mixing process can be visualized using a “mixing tree”. As depicted in Fig. 6, each leaf node of a mixing tree represents a sample, while each internal node represents the mixture resulting from the combination of its children. Figure 7 illustrates that the mixture at an internal node can be calculated as the arithmetic mean of the components in child mixtures. In the 1-to-1 model, mixing trees are binary trees because each mix operation has two inputs. Evaluation of the tree proceeds from the leaf nodes upwards; the mixture for a given node can be produced once the child mixtures are available. The overall result of the operation is the mixture specified at the root node.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig6_HTML.gif
Fig. 6

Mixing tree yielding 3/4 buffer and 1/4 reagent

https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig7_HTML.gif
Fig. 7

Calculation of a parent mixture from child mixtures using a 1-to-1 mixer. For each substance, the resulting concentration is the average of the concentrations in the children

The following theorem is useful for reasoning about mixing trees. It describes the concentration of a substance in the overall mixture based on the depths of leaf nodes containing samples of the substance. The depth of a node n in a binary tree is the length of the path from the root node to n.

Theorem 1

Consider a mixing tree and a substance S. Letmddenote the number of leaf nodes with sample\({\langle S\rangle}\)appearing at depth d of the tree. Then the concentration of S contained in the root mixture is given by\({\sum_d m_d {\ast} 2^{-d}}\).

Proof

A sample at depth d is diluted d times in the mixing process, each time by a factor of two. Thus it contributes 2d to the root mixture. Since each mix operation sums the concentrations from child nodes, the overall contribution is the sum across the leaf nodes at all depths: \({\sum_d m_d {\ast} 2^{-d}}\).□

The following theorem describes the set of mixtures that can be obtained using a 1-to-1 mixer. Informally, it states that a mixture is reachable if and only if the concentration of each substance can be written as an integral fraction k/2d.

Theorem 2

(1-to-1 Mixing Reachability) Consider a finite set of substances\({\{S_1\cdots S_k\}}\)with an unlimited supply of samples\({\langle S_i\rangle}\). Let\({{\cal R}}\) denote the set of mixtures that can be obtained via any sequence of 1-to-1 mixes. Then:
$$\mathcal{R}=\left\{\begin{array}{l} \{\langle S_{1},c_{1}\rangle \cdots \langle S_{k},c_{k}\rangle\}s.t. \; \exists \; p_{i},q_{i}, d \in{\mathcal Z}:\\LCM(q_1 \cdots q_k)=2^d \wedge \forall i \in [1,k]:c_i = \frac{p_i}{q_i}\\ \end{array} \right\}$$

Proof

The equality in the theorem can be shown via bi-directional inclusion of \({{\cal R}}\) and the right hand side (RHS).

\({{\cal R} \subseteq \hbox{RHS}}\): Given a mixing tree for the mixture, construct pi, qi, and d as follows to satisfy the RHS. Select d as the maximum depth of the tree (i.e., the maximum path length from the root node to a leaf node) and set all qi =  2d, thereby satisfying the LCM condition. Then, for leaf nodes at a depth less than d, replace the node with an internal node whose children are leaves with the same sample as the original. This preserves the identity of the mixture but increases the depth of some nodes. Iterate until all leaf nodes are at depth d. By Theorem 1, if a substance has concentration ci in the mixture then it must have \({c_i \,{\ast} \,2^{d}}\) leaf nodes in this tree. Thus, setting pi to the number of leaf nodes with sample \({\langle S_i\rangle}\), we have that \({p_i/q_i=c_i\, {\ast} \,2^d/2^d=c_i}\) as required.

\({{\cal R} \supseteq \hbox{RHS}}\): Given a mixture satisfying the RHS and values of pi, qi, and d satisfying the conjuncts, construct a mixing tree that can be used to obtain the given mixture. The tree has d levels and 2d leaves. Assign sample \({\langle S_i\rangle}\) to any \({p_i {\ast} 2^d/q_i}\) of the leaves (this is an integral quantity because 2d is a common multiple of the qi). By the definition of mixture, \({\sum_i(p_i/q_i)=\sum_i c_i =1}\) and there is a one-to-one mapping between leaves and samples. By Theorem 1, the resulting mixture has a concentration of k/2d for a substance with k samples at the leaves. Thus the concentration for Si in the assignment is \({(p_i {\ast} 2^d/q_i)/2^d = p_i/q_i = c_i}\) as desired.□

It is natural to suggest a number of optimization problems for mixing. Of particular interest are the number of mixes and the number of samples consumed, as these directly impact the running time and resource requirements of a laboratory experiment. The following theorem shows that (under the 1-to-1 model) these two optimization problems are equivalent.

Theorem 3

In any 1-to-1 mixing sequence, the number of samples consumed is exactly one greater than the number of mixes.

Proof

By induction on the number of nodes, there is always exactly one more leaf node than internal node in a binary tree. The mixing tree is a binary tree in which each internal node represents a mix and each leaf node represents a sample. Thus there is always exactly one more sample consumed than there are mixes.□

Note that this theorem only holds under the 1-to-1 mixing model, in which two units of volume are mixed but only one unit of the mixture is retained. For microfluidic chips that attempt to retain both units of mixture (such as droplet-based architectures or our oil-driven chip), it might be possible to decrease the number of samples consumed by increasing the number of mix operations.

4.2 Algorithm for optimal mixing

In this section, we give an efficient algorithm for finding a mixing tree that requires the minimal number of mixes to obtain a given concentration. For clarity, we frame the problem as follows:

Problem 1

(Minimal Mixing) Consider a finite set of substances\({\{S_1\cdots S_k\}}\)with an unlimited supply of samples\({\langle S_i\rangle}\). Given a reachable mixture\({\{\langle{S_1}, {p_1/n}\rangle \cdots \langle{S_k}, {p_k/n}\rangle\}}\), what is the mixing tree with the minimal number of leaves?

Our algorithm runs in \({O(k\, \hbox{lg}\, n)}\) time.1 and produces an optimal mixing tree (with respect to this metric). The tree produced has no more than \({k\, \hbox{lg}\, n}\) internal nodes.

The idea behind the algorithm, which we refer to as Min-Mix, is to place a leaf node with sample \({\langle S\rangle}\) at depth d in the mixing tree if and only if the target concentration for S has a 1 in bit \({\hbox{lg}\, n-d}\) of its binary representation. Theorem 1 then ensures that all substances have the desired concentrations, while fewer than \({\hbox{lg}\, n}\) samples are used for each one.

Psuedocode for Min-Mix appears in Fig. 8. We illustrate its operation for the example mixture of \({\{\langle A,{5/16}\rangle, \langle B,4/16\rangle,\langle{C},7/16\rangle\}}\). As shown in Fig. 9, the algorithm begins with a pre-processing stage that allocates substances to bins according to the binary representation of the target concentrations. It then builds the mixing tree via calls to Min-Mix-Helper, which descends through the bins. When a bin is empty, an internal node is created in the graph and the procedure recurses into the next bin. When a bin has a substance identifier in it, the substance is removed from the bin and a corresponding sample is added as a leaf node to the graph. Figure 9 labels the order in which the nodes in the final mixing tree are created by the algorithm.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig8_HTML.gif
Fig. 8

Min-Mix algorithm

https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig9_HTML.gif
Fig. 9

Example operation of Min-Mix for the mixture \({\{\langle A,{5/16}\rangle,\langle B,{4/16}\rangle,\langle C,{7/16}\rangle\}}\). Part (a) illustrates the algorithm’s allocation of substances to bins. The bin layout directly translates to a valid mixing tree, which appears in (b) with numbers indicating the order in which nodes are added to the tree. The mixing tree is redrawn in (c) for clarity

The following lemma is key to proving the correctness of Min-Mix. We denote the nth least significant bit of x by \({\hbox{LSB}(x, n)}\). That is, \({\hbox{LSB}(x, n)\equiv (x \ll n)\,\&\,1}\).

Lemma 1

Consider the mixing tree t produced by Min-Mix\({(\{\langle S_1,{p_1/n}\rangle \cdots\langle S_k,{p_k/n}\rangle\})}\). A substanceSiappears at a depth d in t if and only if\({{LSB}(p_i,\hbox{lg}\, n-d)=1}\).

Proof

If: It suffices to show that there is a substance added to the mixing tree for each LSB of 1 drawn from the pi (that the substance appears at depth d is given by the only if direction.) Further, since bins[j] is constructed to contain all substances i for which \({\hbox{LSB}(p_i,j)=1}\), it suffices to show that (a) all bins are empty at the end of the procedure, and (b) the procedure does not try to pop from an empty bin. To show (a), use the invariant that each call to Min-Mix-Helper adds a total of 2d to the mixing tree, where d is the current depth; either a leaf node is added (which contributes 2d by Theorem 1) or two child nodes are added, contributing \({2 \,{\ast} \,2^{-(d+1)} = 2^{-d}}\). But since the initial depth is 0, the external call results in 20 =  1 unit of mixture being generated. Since the bins represent exactly one unit of mixture (i.e., \({\sum_j \hbox{bins}[j]\, {\ast} \,2^{-j} = 1}\)), all bins will be used. To show (b), observe that Min–Mix references the bins in order, testing if each is empty before proceeding. Thus no empty bin will ever be dereferenced.

Only if: When a substance is added to the tree from bins[j], it appears at depth \({\hbox{lg}\, n-j}\) in the tree. This is evident from the recursive call in Min-Mix-Helper: it initially draws from bins[\({\hbox{lg}\,n}\)] and then works down when the upper bins are empty. By construction, bins[j] contains only substances Si with \({\hbox{LSB}(p_i,j)=1}\). Thus, if Si appears at depth d in the mixing tree, it was added from \({\hbox{bins}[\hbox{lg}\,n - d]}\) which has \({\hbox{LSB}(p_i,\hbox{lg} \,n-d) = 1}\).□

The following theorem asserts the correctness of Min-Mix.

Theorem 4

The mixing tree given by Min-Mixgives the correct concentration for each substance in the target mixture.

Proof

Consider a component \({\langle S,{p/n}\rangle}\) of the mixture passed to Min-Mix. Let md denote the number of leaf nodes with sample S at depth d of the resulting mixing tree. By Lemma 1, \({m_d = \hbox{LSB}(p,\hbox{lg}(n)-d)}\). Using Theorem 1, this implies that the concentration for S in the root mixture is given by
$$\begin{aligned}c=&\sum_d \hbox{LSB}(p, \hbox{lg}(n)-d) \ast 2^{-d}\\=&\sum_x \hbox{LSB}(p, x) \ast 2^{-({\rm lg}(n)-x)}\\=&\sum_x \hbox{LSB}(p, x) \ast 2^x/n\\=&p / n\\\end{aligned}$$
Thus the concentration in the root node of the mixing tree is the same as that passed to Min-Mix〉 □

The following theorem asserts the optimality of the mixing trees produced by Min-Mix.

Theorem 5

Consider the mixing tree t produced by Min-Mix\({(\{\langle{S_1},{p_1/n}\cdots \langle{S_k},{p_k/n}\})}\). The number of leaf nodes \({{\cal L}(t)}\) is given by
$${\cal L}(t)=\sum_{i=1}^k \sum_{j=0}^{{\rm lg}\, n} {LSB}(p_i, j)$$
There does not exist a mixing tree that yields the given mixture with fewer leaf nodes than\({{\cal L}(t)}\).

Proof

That Min-Mix produces a tree t with \({{\cal L}(t)}\) leaf nodes follows directly from Lemma 1, as there is a one-to-one correspondence between leaf nodes and input samples. To prove optimality, Theorem 1 gives that \({p_i/n = \sum_d m_d {\ast} 2^{-d}}\). Thus \({p_i = \sum_d m_d {\ast} 2^{{\rm lg}\, n-d}=\sum_d\sum_{i=1}^{m_d} 2^{{\rm lg}\,n-d}}\). That is, pi is a sum of powers of two, and the number of leaf nodes determines the number of summands. The minimal number of summands is the number of non-zero bits in the binary representation for pi; this quantity is \({\sum_{j=0}^{{\rm lg}\, n} \hbox{LSB}(p_i,j)}\). Thus it is impossible to obtain a concentration of pi for all k substances in the tree with fewer than \({\sum_{i=1}^k \sum_{j=0}^{{\rm lg}\,n} \hbox{LSB}(p_i,j)}\) leaf nodes.□

The following theorem describes the running time of Min-Mix.

Theorem 6

Min-Mix(\({\{\langle S_1,{p_1/n}\rangle \cdots \langle{S_k},{p_k/n}\rangle\})}\)runs in\({O(k \hbox{ lg}\, n)}\)time.

Proof

The pre-processing stage in Min-Mix executes \({k\,\hbox{lg}\, n}\) iterations with constant cost per iteration. By Theorem 5, the recursive procedure returns a tree with \({\sum_{i=1}^k \sum_{j=0}^{{\rm lg}(n)}\hbox{LSB}(p_i,j)= O(k\, \hbox{lg}\, n)}\) leaf nodes, and by Theorem 3 this implies that there are \({O(k\,\hbox{lg}\, n)}\) total nodes in the tree. Since there is constant cost at each node, the overall complexity is \({O(k\, \hbox{lg}\, n)}\).□

4.3 Special case: mixing two substances

The minimal mixing tree admits a particularly compact representation when only two substances \({\langle{s_1},p_1/n\rangle}\) and \({\langle s_2,p_2/n \rangle}\) are being mixed. Because the two target concentrations must sum to a power of two (in order to be reachable with a 1-to-1 mixer), there is a special pattern in the bitwise representation of p1 and p2 (see Fig. 10). The least significant bits might be zero in both concentrations, but then some bit must be one in each of them. The higher-order bits must be one in exactly one of the concentrations (to carry a value upwards) and the most significant bit is zero (as we assume p1, p2 < n).
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig10_HTML.gif
Fig. 10

Arrangement of bits for any \({p_1 + p_2 = 2^d}\)

Algorithm twoWayMix, shown in Fig. 12, exploits this pattern to directly execute the mix sequence without building a mixing tree. The sequence of mixes is completely encoded in the binary representation of either concentration. As illustrated by the example in Fig. 11, the algorithm starts with a unit of S2 and then skips over all the low-order zero bits (these result from a fraction p1/n that is not in lowest terms). When it gets to a high bit, it maintains a running mixture—requiring no temporary storage—in which either S1 or S2 is added to the mix depending on the next most significant bit of p1. It can be shown that this procedure is equivalent to building a mixing tree. However, it is attractive from hardware design standpoint due to its simplicity and the fact that it directly performs a mixture based on the binary representation of the desired concentration.
https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig11_HTML.gif
Fig. 11

Example of mixing 14/32 and 18/32 using twoWayMix

https://static-content.springer.com/image/art%3A10.1007%2Fs11047-006-9032-6/MediaObjects/11047_2006_9032_Fig12_HTML.gif
Fig. 12

Algorithm for mixing two substances

4.4 Supporting error tolerances

Thus far the presentation has been in terms of mixtures that can be obtained exactly with a 1-to-1 mixer, i.e., those with target concentrations in the form of k/2d. However, the programmer should not be concerned with the reachability of a given mixture. In the BioStream system, the programmer specifies a concentration range [cmin, cmax] and the system ensures that the mixture produced will fall within the given range.2 Such error tolerances are already a natural aspect of scientific experiments, as all measuring equipment has a finite precision that is carefully noted as part of the procedure.

Given a concentration range, the system increases the internal precision d until some concentration k/2d (which can be obtained exactly) falls within the range. When performing a mixture with concentration ranges \({\{\langle{S_1},{[c_{1,{\rm min}},c_{1,{\rm max}}]}\cdots\langle{S_k},{[c_{k,{\rm min}},c_{k,{\rm max}}]}\rangle\}}\) the system needs to choose concrete concentrations ci and a precision d that satisfies the following conditions:
  1. 1.

    \({\forall i: \exists k_i s.t. c_i = k_i / 2^d}\)

     
  2. 2.

    \({\forall i: c_{i,{\rm min}}\leq c_i \leq c_{i,{\rm max}}}\)

     
  3. 3.

    \({\sum_i c_i = 1}\)

     
The first condition guarantees that the mixture can be obtained using a 1-to-1 mixer. The second condition states that the concrete concentrations ci are within the range specified by the programmer. The third condition ensures that the ci form a valid mixture, i.e., that they sum to one.

The BioStream system uses a simple greedy algorithm to choose ci and d satisfying these conditions. It increases d until there exists a ci satisfying (1) and (2) for all i. If multiple candidates for a given ci exist, it selects the smallest possible. Then it checks condition (3). If the sum exceeds one, it increases d and starts over. If the sum is less than one, it increases by 1/2d some ci for which \({c_i\leq c_{i,{\rm max}}-1/2^d}\). If no such ci exists, it increases d and starts over. Otherwise the algorithm continues until the conditions are satisfied.

One can imagine other selection schemes that select ci and d to optimize some criterion, such as the number of mixes required by the resulting mixture. This would be straightforward to implement via an exhaustive search at a given precision level, but it could be costly depending on the size of the error margins. It will be a fruitful area of future research to optimize the selection of target concentrations while respecting the error bounds.

4.5 Open problems

We suggest three avenues for future research in mixing algorithms.

4.5.1 N-to-M mixing

It is simple to build a rotary mixer that combines fluids in a ratio other than 1-to-1; for example, 1-to-2, 1-to-3, or even a ternary mixer such as 1-to-2-to-3. Judging by exhaustive experiments, it appears that a 1-to-2 mixer can obtain any concentration k/3n. However, we are unaware of a closed form for the mixtures that can be obtained with a general N-to-M mixer. Likewise, we consider it to be an open problem to formulate an efficient algorithm for determining the minimal mix sequence using an N-to-M mixer (i.e., one that does not resort to an exhaustive lookup table.) A solution to this problem could reduce mixing time and reagent consumption while increasing precision.

4.5.2 Minimizing storage requirements

Given a mixing tree, it is straightforward to find an evaluation order that minimizes the number of temporaries; one can apply the classical node labeling algorithm that minimizes register usage for trees (Alfred V. Aho and Ullman 1988, p. 561). However, we are unaware of an efficient algorithm for finding the mixing tree that minimizes the number of temporaries needed to obtain a given mixture. This could be an important optimization, as experiments often demand as many parallel samples as can be supported by the architecture. Also, storage chambers on microfluidic chips are relatively limited and expensive compared to storage on today’s computers.

4.5.3 Heterogeneous inputs

Our presentation treats each input sample as a black box. However, in practice, the user is able to prepare large quantities of reagents as inputs to the chip. For an application that produces an array of concentrations, what inputs should the user prepare to minimize the number of mixes required? And if some inputs are related (e.g., a sample of 10% acid and 20% acid) how can that be incorporated into the mixing algorithm? Like the previous items, these are interesting algorithmic questions that can have a practical impact.

5 Related work

Several researchers have pursued the goal of automating the control systems for microfluidic chips. Gascoyne et al. describe a graphical user interface for controlling chips that manipulate droplets over a two-dimensional grid (Gascoyne et al. 2004). By varying parameters in the interface, the software can target grids with varying dimensions, speeds, etc. However, portability is limited to grid-based droplet processors. While the BioStream protocol language could target their chips, their software is not suitable for targeting ours.

Su et al. represent protocols as acyclic sequence graphs and map them to droplet-based processors using automatic scheduling (Su and Chakrabarty 2004) and module placement (Su and Chakrabarty 2005). While the sequence graph is portable, it lacks the expressiveness of a programming language and cannot represent feedback loops (as in our recursive descent example). King et al. demonstrate a “robot scientist” that directs laboratory experiments using a high-level programming language (King et al. 2004), but lacks the abstraction layers needed to target other devices. Gu et al. have controlled microfluidic chips using programmable Braille displays (Gu et al. 2004), but protocols are mapped to the chip by hand.

Johnson demonstrates a special-purpose robotic system (controlled by Labview) that automatically solves 3-SAT problems using DNA computing (Johnson 2006). Miniaturizing his benchtop devices could result in a fully-automatic microfluidic biocomputer. Livstone et al. compile an abstract SAT problem into a sequence of DNA-computing steps (Livstone et al. 2006). The output of their system would be a good match for BioStream and the abstraction layers proposed in this paper.

There are other microfluidic chips that support flexible generation of gradients Dertinger et al. (2001), Neils et al. (2004), Lin et al. (2004) and programmable mixing on a droplet array (Pollack et al. 2000). To the best of our knowledge, our chips are the only ones that provide arbitrary mixing of discrete samples in a soft-lithography medium. A more detailed comparison of the devices is published elsewhere (Urbanski et al. 2006).

Ren et al. also suggest a mixing algorithm for diluting a single reagent by a given factor (Ren et al. 2003). It seems that their algorithm performs a binary search for the target concentration, progressively approximating the target by a factor of two. However, since intermediate reagents must be regenerated in the search, this algorithm requires O(n) mixes to obtain a concentration k/n. In contrast, our algorithm needs O(lg n) to mix two fluids.

6 Conclusions

Microfluidic devices are an exciting substrate for biological computing because they allow precise and automatic control of the underlying biological protocols. However, as the complexity of microfluidic hardware comes to rival that of silicon-based computers, it will be critical to develop effective abstraction layers that decouple application development from low-level hardware details.

This paper presents two new abstraction layers for microfluidic biocomputers: the BioStream protocol language and the Fluidic ISA. Protocols expressed in BioStream are portable across all devices implementing a given Fluidic ISA. We demonstrate this portability by building two fundamentally different microfluidic devices that support execution of the same BioStream code. We also present a new and optimal algorithm for obtaining a given concentration of fluids using a simple on-chip mixing device. This algorithm is essential for efficiently supporting the mix abstraction in the BioStream language.

It remains an interesting area of future work to leverage DNA computing technology to target the BioStream language from a high-level description of the computation. This will create an end-to-end platform for biological computing that is seamlessly portable across future generations of microfluidic chips.

Footnotes
1

lg n denotes log2 n.

 
2

Alternately, BioStream supports a global error tolerance ε that applies to all concentrations.

 

Acknowledgements

We are grateful to David Wentzlaff and Mats Cooper for early contributions to this research. We also thank John Albeck for helpful discussions about experimental protocols. This work was supported by National Science Foundation grant #CCF-0541319. J.P.U. was funded in part by the National Science and Engineering Research Council of Canada (PGSM Scholarship).

Copyright information

© Springer Science+Business Media, Inc. 2007