The search for existing cells with the minimal genome has brought about the identification of the smallest-known cellular genomes such as the endosymbiotic bacteria of aphids Buchnera aphidicola, with an estimated size of 450 kb and 400 genes (Gil et al. 2004). This is even smaller than the complete bacterial genome of the human pathogen Mycoplasma genitalium, an obligate parasite of 540 kb and 480 genes (Fraser et al. 1995).

These genome sizes correspond to 400–500 genes excluding redundancy and the size and content of these minimal gene sets is strongly determined, during evolution, by the environment in which these organisms live, including the highly permissive intracellular environment provided by the host of these two parasites. In this particular case it is believed that these endosymbionts have undergone massive gene loss where instead microbes, representing the next step of complexity, can reach the number of a thousand genes for their genomes.

We have to bear in mind that evolution has followed pathways that often do not match with the construction of low complexity mechanisms. The results are organisms where complex genomes are likely to be the result of billions of years of evolution, with development of defense, repair and survival mechanisms required for environments sometime not permissive.

Therefore modern cells can be theoretically simplified or reconstructed in a laboratory reducing the number of DNA encoding regions and enzymes to the essential genes for a cell to be alive in a permissive environment; this represents an exercise to build a model for early cells in early evolution (Gil et al. 2002; Luisi 2002).

Continuing the search for more simple genomes in terms of sizes and genetic coding regions, and as we speak another work has shone a light on a small bacterial genome, the endosymbiotic Carsonella rudii (Nakabachi et al. 2006). This is showing that no genomic minimal limit for cellular organisms has probably been reached yet and its gene inventory seems insufficient for all biological processes which are essential for bacterial life. Although possibly the host bacteriocyte compensates for its gene loss, it now remains to be seen if the Carsonella minimal gene set of around 180 genes encodes for precursor structures or mechanisms showing proof of earlier cells in evolution than we know today.

Different studies have highlighted that the number of genes required to define a minimal cell alive, where alive means capable of reproduction, maintenance and evolution in a permissive environment, can be drastically reduced and several contributions have reached a theoretical number of 200–250 genes (Hutchinson et al. 1999; Mushegian 1999; Gil et al. 2002). However these numbers are based on the selection of essential genes from existing organisms and these cells are still the result of a complex and often chaotic evolution.

We could now ask the question, can we construct a minimal Cell alive using existing molecules and reduce the number of genes required down to around 100–150 genes?

Using a Synthetic Biology approach, where we combine the use of extant biological molecules such as DNA, RNA, enzymes, proteins and low molecular weight components including ATP glucose, aminoacids etc. together with lipidic vesicles know as liposomes, we have planned to build a semi-synthetic minimal cell.

We have recently discovered the availability of a new cell-free protein synthesis system named “Puresystem” (PS), a minimal set of 36 enzymes plus purified ribosomes and low molecular weight molecules. The PS was introduced into liposome compartments (liposome would mimic cell membrane) together with a gene encoding for the Enhanced Green Fluorescent Protein (EGFP). Entrapping these extant molecules (the non synthetic part of the cell) in liposomes (the assembling process and the cell compartment is the synthetic part of the semi-synthetic cell) we have succeeded in expressing this protein using this minimal system. The EGFP synthesis with a minimal set of enzymes has been proved recently in our laboratory using fluorometric analysis and confocal microscopy (Murtas et al. 2007). In these experiments EGFP fluorescence can only be produced from within liposomes with the help of the PS kit entrapped inside liposomes, this is because RNAse is added soon after the reaction is isolated within these vesicles and therefore killing any transcription occurring outside liposomes.

Similar experiments have been carried out monitoring EGFP synthesis driven by PS within liposomes using confocal microscopy. The EGFP is identified only inside liposomes as green glowing vesicles of 600–900 nm sizes.

In future the reconstruction of a minimal ribosome could bring the number of genes required for ribosomal proteins from 54 of an existing minimal genome (Luisi 2002), down to 30–20 genes. In fact based on comparative sequence analysis only 33 ribosomal proteins correspond to functional domains evolutionarily conserved (Mears et al. 2002) and considering that rRNAs genes and genes for ribosomal proteins have been cloned from E. coli we could attempt first to reduce the number of ribosomal proteins down to 33 and later try to build ribosomes of a fewer number of elements.

Several experiments have shown that rRNA is the catalytic molecule within ribosomes and ribosomal proteins seem to stabilize and orientate the otherwise floppy RNA into an active structure, no one has in fact succeeded in synthesizing proteins with only rRNAs (Noller 2006).

To establish reproduction of the shell compartment with a minimal set of genes we have cloned the gene for the enzyme complex Fatty Acid Synthetase (human FAS) (Jayakumar et al. 1996). This on its own should be able to produce and release fatty acid molecules within a liposome vesicle and promotes vesicle growth and reproduction (Berclaz et al. 2001) and this would be coherent with the knowledge that a fatty acid vesicle compartment is believed to correspond to one of the early primitive shell.

The core reproduction of a minimal cell corresponding to the replication of the minimal genome, including the PS, would require 7–8 genes for the DNA replication and another minimum set of 16 genes for the synthesis of t-RNAs.

All together, we may reach a plan to build a minimal cell with a number of genes below 100.

With this estimate and for simplicity we are not considering that several other genes could be required for post transcriptional and translational modifications.

There is no limit if we want to imagine a minimal set of genes that compose a minimal cell alive in early evolution and this is due to the lack of information on the evolution of biological molecules and early minimal cells. We have to say that even within the new Synthetic Biology approach there is a lack of information, because we first need to prove that we are able to build a minimal cell alive with a minimal set of extant molecules; if successful we would than reach a first defined number of molecules, today available, required for a first minimal cell that based on theoretical calculation has anyway a genome close to 100 genes.

This minimal cell would then represent a work in progress where the exercise of modifying cellular functions using different or newly synthesized molecules and enzymes would minimize the complexity of our early minimal cell model.

Based on the extant molecules required to set the minimal activity of the essential functions of a cell alive, it is very hard to believe that an early cell has ever existed with 30 40 genes unless we think of low-specific enzymes assisting more then one reaction and therefore we are talking about ancient molecules. Otherwise we have to envisage independent molecular mechanisms as minimal steps during early evolution controlling independent functions within compartments and, although these cells would not be alive they could merge later on by cell fusion building more complex structures and functions such as for example ribosomes and protein synthesis respectively.