Introduction to Genome Biology: Features, Processes, and Structures
Genomic analyses increasingly make use of sophisticated statistical and computational approaches in investigations of genomic function and evolution. Scientists implementing and developing these approaches are often computational scientists, physicists, or mathematicians. This article aims to provide a compact overview of genome biology for these scientists. Thus, the article focuses on providing biological context to the genomic features, processes, and structures analysed by these approaches. Topics covered include (1) differences between eukaryotic and prokaryotic cells; (2) the physical structure of genomes and chromatin; (3) different categories of genomic regions, including those serving as templates for RNA and protein synthesis, regulatory regions, repetitive regions, and “architectural” or “organisational” regions, such as centromeres and telomeres; (4) the cell cycle; (5) an overview of transcription, translation, and protein structure; and (6) a glossary of relevant terms.
Key wordsOrganelles DNA RNA Protein Regulatory DNA Plasmids Sequence repeats Cell cycle Transcription Translation DNA replication Chromatin Gene structure Glossary
Many thanks to Maria Anisimova, Sonia Furtado, Halldór Stefánsson, Nita Budd, and Damien Devos for many valuable comments and suggestions during the writing of this article.
One end of a polynucleotide molecule has a free (i.e. available to form additional chemical bonds with other atoms) hydroxyl (OH) functional group that is attached to the 3′ carbon atom of the sugar moiety of the terminal nucleotide. This is known as the 3′ “end” or “terminus” of the polynucleotide. Within the cell, the polynucleotides RNA and DNA can only be synthesised by attaching a nucleotide to a 3′ terminal hydroxyl functional group. Thus, the last nucleotide added to an RNA or DNA molecule is the 3′ terminal nucleotide. For this reason, RNA and DNA synthesis is described as taking place in a 5′- to 3′ direction. Therefore, the 3′ terminal nucleotide is sometimes referred to as the “last” nucleotide. Convention is to write RNA or DNA sequences ending with the 3′ terminal residue.
A post-transcriptional modification of eukaryotic messenger RNAs (mRNAs), in which a methylated guanine nucleotide is added to the 5′ end of the mRNA.
One end of a polynucleotide molecule has a free (i.e. available to form additional chemical bonds with other atoms) phosphate (PO4) moiety attached to the 5′ carbon atom of the sugar moiety of the terminal nucleotide. This is known as the 5′ “end” or “terminus” of the polynucleotide. Within the cell, the polynucleotides RNA and DNA can only be synthesised by attaching a nucleotide to a 3′ terminal hydroxyl group. Thus, the first nucleotide to be incorporated into an RNA or DNA molecule is the 5′ terminal nucleotide. For this reason, RNA and DNA synthesis is described as taking place in a 5′- to 3′ direction. Therefore, the 5′ terminal nucleotide is sometimes referred to as the “first” nucleotide. Convention is to write RNA or DNA sequences starting with the 5′ terminal residue.
The incorporation of different sets of exons in different transcripts derived from the same gene. Genes that produce several different transcripts of this kind are described as being alternatively spliced. Alternative splicing is a feature of many human genes; it has been estimated that 95% of human genes that have more than one exon are alternatively spliced.
A molecule consisting of a carboxylic acid (COOH) functional group, an amine (NH3) functional group, and a “side chain” moeity. Different amino acids have different side chains. Polypeptide chains are linear polymers of amino acids.
The end of a polypeptide chain with a free amine (NH2) functional group. Ribosomes synthesise polypeptides by attaching the amine group of an amino acid to the carboxyl group of a polypeptide. Thus, the first amino acid to be incorporated into a polypeptide chain is the amino-terminal residue. For this reason, polypeptides are described as being synthesised in an N-to-C direction and the amino-terminal residue is sometimes referred to as the “first” amino acid. Convention is to write polypeptide sequences beginning with the amino-terminal residue.
The region of a transfer RNA (tRNA) that specifically interacts with the corresponding codon in a messenger RNA during translation. Anticodons consist of three adjacent nucleotide residues within the tRNA sequence.
One of the two taxonomic groups into which all prokaryotes are divided, the other group being the bacteria.
A nucleotide consisting of a ribose sugar moeity, an adenine nitrogenous base, and three-phosphate groups. The conversion of ATP to adenosine diphosphate (ADP) releases a phosphate group and a large amount of energy. This energy is used by the cell in a huge range of different processes. In this context, ATP acts as a “battery” of chemical energy for the cell.
A eukaryotic chromosome present in two nearly identical copies (homologous pairs) in diploid cells. This contrasts with sex chromosomes, where homologous pairs are typically rather different from each other in both sequence and structure.
The portion of a polymer that includes the moieties that lie along the line of direct links between individual monomers. The backbone is distinct from side group regions of the monomers. For DNA and RNA, monomers (nucleotides) are linked via bonds between phosphate and sugar groups; thus, DNA and RNA are described as having a sugar–phosphate backbone. In polypeptides, monomers (amino acids) are linked via peptide bonds; thus, proteins are described as having a peptide backbone.
One of the two taxonomic groups into which all prokaryotes are divided, the other group being the archaea.
Often used as shorthand for “nucleotide residue” in the context of DNA and RNA molecules. For example, the human genome is sometimes described as containing around 3 gigabases. This usage is based on the fact that nucleotide residues contain nitrogenous bases. More generally, in chemistry there are several different definitions of a base (the opposite of an acid).
A membrane consisting of a lipid bilayer, within which proteins are embedded, that partitions cellular contents into separate compartments. The cell membrane separates the entire cell from the environment; other biological membranes (those of membrane-bound organelles) separate regions of the cell into subcellular compartments. Different membranes contain different lipids (or different proportions of different proportions of different lipids) and different proteins.
See “carboxyl terminus”.
See “complementary base pair”.
The structure that packages the genome of a virus when it is outside its host cell. Capsids vary greatly in terms of the structure of the proteins and other components they consist of, and their overall shape.
One end of a polypeptide molecule has a free (i.e. available to form additional bonds with other atoms) carboxyl (COOH) functional group. The amino acid moiety containing this free carboxyl group is known as the carboxyl terminus (or carboxyl-terminal residue). Within the cell, polypeptides are synthesised by attaching an amino acid to a carboxyl terminus. Thus, the last amino acid to be incorporated into a polypeptide chain is the carboxyl-terminal residue. For this reason, polypeptides are described as being synthesised in an N-to-C direction and the carboxyl-terminal residue is sometimes referred to as the “last” amino acid. Convention is to write polypeptide sequences ending with the carboxyl-terminal residue.
A set of processes that interact to enable a cell to successfully divide into two daughter cells. The cycle involves replication of the cellular genome, cell division, and appropriate duplication and segregation of other cellular components prior to cell division. In eukaryotes, the cycle is divided into interphase (during which individual chromosomes cannot be distinguished by light microscopy) and either mitosis (for the majority of different cell types) or meiosis (for the production of gametes), where individual chromosomes are visible by light microscopy (such chromosomes are described as condensed). DNA replication occurs during S phase, a division of interphase. The region of interphase between cell division and S-phase is known as G1 (gap-1) phase, and that between S-phase and mitosis (or meiosis) as G2 (gap-2) phase. Mitosis and meiosis are also divided up into several different phases.
The region of a eukaryotic chromosome that is attached to the spindle apparatus during mitosis or meiosis. The spindle apparatus pulls chromosomes into a position where they can be successfully incorporated into a daughter cell resulting from cell division.
One polypeptide molecule.
One of the two copies of a eukaryotic chromosome produced following replication of the cellular genome during S phase of the cell cycle. “Sister” chromatids are copies of the same pre-S phase chromosome. During cell division, sister chromatids are separated so that one copy segregates into each of the daughter cells resulting from the cell division.
Typically used to refer to the combination of protein, nucleic acids, and other cellular components that make up eukaryotic chromosomes. However, it is sometimes (rarely) used to refer to the compositionally rather different structures that make up prokaryotic chromosomes and plasmids.
A structure consisting of a double-stranded molecule of DNA packaged and organised together with proteins and other cellular components (the combination of DNA, protein, and other components is referred to as chromatin). Sometimes, the word refers only to structures of this kind found in eukaryotic cells, the word “genophore” being used for such structures in prokaryotic cells. In other contexts, “chromosome” is used for such structures in both eukaryotic and prokaryotic cells.
During interphase, individual chromosomes cannot be distinguished by light microscopy, unlike in M-phase. The process by which diffuse interphase chromosomes change to the more defined condensed structures seen during M-phase is known as chromosome condensation.
The region of the nucleus that a chromosome tends to occupy during interphase.
A gene regulatory region that (1) is on the same chromosome as the gene and (2) provides binding sites for other components, such as transcription factors, whose presence and activity directly mediate the expression level of the gene.
A gene whose primary product is a protein i.e. that is transcribed to yield messenger RNAs that can be translated by a ribosome to yield a protein.
A set of three adjacent nucleotides in a messenger RNA (mRNA) molecule that specify an amino acid for incorporation into the polypeptide encoded by the mRNA. Not all sets of three nucleotides in an mRNA are a codon; nucleotides in the 5′ and 3′ UTRs are not part of any codon, nor are triplets of nucleotides in the coding region of the mRNA that overlap with codon boundaries.
Pairs of nucleotide bases that form strong interactions via hydrogen bonds. In DNA, the two complementary base pairings are cytosine:guanine and adenine:thymine. The complementary pairs in RNA are the same except that uracil substitutes for thymine.
Pairs of DNA (or RNA) molecules, with base sequences that allow formation of antiparallel dimers where all bases participate in complementary base pairing with the other molecule, are described as having complementary sequences.
A structure consisting of two or more polypeptide chains, where the chains are linked by non-covalent bonds. Non-covalent interactions between proteins and nucleic acids are also referred to as complexes (for example RNA–protein complexes).
See “ultraconserved element”.
Attachment of a methyl (CH3) group to the base moiety of a cytosine nucleotide. In mammals, cytosine methylation can regulate levels of gene expression, typically resulting in a reduction or switching off of transcription from the regulated gene.
During cell division, a single parental (or mother) cell divides to yield two daughter cells.
A cell in which each chromosome found in a haploid cell of the same species is represented by a homologous pair of chromosomes.
The term is used differently in different contexts. Structural biologists and bioinformaticians use “domain” to refer to a protein module that forms a stable, globular three-dimensional structure (or “fold”) in the cell, and that does not require interaction with other polypeptide chains to maintain this stable structure. However, cell and other biologists sometimes use the term more loosely, to refer to any subsequence of a polypeptide chain, typically (but not always) associated with a particular function.
Towards the 3′ terminus of a nucleotide strand. For example, the 3′ UTR of a messenger RNA (mRNA) lies towards the 3′ end of the mRNA molecule compared to the coding region of the mRNA.
The molecule that encodes the genomes of all cellular life on Earth. DNA is a polymer of nucleotides in which the sugar moiety of the nucleotide is 2-deoxyribose. The “backbone” of the molecule alternates between sugar and phosphate moieties. Pairs of DNA molecules with complementary sequences can form a right-handed antiparallel double-stranded helical structure, i.e. the famous “double helix”.
A process in which the sequence of bases in an RNA molecule is altered after the base has been incorporated into a transcript during the process of transcription. This can involve the insertion or deletion of nucleotides, or a change in the nitrogenous base attached to the sugar moiety of a nucleotide within the molecule.
A change in gene function inherited by cellular offspring/daughter cells from their parental cell.
Weakly stained regions of chromatin as observed through a microscope under a range of different staining methods. Intensely stained regions are known as heterochromatin. Euchromatin tends to contain more genes and to have higher transcriptional activity than heterochromatin.
Cells in which the majority of the genome is packaged within a nucleus. Typically, eukaryotic cells are larger and have a more complex internal organisation than prokaryotic cells.
A region of a gene whose transcribed sequence is retained in an RNA molecule after splicing of the transcript has occurred.
The process of producing a product (RNA or protein) from a gene, using the DNA sequence of the gene as a template for the sequence of the product molecule. For example, expression of a protein gene in a eukaryote involves, among others, the processes of transcription and translation (it may also involve splicing, polyadenylation, etc.)
A group of atoms within an organic molecule that is responsible for characteristic chemical reactions of the molecule. See the entry for “group” in this glossary for a description of what is meant by that term in this context. “Functional group” is often used synonymously with the word “moiety”; however, these two words have distinct meanings, with moiety being used to generally describe groups within a molecule. Thus, a moiety can include several functional groups.
Eukaryotic cells that can fuse with other cells during fertilisation to produce a zygote; this is part of the process of sexual reproduction. Gametes are haploid, and combine to produce a diploid zygote.
In most cases, a gene refers to a region of a genome that provides the information required to produce, and regulate the timing and level of production, of an RNA (in the case of a coding gene, also a polypeptide) molecule. Using this definition, a gene includes not just the region of the genome that serves as a template for the sequence of the RNA transcript produced from the gene, but also the regulatory regions that control the timing and level of production of the transcripts.
A description of which amino acid (or the signal to stop translation) is encoded by each of the 64 different codons. Many amino acids are encoded by more than one codon; hence, the genetic code is described as redundant. Codons encoding the same amino acid are described as synonymous codons. Not all organisms use the same genetic code; indeed, the genetic code used by the genomes of endosymbiotic organelles (mitochondria and plastids) are different from those used in the nuclei of the cells containing the organelle.
The complete set of heritable genetic information of a cell or organism.
Refers to the use of the word in compound nouns, such as “functional group” or “chemical group” (not in the context of the periodic table, however). In this context, the word “group” typically is used to refer to a portion of a molecule within which all atoms form chemical bounds with at least one other atom within the group. One or more atoms within the group form chemical bonds with atoms that are not part of the group—as the group is only a portion of a larger molecule, and must therefore be connected to other atoms within the molecule by chemical bonds. If all bonds linking the group to the rest of the molecule were broken, then all atoms within the group would remain linked together (at least before the occurrence of any reactions that would act to change the structure of the group). Thus, a group is a distinct sub-structure of a complete molecule, within which atoms are linked by chemical bonds.
Describes a cell where each of the chromosomes in the genome is present in only a single copy. For example, nuclei in human haploid cells (for example, gametes, i.e. sperm or egg cells) contain 23 chromosomal DNA molecules (22 autosomes and one sex chromosome); in contrast, most human cells are diploid, and contain 46 chromosomal DNA molecules in their nucleus, with each chromosome present in the haploid cells represented by a homologous pair of chromosomes in the diploid cell.
Intensely stained regions of chromatin as observed through a microscope under a range of different staining methods. Weakly stained regions are known as euchromatin. Heterochromatin tends to contain fewer genes and to have lower transcriptional activity than euchromatin.
A pair of similar chromosomes contained within the same diploid cell, where one member of the pair was inherited from each parent. The sequences and structures of homologous chromosomes are very similar; however, due to the occurrence of different mutations in the evolutionary history of the two members of pair, they almost certainly have (often only slightly) different sequences/structures.
The phase of the eukaryotic cell cycle in which individual chromosomes cannot be distinguished by light microscopy.
A region of a gene that is transcribed, but where the region of the RNA transcript that was encoded by this region of the gene is removed via the process of splicing.
A region of a mammalian chromosome within which the proportions of the two different kinds of complementary base pairs are similar. The proportion of the different base pairs is often described in terms of the “CG content”, referring to the proportion of cytosine:guanine base pairs in the region.
A thin membrane that consists of a pair of lipid monolayers interacting via the hydrophobic tail regions of the lipids within them. Thus, a bilayer is approximately as thick as two lipid molecules.
A molecule that consists of a polar (and hence hydrophilic) head group and a non-polar (hydrophobic) carbohydrate tail region.
Similar to mitosis, meiosis is a phase of the cell cycle in which eukaryotic chromosomes are segregated into sets to be inherited by daughter cells. The difference to mitosis is that, after two meiotic divisions (meiosis I and meiosis II), four haploid cells are produced from the initial diploid cell; mitosis produces instead two diploid daughter cells from one diploid parental cell.
A compartment of a cell that is surrounded by (at least one) lipid bilayer. Examples of organelles include nuclei and mitochondria.
RNA molecules produced via transcription of a coding gene that can be translated by a ribosome to yield a polypeptide chain. In eukaryotes, mRNAs may be modified by a range of different processes, including splicing, 5′ capping, and polyadenylation.
A group of atoms within a molecule. See the entry for “group” in this glossary for a description of what is meant by the term in this context. Often used synonymously with the term “functional group”—however, the term “moiety”, as defined by the International Union of Pure and Applied Chemistry (IUPAC), the principal international society of chemists, indicates that “moiety” can be used more generally than “functional groups”, to refer to any part of a molecule. Thus, a given moiety might contain several functional groups.
Membrane-bound organelles found in almost all eukaryotic cells. In eukaryotes, they are the site of the oxidative phosphorylation metabolic pathway, which in many eukaryotes is an important source of ATP. Mitochondria also are the only site of synthesis for iron–sulphur clusters in eukaryotes. These clusters are necessary components of several essential eukaryotic proteins. Like plastids, mitochondria are derived from bacterial endosymbionts. The ancestral bacterium from which mitochondria are derived was present in the ancestor of all living eukaryotes. Mitochondria retain small portions of the genome of these bacterial ancestors.
A phase of the cell cycle in which individual chromosomes/chromatids can be distinguished using light microscopy. During mitosis, sister chromatids are segregated so that each daughter cell resulting from cell division contains one copy of each pair of sister chromatids.
A change in the sequence of a genome.
A gene whose primary product is an RNA molecule, rather than a polypeptide, i.e. that produces functional transcripts that are not translated by a ribosome to yield a polypeptide, for example transfer RNA genes.
Codons are non-synonymous if they code for different amino acids.
The irregularly shaped, non-membrane-bound structure that contains the genome of a prokaryotic cell.
A complex of approximately 150 base pairs of genomic DNA wrapped around a core of histone proteins. DNA within a nucleosome is compacted into a smaller volume than it would occupy if it were not bound within a nucleosome.
Molecules consisting of several moieties: (1) a nitrogenous base (2) a five-carbon sugar (3) and between one and three phosphate groups. In DNA molecules, the nitrogenous bases are usually adenine (A), cytosine (C), guanine (G), or thymine (T). In RNA molecules, they are usually adenine (A), cytosine (C), guanine (G), or uracil (U). However, cellular DNA and RNA can also contain modified versions of these bases, such as 5-methylcytosine in DNA.
A double-membrane bound organelle of eukaryotic cells that contains the majority of the cellular genome (the “nuclear genome”). However, in most cases, a eukaryotic cell also contains several organelles (i.e. mitochondria, and in some cases plastids) that also contain portions of the cellular genome (“organellar genomes”).
A molecule consisting of several (typically up to 100) smaller units (monomers). Individual monomers within the oligomer have similar structures; for example, a nucleotide oligomer is made up of several nucleotide monomers covalently bound to each other. Molecules containing much larger numbers of monomeric units are referred to as polymers.
A set of genes that are transcribed together as a single RNA transcript.
A region of genomic DNA at which DNA replication is initiated.
Two or more amino acids covalently bound via peptide bonds.
A covalent bond formed between a caboxyl (COOH) and an amino (NH2) functional group. During synthesis of the bond, a molecule of water is released.
A membrane-bound organelle found in many eukaryotic cells. Peroxisomes are involved in many different processes, and separate the toxic products of some of these processes from the rest of the cell.
Structures consisting of DNA molecules, proteins, and other cellular components. DNA components of plasmids can be either circular or linear although most plasmids encountered in molecular biology are circular. Many prokaryotes possess plasmids, as do some eukaryotes. The genes within plasmids tend to be associated with functions that promote or enable survival and growth under specific “niche” conditions. They can be horizontally transferred between cells (i.e. not inherited from a parental cell as a result of cell division), and are typically replicated independently of the cell cycle, unlike chromosomes.
Membrane-bound organelles found in plants and some other eukaryotic organisms, participating in a range of different processes within the cell. Like mitochondria, plastids are derived from an endosymbiosis with a bacterium. Plastids retain remnants of their ancestral bacterial genome.
A description of how many different homologous copies there are of each chromosome within a genome.
The process by which several additional adenine-containing nucleotides are attached to the 3′ end of a transcript. Not all transcripts are polyadenylated.
A molecule consisting of many smaller units (monomers). Monomers are typically connected to each other via covalent chemical bonds. In this context, “many” is not strictly defined. However, molecules containing between 2 and 100 monomeric units are often referred to as oligomers.
An enzyme that polymerises the synthesis of polynucleotide chains from nucleotide molecules. DNA replication is mediated by a DNA polymerase, transcription by an RNA polymerase.
A polymer of nucleotides. Both DNA and RNA are polynucleotides.
A linear polymer of amino acids bound together by covalent peptide bonds. Polypeptides are synthesised by ribosomes via the process of translation, using the nucleotide base sequence of a messenger RNA molecule as a template for the amino acid sequence of the polypeptide.
Changes made to the structure of an RNA molecule following transcription. Many different post-transcriptional modifications have been identified, some of which are essential for the cell. For example, the addition of an activated amino acid to the 3′ end of transfer RNAs (tRNAs) is essential for the process of translation.
Cells within which the genome is not separated from the rest of a cell by a nucleus. Generally, prokaryotic cells are considerably smaller, and have a less complex internal organisation, than eukaryotic cells. The two most general taxonomic groupings within the prokaryotes are the archaea and the bacteria.
A regulatory region of a gene located close to its target gene, i.e. the gene (or genes) whose transcriptional activity it regulates. The protein complexes responsible for initiating transcription bind to regions of the promoter.
Molecules (molecular complexes, if the protein contains more than one polypeptide) that consist of one or more polypeptide chains, and sometimes also additional non-polypeptide components. For example, a functional haemoglobin protein consists of four polypeptide chains, each of which is also bound to a non-polypeptide haeme molecule.
Regions of protein sequence (i.e. sub-sequences of polypeptide sequences) that mediate important aspects of their function independently of other regions of the full polypeptide chain. Protein domains and linear motifs are examples of protein modules.
A protein complex found in all eukaryotes and archaea, and some bacteria, that is responsible for breaking down proteins into small peptides of approximately eight amino acids. Damaged proteins, and proteins that need to be degraded as part of cellular processes, are targeted to the proteasome.
A process in which genetic material is exchanged between different chromosomes, or between regions of the same chromosome.
An enzyme that changes chromatin structure by repositioning, removing or assembling nucleosomes.
Regions of DNA sequence within the same genome that are very similar/identical to each other.
The process of duplicating a DNA molecule to yield two copies with the same sequence as the original molecule. In practice, due to errors introduced during the process of replication, the two copies of the initial DNA molecule may have slightly different sequences.
Typically used to refer to amino acid moieties within a peptide. Within a peptide, amino acids are linked by peptide bonds. Formation of these bonds is accompanied by the loss of a water molecule (a hydrogen atom from the amino group of one amino acid combining with a hydroxyl (OH) group from the carboxyl group of the other amino acid). Thus, the amino acid monomers incorporated in the peptide are the remnants (or the “residue”) left behind after the loss of this water molecule. The linking of nucleotides via phosphodiester bonds, as occurs in the backbone of RNA and DNA molecules, also releases water; thus, individual nucleotides within these molecules are also sometimes referred to as “residues”.
A polymer of nucleotides in which the sugar moiety of the nucleotide is ribose. Plays an essential role in many cellular processes, including transcription, translation, and replication.
RNA molecules that are essential components of all ribosomes, both structurally and via direct involvement in catalysing the synthesis of polypeptide chains during the process of translation.
The complex of proteins and RNA molecules that uses messenger RNA as a template for the synthesis of polypeptide chains via the process of translation.
The phase of the eukaryotic cell cycle in which the genome is replicated.
Some eukaryotes, such as mammals, use differences between particular pairs of chromosomes (the sex chromosomes) to determine the gender of an organism. In humans, there are two sex chromosomes, X and Y; they have very different lengths, and only a small region of the considerably larger X chromosome shares extensive sequence similarity with the Y chromosome. However, despite these differences, they are still sometimes considered a homologous pair of chromosomes, as they pair together at the metaphase plate during meiosis I.
The process by which regions (introns) of the initial transcript of a gene are removed, retaining only the exons.
One of the two DNA molecules within a double-stranded DNA molecule is often referred to as a strand.
A term used by biochemists and pharmacologists to refer to non-polymer organic molecules. For example, individual amino acids, nucleotides, simple sugars, and many drugs are described as small molecules.
Wrapping of the DNA double helix around itself. Supercoiling yields a more compact structure compared to DNA in a relaxed state. Supercoils can be induced by changing the number of times the two strands of a DNA double helix wrap around each other compared to in their relaxed state.
Codons are synonymous if they code for the same amino acid.
Previously, this term was used to describe groups of organisms that share similar characteristics; current usage is to apply it, if possible, only to such groups where the organisms in the group are believed to represent all the descendents of a single common ancestor. Taxonomic groups are organised hierarchically; within more general groups, organisms are further classified into more specialised sub-groups. For example, the more general group Eukarya (eukaryotes) includes, among others, the groups Animalia (animals) and Plantae (plants); humans are members of both Eukarya and Animalia, but not Plantae.
Structures located at the ends of linear chromosomes or plasmids. Often “telomere” refers only to such structures in eukaryotic cells; however, it is also sometimes used to refer to such structures in both eukaryotes and prokaryotes.
An enzyme that changes the number of times the two DNA strands within a DNA double helix twist around each other. This can act to introduce or relax supercoils in DNA molecules.
An RNA molecule synthesised via the process of transcription.
The process of synthesising an RNA molecule (a transcript) using the sequence of nucleotide bases in the DNA sequence as a template for the sequence of nucleotide bases in the transcript.
A protein that binds to a region of a chromosome or plasmid, as a result activating or repressing the expression of a gene.
The nucleotide position in a DNA sequence corresponding to the first base of an RNA transcript.
Essential components of the translation apparatus that provide a physical link between the codons in a messenger RNA (mRNA) sequence and the amino acids coded for by the mRNA. Interaction with a specific codon in the mRNA is mediated via an anticodon within the tRNA molecule. Prior to participating in translation, the amino acid corresponding to the codon recognised by the tRNA is covalently attached to the 3′ terminus of the RNA molecule. During translation, this amino acid is incorporated in the polypeptide chain encoded by the mRNA.
The process of synthesising a polypeptide molecule using the coding region of a messenger RNA (mRNA) molecule as a template. Translation is mediated by the ribosome.
A region of a nucleotide polymer that contain two or more adjacent copies of a given sequence of three nucleotide bases.
Regions of non-coding sequence that are very strongly conserved between different organisms. Evidence suggests that these regions are functional, in some cases as regulatory sequences. However, in most cases the function of such elements is unknown.
See “ultraconserved element”.
Regions of a messenger RNA (mRNA) molecule that do not encode the amino acid sequence of a polypeptide i.e. that do not overlap with any codons. Each mRNA has two UTRs, one on either side of the coding regions; the 3′ and 5′ UTRs.
Towards the 5′ terminus of a nucleotide strand. For example, the 5′ UTR of a messenger RNA (mRNA) lies towards the 5′ end of the mRNA molecule compared to the coding region of the mRNA.
See “untranslated region”.
See “complementary base pair”.
A cell formed through the fusion of two gametes during the process of sexual reproduction. The zygote combines the haploid genomic material of the two gametes within one diploid cell (the zygote).
- 1.Budd, A. (2012) Diversity of genome organization. In Anisimova M., (ed.), Evolutionary genomics: statistical and computational methods (volume 1). Methods in Molecular Biology, Springer Science+Business media, LLCGoogle Scholar
- 5.Alberts B, Johnson J, Lewis J, Raff M, Roberts K, Walter P (2007) Molecular Biology of the Cell 1392Google Scholar
- 11.History of life through time UCMP http://www.ucmp.berkeley.edu/exhibits/historyoflife.php
- 13.Grant M, Mitton J (2010) Case Study: The glorious, golden, and gigantic quaking aspen Nature Educational Knowledge 1:40Google Scholar
- 17.Whitman WB (2009) The modern concept of the procaryote. J Bacteriol 191:2000–5; discussion 2006–7Google Scholar
- 18.Pace NR (2006) Time for a change. Nature 441:289Google Scholar
- 19.Pace NR (2009) Problems with “procaryote”. J Bacteriol 191:2008–10; discussion 2011Google Scholar
- 42.Potaman VN, Sinden RR. (2005) DNA: Alternative Confirmations and Biology, in DNA Confirmation and Transcription (Ohyama T, Ed.) pp 3–17, SpringerGoogle Scholar
- 50.Clancy S (2008) DNA damage & repair: mechanisms for maintaining DNA integrity. Nature Education 1:BGoogle Scholar
- 55.Maynard Smith J (1998) Evolutionary Genetics 354Google Scholar
- 56.Whitlock MC, Bürger R (2004) Fixation of new mutations in small populations. In Ferrieère R, Dieckmann U, Couvet D (eds.) Evolutionary Conservation Biology pp 155–170, Cambridge University PressGoogle Scholar
- 85.Cramer P, Armache KJ, Baumli S, Benkert S, Brueckner F, Buchen C, Damsma GE, Dengl S, Geiger SR, Jasiak AJ, Jawhari A, Jennebach S, Kamenski T, Kettenberger H, Kuhn CD, Lehmann E, Leike K, Sydow JF, Vannini A (2008) Structure of eukaryotic RNA polymerases. Annu Rev Biophys 37:337–352PubMedGoogle Scholar
- 87.Clancy S (2008) Genetic Recombination Nature Education 1:AGoogle Scholar
- 105.Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S et al. (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447:799–816Google Scholar
- 106.Gerstein MB, Lu ZJ, Van Nostrand EL, Cheng C, Arshinoff BI, Liu T, Yip KY, Robilotto R, Rechtsteiner A, Ikegami K, Alves P, Chateigner A, Perry M, Morris M, Auerbach RK et al. (2010) Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330:1775–1787Google Scholar
- 107.Roy S, Ernst J, Kharchenko PV, Kheradpour P, Negre N, Eaton ML, Landolin JM, Bristow CA, Ma L, Lin MF, Washietl S, Arshinoff BI, Ay F, Meyer PE, Robine N et al. (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797Google Scholar
- 115.Morgan DO (2006) The Cell Cycle: Principles of Control New Science Press, Ltd 327Google Scholar
- 133.Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326:289–293PubMedGoogle Scholar
- 148.Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, Suzuki H, Grimmond SM, Wells CA, Orlando V, Wahlestedt C, Liu ET, Harbers M, Kawai J, Bajic VB, Hume DA, Hayashizaki Y (2006) Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38:626–635PubMedGoogle Scholar
- 156.Cowling VH (2010) Regulation of mRNA cap methylation. Biochem J 425:295–302Google Scholar
- 164.Lodish H, Berk A, Matsudaira P, Kaiser CA, Krieger M, Scott MP, Zipursky L, Darnell J. (2004) Section 1.2 The Molecules of Life, in Molecular Cell Biology Eds.) pp 8–13, W. H. Freeman, New York.Google Scholar
- 176.Petsko GA, Ringe D (2009) Chapter 1 From Sequence to Structure, in Protein Structure and Function (Primers in Biology) Eds.) pp 2–29, Oxford University PressGoogle Scholar
- 185.Branden C, Tooze J (1998) Introduction to Protein Structure 410Google Scholar
- 192.Tompa P (2010) Structure and Function of Intrinsically Disordered Proteins 331Google Scholar