The evolutionary history of meiotic genes

University of Iowa
Iowa Research Online
Theses and Dissertations
2011
The evolutionary history of meiotic genes: early
origins by duplication and subsequent losses
Arthur William Pightling
University of Iowa
Copyright 2011 Arthur Pightling
This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/2960
Recommended Citation
Pightling, Arthur William. "The evolutionary history of meiotic genes: early origins by duplication and subsequent losses." PhD
(Doctor of Philosophy) thesis, University of Iowa, 2011.
http://ir.uiowa.edu/etd/2960.
Follow this and additional works at: http://ir.uiowa.edu/etd
Part of the Biology Commons
THE EVOLUTIONARY HISTORY OF MEIOTIC GENES: EARLY ORIGINS BY
DUPLICATION AND SUBSEQUENT LOSSES
by
Arthur William Pightling
An Abstract
Of a thesis submitted in partial fulfillment
of the requirements for the Doctor of
Philosophy degree in Biology
in the Graduate College of
The University of Iowa
May 2011
Thesis Supervisor: Associate Professor John M. Logsdon, Jr.
1
Meiosis is necessary for sexual reproduction in eukaryotes. Genetic recombination
between non-sister homologous chromosomes is needed in most organisms for successful
completion of the first meiotic division. Proteins that function during meiotic recombination
have been studied extensively in model organisms. However, less is known about the evolution
of these proteins, especially among protists. We searched the genomes of diverse eukaryotes,
representing all currently recognized supergroups, for 26 genes encoding proteins important for
different stages of interhomolog recombination. We also performed phylogenetic analyses to
determine the evolutionary relationships of gene homologs. At least 23 of the genes tested (nine
that are known to function only during meiosis in model organisms) are likely to have been
present in the Last Eukaryotic Common Ancestor (LECA). These genes encode products that
function during: i) synaptonemal complex formation; ii) interhomolog DNA strand exchange; iii)
Holliday junction resolution; and iv) sister-chromatid cohesion. These data strongly suggest that
the LECA was capable of these distinct and important functions during meiosis. We also
determined that several genes whose products function during both mitosis and meiosis are
paralogs of genes whose products are known to function only during meiosis. Therefore, these
meiotic genes likely arose by duplication events that occurred prior to the LECA.
The Rad51 protein catalyzes DNA strand exchange during both mitosis and meiosis,
while Dmc1 catalyzes interhomolog DNA strand exchange only during meiosis. To study the
evolution of these important proteins, we performed degenerate PCR and extensive nucleotide
and protein sequence database searches to obtain data from representatives of all available
eukaryotic supergroups. We also performed phylogenetic analyses on the Rad51 and Dmc1
protein sequence data obtained to evaluate their utility as phylogenetic markers. We determined
that evolutionary relationships of five of the six currently recognized eukaryotic supergroups are
supported with Bayesian phylogenetic analyses. Using this dataset, we also identified ten amino
acid residues that are highly conserved among Rad51 and Dmc1 protein sequences and,
therefore, are likely to confer protein-specific functions. Due to the distributions of these
residues, they are likely to have been present in the Rad51 and Dmc1 proteins of the LECA.
2
To address an important issue with the gene inventory method of scientific inquiry, we
developed a heuristic metric for determining whether apparent gene absences are due to
limitations of the sequence search regimen or represent true losses of genes from genomes. We
collected RNA polymerase I (Pol I), Replication Protein A (RPA), and DNA strand exchange
(SE) sequence data from 47 diverse eukaryotes. We then compared the numbers of apparent
absences to a single measure of protein sequence length and sequence conservation (SmithWaterman pairwise alignment (S-W) scores) obtained by comparing yeast and human protein
sequence data. Using Poisson correlation regression to analyze the Pol I and RPA subunit
datasets, we confirmed that S-W scores and apparent gene absences are correlated. We also
determined that genes encoding products that are critical for interhomolog SE in model
organisms (Rad52, Rad51, Dmc1, Rad54, and Rdh54) have been lost frequently during
eukaryotic evolution. Saccharomyces cerevisiae null rad52, dmc1, rad54, and rdh54 mutant
phenotypes are suppressed by rad51 overexpression or mutation. If rad51 overexpression or
mutation affects other eukaryotes in a similar fashion, this phenomenon may account for frequent
losses of genes whose products are critical for the completion of meiosis in model organisms.
Finally, we place this work into greater context with a review of hypotheses for the
selective forces and mechanisms that resulted in the origin of meiosis. The review and the data
presented in this thesis provide the basis for a model of the origin of meiotic genes in which
meiosis arose from mitosis by large-scale gene duplication, following a preadaptation that served
to reduce increased numbers of chromosomes (from diploid to haploid) caused by erroneous
eukaryotic cell-cell fusions.
Abstract Approved: _______________________________
Thesis Supervisor
_______________________________
Title and Department
_______________________________
Date
THE EVOLUTIONARY HISTORY OF MEIOTIC GENES: EARLY ORIGINS BY
DUPLICATION AND SUBSEQUENT LOSSES
by
Arthur William Pightling
A thesis submitted in partial fulfillment
of the requirements for the Doctor of
Philosophy degree in Biology
in the Graduate College of
The University of Iowa
May 2011
Thesis Supervisor: Associate Professor John M. Logsdon, Jr.
Copyright by
ARTHUR WILLIAM PIGHTLING
2011
All Rights Reserved
Graduate College
The University of Iowa
Iowa City, Iowa
CERTIFICATE OF APPROVAL
_______________________
PH.D. THESIS
_______________
This is to certify that the PH.D. thesis of
Arthur William Pightling
has been approved by the Examining Committee
for the thesis requirement for the Doctor of Philosophy
degree in Biology at the May 2011 graduation.
Thesis Committee: ___________________________________
John M. Logsdon, Jr., Thesis Supervisor
___________________________________
Stephen D. Hendrix
___________________________________
Robert E. Malone
___________________________________
Bryant F. McAllister
___________________________________
Hallie J. Sims
For my family
ii
ACKNOWLEDGMENTS
I would like to express my gratitude towards the many people at the University of
Iowa that have contributed to my growth as a scientist. I would like to thank my thesis
advisor Dr. John Logsdon for his mentorship, time, unfettered access to his laboratory,
and insightful feedback. I would especially like to thank him for allowing me to work on
projects he conceived, while also extending me the freedom to develop concepts of my
own. I am fortunate that he funded my participation in two workshops at which I learned
new molecular phylogenetic and evolutionary analyses, in addition to techniques for
collecting, identifying, and culturing eukaryotic microorganisms (protists). I am very
grateful to the other members of my thesis advisory committee, Dr. Stephen Hendrix, Dr.
Bryant McAllister, Dr. Robert Malone, and Dr. Hallie Sims for their invaluable time,
patience, constructive criticism, and helpful advice. I would also like thank Dr. John
Logsdon, Dr. Bryant McAllister, Dr. Josep Comeron, Dr. Ana Llopart, Dr. Jeff Klahn,
and Dr. Maurine Neiman for the opportunity to teach the Evolution course with them
nearly every semester in the past 6 years. My teaching experience has clearly contributed
to my growth as an evolutionary biologist.
I am thankful for the contributions of my collaborators to my projects. I would
like to thank Matthew Brockman for his enthusiasm and unfailing extensive computer
support, without which most of this thesis would not have been possible. Cindy Brochu,
Abram Doval, Nicole Adams, Lauren Stefaniak, and Nevin Sebastian are thanked for
their technical assistance and sequencing. I would also like to acknowledge former and
current members of the Logsdon Lab for illustrative discussions. I would especially like
to thank Dr. Banoo Malik for training me initially in the lab and with phylogenetic
analyses, for helpful discussions, for helpful comments on chapters 2 and 3 of this thesis,
and for encouraging me to collaborate with her.
iii
The research in this thesis would not have been possible without financial support
from various sources. Funds for an honorable mention for the Sally Casanova Predoctoral
Fellowship at the California State University enabled me to apply to the University of
Iowa’s Ph.D. program, and purchase textbooks and phylogenetic software used in the
initial years of my doctoral thesis projects to launch my research. During the summers of
2004 and 2005, the University of Iowa’s Avis Cone Graduate Summer Fellowships
supported me while performing laboratory research, and in 2010 a University of Iowa
Graduate College Graduate Summer Fellowship supported my bioinformatic research. In
2008 I received a travel award for participation in the Bodega Bay Applied Phylogenetics
Workshop, and student travel awards from the International Society of Protistologists and
International Society of Evolutionary Protistology for presenting my work as a talk at the
Protist 2008 conference. I was also supported as a teaching assistant in fall 2004, fall and
spring of 2006, 2007, 2008, 2009 and 2010 by the Biology Department at the University
of Iowa. Otherwise my research has been supported by funding to my thesis advisor from
the National Science Foundation, grants # MCB-0216702 and EF-0431117.
iv
ABSTRACT
Meiosis is necessary for sexual reproduction in eukaryotes. Genetic
recombination between non-sister homologous chromosomes is needed in most
organisms for successful completion of the first meiotic division. Proteins that function
during meiotic recombination have been studied extensively in model organisms.
However, less is known about the evolution of these proteins, especially among protists.
We searched the genomes of diverse eukaryotes, representing all currently recognized
supergroups, for 26 genes encoding proteins important for different stages of
interhomolog recombination. We also performed phylogenetic analyses to determine the
evolutionary relationships of gene homologs. At least 23 of the genes tested (nine that are
known to function only during meiosis in model organisms) are likely to have been
present in the Last Eukaryotic Common Ancestor (LECA). These genes encode products
that function during: i) synaptonemal complex formation; ii) interhomolog DNA strand
exchange; iii) Holliday junction resolution; and iv) sister-chromatid cohesion. These data
strongly suggest that the LECA was capable of these distinct and important functions
during meiosis. We also determined that several genes whose products function during
both mitosis and meiosis are paralogs of genes whose products are known to function
only during meiosis. Therefore, these meiotic genes likely arose by duplication events
that occurred prior to the LECA.
The Rad51 protein catalyzes DNA strand exchange during both mitosis and
meiosis, while Dmc1 catalyzes interhomolog DNA strand exchange only during meiosis.
To study the evolution of these important proteins, we performed degenerate PCR and
extensive nucleotide and protein sequence database searches to obtain data from
representatives of all available eukaryotic supergroups. We also performed phylogenetic
analyses on the Rad51 and Dmc1 protein sequence data obtained to evaluate their utility
as phylogenetic markers. We determined that evolutionary relationships of five of the six
v
currently recognized eukaryotic supergroups are supported with Bayesian phylogenetic
analyses. Using this dataset, we also identified ten amino acid residues that are highly
conserved among Rad51 and Dmc1 protein sequences and, therefore, are likely to confer
protein-specific functions. Due to the distributions of these residues, they are likely to
have been present in the Rad51 and Dmc1 proteins of the LECA.
To address an important issue with the gene inventory method of scientific
inquiry, we developed a heuristic metric for determining whether apparent gene absences
are due to limitations of the sequence search regimen or represent true losses of genes
from genomes. We collected RNA polymerase I (Pol I), Replication Protein A (RPA),
and DNA strand exchange (SE) sequence data from 47 diverse eukaryotes. We then
compared the numbers of apparent absences to a single measure of protein sequence
length and sequence conservation (Smith-Waterman pairwise alignment (S-W) scores)
obtained by comparing yeast and human protein sequence data. Using Poisson
correlation regression to analyze the Pol I and RPA subunit datasets, we confirmed that
S-W scores and apparent gene absences are correlated. We also determined that genes
encoding products that are critical for interhomolog SE in model organisms (Rad52,
Rad51, Dmc1, Rad54, and Rdh54) have been lost frequently during eukaryotic evolution.
Saccharomyces cerevisiae null rad52, dmc1, rad54, and rdh54 mutant phenotypes are
suppressed by rad51 overexpression or mutation. If rad51 overexpression or mutation
affects other eukaryotes in a similar fashion, this phenomenon may account for frequent
losses of genes whose products are critical for the completion of meiosis in model
organisms.
Finally, we place this work into greater context with a review of hypotheses for
the selective forces and mechanisms that resulted in the origin of meiosis. The review
and the data presented in this thesis provide the basis for a model of the origin of meiotic
genes in which meiosis arose from mitosis by large-scale gene duplication, following a
vi
preadaptation that served to reduce increased numbers of chromosomes (from diploid to
haploid) caused by erroneous eukaryotic cell-cell fusions.
vii
TABLE OF CONTENTS
LIST OF TABLES ...............................................................................................................x
LIST OF FIGURES ........................................................................................................... xi
CHAPTER
1.
GENERAL INTRODUCTION ........................................................................1
The origin of eukaryotes ...................................................................................3
A comparison of mitotic and meiotic divisions ................................................6
The origin and evolution of meiotic genes .....................................................10
Components of meiotic interhomolog DNA strand exchange ........................14
Current state of the eukaryotic phylogeny ......................................................20
Summary .........................................................................................................24
2.
A PAN-EUKARYOTIC INVENTORY OF DNA STRAND
EXCHANGE COMPONENTS REVEALS PATTERNS OF
CONSERVATION AND LOSS .....................................................................35
Abstract ...........................................................................................................35
Introduction.....................................................................................................36
Methods ..........................................................................................................39
Data acquisition .......................................................................................39
Phylogenetic analyses ..............................................................................41
Inventory assembly ..................................................................................41
Results and discussion ....................................................................................42
Limits of sequence detection and distribution of strand exchange
genes among eukaryotes ..........................................................................44
Suppressors of strand exchange component mutant phenotypes in
Saccharomyces cerevisiae .......................................................................49
Conclusions .............................................................................................51
3.
PHYLOGENOMIC ANALYSIS OF RECA HOMOLOGS RAD51
AND DMC1 FROM ALL SUPERGROUPS PROVIDES EVIDENCE
FOR MEIOSIS IN THE LAST COMMON ANCESTOR OF
EUKARYOTES ..............................................................................................95
Background .....................................................................................................95
Results and discussion ....................................................................................99
Phylogenetic analysis of Dmc1 ...............................................................99
Phylogenetic analysis of Rad51.............................................................100
Phylogenetic analyses of Rad51 and Dmc1 ..........................................100
Characteristics of Rad51 and Dmc1 protein sequences ........................104
Conclusions ...........................................................................................106
Methods ........................................................................................................108
Database searches ..................................................................................108
Degenerate PCR ....................................................................................109
Phylogenetic analyses ............................................................................111
viii
4.
MEIOSIS-SPECIFIC GENES AROSE BY DUPLICATION PRIOR
TO THE LAST COMMON ANCESTOR OF EUKARYOTES ..................132
Abstract .........................................................................................................132
Introduction...................................................................................................133
Results and discussion ..................................................................................135
Distributions of meiotic genes ...............................................................135
Assessment of distributions ...................................................................138
Case study: the Spo11 genes..................................................................138
Conclusions ...........................................................................................139
Methods ........................................................................................................140
Database searches ..................................................................................140
Phylogenetic analyses ............................................................................142
Inventory assembly ................................................................................142
5.
CONCLUDING REMARKS........................................................................173
Why meiosis?................................................................................................173
Meiosis arose from mitosis ...........................................................................180
A model for the evolution of meiotic DNA strand exchange genes .............184
A model for the origin of meiosis .................................................................187
Future directions ...........................................................................................193
REFERENCES ................................................................................................................198
ix
LIST OF TABLES
Table
2.1
DNA strand exchange component absences from eukaryotic groups ......................75
2.2
Protein sequence comparisons between Saccharomyces cerevisiae and Homo
sapiens ......................................................................................................................83
2.3. Protein sequence comparisons between Homo sapiens and Oryza sativa. ...............84
2.4
Protein sequence comparisons between Oryza sativa and Saccharomyces
cerevisiae ..................................................................................................................85
2.5
Functions of strand exchange protein with Saccharomyces cerevisiae null
mutant phenotypes and suppressors ..........................................................................88
2.6
The most complete genomes of the genera searched during this study with
web addresses ...........................................................................................................93
3.1
Support for eukaryotic supergroups and first order groups from phylogenetic
analyses of Rad51, Dmc1, and concatenated protein sequence data ......................113
3.2
Degenerate primers and their positions ..................................................................130
3.3
Proposed functions of residues identified during this study ...................................131
4.1
Proteins involved in four general categories of meiosis and their functions ..........167
4.2
Observed numbers of sequence absences from 46 genomes, Smith-Waterman
pairwise alignment scores, predicted numbers of absences, and the proportion
of observed absences likely due to detection failures for 20 proteins that
function during meiosis ..........................................................................................170
4.3
Genome sequence databases searched with web address and references ...............171
x
LIST OF FIGURES
Figure
1.1
Evolutionary relationships among prokaryotes, members of six currently
recognized eukaryotic supergroups and Apusozoa according to multigene
phylogenetic analyses ...............................................................................................27
1.2
The three-kingdom tree of life with relative order of major events during
eukaryotic evolution .................................................................................................29
1.3
General schematic of mitosis and meiosis ................................................................30
1.4
General model of interhomolog DNA strand exchange during meiosis ...................32
1.5
A model for the origin of meiotic function by gene duplication ..............................34
2.1
Phylogenetic distribution among eukaryotes of DNA strand exchange genes .........55
2.2
Unrooted phylogenetic tree of 47 Replication Protein A – 1 (RPA1) homologs .....57
2.3
Unrooted phylogenetic tree of 42 Replication Protein A – 2 (RPA2) homologs .....58
2.4
Unrooted phylogenetic tree of 36 Replication Protein A – 3 (RPA3) homologs .....59
2.5
Unrooted phylogenetic tree of 44 Replication Protein A – 1 (RPA1) homologs .....60
2.6
Unrooted phylogenetic tree of 29 Rad52 homologs .................................................61
2.7
Unrooted phylogenetic tree of 46 Rad51 homologs .................................................62
2.8
Unrooted phylogenetic tree of 41 Rad51 homologs .................................................63
2.9
Unrooted phylogenetic tree of 34 Rad55 homologs .................................................64
2.10 Unrooted phylogenetic tree of 42 Rad57 homologs .................................................65
2.11 Unrooted phylogenetic tree of 38 Rad57 homologs .................................................66
2.12 Unrooted phylogenetic tree of 34 Dmc1 homologs ..................................................67
2.13 Unrooted phylogenetic tree of 38 Hop2 homologs ...................................................68
2.14 Unrooted phylogenetic tree of 41 Mnd1 homologs ..................................................69
2.15 Unrooted phylogenetic tree of 34 Mnd1 homologs ..................................................70
2.16 Unrooted phylogenetic tree of 29 Rdh54 homologs .................................................71
2.17 Unrooted phylogenetic tree of 34 Rad54 homologs .................................................72
2.18 Unrooted phylogenetic tree of 13 Rad59 homologs .................................................73
xi
2.19 Unrooted phylogenetic tree of 46 sets of 13 concatenated strand exchange
homologs...................................................................................................................74
2.20 Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-A)
from 54 diverse eukaryotes. ......................................................................................76
2.21 Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-B)
from 54 diverse eukaryotes .......................................................................................77
2.22 Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-C)
from 54 diverse eukaryotes .......................................................................................78
2.23 Multiple sequence alignment of RPA2 ssDNA binding domain (DBD-D)
from 45 diverse eukaryotes .......................................................................................80
2.24 Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-F) from
54 diverse eukaryotes................................................................................................81
2.25 Multiple sequence alignment of RPA3 ssDNA binding domain (DBD-E) from
36 diverse eukaryotes................................................................................................82
2.26 Phylogenetic distribution among eukaryotes of RNA Polymerase I core
complex subunit genes ..............................................................................................86
2.27 Number of detection failures for RNA polymerase I, RPA and SE proteins as
predicted by Poisson regression analysis compared with observed numbers of
detection failures .......................................................................................................88
3.1
Graphic representation of Rad51 or Dmc1 gene sequence fragments amplified
with degenerate PCR from representatives of four eukaryotic supergroups and
Apusozoa relative to Saccharomyces cerevisiae Rad51 protein sequence .............114
3.2
Unrooted phylogenetic tree of 47 Dmc1 homologs ................................................115
3.3
Unrooted phylogenetic tree of 47 Dmc1 homologs with accession numbers.........116
3.4
Unrooted phylogenetic tree of 54 Dmc1 and RadA homologs with accession
numbers...................................................................................................................117
3.5
Unrooted phylogenetic tree of 105 Rad51 and Dmc1 homologs............................118
3.6
Unrooted phylogenetic tree of 112 Rad51, Dmc1, and RadA homologs ...............119
3.7
Unrooted phylogenetic tree of 157 Rad51, Dmc1 and RadA homologs ................121
3.8
Unrooted phylogenetic tree of 52 Rad51 homologs ...............................................122
3.9
Unrooted phylogenetic tree of 58 Rad51 homologs with accession numbers ........123
3.10 Unrooted phylogenetic tree of 65 Rad51 and RadA homologs with accession
numbers...................................................................................................................124
3.11 Unrooted phylogenetic tree of 40 Concatenated Rad51 and Dmc1 homologs .......125
xii
3.12 Unrooted phylogenetic tree of 40 Concatenated Rad51 and Dmc1 homologs
with accession numbers (Dmc1/Rad51) .................................................................126
3.13 Protein sequence alignment of prokaryotic and eukaryotic recA orthologs
with amino acids conserved among 158 protein sequences indicated ....................127
3.14 p-distance matrix of prokaryotic and eukaryotic recA orthologs ...........................129
4.1
Distribution of 20 homologs that function during meiosis among 46
eukaryotes representing all eukaryotic supergroups ...............................................143
4.2
Presence of 20 homologs that function during meiosis in the last eukaryotic
common ancestor (LECA) inferred by their distribution among eukaryotic
supergroups .............................................................................................................145
4.3
Unrooted phylogenetic tree of 50 eukaryotic Hop1 and Rev7 homologs...............146
4.4
Unrooted phylogenetic tree of 49 eukaryotic Rad21 and Rec8 homologs .............147
4.5
Unrooted phylogenetic tree of 69 eukaryotic Spo11-1, Spo11-2, and Spo11-3
homologs with 6 archaebacterial Top6A homologs ...............................................149
4.6
Unrooted phylogenetic tree of 69 eukaryotic Spo11-1, Spo11-2, and Spo11-3
homologs.................................................................................................................151
4.7
Unrooted phylogenetic tree of 81 eukaryotic Rad51 and Dmc1 homologs with
6 archaebacterial RadA homologs ..........................................................................152
4.8
Unrooted phylogenetic tree of 81 eukaryotic Rad51 and Dmc1 homologs with....154
4.9
Unrooted phylogenetic tree of 82 eukaryotic Hop2 and Mnd1 homologs .............155
4.10 Unrooted phylogenetic tree of 131 eukaryotic Mlh1, Mlh2, Mlh3, and Pms1
homologs with 4 archaebacterial MutL homologs .................................................156
4.11 Unrooted phylogenetic tree of 131 eukaryotic Mlh1, Mlh2, Mlh3, and Pms1
homologs.................................................................................................................158
4.12 Unrooted phylogenetic tree of 113 eukaryotic Mer3, Brr2, and Slh1 homologs
with 6 archaebacterial Ski2 homologs ....................................................................159
4.13 Unrooted phylogenetic tree of 113 eukaryotic Mer3, Brr2, and Slh1 homologs....161
4.14 Unrooted phylogenetic tree of 183 eukaryotic Msh2, Msh3, Msh4, Msh5, and
Msh6 homologs with 5 archaebacterial MutS homologs........................................163
4.15 Unrooted phylogenetic tree of 183 eukaryotic Msh2, Msh3, Msh4, Msh5, and
Msh6 homologs ......................................................................................................165
4.16 Radial tree topologies of archaebacterial and eukaryotic homologs ......................166
4.17 Number of detection failures as predicted by Poisson regression analysis or
RNA polymerase I and Replication Protein A subunits with observed
numbers of detection failures for 18 meiotic genes ................................................169
xiii
5.1
General model for the evolution of DNA strand exchange genes ..........................195
5.2
Alignment of conserved Rad51 and Dmc1 residues ...............................................196
5.3
Model for mitotic ploidy reduction in ancestral eukaryotes ...................................197
xiv
PREFACE
This thesis describes research on the origins and evolution of eukaryotic gene
homologs involved in different stages of meiosis. Chapters 2, 3 and 4 of this thesis are
written in the form of manuscripts intended for submission to peer-reviewed journals for
publication in the very near future. The chapters are formatted according to the
requirements of each journal. The exception is the references, which are formatted
consistently throughout in the style of Molecular Biology and Evolution. When referring
to gene names italics are used, the first letter is uppercase for archaebacterial and
eukaryotic genes (e.g. RadA and Rad51), genetic mutants of genes are presented in lower
case (e.g. radA and rad51), and proteins are presented in normal font with the first letter
in uppercase (e.g. RadA and Rad51).
The research presented in this thesis builds upon a now published phylogenomic
study of 29 genes involved in meiosis in 5 of 6 currently recognized supergroups of
eukaryotes (Malik et al. 2008), which itself is not included in this thesis. Dr. Banoo
Malik was the primary author; I was the second author, followed by Lauren Stefaniak,
Dr. Andrew Schurko, and Dr. John Logsdon. I cloned and sequenced Trichomonas
vaginalis mutL homologs during my laboratory rotation project, and later analyzed
homologs of the Rad52, recA and mutL gene families, contributed 40% of Figure S1 and
Tables S1.1, S1.2 and S1.3, and helped revise the manuscript. Banoo advised me on the
laboratory and phylogenetic techniques used in my contribution to this publication. Other
specific contributions and acknowledgements for this project are detailed in the
publication. The entire content of this thesis was initiated following from this project, or
motivated by discussions with my advisor, Dr. John Logsdon, who provided advice and
technical expertise throughout, with additional feedback over the years from former and
current members of the Logsdon lab, as well as members of my supervisory committee.
xv
Chapter 2 describes a bioinformatic study and is organized as a manuscript for
submission to the journal Molecular Biology and Evolution. This chapter details the
phylogenomic distributions of 13 genes encoding proteins that catalyze DNA strand
exchange during interhomolog recombination among 47 diverse genera. This project
builds upon the previously published study (Malik et al. 2008) by further investigating
the distributions of Rad51, Dmc1, Rad52, Hop2, Mnd1 and other “strand exchange”
genes among 34 diverse eukaryotes. These data were also used to explore a heuristic
metric for determining the limits of sequence detection versus bona fide gene loss. I
began developing this project in 2006 during the “Writing in the Natural Sciences”
graduate course offered by Dr. Stephen Hendrix, with feedback on the initial draft
manuscript offered by Dr. Hendrix and my classmates Rebecca Hart-Schmidt, Mike
Peglar, Banoo Malik, and Min Wu. The analyses in the current version of the manuscript
really took shape immediately before, during and after my December 2009 two-week
visit to New York to accompany my wife during her surgery, when I was a guest in Dr.
Jane Carlton’s laboratory at New York University’s Department of Medical Parasitology.
Helpful discussions with Dr. Malik, Dr. Carlton and Dr. Steven Sullivan led me to devise
my criteria for selecting the genomes scrutinized in this chapter, and led me to further
utilize NCBI’s BLAST tools to search local databases (that I built myself on my own
computer) by PSI-tBLASTn and HMMer. I conceived the project, performed all the
analyses, and am the primary author of the manuscript. My advisor, Dr. John Logsdon, is
the senior author, provided advice on the research design and implementation, and
revised the manuscript. My thesis committee members provided helpful comments
throughout the project, including some time-consuming detailed technical suggestions
and advice on statistical analyses of regression provided by Dr. Bryant McAllister and
Dr. Stephen Hendrix.
Chapter 3 is organized as a manuscript for submission to the journal BioMednet
Central – Evolutionary Biology. I am the primary author, followed by Rebecca Hernan,
xvi
Dr. Nidhi Sahni and Dr. John Logsdon. The project builds further upon my advisor’s
evolutionary analyses of Rad51 and Dmc1 protein sequences from animals, plants, and
fungi initially published by Stassen, et al. (1997), and Dr. Logsdon’s unpublished work
on protist Rad51 and Dmc1 genes conceived and begun during his own postdoctoral
research. This chapter reports bioinformatic analyses of Rad51 and Dmc1 sequence data
obtained from searches of public gene and genome sequence databases and with help of
my co-authors by degenerate PCR experiments in the laboratory. I amplified and cloned
69% of the reported degenerate PCR products, oversaw the laboratory research of my
undergraduate assistant, Rebecca Hernan, and Dr. Nidhi Sahni during her laboratory
rotation, and I searched public databases, performed phylogenetic analyses and wrote the
manuscript. Nidhi amplified and cloned 14% of the reported degenerate PCR products,
and Rebecca amplified and cloned 14% of the degenerate PCR products. Nevin Sebastian
amplified and cloned 3% of the degenerate PCR products, overseen by Dr. Andrew
Schurko. Degenerate PCR products that I isolated for several organisms were superseded
by the public release of genome sequence data, and so these sequenced PCR products are
excluded from the chapter. DNA samples were obtained by collaboration with Jeff Cole
and Dr. Robert Molestina at the American Type Culture Collection (ATCC, Manassas,
VA) and Dr. Laura Katz and her assistant Jessica Grant (Smith College, Northampton
MA). Research assistants Cindy Brochu, Abram Doval, Nicole Adams, Lauren Stefaniak
and Nevin Sebastian sequenced the clones. Dr. John Logsdon conceived and initiated the
project, developed the initial set of degenerate PCR primers, advised on degenerate PCR
strategies and phylogenetic analysis, provided helpful discussion of research design and
implementation and revised the manuscript.
In Chapter 4, a phylogenomic analysis of the distribution among 46 diverse
eukaryotes of 20 genes whose products function during meiosis in model organisms is
presented. The chapter is organized as a manuscript for the journal Molecular Biology
and Evolution. It represents the culmination of the studies presented here and follows
xvii
from Dr. Banoo Malik’s doctoral research while she was in the Logsdon lab (Malik et al.
2008). I am the primary author of this chapter; Dr. Malik (now at Dalhousie University)
is co-primary author, followed by Dr. John Archibald (Dalhousie University) and Dr.
John Logsdon. Banoo’s thesis indicated that several genes encoding proteins that are
known to function only during meiosis in model animals, fungi and plants actually arose
early during eukaryotic evolution by gene duplication. I have expanded the taxonomic
sampling to include more putatively basal lineages in the diverse eukaryotic groups, I
learned and made use of several new applications for phylogenetic analysis and gene
sequence search methods, and I wrote the manuscript. Banoo identified meiotic gene
models in Bigelowiella natans, provided her initial multiple sequence alignments of
meiotic proteins from 2008, helped with taxon selection and in considering key
discussion points. B. natans is the first sequenced representative of the Rhizaria, the only
eukaryotic supergroup for which we lacked genetic information in our previous
phylogenomic analyses. Dr. John Archibald, his co-investigators (Dr. M.W. Gray, Dr.
G.I. McFadden, Dr. P.J. Keeling, and Dr. C. Lane), and the Joint Genome Institute
provided access to their data for the first Rhizarian genome sequence (of Bigelowiella
natans) prior to its public release. Dr. John Logsdon advised Banoo and I on the research
design and implementation, provided helpful discussion, and revised the manuscript.
My thesis committee members, Dr. Logsdon and Dr. Malik all provided helpful
comments or discussion for Chapters 1 and 5.
xviii
1
CHAPTER 1
GENERAL INTRODUCTION
All known extant eukaryotes descended from an ancestor (Darwin 1859) that
lived approximately 2.1 - 2.7 billion years ago, according to geochemical and fossil
evidence (Han and Runnegar 1992; Brocks et al. 1999). Based upon the distributions of
traits among eukaryotes, the last common ancestor was most likely a free-living,
unicellular eukaryote that occupied moderate (mesophilic), aerobic environments and
obtained nutrients by engulfing other organisms (phagocytosis) (Cavalier-Smith 2002a).
Today, a wide variety of unicellular and multicellular eukaryotes are observed that live
in diverse habitats (e.g. aerobic, anaerobic, extremophilic, and mesophilic) and fulfill
many different lifestyles (e.g. symbiotic, free-living, sexual, and asexual) (Knoll 2003;
Adl et al. 2005).
Remarkably, all extant eukaryotic lineages began their evolutionary journeys
with the same genetic material (i.e. a common ancestral genome) that was subsequently
shaped by random genetic mutations (Watson and Crick 1953) and natural selection
(Darwin 1859). However, which genes were present within that ancestral genome and
how those genes subsequently evolved are open questions. Elucidating the origins of
genes that encode products responsible for important biological processes provides a
means of comparing extant eukaryotes to their ancestors (Villeneuve and Hillers 2001).
Although direct observation of the ancestor of extant eukaryotes is obviously
impossible, inferring which genes were likely to have been present within its genome is
possible (Dacks and Doolittle 2001). By comparing inferred suites of genes in the last
common ancestor of eukaryotes to the suites of genes present in extant eukaryotes we
can study the evolutionary histories of the genes themselves. In this way, we can gain
insight into the origins and evolution of important biological reactions and we can begin
to establish the order of events that occurred during the early evolution of eukaryotes
(Roger 1999).
2
An approach to studying the origin and evolution of genes is to search for them
within the genomes of diverse organisms (Dacks and Doolittle 2001). Among
eukaryotes, animals, fungi, and plants are estimated to represent the global majority of
named species (Fenchel and Finlay 2004). Currently six eukaryotic groups
(supergroups) have been proposed on the basis of ultrastructural, genetic, and
phylogenetic analyses (Figure 1.1) (Cavalier-Smith 2004; Baldauf 2008). Animals,
fungi, and plants occupy only two eukaryotic supergroups (Opisthokonta and
Archaeplastida), while protists (eukaryotic organisms with unicellular, colonial,
filamentous, or parenchymatous organization that lack vegetative tissue differentiation,
except for reproduction (Adl et al. 2005)) are present in all six eukaryotic supergroups
and are the predominant or sole occupants of four of them (Amoebozoa,
Chromalveolata, Excavata, and Rhizaria; see Current state of the eukaryotic phylogeny
below) (Adl et al. 2005; Adl et al. 2007). Therefore, including diverse protists in
evolutionary studies is important in order to sample the full breadth of eukaryotes
(Ramesh, Malik, and Logsdon 2005).
The presence of orthologs (genes inherited from common ancestors) (Ridley
2004) among groups of eukaryotes implies that those genes were present in their last
common ancestor (Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005). If
genes are detected in the genomes of representatives of all known eukaryotic groups
then they are inferred to have been present in the last common ancestor of all known
eukaryotes (Koonin 2010). Apparent absences of particular genes from the genomes of
eukaryotes may be observed if either the gene arose later during eukaryotic evolution
(after the evolutionary divergence of lineages from other eukaryotes) or it was
subsequently lost (Dacks and Doolittle 2001). The interpretation of apparently missing
genes depends upon our current understanding of the evolutionary relationships among
eukaryotes (i.e. the eukaryotic phylogeny).
3
The origin of eukaryotes
All living organisms share characteristics which indicate they arose from a
common cellular ancestor (Darwin 1859) or a population of ancestral cells promiscuously
exchanging molecules (Doolittle et al. 2008). To name a few, all living things are
cellular, using ATP for energy, DNA as the hereditary genetic material, a common
genetic code (for the most part), and similar transcription and translation machinery
including RNA (Griffiths et al. 2000; Knoll 2003). The phylogenetic tree of life, which
depicts the genealogical relationships of all living organisms, is composed of three major
branches (domains) occupied by eubacteria (Bacteria), archaebacteria (Archaea), and
eukaryotes (Eucarya) (Figure 1.2) (Woese and Fox 1977; Woese, Kandler, and Wheelis
1990; Brown and Doolittle 1995). The position of the root of the tree of life, with
Bacteria on one side and Archaea and Eucarya on the other, was determined by
phylogenetic analyses (Gogarten et al. 1989; Iwabe et al. 1989). This tree topology,
which proposes that eubacteria are the earliest-diverging forms of life (Pool 1990), is
supported by fossil and biogeochemical data (Brocks et al. 1999; Knoll 2003). However,
the relationship between Archaea and Eucarya is currently in dispute. The three domain
hypothesis proposes that archaebacteria and eukaryotes are sisters (monophyletic groups
that share a common ancestor) (Cavalier-Smith 1987a), while the eocyte hypothesis
proposes that eukaryotes are not sisters of but arose from within the archaebacterial
lineage (Crenarchaeota) (Lake et al. 1984). Compelling arguments put forward by
Cavalier-Smith (1987 and 2002) point out that the most parsimonious interpretation of
the distributions of homologous features among the three domains is that Archaea and
Eucarya are sisters (Cavalier-Smith 1987a; Cavalier-Smith 2002c). However, a relatively
recent set of robust phylogenetic analyses appear to support the eocyte hypothesis
(Archibald 2008; Cox et al. 2008). As this issue is currently unresolved and the effect of
this distinction to the following discussion is subtle, by convention, I will continue with
the three domain model.
4
All extant eukaryotes share features that distinguish them from other forms of life
and support their common ancestry (Maynard Smith and Szathmary 1995). The most
relevant of these features to the subject of meiosis are linear chromosomes, contained
within a nucleus that is part of an endomembrane system (which includes the nuclear
envelope, endoplasmic reticulum, Golgi apparatus, and lysosomes) and an endoskeleton
(Cavalier-Smith 2002a; Cavalier-Smith 2010). Eukaryotes also possess mitochondria
(Roger 1999) (although some have highly derived forms of mitochondria called
hydrogenosomes (Muller 1993) and mitosomes (Tovar, Fischer, and Clark 1999) instead)
and many eukaryotic cells also contain photosynthetic plastids (Adl et al. 2005).
Mereshkowsky (1910) and Koso-Polyanski (1924) first proposed that
mitochondria and chloroplasts are symbionts that arose from the engulfment of bacteria
by an ancestral eukaryotic cell; this forgotten concept was later independently revived by
Lynn Margulis (Sagan 1967; Maynard Smith and Szathmary 1995; Knoll 2003). It is
now widely accepted (Embley and Martin 2006; Poole and Penny 2007) that
mitochondria and chloroplasts are the endosymbiotic descendants of bacteria (α-proteoand cyanobacteria, respectively) that were engulfed by eukaryotes (Margulis 1970; Gray
and Doolittle 1982; Gray 1989). The most convincing support for the origins of
organelles by endosymbioses comes from phylogenetic analyses that indicate genes from
mitochondria or chloroplasts are more closely related to bacteria than to eukaryotes
(Poole and Penny 2007). While endosymbioses of cyanobacteria have likely occurred
multiple times after the divergence of eukaryotes from their last common ancestor (Yoon
et al. 2004), the engulfment of α-proteobacteria probably occurred one time prior to the
divergence of all extant eukaryotes (Roger 1999). These observations have led to
hypotheses in which the nucleus is also proposed to be an endosymbiont, usually an
archaebacterium (Lake and Rivera 1994; Horiike et al. 2001; Shinozawa, Horiike, and
Hamada 2001).
5
If we compare the cytological and phylogenetic features of mitochondria and
chloroplasts to the nucleus, several important differences are apparent (Poole and Penny
2007). Unlike mitochondria and chloroplasts, which have at least two different
membranes (i.e. the original bacterial membrane surrounded by the eukaryotic
endomembrane), the nuclear envelope is in dynamic continuity with the rest of the
endomembrane apparatus (Margulis 1970). In addition, although endosymbioses of
eubacterial and eukaryotic (secondary plastids) cells within eukaryotic host cells are well
known, there are no known cases of archaebacterial intracellular endosymbionts (Poole
and Penny 2007). Finally, although many genes within eukaryotic genomes have been
identified as eubacterial or archaebacterial homologs (Koonin 2010), phylogenetic
analyses consistently retrieve topologies in which eukaryotic genes form distinct
monophyletic groups and do not arise from within eubacterial or archaebacterial groups
(Poole and Penny 2007). These topologies differ from evolutionary relationships inferred
from phylogenetic analyses of mitochondrial and plastid genes. The cytological and
phylogenetic differences between these organelles are best explained by autogenous
models of nuclear formation (Martin 1999).
According to the neomuran (“new walls”) hypothesis the most important event
during eukaryotic evolution was the replacement of the peptidoglycan murein in
eubacterial cell walls with N-linked glycoproteins in the common ancestor of
archaebacteria and eukaryotes, resulting in a “…more flexible surface coat…” (CavalierSmith 2002a). Initially, this change may have provided resistance to antibiotics similar to
penicillin that disrupted peptidoglycan synthesis (Maynard Smith and Szathmary 1995).
Archaebacteria may have substituted eubacterial acyl ester lipids with prenyl ether lipids,
resulting in a new exoskeleton, while proto-eukaryotes retained the flexible surface
(Cavalier-Smith 1987a; Cavalier-Smith 2002a). The eukaryotic membrane, along with a
complex cytoskeleton, allowed for the evolution of a phagocytic lifestyle, in which
particles (including other cells) are engulfed within a vacuole (Cavalier-Smith 1987a;
6
Cavalier-Smith 2002a). In short, subsequent invaginations resulted in the formation of
the endomembrane system, including the nuclear envelope (Cavalier-Smith 1987a;
Cavalier-Smith 1988; Cavalier-Smith 2002d; Cavalier-Smith 2010). Internal
compartmentalization may have provided spatial and temporal control of transcription
and translation and protected the genomic integrity of the proto-eukaryote (CavalierSmith 1987a; Cavalier-Smith 1988; Cavalier-Smith 2002d; Cavalier-Smith 2010). The
origin of nucleated cells, which are dramatically different from eubacterial or
archaebacterial cells that have no endomembrane system or organelles, necessitated the
evolution of distinctly eukaryotic modes of nuclear division (Maynard Smith and
Szathmary 1995).
A comparison of mitotic and meiotic divisions
Among eukaryotes two types of nuclear division are possible, mitosis and
meiosis (Figure 1.3) (Griffiths et al. 2000). The mitotic nuclear division, in which two
genetically identical cells arise from one, is the only mode of replication for somatic or
vegetative cells in multicellular organisms and serves as a form of asexual reproduction
in unicellular organisms (Flemming 1878; Weismann, Parker, and Ronnfeldt 1893;
Huxley 1942). The generalized meiotic nuclear division, during which the genetic
content of the cell products is halved, is the sole source of the cells necessary for sexual
reproduction (e.g. spores or gametes) in eukaryotes (Weismann, Parker, and Ronnfeldt
1893; Churchill 1970). This halving of the genetic material during meiosis serves to
maintain appropriate numbers of chromosomes when cells are subsequently fused,
restoring the parental state (Weismann, Parker, and Ronnfeldt 1893).
During both mitotic and pre-meiotic interphases, the genomes of cells are
replicated once (the synthetic or S-phase) (John 1990). Although mitotic and premeiotic S-phases are similar in that the duplication of chromosomes results in paired,
nearly identical, chromosome copies (sister chromatids), there are differences that
distinguish them (John 1990; DePamphilis 1996). For example, in yeast, pre-meiotic S-
7
phase is 2-3 times longer than the mitotic S-phase in diploids (approximately 30 and 65
minutes, respectively) (Williamson et al. 1983). This phenomenon has also been
witnessed in different animals, such as the newt Triturus (Callan 1972) and the fruit fly
Drosophila (Chandley 1966). These differences may be attributed to variation in
numbers and activation of replicon origins and the rate of replication fork migration
(John 1990; DePamphilis 1996). It is interesting, however, that when cells undergoing
pre-meiotic S-phase were removed from the anthers of Lilium and Trillium and placed
in a culture medium, the cells successfully completed mitosis (Lima-de-Faria 1969; Ito
and Takegami 1982), indicating that although pre-meiotic S-phase is different than
mitotic S-phase, they are similar enough in these organisms that mitosis still proceeds
(John 1990).
Following interphase, cells undergoing either mitotic or meiotic divisions enter a
stage called prophase (John 1990; DePamphilis 1996; Griffiths et al. 2000). During
mitotic prophase, the paired sister chromatids contract into a series of coils, packaging
them for alignment along the metaphase plate and subsequent segregation to opposite
sides of the nucleus later during the mitotic cell cycle (Figure 1.3 A. - ii.) (John 1990;
Griffiths et al. 2000). Thus mitosis is distinguished by a single round of DNA
replication followed by a single nuclear division (Flemming 1878). Because the
numbers of homologous chromosome sets (ploidy – e.g. diploid or 2n) is maintained,
the mitotic division is called an equational division (John 1990). In meiosis, two rounds
of division (Meiosis I and Meiosis II) follow a single round of DNA replication (Figure
1.3 B.) (Churchill 1970). The first meiotic division results in the reduction of ploidy
(diploid (2n) to haploid (n)) (John 1990), and, so, is called the reductional division,
while the second meiotic division is equational because the haploid state is maintained
(Weismann, Parker, and Ronnfeldt 1893; Churchill 1970). Thus, in a single diploid
(2n) cell, meiosis yields four haploid (n) products (although in female meiosis not all
products survive as gametes) (John 1990).
8
Meiotic prophase I is much longer and more complex than mitotic prophase, due
primarily to the formation of bivalents (paired homologous chromosomes) during
meiosis that do not form during mitosis (Figure 1.3 B. - iii). In addition, the formation
of synaptonemal complexes (Carpenter 1987) and/or crossing over (chiasma) (Ruckert
1892; Janssens 1909) between homologous chromosomes are required in most (but not
all) organisms for appropriate pairing and segregation (John 1990). However, it is
possible that the binding of microtubules to chromosomes may be even more important
than the formation of bivalents in determining the segregation patterns of the
chromosomes (Simchen and Hugerat 1993). Sets of microtubular structures (spindles),
arising from barrel-shaped organelles (centrioles) located at opposite sides of the
nucleus, attach to protein structures (kinetochores) that are associated with sister
chromatid contact points (centromeres) (Figure 1.3 A. – iii and B. – iii and iv) (John
1990; Griffiths et al. 2000). During mitosis, each sister chromatid is attached to an
opposing set of microtubules, resulting in alignment of the chromosomes between two
poles (Simchen and Hugerat 1993). These microtubules exert opposing forces toward
the poles (John 1990; Simchen and Hugerat 1993). When sister chromatid cohesion
lapses, sisters segregate to opposite poles (John 1990; Simchen and Hugerat 1993). In
meiosis I, microtubules connect to only one sister chromatid per chromosome, on
opposite sides of bivalents, when attachments between homologous chromosomes
lapse, sisters co-segregate to opposite poles (John 1990; Simchen and Hugerat 1993).
Thus the interactions of opposing microtubules with one or both sister chromatids
(unipolar or bipolar attachment) determine whether reductional or equational divisions
occur (Simchen and Hugerat 1993). The same process that is used for equational
divisions during mitosis is used during meiosis I for reductional divisions, in the
presence of bivalents, modified kinetochores, and persistent sister chromatid cohesion
(Nicklas 1977). Indeed, yeast (Saccharomyces cerevisiae) cells that are in the process
of meiosis can be transferred to a vegetative medium, resulting in diploid colonies with
9
recombined genetic markers (Sherman and Roman 1963). These cells likely formed
bivalents and crossovers, (resulting in genetic recombination at high, meiotic levels)
followed by mitotic (equational) divisions (Simchen and Hugerat 1993).
Given they are both equational divisions, it is tempting to conclude that the
second meiotic divisions are the same as mitotic divisions (John 1990). It is true that
the second meiotic division does not require many proteins necessary for completion of
the first meiotic division, during which several specialized proteins known to function
only during pairing of homologous chromosomes and formation of synaptonemal
complexes and crossovers are necessary (Paques and Haber 1999). However, there are
at least three important differences; 1) during mitosis, sister chromatids are associated
along their entire lengths but, during meiosis II, sister chromatid cohesion is maintained
only around centromeres, resulting in splayed chromosome arms; 2) paired sister
chromatids in mitosis are genetically identical, while, in meiosis II, sister chromatids
are not identical, due to genetic recombination; and 3) nuclei are diploid (2n) during
mitosis but haploid (n) during meiosis II (John 1990; DePamphilis 1996). Therefore,
although mitotic and meiotic equational divisions are, in principle, the same, they are
not identical (John 1990).
In total, all of these observations indicate that, although mitosis and meiosis are
very similar, hinting at a close evolutionary relationship, they are also distinguished by
important functional differences. The apparent similarities presented here are
confirmed by studies which indicate that many proteins necessary for the completion of
mitosis are also important for completion of meiosis (Marcon and Moens 2005).
Likewise, the presence of proteins known only to function during meiosis sheds light on
an unparalleled evolutionary history. The functions of these proteins have been studied
most often in animals, fungi, and plants (representing the eukaryotic supergroups
Opisthokonta and Archaeplastida (Figure 1.1; and discussed further below), although
other organisms, such as the ciliate Tetrahymena thermophila (Cole et al. 1997)
10
(Chromalveolata) and the amoeba Entamoeba histolytica (Lopez-Casamichana et al.
2008) (Amoebozoa) have also been studied. Though there are differences in meiosis
among different eukaryotes, the proteins studied here are highly conserved in both
sequence and function. Although not all eukaryotes have been studied, the similar
functions of these proteins among four of the six currently recognized eukaryotic
supergroups and the high degree of amino acid sequence conservation, strongly implies
that the proteins fulfill the same functions in unstudied extant organisms. Furthermore,
the inferred presence of genes encoding these proteins in the common ancestors of
eukaryotes strongly implies that mitotic and meiotic functions were also occurring
(Ramesh, Malik, and Logsdon 2005).
The origin and evolution of meiotic genes
Due to the presence of mitosis in all extant eukaryotes, it is widely accepted that
this nuclear division was likely to have been present in their last common ancestor
(Cavalier-Smith 1981b). Furthermore, it is widely accepted that genes encoding proteins
that function during mitosis were present in the common ancestor of all extant eukaryotes
(Eme et al. 2009; Wickstead, Gull, and Richards 2010). More contentious is the notion
that meiosis, and genes encoding proteins that function during meiosis, were present in
the ancestor of all extant eukaryotes (Malik et al. 2008). This is due primarily to the fact
that, although mitosis has been observed in all extant eukaryotes, meiosis is not observed
in some putatively asexual eukaryotes (Schurko and Logsdon 2008; Schurko, Neiman,
and Logsdon 2009). Specifically, the apparent absences of meiosis and sexual
reproduction during the lifecycles of Giardia intestinalis, Trichomonas vaginalis, and
Vairimorpha necatrix led to the speculation that these organisms diverged prior to the
origin of meiosis (Cavalier-Smith 1989). Molecular phylogenetic analyses of small
ribosomal subunit and translation elongation factor EF-1 alpha nucleotide sequences
yielded tree topologies in which these supposed “primitive” organisms were depicted as
the earliest diverging eukaryotes (Leipe et al. 1993; Kamaishi et al. 1996; Hashimoto et
11
al. 1997). These eukaryotic phylogenies appeared to support the Archezoa hypothesis
(Cavalier-Smith 1989), in which organisms with presumed ancestral features and no
observed meiosis emerged early during eukaryotic evolution. Such features include:
prokaryote-like transcriptional apparatus (van Keulen et al. 1991a; van Keulen et al.
1991b) and the apparent absence of mitochondria (Tovar, Fischer, and Clark 1999).
Complex organisms (i.e. animals, fungi, and plants), would form a “crown” at the top of
the eukaryotic tree of life (Sogin, Elwood, and Gunderson 1986; Woese, Kandler, and
Wheelis 1990; Brinkmann and Philippe 2007).
Subsequent phylogenetic studies with more sophisticated methods and data, using
more realistic models of protein substitution, have revealed that the placement of
organisms at the base of the original phylogenetic trees were the result of a statistical
anomaly called long-branch attraction (Edlind et al. 1996; Keeling and Doolittle 1996;
Hirt et al. 1997; Hirt et al. 1999; Felsenstein 2004). Although the possibility remains that
T. vaginalis and G. intestinalis may be among the earliest-diverging eukaryotes, V.
necatrix (a Microsporidian) is now known to be a fungus (Hirt et al. 1999). The
presumption that the Archezoa lack mitochondria has also been proven erroneous (Roger
1999). Instead, derived and highly reduced forms of mitochondria (mitosomes and
hydrogenosomes) have been discovered in representatives of each putatively earlydiverging eukaryotic lineage (Germot, Philippe, and LeGuyader 1997; Tovar, Fischer,
and Clark 1999; Tovar et al. 2003; van der Giezen 2009). However, sexual reproduction
has yet to be observed among any of these lineages.
Direct observation of meiosis is often difficult or impossible with many diverse
eukaryotes (Schurko and Logsdon 2008). However, we may determine if organisms
have the potential to undergo meiosis by the presence of genes (Schurko and Logsdon
2008). We can also use the distribution of meiotic genes to infer their presence in the
common ancestors of different eukaryotic groups (including the ancestor to all
eukaryotes) (Dacks and Doolittle 2001; Villeneuve and Hillers 2001; Ramesh, Malik,
12
and Logsdon 2005; Malik et al. 2008). By using phylogenetic analysis it is possible to
determine when meiosis-specific genes, and meiosis, arose and to determine if any
eukaryotes diverged evolutionarily prior to the origins of these meiotic genes (Ramesh,
Malik, and Logsdon 2005; Malik et al. 2008). That is, it is possible to determine if
apparent gene absences are primitive or derived states (Dacks and Doolittle 2001).
Some eukaryotes may have diverged prior to the origin of sexual reproduction in
eukaryotes while others may utilize a primitive type of meiosis (Cleveland 1947;
Cleveland 1956; Cavalier-Smith 1981b; Archetti 2004).
Previously, a set of “core meiotic recombination machinery” (Spo11, Rad50,
Mre11, Dmc1, Rad51, Msh4, Msh5, and Mlh1) was defined by Villeneuve and Hillers
(2001) as a collection of highly conserved orthologs present in animals, fungi, and
plants (Villeneuve and Hillers 2001). This list includes some components known to
function only during meiosis in model organisms (Spo11, Dmc1, Msh4, and Msh5)
(Bishop et al. 1992; Lichten 2001; Snowden et al. 2004). Thus the authors rightly
pointed out that at least three events were important for the evolution of meiosis:
endogenous double-strand DNA breaks (Spo11 (Keeney, Giroux, and Kleckner 1997)),
interhomolog DNA strand exchange (Dmc1 (Bishop et al. 1992)), and resolution of
Holliday junctions as crossovers (Villeneuve and Hillers 2001). Furthermore, the
distribution of these genes among animals, fungi, and plants implies that they arose
prior to the divergence of the eukaryotes considered. Villeneuve and Hillers suggest,
therefore, that the genes arose in the common ancestor of all eukaryotes (Villeneuve and
Hillers 2001). However, given current hypotheses for rooting the eukaryotic phylogeny
(Stechmann and Cavalier-Smith 2002; Stechmann and Cavalier-Smith 2003a; CavalierSmith 2010), only the placement of the root between the Bikonta and the Unikonta
(Figure 1.1) would support the conclusion that the ancestor of animals, fungi, and plants
is also the last common ancestor of eukaryotes. That is, if either members of the
Metamonada or Discoba are the earliest-diverging eukaryotes (discussed further in
13
Current state of the eukaryotic phylogeny below) then the meiotic genes could have
arisen after their divergence. Then, the meiotic genes would be present in the ancestor
of animals, fungi, and plants but not the ancestor of all eukaryotes. More complete
testing of the eukaryotes would need to be performed to determine if any organisms
diverged prior to the origin of meiosis or if the last common ancestor of eukaryotes was
capable of meiosis.
Additional studies tested the specific hypotheses that G. intestinalis and T.
vaginalis diverged prior to the origin of meiosis by determining which genes that
encode products necessary for completion of meiosis (Ramesh, Malik, and Logsdon
2005; Malik et al. 2008). In total; 29 genes (9 meiosis-specific) were studied among
eukaryotes representing five of the six currently recognized eukaryotic supergroups.
Several genes tested are present in G. intestinalis and T. vaginalis (21 and 27,
respectively), including the meiosis-specific genes (6 and 8, respectively). These
studies indicate that G. intestinalis and T. vaginalis are not candidates for ancient
asexuality.
Although previous studies have failed to produce a candidate lineage for ancient
asexuality or primitive meiosis, they provided the basis for addressing open questions
about when and how meiosis arose and how it subsequently evolved (Ramesh, Malik, and
Logsdon 2005; Malik et al. 2008). That is, rather than using a candidate organism
approach, in which different species that may represent early-diverging eukaryotic
lineages are studied, here we have employed a gene-centric approach in which the
distribution and phylogenetic analyses of genes are the focus. Thus the goals of this
thesis were not to detect meiosis or to identify putative ancient asexual organisms per se
but, instead, to study the evolution of meiotic genes in their own right. Specifically, I
address the following questions: 1) Were meiotic gene present in the genome of the last
eukaryotic common ancestor?; 2) By what genetic mechanisms did meiotic genes arise?;
3) Could the products that meiotic genes encoded in the last eukaryotic ancestor have
14
functioned during meiosis?; and 4) How have the suites of meiotic genes observed in
various extant eukaryotic genomes evolved? The answers to these questions will
contribute to our understanding of the origin and evolution of meiosis and, more
generally, the evolution of eukaryotes.
Components of meiotic interhomolog DNA strand
exchange
To add resolution to our view of the evolution of complex processes that occur
during meiosis, the genes encoding proteins central to interhomolog DNA strand
exchange are the focus of this thesis. Interhomolog recombination (X in Figure 1.3 – B
– iii) can result in gene conversion and/or crossing-over that greatly increases the
efficacy of natural selection (Rice and Chippindale 2001; Agrawal 2006; Otto and
Gerstein 2006). That is, it produces novel combinations of genes (Figure 1.3 – B – v,
black and grey regions of chromosomes) that, when combined with other products of
meiosis (e.g. fertilization), enabling eukaryotes to respond evolutionarily to changing
environments more rapidly than asexual organisms (Fisher 1930; Muller 1932; Van
Valen 1973). Although many benefits of recombination are observed at the population
level, its origin and persistence during the evolution of eukaryotes is due, more likely,
to the selective benefits of the appropriate pairing and segregation of homologous
chromosomes in maintaining genomic integrity (Kleckner 1996; Villeneuve and Hillers
2001; Cavalier-Smith 2002d). In most cases, interhomolog DNA strand exchange
(Figure 1.3 – B – iii) is necessary for the formation of bivalents and correct segregation
of homologous chromosomes to opposite poles during meiosis I (Moore and OrrWeaver 1998). Following these observations, it has been declared that “…the very
essence of sex is meiotic recombination.” (Villeneuve and Hillers 2001). Different
models of meiotic recombination have been proposed, such as the synthesis-dependent
strand annealing and double-strand break repair models (Paques and Haber 1999). In
each of these models, the interhomolog DNA strand exchange reaction that occurs
15
between homologous chromosomes during meiosis I (Figures 1.3 and 1.4) is central to
meiotic interhomolog DNA strand exchange (Paques and Haber 1999). Thus, the very
essence of meiotic recombination is the interhomolog DNA strand exchange reaction.
Therefore, the best way to gain a more complete understanding of the evolution of
genes involved in meiosis and to detect any sort of “primitive” meiosis is to study the
interhomolog DNA strand exchange reaction; the components involved in interhomolog
DNA strand exchange during meiosis are the main focus of this thesis.
Genetic recombination between sister chromatids is important for repair of DNA
double-strand (dsDNA) breaks that may be caused by mutagens or by collapsed or
damaged replication forks (John 1990). During entry into meiosis, dsDNA cuts
introduced by meiosis-specific proteins (Spo11-1 or Spo11-2) are repaired with
recombination between either sister chromatids or homologous chromosomes (Keeney,
Giroux, and Kleckner 1997; Hartung et al. 2002). In animals, fungi, and plants, several
genes whose products are important for both sister chromatid and interhomolog DNA
strand exchange (Rad52, Rad59, Rad51, Rad55, Rad57, Rad54, and Rdh54) and some
that are known to function only during meiosis in model organisms (Dmc1, Hop2, and
Mnd1) have been studied extensively (see Table 2.5 and the description of Figure 1.4
for references). A general model of the interactions of thirteen proteins that function
during interhomolog DNA strand exchange is presented (Figure 1.4). This model
illustrates four important steps: i) formation of a Rad51/Dmc1-ssDNA pre-synaptic
filament on a 3’ ending DNA strand (A-D); ii) capture of a DNA duplex by the presynaptic filament (E and F); iii) search by the pre-synaptic filament for regions of DNA
duplex homology (F.); and iv) invasion of DNA duplex by the pre-synaptic filament and
D-loop formation (G) (Filippo, Sung, and Klein 2008). Studies in which searches for
genes that encode Rad51, Rad52, Dmc1, Hop2, and Mnd1 proteins among diverse
eukaryotes (i.e. all eukaryotic supergroups except Rhizaria; Figure 1.1) indicate that,
due to their distributions, these strand exchange components must have arisen very
16
early during eukaryotic evolution (Villeneuve and Hillers 2001; Ramesh, Malik, and
Logsdon 2005; Malik et al. 2008).
Interestingly, some of these genes have not been found within the genomes of
many diverse, less well-studied, eukaryotes (Ramesh, Malik, and Logsdon 2005; Malik
et al. 2008). These apparent absences may be due either to true losses of genes from
genomes or they may represent instances of non-detection (i.e. type II error). Previous
investigation of the distributions of genes cannot easily distinguish between these
possibilities (Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005; Malik et
al. 2008). This limitation inhibits studies of the distribution of genes among eukaryotes
since different suites among diverse organisms cannot be verified bioinformatically and
functional studies may be difficult or impossible to perform, especially with large
numbers of eukaryotes. To address this issue, the distributions of 13 genes that encode
meiotic interhomolog DNA strand exchange components (Figure 1.4) among 47 diverse
eukaryotes (representing all supergroups except Rhizaria, Figure 1.1) were determined.
In addition, these data were useful for the development of a heuristic metric for
determining the likelihood that observed absences represent true losses of genes from
genomes. For the first time, we were able to assess our confidence in the suites of
genes observed in diverse, relatively unstudied, eukaryotic genomes. This new insight
allowed us to study patterns in the distributions of strand exchange genes across
eukaryotes and to formulate an evolutionary hypothesis explaining these patterns. This
project is presented in Chapter 2 of this thesis.
Of particular interest are the eukaryotic Rad51 (Shinohara, Ogawa, and Ogawa
1992) and Dmc1 (Bishop et al. 1992) genes whose products catalyze homologous DNA
strand exchange during genetic recombination (Paques and Haber 1999). Both Rad51
and Dmc1 are related (orthologous (Ridley 2004)) to the eubacterial recA and the
archaebacterial RadA genes, whose products function during homologous
recombination and DNA repair (Cox 1993; Clark and Sandler 1994; Camerini-Otero
17
and Hsieh 1995; Sandler et al. 1996). The master recombinase (Rad51) forms righthanded helical filaments on single-stranded and double-stranded DNA (Conway et al.
2004) during repair of all double-strand breaks (Shinohara, Ogawa, and Ogawa 1992).
The Dmc1 proteins function similarly, promoting interhomolog strand exchange only
during meiosis in model organisms (Figure 1.1) (Bishop et al. 1992). Among animals,
fungi, and plants, rad51 mutants experience reduced recombination, resulting in
decreased resistance to mutagens, and diminished sporulation or fertility, while
mutations in vertebrates cause embryonic lethality (Bishop 1994; Bleuyard, Gallego,
and White 2006). In animals, fungi, and plants, dmc1 mutants reduce or eliminate
homologous recombination during meiosis (Bishop et al. 1999; Tsubouchi and Roeder
2003). Given the evolutionary relationships of eukaryotic Rad51 and Dmc1 genes to
eubacterial recA and archaebacterial RadA genes and the central role of DNA strand
exchange catalysis to the DNA damage repair in all organisms and meiosis in
eukaryotes, elucidating the evolutionary histories of Rad51 and Dmc1 is important to
understanding the origin and evolution of meiosis.
Previous studies, including the one presented in Chapter 2, in which the
distributions of Rad51 and Dmc1 genes were studied, were limited by the availability of
genome sequence data from diverse eukaryotic lineages. We have concluded that these
genes arose very early during eukaryotic evolution but not whether they were present in
the last common ancestor of all eukaryotes. Determining more accurately when Rad51
and Dmc1 arose during eukaryotic evolution and whether there are any other organisms
or groups of organisms that lack these genes may provide valuable insight into the
evolution of interhomolog DNA strand exchange during meiosis. The presence of the
Dmc1 gene may also serve as proxy for the presence of meiosis itself (Ramesh, Malik,
and Logsdon 2005; Malik et al. 2008). That is, although the absence of Dmc1 does not
indicate that meiosis is absent, the presence of a functional copy indicates that meiosis
is likely to be present (Schurko and Logsdon 2008). If the Dmc1 gene is found in
18
representatives of all eukaryotic groups we can infer that their ancestor also possessed a
Dmc1 gene (Dacks and Doolittle 2001; Koonin 2010) and was likely to have been
capable of meiosis (Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005;
Malik et al. 2008; Schurko and Logsdon 2008).
In Chapter 3, using a combination of extensive searches of gene and protein
sequence repositories and degenerate PCR, we demonstrate that both Rad51 and Dmc1
genes are present in genomes of organisms representing all known eukaryotic
supergroups (Figure 1.1) and were, therefore, likely to have been present in the genome
of the last eukaryotic common ancestor. To understand the importance of specific
amino acids to the functions of Rad51 and Dmc1, we aligned protein sequence data
from all known eukaryotic supergroups and identified amino acids that are highly
conserved among them. We also identified several amino acid residues that may confer
Rad51- or Dmc-specific functions, due to their conservation in one set of proteins but
not the other, and that were likely to have been present in the last common ancestor of
all extant eukaryotes. Collectively, these data imply that Rad51 and Dmc1 genes
present within the genome of the last common ancestor of eukaryotes encoded proteins
that functioned during both mitosis and meiosis.
Finally, although the distributions of genes among diverse eukaryotes allow us
to infer when meiotic genes may have arisen, more analyses are necessary to determine
how they arose. What known genetic mechanisms yielded genes that encode products
that function during meiosis? Phylogenetic studies of meiotic genes can provide insight
into their evolutionary histories and may inform their origins (Ramesh, Malik, and
Logsdon 2005; Malik et al. 2008). For example, studies indicate that Rad51 and Dmc1
genes are paralogs (genes resulting from gene duplication events (Ridley 2004))
(Ramesh, Malik, and Logsdon 2005; Lin et al. 2006; Malik et al. 2008). Since Rad51
and Dmc1 genes are orthologous to both the eubacterial recA and the archaebacterial
RadA genes that are known to be important for DNA damage repair in prokaryotes
19
(Marcon and Moens 2005), two genes, one encoding proteins that are known to catalyze
DNA strand exchange during both mitosis and meiosis (Rad51) and the other encoding
proteins that are known to catalyze interhomolog DNA strand exchange only during
meiosis in model organisms (Dmc1), arose from a single gene that most likely encoded
products involved in DNA damage repair early during eukaryotic evolution (Figure 1.5)
(Ramesh, Malik, and Logsdon 2005; Lin et al. 2006; Malik et al. 2008). Thus we can
study the evolutionary histories of these genes to determine when meiosis may have
arisen during eukaryotic evolution and whether any organisms diverged prior to the
duplication event yielding the meiosis-specific gene (Dmc1) (Figure 1.5).
Similarly, this pattern has been observed in studies of the Spo11 paralogs (Malik
et al. 2007). The Spo11-1 and Spo11-2 genes encode meiosis-specific products
(Atcheson et al. 1987; Keeney, Giroux, and Kleckner 1997; Hartung et al. 2002) that
are paralogs of the Spo11-3 gene (Malik et al. 2007) whose products function only
during vegetative growth DNA in Arabidopsis thaliana (Hartung and Puchta 2001;
Sugimoto-Shirasu et al. 2002; Yin et al. 2002). The Spo11 homologs are orthologous to
the archaebacterial Top6A gene (Atcheson et al. 1987), a type II topoisomerase that
functions to separate replicated chromosomes (Bergerat et al. 1997; Nichols et al. 1999;
Corbett and Berger 2003). Like Rad51 and Dmc1, the evolutionary history of Spo11
genes is similar to the model presented in Figure 1.5 (Malik et al. 2007). Indeed, Malik
(2007) demonstrated that many meiosis-specific genes fit this pattern (Malik 2007). We
hypothesized, therefore, that meiosis may have arisen in toto by large-scale gene
duplications, early during eukaryotic evolution. Consistent with this hypothesis, It has
been shown that large-scale gene duplication events may have occurred early during
eukaryotic evolution (Zhou, Lin, and Ma 2010).
Due primarily to the availability of more sensitive gene sequence search
methods (see Chapter 4 Methods), more realistic models of protein substitution (e.g.
LG) for phylogenetic analyses, and unprecedented access to genome sequence data for
20
diverse eukaryotic groups, we were able to extend the previous study on the origins of
meiosis-specific genes by duplication to include representatives of all known eukaryotic
supergroups (Figure 1.1). We determined the eukaryote-wide distributions of twenty
genes that encode products that perform five important functions during meiosis: 1)
pairing of homologous chromosomes; 2) sister chromatid cohesion; 3) dsDNA cuts; 4)
interhomolog DNA strand exchange; and 5) Holliday junction resolution (Table 4.1).
Eighteen out of 20 genes tested fit the pattern presented in Figure 1.5. Furthermore,
given their phylogenetic distributions among eukaryotes groups, these paralogs are
inferred to have been present in the ancestor of all extant eukaryotes.
Current state of the eukaryotic phylogeny
An important motivation for the studies presented in Chapters 3 and 4 was to
determine which components were likely to have been present in the common ancestor
to eukaryotes. In order to correctly interpret the distributions of genes among
eukaryotes and the phylogenetic analyses of their products, an accurate understanding
of the evolutionary relationship of eukaryotes is required. The following discussion
provides the appropriate framework for such studies.
Eukaryotes can be divided into at least six “supergroups” (Opisthokonta,
Amoebozoa, Archaeplastida, Chromalveolata, Rhizaria, and Excavata) and at least one
group of unclassified organisms (Apusozoa) on the basis of phenotypic, ultrastructural,
and phylogenetic studies (Figure 1.1) (Baldauf 2003; Simpson and Roger 2004; Roger
and Hug 2006; Baldauf 2008; Cavalier-Smith 2010; Roger and Simpson 2009). Two
major divisions of eukaryotes (Unikonta and Bikonta) are recognized (Cavalier-Smith
2002a). The Unikonta (Opisthokonta + Amoebozoa) are named for the ancestral
possession of a single flagellum and are distinguished by the fusion of three genes that
encode enzymes that synthesize pyrimidine nucleotides (Cavalier-Smith 2002a;
Stechmann and Cavalier-Smith 2002). The Bikonta (named for the presence of two
flagella in their last common ancestor) share a similar two gene fusion (dihydrofolate
21
reductase and thymidylate synthase) (Stechmann and Cavalier-Smith 2002; Stechmann
and Cavalier-Smith 2003b; Stechmann and Cavalier-Smith 2003a). Recently,
phylogenetic analyses including sequence data from representatives of an unclassified
group of eukaryotes (Apusozoa) have challenged the monophyly of Unikonta by
retrieving topologies in which Apusozoa and Opisthokonta are closely related
(Cavalier-Smith and Chao 2010; Parfrey et al. 2010). The presence of the previously
mentioned two gene fusion and two flagella would appear to support the placement of
Apusozoa within Bikonta (Stechmann and Cavalier-Smith 2002). These conflicting
data make the inclusion of representative Apusozoa important for determining which
genes may have been present in the last common ancestor of eukaryotes, for they may
very well be the earliest-diverged eukaryotes.
While the monophyly of some proposed eukaryotic supergroups is widely
accepted, other groups remain controversial. The supergroup Chromalveolata is
composed of two smaller groups, Chromista (Cryptomonads, Haptophytes, and
Stramenopila) and Alveolata whose ancestor engulfed and enslaved red algae
(secondary endosymbiosis) (Figure 1.1) (Cavalier-Smith 1981a; Fast et al. 2001; Yoon
et al. 2002). Chromalveolates may be sister to Archaeplastida (Cavalier-Smith 2003a).
However, whether Chromalveolata are truly monophyletic has been disputed (Parfrey et
al. 2006; Parfrey et al. 2010). Although evolutionary relationships inferred from
phylogenetic analyses of plastid genes support monophyly, phylogenetic support for
monophyly from topologies determined from analyses of nuclear genes is tenuous
(Parfrey et al. 2006). In addition, recent phylogenetic analyses indicate that Rhizaria
share a more recent common ancestor with the Stramenopiles and Alveolates,
prompting calls for the placement of Rhizaria within the Chromista (Burki et al. 2010;
Burki et al. 2007; Hackett et al. 2007; Cavalier-Smith 2010). The relationships among
Cryptomonads and Haptophytes to other chromalveolates are also unknown as
phylogenetic analyses are somewhat conflicting and ambiguous (Patron, Inagaki, and
22
Keeling 2007; Burki, Shalchian-Tabrizi, and Pawlowski 2008; Reeb et al. 2009). What
we do know is that the evolutionary relationships among chromalveolates and rhizarians
are for more complicated than previously supposed. Therefore, including exemplar
chromalveolates and rhizarians is important to these studies.
The supergroup Excavata (Discoba and Metamonada) was proposed on the basis
of a ventral feeding groove (Figure 1.2) (Simpson and Patterson 1999). The Excavate
hypothesis remains controversial due to conflicting evolutionary relationships implied
by phylogenetic analyses. While some analyses support the monophyly of excavates
(Hampl et al. 2009; Parfrey et al. 2010) others refute it (Parfrey et al. 2006), retrieving
polyphyletic groups instead (a group with multiple ancestors).
Recall that apparent absences of mitochondria from excavate taxa (Trichomonas
vaginalis and Giardia intestinalis) and a primitive looking small subunit rRNA
sequence in G. intestinalis initially supported the notion that the excavates are the
earliest-diverging eukaryotes (Figure 1.2) (Cavalier-Smith 1987b). However, more
recent studies show that relics of mitochondria (mitosomes and hydrogenosomes) are
found in T. vaginalis and G. intestinalis (Muller 1993; Tovar, Fischer, and Clark 1999).
These observations make the status of these excavates as the earliest-diverged
eukaryotes questionable. It has also been proposed recently that the Euglenozoa
(Discoba) are the extant representatives of the earliest-diverging eukaryotes as they lack
an origin recognition complex and four genes (Tom40, CenpA, Smc5 and Smc6) that are
thought to be present in all other eukaryotic groups (Cavalier-Smith 2010). The point is
that the question of whether Excavata are monophyletic is tied to the determination of
the earliest-diverging eukaryotes. If any excavates are the earliest-diverging
eukaryotes, the root of the eukaryotic phylogeny lies between either Discoba and/or
Metamonada and all other eukaryotes (Figure 1.1). This placement of the root requires
that Excavata are paraphyletic (a group including an ancestor and some, but not all, of
its descendants (Ridley 2004)). Excavates are only monophyletic if none of them are
23
the earliest-diverging eukaryotes, and that is currently unknown. Therefore, including
representative Metamonada and Discoba is important for inferring which genes were
present in the last common ancestor of all eukaryotes.
The root of eukaryotes has also been proposed to lie between Unikonta and
Bikonta on the basis of the ultrastructural data discussed previously (Figure 1.1)
(Stechmann and Cavalier-Smith 2002). However, the reliability of these features for
placement of the root of eukaryotic life has recently been called into question by the
possible evolutionary relationship of the Apusozoa to Opisthokonta discussed
previously (Cavalier-Smith 2010). Placement of the root on the eukaryotic phylogeny
remains one of the most vexing questions in molecular phylogenetics. It seems most
likely that, despite the presence of highly derived mitochondria (mitosomes), G.
intestinalis is still the best candidate for the earliest-diverged eukaryote. Whatever the
answer, the ability to polarize the eukaryotic phylogeny will have significant impacts
upon studies of the ancestor of eukaryotes.
Since the eukaryotic phylogeny remains incompletely resolved, genes must be
detected in all eukaryotes to infer its presence in the ancestor to all eukaryotes.
However, this approach is certain to underestimate the numbers of components present
in the last common ancestor of eukaryotes, due to subsequent gene losses. That is, if
genes were lost in some lineages than we cannot, without knowing how they are related,
determine whether the genes were present in the ancestor of eukaryotes. Therefore,
searches for components among diverse eukaryotes combined with phylogenetic
analyses that add resolution to the eukaryotic phylogeny are necessary (Cavalier-Smith
2010). In addition, this method is always prone to the discovery of putative earlydiverging eukaryotes. That is, the last eukaryotic common ancestor is defined by our
knowledge of extant eukaryotes and if new lineages are discovered then inferences
regarding the common ancestor of extant eukaryotes will obviously need to be
reassessed. David Patterson (1999) has estimated that there are approximately 220
24
known genera whose evolutionary relationships have yet to be completely resolved
(Patterson 1999). Although most of those unclassified genera have ultrastructural
identities similar to eukaryotes with well resolved evolutionary relationships (Patterson
1999), indicating that the full breadth of eukaryotes has probably been discovered, the
possibility that one (or more) may represent previously unknown lineages always exists.
We performed phylogenetic analyses upon Rad51 and Dmc1 protein sequence
data to determine if they are effective markers for resolving the eukaryotic phylogeny.
These data are presented in Chapter 3. Products of Rad51 and Dmc1 gene sequences
are well conserved among animals, fungi, and plants, with a great degree of similarity
and retention of functional motifs (Stassen et al. 1997). The Rad51 gene is present in
the genomes of all but one eukaryote studied (G. intestinalis) and both Rad51 and Dmc1
genes are present in single-copy in most organisms (Ramesh, Malik, and Logsdon 2005;
Malik et al. 2008). These qualities make Rad51 and Dmc1 protein sequences attractive
markers for phylogenetic reconstruction. Since Rad51 and Dmc1 genes are paralogs, we
also attempted to determine which eukaryotes are the earliest-diverging by reciprocally
rooting them (paralogous rooting) (Gogarten et al. 1989; Iwabe et al. 1989; Schlegel
1994; Brown and Doolittle 1995; Baldauf, Palmer, and Doolittle 1996). Although we
failed to positively place the root of eukaryotes using these methods, Rad51 and Dmc1
protein sequence data was useful for resolving five of six eukaryotic supergroups and
several first order groups (Table 3.1).
Summary
The origin of meiosis is likely to have been one of the most important events in
eukaryotic evolution. The effects of this event can be observed at the genetical,
cytological, organismal, and population levels of eukaryotic biology. Indeed, meiosis
and sexual reproduction may have provided the genetic grist which, when subsequently
acted upon by natural selection, resulted in the rapid evolution of the diverse eukaryotic
lineages observed today. This thesis presents a body of work in which the distributions
25
of meiotic genes and phylogenetic analyses of the proteins they encode were used to
study the origin and evolution of meiotic genes.
The study presented in Chapter 2 shows that at least eight genes whose products
are known to be involved in interhomolog DNA strand exchange during meiosis arose
very early during eukaryotic evolution. In addition, we applied a heuristic metric to
determine if apparent gene absences are due to limitations of the gene sequence search
regimen or to bona fide absences of genes from genomes. These analyses indicate that
some genes are detected far less frequently than predicted and are likely to indicate true
gene losses (Figure 2.1). Interestingly, some organisms have retained all of the genes
tested (e.g. Saccharomyces cerevisiae and Homo sapiens) and others have retained
relatively few (e.g. Caenorhabditis elegans) (Figure 2.1). Based upon these
observations we propose a general hypothesis in which overexpression or mutation of
Rad51 gene may allow other components to be lost due to relaxed selection (Chapter 5).
In addition, genes that encode components known to function only in complexes may be
vulnerable to loss when another component of the complex is lost.
The study presented in Chapter 3 focuses on the eukaryotic RecA homologs
Rad51 and Dmc1. Rad51 protein functions during both mitotic DNA repair and during
meiosis, while Dmc1 catalyzes interhomolog DNA strand exchange only during meiosis
in model organisms. We collected nucleotide and protein sequence data from databases
and by using degenerate PCR. The dataset contains Rad51 and Dmc1 protein sequences
from all available eukaryotic supergroups. Therefore, Rad51 and Dmc1 genes were
likely present in the ancestor to all extant eukaryotes. We also analyzed an alignment of
98 Rad51 and 51 Dmc1 protein sequences to determine which amino acid residues are
conserved and, therefore, might have conferred Rad51- or Dmc1-specific activities in
the common ancestor of eukaryotes (Figure 3.13). We found 18 sites among Rad51
protein sequences and 15 sites among Dmc1 that are completely conserved and likely to
have been present in the ancestor of eukaryotes. In addition, we detected 10 sites that
26
are highly conserved in one protein and conserved but different in the other. These
residues are likely to facilitate Rad51- or Dmc1-specific activities; their distributions
indicate that these functions were present in the common ancestor of eukaryotes
The study presented in Chapter 4 was designed to determine the distribution of
genes that encode proteins that function during different stages of interhomolog DNA
strand exchange in model organisms: 1) synaptonemal complex formation; 2)
interhomolog strand exchange; 3) sister chromatid cohesion; and 4) resolution of
Holliday junctions as crossovers. We studied the distributions of 20 genes: 10 of that
encode proteins that are known to function only during meiosis in model organisms.
We determined that 19 of the genes tested are likely to have been present in the
common ancestor of eukaryotes. Furthermore, phylogenetic analyses of the protein
sequences indicate that all of the putative meiosis-specific genes arose by gene
duplication and that they are often paralogs of genes that encode products which
function during general DNA damage repair in mitotic cells.
Together, the results of these studies have culminated in general models for the
origin and subsequent evolution of meiotic genes that are presented in Chapter 5.
27
Figure 1.1: Evolutionary relationships among prokaryotes, members of six
currently recognized eukaryotic supergroups and Apusozoa according to
multigene phylogenetic analyses. Relationships that are well supported in
the literature have solid branches while unsupported or conflicting
relationships are represented by dotted lines. Although the monophyly of
Rhizaria, Stramenopila, and Alveolata is well supported, the relationships
among Archaeplastida, Cryptomonads, and Haptophytes within the
photosynthetic ‘megagroup’ have not been established. Current hypotheses
for placement of the root of eukaryotes are shown. (Baldauf and Palmer
1993; Baldauf et al. 2000; Stechmann and Cavalier-Smith 2002; CavalierSmith and Chao 2003a; Cavalier-Smith and Chao 2003b; Cavalier-Smith and
Chao 2003c; Stechmann and Cavalier-Smith 2003b; Stechmann and
Cavalier-Smith 2003a; Simpson and Roger 2004; Cavalier-Smith and Chao
2006; Kim, Simpson, and Graham 2006; Burki et al. 2007; Moreira et al.
2007; Burki, Shalchian-Tabrizi, and Pawlowski 2008; Yoon et al. 2008;
Reeb et al. 2009; Roger and Simpson 2009; Cavalier-Smith and Chao 2010;
Parfrey et al. 2010)
28
Apusozoa
Excavata
28
13
Bacteria
Archaea
Eucarya
Archezoa
e.g. Giardia
Trichomonas
1
2
meiosis
actin, tubulin cytoskeleton, phagocytosis,
nucleus, and mitosis
surface N-linked glycoproteins replace murein
peptidoglycans
ATP, DNA, RNA, genetic code, transcription and
translation machinery
29
Figure 1.2: The three-kingdom tree of life with relative order of major events during eukaryotic evolution. A simplified
universal tree of life as determined by phylogenetic analyses of ribosomal RNA nucleotide sequence data is presented
(Woese and Fox 1977; Gogarten et al. 1989; Iwabe et al. 1989; Woese, Kandler, and Wheelis 1990; Brown and Doolittle
1995). Neither branch lengths nor distances between arrows depict specific amounts of time but indicate only the
proposed relative order of events. The dashed arrows are intended to illustrate two competing hypotheses; 1) the
Archezoa hypothesis, in which some organisms diverged prior to the origin of meiosis (or some meiotic functions) and
2) that meiosis (including all currently known steps) was present in the last common ancestor of eukaryotes. The events
are taken primarily from Cavalier-Smith’s “neomuran” hypothesis (Cavalier-Smith 1987a; Cavalier-Smith 1988;
Cavalier-Smith 2002d; Cavalier-Smith 2002c; Cavalier-Smith 2002a; Cavalier-Smith 2010). However, Cavalier-Smith
seems to imply that meiosis arose at the same time as mitosis, whereas meiosis is shown here to have arisen after
mitosis. In Eucarya, one dashed branch is intended to indicate putative primitive eukaryotes (Archezoa (Cavalier-Smith
1989)) and the other branch represents all other eukaryotes.
30
Figure 1.3: General schematic of mitosis and meiosis. A. Mitosis – Chromosomes in
a diploid (2n = 2) cell (i) (chromosomes are shown here condensed for
convenience but are, in reality, unwound, appearing as threads) replicate
(yielding 2 x 2n products) and condense, sister chromatids are tightly
associated (ii). Pairs of sister chromatids (chromosomes) line up on the
metaphase plate and microtubules bind kinetochores of both sister
chromatids (iii) in preparation for the mitotic (equational) division (yielding
2n products) (iv). B. Meiosis - Chromosomes in a diploid (2n = 2) cell (i)
(chromosomes are shown here condensed for convenience but are, in
reality, unwound, appearing as threads) replicate (yielding 2 x 2n products)
and condense, sister chromatids are tightly associated (ii). Homologous
chromosomes pair, creating bivalents, synaptonemal complexes form (grey
bars), interhomolog DNA recombination (crossing-over) occurs (chiasmata
are indicated with an X; only one crossover event is shown but at least one
event per chromosome arm occurs in most organisms studied), and
microtubules bind only one sister chromatid per pair (iii). This is followed
by the first meiotic (reductional) division (2 x n). Pairs of non-identical
sister chromatids (chromosomes), with chromosome arms splayed and only
the centromeres tightly associated, align (iv) for the second meiotic
(equational) division, yielding four non-identical haploid products (n) (v).
This image is adapted from (Schurko, Neiman, and Logsdon 2009) with
permission. Additional details were provided by (John 1990; Simchen and
Hugerat 1993; Kleckner 1996).
13
A.
2n
2n
2x2n
i.
iv.
2x2n
ii.
iii.
Interphase
Mitosis
2xn
B.
i.
2n
iv.
2x2n
n
v.
2x2n
ii.
iii.
x
Interphase
Meiosis I
Meiosis II
31
32
Figure 1.4: General model of interhomolog DNA strand exchange during meiosis.
This model (based upon details from studies of animals, fungi, and plants)
presents interactions of 13 proteins and illustrates four steps of
interhomolog DNA strand exchange during meiosis: formation of a presynaptic filament on a 3’ ending DNA strand (A-D), capture of a DNA
duplex by the pre-synaptic filament (E and F), search by the pre-synaptic
filament for regions of DNA duplex homology (F), and invasion of the
DNA duplex by the pre-synaptic filament and D-loop formation (G).
Components with blue labels are known to function only during meiosis in
model organisms. Exact stoichiometry is not implied. The interactions
between Rad51 proteins, Rad52 and 59 proteins, and single-stranded DNA
(A and B), the formation of Rad52/Rad59 heteroheptamers (A-C), and
extension of a Rad51-ssDNA nucleoprotein filament by the Dmc1 protein
(C and D) are speculative. (Brill and Stillman 1991; Bishop et al. 1992;
Kadyk and Hartwell 1992; Milne and Weaver 1993; Bishop 1994; Bai and
Symington 1996; Noble and Guthrie 1996; Klein 1997; Nishinaka et al.
1998; Petukhova, Stratton, and Sung 1998; Shinohara et al. 1998; Arbel,
Zenvirth, and Simchen 1999; Bai, Davis, and Symington 1999; Bishop et
al. 1999; Chen et al. 1999; Paques and Haber 1999; Petukhova et al. 1999;
Borts, Chambers, and Abdullah 2000; Muniyappa, Anuradha, and Byers
2000; Shinohara et al. 2000; Davis and Symington 2001; Gasior et al. 2001;
Masson and West 2001; Bochkareva et al. 2002; Brush 2002; Fortin and
Symington 2002; Kiianitsa, Solinger, and Heyer 2002; Krejci et al. 2002;
Miyagawa et al. 2002; Pellegrini et al. 2002; Solinger, Kiianitsa, and Heyer
2002; Symington 2002; Tsubouchi and Roeder 2002; Cox 2003; Davis and
Symington 2003; Sugawara, Wang, and Haber 2003; Anuradha and
Muniyappa 2004a; Anuradha and Muniyappa 2004b; Bishop and Zickler
2004; Chen et al. 2004; Dudas and Chovanec 2004; Grishchuk et al. 2004;
Krogh and Symington 2004; Sehorn et al. 2004; Ishibashi et al. 2005;
Sauvageau et al. 2005; Bleuyard, Gallego, and White 2006; Chi et al. 2006;
Enomoto et al. 2006; Flaus et al. 2006; Fung et al. 2006; Henry et al. 2006;
Holzen et al. 2006; Ishibashi, Kimura, and Sakaguchi 2006; Cox 2007;
Feng et al. 2007; Nimonkar et al. 2007; Chen, Yang, and Pavletich 2008;
Filippo, Sung, and Klein 2008; Lopez-Casamichana et al. 2008; Mozlin,
Fung, and Symington 2008; Octobre et al. 2008; Pannunzio, Manthey, and
Bailis 2008; Sarai et al. 2008; Chang et al. 2009; Fung, Mozlin, and
Symington 2009; Kudoh et al. 2009; Sakaguchi et al. 2009; Seong et al.
2009; Latypov et al. 2010; Okorokov et al. 2010; Szekvolgyi and Nicolas
2010)
33
3’
5’
3’
5’
A.
•RPA binds ssDNA, preventing secondary structure formation
•Rad51 and Rad52/59 present as heptamers
B.
•Rad52/59 recruits Rad51
•Rad52/59-Rad51 complex binds RPA-ssDNA complex
•RPA’s displaced by Rad52/59
C.
•Rad51 binds ssDNA, forming presynaptic filament
•Dmc1 binds, extending presynaptic filament
•Rad55/57 heterodimer mediates filament assembly
D.
•presynaptic filament extension displaces remaining RPA
•Hop2/Mnd1 heterodimer stabilizes presynaptic filament
E.
•Hop2/Mnd1 heterodimer captures dsDNA
F.
•Hop2/Mnd1 heterodimer stabilizes interactions between
homologous DNA sequences
G.
3’
5’
5’
3’
5’
3’
•Rad54 and Rdh54 stimulate D-loop formation and may
remove recombinational intermediates
RPA 1-3
Rad51
Rad52/59
Rad55/57
Dmc1
Hop2/Mnd1
Rad54
Rdh54
34
Protist X Protist Y Archaeplastida Protist Z
Animals Fungi Protist X Protist Y Archaeplastida Protist Z
Animals Fungi MEIOSIS MITOSIS/GENERAL Origin of
meiotic
function
Protist A Archaea Bacteria
Figure 1.5: A model for the origin of meiotic function by gene duplication. This
model hypothesizes that gene duplication events yielding meiosis-specific
components mark the origins of their respective meiotic functions. In
addition, it suggests that some organisms (most likely protists) diverged prior
to the gene duplication events and, therefore, prior to the origins of some
meiotic functions. Some organisms may have primitive meiosis or none at
all as the ancestral state.
35
CHAPTER 2
A PAN-EUKARYOTIC INVENTORY OF DNA
STRAND EXCHANGE COMPONENTS REVEALS
PATTERNS OF CONSERVATION AND LOSS
Abstract:
Recombination is critical for repair of DNA double-strand breaks, and the DNA
strand exchange (SE) reaction is central to recombination. We present a phylogenetic
inventory of ten SE component proteins (Rad52, Rad59, Rad51, Rad55, Rad57, Dmc1,
Hop2, Mnd1, Rad54, and Rdh54) among 47 genera representing five eukaryotic
supergroups. We aligned SE protein sequences, verified their homology by phylogenetic
analyses, and used these alignments to create hidden Markov model (HMM) profiles and
position-specific scoring matrices (PSSM), which we used to further scrutinize public
nucleotide sequence databases. Phylogenetic analyses of all the resulting sequences
confirmed orthology of the evolutionarily diverse SE component proteins. Eight of ten
SE proteins (Rad52, Rad51, Rad55, Rad57, Dmc1, Hop2, Mnd1, and Rdh54) are present
in five of six eukaryotic supergroups and were likely present in the common ancestor of
extant eukaryotes. An evolutionary analysis of the heterotrimeric Replication Protein A
complex (RPA1, RPA2, and RPA3) is also presented. Since RPA subunit protein
sequences and their single-stranded DNA binding domains are well conserved, apparent
absences of RPA-coding genes from genomes most likely result from detection failures
due to limitations of the search regimen. To validate the approach, we fitted a Poisson
regression model to the numbers of observed RNA Polymerase I (Pol I) subunit detection
failures. We then compared the numbers of RPA subunit detection failures observed to
the numbers predicted by the Pol I regression analysis. The results demonstrate that the
frequencies of RPA subunit detection failures and their Smith-Waterman alignment
scores are strongly correlated. We then applied this approach to the SE proteins by
comparing the numbers of detection failures for SE components, given their Smith-
36
Waterman scores. Detection failures of six proteins (Rad52, Rad59, Rad51, Dmc1,
Rad54, and Rdh54) occurred more frequently than predicted, indicating the likely loss of
these genes from some completely sequenced genomes. The inferred losses of these
genes can be explained if compensatory changes (e.g. overexpression or functional
mutations) of Rad51 suppress SE component mutant phenotypes.
Introduction:
In eukaryotes, meiosis is necessary for sexual reproduction (Weismann, Parker,
and Ronnfeldt 1893; Churchill 1970). During meiosis, a single round of genome-wide
DNA replication is followed by two nuclear divisions (reductional and equational)
(Churchill 1970). A diploid organism typically produces four haploid cells that combine
with other haploid products of meiosis (e.g. spores and gametes) (Weismann, Parker, and
Ronnfeldt 1893). In this manner, the chromosomes of organisms are recombined while
maintaining the appropriate numbers of chromosomes (Weismann, Parker, and Ronnfeldt
1893; Cavalier-Smith 2002d). Although there are important differences between meiosis
and mitosis, during which one nuclear division follows one round of DNA replication
(Flemming 1878), many proteins that function during meiosis also function during
mitosis (Marcon and Moens 2005). The pairing of non-sister homologous chromosomes
during the first (reductional) division, followed by their segregation to opposite spindle
poles, is unique to meiosis (Simchen and Hugerat 1993; Paques and Haber 1999; Dudas
and Chovanec 2004; Krogh and Symington 2004; Filippo, Sung, and Klein 2008).
However, the second (equational) division that occurs during meiosis is similar (though
not identical) to the single equational division of mitosis, during which sister chromatids
segregate to opposite spindle poles (Nicklas 1977).
Genetic recombination between homologous chromosomes is essential in most
organisms for appropriate pairing and segregation during the reductional division of
meiosis (Moore and Orr-Weaver 1998; Paques and Haber 1999; Dudas and Chovanec
2004; Krogh and Symington 2004; Filippo, Sung, and Klein 2008). The importance of
37
homologous recombination may also be observed at the population level as gene
conversions and/or cross-over events may occur that increase the efficacy of natural
selection (Fisher 1930; Muller 1932; Hill and Robertson 1966), allowing eukaryotes to
respond evolutionarily to changing environments (Van Valen 1973; Rice and Chippindale
2001; Agrawal 2006; Otto and Gerstein 2006).
Several models of recombination have been proposed, such as the double-strand
break repair, synthesis-dependent strand annealing, and break-induced replication
(Paques and Haber 1999; Dudas and Chovanec 2004; Krogh and Symington 2004;
Filippo, Sung, and Klein 2008). Central to all of these models is the DNA strand
exchange (SE) reaction, in which 3’ ends of single stranded DNA (ssDNA) invade intact
DNA duplexes (Paques and Haber 1999; Dudas and Chovanec 2004; Krogh and
Symington 2004; Filippo, Sung, and Klein 2008). Double-strand DNA breaks (DSB) in
mitotic cells are generally caused by mutagens and collapsed or damaged replication
forks (Paques and Haber 1999; Dudas and Chovanec 2004; Krogh and Symington 2004;
Filippo, Sung, and Klein 2008). During meiosis, DSBs are introduced by the Spo11
transesterase, followed by resection of the 5’ strand by nuclease activity (Lichten 2001;
Krogh and Symington 2004). Several proteins important for SE activity have been
studied in animals, fungi, and plants with genetics, molecular biology, and biochemistry
(Brush 2002; Krogh and Symington 2004; Sakaguchi et al. 2009); however, less is
known about the origins or evolution of SE components.
An approach to studying the evolution of genes is to search for and compare them
among diverse eukaryotes (Dacks and Doolittle 2001). The presence of orthologs among
groups of eukaryotes indicates that the genes must have been present in their last
common ancestor, while absences might represent either ancestral or derived states
(Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005). It is important to
include diverse protists in order to estimate when genes most likely arose during
eukaryotic evolution (Ramesh, Malik, and Logsdon 2005). Previous analyses indicate
38
that some SE proteins (Rad52, Rad51, Dmc1, Hop2, and Mnd1) likely arose very early
during eukaryotic evolution (Ramesh, Malik, and Logsdon 2005; Malik et al. 2008).
These studies provided a much needed “snapshot” of the distribution of these components
in representative animals, fungi, and plants and some (mainly parasitic) protists, yet more
specific conclusions regarding the evolutionary histories of SE components could not be
made due to limited availability of genome sequence data from diverse lineages within
these groups (Adl et al. 2005). In addition, several important mediator proteins that
interact with Rad51 or Dmc1 recombinase proteins were excluded from the prior
analyses. Here, we present an expanded inventory of the SE machinery (Rad52, Rad51,
Dmc1, Hop2, and Mnd1 studied previously, and Rad59, Rad55, Rad57, Rad54, Rdh54,
RPA1, RPA2 and RPA3 (Krogh and Symington 2004; Sakaguchi et al. 2009)) with broad
taxonomic sampling (47 eukaryotes representing five eukaryotic supergroups (Adl et al.
2005)).
This study also addresses the ambiguous interpretation of apparent gene absences,
which has been and important limitation in prior phylogenetic analyses (Ramesh, Malik,
and Logsdon 2005; Malik et al. 2008). In addition, the data collected during this study
was used to address the issue of the ambiguous interpretation of apparent gene absences,
which has been an important limitation to prior phylogenetic inquiries (Villeneuve and
Hillers 2001; Ramesh, Malik, and Logsdon 2005; Malik et al. 2008). Current approaches
do not distinguish between instances of non-detection that result from (i) failures of the
search methods employed, or (ii) true absences (e.g. losses) of genes from completely
sequenced genomes (Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005;
Malik et al. 2008). We describe a heuristic metric for detecting potential gene absences
that can be applied to a broad range of diverse eukaryotes (Adl et al. 2005). The
distribution of Replication Protein A complex subunits (RPA1, RPA2, RPA3) provides
an empirical basis for determining the limits of sequence detection. RPA subunits are
known to function only as heterotrimeric complexes that bind ssDNA and interact with
39
Rad52 proteins during recruitment of Rad51 proteins to pre-synaptic filaments in
animals, fungi, and plants (Brill and Stillman 1991; Sakaguchi et al. 2009). In addition,
we searched for ten RNA Polymerase I subunits (Kuhn et al. 2007) among the 47 taxa
studied (Adl et al. 2005) and compared their distributions to the RPA and SE protein
datasets. We determined that absences of at least four SE proteins in our inventory
(Rad51, Dmc1, Rad54, and Rdh54) most likely represent true gene losses.
Methods:
Data acquisition
Keyword searches (e.g. Saccharomyces cerevisiae Rad51) of the National Center
for Biotechnology Information (NCBI, www.ncbi.nlm.nih.gov/)non-redundant protein
sequence database retrieved SE protein sequences of RPA1, RPA2, RPA3, Rad52,
Rad59, Rad51, Rad55, Rad57, Dmc1, Hop2, Mnd1, Rad54, and Rdh54 (Krogh and
Symington 2004; Sakaguchi et al. 2009) from representatives of animals, fungi, and
plants. We also searched the clusters of euKaryotic Orthologous Groups of proteins
(KOGs) database for each protein (Tatusov et al. 2003). The identities of retrieved
protein sequences were initially verified by evaluating the results of bi-directional
searches with the tBLASTn (Altschul et al. 1997) option of the Basic Local Alignment
Search Tool (BLAST), in which the translated non-redundant nucleotide database is
searched using a protein query.
The set of protein (and protein-coding) sequences collected in this manner were
subsequently used as queries to search additional protein, nucleotide, and expressed
sequence tag (EST) databases at NCBI and other public genome sequence databases by
BLASTp, tBLASTn, or BLASTn (Table 2.6). Searches were performed for all
homologous protein-coding sequences available between December 2009 and June 2010.
In an effort to identify apparently missing homologous sequences from distantly-related
organisms, additional searches were performed using protein sequence queries from
organisms likely to share more recent common ancestors. For example, Trypanosoma
40
brucei protein sequences were used as additional queries for searches of sequences for a
closely related kinetoplastid protist, Leishmania major (Adl et al. 2005). Identities of
sequences were again confirmed with bi-directional BLASTp, BLASTx and tBLASTn
searches.
When multiple sequences were found for a species, only the most complete open
reading frame or protein prediction was retained for our analyses. If no previously
annotated protein sequence was available in a database, we annotated the nucleotide
sequences manually, using Sequencher v4.5 (Genecodes, Ann Arbor, MI). Exons were
identified with reference to multiple protein sequence alignments, inferred translations
from BLASTx pairwise comparisons to the NCBI protein sequence database, and the
locations of putative intron splice donor and acceptor sites (Griffiths et al. 2000).
Multiple amino acid sequence alignments were calculated using MUSCLE v3.7 (Edgar
2004) and observed with BioEdit v7.0.5.3 (Hall 1999).
To further scrutinize publicly available genome sequence data for the presence of
SE protein-coding genes, we created local databases of nucleotide and predicted protein
sequences for completely sequenced (Sanger sequence coverage of 8x and greater or
sequenced from end-to-end) genomes and searched them using HMMER v2.3.2
(Sonnhammer et al. 1998) and tBLASTn (Altschul et al. 1997). Multiple sequence
alignments of homologous amino acid sequences positively identified by reciprocal
BLAST searches and phylogenetic analysis were used to calculate hidden markov models
(HMM, global and local) with HMMER v2.3.2 and position specific scoring matrices
(PSSM) using a local installation of the suite of NCBI BLAST programs. These HMM
files were then used to search protein sequence data with HMMER, and the PSSM files
used to search protein and nucleotide sequence data for homologs using PSI-BLAST and
tBLASTn.
41
Phylogenetic analyses
We aligned all protein sequences of potential eukaryotic orthologs using
MUSCLE v3.7 (Edgar 2004), manually edited them by removing ambiguously aligned
columns and gaps in BioEdit v7.0.5.3 (Hall 1999), and performed phylogenetic analyses
on the multiple protein sequence alignment. Optimal protein substitution models and
parameters were determined for each alignment independently with Modelgenerator v0.8
(Keane et al. 2006). Constant sites were excluded from analyses. Phylogenetic trees
were calculated using PhyML v3.0 (Guindon et al. 2009) for 1000 replicates, and
PhyloBayes v3. (Lartillot, Lepage, and Blanquart 2009), which used at least two
independent chains in which maximum differences observed across all bipartitions were
less than 0.10, an indicator that the chains have good convergence (Lartillot, Lepage, and
Blanquart 2009). Every other tree after burnins (selected to minimize the differences
across all bipartitions) was used to calculate consensus tree topologies in Phylobayes
(Lartillot, Lepage, and Blanquart 2009). Analyses were also performed by reciprocally
rooting all paralogs (e.g. Rad54 and Rdh54) for positive identification (Ridley 2004). All
strand exchange protein sequence alignments were concatenated end-to-end using
BioEdit v7.0.5.3 (Hall 1999). Both unpartitioned analyses and analyses partitioned to
each protein in the concatenated dataset were performed with RAxML v7.2.7
(Stamatakis, Ludwig, and Meier 2005) for 1000 replicates at the CIPRES Science
Gateway v3.0 (Miller et al. 2009).
Inventory assembly
Genes were determined to be present in an organism when putative orthologs
were discovered and identified with bi-directional BLAST and phylogenetic analyses
(Figures 2.2-2.18) (Ramesh, Malik, and Logsdon 2005; Malik et al. 2008). Protein
sequence data for genes that encode all SE proteins (including RPA subunits) in Homo
sapiens, Saccharomyces cerevisiae, and Oryza sativa (or its relative)were aligned and
their Smith-Waterman pairwise alignment scores (Smith and Waterman 1981) were
42
calculated with the PRSS/PRFX tool
(http://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=shuffle) (Tables 2.3 and
2.4). In some cases protein sequences were either not available for one representative or
we were unable to align them properly (Tables 2.3 and 2.4). We also determined SmithWaterman pairwise alignment scores (Smith and Waterman 1981) for protein sequences
from genes that encode RNA Polymerase I proteins (A190, A135, AC40, AC19, AC12.2,
Rpb5, Rpb6, Rpb8, Rpb10, and Rpb12 (Kuhn et al. 2007)) from H. sapiens and S.
cerevisiae (Table 2.2). Poisson regression analyses (Allison 1999) were calculated on
counts of detection failures among 34 genomes and their respective Smith-Waterman
pairwise alignment scores (Smith and Waterman 1981) for the RNA Polymerase I dataset
and a combined RNA Polymerase I/RPA1-3 dataset using the genmod procedure in SAS
v. 9.2 (SAS Institute Inc., Cary, NC). Graphs were created from the resulting parameter
estimates and Wald 90% confidence limits using Microsoft Excel 2010, with the
observed numbers of detection failures superimposed for comparison. In addition,
parameter estimates and Wald 90% confidence limits from regression analyses were used
to calculate the predicted numbers of detection failures given protein Smith-Waterman
pairwise alignment scores listed in Tables 2.2-2.4 (Allison 1999).
Results and discussion:
An inventory of the presence of 13 component proteins predicted to catalyze the
DNA strand exchange (SE) reaction (RPA1, RPA2, RPA3, Rad52, Rad59, Rad51,
Rad55, Rad57, Dmc1, Hop2, Mnd1, Rad54, and Rdh54 (Krogh and Symington 2004;
Sakaguchi et al. 2009)) among 47 diverse eukaryotes is presented here (Figure 2.1) (Adl
et al. 2005). This inventory includes representatives of five of the six currently
recognized eukaryotic supergroups (Adl et al. 2005) (Excavata, Chromalveolata,
Archaeplastida, Opisthokonta, and Amoebozoa; but not Rhizaria). Completed genome
sequence and other nucleotide databases (including ESTs) were rigorously searched using
HMM profiles and PSSMs created from phylogenetically verified amino acid sequences
43
with HMMER v2.3.2 (Sonnhammer et al. 1998), PSI-BLAST, and tBLASTn (Altschul et
al. 1997) (Figure 2.1). Identities of putative orthologs were confirmed with phylogenetic
analyses (Ramesh, Malik, and Logsdon 2005; Malik et al. 2008) performed with PhyML
v3.0 (Guindon et al. 2009) and Phylobayes v3.1 (Lartillot, Lepage, and Blanquart 2009)
(Figures 2.2-2.18). Although phylogenetic analysis of several SE proteins yielded poorly
resolved phylogenies, the resolution is sufficient to establish orthology (Ramesh, Malik,
and Logsdon 2005; Malik et al. 2008).
To determine if either short amino acid sequences or substitution rate
heterogeneity are causing phylogenetic artifacts (Felsenstein 2004), we analyzed the
concatenated SE protein alignments (Figure 2.19) (Rokas et al. 2003). The concatenated
protein sequence analyses strongly support the monophyly of several known groups:
Opisthokonta, Chloroplastida, Stramenopila, Apicomplexa, Metamonada, Discoba and
Amoebozoa (Adl et al. 2005). However, the unsupported topology, which places some
Excavata with ciliates and Amoebozoa, are most likely due to methodological issues,
such as long-branch attraction (Felsenstein 2004), and are unlikely to indicate cases of
lateral gene transfer (Syvanen 1985).
Ten SE components (RPA1, RPA2, Rad52, Rad51, Rad55, Rad57, Dmc1, Hop2,
Mnd1 and Rdh54 (Krogh and Symington 2004; Sakaguchi et al. 2009)) are present in
every supergroup tested (Figure 2.1) (Adl et al. 2005). Therefore, they most likely arose
very early during eukaryotic evolution, prior to the divergence of nearly all known
eukaryotes (Dacks and Doolittle 2001; Villeneuve and Hillers 2001; Ramesh, Malik, and
Logsdon 2005; Malik et al. 2008). The absence of the Rad54 gene from the eukaryotic
supergroups Excavata and Amoebozoa indicates that it may have arisen later, after the
divergence of Excavata or Amoebozoa from other eukaryotes (Figure 2.1 and Table 2.1)
(Dacks and Doolittle 2001; Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon
2005; Malik et al. 2008). We detected the Rad59 gene in Opisthokonta and Amoebozoa
that form the metagroup Unikonta (Cavalier-Smith 2002a); the most parsimonious
44
explanation is that Rad59 arose more recently during eukaryotic evolution than the last
eukaryotic common ancestor, possibly in the last common ancestor of Unikonta (Dacks
and Doolittle 2001; Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005;
Malik et al. 2008).
Limits of sequence detection and distribution of strand
exchange genes among eukaryotes
We selected the three Replication Protein A (RPA) subunits (Sakaguchi et al.
2009) for phylogenetic comparison with the other SE proteins and with the subunits of
RNA Polymerase I (Kuhn et al. 2007), with the goal of establishing a threshold for
detection of their component proteins. Replication Protein A, a complex composed of
RPA1 (70kDa), RPA2 (32kDa), and RPA3 (14kDa) subunits, binds ssDNA (Brill and
Stillman 1991; Bochkareva et al. 2002; Brush 2002; Ishibashi et al. 2005; Ishibashi,
Kimura, and Sakaguchi 2006; Chang et al. 2009; Sakaguchi et al. 2009). In humans, the
RPA heterotrimer is critical DNA metabolic pathways, such as DNA replication, DNA
repair, recombination, cell cycle, and DNA damage checkpoints (Zou et al. 2006). In
yeast, RPA has been indicted in DNA replication, repair, and recombination (Wold
1997). RPA is also necessary for DNA damage repair in plants but may not be critical
for DNA replication and homologous recombination in Oryza sativa (Ishibashi et al.
2005; Ishibashi, Kimura, and Sakaguchi 2006; Kimura and Sakaguchi 2006; Sakaguchi et
al. 2009). During meiosis the RPA complex recruits Rad52 proteins during pre-synaptic
filament formation in animals, fungi, and plants (Davis and Symington 2003; Krogh and
Symington 2004; Sakaguchi et al. 2009). Two conserved domains (DBD-A and DBD-B)
allow RPA1 proteins to bind ssDNA, preventing formation of secondary structures of
DNA that inhibit SE (Wold 1997; Brush 2002). RPA1 monomers may bind ssDNA
weakly (approximately 8 nucleotides) but binding of the RPA1 subunit interaction motif
(DBD-C) with RPA2/RPA3 heterodimers (RPA2 DBD-D, RPA3 DBD-E) causes
conformational changes that result in stable interactions (approximately 30 nucleotides)
45
(Bochkareva et al. 2002). The role of RPA3, which has a single binding motif (DBD-D),
is currently unclear. In Saccharomyces cerevisiae RPA1 binds only to RPA2/RPA3
heterodimers (Sakaguchi et al. 2009). Inspection of amino acid sequences among
animals, fungi, and plants indicates that RPA1 proteins sequences are longer and more
conserved than RPA2 (Tables 2.2 and 2.3). In addition, the binding domains of RPA1
and RPA2 protein sequences (DBD - A-D and F) appear to be well conserved among all
of the eukaryotes studied here (Figures 2.20-2.24). RPA3 has the least conserved protein
sequences of the three subunits (Figure 2.25 and Tables 2.2-2.4), possibly attesting to
differences in ssDNA binding among eukaryotes (Sakaguchi et al. 2009). Various RPA
complexes have been observed but there are no known functions of any component
outside of the trimerization core (Bochkareva et al. 2002). Also, there are no known
mutant phenotype suppressors for any of the three components (Table 2.5). Together,
these data indicate that RPA trimerization is likely to be required for successful SE in all
eukaryotes. Therefore, apparently missing RPA subunit genes are best explained by
limitations of the search methods used (i.e. type II error). We propose that a correlation
exists between the numbers of RPA subunit gene sequence detection failures and their
respective protein sequence amino acid lengths and degrees of conservation (Table 2.2);
this possibility is explored further below.
RPA1 protein sequences were obtained for all 47 organisms studied here, while
genes that encode RPA2 and RPA3 homologs were not always identified in each study
organism (Figures 2.1). Among the 34 organisms included in this study with at least one
genome sequence of 8.0x whole-genome shotgun sequencing coverage, there are five
RPA2 and ten RPA3 gene sequence detection failures. RPA1 and RPA2 genes were likely
present in the last eukaryotic common ancestor due to their presence in every supergroup
tested (Villeneuve and Hillers 2001; Ramesh, Malik, and Logsdon 2005; Malik et al.
2008). The apparent absence of RPA3 genes from the Amoebozoa indicates that this
subunit may have arisen later during eukaryotic evolution (Dacks and Doolittle 2001),
46
after the divergence of Amoebozoa from other eukaryotes. However, the Amoebozoa are
not candidates as the earliest-diverging eukaryotes by current hypotheses for rooting the
evolutionary tree of eukaryotes (Cavalier-Smith 2002a; Stechmann and Cavalier-Smith
2002; Roger and Simpson 2009; Cavalier-Smith 2010), making it more likely that RPA3
was lost in the common ancestor of D. discoideum and Entamoeba (Dacks and Doolittle
2001; Adl et al. 2005).
To test the hypothesis that the numbers of RPA subunit sequence detection
failures correlate with their protein sequence lengths and degrees of conservation, we
compared RPA subunit protein sequence data to homologs of the ten RNA Polymerase I
(Pol I) subunit protein sequences (A190, A135, AC40, AC19, AC12.2, Rpb5, Rpb6,
Rpb8, Rpb10, and Rpb12 (Kuhn et al. 2007)) (Figures 2.3 and 2.26) by determining
Smith-Waterman pairwise alignment (S-W) scores (Smith and Waterman 1981). We
selected the S-W algorithm to score protein sequence length and conservation due to its
ability to apply a similarity measure to protein sequences of variable lengths (Smith and
Waterman 1981). S-W scores were calculated from pairwise alignment of protein
sequences encoded by genes from Homo sapiens and S. cerevisiae genomes. Yeast and
human gene products were selected for S-W score assessment on the basis that each
genome contains all of the genes tested, providing consistency between comparisons.
Poisson regression analysis (Cameron and Trivedi 1998) was performed on the numbers
of sequence absences observed among 34 organisms with at least one genome sequence
per supergroup of 8.0x whole-genome shotgun sequencing coverage, although
Cyanidioschyzon merolae and Encephalitozoon cuniculi were included on the basis that
their genomes have been sequenced from end-to-end and, therefore, are complete
(Katinka et al. 2001; Matsuzaki et al. 2004) (Figure 2.1). Homo sapiens is also included
since all SE, RPA and RNA Pol I component proteins were detected within its genome,
although the reference human genome sequence has less than 8.0x sequence coverage
(Venter et al. 2001). Regression analysis indicates that RNA Pol I S-W scores are good
47
predictors of the numbers of observed RNA Pol I subunit detection failures (p = 0.0017)
(Figure 2.27-a.) (Allison 1999). The numbers of observed RPA subunit detection failures
relative to their S-W scores are similar to the expected numbers of detection failures, as
predicted by the 90% confidence interval for the regression of RNA Pol I data (Figure
2.27-b.). To increase the numbers of proteins included in the regression analyses, we
then performed Poisson regression analyses on a combined RPA and RNA Pol I dataset
(Figure 2.27-c.). Regression analysis of the combined dataset indicates that S-W scores
are good predictors of RPA/ RNA Pol I subunit detection failures (p < 0.0001).
We detected eight SE components (Rad51, Rad55, Rad57, Dmc1, Hop2, Mnd1,
and Rdh54) in all eukaryotic supergroups tested (Figure 2.1, black rows). Therefore,
these components are all likely to have been present in the last eukaryotic common
ancestor (Dacks and Doolittle 2001). We failed to detect the Rad54 gene in the
Amoebozoa tested (Figure 2.1 and Table 2.1); our search for Rad54 genes was conducted
in all available Amoebozoa sequence data (individual EST, nucleotide, and protein
sequence submissions and incomplete genomes) and none was discovered. In addition,
the Rad54 gene may be absent from Excavata; however, the genome coverages of
Trichomonas vaginalis and several Leishmania species are below 8.0x whole-genome
shotgun sequencing (Carlton et al. 2007; Aslett et al. 2010), reducing our confidence in
this conclusion. The Rad59 gene appears to have arisen later during eukaryotic
evolution, as it is present in only two supergroups (Opisthokonta and Amoebozoa) that
form a monophyletic “metagroup” (Unikonta) (Figure 2.1) (Malik et al. 2008). The last
common ancestor to extant eukaryotes may have had all of the components necessary for
homologous recombination, despite the possible absence of Rad54 and Rad59 (Bai and
Symington 1996; Klein 1997; Arbel, Zenvirth, and Simchen 1999).
We then compared the numbers of observed SE component detection failures to
those predicted by our analysis of the RNA Pol I/RPA dataset (Figure 2.27-c. and Table
2.2). Sequence detection failures for four SE components (Rad55, Rad57, Hop2, and
48
Mnd1) are within the 90% confidence interval of the predicted failure range.
Interestingly, Rad55 and Rad57 proteins form heterodimers that stabilize Rad51-DNA
filaments, and they are not known to function as either monomers or homodimers
(Bleuyard, Gallego, and White 2006; Filippo, Sung, and Klein 2008). The same is true of
Hop2 and Mnd1 proteins that stabilize Dmc1-DNA filaments (Chen et al. 2004; Henry et
al. 2006). Therefore, the apparent absence of one of these proteins in the presence of the
other as in Theileria annulata, Thalassiosira pseudonana, and Phaeodactylum
tricornutum may be due to type II errors. Similarly, the absences of the genes that
encode Hop2 and Mnd1 proteins and Dmc1 proteins may indicate that joint losses have
occurred (e.g. Drosophila sp., Caenorhabditis sp., Neurospora crassa, Gibberella zeae,
and Ustilago maydis) (Figure 2.1 and Table 2.1). Six SE components (Rad52, Rad59,
Rad51, Dmc1, Rad54, and Rdh54) were detected less frequently than predicted among
the taxa tested (Figure 2.2-c. and Table 2.2). Observed numbers of detection failures are
2-4 times higher than predicted for Rad52 and Rad59 genes and no sequence detection
failures of Rad51, Dmc1, Rad54, and Rdh54 genes are predicted (Table 2.2).
S-W scores may not adequately predict the numbers of detection failures due to
either variation in genome coverage, or the true absence of genes from a genome. In
order to minimize the effects of variation in genome quality, only organisms with
completed genomes were used for this analysis (Figure 2.1) (Malik et al. 2008). Poisson
regression analysis of the RPA/RNA Pol I dataset (Figure 2.27) indicates that there is a
strong correlation between S-W scores and the numbers of absences observed (p <
0.0001) (Allison 1999). So, the effect of variation in genome quality is likely to be
negligible among organisms with completed genomes. The quality of gene sequence
annotations and true absences of genes from completed genomes are the most likely
causes of a reduction in correlation between S-W scores and the numbers of detection
failures. As mentioned previously, the Rad59 gene likely arose later during eukaryotic
evolution (Malik et al. 2008), explaining some of the detection failures observed (Dacks
49
and Doolittle 2001). Similarly, some of the sequence detection failures of the Rad54
gene may be due to its emergence later during eukaryotic evolution (Dacks and Doolittle
2001) (after the divergence of Excavata and/or Amoebozoa (Adl et al. 2005)), although
failures to detect the Rad54 gene among Chromalveolata, Archaeplastida, and
Opisthokonta are most likely due to subsequent losses. All apparent absences of Rad51,
Dmc1, and Rdh54 genes from the 34 genomes tested could indicate true gene losses.
Despite the presence and inferred importance of SE component proteins in the earliest
common ancestor of eukaryotes, lineage-specific losses seem pervasive; only some
animals and fungi have retained all SE component proteins studied here.
Suppressors of strand exchange component mutant
phenotypes in Saccharomyces cerevisiae
By interpreting our observed distributions of SE component proteins in
comparison with functional studies of SE components in S. cerevisiae, it is possible that
overexpression or mutation of Rad51 gene could suppress the mutation or loss of other
SE protein-coding genes (Table 2.5). Extragenic suppressors of mutant phenotypes in S.
cerevisiae are known for most of the SE components studied here except Rad51 (absent
only in Giardia intestinalis) and RPA1-3 (Table 2.5). Overexpression of Rad51 gene in
S. cerevisiae suppresses rad52, dmc1, rad55, rad57, hop2, and mnd1 mutant phenotypes
(Milne and Weaver 1993; Klein 1997; Krejci et al. 2002; Tsubouchi and Roeder 2003;
Henry et al. 2006; Schild and Wiese 2009). In addition, S. cerevisiae rad51 mutants
demonstrate decreased recombination and reduced viability (Bishop 1994; Tsuzuki et al.
1996; Bleuyard, Gallego, and White 2006). These characteristics may account for the
infrequent loss of Rad51 gene from eukaryotes.
In fungi, rad52 mutants are perhaps the most deleterious of the SE machinery, but
Rad52 genes may be absent from as many as 18 of 47 genomes, representing four of the
five eukaryotic supergroups in our study (Feng et al. 2007). This apparent contradiction
may be explained if compensatory changes in Rad51 gene overcome the inhibitory
50
effects of single stranded DNA-RPA complexes in nature (Milne and Weaver 1993;
Krejci et al. 2002).
Although critical for meiosis in many organisms studied, several organisms
appear to be missing genes that encode Dmc1 (Bishop 1994; Bishop et al. 1999;
Tsubouchi and Roeder 2003). However, in S. cerevisiae Rad51 protein is capable of
completing homologous strand exchange during meiosis when Rad51 gene is
overexpressed and does so without the assistance of Hop2 or Mnd1 proteins, which work
in concert with Dmc1 (Table 2.2) (Tsubouchi and Roeder 2003). In addition, dmc1
mutant phenotypes may be suppressed with high copy numbers of the Rad54 gene,
reducing the increased numbers of Rad51 foci that form, as in S. cerevisiae (Bishop 1994;
Bishop et al. 1999).
We cannot distinguish between the possibility that absences of Rad55 and Rad57
genes in our analyses are real or that their apparent absences are artifacts of the search
methods used. However, if Rad51 gene expression is increased, it is possible that enough
protein is available for successful pre-synaptic filament formation despite the destabilizing effects that rad55 and rad57 mutations may have on recombination, as in S.
cerevisiae (Fung, Mozlin, and Symington 2009). This observation is consistent with the
suppression of rad55 and rad57 mutant phenotypes by compensatory changes in the
Rad52 gene, encoding products that recruit Rad51 to form pre-synaptic filaments in S.
cerevisiae (Milne and Weaver 1993). The numbers of Hop2 and Mnd1 gene absences
may also be due to failures of the search methods used, however it is interesting that
compensatory changes in Rad51 gene also suppresses hop2 or mnd1 mutant phenotypes
in S. cerevisiae, possibly by creating more Dmc1 foci (Bishop 1994).
Alternatively, the elimination of some Rad51 protein functions suppresses S.
cerevisiae mutant phenotypes of rad54 and rdh54 mutants (Klein 1997). The detrimental
effects of rad54 and rdh54 mutants are almost certainly the result of accumulation of
Rad51 proteins on DNA, since mutants of rad52, rad51, rad55, and rad57 suppress these
51
phenotypes, eliminating or reducing the number of Rad51 proteins bound to ssDNA in S.
cerevisiae (Klein 1997).
Finally, the Rad59 gene, a paralog of Rad52, encodes a protein that appears to
overlap functionally with Rad52 but cannot suppress a rad52 mutant phenotype in S.
cerevisiae (Bai and Symington 1996). The rad59 mutant confers the most benign mutant
phenotype, mildly defective mitotic recombination and decreased resistance to ionizing
radiation, which is suppressed by Rad52 overexpression (Davis and Symington 2001;
Davis and Symington 2003; Pannunzio, Manthey, and Bailis 2008), possibly leading to
frequent losses.
Conclusions
We found that 10 of 13 strand exchange reaction components (RPA1, RPA2,
Rad52, Rad51, Rad55, Rad57, Dmc1, Hop2, Mnd1, and Rad54 (Krogh and Symington
2004; Sakaguchi et al. 2009)) are present in all of the eukaryotic supergroups (Adl et al.
2005) scrutinized and thus are likely to have been present in the last common ancestor to
extant eukaryotes (Figure 2.1) (Dacks and Doolittle 2001). It is possible that one
component (Rad54) may have arisen later during eukaryotic evolution (Dacks and
Doolittle 2001), after the divergence of either Amoebozoa or Excavata (if either are the
earliest diverging eukaryotes) (Adl et al. 2005). It is likely that Rad59 arose later during
eukaryotic evolution (Dacks and Doolittle 2001), after the divergence of the Unikonta
(composed of Opisthokonta and Amoebozoa) (Cavalier-Smith 2002a) from other
eukaryotes. Rad54 (Petukhova, Stratton, and Sung 1998; Petukhova et al. 1999;
Kiianitsa, Solinger, and Heyer 2002) and Rad59 (Bai and Symington 1996; Davis and
Symington 2003; Pannunzio, Manthey, and Bailis 2008) proteins are both thought to
function primarily during sister chromatid or intrachromosomal recombination.
The requirement for trimerization of RPA protein subunits appears to be
conserved among eukaryotes, as heterotrimers are observed among animals, fungi, and
plants (Sakaguchi et al. 2009). The presence of RPA1 subunits in every organism studied
52
here strongly implies that RPA2 and RPA3 must also be present in these organisms; thus
any apparent absences are inferred to be the result of search method detection limits
(Figure 2.1). Detection of nucleotide and protein sequences is influenced by the length or
degree of sequence conservation in different organisms (Pevsner 2009). The SmithWaterman pairwise alignment algorithm (S-W) scores protein sequence data using a
similarity measure that incorporates protein sequence length (Smith and Waterman
1981). Thus, we hypothesized that the number of detection failures is correlated with
protein S-W scores. We tested this hypothesis with searches of RNA Pol I core complex
subunits (Figure 2.26) (Kuhn et al. 2007). In addition, we determined RNA Pol I subunit
S-W scores with pairwise alignments of human and S. cerevisiae gene (Figures 2.1 and
2.26). Poisson regression analyses (Allison 1999) indicate that there is a strong
correlation between S-W scores and the number of undetected sequences among RNA
Pol I proteins (Figure 2.27-a. and Table 2.2). Furthermore, the number of detection
failures predicted by the RNA Pol I regression analysis for the RPA subunits is similar to
the observed numbers of RPA subunit detection failures (Table 2.2) (Allison 1999).
These analyses indicate that absences among RPA components are likely due to failures
of detection and may not represent true losses. We then combined the RNA Pol I and
RPA and performed additional regression analyses (Figure 2.27-b.) and compared the
numbers of predicted detection failures relative to S-W scores for the remaining SE
components (Figure 2.27-c and Table 2.2). More detection failures were observed than
predicted by the Pol I/RPA data for six SE components (Rad52, Rad59, Rad51, Dmc1,
Rad54, and Rdh54), these absences may represent true losses of genes from genomes.
Complicating the inference of the early origins of SE component proteins are the
frequent absences of SE protein-coding genes observed among diverse eukaryotes
(Figure 2.1 and Table 2.1). Only eight organisms (all Opisthokonts (Adl et al. 2005)
encode all of the SE proteins studied: Homo sapiens, Mus mus, Gallus gallus, Xenopus
laevis, Danio rerio, Nematostella vectensis, Saccharomyces cerevisiae, and
53
Kluyveromyces lactis. The organism with the fewest SE protein-coding genes
(Caenorhabditis elegans with only four) is also an Opisthokont (Figure 2.1) (Adl et al.
2005). Therefore, animals represent the greatest range in the number of SE proteins
encoded. The common ancestor to the putatively early diverging eukaryotes
(Trichomonas vaginalis and Giardia intestinalis) (Woese, Kandler, and Wheelis
1990)most likely had at least nine SE genes present in its genome (RPA1, RPA2, RPA3,
Rad52, Rad51, Dmc1, Rad57, Hop2, and Mnd1).
In nature, frequent losses of SE protein coding genes may be facilitated by
mutations in Rad51 gene (Schild and Wiese 2009). Overexpression or mutation of Rad51
genes suppresses the mutant phenotypes of several SE protein coding genes (rad52,
rad59, rad55, rad57, dmc1, hop2, mnd1, rad54, and rdh54) in S. cerevisiae (Table 2.5).
Rad51 mutations could result in the relaxation of selection on SE proteins, leaving the
genes that encode them vulnerable to loss (Nei and Kumar 2000; Ridley 2004).
Furthermore, when component proteins function in complexes, such as Hop2 and Mnd1
(Henry et al. 2006), loss of the gene encoding one component may expedite the loss of
the gene encoding its partner protein, i.e. the genes that encode obligate complexes likely
evolved together by coevolution and rely on one another to have functional value (Goh et
al. 2000). As the Hop2-Mnd1 heterodimer is known only to function during meiosis,
interacting with Dmc1-DNA filaments (Chen et al. 2004), the loss of the Dmc1 gene may
hasten the loss of both Hop2 and Mnd1 genes. Alternately, it is imaginable that Hop2
and Mnd1 proteins could interact with Rad51 proteins in organisms missing Dmc1 (e.g.
Encephalitozoon cuniculi and Paramecium tetraurelia) (Figure 2.1).
Although the protein components of the strand exchange reaction appear
ubiquitously across eukaryotes (Adl et al. 2005), likely present in their last common
ancestor (Dacks and Doolittle 2001), the manner in which SE proceeds may vary greatly
due to subsequent loss of a few component proteins. All extant eukaryotes inherited a
54
complex of SE machinery that has been differentially retained over evolutionary time
since the last eukaryotic common ancestor.
55
Figure 2.1: Phylogenetic distribution among eukaryotes of DNA strand exchange
genes. The names of genera studied are listed. Asterisks indicate organisms
with completed genomes (on the basis that they have at least one isolate
genome-sequence with 8.0x whole-genome shotgun coverage or a genome
that was sequenced from end-to-end). Supergroups are presented in black
rows with a summary of the genes deduced to be present in their common
ancestor and grey rows provide summaries of major Opisthokont lineages.
Light grey columns designate RPA1-3, which were used to determine the
thresholds of detection. Meiosis-specific proteins are presented in dark grey
columns. Symbols: ‘+’ indicates sequence was found and phylogenetically
verified, ‘(-)’ indicates that sequence was not found and may be outside the
threshold of detection, blank spaces indicate sequences were not found and
the genome project has less than the equivalent of 8.0x Sanger whole genome
shotgun coverage, ‘-‘ indicates sequence was not found, is within the
calculated threshold of detection and the genome project has ≥ 8.0X
coverage. The tree is a cartoon that summarizes current literature (Simpson,
Inagaki, and Roger 2006; Baldauf 2008; Burki, Shalchian-Tabrizi, and
Pawlowski 2008; Kolisko et al. 2008; Timmermans et al. 2008; Minge et al.
2009; Reeb et al. 2009; Shadwick et al. 2009).
56
EXCAVATA
Giardia*
Trichomonas
Trypanosoma*
Leishmania
Naegleria*
CHROMALVEOLATA
Plasmodium*
Theileria*
Cryptosporidium*
Tetrahymena*
Paramecium*
Thalassiosira*
Phaeodactylum*
Phytophthora*
ARCHAEPLASTIDA
Arabidopsis*
Oryza
Physcomitrella*
Chlamydomonas*
Ostreococcus*
Cyanidioschyzon*
OPISTHOKONTA
HOLOZOA
Homo*
Mus
Monodelphis
Gallus
Xenopus
Danio*
Strongylocentrotus*
Aedes
Drosophila*
Caenorhabditis*
Apis
Tribolium
Nematostella
Trichoplax*
Monosiga*
FUNGI
Saccharomyces*
Kluyveromyces
Candida albicans*
Neurospora*
Gibberella*
Magnaporthe
Aspergillus*
Schizosacch.*
Coprinus*
Ustilago*
Encephalitozoon*
AMOEBOZOA
Dictyostelium*
Entamoeba*
Rpa1
Rpa2
Rpa3
Rad52
Rad59
Rad51
Rad55
Rad57
Dmc1
Hop2
Mnd1
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
(-)
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
(-)
(-)
(-)
+
+
+
+
+
+
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
(-)
+
+
+
(-)
+
+
+
+
(-)
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
(-)
(-)
(-)
(-)
+
+
(-)
(-)
(-)
(-)
(-)
+
+
+
+
(-)
(-)
(-)
(-)
+
+
+
+
+
+
+
+
+
+
(-)
(-)
(-)
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
(-)
(-)
(-)
(-)
(-)
(-)
(-)
+
+
+
(-)
+
+
+
+
+
+
+
+
(-)
(-)
(-)
(-)
+
+
+
+
+
+
(-)
+
(-)
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
(-)
+
+
+
(-)
+
+
+
+
+
+
(-)
(-)
+
+
(-)
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
(-)
(-)
Rad54
Rdh54
-
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
57
Mus 74143871
Homo 17390283
Monodelphis 126314233
809/1.00
729/0.98
Gallus 57525314
1000/1.00
Xenopus 147901418
415/0.96
Danio 49619041
Strongylocentrotus 115958731
288/0.57
Apis 110756775
1000/0.99
595/0.52
Tribolium 91094635
Aedes 157133641
672/0.99
982/1.00
Drosophila 195390608
953/0.95
Trichoplax 196010750
643/0.52
Nematostella 156369841
Monosiga 167519220
592/0.49
Saccharomyces 6319321
1000/1.00
1000/0.99
Kluyveromyces 50302901
Candida albicans 68472948
904/0.90
Ustilago 71022145
744/0.92
Coprinus 169852738
327/0.98
Schizosaccharomyces 213408425
Aspergillus 145234512
930/0.98
Gibberella 46137213
1000/0.99
Neurospora 85102469
705/0.99
-/0.78
Magnaporthe 39974337
Oryza 115449015
1000/0.99
1000/0.99
Arabidopsis 15225129
889/0.98
Physcomitrella 168050100
Ostreococcus 145353884
338/0.37
873/1.00
Chlamydomonas 159491651
Cyanidioschyzon 151559128
172/0.16
Trichomonas 154413577
714/0.17
Caenorhabditis 17533299
-/0.23
Encephalitozoon 19074669
960/1.00
996/1.00
Rpa1
Giardia 253747183
Entamoeba 67468384
Dictyostelium 66804925
-/0.39
-/0.08
Tetrahymena 146163802
487/0.72
Paramecium 145514039
-0.25
Cryptosporidium 209879421
Theileria 84995782
573/0.90
999/0.97
Plasmodium 68077039
Thalassiosira 224006215
1000/1.00
Phaeodactylum 219123923
836/0.89
Phytophthora capsici jgi-116852
327/0.98
Naegleria jgi-45814
Trypanosoma 71756127
595//0.93
1000/1.00
Leishmania 40317150
-/0.11
427/0.28
0.5 substitutions/site
Figure 2.2: Unrooted phylogenetic tree of 47 Replication Protein A – 1 (RPA1)
homologs. Trees were estimated with maximum likelihood and Bayesian
inference (LG+I+G+F) from 506 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, and Excavata in brown. GenBank Geninfo
Identifiers are given for all sequences unless otherwise noted (e.g. “jgi”)
refers to the Joint Genome Institutes public sequence databases. The
consensus topology of 2 Phylobayes chains is shown. Numbers at the nodes
indicate support from 1000 PhyML bootstrap replicates followed by the
posterior probability estimated using Phylobayes.
58
Homo 4506585
Mus 13435424
823/0.72
Monodelphis 126328771
922/0.92
Gallus 71894737
896/0.95
Xenopus 55742354
-/0.46
Danio 63102323
613/0.37
Strongylocentrotus 115929083
Nematostella 162101610
-/0.44
Aedes 157104136
970/0.90
745/0.80
Drosophila 194766307
-/0.48
Tribolium 91088823
817/0.99
Apis 110748861
-/0.37
Caenorhabditis 157746001
Trichoplax 196015539
Gibberella 46136417
690/0.69
865/0.92
Neurospora 85105463
970/0.99
Magnaporthe 145612441
601/0.90
Aspergillus 145240461
Coprinus 169853110
423/0.90 454/0.67
Schizosaccharomyces 63054444
Ustilago 71014541
Saccharomyces 6324017
1000/1.00
-/0.79
Kluyveromyces 50308557
924/0.95
Candida albicans 68482450
Entamoeba 167380559
339/0.43
124/0.31
Naegleria jgi-scaffold 32 183881-184777
-/0.15
Trichomonas 123482304
-/0.20
Leishmania 146081832
1000/1.00
Trypanosoma 71410456
-/0.33
266/0.26
Ostreococcus 145353240
Chlamydomonas 159483541
14/0.09
Arabidopsis 82621223
-/0.23
571/0.37
760/0.80
Oryza 9801268
Physcomitrella 168062552
231/0.53
Encephalitozoon 19074286
Monosiga jgi-36751
-/0.25
Phaeodactylum 219112023
986/0.99
Thalassiosira 224001160
-/0.08
Cyanidioschyzon 151559134
Theileria 71029898
357/0.32
918/0.97
Cryptosporidium 209882741
Phytophthora capsici jgi-116202
807/0.71
945/0.99
Rpa2
0.5 substitutions/site
Figure 2.3: Unrooted phylogenetic tree of 42 Replication Protein A – 2 (RPA2)
homologs. Trees were estimated with maximum likelihood and Bayesian
inference (LG+G) from 158 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, and Excavata in brown GenBank Geninfo
Identifiers are given for all sequences unless otherwise noted (e.g.
“jgi”)refers to the Joint Genome Institutes public sequence databases. The
consensus topology of 2 Phylobayes chains is shown. Numbers at the nodes
indicate support from 1000 PhyML bootstrap replicates followed by the
posterior probability estimated using Phylobayes.
59
Aspergillus 259480423
Gibberella 46130890
Magnaporthe 145607133
48/0.26
768/0.79
669/0.84
Neurospora 85075943
Schizosaccharomyces 19075628
Rpa3
Ustilago 71020163
-/0.18
Cryptosporidium 62638182
Thalassiosira 223992931
-/0.43
393/0.63
Coprinus 169850994
-/0.03
Candida albicans 238878621
565/0.89
Kluyveromyces 50302705
1000/1.00
Saccharomyces 6322288
-/0.12
Plasmodium – ApiDB 130920
Danio 47940029
481/0.76
Xenopus
166796524
523/0.69
Homo
4506587
810/0.98
Mus 13386122
881/0.96
734/0.96
Monodelphis 126343264
-/0.53
Gallus 50732531
Nematostella 156406582
32/0.69
Tribolium 91090226
477/0.91
Aedes 157115696
162/0.95
Drosophila 21064755
-/0.48
-/0.08
Strongylocentrotus 115930113
216/0.34
Apis 110761258
Chlamydomonas 159475703
Ostreococcus 144578945
-/0.14
Physcomitrella
168007302
-/0.29
-/0.05
Oryza
113532144
740/0.80
884/0.84
Arabidopsis 145332815
Tetrahymena 146171005
305/0.53
Paramecium 145508173
Phytophthora 262112589
15/0.07
Encephalitozoon 19074517
433/0.45
49/0.20
Naegleria jgi – scaffold 8
-/0.17
Trichomonas 123475178
724/0.99
0.5 substitutions/site
Figure 2.4: Unrooted phylogenetic tree of 36 Replication Protein A – 3 (RPA3)
homologs. Trees were estimated with maximum likelihood and Bayesian
inference (LG+G) from 79 aligned amino acids. Opisthokonta are highlighted
in purple, Amoebozoa in blue, Archaeplastida in green, Chromalveolata in
orange, and Excavata in brown. GenBank Geninfo Identifiers are given for all
sequences unless otherwise noted (e.g. “jgi”) refers to the Joint Genome
Institutes public sequence databases. The consensus topology of 2 Phylobayes
chains is shown. Numbers at the nodes indicate support from 1000 PhyML
bootstrap replicates followed by the posterior probability estimated using
Phylobayes.
60
Magnaporthe 39974337
Neurospora 85102469
Gibberella 46137213
927/0.99
Aspergillus 45234512
925/0.98
Schizosaccharomyces 213408425
Coprinus 169852738
728/0.93
941/0.93
Ustilago 71022145
Candida albicans 68472948
1000/0.99
Kluyveromyces 50302901
1000/0.99
Saccharomyces 6319321
Gallus 57525314
721/0.99
Monodelphis 126314233
993/1.00
776/0.98 Homo 17390283
969/1.00
623/0.68
Mus 74143871
1000/0.99
Xenopus 147901418
420/0.95
Danio 49619041
Strongylocentrotus 115958731
314/0.58
Apis 110756775
999/1.00
Tribolium 91094635
523/0.56
Drosophila 195390608
679/0.99
992/1.00
Aedes 157133641
Nematostella 156369841
969/0.99
637/0.63
Trichoplax 196010750
Monosiga 167519220
Arabidopsis 15225129
999/0.99
1000/0.99
Oryza 115449015
880/0.99
Physcomitrella 168050100
Chlamydomonas 159491651
880/0.99
521/0.38
Ostreococcus 145353884
Cyanidioschyzon 151559128
-/0.32
Trichomonas 154413577
Entamoeba 67468384
781/0.38
Dictyostelium 66804925
-/0.44
331/0.50
Paramecium 145514039
474/0.70
Tetrahymena 146163802
-/0.28
Cryptosporidium 209879421
Theileria 84995782
593/0.89
997/0.98
Plasmodium 68077039
Phytophthora
capsici
jgi-116852
808/0.93
Phaeodactylum 219123923
1000/1.00
Thalassiosira 224006215
-/0.97
Leishmania 40317150
1000/1.00
Trypanosoma 71756127
635/0.93
Naegleria jgi-4814
-/0.69
742/0.97
1000/1.00
Rpa1
0.5 substitutions/site
Figure 2.5: Unrooted phylogenetic tree of 44 Replication Protein A – 1 (RPA1)
homologs. Trees were estimated with maximum likelihood and Bayesian
inference (LG+I+G+F) from 506 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, and Excavata in brown. GenBank Geninfo
Identifiers are given for all sequences unless otherwise noted (e.g. “jgi”)
refers to the Joint Genome Institutes public sequence databases. The
consensus topology of 2 Phylobayes chains is shown. Numbers at the nodes
indicate support from 1000 PhyML bootstrap replicates followed by the
posterior probability estimated using Phylobayes.
61
Homo 75516404
Mus 148667208
351/0.71
Monodelphis 126340233
303/0.49
Gallus 730466
998/0.99
Xenopus 14822632
573/0.85
Danio 66773042
996/0.99
Trichoplax 196010005
Rad52
Strongylocentrotus 115916111
-/0.45
Nematostella jgi - 38126
Coprinus 169851317
453/0.70
Ustilago 71021811
Magnaporthe 52783233
702/0.58
-/0.46
Neurospora 85100991
868/0.99
Gibberella 46137285
-/0.15
839/0.98
Aspergillus 70994704
472/0.33
Schizosaccharomyces 19114119
Candida albicans 68489792
Kluyveromyces 403011
992/0.99
-/0.15
555/0.96 Saccharomyces 27808713
Encephalitozoon 85014303
-/0.25
Monosiga 167525549
-/0.52
Phaeodactylum 219126773
1000/1.00
Thalassiosira 224014646
-/0.73
Giardia 159112704
Naegleria jgi - 59017
-/0.51
Phytophthora ramorum jgi - 96312
Entamoeba 67476176
-/0.68
-/0.39
719/0.86
Dictyostelium 66825177
Cyanidioschyzon 151559144
788/0.88
786/0.82
0.2 substitutions/site
Figure 2.6: Unrooted phylogenetic tree of 29 Rad52 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 127 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
62
Mus 6755276
Monodelphis 126277684
Homo 19924133
Gallus 299819
964/0.99
Xenopus 62858453
418/0.69
Danio 47086005
-/0.60
Stpu115610
Nematostella 156342885
-/0.75
-/0.31
Monosiga – jgi 6000172
-//0.29
Trichoplax – scaffold 6
Tribolium 91080301
-/47
70/0.26
Aedes 157112162
499/0.74
Apis 110756953
262/0.44
Drosophila 17864108
Dictyostelium 66822135
Ustilago 71018413
741/0.81
Coprinus 3237296
-/0.19
Schizosaccharomyces 397843
955/0.99
Saccharomyces 4275
-/0.31
959/0.99
Kluyveromyces 50309711
1000/1.00
722/0.71
Candida albicans 68485285
Aspergillus 83774056
532/0.89
Gibberella 46108550
-/0.18
1000/0.99
998/0.99 Neurospora 28926929
Magnaporthe 145614388
-/0.58
Cael378640
Entamoeba 67477127
Arabidopsis 18420327
760/0.53
960/0.99
Oryza 18874071
485/0.47
Physcomitrella 16605579
-/0.61
Chlamydomonas 45685351
Ostreococcus 145349400
Phytophthora
sojae – jgi 1108595
258/0.76
Phaeodactylum 219119366
-/0.56
Thalassiosira – jgi 2|665690|666833
-/0.08
1000/0.99
Paramecium 145492218
948/0.99
Tetrahymena 118355624
360/0.32
Encephalitozoon 19069607
-/0.09
Naegleria – jgi 63|193771|194715
193/0.35
Leishmania 146091679
Trypanosoma 37778910
1000/0.99
Trichomonas 123408472
Theileria 84996361
567/0.66
1000/0.99
Plasmodium 124803581
-/0.47
Cryptosporidium 66357650
574/0.64
Cyanidioschyzon 151559143
-/0.67
-/0.53
311/0.49
725/0.93
Rad51
0.1 substitutions/site
Figure 2.7: Unrooted phylogenetic tree of 46 Rad51 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 312 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
63
Mus 6755276
Monodelphis 126277684
308/0.58
Gallus 299819
695/0.96
Homo 19924133
961/1.00
Xenopus 62858453
430/0.74
Danio 47086005
-/0.37
Strongylocentrotus 115610811
-/0.66
Neve162070
231/0.75
Monosiga – jgi 6000172
Trichoplax – jgi 6|2098752|2100304
172/0.56
Drosophila 17864108
-/0.43
Tribolium 91080301
-/0.41
Aedes 157112162
198/0.54
544/0.72
Apis 110756953
Ustilago 71018413
697/0.89
Coprinus 3237296
Schizosaccharomyces 397843
931/0.100
Saccharomyces 4275
1000/1.00
953/1.00
-/0.44
Kluyveromyces 50309711
716/0.74
Candida albicans 68485285
518/0.85
Aspergillus 83774056
Magnaporthe 145614388
-/0.42
1000/1.00
Neurospora 28926929
999/1.00
782/0.56
Gibberella 46108550
Dictyostelium 66822135
Entamoeba 67477127
Trypanosoma 37778910
1000/1.00
Leishmania 146091679
-/0.33
Naegleria – jgi 63|193771|194715
Theileria 84996361
473/0.45
1000/1.00
Cryptosporidium 66357650
Plasmodium 124803581
139/0.21
Tetrahymena 118355624
956/1.00
Paramecium 145492218
Thalassiosira
– jgi 2|665690|666833
1000/1.00
100/0.25
756/0.89
Phaeodactylum 219119366
Phytophthora sojae – jgi 1108595
438/0.50
Chlamydomonas 45685351
Physcomitrella 16605579
598/0.62
Oryza 18874071
911/0.99
769/0.71
Arabidopsis 18420327
-/0.59
349/0.56
Rad51
0.1 substitutions/site
Figure 2.8: Unrooted phylogenetic tree of 41 Rad51 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 312 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
64
Mus 31543969
Homo 4885657
867/0.97
Monodelphis 126341244
845/0.65
Gallus 50732239
594/0.96
Danio 62202848
Rad55
529/0.38
Xenopus 49118098
Trichoplax 196003852
540/0.54
100/0.31
Nematostella 156376533
Strongylocentrotus 72152235
75/0.25
Aedes 157108777
-/0.43
-/0.33
Drosophila g. – FlyBase 155112
396/0.65
Tribolium 158703267
787/0.65
Apis
95104231
-/0.27
Plasmodium 68075425
429/0.82
Cryptosporidium 66358788
Dictyostelium 66803939
Saccharomyces 1321666
987/0.99
-/0.24
Kluyveromyces 50307995
Tetrahymena 146183407
Schizosaccharomyces 19114516
-/0.25
-/0.50
Neurospora 164423281
Gibberella 46136401
969/0.95
637/0.75
Magnaporthe
39951655
-/0.14
30/0.56
Aspergillus
83765373
352/0.35
Leishmania 146079674
-/0.13
Coprinus 169859075
328/0.46
173/0.70
Trypanosoma 71409752
Theileria 71028890
Naegleria – jgi scaffold 54000039
Chlamydomonas – jgi 413487
Ostreococcus 145356051
-/0.30
Physcomitrella 168016885
284/0.48
Oryza 125528524
839/0.97
-/0.54
Arabidopsis 30698040
897/1.00
903/0.99
0.5 substitutions/site
Figure 2.9: Unrooted phylogenetic tree of 34 Rad55 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 125 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
65
Mus 148686665
Homo 20140428
Xenopus 62859281
740/0.99
942/0.99
Monodelphis 126290417
-/0.50
Gallus 50748752
500/0.58
Danio 55251032
-/0.53
Strongylocentrotus 115677903
436/0.40
Trichoplax jgi - 61282
558/0.63
Nematostella 162106246
Apis 110760303
496/0.46
Drosophila 125775489
-/0.55
Aedes 157109848
618/0.82
Tribolium 91082871
Magnaporthe 39978201
766/0.84
-/0.58
Neurospora 16416086
929/0.98
Aspergillus 41581328
646/0.91
Gibberella 46107922
Schizosaccharomyces 19114539
372/0.40
Candida albicans 68479930
399/0.73
Paramecium 145526625
-/0.60
-/0.45
Kluyveromyces 50309463
980/1.00
Saccharomyces 6320207
Monosiga jgi - 22811
Coprinus 169853855
201/0.13
Ustilago 71018023
Leishmania 154340868
989/0.99
-/0.04
-/0.21
Trypanosoma 71407982
-/0.10
Chlamydomonas jgi - 514873
Thalassiosira jgi - 2935
-/0.10
988/0.99
Phaeodactylum jgi - 53756
-/0.01
270/0.27
Ostreococcus 116056847
Oryza 50909545
832/0.85
Arabidopsis 15242137
-/0.02
-/0.29
Trichomonas 123402061
-/0.06
Plasmodium 68068013
192/0.25
431/0.71
Cryptosporidium 126644246
Phytophthora ramorum jgi - 84214
-/0.16
-/0.12
Dictyostelium 66810419
Cyanidioschyzon 151559139
-/0.04
Naegleria jgi – scaffold 18000071
Giardia 71079596
Entamoeba 167379316
988/0.99
773/0.70
Rad57
0.5 substitutions/site
Figure 2.10: Unrooted phylogenetic tree of 42 Rad57 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 119 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
66
Monodelphis 126290417
Gallus 50748752
824/0.66
Xenopus 62859281
Mus 148686665
915/0.99
988/0.99
Homo 20140428
502/0.65
Danio 55251032
-/0.42
Strongylocentrotus
115677903
384/0.38
Trichoplax
jgi
61282
491/0.57
Nematostella 162106246
Rad57
487/0.77
Apis 110760303
Tribolium 91082871
Aedes
157109848
622/0.96
-/0.96
Drosophila 125775489
743/0.77
Magnaporthe 39978201
-/0.73
-/0.47
Neurospora 16416086
946/0.95
Aspergillus 41581328
607/0.84
Gibberella 46107922
729/0.48
Schizosaccharomyces 19114539
35/0.16
Candida albicans 68479930
Kluyveromyces 50309463
491/0.69
268/0.74
996/0.99
Saccharomyces 6320207
21/0.06
Monosiga jgi - 22811
Coprinus 169853855
Dictyostelium 66810419
204/0.26
Cryptosporidium 126644246
Ustilago
71018023
-/0.01
Phytophthora ramorum jgi - 84214
-/0.23
Naegleria jgi – scaffold 18000071
-/0.25
Cyanidioschyzon 151559139 128960|127722|
Trichomonas 123402061
987/0.99
Trypanosoma 71407982
343/0.21
Leishmania 154340868
-/0.01
Chlamydomonas jgi - 514873
Ostreococcus
116056847
-/0.04
820/0.91
Arabidopsis 15242137
-/0.06
Oryza 50909545
-/0.28
Phaeodactylum jgi - 53756
989/0.99
Thalassiosira jgi - 2935
-/0.44
744/0.94
0.5 substitutions/site
Figure 2.11: Unrooted phylogenetic tree of 38 Rad57 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 119 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
67
Homo 13878923
Mus 6753650
646/0.89 Monodelphis 126339552
Gallus 118082782
616/0.65
Xenopus – jgi 69|1109612|1121870
-/0.57
401/0.76
Danio 63852092
Strongylocentrotus 115660762
90/0.95
Tribolium 91078458
502/0.47
133/0.26
Nematostella 156342885
Monosiga – jgi 11|659650|660866
4/0.87
94/0.49
Trichoplax – jgi 52181
Coprinus 6714639
Schizosaccharomyces 3176384
705/0.88
Aspergillus 121709155
601/0.75
Candida albicans 1706446
701/0.88
Kluyveromyces 50311197
890/0.92
1000/0.99
Saccharomyces 118683
Entamoeba 67482427
74/0.20
Arabidopsis 21903409
Physcomitrella – jgi 9|1771650|1771838
573/0.99
-/0.28
373/0.55 Oryza 18700485
Ostreococcus 145352283
383/0.72
-/0.51
Chlamydomonas 158272235
-/0.41
Phytophthora sojae – jgi 108|233543|235442
Tetrahymena 118382143
Theileria 71028324
669/0.72
936/0.99
Cryptosporidium 209879790
Plasmodium 156097941
213/0.41
Trypanosoma 71659624
999/0.99
Leishmania 72549845
-/0.26
Naegleria – jgi 1|500453|501457
Trichomonas 123408121
-/0.24
Giardia 30578211
-/0.36
469/0.52
Cyanidioschyzon 151559145
627/0.89
399/0.95
Dmc1
0.2 substitutions/site
Figure 2.12: Unrooted phylogenetic tree of 34 Dmc1 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 312 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
68
Homo 7706577
Gallus 118103014
Mus 74225665
-/0.81
Monodelphis 126307900
-/0.84
Xenopus 148235831
Hop2
-/0.27
Danio 50344904
Monosiga 167533307
582/0.88
Nematostella 156406634
Strongylocentrotus 115774724
-/0.45
Apis
110748910
-/0.17 -/0.31
Trichoplax
196002715
232/0.24
Tribolium
270009699
205/0.26
607/0.65
Aedes 157114328
8/0.21
Entamoeba 67474883
Saccharomyces 9755333
-/0.37
671/0.99
Candida albicans 68481069
Kluyveromyces
50308877
196/0.42
Coprinus 116508641
Schizosaccharomyces 67989864
-/0.52
-/0.51
Aspergillus 67525977
Encephalitozoon 85691065
-/0.24
Dictyostelium 66813152
-/0.10
Cryptosporidium 209878810
Trichomonas 123468375
Phytophthora sojae – jgi 130587
-/0.46
-/0.11
Plasmodium 124805558
-/0.60
Paramecium 145484139
-/0.06
-/0.84
Tetrahymena 146164587
-/0.05
Giardia 71069891
Leishmania 73544615
-/0.41
-/0.06
Trypanosoma
71655487
990/0.99
Naegleria – jgi 3930
206/0.16
Cyanidioschyzon 151559141
Arabidopsis
15222250
846/0.77
499/0.56
Oryza 108710703
410/0.64
Physcomitrella 168061248
Ostreococcus 116057249
-/0.26
Chlamydomonas 159474466
-/0.75
-/0.47
-0.96
0.5 substitutions/site
Figure 2.13: Unrooted phylogenetic tree of 38 Hop2 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 105 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
69
Homo 14149769
Mus
20380711
-/Danio
68354610
-/885/0.19 Monodelphis 126331473
Xenopus 147905528
-/Gallus 118089748
Nematostella 156402283
-/-/0.40
Strongylocentrotus 7211568
Trichoplax 196010958
-/Monosiga 167526144
-/-/Schizosaccharomyces 213408226
Cryptosporidium 209880586
Plasmodium 258549228
-/Encephalitozoon 19074771
-/Apis 66516862
979/0.97
Kluyveromyces 50306007
-/Saccharomyces 27808704
Theileria 71027623
-/Coprinus 116508756
-/Candida albicans 68487528
-/Aspergillus 6752273
913/0.99
-/Trypanosoma 71410776
Leishmania 68126682
-/Cyanidioschyzon 151559132
-/Aedes 157167978
-/Dictyostelium 66802388
-/Tetrahymena 118364583
Entamoeba 67470301
-/Tribolium 91088571
817/0.95
Oryza 115478342
997/0.95
Arabidopsis 30688234
Physcomitrella 168015455
-/997/0.95
Chlamydomonas 159474610
Ostreococcus 145350550
-/Phytophthora ramorum jgi - 97076
585/0.67
1000/1.00
Phaeodactylum 219115844
-/Thalassiosira 224014939
Paramecium 145503596
Giardia 71072040
-/-/Naegleria jgi - 31971
-/Trichomonas 154413267
-/-
Mnd1
0.2 substitutions/site
Figure 2.14: Unrooted phylogenetic tree of 41 Mnd1 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 137 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
70
Mus 20380711
Homo 14149769
Monodelphis 126331473
Mnd1
864/0.16
Xenopus 147905528
Gallus 118089748
-/0.32
Danio 68354610
-/0.20
Ostreococcus 145350550
-/0.14
Monosiga 167526144
-/0.31
-/0.05
Trichoplax 196010958
Strongylocentrotus 72111568
-/0.42
Nematostella 156402283
-/0.05
Entamoeba 67470301
-/0.17
Aspergillus 67522773
Aedes
157167978
341/0.35
Tribolium 91088571
-/0.01
Apis
66516862
-/0.01
-/0.08
Cyanidioschyzon 151559132
Encephalitozoon 19074771
-/0.16
Physcomitrella 168015455
Oryza 115478342
992/0.99
Arabidopsis 30688234
784/0.95
Kluyveromyces 503060
988/0.98
683/0.87
Saccharomyces 27808704
-/0.37
Candida albicans 68487528
Dictyostelium
66802388
-/0.09
Naegleria jgi - 31971.
Leishmania 68126682
-/0.35
Trypanosoma 71410776
999/0.99
Cryptosporidium 209880586
Paramecium 145503596
588/0.84
-/0.68
Tetrahymena 118364583
Phytophthora ramorum jgi - 97076
-/0.48
Phaeodactylum 219115844
675/0.98
Thalassiosira 224014939
999/1.00
872/0.83
-/0.34
669/0.61
0.2 substitutions/site
Figure 2.15: Unrooted phylogenetic tree of 34 Mnd1 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 134 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support fro
71
Mus 34785459
Homo 6912622
857/1.00
Monodelphis 126322473
983/0.99
Gallus 45382655
1000/1.00
Xenopus 148230804
765/0.97
Danio 125839739
Strongylocentrotus 115657922
809/0.93
Trichoplax – jgi 27976
858/0.94
Nematostella
156379220
Rdh54
650/0.94
888/0.98
Apis 110760280
996/1.00
Aedes 157128256
Monosiga
– jgi 170
523/0.36
Ustilago 71019185
936/1.00
Coprinus 116508450
Schizosaccharomyces 63054489
914/0.99
-/0.53
Aspergillus 66850516
-/0.60
Candida albicans 68477713
Saccharomyces 151946464
1000/1.00
836/0.97
Kluyveromyces 49644752
Dictyostelium 66811190
802/0.60
Entamoeba 67475316
1000/0.99
Leishmania 146087788
-/0.40
1000/1.00
Trypanosoma 71651467
Ostreococcus 145350886
-/0.49
-/0.95
Physcomitrella 168048890
Paramecium 145482121
1000/1.00
Tetrahymena 118383249
Chlamydomonas 159467693
-/0.27
Naegleria – jgi 43000074
1000/1.00
999/1.00
0.5 substitutions/site
Figure 2.16: Unrooted phylogenetic tree of 29 Rdh54 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 495 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support fro
72
Mus 1495708
Homo 1495483
689/0.97
Gallus 118094595
1000/0.98
Xenopus 47575794
756/0.98
Danio 41055574
567/0.59
Strongylocentrotus 72012428
Rad54
Nematostella
156369786
735/0.97
Caenorhabditis 17508659
346/0.70
Apis 110771180
Tribolium 189238349
956/0.99
Aedes
157130680
-/0.75
974/0.99
Drosophila 27819922
Kluyveromyces 49640265
999/0.99
1000/0.99
1000/0.99
Saccharomyces 151943650
691/0.83
Candida albicans 46444289
Coprinus
116505577
639/0.94
774/0.98
Ustilago 71008587
1000/0.99
Schizosaccharomyces 19115202
Aspergillus 40743497
Neurospora
7384851
-/0.55
Gibberella 46127169
-/0.68
1000/0.99
537/0.99
Magnaporthe 22775414
Monosiga 167527295
Theileria 84998504
782/0.95
1000/0.99
Plasmodium 124512694
Cryptosporidium 66361996
Thalassiosira – jgi 259430
1000/0.99
992/0.99
Phaeodactylum – jgi 37863
Phytophthora sojae – jgi 112767
Ostreococcus 116059418
864/0.99
985/0.98
Chlamydomonas 159489044
Physcomitrella – jgi 172305
889/0.98
Oryza 50913053
997/0.99
1000/0.99
Arabidopsis 9294624
1000/0.99
750/0.99
0.2 substitutions/site
Figure 2.17: Unrooted phylogenetic tree of 34 Rad54 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 495 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support from 1000 PhyML bootstrap
replicates followed by the posterior probability estimated using Phylobayes.
73
Mus 13385116
969/0.93
Homo 21717826
791/0.82
724/0.51
Monodelphis 126308554
772/0.48
Gallus 45383087
783/0.47
Xenopus 451583
Rad59
Danio 55925241
-/0.39
Nematostella 156377005
-/0.58
399/0.51
Trichoplax – jgi 158188
Entamoeba 56465048
Dictyostelium 66810566
Candida albicans 68492202
943/0.99
Saccharomyces 151941940
943/0.86
Kluyveromyces 50308361
0.5 substitutions/site
Figure 2.18: Unrooted phylogenetic tree of 13 Rad59 homologs. Trees were estimated
with maximum likelihood and Bayesian inference (LG+G) from 102 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, and Excavata in brown.
GenBank Geninfo Identifiers are given for all sequences unless otherwise
noted (e.g. “jgi”) refers to the Joint Genome Institutes public sequence
databases. The consensus topology of 2 Phylobayes chains is shown.
Numbers at the nodes indicate support fro
74
100
100
100
100
100
99
93
82
94
100
100
98
100
55
Mus
Homo
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Nematostella
Trichoplax
Apis
Tribolium
Aedes
Drosophila
Monosiga
Caenorhabditis
Neurospora
100
Magnaporthe
100
Gibberella
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Candida
Saccharomyces
100
100
Kluyveromyces
100
Arabidopsis
100
Oryza
Physcomitrella
Ostreococcus
Chlamydomonas
Phytophthora
Phaeodactylum
100
Thalassiosira
Cyanidioschyzon
Cryptosporidium
Plasmodium
100
96
Theileria
100
Paramecium
Tetrahymena
Giardia
Trichomonas
100
Trypanosoma
Leishmania
Naegleria
Dictyostelium
Entamoeba
78
100
100
58
100
100
100
77
100
39
100
80
31
68
43
56
44
82
0.2 substitutions/site
Figure 2.19: Unrooted phylogenetic tree of 46 sets of 13 concatenated strand
exchange homologs. Trees were estimated with partitioned maximum
likelihood inference from 3084 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, and Excavata in brown. The best tree from 1000
replicates is shown.
75
Table 2.1: DNA strand exchange component absences from eukaryotic groups.
Gene
Rad52
Rad59
Rad55
Dmc1
Hop2
Mnd1
Rad54
Rdh54
Eukaryotic group
Alveolata (Plasmodium, Theileria, Cryptosporidium, Tetrahymena, Paramecium)
Viridiplantae (Arabidopsis, Oryza, Physcomitrella, Chlamydomonas, Ostreococcus)
Endopterygota (Aedes, Drosophila, C. elegans, Apis, Tribolium)
Endopterygota (Aedes, Drosophila, C. elegans, Apis, Tribolium)
Most Fungi (except Saccharomycetales – S. cerevisiae, Kluyveromyces, Candida albicans)
Excavates (T. vaginalis, G. intestinalis)
Chromista (Thalassiosira, Phaeodactylum, Phytophthora)
Bacillariophyta (Thalassiosira, Phaeodactylum)
Diptera (Aedes, Drosophila, C. elegans, Apis)
Sordariomycetes (Neurospora, Gibberella, Magnaporthe)
Bacillariophyta (Thalassiosira, Phaeodactylum)
Sordariomycetes (Neurospora, Gibberella, Magnaporthe)
Sordariomycetes (Neurospora, Gibberella, Magnaporthe)
Ciliophora (Tetrahymena, Paramecium)
Excavates (T. vaginalis, G. intestinalis)
Most Chromalveolata (except Ciliophora – Tetrahymena, Paramecium)
Embryophyta (Arabidopsis, Oryza)
76
EUKARYOTES
Giardia
Trichomonas
Trypanosoma
Leishmania
Naegleria
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-A
Arabidopsis-B
Arabidopsis-C
Oryza-A
Oryza-B
Oryza-C
Physcomitrella-A
Physcomitrella-B
Chlamydomonas-B
Chlamydomonas-C
Ostreococcus-B
Ostreococcus-C
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
Dictyostelium
Entamoeba
10
20
30
40
50
60
70
80
90
100
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....
IADLvDFG--QHTIRGRVLSRKPLATtragkp----wFSF-MLDDDSLDVKVDCFdD-AEKFSAQIsAGDIIQLEHIKISSKTPAdRRFDVSRS-DYKVSIKsGTVVTL
IASLnQYLT-SWSLIVRIISKSQMRQfqgsrp--gklFSIIMRDKNNDEIKGTFFnQEAEKFENLVeQDKVYKVS--C-GRVKKAnERYNSTKS-EFEITFDsTSSIIE
IDSLsPFLGGKWWIRARVTDKSEIRTwnkpts-qgklFSFTLIDESA-SIRATVFnEAVDMFNPLIvNGQVYYFS--G-GQVKNAnRKFSNVNN-DYELSFDnTCQISA
IDSLtPFLGGKWWIRARVTDKTDVRTwnkpts-qgklFSFTLIDESA-AIRATVFnDAVDTFEPLIvNGQVYYFS--G-GQVKNAnRRFSNVNN-DYELTFDrSSEIML
ISNLnPYDK-VWVIKARVTQKSDMKHwdkgts-kgslFSIELLDEYGGQIRATFFnDVAKKYYDAIkERSVYFFS--G-GKLKDAnRKFTTIPH-PYEITFDrDTVIQN
INKLsQYST-KWIIKARVQSKDNIRKfytgnk-egkvFNIELCDESG-EIKVNVFgKAVDKWYDYLeVGKIYKIS--K-GNIKSAnKKFNTLKH-DCEITLDeNSILEL
ISDLtLYTP-KWQIRARVVFKSEIRKfnnqrg-esqlFSVDLCDSNG-EIRAVFFgESVNKWYSFLeEGQVYSIS--G-GQLKPAnKRYNNLKH-SCELILDeSSYIQL
IKSItSYLH-RWRIIGRVISKSDVRTfssskskegkvFSFEICDAEGSQIRATCFtKAVDKFYEFLkEGEIYSFS--K-GDVKEAnAKFNKTGH-GFEIIFNeDADIQS
IRNLqPNGQ-PQTIKVRITKKGDLKSfkek---qgklFSIDVIDKFGDECSISFFnEIAEQYDGLFkVGQVIVLK--Q-FSVKV-nNNHQYNKG-DHTVTVNkESKILI
ISELyPGMR-GFKIKGRITSKTDITQfkngkg---ylFTIEIIDSDKQTIQGVFFnKLCDKFYDFIdIGKVYYFE--N-ASVKTNrYSSKNQNQSDYQIHFEdFSKISI
ISGLnMYSN-RWVIRAKVTNKSDVKTwsnakg-egslFSVTLLDSSGYDVKCTFFkEAVDKFYNMLeEGRVYTFS--G-GRLKVAnMAYNNCKS-QFEITFDqNSEIHL
ISNLnMYAN-KWTIQARVTSKSDIRTwsnakg-egslFSVELLDQTQ-DVRATFFkEAVDKFYSFLqIGSVYTFS--G-GRLKVAnAQYNTCQS-NFEITFDqNSEIHL
IQSLnPYAGGRWTIKARVTTRSPIKNwtnarg-sgklFSVDLLDAKGGEIRATFFkEGVDKFYDTLrEGGVYYFS--G-GKIKMAnRRFSSVDN-DYEITFDtHSDISP
IAALnPYQG-RWAIKARVTAKGDIRRynnakg-dgkvFSFDLLDYDGGEIRVTCFnALVDRFYDVTeVGKVYLIS--K-GSLKPAqKNFNHLKN-EWEIFLEsTSTVEL
LVSLnPYQG-SWTIKVRVTNKGVMRTyknarg-egcvFNVELTDEEGTQIQATMFnAAARKFYDRFeMGKVYYIS--R-GSLKLAnKQFKTVQN-DYEMTLNeNSEVEE
IAALnPYQG-RWTIKVRVTSKADLRRfnnprg-egklFSFDLLDADGGEIRVTCFnDAVDQFFDKIvVGNVYLIS--R-GNLKPAqKNFNHLPN-DYEIHLDsASTIQP
ISALnPYQG-RWAIKARVTAKGDIRRyhnakg-dgkvFSFDLLDSDGGEIRVTCFnALLDRFYEVVeVGKVYVVS--R-GNLRPAqKNYNHLNN-EWEILLEnGSTVDL
LISLnPYQG-NWIIKVRVTSKGNLRTyknarg-egcvFNVELTDVDGTQIQATMFnEAAKKFYPMFeLGKVYYIS--K-GSLRVAnKQFKTVHN-DYEMTLNeNAVVEE
ITALnPYQP-KWTIKARVTAKSDIRHwsnars-sgtvFSFDLLDAQGGEIRAQCWkESADKFFGQIeVGRVYLIS--R-GSLKPAqKKYNTLNH-DYEITLDiLSTVEV
IAALnPYQG-RWTIKARVTSKGEIRRfhnakg-egkvFSFDMLDADGGEIRATCFnNVVDQFHDRIeVGKVYLIS--K-GSLKAAqKNFNHLKN-DWEIFLEsQSTIEP
ILSLnPYQG-NWTIKVRVTSKSPLRTfknarg-dgnvFNVELTDEDGTQIQATMFkEAADKFYDVLqLDKVYFIS--K-GSLRMAnKQYATVKN-DYEMTLNsNSEIVE
IAQLhPYET-NWCIRAKVDRKAPLRAlpskp--dvkvMTVDLVDETGTAIQGTFWrGPAERMSEQLvEGKVYVFH--K-FKVKPAdKKYVTVKN-EYQIDFTdTTDVSE
ISALnPYTA-RWAIRARVTSKGELRRwtnvrg-egkvFSFDLLDKDGGEIRATAFgAEADKFFEVVeAGAIYQIS--K-ASLTNKrPQFNHTNH-QYEIKLDrNSMVER
LAALnPYRT-PWTVKVKLTNKGNVREyksarg-pgkvCSVDFVDEEGTAIGATLWrEAIEKYDSVLeVGKVYYVS--K-GSLKPAdKRYSTSGN-DYEMNLDgKCEIDV
IHALnPYQN-RWTIRARITTPLELRSysnakg-egkvLGFQVLDADGTEIKCVCFnDTAVRLAGELrQGLVYEIS--KGAIVTPRdPRYAIY---QYEIKLDnHATFVP
ISAAnPYQN-NVIIRGRVVQKGELRTysnakg-egklFSFEIADETG-NMRVTAFrEKALEAHQRIeLNGIYSIA--G-ASLKPAnAQFNHTGH-SFEMILDqNSVITQ
IASLtPYQS-KWTICARVTNKSQIRTwsnsrg-egklFSLELVDESG-EIRATAFnEQVDKFFPLIeVNKVYYFS--K-GTLKIAnKQFTAVKN-DYEMTFNnETSVMP
IASLtPYQS-KWTICARVTNKSQIRTwsnsrg-egklFSLELVDESG-EIRATAFnEQVDKFFPLIeVNKVYYFS--K-GALKIAnKQFSAVKN-DYEMTFNnETSVLP
IASLnPYQS-KWTICARVTNKSQIRTwsnsrg-egklFSIEMVDESG-EIRATAFnDQVDKFFPLIdVNKVYYFS--K-GTLKIAnKQFTAVKN-DYEMTFNnETSVVL
IASLnPYQS-KWTICARVTQKGQIRTwsnsrg-egklFSIELVDESG-EIRATAFnDQADKFFPLIeLNKVYYFT--K-GNLKTAnKQYTAVKN-DYEITFNnETSVVP
IASLnPYQS-KWTVRARVTNKGQIRTwsnsrg-egklFSIEMVDESG-EIRATAFnEQADKFFSIIeVNKVYYFS--K-GTLKIAnKQYTSVKN-DYEMTFNsETSVIP
IASLnPYQS-KWTIRARVTNKSAIRTwsnsrg-dgklFSMELVDESG-EIRATGFnNEVDKFFSLIeQGKVFYIS--K-GTLKIAnKQFSSLKN-DYEMTLNgETSIIP
------------------------------------------------------------------------------------------------------------INSLsPYQN-KWVIRARVMSKSGIRTwsnakg-egklFSMDVMDESG-EIRVTAFkDQCDKYYDMIeVDKVYYIT--K-CQLKPAnKQYSTLKN-DYEMTMTnDTIVQE
ISSLsPYQN-KWVIKARVTSKTAIRTwsnarg-egklFSMDLMDESG-EIRATAFkEQCDKFYDLIeVDNVYFFS--K-CQLKPAnKQYSQLKN-DYEMTFTnETMVQP
IAMVtPYVS-NFKIHGMVSRKEEIRTfpaknt---kvFNFEITDSNGDTIRCTAFnEVAESLYTTItENLSYYLS--G-GSVKQAnKKFNNTGH-DYEITLRsDSIIEA
IVALsPYQN-RWVIKARVVSKSNIRTwsnsrg-egklFSMDLIDESG-EIRCTAFrNECDKFYDMLeIGKVYYIS--R-ATLKPAnKQFNNLKN-DYEMTLIgDSEIIP
INALtPYHN-KWVIKARVTNKSDMRTwsnsrg-egklFSFDLMDDSG-EIRCTAFrDMADKYFNYLqVDKVYYIS--K-CQLKAAnKQFNTLKN-EYEMTIGnETIIEE
ISGLtPYQN-RWTIRARITSKSNIRTwnnsrg-egrlFNVEMVDESG-EIRATGFnEAVDKFYQMLeVDKVFYIT--K-GSLRTAnKQYSSIRN-DYEMYLNnDTIIEP
ISSLtPYQN-RWTIRTRVTSKSEIRKwsnsrg-egklFSVDLIDESG-EIRATAFrDQVEKFYDVLeVNKVYYIS--R-CSIKTAnKNFTSIKN-DYEMTFTnETAVEP
LTSLnPYDR-RWAIRVRVVAKPPIRTynsdrg-egkiFSVDLVDASG-EIRATGFnADCDRLYPLFeKNKVYMIQ--G-GRIKPKnRRFNQLSH-EYEITFDsTTTVTE
IEQLsPYQN-VWTIKARVSYKGEIKTwhnqrg-dgklFNVNFLDTSG-EIRATAFnDFATKFNEILqEGKVYYVS--K-AKLQPAkPQFTNLTH-PYELNLDrDTVIEE
IEQIsPYQN-NWTIKARVSFKGDLKKwqnnrg-eghiLNVNLLDSSG-EIRATAFnDNAIKFNEILqEGKAYFVS--K-ARVQPAkPQFSNLKH-PYELSLErDCVVEE
IETIsPYQN-NWTIKARVSYKGDLRTwsnskg-egkvFGFNLLDESD-EIKASAFnETAERAHKLLeEGKVYYIS--K-ARVAAArKKFNTLSH-PYELTFDkDTEITE
IEGLsPFSH-KWTIKARVTSKSDIKTwhkasg-egklFSVNFLDESG-EIRATGFnDQVDQFYDLLqEGQVYYIS--TPCRVQLAkKQWSNLPN-DYELTFErDTVIEK
IEGLsPFAH-KWTIKARVTAKSDIKTwhkatg-egklFSVNLLDESG-EIKATGFnDQCDALYDQLqEGSVYYIS--TPCRVQLAkKQFSNLPN-DYELTFErDTVVEK
IESIsPYQH-KWTIKARVSQKSDIRTwhkasg-egklFSVNLLDETG-EIKATGFnDQCDKFYDILqEGQVYYIS--TPCRVQMAkKQFTNLPN-DYELTFEdGTQIEK
IEAIsPYSH-KWTIKARCTSKTNIRTwhnrnt-egrlFSVNLLDDSG-EIRATGFnDQCDMLYDVFqEGGVYYIS--N-CRVQIAkKQFTNLNN-DYELTFErDTVVEK
IEGLsPYQN-KWTIRARVTNKSEIKHwhnqrg-egklFSVNLLDESG-EIRATGFnEQVDAFYDILqEGQVYFIS--K-CRVNIAkKQFSNVQN-EYELMFErDTEIKK
IEGLsPYQN-NWTIKARVTQKSEMKQwsnaqg-egklFNVTFMDDSG-EIRATAFnLVADDLYPKLeEGKVYYVS--K-ARVGLAkKKFSNIPN-DYELSLErNTEIEE
IEGLsPYQN-RWTIKARVTSKSDIRHwsnqrg-egklFSVNLLDDSG-EIKATGFnDAVDRFYPLLqENHVYLIS--K-ARVNIAkKQFSNLQN-EYEITFEnSTEIEE
INMLnPFHN-KWAIKGRVVMKSDIRRftnqkg-egkvFNFEVSDGTA-QVKIICFsDCVDIFFPIVeVGKVYTIA--K-GTVKMAnKQYSTNPF-DYEIILDkSSEVHR
IESIaPGMNVQFTIRAMVRNKQPLKSwnkgangegklFSMELVDSTG-EIKCACFsDSINALYDCFeNGKVYFIQ--R-FFVKSAnKLYNTLSH-QSELSINsESRVMI
FSLLtTFGS-KLIIKGRVVSKNDKFKya-----kgnlFSFVLQDKDGAEIKATCFnDVCDEKFDQIkVGETYYIT--KADYKQSNgKGYRSAKMIDLDMIIGkYTIIQK
76
Figure 2.20: Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-A) from 54 diverse eukaryotes. Genus names
for Excavata are highlighted with brown, Chromalveolata with orange, Archaeplastida with green, Opisthokonta with
purple, and Amoebozoa with blue. Shaded columns indicate amino acids are 75% identical.
77
EUKARYOTES
Giardia
Trichomonas
Trypanosoma
Leishmania
Naegleria
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-A
Arabidopsis-B
Arabidopsis-C
Oryza-A
Oryza-B
Oryza-C
Physcomitrella-A
Physcomitrella-B
Chlamydomonas-B
Chlamydomonas-C
Ostreococcus-B
Ostreococcus-C
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
Dictyostelium
Entamoeba
10
20
30
40
50
60
70
80
90
100
110
120
130
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|.
HTHICTILAGISDATLVYSspesarQWFRLDLQVVDKTg----SIKVSLWTE----QLEPFLSAYNMTKDdAPARLTGKIVVLSRVqFRCSER-ygIQCSCVgISKVFLVDSTpskqddpSNLKEFRLWWA
TVDIISIITFIGDCQTI--ktksgsSIEKRNITVSDETg----TIEVTLWGS----SATEF----D--QKeSE--------IFCVRnVTVSDF-rgVSLNVGqSATIVV---Np-p---dNDVKNIRNWYE
LVDVLAVVLNVEELGTI-VqrstgrELVRRTVKVADSTa----GIDVTLWNE----NAKEWP------HQpGT--------VLAMRqLKVGSF-dgVTLSTTmQSSFDV---Np-n---iPDVKKLREWFE
LVDVLGVVLKVDEVSSI-TqkstgrELVKRNVKMGDMTa----AVEVTFWND----EAKAWC------YPvGT--------VVALRqLKVGSF-hgVPFSSTyQTKIDI---Nptd---lPDVKKLATWYV
ILDVAGVVQNIGETKEF--ttknnrKTKRCNISLIDDSsspfcTVDLTIWGD----MCDTH----D--MQqGD--------VVILKsVRKSNY-ggVSLNTInSTRIFK---Dp-g---iPIYQQLSEWYQ
LVDVIGVVLSFQELNQI-LikktgqYKEKKDLMLIDETn---eTINVTLWGE----NAVKMEEMN---ITeNC--------IICFKcLKVGEW-qgKKLESHpKTKVEI---Np-e---lDKAYTLKNWWI
TIDVIAIVVTARDLQKI-NnkatgnNVEKRDFLLCDSTn---tTVWVTSWGQ----KTQLFNYEGD---NsHP--------LVCLKgVKVGEW-qgKKLDVQiSTQVIC---Ep-v---iPEALKLRKWWN
SIDILGILWKASPIMTI-TikstgaDTQKRELTILDRSg---ySIDLTLWSERTNLDEGML--------AqNP--------MIAVKnAIIEEF-ngFKLKFGpNTSIEW---Npin---iEQADELRQWFQ
LIDLIVVVKADTEVKTM-IlkkdnqQQSKRDIISFDESl---iETEITLWGE----TAKDY----D--AKqGD--------IIVFKdAKIGEFkdkKQINIGyGTQIFM---Np-deqlfPQIHDVKKWYL
KCDVLGVIIDIKPTTQI-Mtk-sneNRSKKNITLYDQTq---rGIDIVLWGQ----QAEKWQ------FQkDE--------IVAFRgLKISDYqmvRNLTVTnSTIYEK---Nlsn---lKKINGFQEFYE
YVDILAVVKHVGDVSTI-VskksgkEMTKVDLVVEDDSg---aDVKLTLWGNSAQNAENQF--------AnCP--------VVAFKkSRLGDY-ggRSLS---GGSPTV---Np-q---iPQTNQLMQWWG
MVDLLAVVQAVGEVATI-VskksgqELTKCDLTLIDTSa---tQITLTLWGDKAASALTDY--------NqQP--------VVAIRrARVSDY-ggRSLSL--SGSIET---Np-d---iPQTAPLQTWWR
NVDVIGVVRDVGQVNEI--mskagkQLFKRDILLVDDSn---aEIKCTMWNERAQEDCSGW---------qNQ--------VLAIKgCRVSEY-ngRSISTVsSSNFTV---Np-a---iPEAGHLVTWFS
ILDVIGVVTSVNPSVPI--lrkngmETHRRILNLKDESg---kAVEVTLWGEFCNRDGRQLEEMVD--SAfHP--------VLAIKaGKVSDF-sgKSVGTIsSTQLFI---Np-d---fPEAHKLRTWFD
LIDVIGVVQSVSPTMSI-RrkndneMIPKRDITLADETk---kTVVVSLWNDLATGIGQELLDM----ADnHP--------VIAIKsLKVGAF-qgVSLSTIsRSNVVI---Np-n---sPEATKLKSWYD
TTDVIGIVSSISPTVAI--mrknltEVQKRSLQLKDMSg---rSVEVTMWGNFCNAEGQKLQNLCD--SGvFP--------VLALKaGRIGEF-ngKQVSTIgASQFFI---Ep-d---fPEARELRQWYE
ILDIIGVVTSVNPCTTI--qrkngmETQKRTMNLKDMSg---rSVEVTMWGDFCNREGSQLQGMVE--RGiFP--------VLAVKaGKVSDF-sgKSVGTIsSTQFFI---Np-d---sAEAHSLRQWFD
LVDVIGVVQSVSPTLSV-RrkidneTIPKRDIVVADDSs---kTVTISLWNDLATTTGQELLDM----VDsAP--------IIAIKsLKVSDF-qgLSLSTVgRSTIVV---Np-d---lPEAEQLRAWYD
IVDLLGVVTSVSPSATI--mrkigtETRKRSIQLKDLSg---rSIEVTLWGNFCDAEGQQLQLQCD--SGsNP--------IIAFKgARVGDF-ngKSVSTIgSTQLII---Np-d---fPEVERLRQWYM
MIDIIGVVMSITPTVTI--trknglETQKRSLQLKDMSn---rSVELTMWGNFCNKEGQELQDLCD--SGaNP--------VLAVKaGRVSDF-sgKSVGTIsSTQLVI---Np-d---hPEARKVRDWFN
VADVLGVVQSVGPLTTV-NrksnndEIPKRDIVLLDQSr---qTVVLTLWNNMAVKEGASLADL----IAeSP--------ILMAKgLRLSDF-qgVSLSSTmNTMVLI---Np-v---iPDANELRTWYE
PVDVMGVVLALGSYGTV-KrkadnsELPRREVTIGDQSg---kSVAITLWGDMSSTTAQQLEG-----MEgRA--------VLQVTgCRVTDY-ngCSLSTLsKSVASI---Np-e---tPAAQQMMLWYK
VVDIIGVVETCEPWQTI--trrtgeETQKRSMVVRDDSg---rSIEVTLWGALVNNPGDQIEQMVR--GGgRP--------VLAAKaLRVGDY-ngKTLSTVgASALRL---Dpmd---lPAAQRVRGWYN
NVDVVAVVKEVSELSSI-RrksdntELNKREVVLVDDSa---kTVRLTLWNA----LAVEVGEQLA--SMtNP--------VVAIRsVRVGDY-egVSIGTVsRSDIVI---Dped---vPRAVEIKKWWS
MVDVIGIAYSVGDLTTI--mkrdgsETSKRSVMIRDDSd---tSIEFTLWDPHSVEIGGQIESLIA--SGeKP--------VIAVKsSRLGEF-qgKNMGTVsSTMVEI---Np-d---sSEATRMRVWFD
VVDVIGIALDIGEVGEI-SskttglPVAKREVKLIDDTg---cSVALTIWGE----RARSLFSN----EDdRP--------VLLVKsAKRGDF-ngVSLSTTpSSHVEV---Np-n---iREAFELRGWFD
LVDIIGICKSYEDATKI-TvrsnnrEVAKRNIYLMDTSg---kVVTATLWGE----DADKF----D--GSrQP--------VLAIKgARVSDF-ggRSLSVLsSSTIIA---Np-d---iPEAYKLRGWFD
LVDIIGICKSYEDSIKI-TvksnnrEVAKRNIYLMDMSg---kVVTTTLWGE----DADKF----D--GSrQP--------VMAIKgARVSDF-ggRSLSVLsSSTVIV---Np-d---iPEAYKLRGWFD
LVDIIGVCKSYEDASKV-VvkssnrEVSKRNVHLMDTSg---kVVTTTLWGE----DADRF----D--GSrQP--------VLAIKgARVSDF-ggRSLSVLsSSTILV---Np-d---iPEAFKLRGWFD
IVDVIGICKSYEDVTKI-VvkasnrEVSKRNVHLMDTSg---kLVTATLWGN----EAEKF----D--GSrQP--------VIAIKgARVSDF-ggRSLSVLsSSTVVV---Np-d---sPEAFKLRGWFD
VLDIIGVCKNVEEVTKV-TiksnnrEVSKRSIHLMDSSg---kVVSTTLWGE----DADKF----D--GSrQP--------VVAIKgARLSDF-ggRSLSVLsSSTVMI---Np-d---iPEAFKLRAWFD
ILDVIGVCKNAEDVARI--mtknsrEVSKRNIQLIDMSg---rVIQLTMWGS----DAETF----D--GSgQP--------ILAIKgARLSDF-ggRSLSTLySSTVMI---Np-d---iPEAYKLRGWYD
MPNVIGVCKSTSDLTAV-TikssnrEVNKRSLQLVDDSq---kEVSLTLWGK----EAEDF----D--GSgNP--------VIAVKgARLSGF-ggRSLSVLqNSIFQV---Np-d---iPKAHHLKGWFD
MIDVIGVCKEAGEVMQF-TarssgrELKKREVTLVDSSn---aAVSLTLWGD----DAQNF----N--ATnNP--------VLVIKgARVTEFgggKSLGLVaSSVLKT---Np-d---nEEAHKIRGWYL
AVDTIGICKEVGELQAF-TsrttnkEFKKRELTLVDMSn---aAVTLTLWGD----EAVNF----D--GHvQP--------VILVKgSRINEFnggKSLSMGgGSILKI---Np-d---iPEAHKLRGWFD
LIDVLVVVEKMDPEATE-FtskagkSLIKREMELIDESg---aLVRLTLWGD----EATKA--LVD--DYvQK--------VIAFKgVIPREFnggFSLGTGsATRIIS---Vp-e---iAGVSELYDWYA
IMNILGIVKYSGDLQIL-TsrnsgrELRKRDVSLVDESn---tTVTLTLWGS----QAEEF----D--GSsNP--------VLAVKgARITEFnggKNLSTLsSTVLQI---Dp-d---lPAAHRLRGWFN
LVDVIGICKEASEVQTF-TskstnrELRKREITLVDQSk---tSIALTLWGS----QADSF----D--ATnNP--------VVVIKgAKVGEFgggKNLSTLmSSQIKL---Np-d---iPECHRIKGWYD
IVDILGVVTNVGDLAQI-TtkttnkQVSKRDITLLDRSe---kSVTATLWGD----EAEKF----EEHAGkNP--------VLAIRgAKVSDF-ggRSLSVLnSSNMRV---Npvd---mKEAQVLRGWYD
MIDVVGVVKSADDVVTI--ntksnrQVNKRDIELVDDSg---kVVRLTLWGT----NAEEF----D--GSqFP--------VVAVRgARVTEF-ggRSLSVVgSSQLMT---Np-d---iPEAHILRGWFD
FADILAVIKEVADVTTI-VtraaqkELSKREVTLVDKDn---vSLSCTLWGK----EAEGF---VDAGGHpGV--------VMAIKaARISDF-ngRSLSVAsNSNYSI---Np-d---lKEAHELKGWCV
NVDVLGIIQTINPHFEL--tsragkKFDRRDITIVDDSg---fSISVGLWNQ----QALDF----N--LPeGS--------VAAIKgVRVTDF-ggKSLSMGfSSTLIP---Np-e---iPEAYALKGWYD
AIDVVGILKSVGPHFEL--aaksgkKFDRRDVEIVDDSg---aCISLGLWGE----QAIKF----N--LPeGS--------VVALKgVRVTDF-ngKSLSMGnTSSLFA---Np-d---iQEAYTLKGWYD
IIDVLGALKTVFPPFQI-TakstgkVFDRRNILVVDETg---fGIELGLWNN----TATDF----N--IEeGT--------VVAVKgCKVSDY-dgRTLSLTqAGSIIP---Np-g---tPESFKLKGWYD
TVDIIGVLKEVQEVTQI-VskttqkPYDKRELTLVDNTg---ySVRCTIWGK----TATNF----D--AQpES--------IVAFKgTKVSDF-ggRSLSLLsSGTMAI---Dp-d---iPEAHHLKGWYD
TVDVIGVLKEVGEIGDI-TskkdgrPFQKRELTLVDDTg---fSVRVTIWGK----NANSF----D--AApES--------VVAFKgTKVSDF-ggKSLSLLsSGTMTV---Dp-d---iPDAHRLKGWYD
TVDVIGVLKDVADVTQI-TskasgkFFDKRELTLVDDSg---ySVRMTIWGK----TAQNF----D--AKsES--------VVAFKgAKVGDF-ggRSLSLLsSGTMTV---Dp-d---iPEAHRLKGWFD
TIDVIGVLKEAMDVTQI-TskttnkPYDKRELIMVDNTg---fSVRLTIWGS----TAQKF----N--ASpES--------VIAFKgVKVSDF-ggRSLSLLsSGSMAV---Dp-d---iEEAHKLKGWYD
IIDVIGVLQNIGPVQQI-TsratsrGFDKRDITIVDQSg---fEMRMTVWGK----QAIDF----S--VPeES--------IIAFKgVKVNDF-qgRSLSMLnSSTMTT---Dp-d---iPEAHTLKGWYD
ICDVIGVVKDVGEVGTI-TsrsnnrQISKRDLTLVDKSa---ySVRMTLWGK----QAEQF----K--VEpES--------IIAFKgVRVGDF-ngRNLSMTsASTMQV---Np-d---iEECFTLRGWYD
TCDVIGILDSYGELSEI-VskasqrPVQKRELTLVDQGn---rSVKLTLWGK----TAETFPTNAG--VDeKP--------VLAFKgVKVGDF-ggRSLSMFsSSTMLI---Np-d---iTESHVLRGWYD
YCDTIGVVKEVYAPSTV-MvrstqsELLKRDAVLVDDGg----SVRLTLWGP----KAELE-------IEsGM--------VLALKsIKVSEF-ngISISTTgGSQVVT---Np-d---iAEAHELEGWYQ
TVDVIGAITNIDPIANL-Tsk-qgkEFTKFGITIADDTn---aSINVVFWNE----KATEV----APQVKvGD--------IIAMKgVKVSDF-sgRTLSYSfGSSFGL---Ndeq---lQETSNLRAHLQ
TYDICAFLVDKGPEQTY------knEKAKVTLTFMDQSs---yAVEVDFWNE----DIDKTKD-----MEnGV--------VYVLTsLKLKEF-kyKTLTVTkATKILS---Nt-dieqyDEASLVNKFIQ
77
Figure 2.21: Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-B) from 54 diverse eukaryotes. Genus names
for Excavata are highlighted with brown, Chromalveolata with orange, Archaeplastida with green, Opisthokonta with
purple, and Amoebozoa with blue. Shaded columns indicate amino acids are 75% identical.
78
Figure 2.22: Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-C)
from 54 diverse eukaryotes. Genus names for Excavata are highlighted
with brown, Chromalveolata with orange, Archaeplastida with green,
Opisthokonta with purple, and Amoebozoa with blue. Shaded columns
indicate amino acids are 75% identical.
79
EUKARYOTES
Giardia
Trichomonas
Trypanosoma
Leishmania
Naegleria
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-A
Arabidopsis-B
Arabidopsis-C
Oryza-A
Oryza-B
Oryza-C
Physcomitrella-A
Physcomitrella-B
Chlamydomonas-B
Chlamydomonas-C
Ostreococcus-B
Ostreococcus-C
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
Dictyostelium
Entamoeba
10
20
30
40
50
60
70
80
90
100
110
120
130
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|
NTLg---el--agsdpLRVLASLILDR--Nle-----TATYRGCa-----s-------------CKS--ALRD--------------------s------PICPK-CqetsagekYYWRIGGHISDAlAH
KQMc---rt--engevFCVNVMIQDMP--Ssr----KPV-YQACp-----n-----------eaCRGSGLIID--------------------q-etgk-MICKK-CnkevtnpkLRYSLSLNVGDYsGS
EGLg---kg--pkpdyIDLRCVPVYLK--Qdt-----QW-YDACp-----q-------------CNK--KVML--------------------egamgdrFRCEK-Cdqsi-vptQRYLVSIQVTDNvSQ
DGIg---rg--lkpeyVDVRCVPIYFK--Qda-----QW-YDACp-----t-------------CNK--KVTE--------------------egaqgdrFRCEK-Cdktv-tptQRYLVSIQVTDNvSQ
MSVd---tv--tapdyLTIRAYVSYIK--He------LW-YDACt-----n-----------keCNK--KVQQ----------------------negi-YHCSS-CnhssdtctRKFLANLGITDWtGK
VNLa-neevlsgkgiiFTTFGFIDHIY--Nai-----PV-YSACp-----n-------------CNK--KMVAtviedg----------eedmdqnvsesMYCAK-Cnkn-nipvYNYSINLKITDNtDS
TNQglqfksidsngmvFTTRGLIEVLK--Dtn-----FC-FPSCt-----g-------------CRK--KMSN----------------------dqgc-WYCSK-Cnsst-npiHLYILNIKIVDEsSH
ATSgvnssdildggiwVFTNATIRTIR--Dnk-----YF-WSSCr-----q-------------CKR--KVTEiedpnsvsalilpfssengnkvntgpnYHCPN-CqqtiedplKKYILSCELIDStGT
KNL----qtdpemkiwKEIRGQIMYIK--Dtp-----LY-YNACf-----s-------------CKK--KIAR----------------------nnev-WTCIN-CnkdfnepdSRYILSLNISDStDT
DFEg---irnikfvkfYEIKAYITNIF--Tkl-----LY-YEGCe-----n-------------CKR--KVVY--------------------iqqtkl-YHCQS-CnqnfdqpsYKYMFNAKIADTtGN
EHLg---ns--dkpdwLSFKATITFLK--RekQGDDGAW-YTACa-----n---------sgepCRNMFKATQ--------------------t-sdgn-YHCDK-CqqthpncvRRFIFSGTVADDtST
NNLg--yag--dkpdwLTFKATVSFLKKDKeg----GSW-YPACa-----n---------agepCKNRYKVTQ--------------------t-tdgn-WYCDK-CqgsfptcvRRWIFSGVVEDDtSS
KQLg---fg--qkpdyITVKGTVNFIK--Hds----GVF-YQACp-----k-------------CQK--KVVA--------------------d-vaqn-FTCEK-CqtsypnceNRYILSVVLLDHtGS
EGLg---rs--dkpdwITVKATISFIK--Tds-----FC-YTACplmig-d-----------kqCNK--KVTR--------------------s-gtnr-WLCDR-CnqesdecdYRYLLQVQIQDHtGL
PSLg---e---ekpvfFSTRAYISFIK--Pdq----TMW-YRACk-----t-------------CNK--KVTE--------------------a-mdsg-YWCES-CqkkdqecsLRYIMAVKVSDStGE
EKLg---ts--ekpdwITVCATISFMK--Ven-----FC-YTACpimng-d-----------rpCSK--KVTN--------------------n-gdgt-WRCEK-CdkcvdecdYRYILQIQLQDHtDL
EGLg---mg--dkpdwITVKATVIFFK--Nes-----FF-YTACpnmig-d-----------rqCNK--KVTK--------------------s-tngn-WTCDK-CdrefeecdYRYLLQFQIQDHsGT
PNLg---q---dkpvfFSLNAYISLIK--Pdq----TMW-YRACk-----t-------------CNK--KVTE----------------------amgsgYWCEG-CqkndaecsLRYIMVIKVSDPtGE
ENLg---rl--ekpdwITVKAAISHVT--Tes-----FC-YPACpkllpvg-----------rqCNK--KAIN--------------------n-gdgm-WHCDR-CdesfqnpeYRYMLRFQIQDHtGS
EGLg---rg--dkpdwITIRATVFYIK--Pen-----FC-YSACplevn-g-----------kqCMK--KVTN--------------------n-gdgt-WRCDR-CdrsvpecdYRYLLSIQVQDHtGP
PNVg---eg---kpmyFNVRAYISFIK--Pdq----AMW-YLACq-----t-------------CNR--KVVE--------------------q-ssss-YWCEG-CqnhydkcsRRYIMQAKLSDSsGE
ETEa---lan-dkaifQNVTACVAMINNDDkn-----IF-YLANp-----e-------------NGR--KVVD--------------------q-gggr-FWSEA-DskvvekpeHRYLLSVRLADHtGE
ENLg---rs--gkadwVNVSAVLDMIKGGAsa-----VV-YPSCphdfn-g-----------rpCQK--KMMD--------------------v-gggn-WNCDR-CqfstenpaWRYLVSLSACDHtAK
EIAp---vt--dkptfAWVCAHTVMCK--Pdq-----TMYYTATp-----e-----------egNNK--KVIE----------------------sdgk-WYCEA-NgqtydtceRRYIMRFKAQDSsEG
ELVa---kn--egvayLSCCGIIKHIKLGAeg-----NF-YPACpllng-e-----------rtCQK--KLRK--------------------ddstge-WKCERHAgekieaadWRYMFSMVCMDHsDE
EHIgedphsa-pgasyYTVRATISHIK--Qde--ERPPW-YLSCp-----d-------------CKK--KVIE--------------------e-spdm-YRCER-Cdklv-kptPRYIFSIQAMDAtGS
ENLg---qg--dkpdyFSSVATVVYLR--Ken-----CM-YQACp-----t-----------qdCNK--KVID--------------------q-qngl-YRCEK-CdtefpnfkYRMILSVNIADFqEN
ENLg---qg--dkadyFSTVAAVVFLR--Ken-----CM-YQACp-----t-----------qdCNK--KVID--------------------q-qngl-YRCEK-CdrefpnfkYRMILSANIADFqEN
ENLg---qg--dkadyFSCVGTVVYLR--Ken-----CM-YQACp-----s-----------qdCNK--KVID--------------------q-qngl-YRCEK-CdrefpsfkYRMILSVNIADFqEN
ERLg---qg--dkadyFSCVGTIVHLR--Ken-----CM-YQACp-----s-----------qdCNK--KVID--------------------q-qngl-YRCEK-CdrefpnfkYRMMLLVTIADSlDY
ENLg---hg--ekadyFTSVATIVYLR--Ken-----CL-YQACp-----s-----------qdCNK--KVID--------------------q-qngl-FRCEK-CnkefpnfkYRLILSANIADFgEN
EHLg---hg--dkadyFSCIATIVYIR--Ken-----CL-YQACp-----s-----------kdCNK--KVVD--------------------q-qngm-FRCEK-CdkefpdfkYRLMLSANIADFgDN
QNLg---qg--ekpdyFTVKGTILFVR--Ken-----CM-YMACp-----s-----------aeCNK--KVSE--------------------n-gdgs-YRCEK-CskdyenfkYRLLLSANVADStDN
KNLg---ag--dkpdyFQVKALIHNIK--San-----AV-YKACp-----q-----------aeCNK--KVID--------------------q-dngq-YRCEK-CnadfpnfkYRLLVNMLVGDWtSN
RNLg---sg--dkpdyFQCKAVVHIVK--Qen-----AF-YKACp-----q-----------adCNK--KVVD--------------------e-gngq-YRCER-CnaafpnfkYRLLINMSIGDWtSN
MQFg---kds-dkgdyATVKAMITRVN--Ptn-----AL-YRGCa-----s-----------egCQK--KLVG----------------------engd-YRCEK-CnknmnkfkWLYMMQFELSDEtGQ
MELg---y---knsdiYTVKATLNMIR--Men-----AI-YKACp-----s-----------enCKK--KLVD--------------------q-andm-YRCEK-CdkeypnyrYRLLANISLADWtDN
KGLg---hs--ekgdyFQVKATILLVR--Sen-----AL-YKACp-----t-----------ddCNK--KVVD--------------------l-engm-YRCEK-CcrefpnfkYRLLVSMNIGDFsGN
EQLg---mg--ekadyISVKGVCVYFR--Ren-----CM-YKACp-----s-----------eeCNK--KVIE--------------------e-dsgf-Y-CEK-CgrkypnykYRLILSAHLADFtGS
ENLg---qq--ekadyFNLKATIIYIR--Ken-----LM-YKACp-----k-----------edCNK--KVID----------------------qggs-YRCEK-CnqtfpdfkYRLMISASIVDStGS
ATVg---lpd-dksvaFQVTGTILYVK--Sdn-----IY-YQACp-----t-------------CNK--KVVE--------------------e-sdgs-YECQK-CaksykefkYRLLTSFSIGDFsGS
ENLg---rs--ekgdfFSVKAAISFLK--Vdn-----FA-YPACs-----n-----------enCNK--KVLE--------------------q-pdgt-WRCEK-CdtnnarpnWRYILTISIIDEtNQ
SNLg---rs--ekgdyFSVKAAVSFLK--Vdn-----FA-YPACl-----n-----------egCQK--KVIM--------------------q-sdgt-WRCEK-CdmnhphpkYRYMLTISIMDQtGQ
EHSg---st--ekpdyFSIKASVTFCK--Pen-----FA-YPACp-----nlvqnadatrpaqvCNK--KLVF--------------------qdndgt-WRCER-CaktyeeptWRYVLSCSVTDStGH
ENLg---tn--eapdyFALKATVVFIK--Qdn-----FA-YPGCr-----s-----------egCNR--KVTD--------------------m-gdgt-WRCEK-CqinhdrpqYRYIMSVNVNDHtGQ
ENLg---md--dqa-yYTIKATIVFVK--Qen-----FC-YAACl-----s-----------qgCNK--KVTQ--------------------m-pdgt-WQCEK-CnlshekpeYRYVLSLNVADHtSH
DNLg---vd--dvv-yFALKATVVYIR--Qen-----FA-YPSCl-----n-----------egCSK--KVTD--------------------l-gdgs-WRCEK-CdvnhprpeYRYIMSVNVNDHtGQ
EQLg---ms--eeavyFSLKATVIYIK--Qdn---MSFA-YPACl-----s-----------egCNK--KVTE--------------------l-dpgq-WRCER-CdkthpqpdYRYIMHVNVSDHtGQ
QHLg---ms--etpdyFSLKATVVYIR--Kkn-----IS-YPACp-----t-----------pdCNK--KVFD----------------------qggs-WHCEK-CnkdyeaphYRYIMTIAAGDHtGQ
AGFg---qs--dkpdyFSTRATIIHIK--Ddn-----IA-YPACp-----t-----------qgCNK--KVIE--------------------e-adg--WRCEK-CekvfeapeYRYIMSMMVADHtGK
ENLg---ms--ekpdyFNVRATVVYIK--Qen-----LY-YTACa-----s-----------egCNK--KVNL--------------------d-henn-WRCEK-CdrsyatpeYRYILSTNVADAtGQ
SDL-----------tySTVQGTVMFLK--Edg-----LW-YTSCk-----g-----------egCNK--KVVM--------------------e-dggc-YRCER-CnmtyedcdYRYMVTMHLGDFsGQ
KKLy---rtigqfqrvVPLSQAGEMDK--GdeISSKMEWKYKACk-----k-------------CKK--------------------------scpegs---CPQ-Cgsd--dweYAYRMSLKLSDGdDA
TSEt-------sedvkANVYGYFTMFK--Vdn----GFC-YLSCp-----d-------------CKK--KIVE------------------------gs-TFCEK-Cqkdi-qpmRRFIVRASIADStSS
Giardia
Trichomonas
Trypanosoma
Leishmania
Naegleria
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-A
Arabidopsis-B
Arabidopsis-C
Oryza-A
Oryza-B
Oryza-C
Physcomitrella-A
Physcomitrella-B
Chlamydomonas-B
Chlamydomonas-C
Ostreococcus-B
Ostreococcus-C
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
Dictyostelium
Entamoeba
140
150
160
170
180
190
200
210
220
230
240
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|.
TRITIFdPHASTL-M-GmSADEFVEKGDeek----------LRLLSKVLEVDVAVKVQQGKYDg------rdRTNLIGIELAPS------------------yvtifDNLVGQLQA
AYINVIgDENSFMPLINvKPEEFEIDDTtkl-------RTMLLKKSFFRALRVKVRGKNSEYGv----------KLTAISGGEV----------------n-faeeaLRIANNITA
VWLTLFnESGAEF-F-GmTAPELKRRQEedp--mF---VTKVAQMRMNRPVLMRLRVKEEGLGg---nedseRVRLNVVRITEFmpldtvtedkrqamaaq-lrqecDEMIKCINA
AWLTLFnEAGIEF-F-GmEAAELKRRAQedp--lY---IAKLAQGRMNRPVVMRLRVKEETSSnamtgeesdRLRMSVVRISEFmpiagtseetrrrlaqn-lrtecDEILRLIEA
QYCNAFnQAVEKL-FSDmTADDMCA-RAaep--eY---MPYLLGEKTFTRYVFTVRVTTETTK-------epKLKFTIIRVTPI----------------d-yareaKSILSLIRD
LRVSAFaNSAKTI-MNGlSAEEFMKLRQeyisqeNIENFD-LIEKAKLNEFFFRIKAYMTSHMd------eiKKNYTILETIPL--------------skl-lvdscRYLIKEIKL
IWASAMaDVGESI-M-GiKAYNLINLMErgpsneNEKSFINYFEDARLTEYIFKIKATVENFMd------epRIKYRVLKATPL--------------dre-ldlaiKDRIENIKK
LRAVAFaEHGESI-MDGlNVDQLESMRNnpe--kS---TEDIFADKNFSEWVFKLNGRKEVYQd------stILKYRIFGVEDM--------tspdvlnre-akkklEYVYSKLNG
IWVSAFdEVGQKI-L-GvKGDVFRYADEdte--hGTETKKKLLMAAQNKEYRFLLLTKQERDQn-----gnaRDKTVIHAIKDF----------------q-payeaKKIINSLEK
LSVSVAnDQGQQI-L-QlSCDEFQKKSQvdk--------DNYVKRANFQQFRFLIIGKVETYNd------eiRPRYYISTFIQD----------------d-ivsdnEELYNQIKQ
SWISMFnEQAETL-FNGmTADNLYQQSIeqgdkdF---YDSTFLKATYTEWVFKCKVKQEMVGd------etRIKTSVASLVPV----------------d-yakesRALLSSL-TWVSFFnEQAETL-LAGaTADQVYAETYqdq--qDQDAYDSYFAKANHTEWIFKCKVKNEMVNe------esRVKTSVVAMQPV----------------d-fakesRDLLSALAK
TWTTCFnDQGKVV-MGGrTADEIGELRDtnp--aL---FESIFKDALFKQYVCRLRVKAENVQe------elRVKASVMNLEPV----------------n-fvqesKDLLQAIAQ
TWITAFqETGEEI-M-GcPAKKLYAMKYelekeeE---FAEIVRDRLFHQYMLKLKIKEESYGd------eqRVKMTVVKVDKV----------------n-ytsesKYMLDLLVR
TWLSAFnDEAEKI-I-GcTADDLNDLKSeegevnE---FQTKLKEATWSSHLFRISVSQQEYNs------ekRQRITVRGVSPI----------------d-faaetRLLLQDISK
TWATAFqEAGEEI-M-GmSAKDLYYVKYenqdeeK---FEDIIRSVAFTKYIFKLKIKEETYSd------eqRVKATVVKAEKL----------------n-yssntRFMLEAIDK
AWVTAFqEAGQEL-L-GcSATELNALKEred--pR---FADTMLNCLFQEYLLRLKVKEESYGd------erKVKNTAVKVEKV----------------d-psgesKFLLDLISK
AWLSLFnDQAERI-V-GcSADELDRIRKeeg--dDS--YLLKLKEATWVPHLFRVSVTQNEYMn------ekRQRITVRSEAPV----------------d-haaeaKYMLEEIAK
TYASAFdEAGEQI-F-GrKAGELFSIRNvdqddaQ---FAEIIEGVRWHLYLFKLKVKEETYNd------eqSLKCTAVKVEKL----------------d-pskesNVLLGAIDN
TWITVFqETGEEL-M-HhTAKELFLWSQdep--qR---FSEAIQKLTFMKHIFKLKVKEETYNd------eqRTKSTLVKVDPM----------------d-wisesKLML----AWVSAFnEQAESL-L-GvSADNLSEMRNqagddnQ---YQNAVRKAMWQPCVYRISAAQTEYMs------ekRQRLTVRTVVPV----------------d-wvaesKHLLAKITK
TNVQLFgKEAEAV-M-GmRADELAALKEagg--eG---FAGALKAAQWKPWQVVVMSKAREYNg------nrSVRHSAYKVENI----------------d-wvsesSRLVTLIAK
QSLTAFgDAGDAI-F-GrSATEVRNMEVdrp--qE---FDRLAESIRFTPFFFRLKVAEDNYNd------eqRIKVSIYKMER--------------------------------AWLNAFnEEATKM-F-GmTANEMHELKEndf--aA---YERAVKKMTCQHWSFLVKVVTEEYQg------esKRRMTAVKCNPV----------------n-yaaesKKLLSKMGV
YWVSVFgDKGDKI-F-GiSAAEMKEIYDrep--eR---YENMISDALFNDYSLRVKVAVDNYTd------vpRAKGSLVEIERV----------------n-yvdmsKKLIGKIAK
HWLNCYdEVGPII-FGGySAEELKRIKEtds--eE---YQRILEQAHFGEFLFRVRVRSDTYQd------emTFRHMVVGAEKI----------------n-yesemKMLESEIHN
QWVTCFqESAEAI-L-GqNAAYLGELKDkne--qA---FEEVFQNANFRSFIFRVRVKVETYNd------esRIKATVMDVKPV----------------d-yreygRRLVMSIRR
QWVTCFqESAEAI-L-GqNTMYLGELKEkne--qA---FEEVFQNANFRSFTFRIRVKLETYNd------esRIKATVMDVKPV----------------d-frdygRRLIANIRK
QWVTCFqESAEAI-L-GqNTAYLGELKDkne--qA---FEEVFQNANFRSYTFKIRVKLETYNd------esRIKASVLDVKPV----------------d-yreygKRLIMNIRK
QWVTCFqESAEFI-L-GqSATFLGELKDkne--qA---FEEVFQNANFNTYEFKIRVKLETYNd------esRIKATALDVKPV----------------n-yreysKRLIASIRR
QWITCFqESAESI-L-GqNATYLGELKEkne--qA---YDEVFQNANFRSYTFRARVKLETYNd------esRIKATAVDVKPV----------------d-hkeysRRLIMNIRK
QWVTCFqDTAETL-L-GqNSSYLGQLKDtne--aA---FDEVFQHANFNTFVFRNRVKLETYNd------esRIKVTVVDAKPV----------------d-hreysKRLIINIRK
QWATCFqETAEQL-L-LkSAQELGSLKDqge--aTEKEFNQVFQDACFIDYMFRMRIKMETYNe------eaRLKCTCVSAQPI----------------n-vrdytNKLIKDIRL
RWVTVFtDLAEQM-L-GkSSQDIGDALEfnk--dE---AEQIFSAINFKSYVFKLRTKVEFYGd------ssRNKTTAVAANPV----------------n-hkeynAYLIKNIQE
RWVTCFsETGEQL-L-KhNAQEVGEALEndp--aA---AEKMFADINFSSYIFKLRCKNEMYGd------mtRNKLTVQSMTPI----------------n-ykeynKHLIKELKE
VYVTAFgDSAAKI-V-GkSAAELGELHDesp--dE---YNAIFERLQFVPKMWRLRCKMDSYNe------evRQKMTVYGVDDV-------------nqdk-yienlKQMIEQMQQ
QWVTAFnDEAEKI-L-SsTAQELGELKEndi--dA---YSEKFSEATFKSFIFKIRVKVEVFGd------enRLRATCLGVSPM----------------d-yklynNHLITQIKE
QWVSVFsSEAEKI-L-GkTAQEIGLTMRdds--eA---GTAIFQAANFKQFIFKCRAKMENYNd------eqRLKIVVVKVDPV----------------n-yeeynGYLCEQIEA
QWVTCFqESAEAL-L-GrSASDLGQMKEnqd--eAQ--FDQVFASSEFKLHTFKIRAKMETYNe------etRLKCSVVNVVPV----------------n-ykqesKRLIDEVKK
NWLTFFqETGEAM-L-KcTAQQLGAWKEnde--sK---YEHTINEALFQSYILKVRAKMESFNd------enRLKCSCVNLTPM----------------d-yvqqsRRLLEGIRR
QWLQSFsEVAESV-L-GhSADEIGSWSAnsd--pR---FTTALADATFKTWTFRCRARTDTYNd------qsRLRVSVASAVPI----------------d-yvqdsKRMV----LWLTLFdDQAKQL-L-GvDANTLMSLKEedp--nE---FTKITQSIQMNEYDFRIRAREDTYNd------qsRIRYTVANLHSL----------------n-yraeaDYLADELSK
IWLTLFnDQAEQL-V-GvSANELTELKEnnn--qA---FVALTQKVQMNEYDFRIRAREDNYNn------etRIRYTVANLHDL----------------r-wkaeaDFLAAELLK
MWVTLFnDQAEKL-L-GiDATELVKKKEqks--eV---ANQIMNNTLFKEFSLRVKAKQETYNd------elKTRYSAAGINEL----------------d-yasesQFLIKKLDQ
LWLSCFdDTARVI-M-GkSADELMEIREtde--tR---LPAEFEQANCRKLNFRCRAKMDTFGe------qqRIRYQVMSVAPL----------------d-ykmegNKLNELINS
QWLTGFdDFGRQV-M-GrTADEMMELKEndd--tK---LTAAFEEANCKKFTFRCRAKMDNFGe------aqRIRYQVMSVTPL----------------d-fksegTKLAELIKQ
LWLSCFdDVGRII-M-GkSADELMALKDenf--eA---FTREFENANCRKLSFRCRAKMDTFGd------nqRVRYQVMGATKM----------------d-wkseaARLADLIKQ
LWLSCFdDVGRSM-M-DiSANQLMELFQtde--kA---AGDVFQDANCRTWNFRCRAKIDHFGe------qqRIRYQVSSAKPI----------------n-ysheaGRLADLIGS
LWLNVFdDVGRIL-M-GkTADELNAMQEnde--nE---FTSVMSDASYVPYVFECRAKQDNFKg------evRVRYTAMSVRNI----------------d-wkqesKRLVDLIKS
AWFQGFnEVGVTV-Y-GmSANDLVQIKNndh--aQ---YKAIQYHAACNTYNFSCRAKEDEFNg------vrRVRFGISRLAKV----------------d-ykeeaGYLRDLLYS
MWLSGFnEDATQL-I-GmSAGELHKLREese--sE---FSAALHRAANRMYMFNCRAKMDTFNd------taRVRYTISRAAPV----------------d-fakagMELVDAIRA
MWVSLFdEVATSF-F-GiSAREMKVMSEeap--gE---LQALIRRMYFRECLFRIKSKQDSYNd------eiRMRYSGLSVENL----------------d-ilkesKRLLGVIEK
ISVEVMgKTGDRL-F-GkSAAELYQMNQeq--------INEIFNTVLSNNYVVSLAPSSYMGSn-----gqtYNRFNVYDFANLydtpnklvsgpnpnapnqfqnyiESSFNDCRL
IWVTIFdEEMKKI-I-GkSADEMYEINEqds--eL---FENLFKQLTFIECRFHLICKKDEYNg------etRTRFTVNFIQVL----------------d-niasgTEEMKSLER
80
EUKARYOTES
Trichomonas
Trypanosoma
Leishmania
Naegleria
Theileria
Cryptosporidium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-a
Arabidopsis-b
Oryza-1
Oryza-2
Oryza-3
Physcomitrella
Chlamydomonas
Ostreococcus
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospors
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
Entamoeba
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|...
VIRIIGKIL--SHEEQDATTDK---------YVLNDCS-GTIEAFESVDPSNVRE---------------------------------------------PFEDNRYVAITGMLK--FENDSKSLSIES---IEYADDYNRITYHALDTIHAH
QATVVGRVI--GYEDDTTNRVTGALTAKHYGYRITDGT-GLVVVRQWMDADHQE---E------------------------------------------PLPVQCYVRASGTVKV-WQ--NAPIVTGT---VRLVSDCNELNYHYLDVILTH
QATVVGRVL--GYENANMASGGGAITAKHFGYRITDNT-GMLVVRQWIDADRMQ---E------------------------------------------PLPLNTHVRASGTVNV-WQ--QNPIVTGT---VVSMADSNEMNYHMLDAILTH
IVEVLGMVT--SINS-RNGFTT---------IQIDDCT-GKLDVKVFDESVNNNPFLKSEVEQIQYVLLYIHKCLFIIIINCSCHDLKIIINCVYYFNNQQCRVNKFVKAFGLVSE-YK-DRLSIKAQM---VRRVEDANEIYFHMLECIQVH
IIKLVGYVK--DAKE-TEQDTS---------FVIDDGT-GTIECIHLSPGDISD---WKRSYISE-----------------------------------LTRTKSPVKIYGGFNPLYSSSSPTIIIYS---IKEVTSPEEIKLHNLDVIYSV
LFKLVGFVRCAEHEE-YPQRVR---------FYLDDGS-GLI-LIDWLIDNTGTN--YKQELIN------------------------------------SITEGCFVKVYGELTL-MV-SEPSVRAFV---VRPLVCTDEISAHDIDVAVFI
MVKVVVAVR--SHEE-RSTNLF---------LDIEDGT-GFTQAKVWVN-EGDE---CSGVVQLRQN---------------------------------ACKDHQYVRIIGQVRE-FD-GTRQIVAND---VRPVSSGDEITYHFLEVAHSY
HVRFVAAVR--SFED-FSTNVV---------YTLEDGT-GLMEVKQWLDDNHCT---AIAEMRQH-----------------------------------TLKENIYLKVVGQIKE-YD-GKKMVVAES---IRVLSTGNELAHHMLEVVYAG
--------------------VS-----------TDDGS-GAFDCQYFISADDDN---ASEGEMN------------------------------------RLREGSYVRVVGKLRT-FQ-GKASLSCFS---VNPVEDMNELTHHLLEVIYTH
NVSLVGLVC--DKDESKVTEVR---------FTLDDGT-GRIDCKRWVS---ET---FDAREME------------------------------------SVRDGTYVRLSGHLKT-FQ-GKTQLLVFS---VRPIMDFNEVTFHYIECIHFY
TVVIVGRIS--RMEN-RITQVD---------FVVDDGT-GWVDCVRWCH---AR---QETEEME------------------------------------AVKLGMYVRLHGHLKI-FQ-GKRSVNVFS---VRPVTDFNEIVHHFTECMYVH
NVRLVGLVS--GKTE-RNTDVS---------FTIDDGT-GRLDFIRWVN---DG---ADSAETA------------------------------------AVQNGMYVSVIGSLKG-LQ-ERKRATAFA---IRPVTDYNEVTLHFIQCVRMH
TVRLVGRML--NKLD-RVTDVS---------FTLDDGT-GRVPVNRWEN---DS---TDTKEMA------------------------------------DIQNGDYVIVNGGLKG-FQ-GKRQVVAYS---VRRITNFNDVTHHFLHCVHVH
NVRVLGRVV--SVVS-RDTDVC---------FTLDDST-GKIPLVRWIT---DQ---SDTRDTS------------------------------------YIQEGVYVKVQVNLMG-FQ-AKKQGLARS---IRPINNFNEVVLHFIECMHVH
NVTLVGMVH--DKDE-RNIDTS---------FMLDDTT-GRIEVKRWIDGQ-DS---YEYFEMQ------------------------------------SVQNFMYVRVHGHLRT-FQ-NKLNVVAFS---VRPITDFNEVTFHFLEVIHVH
TVTILGKVT--SYRE-LSTRVQ---------LQLHDGT-ASMEVCSWVD---DA---DMQAQKPV-----------------------------------EWQVGKYVRVYGNLKT-FE-GKRSLTAFA---VKPVTDHNEVTYHFLQCVMQH
NLTVVGKIV--GVES-KSSYVL---------YKVDDST-GVCDVKVWSDQDGDQ---TAE----------------------------------------PIEVGAYVRVYGSVKT-LA-NEHMIAAHTQQAVRKITDHNEVTFHMLEVVYAS
-----DELR----EQ--PLDLL---------WLLDDRS-GEM---IWARMASTS---SSSLAA-------------------------------------LEQSGILVRVFGQLLE-VD-GRRVLNVRA---IRKADGEVELRYHENLCQLSK
QVTIVGIIR--HAEK-APTNIV---------YKIDDMTAAPMDVRQWVDTDDTS---SENT---------------------------------------VVPPETYVKVAGHLRS-FQ-NKKSLVAFK---IMPLEDMNEFTTHILEVINAH
QVTIVGIIR--HAEK-APTNIV---------YKIDDMTAPPMDVRQWVDTDDAS---GENA---------------------------------------VVPPETYVKVAGHLRS-FQ-NKKSLVAFK---IIPLEDMNEFTAHILEVVNSH
QVTIVGIIR--QAEK-APTNIV---------YKIDDMTAAPMDVRQWVDTDDTS---SENT---------------------------------------VVPPETYVKVAGHLRS-FQ-NKKSLVAFK---ILPLEDMNEFTIHILETVNAH
QVTVVGIVR--HAEK-APTNIL---------YKVDDMTAAPMDVRQWVDTDEAG---SENI---------------------------------------VVPPGTYVKVAGHLRS-FQ-NKKSLVAFK---IMPLENMNEFTTHILETVNAH
QVTIVGIVR--HAEK-APTNIL---------YKVDDMTAAPMDVRQWVDTDEAS---CENM---------------------------------------VVPPGSYVKVAGHLRS-FQ-NKKSVVAFK---IAPVDDMNEFVSHMLEVVHAH
QVTIVGVIR--STDK-STINIQ---------YKVDDMTAAPMDVKQWIDTEDMG---VDNS---------------------------------------VIPPGSYVKVSGNLRS-FQ-NNRSLVAFS---VRVLEDMNEVTSHMLEVVNAH
QVG----------------------------FTMKDSS---------MDQQPT-----------------------------------------------VYEENTYIKVSGNVRA-FG-GKRSIGPFR---IAPIKDLNEISMHMAEVVQSH
MVTFVAIVR--SVDH-SSTKIT---------YGLEDHT-GQVDAHLWLE-EGDT---NSVP---------------------------------------GMMTHSYARVFGSVRH-QG-GSKAVMIYK---IEQVSSPNDVTTHLLEVLNAR
MACVVGIVR--NIET-SSTKIT---------YTLEDHS-GRIDAHYWLE-EGDT---LKAP---------------------------------------EVMVNNYVKVYGTTRS-QA-GQKTLMVFK---LLPILDPNEVCTHLLEVLNAR
TVQTVGIVK--EINQ-EGTTWS---------YDLCDPNNEAMEYRALKYENEGS---NSDQS--------------------------------------SIVEGTRVRAIGKLKS-FD-GSNSIMLFN---ITPVTDDKDFTIFELEAEAAR
MFTFVGLIR--NVEE-TATKIS---------YDIEDDT-GTITALKWLEANKQE---TDR----------------------------------------VAEVNTYVRIVGMLRE----QNDKLIYAS------LKAEAKLNK--------FADVVGVLK--DFEV-QTTKAT---------CTIEDHS-ASIKAIMWLETDNDT---VTALP--------------------------------------PVKENCYVRVFGSVRT-QD-GEKMIMILK---ILPVDDLNIVTNHLLEIIQAK
QVSFIGVIR--SAEE-ASTNVV---------YHVNDMTGEDIVVKKWANDNEET---EQERERRA-----------------------------------ACRENTYVHVVGNLKW-FK-ESKSLIAFS---LMPLEDFNQLTCHILEVMQAH
QITFVGVIR--SVTE-SAAYTQ---------YAVDDMTKSPISVRRWVDSEVSC---NMYS---------------------------------------TLADDTYVRVVGHLRA-LQ-GVRYVMAIN---IQPIEDCNEITYHILEVIHSH
KVMIVGVIR--SVDA-RATRVT---------YTVEDHT-GAISATRWSSNAGDE---EESSAAPD-----------------------------------LYRENDYVQVVGQLRSDNE-NNLQLTAYN---ISKLTNGNQLTHHLISIVHAH
HVCFVGVVR--NITD-HTANIF---------LTIEDGT-GQIEVRKWSEDANDL---AAGNDDSSGKGYGSQVAQ-------------------------QFEIGGYVKVFGALKE-FG-GKKNIQYAV---IKPIDSFNEVLTHHLEVIKCH
HVSFVGVIR--NVAD-NTSNVT---------LTVEDGT-GQIEFRKWTNDSNDM---SHASQEDQNGDYNSQVAQ-------------------------DYSVGKYIKVYASLRE-FS-GKMNVQYAV---VKHIDSFNEILAHHLEVIKAF
MISFVGVVR--NVEN-TNASIA---------VTIEDGT-GSIDVRKWVDET-------ISSAEEDFEKY-------------------------------NEMKGKYVYVGGSLKQ-FN-NRKTVQNAS---ISLITDSNQIVYHHLSAIEHH
QVTIVGQVR--SVKP-QPTNIT---------YRIDDGT-GAIDVKKWVDSEAQG---GEDGGSGAG----------------------------------TIAPDAFVRVWGRLKS-LG-GKKHVSANF---IRQIEDFNEVNYHLLEATYVH
QITFIGQVR--SVQP-QPTNIT---------LKIDDGT-GQIEVKKWIDVDK-----ADDSEA-------------------------------------GFELDSHIRIWGRLKS-FN-NKRHVGAHV---IRPVSDFNEVNYHMLEATYVH
QITLVGQVR--SINP-QPTNIT---------YRIDDGT-GTIDVKRWIDPEK-----AEDADAAS-----------------------------------QHQPDSYVRVWGKLKA-FN-NRRHVGALF---VRPVEDFNEVNYHMLEVAYVH
SICFIGQVR--NISS-QSTNVT---------YKIDDGT-GEIEAKQWIDSMTAD---SMDTDDINNTKAATGRRDG------------------------KVELNGYAKVFGKLKS-FG-NKRFVGAHC---VRPVKSLDEVHCHLLEASAVH
QVTFVGVLR--NIHA-QTTNTT---------YQIEDGT-GMIEVRHWEHIDALS----------------------------------------------ELATDTYVRVYGNIKI-FS-GKIYIASQY---IRTIKDHNEVHFHFLEAIAVH
QVTIVGQIL--SIQQ-QATNSV---------YAIVDGT-GTIEARQWLNTDTDG---SIQQ---------------------------------------GLKENIYVRVAGNLKA-FN-SRRYINTTH---IRPITDPHELYFHILESMTVT
QLTFVAVVR--NISR-NATNVA---------YSVEDGT-GQIEVRQWLDSSSDD---SSKAS--------------------------------------EIRNNVYVRVLGTLKS-FQ-NRRSISSGH---MRPVIDYNEVMFHRLEAVHAH
NVQIIGWVV--SSKT-SATGSM---------FVLEDGT-GSVDCTFWPG---NS---YEEEQCK------------------------------------VLEEQNLLKVNGSLRT-FN-GKRSVSASH---LSAVEDSNFVTYHFLSCIYQH
TVVVCGRIT--SIDI-QNDVKR---------YTINDST-GSVVVGVYQTDSTEE----------------------------------------------NIEVGQYIKCVGKIKK-FS-QETYILASR---LPLVVDVNHMMTHLIECAYAL
Figure 2.23: Multiple sequence alignment of RPA2 ssDNA binding domain (DBD-D) from 45 diverse eukaryotes. Genus names
for Excavata are highlighted with brown, Chromalveolata with orange, Archaeplastida with green, Opisthokonta with
purple, and Amoebozoa with blue. Shaded columns indicate amino acids are 75% identical.
80
81
EUKARYOTES
Giardia
Trichomonas
Trypanosoma
Leishmania
Naegleria
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis-A
Arabidopsis-B
Arabidopsis-C
Oryza-A
Oryza-B
Oryza-C
Physcomitrella-A
Physcomitrella-B
Chlamydomonas-B
Chlamydomonas-C
Ostreococcus-B
Ostreococcus-C
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabiditis
Apis
Tribolium
Nematostella
Trichoplax
Monosiga
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilogo
Encephalitozoon
Dictyostelium
Entamoeba
10
20
30
40
50
60
70
80
90
100
110
120
130
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|...
-------------------------------------------------------MDNHH----------LlYLILSrr---LQKM----------------------------------------------NVLEDILVKkvnvMEP--------IVQVVQIKECnqh-----------dLYRAALSDGTH----------FiPAMLGsk---LKDLIenk--viqRNSLIKLLK--YTVSNNs-------KQPLIVLNAELHK
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------MQGktmdK----------ITPCVQVTEArss----------gqKLIVTISDGIN----------ScPAVIIkp---SQ--------pvdQYAVIKIYE------SLkht-----KEIVIIVKYTVEE
NFIYKFFTEpnstEALKWLNSEVNLICFSQMNAGgnqvflkvidgsippQYYAIVHLGVEdngnmdppiSYvKKIIKiqkfsITNYYgklfilakKVTYLNIES--FDIEDLfkkyhlqsISYLLLNAQNDYN
DFFTSLADSinnaNEK--------YFQEVHECQTpvkllclqqsciadnKYLIETLDGSV-------pvEYkHTCLA-----LVPLTpegdrql-IGKVISFTQ--YKVAPTks------RYLVVLTRISVVP
GVCDQLLSGipssNGS--------IVVILAINLIrsg----------prALVHVADAGSN---------IHeSGVPLsi---RLVMSdat--qfqPGDLIRIIR--FSLNEI-hs-----TKLVTVIQFEKVG
NAIDSLINStnpdEQY--------VIQVLKAPQSvae-----------nLFKICISDGFC---------KFkKGYFVsd---AATKCq----dlkDLCIIKCKK--YIDDSN-hd-----KERIIISNYELIY
EALKSMVEYqdntTQY--------CVQILNIVF------qddmkskngiLYQCICSDGFA-----------kMKMHIln---LSPSIlan--iskLNPIIRILE--LKLS----------QPFFIVIKHEVLY
-----------------P------VVQVIHLKKIdks--------ggdeRWKVILSDGTL----------HvSGMLAtq---LNPLVass--qitTNSILTVKD--FIINTMgsg-----QKVCILLNVEVNG
---MTGNGGgpsfSP---------ILQVLDTKQVpgp--------qgsvRYRIVLSDGKH----------YiQGMLAtq---LNHMIatn--ligANTIVQVEQ--FMSNRVkd------RTVIILLNVHALR
NAVSMLYNKqapdGFEP-------WLQVIDTKKIkpa------sgtggdRYRIVLSDGSS----------YiSGMLAtq---LAPMMese--slkTNFVLQLKD--FLVNEVqg------RRILIVLSIGDIV
NAITAIHDGdvnlKP---------LLQVLEIKMIgrs------qersqeRYRFLISDGVS----------AqHAMVAvq---LNDRVksg--qfeKGSIVQLID--YICSDV-kg-----RKLIVVLNMETIV
DGIATVLANqsldSSSVRPEI---VVQVVDLKPA-------------gnRYTFSANDGKM----------KiKAMLPat---LTSDIisg--kiqNLGLIRLLE--YTVNDIpgkse---EKYMLITKCEAVA
GVVMKMLNGevtsETDMMP-----VLQVTELKLIqsk---lhqnqessnRYKFLLSDGTD----------LaAGMLNts---LNSLVnqg--tiqLGSVIRLTH--YICNLIqt------RRIVVIMQLEVIV
NGVAAALAGdtnlKP---------VLQIVELRGVqvn----gagvtrgeRFRAVVSDGTA----------AsSALFAaq---LSDHArsg--alrRGSIVQLSE--YVINEVgp------RRIIVILNLEVLV
GAVAFVLENaspdAATGVPVPEI-VLQVVDLKPIgt-------------RFTFLASDGKD----------KiKTMLLtq---LAPEVrsg--niqNLGVIRVLD--YTCNTIgekq----EKVLIITKLEVVF
GAVQAIAEHpdgtGTIQP------VLQVVDVRPVttk--napptpkpaeRFRMMLSDGVN----------TqQSMLAta---LNPLVkda--tlrPGTVVQLTD--FMCNTIqg------KRIIIVVKLDVLQ
NAIVALNNGdvelRP---------VLQIVDVRQIgns-------qttteRFRLVLSDGVH----------LqQAMLAtq---LNEKVknn--lavKGSIVQLLE--YICNTVqn------RKIIIVLNMEIVE
------------------------------------------------------------------------------------------------------------------------------------GDVARIKSKedfaNGV--------VLRVSELQEVggk------------KHKCMLSDGNN----------SiRGVLAsq---FADLVasg--elsNGCLIKITA--FVTNTIgs------DDVVLATDLSVVS
------------------------------------------------------------------------------------------------------------------------------------GAVNKIRESagatDV---------CVQVLDFKSAdea-------------YSATLNDGEN----------TiAAKFAat---CGEKLssg--avkENAVLKLTDVAFETDGVer------KPFAVINGFEVVD
NAISNILEQthgsQDFKP------IVQVFDLKELktk----pdaddaakRFRVLASDGGF----------AaQGLFGae---LNAMCerg--eitKFTVLRLRE--YIVNDLng------RRILIVMDAEVMD
GAVKSIYGMntvsRP---------VLQVQEVRKLqpsvaqqaqattsgdRYRVVLSDGEH----------LlHCVLMaq---LNSFVlsg--dldKGSIVRLVD--YQPNKVqd------RVVAIIINLEILE
GAIAAIMQKgdtnIKP--------ILQVINIRPIttg--------nsppRYRLLMSDGLN---------TLsSFMLAtq---LNPLVeee--qlsSNCVCQIHR--FIVNTLkdg-----RRVVILMELEVLK
GAIEVMIQQentsIKP--------ILQVINIRPIstg--------nrspRYRLLMSDGLN---------TLsSFMLAtq---LNTLVegg--qlaSNCVCQVHK--FIVNTLkdg-----RKVVVLMDLEVMK
GAIGLIMQQgdttIKP--------ILQVINIRPIatg--------nsppRYRMLMSDGLN---------TVsSFMLAtq---LNVLVeee--rlsSNCICQVNR--FIVNTLkfg-----RKVVILMDLEVLQ
GAIAAIMQGenvyKP---------VLQVINTRAIatg--------ngppRYRVLMSDGVN---------TLsSFMLAtq---LNPLVeee--rlsAHCICQVNR--FIVNSLkdg-----RRVVILMDLDVLK
GAISAMLGGdsscKP---------TLQVINIRPIntg--------ngppRYRLLMSDGLN---------TLsSFMLAtq---LNSLVdnn--llaTNCICQVSR--FIVNNLkdg-----RRVIIVMELDVLK
GAIESLSKGtevnNP---------ILQCVNIRKIdgg--------ngvsRFRVMMSDGLH---------TMsSFMLStq---LNPMAeqn--qlaTNCVCVLKR--SVTNVLkdg-----RRVVVILDIEVLK
------------------------------------------------------------------------------------------------------------------------------------GCIADIMRGteleKP---------VVQILGSKRIagg-------geqseRYRLLISDGQN---------LYsFAMLAtq---LNELHhng--qlaEFTVIRIDR--YITSVVnrnekge-KRVLIILDLHVVK
GVIARIMNGedvsQP---------VLQILGIKRIntn--------sdqeRYRLLMSDGKY---------YNsYAMLAsq---LNEMQnrg--llnENTIVRLDK--YMTSMVgkegsg--KRVLIVTELTVLN
GYVQEAIENngypGHDG-------IVQVLKGKVEqge------qlghafTFRIRISDGVF----------QyNALMSad---IDDQIkrevehlvEGTIIALTK--FEIYDQgega----KNCFLIKGYKILS
GALDKIMNGidvdKP---------VLQILGHKKLsss--------ssgeRYRLLVSDGKR---------VNsFTMLAtq---LNSMIten--iltEFSICQINR--YAISMVnnagkq--KRVMVILNIDLKV
GALLRIMKGqeveEP---------LVQVLVSKKIssr-------saeteRYRIWASDGDY---------SItYGILTlp---PGK-------pveDFSIIKLKK--FVKSEIsnakgp--QKILLIIDSEIVT
GAIQEILNTpsdqPDRLPEQP---VFQILGLKKIqpk------qgdasdRYRLVLSDGVL---------IHtSAMLAtq---LNDKVtdg--eieVKAVVRLDK--YICNIIqet-----RKVLILLELTTVK
------------------------MMIFLQIRLFatv----kthqhrffKWLIVLSDGIH---------AYsSVMLAtq---LNQRVtsg--eldAKAIIKLNN--YTCNIVqet-----RKVLVILDLTVLT
----------------------------------------------------LSISDGKY---------KHnSAMLAtq---LNNLIqnd--yirVNSIVRVKQ--GVCNLVsn------RRILILLDVEVVA
GDFHSIFTNkqryDNPTGG-----VYQVYNTRKSdga--------nsnrKNLIMISDGIY----------HmKALLRnq---AASKFqsm--elqRGDIIRVII--AEPAIVrerk----KYVLLVDDFELVQ
GDLLDIFRIperyNNPTGG-----IYQVVQTKKTetn----------akKNLILINDGKY----------HvKALLRnk---AAEAAqqa--eleRGDVFKVLN--AECAVIkekk----KFVLLVDEIEIVS
GALKQVFSKeghdSVQIPM-----ILQITNIKAFdvs-------psdskKFRILVNDGVY----------StHGLIDes---CSEYIknn--ncqRYAIVQVNA--FSIFATs-------KHFFVIKNFEVLA
GALDAMFNDpdraQQQFPVP----ILQCLQIKTLdsk----nggagateRFRIVLSDLKN----------YvQCMMAtq---TNHLVhdg--llqRGCIVRLKQ--YQAQCLkg------KNILIVLDLEVIQ
GALDVIFNDpdkaTKLFPVP----VLQCLQVKQMaps-------aqggdRFRLVMSDGQH----------YvQTMLAtq---ANHVVhdn--klvRGCFARIKQ--YTPNNLkg------KNILVILDIEVIE
GAIAAIFNDpegvKTRFPVP----VLQCLQVKLLgqq-----pnagaaeRYRVVLSDVDN----------YiQCMLAtq---ANHVIhdd--qlqRGCIVRVKS--YQANTVkg------KSVLVLLDLEVIQ
GALSAIFDDtkpqTREP-------VVQCVQIKPLpaq-------qshpeRYRAVFSDISN----------YvQTMLAtq---LNPMVssk--llrKGCFVRLKS--FQANSV-kg-----KKILIILDLEVLE
GSLNKINTTsdpsEFPANP-----VLQVLTVKELnsn-----ptsgapkRYRVVFSDSQN----------YaQSMLStq---LNHLVmen--klvKGAFVQLTQ--FTVNVMke------RKILIVLGLNVLP
GSCERLQFAnpqdASVFESPH---TIQFLSIKKVnta---npnsnapvdRYRIIISDGVH----------FiQAMLAtq---LNELVqnn--sigKHTVAVVER--ATCNYVqe------KRLIVVLELRVVA
GAIAQMIQTsdpaSSSVQNP----VCQILSIKKIqas---atsaanvgdRYRIILSDGIN----------YaQAMLAsq---KRSMVesg--eleKNCLVRVTQ--FASNSVqn------RRILILLDLDVVH
GTVEALYNSqannPLYKNP-----VLQITSLGKLsva-------igdkqRYRVNLSDGVN----------YmKGIFSse---LTPHFekg--lvsRYSLIRPGR--FSVRSKdg------SVYIYIQEIQAYE
GCLTNIINVdqtkRFQNIDC----RVQVIKTKQSssn------------NLEIHLSDQDH----------IfVGISKtd---PS--------nipINSIVNLSD--FSINFP--------KRFLNIVQFNVIS
GKIADVVRArkiiEPI--------IVQVSNVQSVekt----------qdVVKAKIHDNKY----------QiTTIFKlk---DTALLk----rlkDFMLIKVIQGSVHVPQNisk-----TLVVAISNFEIVD
81
Figure 2.24: Multiple sequence alignment of RPA1 ssDNA binding domain (DBD-F) from 54 diverse eukaryotes. Genus names
for Excavata are highlighted with brown, Chromalveolata with orange, Archaeplastida with green, Opisthokonta with
purple, and Amoebozoa with blue. Shaded columns indicate amino acids are 75% identical.
82
EUKARYOTES
Trichomonas
Naegleria
Plasmodium
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phytophthora
Arabidopsis
Oryza
Physcomitrella
Chlamydomonas
Ostreococcus
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Apis
Tribolium
Nematostella
Saccharomyces
Kluyveromyces
Candida albicans
Neurospora
Gibberella
Magnaporthe
Aspergillus
Schizosaccharomyces
Coprinus
Ustilago
Encephalitozoon
10
20
30
40
50
60
70
80
90
100
110
120
130
....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|....|..
--------METDHSFRVNSEYLG--QHKSKMVTIVGRIV--DKSSDPYIISTTDSKSVRVHKN--PSLNDKRFSAEWIEVTGRV---QENLD-IEESSAIPITS-----KIDPEAWNQMVKLSH------KFK-EIF
MSNNDIQPYQPVSYPYVNAQTLR--NYIGKEVTLIGRVVSLVSDSDVFDVLTHEG-TVTIYHN--SPVSFD--ENAFVMIRGQGEDLNGSPS-LRSTFAQQLHT---TTDLDMEVFNNFILLAE-----GKFR-DLF
--------MEQFIAPRVNKKHLS--KFYSKNVRIIGKVLKK--DGNELTLLACDNEEIKCILTD-NQVEEP--LDQYVEVLGKV---NEDDT-ISDIVYVQNGG----SSINLNEINNLVNLTF----LEELE-GVF
------MQSSIENARRVNKEELQ--NFVNKQVRFVGKVVSV--EGEIVILEAPDGGTVRCRTI--SPP-----PSTYVEVIAQV---MPDLSLTQTDFMFDLGD-----SLNMDLVNESIKVSF----HPKLR-QHW
----MDAEQEQVMYPRILFEQMA--QFRGKKVTVVGNVCNEDQNDSLVIEFGPTGLNQHVVIDNYRRVDLNN-TTKFVEIRGVV---LNQNI-VSCEELTEFEQ---KDPFDFDTYSKLIHLSQ----SDKLS-SLF
------MTENPTSFQRINADMIS--KFKGQYVTLVGKLIQ---SKGDYVEFSVDGTIVKVTEI--EEVPEST-EDILLEIRGKL---NEDGY-LEAKEFTELDQ-----TFDFELYKKVINMVQ-----GQFR-ELF
-----MSSQPDGAFPRVNHALIKQGQYIGLIVSVVGRTVNFD-GQSNLEIECSDGGRVTITVD--PEYNYV--PGQVLEIMGHL---MDENT-IQ----VGWGG-----GVGVALLCNDTDVS--------------MDFGGAPTGNATSPRVNKKTMG--AYVGHTVALVGAVESH--SPTAVVLRTSDGEIVNVKTQ--PGTDYG---SKVVEVIGRV---EDSET-IREFKTTLFGD-----NFDLDVYDQFVQLSQ-----TKYK-HLF
-------MDTSSPSAFVNGALLR--RFIGQKVRTVIQVT--GSEIGSVVGKSTDDLQIVVRGS--SPP-SP--LTTYLEVIGIA---ESDNA-IRAETWTNFGN-----TFDTQNYNELCKLAN-----GEFK-HLF
-------MDTSGPAAFVNGEILK--MFVGRRVRTVVQAQRE--EGGLLIGQSTDGHQLTIKGA--SGAP----MSHYVEIIGIA---EPNQA-IRAEVCTDFGE-----NFDPAPFNGLCKLAN-----GQMK-DLF
------MADISNPRPMVNSKLLK--NYMGRRVTTVVKVA--RTEGGNVVGELPDGAPITVKQA--PQHVAA--QSQFMEVIGVV---EGDRS-LRAETCTSFGD-----NFDMSTYNDLCQLAN-----IENR-ECF
SNISDKMAGVDEAIPRVNFETMQ--RYHGRKVILCCQISQID-NGTVRVTTSDKGEVTVVGGS--SPYE-----GRFAEVVGTV---VGPTN-IQEVEHTNLSD-----NFSLDMYNELVKLAHKDAYIGMFS-TIR
-------MDDSAPRPRVNGEALV--NFIGKTVLVVGEVT--PRDANSATVKTADDKMITVNLA--GAG-AF--GSKYVEFEATV---DGADC-VTECSRVEFGD-----DFDQYSYGELCKLIN-----GKSK-ELF
-----MVDMMDLPRSRINAGMLA--QFIDKPVCFVGRLEKIHPTGKMFILSDGEGKNGTIELM--EPLDEE--ISGIVEVVGRV---TAKAT-ILCTSYVQFKE--DSHPFDLGLYNEAVKIIH------DFP-QFY
-----MEDIMQLPKARVNASMLP--QYIDRPVCFVGKLEKIHPTGKMFILSDGEGKNGTIELM--EPLDEE--ISGIVEVVGKV---TAKAT-VLCASYTLFKE--DTNRFDLELYNEAVKIIN------ELP-QFF
-----MAEVLELPRTRIGAAHVA--SFIDRPVCFVGRLEKIHPSGRSFTLTDGEGAQVTVELA--QPLEEE--ISGVLEVVGRV---TAKAT-ILCSSYVLFRD--HNHSFDLRLYNDALKIIH------EFP-QFY
-----MGDVHEAPRPRIAAAQLV--QHIGRPVCFVGRVEKIHPTGKLIVLSDGEGCNATVELS--EPLDEE--ISGILEVVGRV---TNQAT-IMCTSYVQFRE--DKSPFDLELYNEALKIIH------EFP-EYF
-----MADLFDVPKVRINTSMLA--QNVGRPVCFVGKVEKVHPTGTSIVVSDGAGKNATVELN--EPLEEE--ISGIIEVIGKV---TPKAT-IMGVSYVPFRE--DVSTFDLALYDEALKIIH------EFP-QYY
-----MTGVYESPKTRINTSMLS--QYISRPVCFVGRLEKVHPSGKVLTLVDGEGKSASVELN--EPLDEE--LSGIVEIIGMV---SNKGA-IMATSYTQYRE--DKVPFDLELYNEGLKVLH------DFP-QHY
------MDAFKQPKPRVNGSMLP--KHQGSIVCLLGLLKNVDPNGTSLTLTLSDGVDAQVNLQ--TPLDRP--IEGLVEVVGQV---GANPRQIKGLNLISHGQ---------KDFGEICSIKQ------EFTLHTK
--------MEFEPRSIVNGSLLK--RHSGQSVSIHLFVEKGDKDGRSFVGKSTDGMPIQVMLS--APLSQI--LHGWVEVIGMA---GSNDS-VRCKEIITYTGSEDGEEFDTDGHNMLCNFLA------NCR-DMY
-------MDAFDPRSIINGGMLK--QFSGQTVSIMVRVESV--AGSTLLASSTDNHKLKINLP--GELGAA--EGAWVEVIGVP---HGADT-LRAKEVIEFGG--ENIDFDKDGYNGLSHLIN------NVK-AFY
------------MKKRIDGRRLA--QNIGEQVILLGTIGKKSSNGRNLELRTTDGVQVNITLP--EPIDGN--AEGYIEVHGTL---QSKST-MNCSNYIVFPL-SLTEEFDADQYNELMIILN-IVGVEKLT-ECE
------MSTRNPLYNIVSGAQIA--GFVGKNVAVCGLVNGAHVGDKTFTLRSSDGVLVPVELN--KPLTED--IEGYVEVKGVC---QQSKT-IRADEFCTFNN----EKFDSSNHTKLCKILN------SLP-NVY
-----------MDAPRVNASMMK--QYSGRLVCFVGSVSEINSTGTELKMLSSDDKMIHVVLP--EPLDEA--LQGVVEVVGRV---ERDLT-ISAQRIISYAG---REEFDLSLYNEAITLAA------GFP-EMF
---------MASETPRVDPTEIS--NVNAPVFRIIAQIKSQPTESQLILQSPTISSLNNIRVS--MNKTFE--IDSWYEFVCRNNDDGELGFLILDAVLCKFKE---NEDLSLNGVVALQRLCK------KYP-EIY
ATSNQQPLVMSNQTPRIDPSQIS--NTQHSVFRIIAKVLDQPQPKELILQSPTTNGLSQVKLS..SNIE--VGSWYEFVCRNVDTGDIGLMVLDSVKCELKE---GEEISVSGIVALQQLSG------KFP-DLY
---------MEASNIRIDGTLLQ--ANKNKLVRVMGKCESFDHASNQAIIVCNGTIKLDLSQV--TDSPLE--IHKNYEIIGKV---SGDELKIFVYSVIELSD-----NLDINAASKLAQYAQ------KVS-ELY
-------MDNKSSTPRITCAYLS--QYVGKLVTVVGKVVQLRGEEATI---DADG-TIHAFLN--REAHLS--ANNGVQLIGKV---NPDLS-IKVLSSVDLGQ-----GVDYNLANAVVEVTH------RYK-PLF
-------MSEQLSTPRITAAYLD--NFVGRVVMLVGEVTQLRGDQATV---ESDG-TVTVLLN--RDAHLS--NGNYVQVIGKV---NPDLS-IKVLTSRDLGNSVDHGPFSQQTYDEDSQLSHIPSAQPYTP-PGW
--------MEQTSTPRVNCGLLD--SYVGRNVMVVGRVQQL---RGDVALIDADG-NVTANLN--RDSHLL--VGNAAQIIGKV---NPDLT-IKVLSSHDLGP-----NVDMNVSRAVVETSQ------KLK-ALF
---------MSLQTPRVLPSHLH--AFSAPPVRLLGTVTAL--HGDTATITCGTHGDVTLILK--PDSHLQ--MGKLVEVVGKVAEIDGGLG-IRVLATTDWGN---PADCDYKIYEKVVDVTH------RLK-PIF
---------MERPTPRVTKDMLP--ECSGKTVRIVGKANQV--EGETAKVDSNGSFDMHLTVD--NTLE----PNHFYEFVVSV---KPDSS-VQLLTCVDFGT-----DIDMEVYQKLVLFSH------KYN-SLF
----MSNGIEQEVTPRVNSALLS--NFQGRTIRLACKLVKFN-DNGSLTVSAADGGQVVVQLV--GEHEPI--SDTYLEIVGKV---MDPTT-IQMRGCIGLGA-----DLDMKLVNDTINLIH----DERFYGRMF
---------MEKPTPLINSSMLG--QYVGQTVRIVGKVHKV--TGNTLLMQTSDLGNVEIAMT--PDSDVS--SSTFVEVTGKV---SDAGSSFQANQIREFTTVDCGHDVDLTLVENVVQISA------AFP-NLF
---MLFLVSYLVPMYSVDVE-----NCEGQDVVVIGRLERV---EDGVVVLKCMGREVQVRH---QGVELY--RPGLVRVRGTV----ENGV-LVESSVRPVGG-----EFDMEVYGRFVAIAA------KYP-DLF
Figure 2.25: Multiple sequence alignment of RPA3 ssDNA binding domain (DBD-E) from 36 diverse eukaryotes. Excavata are
highlighted with brown, Chromalveolata with orange, Archaeplastida with green, and Opisthokonta with purple. Shaded
regions indicate amino acids are 75% identical.
82
82
83
Table 2.2: Protein sequence comparisons between Saccharomyces cerevisiae and
Homo sapiens.
Number
undetected
Protein
Length (aa)
Identity
S-W score
expected
(90%)
A190
1664
0.38
3611
0
0.01
A135
1203
0.44
3185
0
0.02
AC40
335
0.46
1038
0
1.38
AC19
215
0.44
612
0
3.24
AC12.2
131
0.65
382
1
5.13
Rpb5
142
0.45
364
3
5.32
Rpb6
70
0.74
354
1
5.43
Rpb8
146
0.36
289
0
6.18
Rpb10
125
0.36
257
4
6.59
Rpb12
70
0.41
13
5
10.73
RPA1
621
0.32
1161
0
1.08
RPA2
273
0.26
255
5
6.62
RPA3
110
0.14
73
10
9.52
Rad52
471
0.24
550
12
2.96
Rad59
238
0.10
93
27
11.12
Rad51
400
0.66
1538
1
0.17
Rad55
406
0.17
98
12
10.96
Rad57
460
0.26
295
5
6.19
Dmc1
334
0.54
1178
10
0.48
Hop2
218
0.21
161
8
9.13
Mnd1
219
0.28
328
5
5.63
Rad54
898
0.45
1984
10
0.05
Rdh54
958
0.36
1524
15
0.18
Note: Yeast RNA Polymerase I, Replication Protein A, and strand exchange component
amino acid lengths, their identities to human, Smith-Waterman scores, and the observed
numbers of absences among 34 taxa with at least 8.0x whole-genome shotgun sequencing
coverage (except for RPA3 and Rad59 in which H. sapiens was compared to Candida
albicans) are shown. Proteins in bold function only during meiosis in model organisms.
Number
undetected
observed
84
Table 2.3: Protein sequence comparisons between Homo sapiens and Oryza sativa.
Number
undetected
RPA1
616
0.34
1292
0
RPA2
270
0.30
340
5
RPA3
121
0.21
81
10
Rad52
399
0.37
343
12
Rad51
339
0.69
1585
1
Rad55
280
0.26
199
12
Rad57
346
0.36
464
5
Dmc1
340
0.63
1321
10
Hop2
217
0.38
461
8
Mnd1
205
0.42
539
5
Rad54
747
0.47
1849
10
Rdh54
910
0.40
1638
15
Note: The lengths of Homo sapiens protein sequences, identities to Oryza sativa protein
sequences, Smith-Waterman scores (except for RPA3 and Rdh54 which were compared
to Physcomitrella patens, and Rad52 which was compared to Cyanidioschyzon merolae),
observed numbers of absences among 34 taxa with at least 8.0x whole-genome shotgun
sequencing coverage are shown. Proteins in bold function only during meiosis in model
organisms.
Protein
Length (aa)
Identity
S-W score
85
Table 2.4: Protein sequence comparisons between Oryza sativa and Saccharomyces
cerevisiae.
Number
undetected
RPA1
656
0.31
1037
0
RPA2
279
0.24
275
5
RPA3
106
0.13
Unalignable
10
Rad52
318
0.42
394
12
Rad51
339
0.65
1455
1
Rad55
280
0.33
80
12
Rad57
290
0.31
229
5
Dmc1
344
0.54
1125
10
Hop2
227
0.23
160
8
Mnd1
207
0.28
255
5
Rad54
980
0.43
1562
10
Rdh54
1122
0.35
1277
15
Note: The lengths of Oryza sativa protein sequences, identities to Saccharomyces
cerevisiae protein sequences, Smith-Waterman scores (except Rad52, Rad55, and Rdh54
which were compared to Cyanidioschyzon merolae, Chlamydomonas reinhardtii, and
Physcomitrella, respectively), observed numbers of absences among 34 taxa with at least
8.0x whole-genome shotgun sequencing coverage are shown. Proteins in bold function
only during meiosis in model organisms.
Protein
Length (aa)
Identity
S-W score
86
Figure 2.26: Phylogenetic distribution among eukaryotes of RNA Polymerase I core
complex subunit genes. The names of genera, the numbers of completed or
nearly completed genome projects available for those genera, and the whole
genome shotgun equivalent coverage of the most complete genome project
is listed, except for Oryza, Mus, and Kluyveromyces, which were
unavailable, and Cyanidioschyzon and E. cuniculi, which were sequenced
from end to end with BAC and PCR. Grey regions indicate subunits shared
by RNA Polymerase II or III. Supergroups are presented with white text on
black background with a summary of the genes present. Symbols: ‘+’
indicates sequence was found and phylogenetically verified, ‘ (-)’ indicates
that sequence was not found and may be outside the calculated threshold of
detection, blank spaces indicate sequences were not found and the genome
project has less than the equivalent of 8.0X whole genome shotgun
coverage. The tree is a cartoon that summarizes current literature (Simpson,
Inagaki, and Roger 2006; Baldauf 2008; Burki, Shalchian-Tabrizi, and
Pawlowski 2008; Kolisko et al. 2008; Timmermans et al. 2008; Minge et al.
2009; Reeb et al. 2009; Shadwick et al. 2009).
87
EXCAVATA
Giardia (3)(11.3X)
Trichomonas (1)(7.2X)
Trypanosoma (4)(8.0-10.0X)
Leishmania (6)(5.0X)
Naegleria (1)(8.6X)
CHROMALVEOLATA
Plasmodium (8)(8.0X)
Theileria (2)(8.0X)
Cryptosporidium (4)(13X)
Tetrahymena (1)(9.1X)
Paramecium (1)(8.0X)
Thalassiosira (1)(12.8X)
Phaeodactylum (1)(10.4X)
Phytophthora (3)(9.0X)
ARCHAEPLASTIDA
Arabidopsis (2)(8.0X)
Oryza (1)
Physcomitrella (1)(8.1X)
Chlamydomonas (1)(12.8X)
Ostreococcus (4)(8.8X)
Cyanidioschyzon (1)
OPISTHOKONTA
HOLOZOA
Homo (3)(5.1X)
Mus (2)
Monodelphis (1)(6.8X)
Gallus (1)(6.6X)
Xenopus (2)(7.7X)
Danio (1)(10X)
Strongylocentrotus (1)(8.0X)
Aedes (1)(7.6X)
Drosophila (12)(8.9X)
Caenorhabditis (2)(10.0X)
Apis (1)(7.5X)
Tribolium (1)(7.0X)
Nematostella (1)(7.8X)
Trichoplax (1)(8.1X)
Monosiga (1)(8.4X)
FUNGI
Saccharomyces (9)(10.2X)
Kluyveromyces (3)
Candida albicans (2)(10.0X
Neurospora (3)(8.6X)
Gibberella (2)(10.0X)
Magnaporthe (1)(7.0X)
Aspergillus (7)(8.9X)
Schizosacc. (4)(11.8X)
Coprinus (1)(10.0X)
Ustilago (1)(10.0X)
Encephalitozoon (1)
AMOEBOZOA
Dictyostelium (2)(8.3X)
Entamoeba (5)(8.0X)
A190 A135 AC40 AC19 A12.2 Rpb5 Rpb6 Rpb8 Rpb10 Rpb12
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
(-)
+
(-)
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
(-)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
(-)
+
+
+
+
+
(-)
88
Figure 2.27: Number of detection failures for RNA Polymerase I, RPA and SE
proteins as predicted by Poisson regression analysis compared with
observed numbers of detection failures. (a.) Poisson regression analyses
were performed using the numbers of failures to detect RNA Polymerase I
subunits (A190, A135, AC40, AC19, AC12.2, Rpb5, Rpb6, Rpb8, Rpb10,
and Rpb12) among 34 genera with at least one genome of 8.0X wholegenome shotgun sequencing coverage (or sequenced from end-to-end)
relative to their Smith-Waterman scores. The predicted numbers of failures
relative to Smith-Waterman scores (black dots) are plotted with Wald 90%
confidence limits (green dots). The observed numbers of RNA Polymerase
I subunit detection failures are indicated with open circles. (b.) The numbers
of Replication Protein A (RPA1-3) subunit detection failures observed (open
circles) compared with the Poisson regression predictions obtained from
analyses of the RNA Polymerase I dataset. (c.) The observed numbers of
detection failures among strand exchange components (Rad59, Rad52,
Rad51, Rad55, Rad57, Dmc1, Hop2, Mnd1, Rad54, and Rdh54) (open
circles) compared with Poisson regression predictions calculated from a
combined RNA Polymerase I and Replication Protein A dataset.
89
a.
b.
c.
90
Table 2.5: Saccharomyces cerevisiae strand exchange gene mutant phenotypes, suppressors, and meiotic functions of their
products.
Mutant Phenotype
Mitosis
Meiosis
Abnormal growth, and arrest
rpa1 mutations result in
S- or M- phases (Brill
reduced sporulation
Rpa1 during
and
Stillman
1991),
UV
and
efficiency, severely reduced
Rpa2 MMS sensitive, and deficient
spore viability, and defective
Rpa3 in homologous recombination
recombination (Soustelle et
(Umezu et al. 1998)
al. 2002)
Increased sensitivity to
Reduced ability to sporulate,
ionizing radiation (Game and
greatly reduced spore
Mortimer 1974; Saeki,
viability (Game and
Rad52 Machida, and Nakai 1980) and Mortimer 1974), and reduced
reduced spontaneous
meiotic recombination
recombination (Petes, Malone, (Petes, Malone, and
and Symington 1991)
Symington 1991)
Gene
Increased sensitivity to IR and
mildly defective recombination Slightly reduced sporulation
Rad59 (Bai and Symington 1996;
efficiency and spore viability
Davis and Symington 2001;
(Bai and Symington 1996)
Davis and Symington 2003)
Meiotic Function
Form heterotrimeric complexes that
bind ssDNA and recruit Rad52 to
the Rpa-ssDNA complex
(Firmenich, Elias-Arnanz, and Berg
1995; Gasior et al. 1998; Hays et al.
1998)
Forms heptamers that mediate
displacement of RPA from ssDNA
and recruits Rad51 (Shinohara,
Ogawa, and Ogawa 1992; Milne
and Weaver 1993; Hays, Firmenich,
and Berg 1995; Sung 1997; Octobre
et al. 2008)
Forms homomeric rings or
heteromeric rings with Rad52,
functions partially overlap with
Rad52, may stimulate or augment
Rad52 functions (Bai and
Symington 1996; Davis and
Symington 2001; Pannunzio,
Manthey, and Bailis 2008)
Suppressor
None found
Rad51
overexpression
(Milne and Weaver
1993; Schild 1995;
Krejci et al. 2002)
Rad52
overexpression
(Bai and Symington
1996)
90
91
Table 2.5: Saccharomyces cerevisiae strand exchange gene mutant phenotypes, suppressors, and meiotic functions of their
products. - continued
Gene
Mutant Phenotype
Mitosis
Increased sensitivity to
ionizing radiation (Saeki,
Machida, and Nakai 1980)
Rad51 and reduced spontaneous
recombination (Petes,
Malone, and Symington
1991)
Dmc1
None
(Bishop et al. 1992)
Increased sensitivity to
ionizing radiation (Game
Mortimer 1974) and
Rad55 and
reduced
spontaneous
Rad57 recombination
(Petes,
Malone, and Symington
1991)
Meiosis
Decreased recombination,
reduced spore viability (Petes,
Malone, and Symington 1991),
and failure to form Dmc1 foci
(Bishop 1994)
Defective recombination and
accumulation of double-strand
break recombination
intermediates, failure to form
normal synaptonemal
complexes, and arrest late in
prophase (Bishop 1994)
Reduced spore viability (rad55
<25% and rad57 <3%) (Game
and Mortimer 1974), decreased
recombination, and failure to
form Rad51 foci (Petes,
Malone, and Symington 1991;
Krogh and Symington 2004)
Meiotic Function
Suppressor
Forms helical filaments on ss- and
dsDNA, catalyzes strand exchange,
causes ssDNA extension and
dsDNA rotational transition, may
recruit Dmc1 to the pre-synaptic
filament during meiosis (Nishinaka
et al. 1998; Krogh and Symington
2004; Lopez-Casamichana et al.
2008)
None found
Meiosis-specific protein with
function similar to Rad51 (Bishop
et al. 1992; Bishop 1994; Bishop et
al. 1999; Sehorn et al. 2004;
Sauvageau et al. 2005)
Rad54 or Rad51
overexpression (Bishop
et al. 1999; Tsubouchi
and Roeder 2003)
Form heterodimers, stabilize
Rad51-ssDNA pre-synaptic
filaments (Hays, Firmenich, and
Berg 1995; Bai, Davis, and
Symington 1999; Bleuyard,
Gallego, and White 2006; Filippo,
Sung, and Klein 2008)
Rad51 or Rad52
overexpression (Hays,
Firmenich, and Berg
1995; Johnson and
Symington 1995; Schild
and Wiese 2009)
91
92
Table 2.5: Saccharomyces cerevisiae strand exchange gene mutant phenotypes, suppressors, and meiotic functions of their
products. - continued
Mutant Phenotype
Mitosis
Meiosis
Defective
recombination and
None
Hop2
inappropriate pairing
(Leu, Chua, and Roeder 1998;
Mnd1
of homologs (Leu,
Tsubouchi and Roeder 2002)
Chua, and Roeder
1998; Chen et al. 2004)
Increased sensitivity to ionizing
radiation (Game and Mortimer 1974;
Saeki, Machida, and Nakai 1980) and
MMS (Klein 1997), reduced sister
30-100% reduced
Rad54 chromatid recombination (Petes,
spore viability (Game
Malone, and Symington 1991), and
and Mortimer 1974)
accumulation of Rad51 foci (Arbel,
Zenvirth, and Simchen 1999;
Shinohara et al. 2000)
Gene
sporulation
MMS sensitivity and Reduced
Rdh54 Diploid-specific
and
spore
viability
reduced growth (Klein 1997)
(Klein 1997)
Meiotic Function
Suppressor
Form heterodimers, stabilize presynaptic filaments, capture dsDNA
only during meiosis in model
organisms (Tsubouchi and Roeder
2002; Chen et al. 2004; Henry et al.
2006)
Rad51
overexpression
(Henry et al. 2006)
Forms homodimer/oligo, stimulates
D-loop formation, dissociates Rad51dsDNA complex (Petukhova,
Stratton, and Sung 1998; Petukhova
et al. 1999; Kiianitsa, Solinger, and
Heyer 2002)
rad52, rad51,
rad55, rad57
functional
mutations (Klein
1997)
Translocation activity stimulates Dloop formation and displacement of
recombinational intermediates (Chi et
al. 2006). Functions in diploidspecific mitotic recombination and is
required for complete meiotic
viability (Klein 1997)
rad52, rad51,
rad55, rad57
functional
mutations (Klein
1997)
92
93
Table 2.6: The most complete genomes of the genera searched during this study with
web addresses.
Taxon
Trichomonas
Giardia
Naegleria
Trypanosoma
Leishmania
Plasmodium
Theileria
Cryptosporidium
Tetrahymena
Paramecium
Thalassiosira
Phaeodactylum
Phytophthora
Arabidopsis
Oryza
Pyscomitrella
Chlamydomonas
Ostreococcus
Cyanidioschyzon
Homo
Mus
Monodelphis
Gallus
Xenopus
Danio
Strongylocentrotus
Aedes
Drosophila
Caenorhabditis
Apis
Tribolium
Nematostella
Web address
http://trichdb.org/trichdb/
http://giardiadb.org/giardiadb/
http://genome.jgi-psf.org/Naegr1/Naegr1.home.html
http://tritrypdb.org/tritrypdb/
http://tritrypdb.org/tritrypdb/
http://plasmodb.org/plasmo/
http://www.sanger.ac.uk/Projects/T_annulata/
http://cryptodb.org/cryptodb/
http://www.ciliate.org/
http://paramecium.cgm.cnrs-gif.fr/
http://genome.jgi-psf.org/Thaps3/Thaps3.home.html
http://genome.jgi-psf.org/Phatr2/Phatr2.home.html
http://genome.jgipsf.org/Physo1_1/Physo1_1.home.html
http://www.arabidopsis.org/
http://www.plantgdb.org/OsGDB/
http://genome.jgipsf.org/Phypa1_1/Phypa1_1.home.html
http://genome.jgi-psf.org/Chlre4/Chlre4.home.html
http://genome.jgipsf.org/Ost9901_3/Ost9901_3.home.html
http://merolae.biol.s.u-tokyo.ac.jp/
http://genome.ucsc.edu/cgi-bin/hgGateway
http://uswest.ensembl.org/Mus_musculus/Info/Index
http://www.broadinstitute.org/mammals/opossum
http://genome.ucsc.edu/cgi-bin/hgGateway?org=chicken
http://genome.jgi-psf.org/Xentr4/Xentr4.home.html
http://www.sanger.ac.uk/Projects/D_rerio/
http://www.hgsc.bcm.tmc.edu/project-species-oStrongylocentrotus%20purpuratus.hgsc?pageLocation=Strongylo
centrotus%20purpuratus
http://www.nd.edu/~dseverso/genome.html
http://flybase.org/blast/
http://www.wormbase.org/
http://www.hgsc.bcm.tmc.edu/project-species-iApis%20mellifera.hgsc?pageLocation=Apis%20mellifera
http://www.hgsc.bcm.tmc.edu/project-species-iTribolium%20castaneum.hgsc?pageLocation=Tribolium%20casta
neum
http://genome.jgi-psf.org/Nemve1/Nemve1.home.html
94
Table 2.6: The most complete genomes of the genera searched during this study with
web addresses. Continued
Taxon
Monosiga
Trichoplax
Saccharomyces
Kluyveromyces
Candida albicans
Magnaporthe
Neurospora
Gibberella
Aspergillus
Schizosacc.
Coprinus
Ustilago
Encephalitozoon
Dictyostelium
Entamoeba
Web address
http://genome.jgi-psf.org/Monbr1/Monbr1.home.html
http://genome.jgi-psf.org/Triad1/Triad1.home.html
http://www.yeastgenome.org/
http://www.genome.jp/keggbin/show_organism?org=kla
http://www.candidagenome.org/
http://www.broadinstitute.org/annotation/fungi/magnaporthe/
http://www.broadinstitute.org/annotation/genome/neurospora/Multi
Homhttp://www.broadinstitute.org/annotation/genome/fusarium_ver
ticillioides/Info.htmle.html
http://www.broadinstitute.org/annotation/genome/fusarium_verticilli
oides/Info.html
http://genome.jgi-psf.org/Aspni5/Aspni5.home.html
http://www.sanger.ac.uk/Projects/S_pombe/
http://www.broadinstitute.org/annotation/genome/coprinus_cinereus/
MultiHome.html
http://www.broadinstitute.org/annotation/genome/ustilago_maydis/H
ome.html
http://www.genome.jp/keggbin/show_organism?org=ecu
http://genome.jgi-psf.org/Dicpu1/Dicpu1.home.html
http://www.sanger.ac.uk/Projects/Comp_Entamoeba/
95
CHAPTER 3
PHYLOGENETIC ANALYSIS OF RECA HOMOLOGS
RAD51 AND DMC1 FROM ALL SUPERGROUPS
PROVIDES EVIDENCE FOR MEIOSIS IN THE LAST
COMMON ANCESTOR OF EUKARYOTES
Background:
Genetic recombination is necessary for repair of DNA double-strand breaks,
introduced during replication or exposure to mutagens, in prokaryotes and eukaryotes
(West 1992; Bishop 1994; Sandler et al. 1996). Among eukaryotes, recombination is
also necessary for repair of DSBs introduced during meiosis to ensure accurate pairing
and segregation of chromosomes to opposite spindle poles during the first meiotic
division (Bishop et al. 1992; Grishchuk et al. 2004). Eubacterial recA, archaebacterial
RadA, and eukaryotic Rad51 and Dmc1 genes are orthologs whose products are important
because they catalyze homologous DNA strand exchange during recombination (Stassen
et al. 1997; Lin et al. 2006). Rad51-ssDNA nucleoproteins seek out homologous Rad51dsDNA complexes, promoting DNA strand exchange (Krogh and Symington 2004).
Dmc1 functions similarly, promoting interhomolog DNA strand exchange but only
during meiosis in model organisms (Bishop et al. 1992; Paques and Haber 1999;
Symington 2002; Krogh and Symington 2004). Saccharomyces cerevisiae and
Arabidopsis thaliana rad51 mutants display increased sensitivity to DNA damaging
agents and diminished sporulation or fertility, as a result of reduced mitotic
recombination (Bishop 1994; Bleuyard, Gallego, and White 2006). Among vertebrates,
rad51 mutants have a lethal phenotype, indicating a possible dependence upon
recombination during growth and development (Tsuzuki et al. 1996). Homologous
recombination during meiosis is reduced or eliminated among dmc1 animal, fungi, and
plant mutants (Bishop et al. 1999; Tsubouchi and Roeder 2003). Available animal,
fungal, and plant Rad51 and Dmc1 protein sequences are highly conserved, with a great
96
degree of similarity and retention of motifs (Stassen et al. 1997). However, less is known
about Rad51 and Dmc1 among diverse protist lineages. It is necessary to include protists
in studies of eukaryotic evolution as they embody the greatest breadth of eukaryotes and
their genes may encode products with deviant functions (Sogin 1991; Dacks and Doolittle
2001). We present analyses of the distribution, molecular phylogenetic relationships, and
characteristics of Rad51 and Dmc1 protein sequences from organisms representing all
currently recognized eukaryotic supergroups - Opisthokonta, Amoebozoa, Excavata,
Chromalveolata, Rhizaria, and Archaeplastida – and a currently unclassified group, the
Apusozoa (Cavalier-Smith 2004; Adl et al. 2005; Baldauf 2008).
Previous studies confirmed the presence of Rad51 and Dmc1 in all but one
eukaryotic supergroup, Rhizaria, indicating that they likely arose early during eukaryotic
evolution (Ramesh, Malik, and Logsdon 2005; Lin et al. 2006; Malik et al. 2008). The
monophyly of Rad51 and Dmc1 has been demonstrated previously with phylogenetic
analyses (Komori et al. 2000; Ramesh, Malik, and Logsdon 2005; Lin et al. 2006; Malik
et al. 2008). The observations that homologous recombination is central to meiosis
(Paques and Haber 1999; Krogh and Symington 2004) and that Dmc1 catalyzes
interhomolog DNA strand exchange only during the first meiotic prophase (Bishop et al.
1992) have led to the inference that the presence of a Dmc1 gene in an organism indicates
that meiosis may occur (Ramesh, Malik, and Logsdon 2005). The existence of Dmc1 in
the putative early diverging eukaryotes Giardia intestinalis and Trichomonas vaginalis
has been cited as evidence of meiosis in the last common ancestor to eukaryotes
(Ramesh, Malik, and Logsdon 2005; Lin et al. 2006; Malik et al. 2008). This view is
supported by the presence of several other meiotic genes in G. intestinalis and T.
vaginalis (Ramesh, Malik, and Logsdon 2005; Malik et al. 2008). However, the status of
G. intestinalis and T. vaginalis as “primitive” eukaryotes is now dubious as different
hypotheses for rooting the evolutionary tree of eukaryotes have been proposed (CavalierSmith 2002a; Stechmann and Cavalier-Smith 2002; Roger and Simpson 2009; Cavalier-
97
Smith 2010). The relatively recent morphological and molecular phylogenetic analyses
of unclassified eukaryotes, such as the Apusozoa, further revives the prospect that some
organisms may be primitively asexual, having diverged prior to the origin of Dmc1 genes
and, perhaps, meiosis. In the absence of a clearly established earliest-diverging branch
on the eukaryotic tree, it is necessary to include representatives of all known eukaryotic
supergroups to address the question of whether Dmc1 genes and meiosis were present in
their last common ancestor.
Rad51 and Dmc1 protein sequences are well conserved, approximately 350 amino
acids long, and may be distinguished by inspection of multiple sequence alignments. In
addition, duplications of Rad51 and Dmc1 genes appear rare and, where present, seem to
have occurred recently during eukaryotic evolution (Maeshima et al. 1995; Kathiresan,
Khush, and Bennett 2002; Ramesh, Malik, and Logsdon 2005; Malik et al. 2008). Only
one absence of Rad51 genes, in G. intestinalis, has been confirmed (Ramesh, Malik, and
Logsdon 2005). Rad51 and Dmc1 genes are themselves paralogs, which means that it
might be possible to determine which eukaryotes represent the earliest-diverging lineages
with reciprocal rooting (Gogarten et al. 1989; Iwabe et al. 1989; Iwabe et al. 1991).
These characteristics make Rad51 and Dmc1 good candidates for phylogenetic analyses
(Baldauf and Palmer 1993). Several studies have determined that Rad51 and Dmc1
nucleotide and amino acid sequences are useful phylogenetic markers, resolving
relationships among animals, fungi, and plants (Stassen et al. 1997; Petersen and Seberg
2002; Petersen, Seberg, and Baden 2004). However, it is unknown whether Rad51
and/or Dmc1 protein sequence data will be useful for elucidating the relationships among
eukaryotic supergroups, or for the placement of unclassified organisms within the
eukaryotic tree of life.
We collected 99 Rad51 and 51 Dmc1 protein sequences (representing 97 and 50
genera, respectively) from six eukaryotic supergroups and Apusozoa. Among these
sequences, degenerate PCR was used to isolate 21 new Rad51 sequences and 8 new
98
Dmc1 sequences from evolutionarily diverse representatives of the eukaryotic
supergroups Rhizaria, Excavata, Chromalveolata, Amoebozoa, and also unclassified
Apusozoa (Ancyromonas sp. and T. trahens sp.) for which genome sequence data were
unavailable. All publically available nucleotide and protein sequence repositories were
also searched for homologs in diverse eukaryotes. To ensure that the breadth of sampling
was sufficient for a eukaryote-wide study of Rad51 and Dmc1, and given the abundance
of sequences from some eukaryotic groups (Fungi, Metazoa, Chloroplastida,
Kinetoplastida, and Apicomplexa), discrete datasets composed of exemplars were
collected for some over-represented groups, while exhaustive sequence data searches
were performed for all other groups (see Methods). Phylogenetic analyses revealed no
clear cases of lateral gene transfer of Rad51 or Dmc1 genes, indicating that vertical
transmission is the predominant (if not exclusive) mode of inheritance. In addition,
phylogenetic analyses of Rad51 and Dmc1 amino acid sequences indicated support for
five of the six currently proposed eukaryotic supergroups (Table 3.1).
We also scrutinized our alignments of all Rad51 and Dmc1 protein sequences
obtained and compared them to archaebacterial RadA and eubacterial RecA sequences.
Rad51 and Dmc1 protein sequences are highly conserved across all eukaryotic groups,
including functional motifs previously identified in archaebacterial RadA protein
sequences (Story, Weber, and Steitz 1992). In addition, we identify ten amino acid
residues conserved across all eukaryotic supergroups, but not among prokaryotes, which
may confer Rad51- and Dmc1-specific functions. Taken together, these data indicate that
the functions of Rad51 and Dmc1 are likely to be conserved across all eukaryotes. Thus
meiosis and mitosis most likely occurred in the last common ancestor of eukaryotes.
99
Results and discussion:
Phylogenetic analysis of Dmc1:
We analyzed the distribution of 51 Dmc1 genes from representatives of 50 genera;
42 of which were obtained from databases and 8 by degenerate PCR (Figures 3.1-3.6).
Dmc1 is present in representatives of all six currently recognized supergroups and the
unclassified Apusozoa. However, the distribution of the Dmc1 gene is uneven since it is
not detected in the genomes of entire groups of organisms, such as Diptera,
Sordariomycota, or Stramenopila (except for oomycetes). Failure to detect Dmc1 among
more stramenopiles is most parsimoniously interpreted as a loss following the divergence
of oomycetes (Brown and Sorhannus 2010). Dmc1 gene losses have been confirmed in a
few organisms that are known to undergo meiosis (e.g. Caenorhabditis elegans and
Drosophila melanogaster) (Orr-Weaver 1995; Zalevsky et al. 1999). Therefore, meiosis
may be accomplished without Dmc1 proteins in some organisms and its absence does not
necessarily indicate the absence of meiosis, since these sexual organisms have adapted to
Dmc1 loss. However, since Dmc1 is known to function only during meiosis, it is likely
that the presence of the Dmc1 gene indicates that meiosis occurs (Bishop et al. 1992;
Proudfoot and McCulloch 2006).
Phylogenetic analyses of Dmc1 protein sequences consistently yield a single,
distinct monophyletic group (Figures 3.5-3.7), indicating that the Dmc1 gene arose once
during the evolutionary history of extant eukaryotes. Most organisms have a single copy
of the Dmc1 gene within their genomes. Subsequent duplications of the Dmc1 gene
appear to be rare, with recent duplications detected only in the genomes of G. intestinalis
(Excavata) and Oryza sativa (Archaeplastida) (Kathiresan, Khush, and Bennett 2002;
Ramesh, Malik, and Logsdon 2005; Malik et al. 2008). Interestingly, G. intestinalis is
also the only organism with a confirmed absence of Rad51 gene from its genome
(Ramesh, Malik, and Logsdon 2005), but whether these observations are related is
currently unknown.
100
Phylogenetic analysis of Rad51:
We analyzed the phylogenetic relationships among 99 Rad51 protein sequences
representing 97 genera (78 from databases and 21 inferred from degenerate PCR (data
not shown), Figures 3.6-3.10). Rad51 genes were retrieved from the genomes of
organisms representing every currently recognized eukaryotic supergroup and two
Apusozoa (T. trahens and Ancyromonas). Unlike the Dmc1 gene, Rad51 gene appears to
be present in most organisms, and so far is absent only from the genome of G.
intestinalis. However, an extensive search for Rad51 in the genome of a related
diplomonad, Spironucleus vortens (Jorgensen and Sterud 2007), was performed in which
we explored all nucleotide, protein, and EST sequence databases and attempted to
amplify Rad51 with degenerate PCR and no Rad51 gene sequences were recovered.
Rad51 gene may have been lost prior to the divergence of G. intestinalis and S. vortens.
Like the Dmc1 gene, duplications of Rad51 gene appear to be rare and relatively recent,
with paralogs present only in Archaeplastida (Physcomitrella patens, Oryza sativa, and
Zea mays), Xenopus laevis (Opisthokonta), and T. vaginalis (Excavata) (Maeshima et al.
1995; Stassen et al. 1997; Malik et al. 2008). One of the T. vaginalis Rad51 gene copies
is a pseudogene, but both of the Xenopus Rad51 genes seem to encode functional
products and are expressed (Maeshima et al. 1995; Malik et al. 2008). There are also no
clear cases of Rad51 lateral transfer indicated, although Rad51 was discovered in the
nucleomorph genomes of Bigelowiella (Rhizaria), and the cryptophytes Hemiselmis
andersenii and Guillardia theta (both Chromalveolata) (Figure 3.6). Overall, our results
show that Rad51 gene is thus only vertically transmitted, and arose once, prior to the
divergence of extant eukaryotes.
Phylogenetic analyses of Rad51 and Dmc1:
Recently, many relationships among diverse eukaryotes have been determined by
phylogenetic analyses performed on multiple concatenated protein sequences (Figure 3.8)
(Burki and Pawlowski 2006; Kim, Simpson, and Graham 2006; Burki et al. 2007;
101
Moreira et al. 2007; Burki, Shalchian-Tabrizi, and Pawlowski 2008; Yoon et al. 2008;
Reeb et al. 2009; Parfrey et al.). These eukaryotic phylogenies provide references for
assessing the utility of individual nucleotide or protein sequence datasets as phylogenetic
markers. We performed extensive phylogenetic analyses on Rad51 and Dmc1 individual
and concatenated protein sequence datasets to test their phylogenetic utility (Figures 3.13.12 and Table 3.1).
The eukaryotic supergroup Opisthokonta (comprised of Animalia, Fungi and
several protist groups) is unified by flat mitochondrial cristae and a 12-amino acid
insertion in the translation elongation factor 1α (Baldauf and Palmer 1993; Adl et al.
2005; Steenkamp, Wright, and Baldauf 2006). Phylogenetic analyses typically provide
strong support for topologies unifying animals and fungi, confirming these observations
(Cavalier-Smith 1987c; Baldauf and Palmer 1993; Steenkamp, Wright, and Baldauf
2006). We obtained strong support with both maximum likelihood and Bayesian
phylogenetic approaches for the monophyly of Metazoa and Fungi with Dmc1 and
concatenated protein sequence alignments (Figures 1.2 and 3.5, Table 3.1). Although
opisthokont unity was not formally observed for Rad51 protein sequence dataset which
includes the Choanoflagellate, Monosiga brevicollis, this was a result of the likely
erroneous placement of the Apusomonad, T. trahens, within this group (but, see below)
(Figure 3.8) (Adl et al. 2005).
The Unikont hypothesis proposes that the eukaryotic supergroups Opisthokonta
and Amoebozoa are monophyletic, on the basis that they ancestrally possessed a single
flagellum (unlike the “bikont” Excavata, Archaeplastida, Chromalveolata, Rhizaria, and
Apusozoa that mostly have two flagella) and three fused genes (carbamoyl-phosphate
synthase, dihydroorotase, and aspartate carbomyl-transferase), the likely result of two
rare gene fusion events (Cavalier-Smith 2002a; Stechmann and Cavalier-Smith 2002;
Cavalier-Smith 2003a; Stechmann and Cavalier-Smith 2003b). Phylogenetic analyses
have supported the “Unikont hypothesis” (Stechmann and Cavalier-Smith 2003b; Burki,
102
Shalchian-Tabrizi, and Pawlowski 2008), however, recent phylogenetic analyses retrieve
topologies in which unclassified Apusozoa (Ancyromonas and T. trahens) are
monophyletic and closely related to Opisthokonts (Kim, Simpson, and Graham 2006;
Cavalier-Smith 2010). While none of our analyses retrieved topologies consistent with a
common origin of the Apusozoa, Ancyromonas and T. trahens, our Bayesian analysis of
the individual Rad51 protein sequences strongly supports their inclusion in the Unikont
clade (Figure 3.8 and Table 3.1), and analysis of concatenated Rad51 and Dmc1 proteins
moderately supports the inclusion of Ancyromonas in the Unikonts (Figure 3.11 and
Table 3.1). In addition to having two emergent flagella (instead of one flagellum like
other Unikonts), Apusozoa also lack the three-gene fusion. Instead, they share a fusion of
two genes (dihydrofolate reductase and thymidylate synthase) that distinguishes Bikonts.
Unikonts may, therefore, represent a polyphyletic group if Apusozoa are sisters to
Opisthokonta (Figure 1.2) (Stechmann and Cavalier-Smith 2002; Stechmann and
Cavalier-Smith 2003a).
On the basis of strongly supported topologies obtained with molecular
phylogenetic analyses of many concatenated protein sequence alignments, a
“megagroup” of predominantly photosynthetic eukaryotes has been proposed (including
supergroups Archaeplastida, Chromalveolata, and Rhizaria) (Burki, Shalchian-Tabrizi,
and Pawlowski 2008). The supergroup Chromalveolata was proposed to include
secondarily photosynthetic eukaryotes (alveolates, stramenopiles, cryptomonads, and
haptophytes) that obtained plastids by endosymbiosis with red algae (Cavalier-Smith
2002b; Cavalier-Smith 2003b; Janouskovec et al. 2010) (Figure 1.2). However,
molecular phylogenetic analyses rarely support the monophyly of this group (Parfrey et
al. 2006). Despite the complexities of developing the protein targeting system observed
in nascent plastids, recent phylogenetic analyses suggest secondary photosynthesis
evolved at least twice during eukaryotic evolution (Keeling 2010).
103
We included a subset of chromalveolates (stramenopiles and alveolates) in our
analyses. Phylogenetic analysis of Rad51 and Dmc1 concatenated protein sequence
dataset retrieved topologies consistent with the Chromalveolate hypothesis (Figure 3.11).
The phylogenies of the individual Dmc1 and Rad51 protein sequences both retrieve
discrepant topologies that support the grouping of stramenopiles with Chloroplastida,
while red algae are most closely related to stramenopiles in the Dmc1 phylogeny (Figure
3.1), and to alveolates in Rad51 phylogeny, (Figure 3.8). These topologies could be the
results of phylogenetic artifacts such as long-branch attraction (Felsenstein 2004).
However, it is noteworthy that Bayesian analyses of our concatenated Rad51 and Dmc1
dataset strongly support the monophyly of alveolates, stramenopiles, and Rhodophyceae,
and that Chloroplastida are grouped with Cercozoa (Rhizaria) (Figure 3.11). It has been
hypothesized that the difficulties of resolving relationships among secondarily
photosynthetic eukaryotes with multigene analyses may be due to the “mosaic” nature of
their nuclear genomes as a result of endosymbiotic gene transfer, resulting in conflicting
phylogenetic signals (Parfrey et al.). Our analyses of Rad51 and Dmc1 failed to support
subgroups within the photosynthetic megagroup such as SAR, in which Stramenopila,
Alveolata, and Rhizaria share a common ancestor, or the Archaeplastida, which all have
plastids obtained by primary endosymbiosis of a cyanobacterium (Adl et al. 2005;
Rodriguez-Ezpeleta et al. 2005; Burki, Shalchian-Tabrizi, and Pawlowski 2008; Parfrey
et al.). However, we did observe support for the monophyly of Cercozoa (Rhizaria),
stramenopiles and alveolates (Chromalveolata).
The eukaryotic supergroup Excavata (represented in our dataset by members of its
subgroups Discoba and Metamonada) was proposed to describe organisms with
suspension-feeding grooves (cytostomes) used to capture particles in a current produced
by anterior flagella (Figure 1.2) (Simpson 2003; Adl et al. 2005). Excavates include
organisms once considered to be among the earliest-diverging eukaryotes (e.g.
Euglenozoa, T. vaginalis, and G. intestinalis), based upon so-called “primitive” features
104
(like the apparent absence of organelles such as mitochondria) and early phylogenetic
analyses of small ribosomal subunit sequence data which retrieved topologies placing T.
vaginalis and G. intestinalis at the base of eukaryotic trees (Woese, Kandler, and Wheelis
1990; Tovar et al. 2003; Adl et al. 2005; Cavalier-Smith 2010). However, more recent
discoveries have cast doubt that they represent “primitive” eukaryotes. G. intestinalis and
T. vaginalis do, indeed, possess highly derived mitochondria (mitosomes and
hydrogenosomes, respectively), and their placement at the base of rooted eukaryotic
phylogenetic trees were most likely caused by artifacts of the phylogenetic analysis
(Tovar et al. 2003; Felsenstein 2004; van der Giezen, Tovar, and Clark 2005). Similarly,
Microsporidia, were later determined to be fungi with mitosomes (Cavalier-Smith 1989;
Hirt et al. 1999). If T. vaginalis and G. intestinalis (or any of Excavata) are the earliestdiverging eukaryotes, then Excavata would represent a paraphyletic group (a common
ancestor plus some but not all of its descendants) whose members diverged separately at
the base of the eukaryotic phylogenetic tree, i.e., very early during the evolution of
eukaryotes. However, recent phylogenetic analyses retrieve topologies that are consistent
with the monophyly of Excavata (Burki et al. 2007; Burki, Shalchian-Tabrizi, and
Pawlowski 2008; Hampl et al. 2009; Parfrey et al.). Our phylogenetic analysis of the
Dmc1 protein sequence dataset also supports the monophyly of Excavata, although it is
not resolved by Rad51 protein sequences (Figures 3.1 and 3.8). In an attempt to
determine the earliest-diverging eukaryotic lineages we performed analyses in which one
paralog was used to root the other, rather than assigning a root (Gogarten et al. 1989;
Iwabe et al. 1989; Iwabe et al. 1991). However, the topologies retrieved with reciprocal
rooting of Rad51 and Dmc1 protein sequence are poorly supported and discordant
(Figures 3.5-3.7).
Characteristics of Rad51 and Dmc1 protein sequences:
We aligned Rad51 and Dmc1 protein sequences from representatives of all known
eukaryotic supergroups and Apusozoa with representative archaebacteria (Nitrosopumilus
105
maritimus, Cenarchaeum symbiosum, Pyrobaculum islandicum, Candidatus
Korarchaeum cryptofilum, Aeropyrum pernix, Nanoarchaeum equitans, and
Methanocaldococcus fervens) and eubacteria (Bacillus amyloliquefaciens and Thermus
thermophilus) (Figure 3.13). Visual inspection of the central domains responsible for
recombinase activity of RecA, RadA, Rad51 and Dmc1 proteins indicates that the amino
acid sequences are well conserved in all domains of life (Story, Weber, and Steitz 1992).
Several motifs important for RecA function are highly conserved among eukaryotes. In
addition, archaebacterial RadA sequences contain all of the described functional motifs
(Chen et al. 2007); it is likely that these motifs were present in the common ancestor of
archaebacteria and eukaryotes, and thus were present in the last eukaryotic common
ancestor.
Although Rad51 and Dmc1 perform very similar functions, Rad51 catalyzes DNA
strand exchange during both mitosis and meiosis, while Dmc1 functions in interhomolog
DNA strand exchange exclusively during meiosis. Specific interactions between Rad51
and Dmc1 with each other, other proteins, and DNA are required for successful
completion of meiotic recombination (Krejci et al. 2001; Shin et al. 2003; Sugawara,
Wang, and Haber 2003). However, the basis of these interactions remains largely
unknown, especially for those interactions that distinguish Rad51 from Dmc1 function.
We examined our multiple sequence alignments for conserved amino acid residues
specific to Rad51 or Dmc1, which might confer Rad51- or Dmc1-specific activity.
Comparison of the central domains of Rad51 and Dmc1 protein sequences from all of our
representatives of six eukaryotic supergroups and Apusozoa indicate they are conserved,
likely due to common ancestry and functional constraints (summarized in Figure 3.14).
By identifying residues conserved in one protein but variable or different in the other, we
can generate hypotheses for future functional studies. Comparing protein sequences from
representatives of the entire breadth of eukaryotic diversity enables us to pinpoint
residues fundamental to Rad51 or Dmc1 function.
106
To examine amino acid conservation, we analyzed an alignment of 98 Rad51 and
51 Dmc1 protein sequences from all eukaryotic supergroups and Apusozoa (Figure 3.13).
The central domain (S. cerevisiae Rad51 amino acid positions 90-397) was examined
because it is conserved in all RecA homologs. All groups except Apusozoa and Rhizaria
were represented at each amino acid position studied. Apusozoa were represented from
S. cerevisiae Rad51 amino acid positions 126-356 for the aligned Rad51 proteins and
positions 188-324 for the aligned Dmc1 proteins; while Rhizaria were represented from
positions 188-397 in the Dmc1 alignment. We identified 18 amino acids that are
completely conserved among Rad51, and 15 completely conserved amino acids among
Dmc1. Seven residues are present among at least 95% of Rad51 protein sequences, but
are either different or variable among Dmc1 sequences, but among Dmc1, only three
such sites were identified. We found no cases in which a residue is ≥ 95% conserved in
one protein dataset and a different residue conserved ≥ 95% in the other dataset.
Studies in which the structures of RecA, RadA, Rad51, and Dmc1 have been
analyzed have resulted in the identification of several important functional motifs and
amino acid residues (Table 3.3) (Story, Weber, and Steitz 1992; Aihara et al. 1999;
Pellegrini et al. 2002; Conway et al. 2004; Chen et al. 2007; Chen, Yang, and Pavletich
2008; Okorokov et al. 2010). Residues identified with these methods are also highly
conserved (often 100%) in our sequence alignments. Five sites involved in ATP binding
(G185, D219, E221, D280, and S281) and three sites involved in DNA binding (N325,
G346, and G347) are present in all RecA, RadA, Rad51, and Dmc1 protein sequences
studied here. However, specific interactions have not been proposed for several sites that
we have determined are likely to be involved in Rad51- or Dmc1-specific activities
(Table 3.3).
Conclusions:
We isolated 8 Dmc1 and 21 Rad51 genes with degenerate PCR from eukaryotes
representing four of the six currently recognized supergroups (Amoebozoa, Excavata,
107
Chromalveolata, and Rhizaria) and the unplaced Apusozoa. In addition, we performed
extensive searches of all publicly available nucleotide and amino acid sequence
repositories, identified, and collected a total of 51 Dmc1 and 99 Rad51 sequences
(representing 50 and 97 genera, respectively). Our phylogenetic analyses indicate support
for all eukaryotic supergroups (Opisthokonta, Amoebozoa, Excavata, Chromalveolata,
and Rhizaria) except Archaeplastida was observed during this study (Table 3.1).
However, support was strongest for the supergroup Opisthokonta, which was retrieved
with phylogenetic analysis of Dmc1, Rad51, and concatenated protein sequences. These
results are consistent with previous studies in which the support for supergroups was
assessed (Parfrey et al. 2006). Dmc1 appears to retrieve known relationships well when
several protein sequences representing the greatest breadth of eukaryotes are available.
Consistent with the predictions of Stassen, et al. (1997), our analyses of Rad51 proteins
retrieve “somewhat anomalous” phylogenies, most likely due to substitution rate
heterogeneity among taxa resulting in long-branch artifacts (Stassen et al. 1997;
Felsenstein 2004). Analysis of Rad51 and Dmc1 concatenated protein sequence data
provides better resolution of the evolutionary relationships of eukaryotes (Figure 3.11).
We aligned Rad51 and Dmc1 protein sequences from every eukaryotic
supergroup and members of the currently unclassified Apusozoa with bacterial RecA and
archaebacterial RadA protein sequences (Figure 3.13). Previously identified (Sandler et
al. 1996; Chen et al. 2007; Okorokov et al. 2010) functional motifs are present in all
Rad51, Dmc1 and RadA proteins sampled, thus these motifs must have been present in
Rad51 and Dmc1 sequences of the last eukaryotic common ancestor. Furthermore, we
identified seven sites where the amino acids are conserved among Rad51 but not in
Dmc1, and three sites where the amino acids are conserved among Dmc1 but not in
Rad51. These amino acids are likely to be involved in functions that are specific to
Rad51 or Dmc1 but not both. Given the conservation of these amino acids in protein
sequences of diverse eukaryotes, they must have been present in the last eukaryotic
108
common ancestor as well. Thus, since both Rad51- and Dmc1-specific functions are
likely to have been present in the last eukaryotic common ancestor, the hypothesis that
Dmc1 was both present and functioning in a meiosis-specific role is supported by these
results.
Methods:
Database searches:
Keyword searches (e.g. S. cerevisiae Rad51) of the National Center for
Biotechnology Information (NCBI, www.ncbi.nlm.nih.gov/) protein sequence database
retrieved Rad51 and Dmc1 protein sequences for representatives of animals, fungi, and
plants (Homo sapiens (Rad51 – accession number NP_002866 and Dmc1 Q14565),
Saccharomyces cerevisiae (Rad51 - CAA45563 and Dmc1 - AAA34571), and Oryza
sativa (Rad51 - BAB85491 and Dmc1 - BAB85214) (Aboussekhra et al. 1992; Bishop et
al. 1992; Collins et al. 2004; Sakane et al. 2008; Kudoh et al. 2009). In addition, the
clusters of euKaryotic Orthologous Groups of proteins (KOGs) database for each protein
were searched (Tatusov et al. 2003). Sequence identities were initially verified by
evaluating the results of bi-directional searches with the tBLASTn (Altschul et al. 1997)
option of the Basic Local Alignment Search Tool (BLAST), in which the translated
nucleotide database is searched using a protein query. Rad51 and Dmc1 protein
sequences collected in this manner were subsequently used as queries to search protein,
nucleotide, and expressed sequence tag (EST) databases at NCBI, the Institute for
Genomic Research (TIGR, www.tigr.org/tdb/euk, since moved to
compbio.dfci.harvard.edu/tgi/protist.html), the Joint Genome Institute (JGI, genome.jgipsf.org), the Canadian Protist EST Project (Taxonomically Broad Database,
tbestdb.bcm.umontreal.ca ), Michigan State University Galdieria sulphuraria Database
((Weber et al. 2004; Barbier et al. 2005), genomics.msu.edu/galdieria) and the
Cyanidioschyzon merolae Genome Project ((Matsuzaki et al. 2004), merolae.biol.s.utokyo.ac.jp/blast/blast.html) with BLASTp, tBLASTn, and BLASTn, as necessary, for all
109
available Rad51 and Dmc1 sequences from January 2004 through April 2010. Due to the
abundance of sequences from a few eukaryotic groups (Fungi, Metazoa, Chloroplastida,
Kinetoplastida, and Apicomplexa), discrete datasets composed of exemplars were
collected for these groups, while exhaustive sequence data searches were performed for
all other groups, to ensure the breadth of sampling was sufficient for a eukaryote-wide
study of Rad51 and Dmc1. In case sequences from distantly-related organisms were
missed, additional searches were performed using protein sequence queries from
organisms likely to share more recent common ancestors: e.g. Trypanosoma brucei
(Rad51 CAA73605, Dmc1 XP_827266 (Berriman et al. 2005)) protein sequences were
used as additional queries for searches of sequences for a closely related kinetoplastid
protist, Leishmania major. Identities of sequences were again confirmed with bidirectional BLASTx and tBLASTn searches. When multiple sequences were found for a
species, only the most complete open reading frame or protein prediction was retained. If
no previously annotated protein sequence was available in a database (or, it was
apparently incorrectly annotated on the basis of protein sequence alignments with other
orthologs) then nucleotide sequences were annotated manually, using Sequencher v4.5
(Genecodes, Ann Arbor, MI). Exons were identified with the aid of inferred translations
from BLASTx pairwise comparisons to the NCBI protein sequence database and the
locations of putative intron splice donor and acceptor site sequences (e.g. G/GT to AG/G,
although others may be observed among diverse eukaryotes). Additional comparisons of
the inferred Rad51 and Dmc1 homologous amino acid sequences were performed with
alignments created using MUSCLE v3.7 (Edgar 2004) and observed with BioEdit
v7.0.5.3 (Hall 1999).
Degenerate PCR:
DNA samples were obtained by collaboration with Jeff Cole and Robert
Molestina at the American Type Culture Collection (ATCC, Manassas, VA), mainly from
xenic monoprotistan cultures. PCR amplifications were performed using degenerate
110
oligonucleotide primers (i.e. primers designed corresponding to highly conserved regions
of protein sequence alignments which reflect the degeneracy of the genetic code, see
Table 3.2 and Figure 3.13 arrows) synthesized by Integrated DNA Technologies (IDT,
Coralville, IA). Degenerate PCR primers Forward 6 and 7 and Reverse 1 were designed
by JML and the remaining degenerate primers were designed by AWP (Table 3.2). Gene
fragments of Rad51 and Dmc1 homologs were amplified from total DNA by PCR from
representatives of four eukaryotic supergroups and Apusozoa (Figure 3.2).
Amplifications utilized 0.03 U/ l MasterTaq polymerase (5 Prime, Gaithersburg, MD)
according to the manufacturer’s instructions, 0.002U/ l Stratagene Cloned Pfu (La Jolla,
CA) (to increase yields), 0.5 - 1 ng total DNA, 0.25 mM each dNTP (Stratagene): 1.5
mM MgCl2, and 10 µM each primer. Reaction conditions were 95º C for 2 minutes
followed by 40 cycles including denaturation at 94º C for 40 seconds, with replicates
annealing at temperatures of 55º C, 60º C, or 65º C for 1 minute, extension at 72º C
starting at 1.5 minutes, adding 6 seconds per cycle, and ending with 10 minutes at 72º C,
in Eppendorf gradient Mastercyclers (Hamburg, Germany). Resulting PCR products
were analyzed for size on 2% agarose gels by electrophoresis. Initially, eight degenerate
primer combinations were tested for each sample. When necessary, additional primer
combinations were applied or nested amplifications were performed using diluted
(1:1000) PCR products. Subsequent amplifications extended coverage of target genes by
primer walking, using exact-match primers vs. degenerate primers in all possible
combinations. Amplicons for Perkinsus marinus Rad51 genes were obtained with exactmatch primers designed from non-overlapping partial sequences (NCBI GenInfo numbers
126277177 and 126301963, Table 3.2). Selected amplicons were fractionated and excised
from 0.5% NuSieve GTG: 0.5% low-melt agarose gels (BioWhittaker [Walkersville,
MD], Fisher [Pittsburgh, PA]) at 4º C and 100 V for 40 minutes in 1x TAE buffer) and
cloned directly into the pSC-ATM vector (StrataCloneTM kit, Stratagene, La Jolla CA,
USA). Positive clones were identified by PCR with T3 and T7 primers to verify the
111
presence of appropriately sized inserts (cycling conditions: 94º for 2 minutes followed by
30 cycles at 94º C for 1 minute, 57º C for 1 minute, and 72º C for 1.5 minutes, ending
with 72º C for 5 minutes [Stratagene and Promega]). At least two clones per PCR product
were isolated with FastPlasmid Mini kits (5 Prime, Gaithersburg, MD) and sequenced in
each direction with ABI BigDye 3.1 reagents and T3 and T7 primers, on an ABI 3730
sequencer (Applied Biosystems [Foster City, CA]).
Nucleotide sequence data was assembled with Sequencher v4.5 (Genecodes, Ann
Arbor, MI) and the identities were initially verified with BLASTx searches in NCBI. If
either Rad51 or Dmc1 gene sequence fragments were isolated, but not both genes, then
single sequences from four or five additional clones were obtained to detect the other
paralog. In total, sequences generated from both strands for at least three clones per gene
were obtained. Nucleotide sequences were annotated and inferred exons were translated
to proteins as described above (Database Searches).
Phylogenetic analyses:
We aligned all potential eukaryotic Rad51 and Dmc1 protein sequences with
archaebacterial RadA protein sequences using MUSCLE v3.7, manually edited them by
removing ambiguously aligned columns and gaps in BioEdit v7.0.5.3 (Hall 1999; Edgar
2004), and performed phylogenetic analyses on the multiple sequence alignment.
Optimal protein substitution models and parameters were determined for each alignment
independently with Modelgenerator v0.85 (Keane et al. 2006). Analyses were performed
with PhyML v3.0 (Guindon et al. 2009) for 1000 replicates, and PhyloBayes v3.1
(Lartillot, Lepage, and Blanquart 2009), which used at least two independent converged
chains in which maximum differences observed across all bipartitions were less than
0.10. Every other tree after burnins (selected to minimize the differences across all
bipartitions) was used to calculate consensus tree topologies. Only sequences that
unambiguously grouped as either Rad51 or Dmc1 were retained, while those that did not
most likely represented other Rad51 paralogs such as Rad55 or Rad57 (Lin et al. 2006)
112
and were removed prior to subsequent analysis (Figure 3.2). Uncorrected pairwise
protein sequence distances were calculated with ClustalX v2.0.12 (Thompson et al.
1997). Pairs of sequences with less than 0.10 protein sequence distance were identified.
One member of this pair was removed on the basis of observed protein sequence-lengths
or branch-lengths determined with phylogenetic analyses, usually reducing representation
to one species per genus (Stiller and Harrell 2005). We removed the most divergent
sequences during subsequent analyses, as necessary, to minimize the effects of longbranch attraction (Felsenstein 2004; Hampl et al. 2009).
113
Table 3.1: Support for eukaryotic supergroups and first order groups from phylogenetic analyses of Rad51, Dmc1, and
concatenated protein sequence data.
Amoebozoa
Excavata
Chromalveolata
Archaeplastida Rhizaria
Opisthokonta
Rad51
+++
+++
Dmc1
+++
+++
++
Concat.
+++
+++
++
Rad51
Dmc1
Concat.
Fungi
Metazoa
Centra.
Mycet.
Arch.
Discoba
Meta.
Stramen.
Alveolata
Chloro.
Rhodo.
Cercozoa
Apusozoa
++
+++
+++
+++
+++
+++
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
++
-
+
-
+
+++
+++
+++
+++
++
-
+++
+++
+++
++
-
+++
+
Note: Support for eukaryotic groups was assessed with PhyloBayes posterior probabilities from phylogenetic analyses performed on
Rad51, Dmc1, and concatenated protein sequences (Figure 1). Pluses indicate that monophyletic groups were retrieved (ignoring the
placement of Apusozoa) (+++= > 0.90, ++= 0.70-0.90, += < 0.70) and minuses indicate the relationship was not retrieved. N/A
indicates only one representative of the group was in the alignment.
113
Dmc1
Rad51
S.c. Rad51 am ino acid
position
85
90
95
100
105
110
115
120
125
130
135
140
145
150
155
160
165
170
175
180
185
190
195
200
205
210
215
220
225
230
235
240
245
250
255
260
265
270
275
280
285
290
295
300
305
310
315
320
325
330
335
340
345
350
355
360
365
370
114
Ancyromonas
Thecamonas
M astigamoeba
Sexangularia
Arachnula
Bodomorpha
Cercomonas
Proleptomonas
Spongomonas
Thaumatomonas
Cafeteria
Pendulomonas
Perkinsus
Perk F
Pylaiella
Bodo
Diplonema
Jakoba libera
M onotrichomonas
Percolomonas
Scytomonas
Seculamonas
Ancyromonas
Thecamonas
M astigamoeba
Spongomonas
Adenoides
Diplonema
Percolomonas
Scytomonas
F2/F4/F5
R3
R5
R4
F7
F1/F3/F5
F6
F7
R2
R2
R2
F8
F7
F7
R1
R1
R1
F6
F6
R5
R3
F2/F4
F6
R1
Perk R
F7
F8
F8
F7
R1
R5
R3
R2
F4/F5
R5
R1
F6
F4
F5
R5
R3
F8
F6
F6
F6
F7
F7
F7
F6
R1
R1
R2
R1
R1
R1
R1
R1
Figure 3.1: Graphic representation of Rad51 or Dmc1 gene sequence fragments amplified with degenerate PCR from
representatives of four eukaryotic supergroups and Apusozoa relative to Saccharomyces cerevisiae Rad51 protein
sequence. Amoebozoa are labeled with blue, Rhizaria with eggplant, Chromalveolata with orange, Excavata with brown,
and Apusozoa with black. Amino acid positions are Saccharomyces cerevisiae (S.c.) Rad51 protein sequence positions.
Grey bars indicate regions encoded by fragments amplified with degenerate PCR. Letters and numbers on each side of
grey bars indicate degenerate primers used (Table 3.2 and Figure 3.1).
114
115
Dmc1
Candida glabrata
Saccharomyces
Kluyveromyces
Ascomycota
0.99/922
Candida albicans
0.99/802
Aspergillus
Fungi
0.99/696
Schizosaccharomyces
Coprinopsis
0.99/705
Basidiomycota
0.99/748
Cryptococcus
0.99/554
Mucoromycotina
Phycomyces
0.99/695
Batrachochytrium Chytridiomycota
Homo
Eumetazoa
Metazoa
1.00/1000
Strongylocentrotus
Myxogastria
Physarum
Mycetozoa
0.98/266
Mastigamoeba* Mastigamoebida
Archamoebae
0.77/307
Entamoeba Entamoebida
Leishmania
0.97/606
0.99/878
Scytomonas* Euglenozoa
0.95/399
Trypanosoma
Discoba
0.50/21
Diplonema*
Percolomonas* Heterolobosea
0.33/22
0.65/5
Naegleria
0.71/22
Trichomonas Parabasalia
Giardia A
0.94/425
Metamonada
Fornicata
Giardia B
0.99/940
0.99/911
Spironucleus
Apusozoa
Thecamonas* Apusomonadidae
0.85/403
Pythium
Oomycetes Stramenopiles
Phytophthora
1.00/999
0.91/270
0.99/996
Hyalo.
0.64/Chlamydomonas
0.99/722
Chlorella
Chlorophyta Chloroplastida
0.94/40
0.89/423
Micromonas
1.00/999
Ostreococcus
Gracilaria Florideophyceae
0.55/Rhodophyceae
1.00/575
Galdieria Bangiophyceae
Arabidopsis
0.99/967
Streptophyta
Chloroplastida
Oryza
0.86/0.56/Spongomonas* Cercomonadida
Cercozoa
Gymnophrys unclassified
0.68/Adenoides* Dinozoa
Plasmodium
0.99/694
Apicomplexa
0.99/790
Toxoplasma
Alveolata
0.91/261
Cryptosporidium
Dinozoa
Karlodinium
0.99/776
0.42/Perkinsus Perkinsea
Sterkiella Ciliophora
0.63/Ancyromonas* Ancyromonadidae
Apusozoa
0.99/700
0.99/1000
0.99/949
Opis.
Amoeb.
Excav.
Chrom.
Arch.
Rhiz.
Chrom.
0.1 substitutions/site
Figure 3.2: Unrooted phylogenetic tree of 47 Dmc1 homologs. Trees were estimated
with PhyML (LG+G) and PhyloBayes (LG+G) from 312 aligned amino
acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, Rhizaria in violet, and
Excavata in brown. Asterisks indicate data was obtained with degenerate
PCR. The consensus topology of 2 PhyloBayes chains is shown.
116
Candida glabrata 50293765
Saccharomyces 118683
Kluyveromyces 50311197
0.99/922
Candida albicans 1706446
0.99/802
Aspergillus 121709155
0.99/696
Schizosaccharomyces 3176384
Coprinopsis 6714639
0.99/705
Cryptococcus 134118469
0.99/748
0.99/554
Phycomyces jgiScaffold_3|1364891|1940257 and 1122|189|447
0.99/695
Batrachochytrium jgiScaffold_2|2017505|202793
Homo 13878923
Strongylocentrotus 115660762
1.00/1000
Physarum 90192353
0.98/266
Mastigamoeba*
0.77/307
Entamoeba 67482427
Leishmania 72549845
0.97/606
0.99/878
Scytomonas*
0.95/399
Trypanosoma 71659624
0.50/21
Diplonema*
Percolomonas*
0.33/22
0.65/5
Naegleria jgiScaffold_1|500453|501457
0.71/22
Trichomonas 123408121
0.94/425
Giardia A 30578211
Giardia B 71080540
0.99/940
0.99/911
Spironucleus jgiScaffold_430|11672|12631
Thecamonas*
0.85/403
Pythium 166325657
Phytophthora r. jgi76896
1.00/999
0.91/270
Hyaloperonospora 199610544/64
0.99/996
0.64/Chlamydomonas 158272235
0.99/722
Chlorella jgi52039
0.94/40
Micromonas 226524329
0.89/423
Ostreococcus 145352283
1.00/999
Gracilaria 120463106
0.55/Galdieria Galdieria genomeScaffold 896 Oct13 2005:g78.t1
1.00/575
Arabidopsis 21903409
0.99/967
Oryza 18700485
0.56/0.86/Spongomonas*
Gymnophrys 158071814
Adenoides*
0.68/Plasmodium 68076139
0.99/694
0.99/790
Toxoplasma 237843305
0.91/261
Cryptosporidium 209879790
Karlodinium TBestDBKML00009877
0.99/776
0.42/Perkinsus TIGR1637
Sterkiella 209371672
0.63/Ancyromonas*
0.99/700
0.99/1000
0.99/949
Dmc1
0.1 substitutions/site
Figure 3.3: Unrooted phylogenetic tree of 47 Dmc1 homologs with accession
numbers. Trees were estimated with PhyML (LG+G) and PhyloBayes
(LG+G) from 312 aligned amino acids. Opisthokonta are highlighted in
purple, Amoebozoa in blue, Archaeplastida in green, Chromalveolata in
orange, Rhizaria in violet, and Excavata in brown. Asterisks indicate data
was obtained with degenerate PCR. All references are GenBank unless
otherwise noted. The consensus topology of 2 PhyloBayes chains is shown.
117
Candida glabrata 50293765
Saccharomyces 118683
Kluyveromyces 50311197
Candida albicans 1706446
Aspergillus 121709155
0.99/741
Schizosaccharomyces 3176384
Coprinopsis 6714639
0.99/702
0.99/766
Cryptococcus 134118469
0.99/608
Phycomyces jgiScaffold_3|1364891|1940257 and 1122|189|447
0.99/715
Batrachochytrium jgiScaffold_2|2017505|202793
Homo 13878923
0.99/1000
Strongylocentrotus 115660762
0.99/723
0.74/Plasmodium 68076139
1.00/809
Toxoplasma 237843305
Cryptosporidium
209879790
0.90/809
Karlodinium TBestDBKML00009877
0.74/0.99/766
Perkinsus TIGR1637
0.72/Sterkiella 209371672
0.44/Anycromonas*
Physarum 90192353
0.99/269
Mastigamoeba*
0.73/Entamoeba 67482427
0.65/0.99/623
Leishmania 72549845
0.99/864
Scytomonas*
0.97/383
Trypanosoma 71659624
0.46/Diplonema*
Percolomonas*
0.30/8
0.40/Naegleria jgiScaffold_1500453|501457
0.60/16
Trichomonas 123408121
0.96/349
Giardia A 30578211
Giardia B 71080540
0.99/927
0.70/1.00/881
Spironucleus jgiScaffold_430|11672|12631
Adenoides*
0.78/Gymnophrys 158071814
0.80/53
Spongomonas*
Oryza 18700485
0.74/0.99/974
Arabidposis 21903409
0.79/70
Gracilaria 120463106
Galdieria Galdieria genomeScaffold_896 Oct13 2005:g78.tl
0.99/538
1.00/998
Ostreococcus 145352283
0.72/153
Micromonas 226524329
0.87/453
Chlorella jgi52039
0.99/722
Chlamydomonas 158272235
Pythium 166325657
Phytophthora jgi76896
0.99/999
0.99/993
Hyaloperonospora 199610544/64
Thecamonas*
0.85/Nanoarchaeum 41615212
0.99/891
Methanocaldococcus 256811072
0.98/690
Aeropyrum 109689248
Pyrobaculum 119872227
0.71/661
Candidatus 170290825
0.99/998
Nitrosopumilus 161528894
0.99/1000
Cenarchaeum 118575453
0.99/730
0.99/999
0.99/957
0.99/914
0.99/810
Dmc1
RadA
0.1/substitutions/site
Figure 3.4: Unrooted phylogenetic tree of 54 Dmc1 and RadA homologs with
accession numbers. Trees were estimated with PhyML (LG+G) and
PhyloBayes (LG+G) from 312 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, Rhizaria in violet, and Excavata in brown.
Asterisks indicate data was obtained with degenerate PCR. All references
are GenBank unless otherwise noted. The consensus topology of 2
PhyloBayes chains is shown.
118
Strongylocentrotus 115610811
Ciona 198420224
Homo 19924133
Ixodes 215491711
Apis 110756953
0.75/0.99/547
Aedes 157112162
0.29/57
Thecamonas*
0.97/325
Trichoplax jgi Scaffold 6|2098752|2100304
0.88/174
Monosiga jgi 6000172
Batrachochytrium jgi Scaffold 2|1601520|1603332
Phycomyces jgi Scaffold 14|1066538|1067949
0.99/220
1.00/774
Cryptococcus 58259207
1.00/876
Coprinopsis 3237296
0.57/326
0.53/Ustilago 71018413
0.99/749
Schizosaccharomyces 397843
Saccharomyces 4275
0.99/800
0.57/1.00/900
Candida 68485285
Ancyromonas*
Dictyostelium 66822135
0.41/0.73/344
0.51/Acanthamoeba Baylor Contig 1595
Pendulomonas*
0.99/591
Seculamonas*
Jakoba bahamiensis 109794508
Trichomonas 123408472
0.88/Galdieria Genome Contig 785|1|1017
0.71/52
0.45/Guillardia nucleomorph 162605684
0.97/161
Cyanidioschyzon 151559143
0.88/188
Cryptosporidium 209875975
0.99/346
Toxoplasma 211963576
1.00/954
1.00/895
Plasmodium 124803581
Trypanosoma 2108337
Leishmania 157871568
1.00/993
0.86/716
Bodo*
Jakoba libera*
0.99/546
Proleptomonas*
0.75/139
Bodomorpha*
0.95/Thaumatomonas*
0.77/Cercomonas*
1.00/968
Oryza
18874071
1.00/984
0.53/Vitis 225444585
Physcomitrella 16605579
1.00/999
Ostreococcus 145349400
0.99/Micromonas sp. jgi 226516672
0.99/729
Chlorella jgi 20220
Volvox jgi Scaffold 15|851250|854214
0.99/893
1.00/985
Chlamydomonas 45685351
Euglena 109787391
Scytomonas*
Diplonema*
0.62/591
Phytophthora r. jgi 74160
1.00/998
Pythium 207397927
Hyaloperonospora 199611623
1.00/995
Ectocarpus 241962436
1.00/874
Pylaiella*
Aureococcus jgi Scaffold 14|44397|45350
1.00/994
Thalassiosira jgi Scaffold 2|665690|666833
Phaeodactylum 219119366
1.00/999
0.54/513
Chaetoceros 164412700
0.99/706
Candida glabrata 50293765
1.00/998
Saccharomyces 118683
1.00/946
Kluyveromyces 50311197
1.00/941
Candida albicans 1706446
0.99/705
Aspergillus 121709155
0.99/533
Schizosaccharomyces 3176384
0.99/684
Coprinopsis 6714639
0.99/668
Cryptococcus 134118469
0.99/636
Phycomyces jgi Scaffold 3|1364891|1940257/1122|189|447
0.99/687
Batrachochytrium jgi Scaffold 2|2017505|202793
Homo 13878923
0.25/1.00/1000
Strongylocentrotus 115660762
Ancyromonas*
Sterkiella 209371672
0.33/1.00/715
Karlodinium TBestDB 00009877
0.44/Perkinsus TIGR 1637
Cryptosporidium 209879790
0.81/Plasmodium 68076139
0.99/875
0.99/811
Toxoplasma 237843305
Physarum 90192353
Entamoeba 67482427
0.97/380
0.73/310
Mastigamoeba*
0.36/Gymnophrys 158071814
Adenoides*
0.99/969
Arabidopsis 21903409
0.61/Oryza 18700485
Spongomonas*
0.99/608
Galdieria Galdieria genome Scaffold 896 Oct13 2005:g78.t1
Gracilaria 120463106
0.41/1.00/992
Hyaloperonospora 199610544/64
1.00/1000
Phytophthora r. jgi 76896
0.56/0.54/7
Pythium 166325657
Thecamonas*
0.99/642
Chlamydomonas 158272235
0.65/144
Chlorella sp. jgi 52039
0.87/397
Micromonas 226524329
1.00/996
Ostreococcus 145352283
Naegleria jgi Scaffold 1|500453|501457
Diplonema*
Trypanosoma
71659624
0.99/Leishmania 72549845
1.00/822
0.98/583
Scytomonas*
Percolomonas*
Trichomonas 123408121
Giardia A 30578211
Spironucleus jgi Scaffold 430|11672|12631
1.00/936
1.00/842
Giardia B 71080540
0.64/0.77/-
0.95/296
0.73/180
Rad51
0.44/-
0.26/-
0.41/-
1.00/876
0.43/352
0.38/-
0.18/-
0.36/-
Dmc1
0.34/-
0.24/-
0.40/-
0.33/-
0.55/-
0.1 substitutions/site
Figure 3.5: Unrooted phylogenetic tree of 105 Rad51 and Dmc1 homologs. Trees
were estimated with PhyML (LG+G) and PhyloBayes (LG+G) from 315
aligned amino acids. Opisthokonta are highlighted in purple, Amoebozoa
in blue, Archaeplastida in green, Chromalveolata in orange, Rhizaria in
violet, and Excavata in brown. Asterisks indicate data was obtained with
degenerate PCR. The consensus topology of 2 PhyloBayes chains is
shown.
119
Figure 3.6: Unrooted phylogenetic tree of 112 Rad51, Dmc1, and RadA homologs.
Trees were estimated with PhyML (LG+G) and PhyloBayes (LG+G) from
315 aligned amino acids. Opisthokonta are highlighted in purple,
Amoebozoa in blue, Archaeplastida in green, Chromalveolata in orange,
Rhizaria in violet, and Excavata in brown. Asterisks indicate data was
obtained with degenerate PCR. The consensus topology of 2 PhyloBayes
chains is shown.
120
Strongylocentrotus 115610811
Ciona 198420224
Homo 19924133
Ixodes 215491711
0.83/171 0.69/Apis 110756953
0.99/526
Aedes 157112162
0.38/79
Thecamonas*
0.96/344
Trichoplax jgi Scaffold 6|2098752|2100304
0.95/41
Monosiga jgi 6000172
Batrachochytrium jgi Scaffold 2|1601520|1603332
Phycomyces jgi Scaffold 14|1066538|1067949
0.99/234
0.99/783
Cryptococcus 58259207
1.00/891
Coprinopsis 3237296
0.68/322
0.89/Ustilago 71018413
0.99/757
Schizosaccharomyces 397843
Saccharomyces 4275
0.99/819
1.00/915
Candida 68485285
Ancyromonas*
0.41/Dictyostelium
66822135
0.57/325
0.58/Acanthamoeba Baylor Contig 1595
0.50/Pendulomonas*
Seculamonas*
1.00/616
Jakoba bahamiensis 109794508
Trichomonas 123408472
0.73/0.87/Galdieria Genome Contig 785|1|1017
0.85/Guillardia nucleomorph 162605684
0.97/185
Cyanidioschyzon 151559143
0.84/174
Cryptosporidium 209875975
1.00/339
Toxoplasma 211963576
1.00/967
1.00/897
Plasmodium 124803581
0.67/Euglena 109787391
Trypanosoma
2108337
0.83/Leishmania 157871568
1.00/989
0.82/668
Bodo*
Jakoba libera*
Oryza 18874071
1.00/983
1.00/983
Vitis 225444585
0.48/Physcomitrella 16605579
0.36/Cercomonas*
0.84/Thaumatomonas*
0.95/Proleptomonas*
0.73/145
0.99/528
Bodomorpha*
Volvox jgi Scaffold 15|851250|854214
1.00/984
1.00/902
Chlamydomonas 45685351
0.98/728
Chlorella jgi 20220
Ostreococcus 145349400
1.00/999
Micromonas sp. jgi 226516672
0.99/905
0.79/304
Scytomonas*
0.57/Diplonema*
1.00/991
Ectocarpus 241962436
1.00/863
0.65/Pylaiella*
Aureococcus jgi Scaffold 14|44397|45350
1.00/997
Thalassiosira jgi Scaffold 2|665690|666833
Phaeodactylum 219119366
1.00/1000
0.62/546
Chaetoceros 164412700
Pythium 207397927
Phytophthora r. jgi 74160
1.00/999
0.96/Hyaloperonospora 199611623
0.99/715
Candida glabrata 50293765
1.00/998
Saccharomyces 118683
1.00/946
Kluyveromyces 50311197
1.00/947
Candida albicans 1706446
1.00/744
Aspergillus 121709155
0.99/571
Schizosaccharomyces 3176384
Coprinopsis 6714639
0.99/702
1.00/713
Cryptococcus 134118469
0.99/621
Phycomyces jgi Scaffold 3|1364891|1940257/1122|189|447
0.99/670
Batrachochytrium jgi Scaffold 2|2017505|202793
Homo 13878923
0.28/1.00/1000
Strongylocentrotus 115660762
Physarum 90192353
Entamoeba 67482427
0.99/352
0.12/0.64/271
Mastigamoeba*
Ancyromonas*
Sterkiella 209371672
0.40/Karlodinium TBestDB 00009877
1.00/734
0.40/Perkinsus TIGR 1637
Cryptosporidium 209879790
0.07/0.77/Plasmodium 68076139
0.99/0.99/Toxoplasma 237843305
1.00/817
Giardia B 71080540
1.00/936
Spironucleus jgi Scaffold 430|11672|12631
0.91/278
Giardia A 30578211
Trichomonas 123408121
0.33/0.49/Naegleria jgi Scaffold 1|500453|501457
Diplonema*
0.47/Trypanosoma 71659624
0.97/Leishmania 72549845
1.00/819
0.98/599
Scytomonas*
0.43/0.77/Gymnophrys 158071814
Adenoides*
Spongomonas*
0.37/Arabidopsis 21903409
0.60/0.68/0.99/971
Oryza 18700485
Galdieria Galdieria genome Scaffold 896 Oct13 2005:g78.t1
0.99/575
Gracilaria 120463106
0.62/Percolomonas*
0.99/655
Chlamydomonas 158272235
Chlorella sp. jgi 52039
0.87/0.83/351
Micromonas 226524329
1.00/997
Ostreococcus 145352283
Pythium 166325657
0.99/797
Hyaloperonospora 199610544/64
1.00/1000
0.99/993
Phytophthora r. jgi 76896
Thecamonas*
1.00/1000
Nitrosopumilus 161528894
Cenarchaeum 118575453
0.85/646
Pyrobaculum 119872227
1.00/999
Candidatus 170290825
0.92/674
Aeropyrum 109689248
Nanoarchaeum 41615212
1.00/863
0.84/Methanocaldococcus 256811072
0.62/173
0.75/-
0.97/250
Rad51
Dmc1
RadA
0.1 substitutions/site
121
0.99
Saccharomyces 4275
Kluyveromyces 50309711
Candida albicans 68485285
Neurospora 28926929
Aspergillus 83774056
Schizosaccharomyces 397843
Ustilago 71018413
0.99
0.55
0.99
Cryptococcus 58259207
0.74
Coprinopsis 3237296
Phycomyces jgi Scaffold 14|1066538|1067949
0.59
Batrachochytrium jgi Scaffold 2|1601520|1603332
0.34
Amoebidium TBestDB 00001039
Monosiga jgi 6000172
0.99
Homo 19924133
0.74
Danio 47086005
Strongylocentrotus 115610811
0.45
0.83
Ciona 198420224
0.72
0.67
Drosophila 17864108
0.45
Tribolium 91080301
0.87
Aedes 157112162
0.53
Apis 110756953
0.73
0.68
Ixodes 215491711
Thecamonas*
0.61
Trichoplax jgi Scaffold 6|2098752|2100304
0.81
Arachnula*
1.00 Acanthamoeba Baylor Contig 1595
Acanthamoeba 106789002
Glaucocystis TBestDB L00001512
0.77
Physarum 90192351
0.74
0.33
Cyanophora 109763966
0.45
Dictyostelium 66822135
0.73
Paracercomonas 156129599
0.28
1.00 Cercomonas*
Proleptomonas*
0.46
Bodomorpha*
0.72
0.84
Spongomonas*
Thaumatomonas*
0.32
0.58
Trypanosoma
37778910
0.99
Leishmania 157871568
0.23
0.58
Bodo*
Mastigamoeba*
Naegleria jgi Scaffold 63|72794|73744
0.24
Euglena
109787391
0.49
Jakoba libera*
0.79
Sexangularia*
0.63
Entamoeba
67477127
0.33
0.99
Toxoplasma 211963576
0.99
Plasmodium 68071341
0.99
Theileria 71028444
0.71
Cryptosporidium 209875975
0.23
Cyanidioschyzon 151559143
0.38
Porphyra 3702015
Galdieria Genome Contig785|1|1017
Hemiselmis nucleomorph 160331524
0.86
Guillardia nucleomorph 162605684
0.99
0.73
Cafeteria*
0.16
0.68
Bigelowiella TBestDB 00000947
0.39
Malawimonas*
0.67
Ancyromonas*
0.99
Seculamonas*
0.37
Jakoba bahamiensis 109794508
Percolomonas*
0.40
0.99
Nosema 239605787
0.81
Enterocytozoon 169806553
0.38
Trichomonas 123408472
0.92
Monotrichomonas*
0.99
Reticulomyxa 113376167
0.99
Stylonychia 54659980
0.35
Sterkiella 209384558
Tetrahymena 118355624
0.87
Paramecium 145492218
0.99
Isochrysis 106825547
0.73
Emiliania jgi Scaffold 59|567257|569210
0.99
0.99 Vitis 225444585
0.57
Populus 112419535
Zea 194691108
0.99
0.99 Oryza 18874071
0.99
Triticum 222154117
Physcomitrella 16605579
0.99
Volvox jgi Scaffold 15|851250|854214
0.99
0.99
0.50
Chlamydomonas 45685351
Chlorella jgi Scaffold 4|1825618|1827461
1.00
Ostreococcus 145349400
0.50
Micromonas sp. jgi Chr. 4|891136|892374
Bigelowiella nucleomorph 161899442
0.50
Perkinsus*/126301760|1426|2487
0.63
Oxyrrhis 117409217
0.85
0.98
Scytomonas*
Diplonema*
0.32
Schizochytrium 148527882
Pendulomonas*
0.39
0.99
Pythium 207461444
Phytophthora r. jgi 74160
0.61
0.54
Phaeodactylum 219119366
0.99
Chaetoceros 164412700
0.46
Thalassiosira jgi Chr. 2|665690|666833
Aureococcus jgi Scaffold 14|44397|45350
0.99
Pylaiella*
0.99
0.99 Ectocarpus 241962436
0.99
Candida glabrata 50293765
0.99
Saccharomyces 118683
0.99
Kluyveromyces 50311197
0.99
Candida albicans 1706446
0.99
Aspergillus 121709155
0.99
Schizosaccharomyces 3176384
0.98
Coprinopsis 6714639
Cryptococcus 134118469
0.99
0.36
Phycomyces jgi Scaffold3|1364891|1940257 and Scaffold 1122|189|447
Nosema 239605717
0.99
Tribolium 91078458
0.70
Homo 13878923
0.99
Strongylocentrotus 115660762
0.91
0.57
Batrachochytrium jgi Scaffold 2|201705|202793
Ancyromonas*
0.99
Plasmodium 68076139
0.69
Toxoplasma 237843305
0.99
0.61
Cryptosporidium 209879790
0.52
Theileria 71028324
Sterkiella 209369151/209371672
0.30
Tetrahymena 118382143
0.36
Karlodinium TBestDB 00005950
0.42
Perkinsus TIGR 1637
0.99
0.37
Physarum 90192353
Entamoeba 67482427
0.98
Mastigamoeba*
0.65
0.65
Gymnophrys 158071814
Adenoides*
0.67
Spongomonas*
0.24
Arabidopsis 21903409
0.70
Oryza 18700485
0.99
Percolomonas*
Naegleria jgi Scaffold 1|500453|501457
0.96
Leishmania 72549845
0.61
0.99
0.72
Scytomonas*
0.97
0.40
Trypanosoma 71659624
Diplonema*
Trichomonas 123408121
0.23
Giardia A 30578211
0.91
Giardia B 159119566
0.99
0.60
Spironucleus jgi Scaffold 430|11672|12631
0.99
0.99
Galdieria Genome contig_896_Oct13_2005:g78.t1
Gracilaria 120463106
0.99
Hyaloperonospora 199610564/199610544
0.99
Phytophthora r. jgi 44552
0.95
Pythium 166325657
0.99
Chlamydomonas 158272235
0.60
Chlorella sp. jgi 52039
Micromonas 226524329
0.89
Ostreococcus 145352283
0.99
Thecamonas* 0.99
Nitrosopumilus 161528894
Cenarchaeum 118575453
0.99
0.79
Pyrobaculum 119872227
Candidatus 170290825
Aeropyrum 109689248
0.83
Nanoarchaeum 41615212
0.99
Methanocaldococcus 256811072
0.82
0.1 substitutions/site
0.99
0.93
0.99
1.00
0.99
Rad51
Dmc1
RadA
Figure 3.7: Unrooted phylogenetic tree of 157 Rad51, Dmc1 and RadA homologs.
Trees were estimated with PhyloBayes (LG+G) from 314 aligned amino
acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, Rhizaria in violet, and
Excavata in brown. Asterisks indicate data was obtained with degenerate
PCR. The consensus topology of 2 PhyloBayes chains is shown.
122
Homo
Strongylocentrotus
Ciona
Eumetazoa
Ixodes
0.71/.0.52/70
Apis
0.99/535
Aedes
Trichoplax Placozoa
0.44/106
Thecamonas* Apusomonadidae
0.99/761
Cryptococcus
0.30/0.99/843
Coprinus Basidiomycota
Ustilago
0.99/787
Schizosaccharomyces
0.56/0.72/177
Saccharomyces Ascomycota
0.99/859
1.00/966
Candida
Batrachochytrium Chytridiomycota
0.84/251
0.93/116
Phycomyces Mucoromycotina
Monosiga Codonosigidae
0.96/501
Acanthamoeba Acanthamoebidae
Dictyostelium Dictyosteliida
0.57/243
Ancyromonas* Ancyromonadidae
Ectocarpus PX Clade
1.00/1000
0.99/878
Pylaiella*
0.99/994
Aureococcus Pelagophyceae
Thalassiosira
Bacillariophyta
Phaeodactylum
1.00/1000
0.59/0.63/496
Chaetoceros
Hyaloperonospora
Phytophthora r. Oomycetes
1.00/1001
0.29/0.71/554
Pythium
1.00/988
Oryza
1.00/1000
Streptophyta
Vitis
Physcomitrella
1.00/1000
Ostreococcus
0.82/478
0.49/Micromonas sp.
Chlorophyta
0.91/721
Chlorella sp.
Volvox
0.99/949
0.42/1.00/988
Chlamydomonas
Scytomonas*
0.58/Diplonema*
Euglenozoa
Euglena
0.57/77
Trypanosoma
0.39/1.00/943
Bodo*
Cercomonas* Cercomonadida
Thaumatomonas* Silicofilosea
0.98/414
0.67/438
Bodomorpha* Cercomonadida
Seculamonas* Jakobida
Trichomonas Parabasalia
0.38/Galdieria
0.81/147
Cyanidioschyzon Bangiophyceae
0.54/215
Cryptosporidium
0.99/483
Apicomplexa
Toxoplasma
0.99/1000
1.00/902
Plasmodium
0.80/-
0.95/248
0.63/-
Rad51
Metazoa
Opis.
Apusozoa
Fungi
Opis.
Choanoflagellida
Centramoebida
Amoeb.
Mycetozoa
Apusozoa
Stramenopila
Chloroplastida
Chrom.
Arch.
Discoba
Excav.
Cercozoa
Discoba
Metamonada
Rhodophyceae
Rhiz.
Alveolata
Chrom.
Excav.
0.1 substitutions/site
Figure 3.8: Unrooted phylogenetic tree of 52 Rad51 homologs. Trees were
estimated with PhyML (LG+G) and PhyloBayes (LG+G) from 307 aligned
amino acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, Rhizaria in violet, and
Excavata in brown. Asterisks indicate data was obtained with degenerate
PCR. The consensus topology of 2 PhyloBayes chains is shown.
123
Cryptoccus 58259207
Coprinopsis 3237296
0.99/721
Ustilago 71018413
Schizosaccharomyces 397843
0.61/318
Saccharomyces 4275
0.98/777
1.00/864
Candida albicans 68485285
0.99/234
Phycomyces jgiScaffold_40|201443|202854
Batrachochytrium jgiScaffold_2|1601520|1603332
Aedes 157112162
0.99/468
0.29/Apis 110756953
0.74/103
Ixodes 215491711
0.94/225
Homo 19924133
0.78/131
Ciona 198420224
0.89/168
0.66/178
Strongylocentrotus 115610811
0.72/140
Trichoplax jgiScaffold_6|2098752|2100304
0.95/318
0.88/Thecamonas*
Monosiga jgiScaffold_6|780044|781460
Ancyromonas*
0.56/Dictyostelium 66822135
0.48/309
Acanthamoeba*/106789002
0.61/Pendulomonas*
1.00/942
Plasmodium 124803581
1.00/927
Toxoplasma 211963576
0.99/290
Cryptosporidium 209875975
0.92/178
Cyanidioschyzon 151559143
0.96/150
Guillardia 162605684
0.74/Galdieria Galdieria genome contig 785
Trichomonas 123408472
0.58/Jakoba bahamiensis 109794508
0.99/571
Seculamonas*
Bodo*
0.56/674
0.99/927
Leishmania 157871568
0.51/Trypanosoma 2108337
Euglena 109787391
Scytomonas*
Diplonema TBestDB Scaffold 118
Chaetoceros 164412700
0.56/468
0.52/0.33/0.99/936
Phaeodactylum 219119366
Thalassiosira jgiScaffold_2|665690|666833
0.99/927
Aureococcus jgiScaffold_14|44397|45350
Pylaiella*
0.99/805
0.53/0.99/927
Ectocarpus 241962436
0.48/0.48/Hyaloperonospora 199611623
1.00/927
Phytophthora r. jgi74160
0.66/543
Pythium 207397927
Oryza 18874071
1.00/908
1.00/927
Vitis 225444585
Physcomitrella 16605579
Ostreococcus 145349400
1.00/936
0.57/328
Micromonas sp. jgi226516672
0.44/Chlorella sp. jgi20220
0.89/655
Volvox jgiScaffold_15|851250|854214
0.99/842
1.00/917
Chlamydomonas 45685351
Bodomorpha*
0.99/487
0.69/140
Proleptomonas*
0.94/Thaumatomonas*
Cercomonas*
0.17/Jakoba libera*
0.99/721
0.99/805
Rad51
0.1 substitutions/site
Figure 3.9: Unrooted phylogenetic tree of 58 Rad51 homologs with accession
numbers. Trees were estimated with PhyML (LG+G) and PhyloBayes
(LG+G) from 307 aligned amino acids. Opisthokonta are highlighted in
purple, Amoebozoa in blue, Archaeplastida in green, Chromalveolata in
orange, Rhizaria in violet, and Excavata in brown. Asterisks indicate data
was obtained with degenerate PCR. The consensus topology of 2
PhyloBayes chains is shown
124
Cryptococcus 58259270
Coprinopsis 3237296
Ustilago 71018413
Scizosaccharomyces
397843
0.99/833
0.67/338
Saccharomyces 4275
1.00/922
0.98/267
Candida albicans 68485285
Phycomyces jgiScaffold_40|201443|202854
Batrachochytrium jgiScaffold_2|1601520|1603332
0.98/560
Aedes 157112162
0.69/Apis 110756953
0.98/51
Ixodes
215491711
0.94/260
Homo
19924133
0.74/Ciona 198420224
0.64/82
Strongylocentrotus 115610811
0.81/0.96/Trichoplax
jgiScaffold_6|2098752|2100304
0.95/354
Thecamonas*
0.38/194
Monosiga jgiScaffold_6|780044|781460
Ancyromonas*
0.70/Acanthamoeba*/106789002
0.52/0.48/Dictyostelium 66822135
0.57/Pendulomonas*
Seculamonas*
Jakoba bahamiensis 109794508
0.99/602
1.00/929
Plasmodium 124803581
1.00/1000
Toxoplasma 211963576
0.94/0.81/- 0.98/304
Cryptosporidium
209875975
0.86/173
Cyanidioschyzon 151559143
0.97/Guillardia 162605684
Galdieria Galdieria sulphuraria genome contig 785
0.88/Trichomonas 123408472
0.86/Euglena 109787391
Trypanosoma 2108337
0.99/992
0.83/Leishmania 157871568
0.83/661
Bodo*
Jakoba libera*
0.99/979
Oryza 18874071
1.00/990
Vitis 225444585
0.52/Physcomitrella 16605579
0.94/0.44/Cercomonas*
Thaumatomonas*
0.93/124
Proleptomonas*
0.67/159
0.99/524
Bodomorpha*
1.00/986
Volvox jgiScaffold_15|851250|854214
0.99/910
Chlamydomonas 45685351
Chlorella sp. jgi20220
0.99/399
Ostreococcus 145349400
0.99/1000
Micromonas sp. jgi226516672
0.79/Scytomonas*
Diplonema TBestDBScaffold 118
0.86/0.99/996
Ectocarpus 241962436
0.99/871
Pylaiella*
0.88/Aureococcus jgiScaffold_14|44397|45350
0.99/992
Thalassiosira jgiScaffold_2|665690|666833
Phaeodactylum
219119366
1.00/1000
0.58/496
Chaetoceros 164412700
Pythium 207397927
0.99/995
Phytophthora r. jgi74160
0.94/Hyaloperonospora 199611623
0.99/1000
Nitrosopumilus 161528894
Cenarchaeum 118575453
0.99/1000
0.82/602
Pyrobaculum 119872227
Candidatus 170290825
0.89/631
Aeropyrum 109689248
Nanoarchaeum 41615212
0.99/811
0.81/Methanocaldococcus 256811072
0.1 substitutions/site
0.99/794
0.99/879
0.99/781
Rad51
RadA
Figure 3.10: Unrooted phylogenetic tree of 65 Rad51 and RadA homologs with
accession numbers. Trees were estimated with PhyML (LG+G) and
PhyloBayes (LG+G) from 314 aligned amino acids. Opisthokonta are
highlighted in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, Rhizaria in violet, and Excavata in brown.
Asterisks indicate data was obtained with degenerate PCR. All references
are GenBank unless otherwise noted. The consensus topology of 2
PhyloBayes chains is shown.
125
Rad51/Dmc1
Aspergillus
Schizosaccharomyces Ascomycota
Coprinopsis
0.99/780
1.00/977
Cryptococcus Basidiomycota
1.00/786
Mucoromycotina
Phycomyces
0.99/871
Batrachochytrium Chytridiomycota
0.73/Homo
1.00/991
Strongylocentrotus Eumetazoa
0.78/Ancyromonas Ancyromonadidae
0.62/Acanthamoeba Acanthamoebidae
Entamoeba Entamoebida
0.97/353
Diplonema
0.71/164
Scytomonas Euglenozoa
Leishmania
0.89/92
1.00/1000
Trypanosoma
Percolomonas
0.29/Naegleria Heterolobosea
0.24/Thecamonas Apusomonadidae
Trichomonas Parabasalia
0.46/Giardia A
0.57/0.27/Giardia B
Fornicata
0.98/811
1.00/913
Spironucleus
0.71/
Thaumatomonas Silicofilosea
0.82/363
Bodomorpha
Cercomonas Cercomonadida
0.30/Mastigamoeba Mastigamoebida
0.99/972
Oryza
1.00/1000
Arabidopsis
Streptophyta
0.90/Physcomitrella
Ostreococcus
0.97/260
1.00/1000
Micromonas
Chlorophyta
0.99/823
Chlorella
1.00/949
Chlamydomonas
0.98/191
Gracilaria Florideophyceae
Galdieria Bangiophyceae
Pythium
0.91/1.00/1000
Phytophthora Oomycetes
0.91/Perkinsus Perkinsea
Cryptosporidium
0.99/771
Apicomplexa
1.00/995
Toxoplasma
1.00/906
1.00/976
Fungi
Opis.
Metazoa
Apusozoa
Centramoebida Amoeb.
Archamoebae
Discoba
Excav.
Apusozoa
Metamonada
Excav.
Cercozoa
Rhiz.
Archamoebae
Amoeb.
Chloroplastida
Arch.
Rhodophyceae
Stramenopiles
Alveolata
Chrom.
0.1 substitutions/site
Figure 3.11: Unrooted phylogenetic tree of 40 Concatenated Rad51 and Dmc1
homologs. Trees were estimated with PhyML (LG+G) and PhyloBayes
(LG+G) from 603 aligned amino acids. Opisthokonta are highlighted in
purple, Amoebozoa in blue, Archaeplastida in green, Chromalveolata in
orange, Rhizaria in violet, and Excavata in brown. Asterisks indicate data
was obtained with degenerate PCR. The consensus topology of 2
PhyloBayes chains is shown.
126
Aspergillus 121709155/169781702
Schizosaccharomyces 3176384/397843
Coprinopsis 6714639/3237296
0.99/780
Cryptococcus 134118469/58259207
1.00/977
1.00/786
Phycomyces jgiScaffold_3|1364891|1940257 and 1122|189|447/jgiScaffold_40|201443|202854
0.99/871
Batrachochytrium jgiScaffold_2|2017505|202793/jgiScaffold_2|1601520|1603332
Homo 13878923/19924133
0.73/Strongylocentrotus 115660762/115610811
1.00/991
0.78/Ancyromonas*/*
0.62/Acanthamoeba Baylorcontig_2440/106789002
Entamoeba 67482427/167387582
Diplonema*/TBestDB Scaffold 118
0.97/353
Scytomonas*/*
0.71/164
Leishmania 72549845/157871568
0.89/92
Trypanosoma 71659624/2108337
1.00/1000
Percolomonas*/*
0.29/Naegleria jgiScaffold_1|500453|501457/Scaffold_63|72794|73744
Thecamonas*/*
0.24/Trichomonas 123408121/123408472
0.46/Giardia A 30578211/
0.57/Giardia B 71080540/
0.98/811
0.27/Spironucleus jgiScaffold_430|11672|12631/
1.00/913
Thaumatomonas/*
0.71/Bodomorpha/*
0.82/363
Cercomonas/*
0.30/Mastigamoeba */*
Oryza 18700485/18874071
0.99/972
Arabidopsis 21903409/18420327
1.00/1000
0.90/Physcomitrella jgiScaffold_42|1203633|1204535/16605579
Ostreococcus 145352283/145349400
0.97/260
Micromonas sp. 226524329/jgi226516672
1.00/1000
Chlorella sp. jgi52039/jgi20220
0.99/823
Chlamydomonas 158272235/45685351
1.00/949
Gracilaria 120463106/
0.98/191
Galdieria Galdieria genomeScaffold 896/contig 785
Pythium 166325657/207397927
0.91/Phytophthora r. jgi76896/jgi74160
1.00/1000
Perkinsus TIGR1637/*
0.91/Cryptosporidium 209879790/209875975
0.99/771
Toxoplasma 237843305/211963576
1.00/995
1.00/906
1.00/976
Rad51/Dmc1
0.1 substitutions/site
Figure 3.12: Unrooted phylogenetic tree of 40 Concatenated Rad51 and Dmc1
homologs with accession numbers (Dmc1/Rad51). Trees were estimated
with PhyML (LG+G) and PhyloBayes (LG+G) from 603 aligned amino
acids. Opisthokonta are highlighted in purple, Amoebozoa in blue,
Archaeplastida in green, Chromalveolata in orange, Rhizaria in violet, and
Excavata in brown. Asterisks indicate data was obtained with degenerate
PCR. All references are GenBank unless otherwise noted. The consensus
topology of 2 PhyloBayes chains is shown
127
Figure 3.13: Protein sequence alignment of prokaryotic and eukaryotic RecA
orthologs with amino acids conserved among 158 protein sequences
indicated. Two eubacterial RecA, seven archaebacterial RadA, 98
eukaryotic Rad51, and 51 eukaryotic Dmc1 protein sequences were
aligned and analyzed for conserved amino acids. Seven exemplar Rad51
and Dmc1 protein sequences and two RadA protein sequences are
presented. Amino acids that were present 100% among all domains,
archaebacteria and eukaryotes only, eukaryotes only, and Rad51 or Dmc1
only are highlighted with black, blue, green and yellow respectively. In
addition, sites present ≥ 95% in one eukaryotic paralog but different or
variable in the other paralog are highlighted in red. Dots mark residues
identified during this study for which no function has been determined.
Opisthokonta are labeled in purple, Amoebozoa in blue, Chromalveolata in
orange, Excavata in brown, Rhizaria in eggplant, and Apusozoa in black.
Arrows indicate positions of degenerate PCR primers. Numbers indicate
amino acid positions of Saccharomyces cerevisiae Rad51. Supergroups
were represented at each amino acid position except Apusozoa (126-356)
and Rhizaria (188-397).
.
128
Rad51
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Cercomonas*
Ancyromonas*
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
|....|....|....|....|....|....|....|....|....|....|~.|....|....|....|....|....|..~..|....|....|....|....|....|....|~....|....|....|....|....|....|....|....|....
GITMADVKKLRESGLHTAEAVAYAPRKDLLEIKGISEAKADKLLNEAARLV~FVTAADFHMRRSELICLTTGSKNLDTLLG~GGVETGSITELFGEFRTGKSQLCHTLAVTCQIP~LDIGGGEGKCLYIDTEGTFRPVRLVSIAQRFGLDPDDALNNVAY
GITEGDCKKLEEAGFFTVQSIAFTPKKQLITIKGISDAKADKLLAESSKIV~FTNAAELNNLRKETIRITTGSRELDKLLC~GGFETGSITELFGEFRTGKTQLCHQLCVTCQLG~IENGGTEGRAIYIDTEGTFRPERLTQIAEKYGLNSEEALNNVAV
GIAALDVKKLKDSGLYTVESVAYTPRKDLLQIKGISEAKVDKIVEAASKLV~FTSASQLHAQRLEIIQVTTGSRELDKILD~GGIETGSITEIYGEFRSGKTQLCHTLCVTCQLP~LDQGGGEGKALYIDAEGTFRPQRLLQIADRFGLNGADVLENVAY
GFVKRDLELLKEGGLQTVECVAYAPMRTLCSIKGISEQKAEKLKKACKELC~FCNAIDYHDARQNLIKFTTGSKQLDALLK~GGIETGGITELFGEFRTGKSQLCHTLAITCQLP~IEQSGGEGKCLWIDTEGTFRPERIVAIAKRYGLHPTDCLNNIAY
GIASADIKKLMESGFYTVESVAYAPKKNILAVKGISETKADKIMAECAKLV~FTSAVVYHEARKEIIMVTTGSREVDKLLG~GGIETGGIRELFGEFRTGKTQLCHTLCVTCQLP~ISQGGAEGMPLYIDTEGTFRPERLVAVAERYKLDPQDVLSNVAC
-------------GLTPSSPLRYATTKRMTAMKGISDAKALKLVAEAAKYV~FTTATEYHQQREEIIQPHHRRRRTGACVG~GGVETGCITEMFGEFRTGKTQLCHTLCVTCQLK~VEQGGGEGKALYIDTEGTFRPKRLIAIAERFGLNPMDVLDNVAY
-------------------------------------TKAEKLRIEAANQI~FTTASAFNMQRENVIHLTSGSKAVDDLLG~GGFETGSITEICGEFRTGKTQLCHTLCVTCQLP~LESGGGVGKALYIDTEGTFRPERLLAIAERYGLSGQDVLDNVCY
Dmc1
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Gymnophrys
Amastigomonas*
GINASDLQKLKSGGIYTVNTVLSTTRRHLCKIKGLSEVKVEKIKEAAGKII~FIPATVQLDIRQRVYSLSTGSKQLDSILG~GGIMTMSITEVFGEFRCGKTQMSHTLCVTTQLP~REMGGGEGKVAYIDTEGTFRPERIKQIAEGYELDPESCLANVSY
GINVGDINKLKSAGCNTIESVVMHTHKELCAIRGFSDSKVDKIMEAVSKIF~FISATTSLERRANVIKITTGSSQFDQLLG~GGIETMSVTEMFGEFRTGKTQLCHTLAVTTQLP~SHLKGGNGKVAYIDTEGTFRPERIAQIAERFGVDQTAVLDNILI
GINSGDVKKLQDAGIYTCNGLMMHTKKSLTGIKGLSEAKVDKICEAAEKLL~FMTGSDLLIKRKSVVRITTGSQALDELLG~GGIETLCITEAFGEFRSGKTQLAHTLCVSTQLP~IHMHGGNGKVAYIDTEGTFRPERIVPIAERFGMDANAVLDNIIY
GINAADINKLK-GGYCTILSLIQATKKELCNVKGISEVKVDKILEVASKIE~FITGNQLVQKRSKVLKITTGSSVLDKTLG~GGFESMSITELFGENRCGKTQVCHTLAVTAQLP~KNMQGGNGKVCYIDTEGTFRPEKICKIAQRFGLNSEDVLDNILY
GVATADIAKLRQAGIFTVAGIHMQCRKDLALIKGLSDAKVEKIIEAARKLF~FTNGVTYLQQRGKVTRMTTGSTALDQLLG~GGIESMSITEAFGEFRTGKTQIAHTLCVTCQLP~TSMGGGNGKVIYVDTESTFRPERIKPIAARFGLDADAVLNNILV
---------------------------------------------------~-----------------------------~----------------------------TAQMP~TEMGGGNGKVVYIDTEGTFRPQRIQAISERFGVDATAVLDNITY
---------------------------------------------------~-----------------------------~----------------TGKTQIAHTLCVTSQLP~LEAGGGGGKVLYIDTEGTFRPGRIVQIAERYGLDSNDVLENILT
RadA
Cenarchaeum
Nanoarchaeum
GVGPVTKKKLEDSGVHSMMDLVVRGPVELGEISSMSSEICEKIVTIARKRL~FASGSEIYKRRQSIGMITTGTDALDALLG~GGIETQAITEVFGEFGSGKTQFCHTMCVTTQKP~KEEGGLGGGVMYIDTEGTFRPERVVTIAKANNMDPAKLLDGIIV
GVGPKTAEKLISAGYDSLIKIASASVEELMEAADIGEATARKIIEAAMERL~FKTAEEVLEERQKTARITTMSKNLDSLLG~GGIETAALTEFYGEYGSGKTQVGHQLAVDVQLP~PEQGGLEGKAVYIDTEGTFRPERIKQMAEALDLDPKKALKNVYH
RecA
Bacillus
Thermus
------------MSDRQAALDMALKQIEKQFGKGSIMKLGEKTDTRISTVP~------------------SGSLALDTALG~GGYPRGRIIEVYGPESSGKTTVALHAIAEVQEK~------GGQAAFIDAEHALDPVYAQKLGVNIEELLLSQPDT-------------MDESKRKALENALKAIEKEFGKGAVMRLGEMPKQQVDVIP~------------------TGSLALDLALG~GGIPRGRIVEIYGPESGGKTTLGLTIIAQAQRR~------GGVAAFVDAEHALDPLYAQRLGVQVEDLLVSQPDT---
HhH
(dsDNA binding)
Rad51
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Cercomonas*
Ancyromonas*
Subunit Subunit
Polym‐
Rotation
erization
Walker A
(Triphosphate binding)
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
|....|....|....|....|...~.|....|....|.~...|....|~....|....|....|....|....|....|....|..~..|....|....|....|....|..~..|....|....~|....|....|....|....|..
ARAYNADHQLRLLDAAAQMMSESR~FSLIVVDSVMALY~RTDFSGRGE~LSARQMHLAKFMRALQRLADQFGVAVVVTNQVVAQVD~NPDPKKPIGGNIMAHSSTTRLGFKK~GKGCQRLCKVVD~SPCLPEAECVFAIYEDGVGDPRE
ARAHNTEHQMQLLQMASGLMAKER~YGLLIIDSATALY~RTDYSGRGE~LASRQMHLAKFLRALQRIADEFSVAVVLTNQVVAQVD~GGDTKKPVGGNIIAHASTTRLYLRK~GKGEARICKVYD~SPCLPESEASFAITTNGIEDVKD
ARAYNTDHQSRLLLEAASMMIETR~FALMIVDSATALY~RTDFSGRGE~LSARQMHMAKFLRSLQKLADEFGVAVVITNQVVAQVD~AGPQIKPIGGNIMAHASTTRLALRK~GRGEERICKVIS~SPCLAEAEARFQIASEGVADVKD
AKAYNCDHQTELLIDASAMMADTR~FALLIVDSATALY~RSEYTGRGE~LANRQSHLCRFLRGLQRIADIYGVAVIITNQVVAKVD~GGHEKIPIGGNIIAHASQTRLYLRK~GRGESRICKIYD~SPVLPEGEAVFAITEGGIADYEE
ARAFNTDHQQQLLLQASAMMAENR~FAIIIVDSATALY~RTDYSGRNE~LAARQMHLGKFLRSLHNLAEEYGVAVVVTNQVVANVD~QADAKKPIGGHIMAHASTTRLSLRK~GRGEQRIMKVYD~SPCLAEAEAIFGIYEDGVGDARD
ARAYNSDHQSKLLMQAAGMLTEAR~YALVVVDSATALY~RTDYSGRGE~LSARQMHLARFLRQLQRLADEFGVAVVITNQVVASVD~FGDPLKPIGGNIMAHSSTTRLSLRK~GRGET-------~----------------------ARAYNSDHQNQLLQQAAGIMAESR~YVLMIVDSATALY~RTDYSGRGE~LSARQMHLAKFLRQLMRLADEYGIAVVITNQVVAQVD~ASDPKK-------------------~------------~-----------------------
Dmc1
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Gymnophrys
Amastigomonas*
ARALNSEHQMELVEQLGEELSSGD~YRLIVVDSIMANF~RVDYCGRGE~LSERQQKLNQHLFKLNRLAEEFNVAVFLTNQVQSDPG~SADGRKPIGGHVLAHASATRILLRK~GRGDERVAKLQD~SPDMPEKECVYVIGEKGITDSSD
ARAYTHEQQFDLLIEVAARMAEDH~FRMLIIDSVTSLF~RVDFSGRGE~LSERQQKLGKMMNKLIKISEEFNVAVVITNQVMSDPG~VVDPKKPIGGHVIAHASTTRLYLRK~GKGEQRIVKIYD~SPNLPEAEATFAIDTGGIIDAKD
ARAYTYEHQYNLLLGLAAKMAEEP~FRLLIVDSVIALF~RVDFSGRGE~LAERQQKLAQMLSRLTKIAEEFNVAVYITNQVIADPG~ITDPKKPAGGHVLAHAATIRLMLRK~GKGEQRVCKIFD~APNLPEGEAVFQVTSGGIMDAKD
ARAFTHEHLYQLLATSAAKMCEEP~FALLVVDSIISLF~RVDFSGRGN~LSERQQKLNKIMSVLSKLGEQFNIAIVITNQVMSDPG~IANPMKPVGGHVIGHASTTRLSLRK~GKGDQRVCKVYD~APNLPEIECIFQLSDGGVIDALD
ARAYTHEHQMHLLSMVAAKMAEDQ~FGLLVVDSITALF~RVDFSGRGE~LAERQQKLAKMMSHLIKLAEEFNVAVYITNQVVADPG~FVDPKKPVGGHILAHASTTRLSLRK~GRGDQRVCKIYD~SPSLPEVECVFSISEQGIVDARE
ARAYTHEHQYELLTAVAAKMTEER~YALLIVDSVTALF~RVDFSGRGE~LAERQQKLAQFLSKLIKIAEEFNIAVFITNQVVADPG~VADTKKPIGGHILAHASTTRLFLRK~GRAEQRICKIYD~SPCLPESEAVYQLTNGGVADATD
VRVYTHEQQYNMLVRAAALMADDG~IRMLIVDSITALF~RVDYTGRGQ~LAERQQKLNQMLARLTKLADEFNIAVFI---------~-------------------------~------------~-----------------------
RadA
Cenarchaeum
Nanoarchaeum
ARAYNSSHQVLILEEAGKTIQEEN~IKLIISDSTTGLF~RSEYLGRGT~LASRQQKLGRYIRLLARIAETYNCAVLATNQVSSSPD~FGDPTRPVGGNVVGHASTYRIYFRK~GGKNKRVAKIID~SPHHPASEAVFELGERGVQDTEE
MKVFNTDHQMLAARKAEELIRKGE~IKLIVVDSLTALF~RAEYTGRGQ~LAERQHKLGRHVHDLLRIAELYNVAIYVTNQVMAKPD~GLDSVQAVGGHVLAHASTYRVFLRK~GKKGIRIARLVD~SPHLPERETTFVITEEGIRDPE-
RecA
Bacillus
Thermus
--------GEQALEIAEALVRSGA~VDIVVVDSVAALV~KAEIEGDMG~VGLQARLMSQALRKLSGAINKSKTIAIFINQIREKVG~FGNPETTPGGRALKFYSSVRLEVRR~GEGISKEGEIID~LDIVQKSGSWYSYEEERLGQGRE
--------GEQALEIVELLARSGA~VDVIVVDSVAALV~RAEIEGEMG~VGLQARLMSQALRKLTAVLAKSNTAAIFINQVREKVG~YGNPETTPGGRALKFYASVRLDVRK~GRGLDPVADLVN~AGVIEKAGSWFSYGELRLGQGKE
Loop L1
(ssDNA binding)
Loop L2
(ssDNA binding)
128
Walker B
(Mg++ binding)
129
Rad51
Dmc1
RadA
RecA
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Cercomonas*
Ancyromonas*
Saccharomyces
Entamoeba
Oryza
Plasmodium
Trypanosoma
Gymnophrys
Amastigomonas*
Methanococcus
Sulfolobus
Eschirichia
Bacillus
Thermus
0.33
0.22
0.37
0.33
0.23
0.28
0.44
0.46
0.44
0.45
0.40
0.43
0.52
0.59
0.55
0.75
0.79
0.76
Sac.
0.30
0.35
0.32
0.26
0.27
0.43
0.48
0.46
0.52
0.43
0.42
0.46
0.56
0.57
0.82
0.80
0.81
Ent.
0.34
0.28
0.18
0.26
0.48
0.44
0.43
0.48
0.42
0.41
0.47
0.61
0.59
0.81
0.79
0.77
Ory.
0.00-0.19
Avg. recA RadA Rad51 Dmc1
Dmc1 0.77 0.55 0.46 0.34
Rad51 0.80 0.57 0.29
0.35
0.28
0.34
0.51
0.52
0.47
0.53
0.48
0.43
0.49
0.60
0.53
0.82
0.80
0.79
Pla.
Rad51
0.28
0.29
0.47
0.51
0.49
0.50
0.47
0.46
0.51
0.60
0.56
0.82
0.83
0.81
Try.
0.20-0.29
0.16
0.40 0.43
0.47 0.48
0.43 0.45
0.47 0.46
0.43 0.45
0.37 0.38
0.46 0.44
0.54 0.58
0.55 0.53
0.81 0.82
0.79 0.82
0.79 0.80
Cer. Anc.
0.30-0.39
0.46
0.41
0.44
0.43
0.39
0.42
0.51
0.49
0.77
0.77
0.75
Sac.
0.28
0.32
0.29
0.28
0.36
0.54
0.61
0.76
0.81
0.77
Ent.
0.40-0.49
0.31
0.26
0.21
0.34
0.57
0.56
0.77
0.80
0.76
Ory.
0.30
0.30
0.40
0.57
0.57
0.76
0.82
0.78
Pla.
Dmc1
0.50-0.59
0.26
0.36 0.33
0.55 0.56 0.51
0.56 0.55 0.53 0.53
0.78 0.74 0.77 0.82 0.77
0.82 0.76 0.77 0.84 0.76
0.76 0.74 0.74 0.85 0.76
Try. Gym. Ama. Met. Sul.
RadA
0.60-0.69
0.70-0.79
0.30
0.29 0.26
Esc. Bac.
RecA
0.80-0.89
Figure 3.14: p-distance matrix of prokaryotic and eukaryotic RecA orthologs. Uncorrected distances between eukaryotic Rad51
and Dmc1, archaebacterial RadA, and eubacterial RecA protein sequences. All currently recognized eukaryotic
supergroups are represented (purple=Opisthokonta, light blue=Amoebozoa, green=Archaeplastida,
orange=Chromalveolata, brown=Excavata, blue=Rhizaria), and the currently unclassified Apusozoa (Ancyromonas and
T. trahens). Calculations were performed with MEGA. Asterisks designate sequences obtained with degenerate PCR.
129
130
Table 3.2: Degenerate primers and their positions.
Name
Nucleotide Sequence
Amino Acid
Sequence
I K G L S D/E
I K G L S D/E
I K G L S D/E
I K G I S D/E
I K G I S D/E
E M F G E F R/S
D R E G T F R/S
T E G T F R/S P
TNQVVA
TNQVVAH
G G H/N I F A
KGKGE
KGRGET
forward 1 ATC AAG GGC TTR AGY GA
forward 2 ATC AAG GGA CTN TCN GA
forward 3 ATC AAG GGA CTN AGY GA
forward 4 ATC AAG GGC ATH TCN GA
forward 5 ATC AAG GGC ATH AGY GA
forward 6 GAG ATG TTC GGC GAR TTY MG
forward 7 GAC AGG GAA GGC ACN TTY MG
forward 8 ACT GAA GGC ACN TTY MGN CC
reverse 1 AGC GAC GAC YTG RTT NGT
reverse 2 ATG CGC GAC NAC YTG RTT
reverse 3 TGC GAA DAT RTK NCC NCC
reverse 4 TTC ACC YTT NCC YTT
reverse 5 AGT CTC ACC ACK NCC YTT
Perkinsus ATT GAC CAG GGC ATA GGT
IDQGIGT
forward
Perkinsus AAT TCT GAG CGC AAC AGG TT
N L L R S E F/L
reverse
Note: positions are relative to the Saccharomyces cerevisiae Rad51 protein
Position
122
122
122
122
122
184
221
222
331
332
355
370
371
98
290
131
Table 3.3: Proposed functions of residues identified during this study.
Protein
RecA
RadA
Rad51
Dmc1
RadA
Rad51
Dmc1
Rad51
Dmc1
Rad51
≠
Dmc1
Residue
G185
D219
E221
D280
S281
N325
G346
G347
Function
ATP binding
DNA binding
R287
DNA binding
F144
E176
F224
R225
L285
L296
H356
E182
E295
R299
A351
H352
A320
K371
E382
E186
R293
G294
K343
K191
Q193
N254
A265
R308
D332
I349
Subunit binding
DNA binding
References
(Pellegrini et al. 2002)
(Story, Weber, and Steitz 1992)
(Okorokov et al. 2010)
(Chen et al. 2007)
(Conway et al. 2004)
(Grigorescu et al. 2009)
(Okorokov et al. 2010)
(Pellegrini et al. 2002)
(Seong et al. 2009)
(Shin et al. 2003)
(Story, Weber, and Steitz 1992)
Rad52 binding
Subunit binding
Rad54 binding
DNA binding
(Okorokov et al. 2010)
(Pellegrini et al. 2002)
(Shin et al. 2003)
(Story, Weber, and Steitz 1992)
ATP binding
Subunit binding
Subunit/BRC4
DNA binding
(Chen et al. 2007)
(Shin et al. 2003)
(Story, Weber, and Steitz 1992)
Dmc1
≠
Q301
Subunit binding
(Shin et al. 2003)
Rad51
Note: Functions were determined by analysis of RecA, RadA, Rad51, and Dmc1
protein structures. Amino acids were either identified in 100% of sequences at that
position or 95% among one eukaryotic paralog but different or variable in the other.
Amino acid positions are relative to the Saccharomyces cerevisiae Rad51 protein.
132
CHAPTER 4
MEIOSIS-SPECIFIC GENES AROSE BY
DUPLICATION PRIOR TO THE LAST COMMON
ANCESTOR OF EUKARYOTES
Abstract
Meiosis is a distinct and nearly universal feature of eukaryotes. However, the
origins and evolutionary histories of genes that encode proteins that function during
meiosis remain largely unknown. Whether the last eukaryotic common ancestor (LECA)
was capable of meiosis is unknown. Also, whether meiosis in the LECA may have used
the same machinery used by extant eukaryotes to complete important functions during
meiosis, such as: 1) synaptonemal complex formation; 2) interhomolog DNA strand
exchange; 3) Holliday junction resolution; and 4) sister chromatid cohesion, is unknown.
We present our inventory of 20 genes whose products catalyze these important functions
(Hop1, Rad21, Rec8, Spo11-1, Spo11-2, Spo11-3, Rad51, Dmc1, Hop2, Mnd1, Pms1,
Mlh1, Mlh2, Mlh3, Msh2, Msh3, Msh4, Msh5, Msh6, and Mer3) among 46 diverse
eukaryotes. For the first time, genomes of representatives from all eukaryotic
supergroups and the Apusozoa (Thecamonas trahens) were tested for the presence of
these meiotic components. We used alignments of phylogenetically verified protein
sequence data to search nucleotide, EST, and protein sequence repositories. We
determined that 10 of 20 genes are present in all eukaryotic supergroups and the
unclassified Apusozoa, and 19 were likely present in the LECA. I also performed
phylogenetic analyses on the protein sequence data obtained for all of the eukaryotes
tested, revealing a pattern of gene duplications, most prior to the LECA. Many genes that
encode proteins known to function only during meiosis in model organisms are paralogs
of genes whose products also function during mitotic DNA damage repair or
maintenance. In addition, these genes most likely arose by duplication of genes involved
133
in DNA damage repair. These data indicate that meiosis itself likely arose by gene
duplication.
Introduction
The evolutionary forces that gave rise to meiosis are unknown (Cavalier-Smith
2002d; d'Erfurth et al. 2009; Wilkins and Holliday 2009; Bernstein and Bernstein
2010). Efforts to collect data on the origins of meiotic genes are ongoing (Villeneuve
and Hillers 2001; Ramesh, Malik, and Logsdon 2005; Malik et al. 2008; Cavalier-Smith
2010; Wickstead, Gull, and Richards 2010). Meiosis is distinguished from mitosis by
two nuclear divisions (reductional and equational) following a single genome-wide
replication event (generally resulting in four genetically distinct haploid products),
rather than one nuclear division (generally resulting in two genetically identical diploid
products). Despite this dramatic difference, many events that occur during meiosis are
analogous to mitosis, which itself depends upon functional components of somatic
DNA mismatch and damage repair (Borts, Chambers, and Abdullah 2000; Marcon and
Moens 2005; d'Erfurth et al. 2009; Wilkins and Holliday 2009; Bernstein and Bernstein
2010). Furthermore, some genes that encode products that process DNA damage in all
domains of life are homologous to genes that encode proteins that function during
meiosis in eukaryotes (Ramesh, Malik, and Logsdon 2005; Malik et al. 2007; Malik et
al. 2008). It is from these observations that the generally held notion in which meiosis
arose from mitosis early during eukaryotic evolution naturally emanates (Wilkins and
Holliday 2009). Determining when meiotic genes appeared during the history of
eukaryotes (especially those that function only during meiosis in model organisms) and
what genetic mechanisms were responsible will help to clarify how meiosis arose
(Ramesh, Malik, and Logsdon 2005).
Differences between meiotic and mitotic forms of nuclear division are apparent
during meiotic prophase I, during which interactions between homologous
chromosomes ensure their appropriate alignment and subsequent segregation (Dudas
134
and Chovanec 2004; Krogh and Symington 2004; Filippo, Sung, and Klein 2008;
d'Erfurth et al. 2009; Wilkins and Holliday 2009). Events necessary for completion of
meiotic prophase I in many eukaryotes include: 1) synaptonemal complex formation; 2)
interhomolog DNA strand exchange; 3) sister chromatid cohesion; and 4) Holliday
junction resolution (d'Erfurth et al. 2009; Wilkins and Holliday 2009). Several studies
established that genes that encode products known to function during these events in
model organisms arose very early during eukaryotic evolution (Paques and Haber 1999;
Villeneuve and Hillers 2001; Dudas and Chovanec 2004; Krogh and Symington 2004;
Ramesh, Malik, and Logsdon 2005; Filippo, Sung, and Klein 2008; Malik et al. 2008;
Wickstead, Gull, and Richards 2010). Definitive evidence that genes involved in
different stages of meiosis were present in the last common ancestor to all eukaryotes
have not been produced because of several limiting conditions, including: a) failure to
place a root on the eukaryotic phylogeny (Baldauf 2003; Simpson and Roger 2004;
Roger and Simpson 2009); b) lack of genome-sequence data for all eukaryotic
supergroups; and c) the existence of currently unclassified eukaryotes. The failure to
completely resolve evolutionary relationships among eukaryotes makes it possible that
any unsampled supergroups or unclassified eukaryotes may be the earliest-diverging
eukaryotes and exclusion from analyses could result in misestimations of the presence
of genes in the common ancestor of extant eukaryotes. Furthermore, biased taxonomic
sampling of eukaryotes has been problematic for phylogenetic analyses (Dunn et al.
2008), clouding any conclusions regarding the evolution of meiotic genes.
We performed extensive searches of sequence databases for 20 genes that
encode products involved in sister chromatid cohesion, pairing of homologous
chromosomes, synaptonemal complex formation, and interhomolog DNA strand
exchange (Hop1, Rad21, Rec8, Spo11-1, Spo11-2, Spo11-3, Rad51, Dmc1, Hop2,
Mnd1, Pms1, Mlh1, Mlh2, Mlh3, Msh2, Msh3, Msh4, Msh5, Msh6, and Mer3) in the
genomes of 46 diverse eukaryotes. Ten of these genes (Hop1, Rec8, Spo11-1, Spo11-2,
135
Dmc1, Hop2, Mnd1, Msh4, Msh5, and Mer3) are known to function only during
meiosis in model organisms in model organisms (Malik et al. 2007). Access to newly
sequenced genomes of eukaryotes from previously neglected groups greatly increased
the breadth of sampling and provided our first glimpses of the suites of meiotic genes
present in the supergroup Rhizaria (Bigelowiella natans), the first order group
Haptophyta (Emiliania huxleyi), and currently unclassified Apusozoa (Thecamonas
trahens). Distributions of 10 genes that encode proteins that function during four
distinct stages of meiosis (eight that are meiosis-specific) indicate these genes were
present in the last common ancestor to all eukaryotes. These analyses also provide data
supporting the additional presence of Spo11-3 and Msh3 in the last common ancestor of
eukaryotes; only Rec8 and Spo11-2 may have arisen later during eukaryotic evolution.
Eukaryotic homologs that encode products that function during mitosis and DNA repair
are frequently paralogs of meiosis-specific gene products. Several genes most likely
arose from other genes that encode products that function during DNA repair,
replication, or transformation (i.e. orthologs of archaebacterial Ski2, Top6A, RadA,
MutL, and MutS genes). The bulk of meiotic genes tested arose once by duplication
prior to the last common ancestor of eukaryotes (Malik et al. 2007; Bernstein and
Bernstein 2010). The presence of so many genes in the last common ancestor that
encode products necessary for a range of important steps during meiosis in extant
eukaryotes strongly implies that meiosis in the last common ancestor was similar to
meiosis observed in most eukaryotes today.
Results and discussion
Distributions of meiotic genes
We present the distribution of 20 genes that encode proteins that function during
meiosis (Hop1, Rad21, Rec8, Spo11-1, Spo11-2, Spo11-3, Rad51, Dmc1, Hop2, Mnd1,
Mlh1-3, Pms1, Msh2-6, and Mer3) among 46 diverse eukaryotes (Figure 4.1). These
genes were selected on the basis that their products are important for four different stages
136
of meiosis: 1) synaptonemal complex formation; 2) interhomolog DNA strand exchange;
3) sister chromatid cohesion; and 4) Holliday junction resolution (Table 4.1) (Kleckner
1996; d'Erfurth et al. 2009). The taxonomic sampling includes representatives of every
currently recognized eukaryotic supergroup and the Apusozoa (Thecamonas trahens).
Ten genes (Rad51, Dmc1, Hop2, Mnd1, Mlh1, Mlh3, Msh2, Msh4, Msh5, and Msh6) are
present in representatives of every supergroup and T. trahens, implying that they were
present in the last eukaryotic common ancestor (LECA) (Figure 4.2). An additional six
genes (Hop1, Rad21, Spo11-1, Pms1, and Mer3) are missing from representatives of at
least one eukaryotic supergroup and/or T. trahens (Figures 4.1 and 4.2) but are likely to
have been present in the LECA, given their distributions and our current understanding of
the evolutionary relationships among eukaryotes (Figure 1.2).
In addition, we interpreted the distribution of genes in the context of phylogenetic
analyses performed on translated amino acid sequences of putative paralogs, with and
without products of archaebacterial orthologs (Figures 4.3 - 4.15). Tree topologies
retrieved with phylogenetic analyses of protein sequences translated from the 16 genes
inferred to have been present in the LECA feature strongly supported monophyletic
clades for many paralogs. Similarly, strongly supported topologies from analyses
including Spo11-3, Msh3, Mlh2, and Rec8 protein sequences support the monophyly of
their genes (Figures 4.5; 4.6; 4.14; and 4.15). Since these paralogs arose simultaneously
during eukaryotic evolution, the distribution of one paralog can be inferred to be true for
the other.
Since the Spo11-1 gene is inferred to have been present in the LECA and the
Spo11-3 gene arose at the same time, Spo11-3 is also likely to have been present in the
LECA. This inference is especially important if genes are apparently absent from
particular groups (e.g. Discoba and/or Metamonada) that have been hypothesized as the
earliest-diverging eukaryotes. As such, absences from such groups could indicate that a
genes was not present in the LECA but arose early during eukaryotic evolution, after the
137
divergence of some eukaryotes. Thus, the Spo11-3, Msh3, Mlh2, and Rec8 genes were
likely present in the LECA (Figure 4.2). Only one gene (Spo11-2) may have arisen later
during eukaryotic evolution given its distribution (Figures 4.1; 4.5; and 4.6). The Spo112 gene is apparently absent from genomes of the Metamonada tested (Trichomonas
vaginalis, Giardia intestinalis, and Spironucleus vortens). In addition, the phylogenetic
analyses of Spo11-2 protein sequence data (Figures 4.5 and 4.6) retrieve topologies in
which the Spo11-2 clade is nested within the Spo11-1 clade. This indicates that the
Spo11-2 gene may have arisen later during eukaryotic evolution, if the Metamonada are
the earliest-diverging eukaryotes (Figure 1.2).
The phylogenetic analyses performed here indicate also that all of the meiotic
genes tested arose by gene duplication (Figure 4.16) and many are orthologous to
archaebacterial genes that encode proteins that function during DNA damage repair.
Interestingly, several genes that encode proteins that function only during meiosis in
model organisms (Hop1, Rec8, Spo11-1, Spo11-2, Dmc1, Msh4, Msh5, and Mer3) are
paralogs of genes whose products function also during meiosis, mitosis, DNA damage
repair, or maintenance (Rev7, Rad21, Spo11-3, Rad51, Msh2, Msh3, Msh6, Brr2, and
Slh1) (Table 4.2). Further, some genes are orthologs of archaebacterial genes whose
products function during DNA damage repair or maintenance (Top6A, Ski2, RadA, MutS,
and MutL). Whether the duplications of meiosis-specific genes occurred simultaneously,
due to large-scale genome duplication events is unknown. However, prior studies
indicate that great numbers of gene duplications are likely to have occurred in the LECA
(Zhou, Lin, and Ma 2010). Although we cannot be certain that the gene duplication
events yielding meiosis-specific genes mark the origin of meiosis itself, these gene
duplications most certainly resulted in the meiotic functions observed today.
Assessment of distributions
To determine the likelihood that observed gene absences indicate true losses of
genes from genomes (Figures 4.1 and 4.17 and Table 4.2) the heuristic metric developed
138
in Chapter 2 was applied. Here, the proportions of observed absences explained by
sequence detection failures (type II error) were estimated. Among observed absences
from genomes of any of the Dmc1, Pms1, Msh3, Msh4, Msh6, and Mer3 genes there is a
~ 1-10% chance that the gene is present in the genome sequence but was not detected. If
a given organism’s genome is well covered, then the gene has most likely been lost by
the organism (e.g. Ustilago maydis Rad51). However, if a there is a possibility that the
genome sequencing is incomplete, the gene may by present in the genome but not in the
genome assembly (e.g. Bigelowiella natans Pms1). The data analyzed here indicate that
sporadic secondary gene losses occur frequently among diverse eukaryotes (Figures 4.1
and 4.17), a pattern first demonstrated among genes that encode DNA strand exchange
proteins (Chapter 2).
Case study: the Spo11 genes
Some apparent absences of meiotic genes are more ambiguous. For example, the
absence of the Spo11-1 and Spo11-2 genes from the genome sequences of D. purpureum
and Polysphondylium pallidum may be due to either true losses of genes from the
genomes or false negatives caused by sequence detection failures. Assessment of Spo111 and Spo11-2 protein sequences (meiosis-specific transesterases that introduce dsDNA
breaks necessary for homologous recombination (Keeney, Giroux, and Kleckner 1997;
Baudat and Keeney 2001; Lichten 2001; Szekvolgyi and Nicolas 2010)) indicate that a
high proportion of observed absences are likely due to false negatives caused by the
inability to detect gene sequences (0.67 and 0.45, respectively) (Table 4.2). In addition,
only the genome of D. purpureum has been completely sequenced. The authors of a
previous study in which the distribution of Spo11-1, Spo11-2, and Spo11-3 genes
hypothesized that the observed absences of Spo11-1 and Spo11-2 genes from the genome
of D. discoideum may be due to incomplete genome sequence coverage or a result of
mutagenesis during axenic cultivation (Malik et al. 2007).
139
However, the additional absence these genes from D. purpureum implies that the
Spo11-1 and Spo11-2 genes were absent in the common ancestor of the two
Dictyostelium species. Recent population genetic data indicate that D. discoideum
populations display a rapid decay of linkage disequilibrium and recombinant genotypes,
consistent with meiotic recombination (Flowers et al. 2010). In addition, formation of
macrocysts (resulting from the fusion of two haploid cells) has been observed with D.
purpureum(Mehdiabadi et al. 2009) and D. giganteum (Mehdiabadi et al. 2010) and
synaptonemal complexes have been observed in D. discoideum (Okada et al. 1986).
Therefore, it is likely that meiosis and homologous recombination occurs in D.
purpureum. There are three possibilities that explain the apparent absences of the Spo111 and Spo11-2 genes in D. purpureum: i) the rate of dsDNA breaks is sufficiently high to
stimulate interhomolog DNA strand exchange without Spo11-1 or Spo11-2; ii) another
nuclease is introducing dsDNA breaks, or iii) the sequences have diverged, making them
difficult to detect. This study cannot distinguish among these possibilities.
Conclusions
We performed extensive search for homologs of 20 genes that encode products
that are known to catalyze at least four important tasks during meiosis: 1) synaptonemal
complex formation; 2) homologous recombination; 3) Holliday junction resolution; and
4) sister chromatid cohesion (Table 4.2). The distributions of ten genes (Rad51, Dmc1,
Hop2, Mnd1, Mlh1, Mlh3, Msh2, Msh4, Msh5, and Msh6) indicate they are present their
presence in the genomes of representatives from every currently recognized eukaryotic
supergroups and the Apusozoa (Thecamonas trahens) (Figures 4.1 and 4.2). Some genes
are absent from the genomes of one or more eukaryotic supergroups or T. trahens (Hop1,
Rad21, Spo11-1, Pms1, and Mer3). However, based upon our current understanding of
the evolutionary relationships of eukaryotes (Figure 1.2), we determined that these genes
are likely to have been present in the last eukaryotic common ancestor (LECA). We also
performed phylogenetic analyses on the proteins translated from all of the genes collected
140
(Figures 4.3 – 4.15). We used protein sequences of paralogs and archaebacterial
orthologs to root the phylogenies. On the basis of these analyses we determined that an
additional four genes (Rec8, Spo11-3, Mlh2, and Msh3) are likely to have been present in
the LECA, despite their apparent absences from representatives of multiple eukaryotic
supergroups. Only one gene (Spo11-2) may have arisen later during eukaryotic
evolution, based upon its distribution and phylogenetic analyses that retrieve topologies
in which the Spo11-2 clade is nested within the Spo11-1 clade.
Frequently, we observed that genes arose by duplication, often in the LECA, of
genes that are likely to have encoded proteins that functioned during DNA damage repair
(Figure 4.16). In addition, we noticed that many homologs are paralogs in which at least
one gene encodes products that function only during meiosis in model organisms and at
least one other paralog that functions during both meiosis and mitosis. Nearly all of the
genes here (except Hop2 and Mnd1 for which no other eukaryotic or archaebacterial
orthologs have been identified) likely arose by duplications of genes that encode DNA
repair proteins, yielding multiple genes whose products are both meiosis-specific and
generalist in nature, within the LECA. These data are most consistent with the possibility
that meiosis arose from mitosis (Marcon and Moens 2005; d'Erfurth et al. 2009; Wilkins
and Holliday 2009).
Methods
Database Searches
Keyword searches (e.g. Saccharomyces cerevisiae Rad51) of the National
Center for Biotechnology Information (NCBI, www.ncbi.nlm.nih.gov/)protein sequence
database retrieved protein sequences for representatives of animals, fungi, and plants.
In addition, the Clusters of euKaryotic Orthologous Groups of proteins (KOGs)
database for each protein was accessed (Tatusov et al. 2000). Sequence identities were
initially verified using the tBLASTn (Altschul et al. 1997) option of the Basic Local
Alignment Search Tool (BLAST), in which the translated nucleotide database is
141
searched using a protein query and evaluating the results (bi-directional BLAST).
These protein sequences were subsequently used as queries to search genome sequence
databases at NCBI and other publicly available sites (Table 4.3) with BLASTp,
tBLASTn, and BLASTn, as necessary, for all available Hop1, Rev7, Rad21, Rec8,
Spo11-1, Spo11-2, Spo11-3, Rad51, Dmc1, Hop2, Mnd1, Mlh1-3, Pms1, Msh2-6, Mer3,
Slh1, and Brr2 sequences available for a set of 46 taxa from June through August 2010.
Once additional protein sequence data were obtained, searches were also performed
using protein sequence data from closely related organisms likely to share more recent
common ancestors as queries. Identities of sequences were again confirmed with bidirectional BLAST (BLASTx and tBLASTn, as necessary). When necessary,
phylogenetically verified (see below) protein sequences were aligned with MUSCLE
v3.7 (Edgar 2004) and used to create position specific scoring matrices (PSSMs) with
the tBLASTn module (available at
http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_T
YPE=Download). Matrices were then used as queries with the PSI-BLAST module to
search nucleotide genome sequence databases. When multiple sequences were found
for a species, only the most complete was retained. If no previously annotated protein
sequence was available in a database, then nucleotide sequences were annotated by
hand, using Sequencher v4.5 (Genecodes, Ann Arbor, MI). Exons were identified on
the basis of inferred translations using BLASTx pairwise comparisons to the NCBI
protein sequence database and locations of putative intron splice donor and acceptor site
sequences (e.g. G/GT to AG/G, although others may be observed among diverse
eukaryotes). Additional comparisons of resulting amino acid sequences to other
homologs were performed with alignments created using MUSCLE v3.7 (Edgar 2004)
and observed with BioEdit v7.0.5.3 (Hall 1999).
142
Phylogenetic analyses
We aligned all potential eukaryotic protein sequences with archaebacterial
protein sequences using MUSCLE v3.7, manually edited them (removing ambiguously
aligned columns and gaps) with BioEdit v7.0.5.3 (Hall 1999; Edgar 2004) and
performed phylogenetic analyses on the set. Optimal protein substitution models and
parameters were determined for each alignment independently with Modelgenerator
v0.85 (Keane et al. 2006). Analyses were performed with RAxML v7.2.7 (Stamatakis,
Hoover, and Rougemont 2008), for 1000 replicates at the CIPRES Science Gateway
v3.0 (Miller et al. 2009).
Inventory Assembly
Genes were determined to be present in an organism when putative orthologs
were discovered and identified with phylogenetic analyses. To determine the numbers
of observed sequence absences attributable to failures of the sequence detection
regimen, Smith-Waterman pairwise alignment scores (Homo sapiens versus
Saccharomyces cerevisiae) were calculated with the PRSS/PRFX tool
(http://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=shuffle) (Table 2.2).
The numbers of sequence detection failures expected for each protein, given its SmithWaterman score, were determined with a Poisson regression analysis of protein
sequence data previously collected (Chapter 2) for ten RNA Polymerase I and three
Replication Protein A subunits among diverse eukaryotes with completed genome
sequences.
143
Figure 4.1: Distribution of 20 homologs that function during meiosis among 46
eukaryotes representing all eukaryotic supergroups. Cells filled in with
color (Opisthokonta in purple, Amoebozoa in blue, Archaeplastida in green,
Chromalveolata in orange, Rhizaria in eggplant, Excavata in brown, and
Apusozoa in black) indicate the homolog was found and phylogenetically
verified. Labels of proteins known to function only during meiosis in
model organisms are blue. Shades of grey indicate the proportion of
observed absences attributed to sequence detection failures, estimated from
Smith-Waterman pairwise alignment scores (Saccharomyces cerevisiae
versus Homo sapiens) (see Methods). Darker greys indicate the gene is not
present in the genome sequence sampled while lighter greys indicate the
gene may be present but was not detected. Black protein labels identify
sequences discovered in all eukaryotes sampled. Asterisks identify
completed genome sequences (>8.0x WGS coverage or sequenced end-toend).
Ratio of number of undetected sequences expected to observed
no failures 0.01‐0.10
observed
0.11‐0.20
0.21‐0.50
0.51‐1.00
> 1.00
Mer3
Msh6
Msh5
Msh4
Msh3
Msh2
Mlh3
Mlh2
Mlh1
Pms1
Mnd1
Hop2
Dmc1
Rad51
Spo11‐3
Spo11‐2
Spo11‐1
Rec8
Rad21
TAXA/PROTEIN
METAMONADA
Trichomonas vaginalis
Giardia intestinalis*
Spironucleus vortens
DISCOBA
Naegleria gruberi*
Leishmania major/donovani
Trypanosoma cruzi*
ARCHAEPLASTIDA
Arabidospsis thaliana*
Oryza sativa
Physcomitrella patens*
Chlamydomonas reinhardti*
Chlorella sp.*
Ostreococcus tauri*
Galdieria sulphuraria
HAPTOPHYTA
Emiliania huxleyi*
STRAMENOPILA
Thalassiosira pseudonana*
Phaeodactylum triconutum*
Fragilariopsis cylindrus
Phytophthora ramorum/sojae*
Aureococcus anophagefferens
Blastocystis hominis
ALVEOLATA
Plasmodium vivax*
Toxoplasma gondii
Theileria parva/annulata*
Cryptosporidium muris*
Perkinsus marinus
Paramecium tetraaurelia*
RHIZARIA
Bigelowiella natans
AMOEBOZOA
Entamoeba dispar
Dictyostelium purpureum*
Polysphondylium pallidum
HOLOZOA
Homo sapiens*
Ciona*
Nematostella vectensis
Trichoplax adhaerens*
Monosiga brevicollis*
Salpingoeca rosetta*
Capsaspora owczarzaki*
FUNGI
Saccharomyces cerevisiae*
Aspergillus fumigatus*
Ustilago maydis*
Cryptococcus neoformans
Laccaria bicolor*
Coprinopsis cinerea*
Mucor circinelloides*
Batrachochytrium dendrobatidis*
APUSOZOA
Thecamonas trahens*
Hop1
144
Spo11‐2
Spo11‐3
Rad51
Dmc1
Hop2
Mnd1
Pms1
Mlh1
Mlh2
Mlh3
Msh2
Msh3
Msh4
Msh5
Msh6
+
+
+
+
+
+
+
+
+
+
Mer3
Holliday‐junction resolution
Spo11‐1
DNA strand exchange
dsDNA break formation
Rec8
LECA
METAMONADA
DISCOBA
ARCHAEPLASTIDA
CHROMALVEOLATA
RHIZARIA
AMOBOZOA
OPISTHOKONTA
APUSOZOA
Rad21
TAXA/PROTEIN
Hop1
Synaptonemal Complex Formation
Sister‐chromatid cohesion
145
Figure 4.2: Presence of twenty homologs that function during meiosis in the last
eukaryotic common ancestor (LECA) inferred by their distribution
among eukaryotic supergroups. Cells filled in with color (Opisthokonta
in purple, Amoebozoa in blue, Archaeplastida in green, Chromalveolata in
orange, Rhizaria in eggplant, Excavata in brown, and Apusozoa in black)
indicate the homolog was found and phylogenetically verified within that
group. Labels of proteins known to function only during meiosis in model
organisms are blue. Black plusses indicate the gene was most likely present
in the LECA based upon its distribution among all eukaryotic supergroups,
red plusses indicate the presence of the gene in the LECA on the basis of
phylogenetic analyses.
146
Naegleria
Trichomonas
Giardia
20
Blastocystis
50
Theileria
240
Cryptosporidium
590
130
Plasmodium
510
580
Toxoplasma
Bigelowiella
Thecamonas
650
Oryza
930
Arabidopsis
500
Physcomitrella
70
860
Ostreococcus
Chlorella
80
980
Chlamydomonas
990
Leishmania
150 110
Trypanosoma
Emiliania
Aspergillus
370
140
Saccharomyces
180 480
Cryptococcus
Coprinus
950
130
980
Laccaria
Capsaspora
Batrachochytrium
580
Ciona
100
Homo
Trichoplax
850
Nematostella
Salpingoeca
Monosiga
820
Arabidopsis
580
Oryza
500
Physcomitrella
280
Chlorella
90
Mucor
Trichoplax
250
Homo
490
Nematostella
970
310
Coprinus
Laccaria
430
Ustilago
490
Cryptococcus
140
Aspergillus
470
Saccharomyces
270
910
Polysphondylium
Dictyostelium
320
Ciona
Trypanosoma
10
340
990
Hop1
Rev7
0.5 substitutions/site
Figure 4.3: Unrooted phylogenetic tree of 50 eukaryotic Hop1 and Rev7 homologs.
Trees were estimated with maximum likelihood inference (LG+G; 1000
replicates) from 129 aligned amino acids. Opisthokonta are labeled with
purple, Amoebozoa with blue, Archaeplastida with green, Chromalveolata
with orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa
with black. Labels of proteins known to function only during meiosis in
model organisms are blue. The best RAxML v7.2.7 tree is shown.
147
Figure 4.4: Unrooted phylogenetic tree of 49 eukaryotic Rad21 and Rec8 homologs.
Trees were estimated with maximum likelihood inference (LG+G; 1000
replicates) from 171 aligned amino acids. Opisthokonta are labeled with
purple, Amoebozoa with blue, Archaeplastida with green, Chromalveolata
with orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa
with black. Labels of proteins known to function only during meiosis in
model organisms are blue. The best RAxML v7.2.7 tree is shown.
148
Homo
Nematostella
Trichoplax
70
Capsaspora
20
20
Saccharomyces
Cryptosporidium
120
Plasmodium
20
Monosiga
1000
Salpingoeca
530
Coprinus
250
Laccaria
190
Cryptococcus
10
310
Ustilago
Batrachochytrium
Aspergillus
30
140
260
Mucor
Fragilariopsis
Aureococcus
470
Phytophthora
100
Phaeodactylum
260
1
540
Thalassiosira
1
1000
Leishmania
270
Trypanosoma
140
Naegleria
Entamoeba
50
970
Polysphondylium
10
Dictyostelium
Trichomonas
220
Bigelowiella
80
Thecamonas
Emiliania
160
Ostreococcus
540
Chlamydomonas
Physcomitrella
320
Oryza
820
850
Arabidopsis
Galdieria
610
Homo
Ciona
370
860
Saccharomyces
Aspergillus
320
Cryptococcus
230
Batrachochytrium
470
Coprinus
980
Laccaria
570
Ustilago
970
Arabidopsis
Oryza
830
Ostreococcus
450
890
0.2 substitutions/site
Rad21
Rec8
149
Figure 4.5: Unrooted phylogenetic tree of 69 eukaryotic Spo11-1, Spo11-2, and
Spo11-3 homologs with 6 archaebacterial Top6A homologs. Trees were
estimated with maximum likelihood inference (LG+G+I; 1000 replicates)
from 170 aligned amino acids. Opisthokonta are labeled with purple,
Amoebozoa with blue, Archaeplastida with green, Chromalveolata with
orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa with
black. Labels of proteins known to function only during meiosis in model
organisms are blue. The best RAxML v7.2.7 tree is shown.
150
670
270
200
230
Perkinsus
Plasmodium
Theileria
Cryptosporidium
Toxoplasma
Bigelowiella
Emiliania
300
Galdieria
1
390
Trypanosoma
410
Leishmania
990
Aureococcus
Blastocystis
200
Phaeodactylum
490
Fragilariopsis
470
50
Phytophthora
150
Thalassiosira
320
850
Oryza
900
Arabidopsis
250
Physcomitrella
10
800
Chlorella
Chlamydomonas
110
30 Ostreococcus
Naegleria
360
Ustilago
Trichomonas
1000
Trypanosoma
30
Leishmania
820
Coprinus
730
Laccaria
670
Cryptococcus
200
1
Aspergillus
Mucor
570
Homo
420
Ciona
730
1
Nematostella
30
Trichoplax
Capsaspora
30
Salpingoeca
270 Monosiga
870
Arabidopsis
740
1
Oryza
130
Physcomitrella
550
Giardia
1
90
Spironucleus
10
Paramecium
870
Cryptosporidium
10
Plasmodium
570
Theileria
90
Thecamonas
210
Batrachochytrium
Saccharomyces
Galdieria
60 140
Entamoeba
Naegleria
170
Bigelowiella
20
Thecamonas
Monosiga
50
600 Aureococcus
40
Capsaspora
Emiliania
160 Thalassiosira
50
Fragilariopsis
890
650 Phaeodactylum
70
Ostreococcus
620 Chlorella
890
Chlamydomonas
160
Arabidopsis
Physcomitrella
600
Oryza
990
Aeropyrum
860
Sulfolobus
Pyrobaculum
1000
Methanosarcina
Methanocaldococcus
990
Nanoarchaeum
30
980
0.2 substitutions/site
Spo11-2
Spo11-1
Spo11-3
Top6A
151
420
460
980
Leishmania
Trypanosoma
Galdieria
Emiliania
Bigelowiella
Toxoplasma
Cryptosporidium
260
Perkinsus
10
170
Theileria
270
Plasmodium
680
Aureococcus
Blastocystis
200
Phaeodactylum
490
Fragilariopsis
510
50
Phytophthora
170
Thalassiosira
350
900
Oryza
910
Arabidopsis
Physcomitrella
200
10
Chlorella
Chlamydomonas
70
770
Ostreococcus
20
Naegleria
Leishmania
1000
Trypanosoma
Ustilago
40
Trichomonas
360
Saccharomyces
120 Galdieria
Batrachochytrium
840
Coprinus
760
Laccaria
660
Cryptococcus
20
Aspergillus
Paramecium
Mucor
90
830
Arabidopsis
730
Oryza
20
Physcomitrella
1
Cryptosporidium
Theileria
270
Plasmodium
180
550
Homo
470
1
Ciona
760
Nematostella
Trichoplax
Capsaspora
1
Monosiga
1
Thecamonas
30
Salpingoeca
30
Giardia
200
Spironucleus
620
Naegleria
Entamoeba
640
Fragilariopsis
900 Phaeodactylum
200
Thalassiosira
Monosiga
90
Aureococcus
30 560
Capsaspora
Thecamonas
20
190
Bigelowiella
Emiliania
10
300 Chlamydomonas
30 640
Chlorella
Physcomitrella
40
Arabidopsis
Oryza
Ostreococcus
350
40
1
60
1
960
Spo11-2
Spo11-1
Spo11-3
0.2 substitutions/site
Figure 4.6: Unrooted phylogenetic tree of 69 eukaryotic Spo11-1, Spo11-2, and
Spo11-3 homologs. Trees were estimated with maximum likelihood
inference (LG+G+I; 1000 replicates) from 170 aligned amino acids.
Opisthokonta are labeled with purple, Amoebozoa with blue, Archaeplastida
with green, Chromalveolata with orange, Rhizaria with eggplant, Excavata
with brown, and Apusozoa with black. Labels of proteins known to function
only during meiosis in model organisms are blue. The best RAxML v7.2.7
tree is shown.
152
Figure 4.7: Unrooted phylogenetic tree of 81 eukaryotic Rad51 and Dmc1 homologs
with 6 archaebacterial RadA homologs. Trees were estimated with
maximum likelihood inference (LG+G; 1000 replicates) from 305 aligned
amino acids. Opisthokonta are labeled with purple, Amoebozoa with blue,
Archaeplastida with green, Chromalveolata with orange, Rhizaria with
eggplant, Excavata with brown, and Apusozoa with black. Labels of proteins
known to function only during meiosis in model organisms are blue. The best
RAxML v7.2.7 tree is shown.
153
620
130
1000
Homo
Ciona
Trichoplax
570
Nematostella
1000
Laccaria
690
Coprinus
510
230
Cryptococcus
460
Ustilago
Mucor
170
250
Aspergillus
540
Saccharomyces
Batrachochytrium
130 40 950
Monosiga
Salpingoeca
220
Capsaspora
100
Thecamonas
Paramecium
920
Plasmodium
20
490
70
Toxoplasma
1000
Cryptosporidium
Theileria
460
Trichomonas
20
370
Naegleria
Leishmania
1000
10
Trypanosoma
40
Galdieria
Entamoeba
30
Dictyostelium
100
1000
Polysphondylium
Bigelowiella
280
Emiliania
Physcomitrella
Arabidopsis
1000
140
920
Oryza
750
Chlorella
930
Chlamydomonas
Ostreococcus
160
420
Aureococcus
Thalassiosira
1000
990
Fragilariopsis
1000
520
Phaeodactylum
Blastocystis
Phytophthora
620
Homo
940
Nematostella
900
Ciona
790
Trichoplax
770
Capsaspora
Batrachochytrium
Mucor
670
990
Saccharomyces
730
Aspergillus
30
Cryptococcus
560
Laccaria
900
1000
Coprinus
Perkinsus
20
860
Plasmodium
700
Toxoplasma
980
Cryptosporidium
640
Theileria
10
Physcomitrella
Oryza
990
890
Arabidopsis
Naegleria
1000
Trypanosoma
30
30
Leishmania
Trichomonas
450
10
Giardia
990
Spironucleus
70
30
Blastocystis
Entamoeba
60
Galdieria
50
320
Bigelowiella
180
Thecamonas
Phytophthora
Ostreococcus
Chlorella
230
810
740
Chlamydomonas
Emiliania
Salpingoeca
980
960
Monosiga
710
Aeropyrum
920
Nanoarchaeum
550
Methanocaldococcus
Pyrobaculum
550
Candidatus
Nitrosopumilus
1000
Cenarchaeium
0.1 substitutions/site
Rad51
Dmc1
RadA
154
1000
Laccaria
Coprinus
Cryptococcus
490
Ustilago
Mucor
280
Saccharomyces
210
490
Aspergillus
40
Batrachochytrium
Capsaspora
220
Salpingoeca
280
930
Monosiga
Nematostella
590
Trichoplax
160
Ciona
660
Homo
130
920
Plasmodium
510
Toxoplasma
1000
Cryptosporidium
470
Theileria
80
20
Trichomonas
Thecamonas
90
Paramecium
1000
Trypanosoma
20
380
Leishmania
Naegleria
20
40
Galdieria
Entamoeba
40
Polysphondylium
100
990
Dictyostelium
Bigelowiella
250
Emiliania
Physcomitrella
Arabidopsis
1000
90
910
Oryza
750
Chlamydomonas
920
Chlorella
Ostreococcus
1000 400
Aureococcus
Thalassiosira
1000
Fragilariopsis
1000
560
Phaeodactylum
380
Phytophthora
Blastocystis
650
Homo
910
Nematostella
900
Ciona
740
Trichoplax
Capsaspora
1000
Coprinus
900
680
Laccaria
520
Cryptococcus
Aspergillus
700
980
Saccharomyces
30
Mucor
670
Batrachochytrium
Physcomitrella
Arabidopsis
10
990
890
Oryza
860
Plasmodium
980
Toxoplasma
Cryptosporidium
1
680
Theileria
680
Perkinsus
Naegleria
1000
Leishmania
Trypanosoma
20
20
Trichomonas
460
10
Spironucleus
1000
Giardia
70
10
Blastocystis
Entamoeba
Bigelowiella
50
40
Galdieria
290
Phytophthora
190
Thecamonas
Ostreococcus
Chlamydomonas
220
800
Chlorella
Emiliania
Monosiga
970
940
Salpingoeca
680
510
Rad51
Dmc1
0.1 substitutions/site
Figure 4.8: Unrooted phylogenetic tree of 81 eukaryotic Rad51 and Dmc1 homologs
with. Trees were estimated with maximum likelihood inference (LG+G;
1000 replicates) from 305 aligned amino acids. Opisthokonta are labeled
with purple, Amoebozoa with blue, Archaeplastida with green,
Chromalveolata with orange, Rhizaria with eggplant, Excavata with brown,
and Apusozoa with black. Labels of proteins known to function only during
meiosis in model organisms are blue. The best RAxML v7.2.7 tree is shown.
155
Mucor
Batrachochytrium
Saccharomyces
Capsaspora
Cryptococcus
Laccaria
920
980
Coprinus
Nematostella
Ciona
Trichoplax
Homo
Salpingoeca
Monosiga
Naegleria
Entamoeba
Phytophthora
Trichomonas
Bigelowiella
Toxoplasma
Emiliania
Cryptosporidium
Perkinsus
Plasmodium
360
530
290
140
70
60
260
730
20
280
880
80
210
40
490
330
80
80
210
180
130
500
Blastocystis
680
140
10
70
760
50
190
240
460
1000
830
700
Paramecium
Thecamonas
Chlorella
Chlamydomonas
Galdieria
Polysphondylium
Dictyostelium
Ostreococcus
Physcomitrella
Arabidopsis
Oryza
Spironucleus
Giardia
Trypanosoma
990
1
30
70
350
Leishmania
Leishmania
Trypanosoma
Naegleria
Coprinus
180
930
Laccaria
80
Bigelowiella
20
Trichomonas
Dictyostelium
1
990
Polysphondylium
170
Paramecium
Giardia
1
1
Emiliania
140
Entamoeba
1
Cryptococcus
340
Batrachochytrium
Nematostella
Monosiga
130
50
Trichoplax
10
Ciona
1
220
Homo
Saccharomyces
190
210
Plasmodium
Theileria
120
320
Perkinsus
1
250
Toxoplasma
Cryptosporidium
20
Blastocystis
10
240
Galdieria
Thecamonas
400
Capsaspora
Phytophthora
10
Aureococcus
740
Phaeodactylum
850
Fragilariopsis
950
660
Thalassiosira
Mucor
Physcomitrella
Oryza
900
690
Arabidopsis
Chlorella
Chlamydomonas
Aspergillus
Ostreococcus
250
150
Hop2
970
Mnd1
0.2 substitutions/site
Figure 4.9: Unrooted phylogenetic tree of 82 eukaryotic Hop2 and Mnd1 homologs.
Trees were estimated with maximum likelihood inference (LG+G; 1000
replicates) from 98 aligned amino acids. Opisthokonta are labeled with
purple, Amoebozoa with blue, Archaeplastida with green, Chromalveolata
with orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa
with black. Labels of proteins known to function only during meiosis in
model organisms are blue. The best RAxML v7.2.7 tree is shown.
156
Figure 4.10: Unrooted phylogenetic tree of 131 eukaryotic Mlh1, Mlh2, Mlh3, and
Pms1 homologs with 4 archaebacterial MutL homologs. Trees were
estimated with maximum likelihood inference (LG+G+F; 1000 replicates)
from 185 aligned amino acids. Opisthokonta are labeled with purple,
Amoebozoa with blue, Archaeplastida with green, Chromalveolata with
orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa with
black. The best RAxML v7.2.7 tree is shown.
157
Homo
Trichoplax
Nematostella
Ciona
Capsaspora
230
40
Monosiga
270
840
Salpingoeca
Batrachochytrium
70
890
Polysphondylium
100
Dictyostelium
Mucor
Laccaria
660
970
Coprinus
310
Cryptococcus
Saccharomyces
280
10
880
Aspergillus
210
Thecamonas
30
Ustilago
Bigelowiella
990
Arabidopsis
760
Oryza
350
Physcomitrella
Ostreococcus
20
10
Chlorella
20
230
760
Chlamydomonas
Emiliania
Perkinsus
10
Toxoplasma
10
50
10
Trypanosoma
220
990
Leishmania
Aureococcus
Phytophthora
450
Trichomonas
30
Naegleria
Plasmodium
40
Theileria
430
370
Cryptosporidium
Thalassiosira
Fragilariopsis
980
970
Phaeodactylum
Paramecium
310
Blastocystis
Galdieria
Entamoeba
Spironucleus
Giardia
550
Trichoplax
400
Ciona
510
Homo
420
Nematostella
390
Aspergillus
Mucor
Saccharomyces
Thecamonas
470
440
Trichomonas
Giardia
380
650
250
50
110
510
290
110
590
520
360
560
Monosiga
680
300
250
90
120
130
570
80
990
300
10
120
860
110
370
0.5 substitutions/site
Mlh2
Physcomitrella
Ostreococcus
Arabidopsis
Oryza
Capsaspora
Polysphondylium
70
980
Dictyostelium
Ciona
Trichoplax
350
20
Homo
330
Nematostella
350
Mucor
Batrachochytrium
380
Aspergillus
60
10
Saccharomyces
Ustilago
190
Cryptococcus
540
Laccaria
760
90
920
Coprinus
Phytophthora
190
Emiliania
Thecamonas
440
Aureococcus
310
1
Galdieria
Thalassiosira
130
Phaeodactylum
980
160
820
Fragilariopsis
Salpingoeca
70
820
Monosiga
Naegleria
Perkinsus
140
Theileria
Toxoplasma
40
Plasmodium
160
Cryptosporidium
Leishmania
1000
Trypanosoma
Paramecium
Entamoeba
Blastocystis
Trichomonas
Giardia
920
Spironucleus
Homo
300
Ciona
230
Capsaspora
20
Nematostella
Thecamonas
Phytophthora
80
20
Naegleria
100
270
Ostreococcus
310
Physcomitrella
Emiliania
960
Arabidopsis
330
Oryza
10
180
Chlorella
280
Chlamydomonas
140
Aspergillus
Trichomonas
20
Cryptococcus
Monosiga
870
Dictyostelium
1
Polysphondylium
50
Thalassiosira
90
Salpingoeca
150
Batrachochytrium
Mucor
Trichoplax
90
Laccaria
110
700
Coprinus
Saccharomyces
Ustilago
Bigelowiella
270
Entamoeba
Aciduliprofundum
Methanosarcina
Halogeometricum
1000
Natrialba
170
20
Mlh1
870
Pms1
Mlh3
MutL
158
Homo
Trichoplax
Nematostella
Ciona
Capsaspora
40
230
Monosiga
280
Salpingoeca
830
70
Batrachochytrium
880
Polysphondylium
90
Dictyostelium
Mucor
960
Coprinus
640
Laccaria
280
Cryptococcus
300
Saccharomyces
Aspergillus
890
210
Thecamonas
40
Ustilago
Bigelowiella
990
Trypanosoma
190
Leishmania
60
Toxoplasma
10
Perkinsus
1
Emiliania
20
760
Chlamydomonas
240
Chlorella
Ostreococcus
Physcomitrella
340
10
Arabidopsis
760
980
Oryza
Aureococcus
Phytophthora
960
Phaeodactylum
960
Fragilariopsis
Thalassiosira
Plasmodium
380
660
240
10
20
10
130
30
410
380
Blastocystis
690
40
50
160
Giardia
150
300
410
Naegleria
Trichomonas
490
570
420
400
510
Mucor
310
450
880
360
180
80
900
150
40
30
150
20
110
200
20
70
130
980
100
310
220
300
960
310
10
180
270
Salpingoeca
50
870
730
60
280
130
Ustilago
Pms1
Giardia
Emiliania
Chlorella
Chlamydomonas
Aspergillus
Trichomonas
Cryptococcus
Monosiga
70
40
Arabidopsis
Oryza
110
20
1
Ostreococcus
Naegleria
280
Phytophthora
Thecamonas
Nematostella
Homo
Ciona
Capsaspora
Physcomitrella
30
10
Arabidopsis
Oryza
Physcomitrella
Mlh2
690
940
90
Homo
Trichoplax
Ciona
Giardia
Ostreococcus
Capsaspora
Dictyostelium
Polysphondylium
Ciona
Trichoplax
350
30
Homo
300
Nematostella
370
Mucor
Batrachochytrium
410
Saccharomyces
70
20
Aspergillus
Ustilago
210
Cryptococcus
560
Laccaria
740
100
Coprinus
900
Phytophthora
Emiliania
220
Thecamonas
410
Galdieria
280
20
Aureococcus
Thalassiosira
100
Phaeodactylum
980
Fragilariopsis
820
Salpingoeca
Monosiga
800
Naegleria
Perkinsus
Plasmodium
Cryptosporidium
Theileria
Toxoplasma
1000
Leishmania
Trypanosoma
Entamoeba
Paramecium
Blastocystis
Trichomonas
Spironucleus
400
30
30
Nematostella
Spironucleus
Saccharomyces
Trichomonas
Thecamonas
460
Monosiga
480
Aspergillus
Theileria
Galdieria
Paramecium
Entamoeba
440
790
540
Cryptosporidium
Mlh1
Thalassiosira
Mlh3
Polysphondylium
Dictyostelium
Coprinus
Laccaria
Batrachochytrium
Mucor
Trichoplax
Entamoeba
Saccharomyces
Bigelowiella
0.2 substitutions/site
Figure 4.11: Unrooted phylogenetic tree of 131 eukaryotic Mlh1, Mlh2, Mlh3, and
Pms1 homologs. Trees were estimated with maximum likelihood inference
(LG+G+F; 1000 replicates) from 185 aligned amino acids. Opisthokonta
are labeled with purple, Amoebozoa with blue, Archaeplastida with green,
Chromalveolata with orange, Rhizaria with eggplant, Excavata with brown,
and Apusozoa with black. The best RAxML v7.2.7 tree is shown.
159
Figure 4.12: Unrooted phylogenetic tree of 113 eukaryotic Mer3, Brr2, and Slh1
homologs with 6 archaebacterial Ski2 homologs. Trees were estimated
with maximum likelihood inference (LG+G+I; 1000 replicates) from 338
aligned amino acids. Opisthokonta are labeled with purple, Amoebozoa
with blue, Archaeplastida with green, Chromalveolata with orange, Rhizaria
with eggplant, Excavata with brown, and Apusozoa with black. Labels of
proteins known to function only during meiosis in model organisms are
blue. The best RAxML v7.2.7 tree is shown.
160
500
280
Emiliania
390
Homo
Capsaspora
Chlamydomonas
Chlorella
Physcomitrella
Oryza
Arabidopsis
1000
110
1000
Laccaria
480
Coprinus
770
Ustilago
480
Cryptococcus
Mucor
90
Aspergillus
430
Saccharomyces
200
550
Batrachochytrium
Blastocystis
70
70
930
Monosiga
Salpingoeca
Bigelowiella
300
Trichomonas
500
Ostreococcus
Thalassiosira
170
Fragilariopsis
1000
Phaeodactylum
820
Naegleria
Trypanosoma
530
Leishmania
1000
Giardia
Spironucleus
1000
900
Homo
520
Ciona
660
Nematostella
690
Trichoplax
Monosiga
310
Salpingoeca
990
Emiliania
180
870
Oryza
710
Arabidopsis
Physcomitrella
Galdieria
340
Chlorella
240
Chlamydomonas
900
150
Thecamonas
50
Ustilago
50
30
Aspergillus
Cryptococcus
40
Coprinus
550
Laccaria
900
50
Mucor
Batrachochytrium
50
60
Paramecium
Saccharomyces
400
Blastocystis
Phytophthora
10
710
Phaeodactylum
110
1000
Fragilariopsis
780
Thalassiosira
30
Aureococcus
360
Plasmodium
380
Theileria
210
20
Cryptosporidium
220
Toxoplasma
510
Perkinsus
30
Trypanosoma
Leishmania
1000
Naegleria
70
Polysphondylium
430
Dictyostelium
1000
140
Bigelowiella
Entamoeba
Galdieria
290
Homo
370
Trichoplax
190
Nematostella
540
Ciona
Monosiga
580
Salpingoeca
1000
Capsaspora
580
640
Batrachochytrium
60
Mucor
550
Saccharomyces
990
Aspergillus
230
Ustilago
310
Cryptococcus
550
Coprinus
840
160
Laccaria
990
Thecamonas
660
Chlorella
Ostreococcus
50
Physcomitrella
110
880
Oryza
1000
Arabidopsis
1000
Naegleria
120
Emiliania
310
Blastocystis
Phytophthora
410
Aureococcus
70
820
Thalassiosira
980
Phaeodactylum
1000
Fragilariopsis
780
1000
Dictyostelium
Polysphondylium
940
Theileria
360
490
Plasmodium
Cryptosporidium
80
Paramecium
Trichomonas
150
1000
1000
Leishmania
Trypanosoma
Spironucleus
100
96
Giardia
Entamoeba
850
Haloarcula
1000
Natronomonas
Halobacterium
Pyrococcus
1000
460
Sulfolobus
Methanosarcina
990
970
1000
460
700
980
0.1 substitutions/site
Mer3
Brr2
Slh1
Ski2
161
Figure 4.13: Unrooted phylogenetic tree of 113 eukaryotic Mer3, Brr2, and Slh1
homologs. Trees were estimated with maximum likelihood inference
(LG+G+I; 1000 replicates) from 338 aligned amino acids. Opisthokonta are
labeled with purple, Amoebozoa with blue, Archaeplastida with green,
Chromalveolata with orange, Rhizaria with eggplant, Excavata with brown,
and Apusozoa with black. Labels of proteins known to function only during
meiosis in model organisms are blue. The best RAxML v7.2.7 tree is
shown.
162
Coprinus
Laccaria
Ustilago
Cryptococcus
Mucor
Aspergillus
Saccharomyces
Batrachochytrium
Capsaspora
1000
440
740
480
570
460
130
530
540
430
Emiliania
960
1000
480
80
310
950
280
110
1000
770
410
510
1000
Homo
990
280
40
1000
Blastocystis
Chlorella
Chlamydomonas
Physcomitrella
Oryza
Arabidopsis
1000
Trichomonas
Bigelowiella
Monosiga
Salpingoeca
Ostreococcus
Thalassiosira
Fragilariopsis
Phaeodactylum
Naegleria
Leishmania
Trypanosoma
1000
330
990
910
240
330
720
1
890
790
540
10
60
30
150
100
180
490
310
530
0.1 substitutions/site
1000
180
Blastocystis
Trypanosoma
Leishmania
Paramecium
Saccharomyces
Batrachochytrium
Mucor
120
300
Ustilago
330
Aspergillus
Cryptococcus
330
120
Laccaria
540
Coprinus
900
Naegleria
Dictyostelium
420
Polysphondylium
1000
Thecamonas
530
270
Perkinsus
170
Toxoplasma
Cryptosporidium
220
Theileria
390
Plasmodium
370
150
Entamoeba
Bigelowiella
290
Homo
370
Trichoplax
170
Nematostella
540
Ciona
Salpingoeca
580
Monosiga
1000
Capsaspora
990
Laccaria
840
600
Coprinus
560
Cryptococcus
300
Ustilago
Aspergillus
240
Saccharomyces
550
980
Batrachochytrium
Mucor
640
90
Thecamonas
330
Naegleria
Emiliania
990
Oryza
1000
Arabidopsis
890
Physcomitrella
20
Chlorella
66
Ostreococcus
1000
Trypanosoma
20
Leishmania
Blastocystis
90
Phytophthora
400
Aureococcus
820
Thalassiosira
970
Phaeodactylum
1000
Fragilariopsis
790
1000
Polysphondylium
Dictyostelium
Galdieria
Cryptosporidium
Theileria
Plasmodium
930
Trichomonas
960
Spironucleus
Giardia
Paramecium
Entamoeba
10
40
Giardia
Spironucleus
910
550
680
700
180
Homo
Ciona
Nematostella
Trichoplax
Monosiga
Salpingoeca
Emiliania
Chlamydomonas
Chlorella
Galdieria
Physcomitrella
Oryza
Arabidopsis
710
Phaeodactylum
1000
Fragilariopsis
Thalassiosira
Aureococcus
Phytophthora
60
Mer3
430
Brr2
Slh1
163
Figure 4.14: Unrooted phylogenetic tree of 183 eukaryotic Msh2, Msh3, Msh4,
Msh5, and Msh6 homologs with 5 archaebacterial MutS homologs.
Trees were estimated with maximum likelihood inference (LG+G+F; 1000
replicates) from 259 aligned amino acids. Opisthokonta are labeled with
purple, Amoebozoa with blue, Archaeplastida with green, Chromalveolata
with orange, Rhizaria with eggplant, Excavata with brown, and Apusozoa
with black. Labels of proteins known to function only during meiosis in
model organisms are blue. The best RAxML v7.2.7 tree is shown.
164
720
480
770
730
490
180
110
220
20
180
110
220
680
150
970
Homo
Nematostella
Ciona
Trichoplax
Monosiga
Salpingoeca
1000
Capsaspora
Bigelowiella
Batrachochytrium
Aspergillus
Saccharomyces
Mucor
Ustilago
Cryptococcus
Laccaria
990
Coprinus
1000
Thecamonas
Chlamydomonas
Chlorella
1000
460
870
40
40
70
20
1000
40
1000
Trypanosoma
Naegleria
Trichomonas
300
140
1000
Msh4
Leishmania
Galdieria
Entamoeba
Emiliania
Polysphondylium
Dictyostelium
Thalassiosira
1000
Fragilariopsis
Phaeodactylum
620
Physcomitrella
Oryza
Arabidopsis
450
250
510
460
330
10
30
220
30
120
90
1
Homo
Trichoplax
Ciona
Nematostella
Capsaspora
Salpingoeca
Monosiga
Thecamonas
Emiliania
910
Thalassiosira
1000
Phaeodactylum
Fragilariopsis
550
Bigelowiella
Ostreococcus
Entamoeba
970
Polysphondylium
Dictyostelium
Saccharomyces
Trichomonas
420
390
380
30
20
240
Naegleria
990
Galdieria
Chlamydomonas
Chlorella
Physcomitrella
890
200
1000
Arabidopsis
920
Oryza
Aspergillus
Mucor
1000
330
Ustilago
Cryptococcus
60
1000
Coprinus
510
Laccaria
Batrachochytrium
570
Homo
390
Nematostella
Ciona
460
Trichoplax
320
480
Salpingoeca
Monosiga
190
910
Capsaspora
160
Batrachochytrium
Mucor
360
580
Saccharomyces
Aspergillus
510
Cryptococcus
240
480
Ustilago
Laccaria
530
60
Coprinus
820
Thecamonas
Dictyostelium
1
Polysphondylium
1000
Bigelowiella
Trichomonas
Entamoeba
10
110
Paramecium
10
Naegleria
20
340
Spironucleus
130
Giardia
20
Cryptosporidium
10
Perkinsus
120
Toxoplasma
220
Theileria
330
Plasmodium
550
970
Fragilariopsis
950
Phaeodactylum
830
Thalassiosira
610
Phytophthora
250
Aureococcus
950
Oryza
960
700
Arabidopsis
Physcomitrella
Ostreococcus
580
250
680
Chlorella
870
Chlamydomonas
Emiliania
Galdieria
120
980
Trypanosoma
80
Leishmania
Bigelowiella
Blastocystis
970
Trypanosoma
410
Leishmania
Thecamonas
210
Phytophthora
250
130
Monosiga
Polysphondylium
200
Dictyostelium
990
Capsaspora
Salpingoeca
380
290
Homo
300
Nematostella
870
370 Ostreococcus
Chlamydomonas
Chlorella
630
470
Physcomitrella
850
Oryza
630
Arabidopsis
930
590
Mucor
820
Aspergillus
Ustilago
750
Cryptococcus
730
Laccaria
750
Coprinus
1000
Saccharomyces
760
Homo
900
Nematostella
910
Trichoplax
Ciona
50
680
Arabidopsis
Oryza
Ostreococcus
500
Chlorella
220
Chlamydomonas
780
130
Physcomitrella
30
100
Naegleria
Trichomonas
30
Entamoeba
420
Galdieria
980
Trypanosoma
430
Leishmania
930 Spironucleus
100
20
Giardia
Thecamonas
920
Salpingoeca
310
Monosiga
20 110
Capsaspora
100
Emiliania
990
Polysphondylium
110
Dictyostelium
Mucor
250
Batrachochytrium
990
Saccharomyces
580
Aspergillus
520
Ustilago
900
Cryptococcus
950
Laccaria
990
440
Coprinus
1000
Phytophthora
Bigelowiella
170
Aureococcus
100
Fragilariopsis
960
510
510
Thalassiosira
Phaeodactylum
Blastocystis
Plasmodium
850
320
490
Perkinsus
Toxoplasma
Methanosaeta
980
Candidatus
Methanoculleus
Methanohalophilus
Methanohalobium
30
Trypanosoma
Leishmania
Msh5
810
120
990
410
490
450
470
950
0.2 substitutions/site
Msh2
Msh3
Msh6
MutS
165
710
Homo
Nematostella
Ciona
Trichoplax
1000
Salpingoeca
Monosiga
Capsaspora
Batrachochytrium
Mucor
260
240
Saccharomyces
110
Aspergillus
190
Ustilago
280
Cryptococcus
660
Laccaria
30
990
Coprinus
1000
Bigelowiella
Thecamonas
Trypanosoma
Leishmania
160
1000
840
970
Chlorella
490
Chlamydomonas
1000
Oryza
1000
Arabidopsis
Physcomitrella
Naegleria 570
40
Fragilariopsis
1000
Phaeodactylum
60
Thalassiosira
Emiliania
20
990
Polysphondylium
Dictyostelium
70
Entamoeba
40
Trichomonas
130
Galdieria
300
Homo
480
410
Ciona
400
Trichoplax
270
Nematostella
360
Capsaspora
Salpingoeca
40
Monosiga
510
Thecamonas
10
Emiliania
330
Naegleria 900
Phaeodactylum
1000
Thalassiosira
Fragilariopsis
10
550
Ostreococcus
210
Bigelowiella
30
Entamoeba
90
970
Polysphondylium
120
Dictyostelium
30
Saccharomyces
Trichomonas
20
Trypanosoma
230
99
Leishmania
30
Galdieria
790
Chlorella
Chlamydomonas
110
Physcomitrella
1000
880
Arabidopsis
920
Oryza
1000
Aspergillus
Batrachochytrium
Mucor
90
Ustilago
60
Cryptococcus
530
Laccaria
520
Coprinus
1000
590
Homo
400
Nematostella
Ciona
490
Trichoplax
320
500
Monosiga
190
Salpingoeca
920
Capsaspora
160
Batrachochytrium
Mucor
360
580
Aspergillus
Saccharomyces
520
Ustilago
230
470
Cryptococcus
Coprinus
510
60
Laccaria
820
Thecamonas
Polysphondylium
Dictyostelium
1000
880
Chlorella
30
700
Chlamydomonas
600
Ostreococcus
Physcomitrella
Arabidopsis
960
Oryza
950
Galdieria
10
80
600
30
Aureococcus
Phytophthora
Thalassiosira
830
20
Phaeodactylum
10
950
Fragilariopsis
970
Emiliania
10
Bigelowiella
Trichomonas
Naegleria
70
Trypanosoma
110
Leishmania
990
Paramecium 280
Spironucleus
100
Giardia
180
50
Cryptosporidium
Perkinsus
900
Toxoplasma
170
200
Theileria
310
Plasmodium
510
300
Entamoeba
Bigelowiella
Blastocystis
880
Homo
340
Nematostella
100
Capsaspora
Salpingoeca
Dictyostelium
130
Polysphondylium
990
310
Leishmania
320
Trypanosoma
980
430
Phytophthora
210
Thecamonas
Monosiga
370
Chlamydomonas
80
620
Ostreococcus
Chlorella
470
Physcomitrella
840
Arabidopsis
650
Oryza
930
570
Aspergillus
810
Mucor
Ustilago
760
Cryptococcus
730
Coprinus
760
Laccaria
1000
Saccharomyces
780
Homo
910
Nematostella
910
Trichoplax
60
Ciona
Ostreococcus
690
Oryza
Arabidopsis
550
Chlorella
180
Chlamydomonas
800
120
20
Naegleria
90
Physcomitrella
410
40
Entamoeba
410
Trichomonas
Galdieria
Giardia
340
Spironucleus
890
Trypanosoma
80
20
Leishmania
970
Thecamonas
910
Salpingoeca
300
Monosiga
10 120
Capsaspora
20
Emiliania
990
Dictyostelium
110
Polysphondylium
Mucor
250
Batrachochytrium
990
570
Aspergillus
Saccharomyces
520
Ustilago
350
910
Cryptococcus
940
Laccaria
990
Coprinus
Blastocystis1000
Plasmodium
320
Toxoplasma
860
Perkinsus
480
Bigelowiella
Phytophthora
170
Aureococcus
200
Fragilariopsis
520
Thalassiosira
970
Phaeodactylum
540
480
770
730
180
720
480
Msh4
Msh5
Msh2
Msh3
Msh6
0.5 substitutions/site
Figure 4.15: Unrooted phylogenetic tree of 183 eukaryotic Msh2, Msh3, Msh4,
Msh5, and Msh6 homologs. Trees were estimated with maximum
likelihood inference (LG+G+F; 1000 replicates) from 259 aligned amino
acids. Opisthokonta are labeled with purple, Amoebozoa with blue,
Archaeplastida with green, Chromalveolata with orange, Rhizaria with
eggplant, Excavata with brown, and Apusozoa with black. Labels of
proteins known to function only during meiosis in model organisms are
blue. The best RAxML v7.2.7 tree is shown.
166
A
B
D
E
C
D
E
E
Figure 4.16: Radial tree topologies of archaebacterial and eukaryotic homologs.
Phylogenetic analyses were performed using a maximum likelihood
approach on protein sequences of eukaryotic homologs encoding products
that function during meiosis with paralogs and archaebacterial homologs.
Blue bubbles indicate proteins known to function only during meiosis in
model organisms and pink bubbles indicate proteins that function during
meiosis, mitosis, and/or DNA mismatch repair. Branch colors of
eukaryotes correspond to supergroups (Opisthokonta in purple,
Amoebozoa in blue, Archaeplastida in green, Chromalveolata in orange,
Rhizaria in eggplant, Excavata in brown, and Apusozoa in black). Letters
indicate that at least one of the proteins in the tree is important for
synaptonemal complex formation (A), sister chromatid cohesion (B),
double-strand breaks (C), DNA strand exchange (D), or Holliday junction
resolution (E).
167
Table 4.1: Proteins involved in four general categories of meiosis and their
functions.
Category
Protein
Function
Binds discrete sites on axial elements and promotes
synapsis between homologous DNA duplexes during
Synaptonemal
meiotic prophase I (Anuradha and Muniyappa 2004a;
complex
Anuradha and Muniyappa 2004b; Latypov et al.). Hop1
formation and
Hop1
is
a paralog of Rev7, a gene encoding an accessory subunit
pairing of
of
DNA polymerase zeta that is involved in translesion
homologous
synthesis
during post-replication repair and dsDNA break
chromosomes
repair (Acharya et al. 2005; Kolas and Durocher 2006;
Lee and Myung 2008).
Members of cohesin complexes that bind Smc1/Smc3
heterodimers, forming large rings around chromosomes
Sister-chromatid
Rad21
during S-phase. Proteolytic cleavage by the separase
cohesion
Rec8
triggers sister-chromatid disjunction during Mitotic
Anaphase (Rad21) or Meiotic Anaphase II (Rec8)
(Gruber, Haering, and Nasmyth 2003).
Form dimers that cut dsDNA, generating 5'-nucleoprotein
linkages on either side of the break that may become sites
of recombination. Monomers are removed from the ends
oligonucleotide-bound complexes, leaving ssDNA tails.
Spo11-1 as
Double-strand
Spo11-1
and -2 function only during meiosis. In plants,
Spo11-2 Spo11-3 functions
DNA breaks
during vegetative growth (Lin and
Spo11-3 Smith 1994; Keeney,
Giroux, and Kleckner 1997;
Dernburg et al. 1998; Baudat and Keeney 2001; Hartung
et al. 2002; Sugimoto-Shirasu et al. 2002; Szekvolgyi and
Nicolas).
Both form helical filaments on ss- and ds-DNA, catalyze
strand exchange, and cause ssDNA extension and dsDNA
rotational transition during Mitotic prophase (Rad51) and
Rad51
meiotic prophase I (Rad51/Dmc1). Rad51 may recruit
Dmc1
Dmc1 to the pre-synaptic filament (Nishinaka et al. 1998;
DNA
Krogh and Symington 2004; Lopez-Casamichana et al.
strand exchange
2008).
Together, they form heterodimers that stabilize Dmc1Hop2
ssDNA pre-synaptic filaments and stabilize dsDNA
Mnd1
during DNA strand exchange (Chen et al. 2004; Henry et
al. 2006).
Note: names of proteins known to function only during meiosis are bolded
168
Table 4.1: Proteins involved in four general categories of meiosis and their
functions. Names of proteins known to function only during meiosis are
bolded. – continued
Mlh1-3
Pms1
Holliday
junction
resolution
Msh2,3,6
4,5
Mer3
Mlh1 forms heterodimers with Pms1, Mlh2, and Mlh3.
Mlh1-Pms1 functions during mismatch repair, interacting
with Msh2/3 and Msh2/6 heterodimers. During meiosis
Mlh1/2 and Mlh1/3 function to resolve heteroduplexes
(Hunter and Borts 1997; Borts, Chambers, and Abdullah
2000; Hoffmann et al. 2003).
Form heterodimer sliding clamps that may diffuse along
duplex DNA adjacent to mismatches (Msh2/3 or Msh2/6),
marking the location of the lesion and signaling
downstream machinery. During meiosis Msh4/5 form
clamps that bind Holliday junctions and Msh2/6
(Snowden et al. 2004).
A helicase with roles in synaptonemal complex formation,
crossover interference, and unwinding of Holliday
junctions during meiosis (Bishop and Zickler 2004;
Borner, Kleckner, and Hunter 2004; Sugawara et al. 2009;
Wang et al. 2009). Mer3 is paralogous to Slh1 that
encodes a putative RNA helicase involved in translation
inhibition of non-poly (A) mRNAS and is required
suppressing dsRNA viruses and Brr2 that encodes an
RNA helicase required for activation of spliceosomal
catalysis (Noble and Guthrie 1996; de la Cruz, Kressler,
and Linder 1999; Searfoss, Dever, and Wickner 2001).
169
Figure 4.17: Number of detection failures as predicted by Poisson regression
analysis of RNA Polymerase I and Replication Protein A subunits with
observed numbers of detection failures for 18 meiotic genes. The
numbers of observed meiotic gene detection failures (indicated with open
circles) are plotted against the natural logarithm of Smith-Waterman
pairwise alignment scores of Homo sapiens and Saccharomyces cerevisiae.
Poisson regression analyses were performed on the observed numbers of
failures to detect RNA Polymerase I subunits (A190, A135, AC40, AC19,
AC12.2, Rpb5, Rpb6, Rpb8, Rpb10, and Rpb12) and Replication Protein
A subunits (RPA1-3) among 34 taxa with at least 8.0X whole-genome
shotgun sequencing coverage relative to Smith-Waterman scores
(Methods). The predicted numbers of failures relative to Smith-Waterman
scores (black dots) were plotted with Wald 90% confidence limits (green
dots). Shades of grey indicate the proportion of observed absences
attributed to sequence detection failures, estimated from Smith-Waterman
pairwise alignment scores (S. cerevisiae versus H. sapiens) (see Methods).
Darker greys indicate the gene is not present in the genome sequence
sampled while lighter greys indicate the gene may be present but was not
detected. Black labels identify sequences discovered in all eukaryotes
sampled.
170
Table 4.2: Observed numbers of sequence absences from 46 genomes, SmithWaterman pairwise alignment scores, predicted numbers of absences,
and the proportion of observed absences likely due to detection failures
for 20 proteins that function during meiosis.
Smith# Predicted
Ratio (Exp.
Waterman
absences
/Obs.)
alignment score
Hop1
14
261
9.24
0.66
Rad21
8
225
10.26
1.28
Rec8
34
102
14.66
0.43
Spo11-1
15
233
10.03
0.67
Spo11-2
23
220
10.41
0.45
Spo11-3
31
480
4.90
0.16
Rad51
2
1538
0.23
0.11
Dmc1
8
1178
0.65
0.08
Hop2
7
161
12.35
1.76
Mnd1
3
328
7.61
2.54
Pms1
3
1544
0.22
0.07
Mlh1
0
1664
0.16
0.00
Mlh2
35
405
6.09
0.17
Mlh3
15
427
5.71
0.38
Msh2
0
2285
0.03
0.00
Msh3
22
1534
0.23
0.01
Msh4
12
1175
0.65
0.05
Msh5
11
868
0.59
0.14
Msh6
3
1780
0.11
0.04
Mer3
16
1572
0.21
0.01
Note: Smith-Waterman alignment scores were calculated from pairwise alignments of
Saccharomyces cerevisiae and Homo sapiens protein sequences and used to determine
the numbers of absences predicted due to detection failures. The ratio of expected and
observed values indicates the proportion of observed absences that are likely due to
sequence detection failures. Protein names in bold indicate meiosis-specific function.
Protein
# Observed
absences
171
Table 4.3: Genome sequence databases searched with web address and references
Trichomonas
vaginalis
Giardia
intestinalis
TrichDB
GiardiaDB
Spironucleus
vortens
JGI
Naegleria gruberi
JGI
Leishmania and
Trypanosoma
TriTrypDB
Physcomitrella
patens
JGI
Chlamydomonas
reinhardtii
JGI
Chlorella
JGI
Ostreococcus
tauri
JGI
Galdieria
sulphuraria
The
Galdieria
suplphuraria
Genome
Project
Emiliania huxleyi
JGI
Thalassiosira
pseudonana
JGI
Phaeodactylum
tricornutum
JGI
Fragilariopsis
cylindrus
JGI
Phytophthora
ramorum
JGI
P. sojae
JGI
Aureococcus
anophagefferens
JGI
http://trichdb.org/trichdb/
http://giardiadb.org/giardiadb/
http://genome.jgipsf.org/Spivo0/Spivo0.home.
html
http://genome.jgipsf.org/Naegr1/Naegr1.home.
html
(Aurrecoechea
et al. 2009a)
(Aurrecoechea
et al. 2009a)
(Aslett et al.
2010)
http://genome.jgi(Rensing et al.
psf.org/physcomitrella/physco 2008)
mitrella.home.html
http://genome.jgi(Merchant et al.
psf.org/chlamy/chlamy.home. 2007)
html
http://genome.jgipsf.org/ChlNC64A_1/ChlNC
64A_1.home.html
http://genome.jgi(Palenik et al.
psf.org/Ostta4/Ostta4.home.ht 2007)
ml
(Barbier et al.
http://genomics.msu.edu/cgi- 2005)
bin/galdieria/blast.cgi
http://tritrypdb.org/tritrypdb/
http://genome.jgipsf.org/Emihu1/Emihu1.home
.html
http://genome.jgipsf.org/Thaps3/Thaps3.home.
html
http://genome.jgipsf.org/Phatr2/Phatr2.home.ht
ml
http://genome.jgipsf.org/Fracy1/Fracy1.home.h
tml
http://genome.jgipsf.org/Phyra1_1/Phyra1_1.h
ome.html
http://genome.jgipsf.org/Physo1_1/Physo1_1.h
ome.html
http://genome.jgipsf.org/Auran1/Auran1.home.
html
(Armbrust et al.
2004)
(Bowler et al.
2008)
(Tyler et al.
2006)
(Tyler et al.
2006)
172
Table 4.3: Genome sequence databases searched with web address and references. Continued
Plasmodium vivax
PlasmoDB
Toxoplasma
gondii
Cryptosporidium
muris
Paramecium
tetraurelia
ToxoDB
http://cryptodb.org/cryptodb/
Paramecium
DB
http://paramecium.cgm.cnrsgif.fr/
http://genome.jgipsf.org/Dicpu1/Dicpu1.home.
html
http://genome.jgipsf.org/Cioin2/Cioin2.home.h
tml
http://genome.jgipsf.org/Nemve1/Nemve1.hom
e.html
http://genome.jgipsf.org/Triad1/Triad1.home.ht
ml
http://genome.jgipsf.org/Monbr1/Monbr1.hom
e.html
http://www.broadinstitute.org/
annotation/genome/multicellu
larity_project/GenomeDescrip
tions.html#%3Ci%3ESalping
oeca_rosetta%3C/i%3Eformer
ly_known_as_%3Ci%3EProte
rospongia_sp.%3C/i%3E_AT
CC_50818]
http://genome.jgipsf.org/Lacbi1/Lacbi1.home.h
tml
http://genome.jgipsf.org/Copci1/Copci1.home.
html
http://genome.jgipsf.org/Mucci2/Mucci2.home.
html
http://genome.jgipsf.org/Batde5/Batde5.home.
html
JGI
Ciona intestinalis
JGI
Nematostella
vectensis
JGI
Trichoplax
adhaerens
JGI
Monosiga
brevicollis
JGI
Capsaspora
owcarzaki
Thecamonas
trahens
http://toxodb.org/toxo/
CryptoDB
Dictyostelium
purpureum
Salpingoeca
rosetta
http://plasmodb.org/plasmo/
BROAD Origins of
Multicellularity
Database
Laccaria bicolor
JGI
Coprinus cinereus
JGI
Mucor
circinelloides
JGI
Batrachochytrium
dendrobatidis
JGI
(Aurrecoechea
et al. 2009b)
(Aurrecoechea
et al. 2007)
(Heiges et al.
2006)
(Arnaiz et al.
2007)
(Putnam et al.
2007)
(Srivastava et
al. 2008)
(King et al.
2008)
(Martin et al.
2008)
173
CHAPTER 5
CONCLUDING REMARKS
The studies presented in this thesis were designed to provide insight into the
evolutionary history of meiosis. Although different hypotheses for the prevalence and
maintenance of meiosis at the population level are well developed, scant data elucidating
the origin and subsequent evolution of meiotic genes are available. In the following text,
I will bring the research presented in this thesis into context by outlining our current
understanding of meiosis at the levels of populations, individuals, and genes. I will also
present a unifying hypothesis for the origin of meiosis. Finally, I will suggest
experiments that will further elucidate the origin and evolution of meiosis.
Why meiosis?
Meiosis is necessary for sexual reproduction in eukaryotes (Weismann, Parker,
and Ronnfeldt 1893). Two (usually haploid) products of meiosis (e.g. spores and
gametes) are combined (cell fusion), yielding offspring with the parental numbers of
homologous chromosomes (usually diploid) (Figure 1.3 – B) (Weismann, Parker, and
Ronnfeldt 1893). Thus, the halving of organisms’ genomes during meiosis ensures the
maintenance of ploidy in offspring; sexual reproduction results in the alternation of
haploid and diploid phases in eukaryotic life cycles (Maynard Smith and Szathmary
1995). There are several costs associated with sexual reproduction: 1) the time and
energy to switch from mitotic to meiotic cell divisions; 2) the search for appropriate
mating partners; 3) the risks of failing to find appropriate mates; 4) the risk of contracting
sexually transmitted diseases; 5) the disruption of genomes that are well adapted to their
environments; and 6) the transmission of only half of genetic material to offspring (the
twofold cost of sex) (Nei 1967; Lewontin 1971; Feldman 1972; Maynard Smith 1978;
Michod and Levin 1988; Kondrashov 1993; Barton and Charlesworth 1998; West,
Lively, and Read 1999; Otto and Lenormand 2002). These costs of sexual reproduction
174
and meiosis would seem to be prohibitively expensive, giving a fitness advantage to
asexually reproducing populations.
Despite the costs associated with sexual reproduction, it is pervasive among
eukaryotes (Bell 1982). Obligate asexual lineages are uncommon among eukaryotes and
persist for relatively short periods of evolutionary time (White 1978; Bell 1982; Richards
1986). These observations beg the following question: Why should so many eukaryotes
take these risks? In essence, this is the “paradox of sex” (Michod and Levin 1988;
Kondrashov 1993; Barton and Charlesworth 1998; West, Lively, and Read 1999; Otto
and Lenormand 2002). Some organisms reduce the costs of sex by alternating between
sexual and asexual modes of reproduction (facultative sex) (Dacks and Roger 1999).
However, the questions of why facultatively sexual organisms should bother with meiosis
at all and why a great number of organisms rely exclusively upon sexual reproduction
remain.
The question of why eukaryotes undergo the costly process of sexual reproduction
is often answered with the benefits of genetic recombination to produce variable
offspring, upon which natural selection can act (Fisher 1930; Muller 1932; Hill and
Robertson 1966), especially in response to changing environments (Van Valen 1973). In
fact, that recombination increases the efficacy of natural selection has been demonstrated
convincingly in the laboratory with the fruit fly Drosophila melanogaster (Rice and
Chippindale 2001) and the green alga Chlamydomonas reinhardtii (Kaltz and Bell 2002).
In these organisms, populations that underwent multiple generations of sexual
reproduction were more fit than asexually reproducing populations. Populations with
genetic recombination may also be able to purge deleterious mutations more rapidly than
exclusively asexual populations (Muller 1964; Kondrashov 1988). Although various
hypotheses have been offered, revealing differences of opinion regarding the importance
of selection for positive mutations and the elimination of deleterious mutations (i.e. the
roles of natural selection and random genetic drift, respectively), there is little doubt that
175
the population level effects of genetic recombination provide sufficient selection for the
long-term maintenance of meiosis (Otto and Gerstein 2006). More contentious, is the
notion that there are short-term benefits of genetic recombination at the level of the
individual.
Prior to the origin of meiosis, eukaryotes must have relied only upon asexual
modes of reproduction (probably mitosis) (Szathmary and Smith 1995). However, once
meiosis arose, these organisms (like many extant eukaryotes) were probably facultative
sexual reproducers; sometimes they reproduced sexually and sometimes they reproduced
asexually. Therefore, to study the selective advantages of meiosis that led to its origin,
we can observe the conditions in which extant facultatively sexual organisms undergo
meiosis (Michod, Bernstein, and Nedelcu 2008). It is well known that several unicellular
eukaryotes that normally divide by mitosis will occasionally, when exposed to
environmental stressors, divide by meiosis (Michod, Bernstein, and Nedelcu 2008). For
example, both Saccharomyces cerevisiae (Herskowitz 1988) and C. reinhardtii (Sager
and Granick 1954) will switch from mitotic to meiotic divisions in nutrient-poor media
and Volvox carteri will undergo meiosis during heat shock (Kirk and Kirk 1986). Thus it
is tempting to conclude that these organisms undergo meiosis to introduce variability to
their offspring that may deal better with these stressful environments than their parents
(Otto and Lenormand 2002; Otto and Gerstein 2006; Otto 2008). However, in the green
alga C. reinhardtii, researchers found that calculated fitnesses of sexually reproducing
populations were lower than those of asexually reproducing populations during the first
generation (Colegrave, Kaltz, and Bell 2002). Only after subsequent episodes of sexual
reproduction did fitnesses of the sexually reproducing populations exceed those of
asexually reproducing populations (Colegrave, Kaltz, and Bell 2002). This negative,
early effect of genetic recombination was also shown separately in D. melanogaster
(Charlesworth and Barton 1996).
176
The observations that sexually reproducing organisms are initially less fit than
asexually reproducing populations may be explained by a concept called recombination
load (Charlesworth and Barton 1996; Colegrave, Kaltz, and Bell 2002). Simply put, the
variation in the fitness of a population increases during the first sexually reproducing
generation due to genetic recombination (Otto and Lenormand 2002). Previously linked
genes become shuffled by genetic recombination, producing novel combinations of genes
(Agrawal 2006). The first, most obvious, problem is that combinations of genes that
have been selected for in a given environment are broken apart (Charlesworth and Barton
1996). However, many organisms benefit from inheriting combinations of genes that
increase their fitnesses, while other organisms inherit deleterious combinations (Kouyos,
Otto, and Bonhoeffer 2006). This is why genetic recombination increases the efficacy of
natural selection; beneficial combinations of genes are selected for, increasing in
populations, and deleterious combinations are purged from populations (Feldman,
Christiansen, and Brooks 1980; Kondrashov 1984; Kondrashov 1988). The problem is
that, initially, the increased fitness provided by the beneficial combinations of genes is
outweighed by the decrease in fitness caused by the deleterious combinations of genes
and the breaking apart of previously fit genomes (Otto and Lenormand 2002).
Theoretically, there are conditions (weak and negative epistasis) in which short-term
advantages of genetic recombination could be realized and positively selected (Otto and
Lenormand 2002). However, evidence that these conditions exist in nature is weak
(Elena and Lenski 1997; Rice 2002; Bonhoeffer et al. 2004). For these reasons, it is
unlikely that the production of variable offspring upon which natural selection acts could
have provided the immediate selective benefits required for the origin of meiosis.
Some environmental conditions, such as exposure to metabolically or
environmentally produced oxygen-containing compounds, can result in double-strand
DNA breaks (i.e. oxidative stress) (Nedelcu and Michod 2003; Nedelcu, Marcu, and
Michod 2004). While single-strand DNA damage can be repaired using the
177
complementary strand of DNA in a helix, double-strand DNA damage requires
recombination with homologous chromosomes (Michod, Bernstein, and Nedelcu 2008).
Thus meiosis may have arisen as an adaptation for damage DNA repair (Bernstein et al.
1984). This possibility is evidenced by S. pombe and V. carteri, both of which undergo
meiosis in response to oxidative stress (Bernstein and Johns 1989; Nedelcu and Michod
2003). Furthermore, the connection between DNA damage and recombination is
supported by the observations that mutations in recombination genes makes cells
sensitive to UV damage and exposure of cells to mutagens increases recombination rates
(Bernstein and Bernstein 1991).
Of course, the hypothesis in which double-strand DNA damage repair supplies the
primary benefit of meiosis relies upon the presence of at least a diploid number of
homologous chromosomes. Indeed, diploid yeast cells are more resistant to DNA
damage than haploid cells (Herskowitz 1988). However, unlike many diploid eukaryotes
(e.g. metazoans (Schrader and Hughes-Schrader 1931)) whose haploid states are transient
and associated exclusively with sexual reproduction (e.g. gametes), many eukaryotes,
especially unicellular, experience longer haploid stages (Lewis 1985). The question,
then, is why would organisms risk the integrity of their genomes by having extended
haploid lifecycle stages? If DNA damage occurs during the haploid state and no other
appropriate haploid cells are available for cell fusion and replication of chromosomes is
not possible, due to the damage, the cells would be seem to be in danger. The benefits of
having haploid stages during the lifecycles of unicellular eukaryotes must outweigh the
risks of their diminished capacities to repair double-strand DNA damage.
Another hypothesis posits that environmental stressors (e.g. oxidative stress and
starvation) may provide an advantage to organisms with haploid-diploid ploidy cycles
(Maynard Smith and Szathmary 1995). That is, during times of oxidative stress, diploidy
may be beneficial for repair of double-strand DNA damage, while, during starvation,
haploids may benefit from faster growth relative to diploids (Cleveland 1947; Szathmary
178
et al. 1990; Hurst and Nurse 1991). Haploid populations of S. cerevisiae do, indeed, have
higher fitnesses than diploid populations in nutrient-limiting environments (Adams and
Hansche 1974). It has been argued, however, that, since the ancestral eukaryote in which
meiosis arose was probably phagotrophic, diploidy should have been favored during
periods of starvation as diploids are larger and should, therefore, be able to engulf larger
prey (Lewis 1985; Maynard Smith and Szathmary 1995). This logic leads us back to the
conclusion that diploidy should be beneficial and haploidy should be rare, begging, once
again, the question of why eukaryotes should exist as haploids at all. Furthermore, these
hypotheses suffer from lack of supporting data (Kondrashov 1994). There is no evidence
that ancestral eukaryotes would have been subjected to such alternating environments.
The fact that the majority of genetic mutations are deleterious (Lewontin 1974)
would seem to support the notion that diploidy should be selectively advantageous, due to
the presence of two copies of every gene (Otto and Goldstein 1992). If an allele carrying
a deleterious mutation can be masked by the presence of a wildtype allele in a
homologous chromosome (i.e. low dominance) then fitnesses of organisms may not be
affected (Crow and Kimura 1965; Maynard Smith 1978; Charlesworth 1991; Kondrashov
and Crow 1991; Perrot, Richerd, and Valero 1991). Mathematical models confirm this
prediction, but only in cases of large genomes with relatively high rates of genetic
recombination (Perrot, Richerd, and Valero 1991; Otto and Goldstein 1992). This is
because increasing numbers of heterozygous loci result in greater ability to mask
deleterious alleles (Otto and Goldstein 1992). However, the increased fitness of
heterozygotes (heterosis) has a price: the maintenance of deleterious alleles in the
population (mutation load) (Kondrashov 1994). In haploid cells with small genomes and
low levels of genetic recombination, deleterious mutations are unlikely to persist as
selection should efficiently purge them from the population (Scudo 1967). The ancestral
eukaryotes in which meiosis arose most likely had few chromosomes (maybe only one)
and relatively low rates of genetic recombination (Michod and Levin 1988). Therefore,
179
they would have benefited from maintaining a haploid number of chromosomes (Otto and
Goldstein 1992).
In ameiotic haploid organisms, diploidization might occur either endogenously,
due to errors that occur during mitotis (Cleveland 1947; Hurst and Nurse 1991), or
exogenously, due to the fusion of two haploid cells (Cavalier-Smith 1975) This
diploidization is likely to have reduced the fitnesses of the ameiotic ancestral eukaryotes
in which meiosis arose (Otto and Goldstein 1992). Therefore, I argue that meiosis arose
due to the selective benefits of retrieving haploid numbers of chromosomes after
spontaneous diploidization. Furthermore, I propose also that meiosis could only have
arisen in the presence of a strong constant selective pressure. Therefore, the cause of
these diploidization events is important to consider. If mutations occurred that resulted in
endogenous diploidization then they would simply have been selected against and purged
from the population. Thus, cytological sources of diploidization would not have provided
the constant selective pressures necessary for the origin of meiosis. However,
diploidization that occurred because of repeated fusions of two haploid eukaryotes could
have provided a constant selective force. Such fusions could be explained as an artifact
of the ancestral eukaryotes’ phagocytic lifestyle. That is, when one cell attempted to
engulf another eukaryotic cell (either by accident or cannibalism), their membranes could
have occasionally fused. Unlike endogenous sources of diploidization, such an
exogenous source could not easily be purged from populations. A method of
identification could have evolved in order to avoid fusions but, if cannibalization was
common, then chemical signaling may not have been selectively advantageous. Of
course, the problem could have been remedied by abandoning their phagotrophic
lifestyles but this solution would have required a switch to another food source, an
endeavor that would most certainly have required many changes to the cells. Simply put,
meiosis was a less costly way, evolutionarily speaking, to cope with constant and
spontaneous diploidization than eliminating the cause (phagocytosis) altogether. This
180
scenario provides the constant, immediate selective benefits to individuals that would
have been necessary for meiosis to arise.
Meiosis arose from mitosis
There are two main theories for the origin of meiotic genes: 1) meiotic genes
arose directly from prokaryotic genes encoding products that were involved primarily in
transformation (reviewed in (Bernstein and Bernstein 2010)); and 2) meiotic genes arose
from genes encoding products that were involved primarily in mitosis (reviewed in
(Wilkins and Holliday 2009)). Distinguishing between these possibilities is important to
our understanding of the origin and evolution of meiosis.
Prokaryotic organisms are able to exchange genetic material via parasexual
processes (i.e. conjugation (Lederberg and Tatum 1946), transformation (Griffith 1928;
Avery, Macleod, and McCarty 1944), and transduction (Lederberg et al. 1951)), utilizing
recombination enzymes that are also important for DNA damage repair (Maynard Smith
and Szathmary 1995). In this regard, prokaryotic parasexual processes are analogous to
sexual reproduction in eukaryotes. More specifically, recombination of prokaryotic
genomes during transformation appears similar to meiotic recombination in eukaryotes
(Bernstein and Bernstein 2010). Indeed, many genes necessary for bacterial
transformation are orthologs of genes necessary for recombination of nonsister
homologous chromosomes during meiosis (Marcon and Moens 2005). In addition,
bacterial and eukaryotic orthologs may have similar functions during transformation and
meiosis, respectively. For example, bacterial RecA, which stimulates DNA strand
exchange during transformation, is orthologous to the eukaryotic gene encoding Dmc1,
which stimulates interhomolog DNA strand exchange during meiosis in most eukaryotes.
In addition, both transformation and meiosis can be induced by similar types of stress.
Following these observations, it has been proposed that meiosis in eukaryotes arose
immediately from eubacterial transformation (Bernstein and Bernstein 2010). This
181
hypothesis explains the evolution of sex as a continuous evolutionary process from
bacteria to eukaryotes (Bernstein and Bernstein 2010).
Central to the argument that meiotic recombination in eukaryotes arose directly
from eubacterial transformation is the observation that many genes were horizontally
transferred from mitochondria (likely the result of the engulfment of eubacteria by early
eukaryotes (Margulis 1970)) to the nuclear genome of eukaryotes (Gabaldon and Huynen
2003). Eubacterial recA homologs and eukaryotic recA homologs (Rad51 and Dmc1)
share a high level of sequence similarity (0.20 and 0.23, respectively; Figure 3.14).
Therefore, eukaryotic Rad51 and Dmc1 homologs may have arisen from recA orthologs
that were transferred from eubacteria after their engulfment by eukaryotes (Lin et al.
2006). However, this model also predicts that Rad51 and Dmc1 should be more closely
related to eubacterial recA genes than to archaebacterial RadA genes and distance
analyses indicate that Rad51 and Dmc1 are most similar to archaebacterial RadA genes
(0.43 and 0.45, respectively; Figure 3.14). Also, phylogenetic analyses indicate that
Rad51 and Dmc1 share a more recent ancestor with archaebacterial RadA genes than with
eubacterial recA genes (Stassen et al. 1997; Lin et al. 2006). In sum, these data indicate
that eukaryotes inherited a recA homolog (RadA) vertically from archaebacteria and not
horizontally from a eubacteria.
Since the first eukaryotes were certain to have been capable of nuclear divisions
(i.e. mitosis), it is most likely that mitosis arose very early during eukaryotic evolution.
The protoeukaryotes could have been mitotically dividing organisms that were also
capable of bacteria-like transformation. Then meiosis could have arisen from
transformation in the presence of mitosis. The crux of this argument is that meiotic
recombination originating from bacterial transformation would have been a continuous
evolutionary process (Bernstein and Bernstein 2010). That is, if mitosis arose first and
there was neither bacteria-like transformation nor meiosis then a gap exists, during which
eukaryotes did not undergo genetic recombination or sex (Bernstein and Bernstein 2010).
182
This argument assumes that nonsister homologous recombination did not occur during
mitosis. However, crossing over has been shown in animal and fungal vegetative cells
(mitotic crossing-over), albeit at much lower frequencies than meiotic crossing over
(Cardoso et al.; Xu and Rubin 1993). Genetic recombination could have occurred if
protoeukaryotes were capable of mitosis but neither transformation nor meiosis; there
need not have been a “sex gap” during eukaryotic evolution if mitosis arose first.
Phylogenetic and distance analyses of the translated protein sequences of
eukaryotic and prokaryotic recA homologs indicate that the eukaryotic Rad51 and Dmc1
genes are paralogs (Figure 3.14). That is, the genes encoding Rad51, which functions
during both mitotic and meiotic DNA strand exchange reactions in model organisms, and
Dmc1, which functions only during meiotic DNA strand exchange reactions in model
organisms, arose by a single gene duplication event that occurred during eukaryotic
evolution. There are three possible outcomes of gene duplication events: 1) the “extra”
gene copy quickly degrades and its products (if any) do not function
(nonfunctionalization) (Ohno 1970); 2) a division of labor occurs such that the two gene
copies encode products that perform distinct complementary functions previously
accomplished by the products a single gene (subfunctionalization) (Force et al. 1999); or,
3) one gene copy, free from the constraints of purifying selection, is free to mutate and its
products then perform novel functions (neofunctionalization) (Ohno 1970). Since both
the eukaryotic Rad51 and Dmc1 genes have been retained, either subfunctionalization or
neofunctionalization of the genes occurred after they arose. Either the ancestral gene
encoded products that functioned during both mitotic and meiotic DNA strand exchange
and the duplication event yielded genes whose products divided these functions or the
ancestral gene encoded products that functioned during only one reaction (mitotic or
meiotic DNA strand exchange) and the gene duplication event resulted in the origin of a
novel function. Put another way, either both mitotic and meiotic DNA strand exchange
reactions were present at the time of the gene duplication or one arose from the other.
183
In addition to Dmc1 and Rad51, there are several genes whose products are
known to function only during meiosis in model organisms that are paralogs of genes
encoding products that function during both mitosis and meiosis (Chapter 4). These
genes encode products that are involved in several important (if not critical) events that
occur during meiosis, including sister chromatid cohesion, dsDNA cutting, DNA strand
exchange, and Holliday junction resolution. Thus the phenomenon is not restricted to
genes encoding products involved in DNA strand exchange reactions but include many
other genes necessary for successful completion of meiosis. Additionally, the
distributions of these genes among diverse eukaryotes and phylogenetic analyses indicate
that many of these duplications occurred prior to the common ancestor of all extant
eukaryotes, making it possible that they all occurred at the same time or during a very
small window during eukaryotic evolution. It is likely that either both mitosis and
meiosis were present at the time of the gene duplication event(s) or one arose from the
other.
Since the earliest eukaryotes were also most likely haploid, it seems unlikely that
meiosis could have been the primary means of reproduction as it would have required
two rounds of DNA synthesis or some combination of cell fusions and DNA synthesis to
obtain the appropriate numbers of chromosomes. Although some single-celled organisms
(e.g. Saccharomyces cerevisiae) have haploid stages of their lifecycles, during which they
fuse with other cells to form diploid cells that may ultimately undergo meiosis, most of
their nuclear divisions are mitotic (Herskowitz 1988). These observations and the greater
cytological and genetical complexity of meiosis (Chapter 1) indicate that mitosis most
likely arose first and meiosis is a derived process that arose later during eukaryotic
evolution (Cavalier-Smith 1981b; Simchen and Hugerat 1993; Wilkins and Holliday
2009). In total, these data indicate that meiosis may have arisen from mitosis de novo as
a result of one or more largescale gene duplication events. Below, I propose an
184
evolutionary model that includes a preadaptation that could have provided the selective
benefits necessary for such a profound event to occur.
A model for the evolution of meiotic DNA strand exchange
genes
The results obtained and the observations made during the studies performed in
Chapters 2 through 4 revealed three major points regarding the evolution of meiotic DNA
strand exchange genes: 1) While meiotic DNA strand exchange genes are often lost,
Rad51 appears to be present in all but one eukaryotic genome studied; 2) In
Saccharomyces cerevisiae, rad51 functional mutations or Rad51 overexpression
rescue(s) the null mutant phenotypes of other DNA strand exchange genes studied in
Chapter 2 (Table 2.5); and, 3) Rad51 and Dmc1 may have overlapping functions in some
organisms, such that one paralog may perform the activities of the other. These points
have culminated in a model of meiotic DNA strand exchange gene evolution that
explains the various complements of genes observed in different eukaryotes (Figure 5.1).
The presence of ten DNA strand exchange genes (Rad52, Rad59, Rad51, Rad55,
Rad57, Dmc1, Hop2, Mnd1, Rad54, and Rdh54) in representative genomes of all the
eukaryotic supergroups studied in Chapter 2 (Opisthokonta, Amoebozoa, Archaeplastida,
Chromalveolata, and Excavata) indicate that they were likely to have been present in the
last eukaryotic common ancestor. It is, therefore, feasible that the ancestor of eukaryotes
was capable of meiotic DNA strand exchange and meiosis (Figures 2.1 and 5.1 – A).
Also, given their distributions, two genes (Rad59 and Rad54) may have arisen later
during eukaryotic evolution (Figures 2.1 and 5.1 – B). So, although eukaryotes began
with a core set of meiotic DNA strand exchange machinery, additional genes have since
been added during the evolution of different eukaryotic lineages. However, the
distributions of meiotic DNA strand exchange genes also indicate that frequent
independent losses of important genes have occurred. These apparently contradictory
observations beg the question: How can eukaryotes lose genes so important for meiotic
185
DNA strand exchange in model organisms and, by inference, in the last common ancestor
of all extant eukaryotes?
Only one organism (Giardia intestinalis) is confirmed to be without a Rad51
gene, while other genes have often been lost (Figures 2.1, 3.5. and 4.1). Hence, I
hypothesized that a connection exists between the nearly ubiquitous presence of Rad51
among eukaryotes and the frequent loss of other meiotic DNA strand exchange genes in
independent eukaryotic lineages. As it happens, these observations can be explained by
the following: 1) There are no known suppressors of rad51 animal or fungal null mutant
phenotypes; and 2) Overexpression or functional mutations of the Rad51 gene suppresses
rad52, dmc1, rad55, rad57, hop2, mnd1, rad54, and rdh54 Saccharomyces cerevisiae
null mutants (Milne and Weaver 1993; Klein 1997; Bishop et al. 1999; Krejci et al. 2002;
Tsubouchi and Roeder 2003; Henry et al. 2006; Schild and Wiese 2009). I hypothesized
that changes in Rad51 expression or changes in its coding sequence may result in the
relaxation of purifying selection on meiotic DNA strand exchange genes (such as the
Dmc1 gene) (Figure 5.1 – C). That is, when overexpressed or mutated Rad51 products
may ‘fill-in’ for missing components, performing their functions, or rendering the
functions of other gene products altogether unnecessary. Such a dynamic would
theoretically result in relaxation of the normally purifying selection that serves to
preserve genes in populations of organisms. This relaxation of selection may then result
in the loss of DNA strand exchange genes (Figure 5.1 – D). In addition, some meiotic
DNA strand exchange components are known to interact only with a limited set of
proteins (e.g. Hop2 and Mnd1 proteins only interact with Dmc1) (Chen et al. 2004;
Henry et al. 2006). Therefore, the loss of the Dmc1 gene may leave the Hop2 and Mnd1
genes vulnerable to loss (Figure 5.1 – E).
Finally, the complements of meiotic DNA strand exchange genes and the
interactions of their products may provide a feedback loop in which subsequent mutations
changing the expression of Rad51 genes or creating beneficial functional mutants further
186
alter gene combinations (Figure 5.1 – F). Although lineage-specific genes and protein
interactions are almost certainly affecting the complements of DNA strand exchange
genes observed in different eukaryotes, this general model provides a eukaryote-wide
hypothesis for understanding their evolution.
There is some preliminary evidence which suggests that one meiotic DNA strand
exchange protein may perform the functions of another. As stated previously, G.
intestinalis is the only organism known to lack a Rad51 gene. However, during the study
presented in Chapter 3, a search was conducted in the genome of a closely related
diplomonad (Spironucleus vortens) with database mining and degenerate PCR, and no
Rad51 gene was found (data not shown). Interestingly, G. intestinalis does contain two
copies of the Dmc1 gene, both appearing to encode proteins that function during nuclear
divisions in cysts (Poxleitner et al. 2008). It is possible that one copy of Dmc1 may
encode products that perform the functions normally completed by Rad51.
The Dmc1 proteins of G. intestinalis and S. vortens appear to have residues that
are highly conserved among Rad51 protein sequences, with G. intestinalis Dmc1-A
appearing slightly more ‘Rad51-like’ than Dmc1-B, especially at amino acid positions
331 and 332 (Figure 5.2). Residue D332 has been determined in eubacterial RecA and
archaebacterial RadA proteins to bind DNA (Story, Weber, and Steitz 1992; Shin et al.
2003; Chen et al. 2007). The functions of these amino acids in Rad51 and Dmc1 proteins
are unknown and it is possible that residues 331 and 332 are responsible for Rad51- or
Dmc1-specific functions. Further studies will be needed to determine if these residues
confer Rad51- or Dmc1-specific functions, but the possibility that one paralog may
perform the functions of another in G. intestinalis is intriguing. Whether these sites are
useful as diagnostic characters for Rad51- or Dm1-function (regardless of the paralog
being observed) and if there is any functional significance of variations at these sites are
questions worthy of scientific investigation. The point here is that the functions of
187
meiotic DNA strand exchange genes and the interactions between them may be more
dynamic than previously supposed.
A model for the origin of meiosis
The results of the scientific studies presented in this thesis have culminated in a
cohesive model for the origin of meiosis that I will now present. There are four main
events that distinguish meiosis from mitosis: 1) the pairing of homologous chromosomes
during meiosis I; 2) DNA strand exchange (recombination) between non-sister
homologous chromosomes; 3) sister-chromatid cohesion that persists through the first
meiotic division; and 4) the absence of DNA replication (S-phase) upon entering the
second meiotic division (Wilkins and Holliday 2009). Although there are other
differences between meiosis and mitosis (described in Chapter 1 and summarized in
Figure 1.3), understanding the possible the origins of these four novel steps is considered
by many to be necessary for surmising the origin of meiosis itself (Kleckner 1996;
Villeneuve and Hillers 2001; Wilkins and Holliday 2009). That all (or almost all) of
these steps, each requiring its own set of specialized machinery, are necessary in
eukaryotes for successful completion of meiosis would seem to exclude any gradualist
explanations for the origin of meiosis. However, to suggest that these complex processes
could have arisen simultaneously seems to defy logic. For these reasons, the origin of
meiosis is considered one of the most formidable problems in evolutionary studies
(Maynard Smith 1978; Hamilton 1999; Wilkins and Holliday 2009). The following
evolutionary model, including mechanisms for the origins of the novel steps described
above, explains the origin of meiosis in a manner that is both feasible and testable.
Although I find much agreement with models presented by other researchers (especially
(Wilkins and Holliday 2009) and (Cavalier-Smith 2002d)), the timing of important events
and the mechanisms proposed here, responsible for the origins of the pairing of
homologous chromosomes, prolonged sister-chromatid cohesion, and meiotic
recombination, are, I think, unique.
188
I have shown that several genes whose products are known to function only
during meiosis in model organisms must have been present in the common ancestor to all
known extant eukaryotes (Chapters 3 and 4). Therefore, meiosis must have arisen in
eukaryotes that existed prior to the last eukaryotic common ancestor. Such eukaryotes
were probably phagotrophic, single-celled organisms with haploid numbers of
chromosomes (possibly one chromosome) contained within nuclei (Figure 5.3 – A)
(Cavalier-Smith 1975; Hurst and Nurse 1991; Cavalier-Smith 2002a; Wilkins and
Holliday 2009). Like many extant eukaryotes, ancestral eukaryotes may have frequently
engulfed prokaryotic organisms and, occasionally, other eukaryotes (Figure 5.3 – B) (Adl
et al. 2005). It is possible that during phagocytosis, rather than one eukaryotic cell
engulfing and digesting another eukaryotic cell, the cell membranes became fused,
especially if the cells were genetically identical (Figure 5.3 – C). That the fusion of
haploid eukaryotic cells may have been the precursor to meiosis is not a new concept,
having been proposed on numerous occasions, probably due to its similarity to syngamy
during the haploid-diploid cycles of many extant eukaryotes (Maynard Smith and
Szathmary 1995). Following eukaryotic cell fusions, nuclear envelopes could have been
followed quickly by nuclear fusions (Figure 5.3 – D). Again, such fusions may have
been reminiscent of nuclear fusions observed during the sexual haploid-diploid lifecycles
of extant eukaryotes (Wilkins and Holliday 2009).
The fusion of two haploid eukaryotic cells would, of course, have yielded a single
diploid eukaryotic cell (Figure 5.3 – E). I believe that life would have proceeded
somewhat normally for such newly formed diploid eukaryotes, until, that is, they
attempted to undergo mitosis. The cells may have entered pre-mitotic S phase (DNA
synthesis), copying each of the chromosomes present in the newly diploid nuclei.
However, due to changes in gene expression levels and/or stoichiometry of protein and
DNA molecules caused by the presence of diploid numbers of chromosomes, mitosis
may not have proceeded normally. Recall, that during mitosis, Rad51 proteins function
189
during DNA strand exchange between sister chromatids (and, rarely, between non-sister
homologous chromosomes) (Nishinaka et al. 1998; Krogh and Symington 2004; LopezCasamichana et al. 2008), while, during meiosis, Dmc1 proteins are necessary for DNA
strand exchange between non-sister homologous chromosomes (Bishop et al. 1992;
Bishop 1994; Bishop et al. 1999; Sehorn et al. 2004; Sauvageau et al. 2005) (Figure 1.4
and Table 2.5). Interhomolog DNA strand exchange during meiosis in Saccharomyces
cerevisiae dmc1 null mutants is greatly reduced (Table 2.5) (Bishop 1994). However,
overexpression of Rad51 significantly diminishes this phenotype, stimulating
interhomolog DNA strand exchange (Bishop et al. 1999; Tsubouchi and Roeder 2003).
In Chapters 3 and 4 I demonstrated that both Rad51 and Dmc1 genes are present in
representatives of all known eukaryotic supergroups and, so, are likely to have been
present in the last common ancestor of all extant eukaryotes (Figures 3.2 – 3.12 and 4.1).
However, mitosis most likely arose prior to the origin of meiosis (Cavalier-Smith 1981b;
Simchen and Hugerat 1993). It is well known that Rad51 and Dmc1 are paralogs (genes
arising from a common ancestral gene by duplication) (Stassen et al. 1997; Ramesh,
Malik, and Logsdon 2005; Lin et al. 2006; Malik et al. 2008). Therefore, it is also likely
that the ancestor of Rad51 and Dmc1 was most similar to Rad51 (I will call this ancestral
gene Rad51’), encoding products that functioned during mitotic DNA strand exchange.
Thus I propose that the change in the numbers of chromosomes from a haploid to a
diploid number resulted in ‘overexpression’ of Rad51’ genes, increasing the numbers of
Rad51’ proteins relative to the numbers of DNA molecules. In addition to DNA strand
exchange between sister chromatids, this overexpression could have stimulated DNA
strand exchange between non-sister homologs (Figure 5.3 – F).
Because pairing of non-sister homologous chromosomes is important for
successful completion of the reductional division of meiosis, its origin is considered key
to the origin of meiosis itself (Wilkins and Holliday 2009). However, without sustained
pairing of sister-chromatids through the first division and monopolor attachment of
190
spindles to each chromosome, equational divisions are equally likely to occur (Watanabe
and Nurse 1999; Toth et al. 2000; Yokobayashi, Yamamoto, and Watanabe 2003; Hauf
and Watanabe 2004). Although meiosis could have evolved with the equational divisions
occurring first and the reductional division occurring second, rather than the other way
around, we can imagine a mechanism by which sister-chromatids could have stayed
bound until the second division. The Rad21 gene encodes products that bind sisterchromatids during mitosis (Table 4.1) (Gruber, Haering, and Nasmyth 2003). During
meiosis, Rec8, a paralog of Rad21 (Parisi et al. 1999), performs a similar function (Parisi
et al. 1999; Watanabe and Nurse 1999; Toth et al. 2000; Gruber, Haering, and Nasmyth
2003; Yokobayashi, Yamamoto, and Watanabe 2003). Like Dmc1, Rec8 proteins are
known to function only during meiosis in model organisms (Parisi et al. 1999; Watanabe
and Nurse 1999; Toth et al. 2000; Yokobayashi, Yamamoto, and Watanabe 2003).
Again, because we expect that eukaryotes were capable of mitotic nuclear divisions prior
to the origin of meiotic divisions, we also expect that the ancestor of Rad21 and Rec8 was
most similar to Rad21; encoding products that functioned in a manner similar to Rad21
proteins in extant organisms (I will call this ancestral gene Rad21’). In S. cerevisiae,
expression of Rad21 by a Rec8 promoter in null rec8 mutants results in meiosis-like
monopolor attachment of microtubules to chromosomes, rather than the mitosis-like
bipolar attachment normally seen in null rec8 mutants (Figure 1.3) (Toth et al. 2000). In
Schizosaccharomyces pombe null rec8 mutants, Rad21 will relocate to centromeres
(Yokobayashi, Yamamoto, and Watanabe 2003). Both experiments resulted in
equational, rather than reductional, divisions during meiosis I (Figure 1.3) (Toth et al.
2000; Yokobayashi, Yamamoto, and Watanabe 2003). That is, although Rad21 attaches
to centromeres and monopolor attachment of microtubules are rescued, the reductional
divisions are not. However, I suggest that Rad21 overexpression in addition to Rad51
overexpression may result in retrieving the reductional division during meiosis I in yeast
rec8/dmc1 double null mutants. Similarly, changes in the numbers of Rad21’ proteins
191
relative to the numbers of DNA molecules in primitive eukaryotes (in the presence of
increased numbers of Rad51’ proteins) could have resulted in the monopolor attachment
of mitotic spindles to and extended sister-chromatid cohesion. Essentially, a meiosis Ilike reductional division may have resulted.
Although I cannot find other examples in the current data, it is possible that
additional genes acted similarly when overexpressed to achieve pairing of homologous
chromosomes or suppression of DNA synthesis upon entering the second round of
meiosis in ancestral eukaryotes. However, these steps may be otherwise explained
(Wilkins and Holliday 2009). In S. cerevisiae, pairing of homologous chromosomes
occurs during the G1 lifecycle stage (prior to pre-meiotic DNA synthesis) (Weiner and
Kleckner 1994). Pairing is interrupted during the pre-meiotic S-phase and restored
during meiotic prophase I. Like the pre-meiotic pairing that occurs during G1, meiotic
pairing initially occurs in the absence of meiotic recombination and synaptonemal
complex formation (Burgess, Kleckner, and Weiner 1999). Similarly, pairing of
homologous chromosomes occurs in mitotically dividing cells S. cerevisiae during G1,
pairing is interrupted during pre-mitotic S-phase, and pairing is restored during G2
(Burgess, Kleckner, and Weiner 1999). Pairing of non-sister homologous chromosomes
in somatic cells has also been observed in Diptera and a variety of plants (Stack and
Brown 1969). Therefore, a mechanism for homologous pairing may have existed in
ancestral eukaryotes, prior to the origin of meiosis. In addition, the changes the
interactions of mitotic spindles with homolog kinetochores could have contributed to
prolonged sister-chromatid cohesion, through the first division (Wilkins and Holliday
2009).
The suppression of DNA synthesis after one (reductional) division, as cells enter
into a second (equational) division, distinguishes meiosis from mitosis. In Xenopus
laevis, S. cerevisiae, and S. pombe, pre-mitotic DNA synthesis is stimulated by a
licensing reaction, in which a complex (composed of Mcm2-7) is loaded onto chromatin
192
by Origin of Replication Complexes (ORCs), Cdc6, and Cdt1 (Blow and Dutta 2005). In
budding and fission yeasts, the activities of these Mcm complex ‘loaders’ are downregulated by Cycling Dependent Kinases (CDKs) during S-phase and early mitosis,
preventing DNA synthesis (Broek et al. 1991; Hayles et al. 1994; Dahmann, Diffley, and
Nasmyth 1995; Diffley 1996; Piatti et al. 1996). In animals, upregulation of CDKs and
inhibition of Cdt1 by geminin act together to suppresses DNA synthesis during mitosis
(Wohlschlegel et al. 2000; Tada et al. 2001; Lee et al. 2004). In animals and fission
yeast, overexpression of Cdt1 and Cdc6 results in extensive re-replication during mitosis
(Nishitani et al. 2000; Vaziri et al. 2003; Thomer et al. 2004; Arias and Walter 2005).
However, in S. cerevisiae and X. laevis, significant re-replication of DNA occurs only
when CDKs or geminin are inactivated (Nguyen, Co, and Li 2001; Li and Blow 2005).
Overexpression of CDKs and/or geminin should, then, suppress DNA synthesis. It is
possible that, in ancestral eukaryotes, changes in the numbers and/or stoichiometry of
CDKs caused by the presence of diploid numbers of chromosomes in otherwise haploid
cells resulted in suppression of DNA synthesis after the first division. At that point, the
cells could have simply entered into a normal mitotic division, yielding haploid cells
(Figure 5.3 – G).
Assuming the presence of small genomes, selection likely favored haploid cells
over diploid cells early during eukaryotic evolution. Therefore, diploid eukaryotes
arising from the fusion of two haploid eukaryotes should have been at a selective
disadvantage. This selective force may have resulted in further refinement of the process
described here. Eventually large-scale gene duplication events, possibly due to frequent
unequal pairing of non-sister homologous chromosomes, yielded the many paralogous
gene groups seen today (Figure 4.16). The presence of gene paralogs allowed for
divisions of labor to occur (Ohno 1970; Ridley 2004) such that some genes would encode
products that functioned predominantly during mitosis, but, on the occasions in which
cell fusions occurred, the other genes could have functioned to reduce the numbers of
193
chromosomes. As genomes became recombined, the longer-term benefits of genetic
recombination may have been realized and meiosis would have become even more
refined, including enhanced mechanisms for cell fusions, dsDNA cuts, crossing-over,
cross-over interference, and synaptonemal complex formation.
In summary, the data presented in this thesis supports the idea that meiosis arose
from mitosis by large-scale gene duplication following a preadaptation that served to
reduce increased numbers of chromosomes (from diploid to haploid) caused by erroneous
eukaryotic cell-cell fusions.
Future directions
The model for the origin of meiosis presented in this thesis makes two major
predictions: 1) During mitosis, overexpression of Rad51 and Rad21 genes should
promote reductional divisions; and 2) Meiosis should be possible in the absence of
meiosis-specific machinery. Both of these hypotheses can be tested using modern
genetic techniques. However, it should be noted that only positive results would be
informative, while negative results could arise from behaviors that evolved since meiosis
arose (e.g. cell cycle checkpoints). Below, I suggest experiments designed to explore
the models presented in this thesis and provide further insight into the origin and
evolution of meiosis.
Whether reductional divisions can be produced during mitosis and whether the
reductional division in dmc1/rec8 null mutants can be rescued, by overexpression of
Rad51 and Rad21 could be tested with Saccharomyces cerevisiae. As described
previously, increasing Rad51 copy number in S. cerevisiae null dmc1 mutants rescues the
null mutant phenotype (Bishop et al. 1999), characterized by defective recombination,
accumulation of double-strand break recombination intermediates, failure to form normal
synaptonemal complexes, and arrest late in meiotic prophase I (Bishop 1994) (Table 2.5).
Recall, that there is no null mutant phenotype during mitosis (Bishop et al. 1992).
Interestingly, overexpression of another gene involved in meiosis (Rad54) also rescues
194
null dmc1 phenotypes (Tsubouchi and Roeder 2003), but it probably does so by removing
double-strand break recombination intermediates (Petukhova, Stratton, and Sung 1998;
Petukhova et al. 1999; Kiianitsa, Solinger, and Heyer 2002), rather than by performing
the job of Dmc1, as the increased numbers of Rad51 are likely to do.
Although overexpressing Rad51 rescues the null dmc1 mutant phenotype in S.
cerevisiae, it is unlikely to rescue a null dmc1/rec8 mutant alone. The S. cerevisiae null
rec8 mutant phenotype includes the loss of monopolor attachment of microtubules and
equational divisions during meiosis I (Parisi et al. 1999; Watanabe and Nurse 1999). The
increased expression of Rad21 in rec8 mutants rescues monopolor attachment of
microtubules but does not retrieve reductional divisions (Toth et al. 2000). I propose that
Rad51 and Rad21 overexpression in UV radiated S. cerevisiae cells may rescue the
reductional divisions in null rec8 mutants by increasing the numbers of DNA strand
exchange events and extending sister-chromatid cohesion through meiosis I. In addition,
I propose that Rad51 and Rad21 overexpression should rescue reductional divisions in
dmc1/rec8 double mutants. These experiments would test the idea that overexpression of
paralogs whose products function during both mitosis and meiosis (e.g. Rad51 and
Rad21) results in the completion of functions normally fulfilled by products that function
only during meiosis (e.g. Dmc1 and Rec8). To test whether overexpression of Rad51 and
Rad21 is sufficient to explain the origin of meiosis, the same experiments could be
performed during mitosis. If reductional divisions are observed, then the overexpression
of genes is sufficient to explain the origin of meiosis.
195
A
B
Rad52
Rad51
Rad55
Rad57
Dmc1
Hop2
Mnd1
Rdh54
Rad52
Rad51
Rad55
Rad57
Dmc1
Hop2
Mnd1
Rdh54
Rad54
Rad59
C
Rad51
rad51
D
Rad52
Rad51
Rad55
Rad57
Dmc1X
Hop2
Mnd1
Rdh54
Rad54
Rad59
E
Rad52
Rad51
Rad55
Rad57
Dmc1X
Hop2 X
Mnd1 X
Rdh54
Rad54
Rad59
F
Figure 5.1: General model for the evolution of DNA strand exchange genes. A.
many DNA strand exchange genes arose very early during eukaryotic
evolution, B. additional components may have arisen later by gene
duplication, C. Rad51 gene overexpression or mutation results in relaxed
selection for retention of other components, D. some components may be
lost, E. other components known to function only in complexes may be lost
(i.e. Hop2/Mnd1 heterodimers are known only to function with Dmc1
proteins), and F. suites of genes result in further selection for rad51
mutations. Components in bold indicate they are known to function only
during meiosis in model organisms.
196
Dmc1
Rad51
Position
Saccharomyces
Homo
Entamoeba
Oryza
Plasmodium
Trichomonas
Cercomonas
Giardia A
Giardia B
Spironucleus
Saccharomyces
Homo
Entamoeba
Oryza
Plasmodium
Trichomonas
Gymnophrys
Rad51
Dmc1
248 265 286 288 302 318 331/2
A-A-Y-T-H-G-VD
A-A-Y-T-H-G-VD
A-A-Y-T-H-S-VD
A-A-Y-T-H-G-VD
A-A-Y-S-H-G-VD
A-A-Y-T-H-G-VD
A-A-Y-T-H-G-VD
A-A-F-V-K-N-VD
M-L-F-V-K-N-VD
M-I-F-V-K-N-VD
S-L-F-V-K-N-PG
L-V-F-V-K-N-PG
L-V-F-V-K-N-PG
I-L-F-V-K-N-PG
L-S-F-V-K-N-PG
L-A-F-V-T-N-PD
T-V-F-V-K-N-PG
85
97
84
89 78
92 75/97
47
40
95
82 68 100 97/64
Figure 5.2: Alignment of conserved Rad51 and Dmc1 residues. Percent identities
determined from the alignments of 98 Rad51 and 51 Dmc1 protein sequences
(Chapter 3) are indicated. Amino acid residues that are at least 75%
conserved in Rad51 are highlighted in yellow, Dmc1 in green. The actual
percent identities are provided for each paralog below. The Saccharomyces
cerevisiae Rad51 amino acid residue is indicated above for reference.
Representatives are provided here for each eukaryotic supergroup
(Opisthokonta are labeled purple, Amoebozoa blue, Archeplastida green,
Chromalveolata orange, Excavata brown, and Rhizaria eggplant). Giardia
intestinalis and Spironucleus vortens Dmc1 protein sequence data are also
provided for comparison
197
A) Two genetically
similar/identical
nucleated cells
(possibly sisters) with
haploid numbers of
linear chromosomes
B) One cell attempts to
engulf the other
C) Fusion of cell
membranes
E) Single nucleus with
diploid number of
chromosomes
x
F) Entry into “mitosis”,
DNA synthesis, Rad51
and Rad21
overexpression, DNA
strand exchange, and
pairing of homologous
chromosomes
G) CDK
overexpression,
suppression of DNA
synthesis, and entry
into normal haploid
mitosis
D) Fusion of nuclear
envelopes
Figure 5.3: Model for mitotic ploidy reduction in ancestral eukaryotes.
198
198
REFERENCES
Aboussekhra, A., R. Chanet, A. Adjiri, and F. Fabre. 1992. Semidominant suppressors of
srs2 helicase mutation of Saccharomyces cerevisiae map in the RAD51 gene,
whose sequence predicts a protein with similarities to prokaryotic recA proteins.
Molecular and Cellular Biology 12:3224-3234.
Acharya, N., L. Haracska, R. E. Johnson, I. Unk, S. Prakash, and L. Prakash. 2005.
Complex formation of yeast Rev1 and Rev7 proteins: a novel role for the
polymerase-associated domain. Molecular and Cellular Biology 25:9734-9740.
Adams, J., and P. E. Hansche. 1974. Population Studies in Microorganisms .1. Evolution
of Diploidy in Saccharomyces cerevisiae. Genetics 76:327-338.
Adl, S. M., B. S. Leander, A. G. B. Simpson, J. M. Archibald, O. R. Anderson, D. Bass,
S. S. Bowser, G. Brugerolle, M. A. Farmer, S. Karpov, M. Kolisko, C. E. Lane,
D. J. Lodge, D. G. Mann, R. Meisterfeld, L. Mendoza, O. Moestrup, S. E.
Mozley-Standridge, A. V. Smirnov, and F. Spiegel. 2007. Diversity,
nomenclature, and taxonomy of protists. Systematic Biology 56:684-689.
Adl, S. M., A. G. Simpson, M. A. Farmer, R. A. Andersen, O. R. Anderson, J. R. Barta,
S. S. Bowser, G. Brugerolle, R. A. Fensome, S. Fredericq, T. Y. James, S.
Karpov, P. Kugrens, J. Krug, C. E. Lane, L. A. Lewis, J. Lodge, D. H. Lynn, D.
G. Mann, R. M. McCourt, L. Mendoza, O. Moestrup, S. E. Mozley-Standridge, T.
A. Nerad, C. A. Shearer, A. V. Smirnov, F. W. Spiegel, and M. F. Taylor. 2005.
The new higher level classification of eukaryotes with emphasis on the taxonomy
of protists. J Eukaryot Microbiol 52:399-451.
Agrawal, A. F. 2006. Evolution of sex: Why do organisms shuffle their genotypes?
Current Biology 16:R696-R704.
Aihara, H., Y. Ito, H. Kurumizaka, S. Yokoyama, and T. Shibata. 1999. The N-terminal
domain of the human Rad51 protein binds DNA: Structure and a DNA binding
surface as revealed by NMR. Journal of Molecular Biology 290:495-504.
Allison, P. D. 1999. Logistic Regression Using SAS Theory and Application. SAS
Institute, Inc., Cary, NC.
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. H. Zhang, Z. Zhang, W. Miller, and D. J.
Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein
database search programs. Nucleic Acids Research 25:3389-3402.
Anuradha, S., and K. Muniyappa. 2004a. Saccharomyces cerevisiae Hop1 zinc finger
motif is the minimal region required for its function in Vitro. Journal of Biological
Chemistry 279:28961-28969.
Anuradha, S., and K. Muniyappa. 2004b. Meiosis-specific yeast Hop1 protein promotes
synapsis of double-stranded DNA helices via the formation of guanine quartets.
Nucleic Acids Research 32:2378-2385.
Arbel, A., D. Zenvirth, and G. Simchen. 1999. Sister chromatid-based DNA repair is
mediated by RAD54, not by DMC1 or TID1. EMBO J 18:2648-2658.
199
Archetti, M. 2004. Loss of complementation and the logic of two-step meiosis. Journal of
Evolutionary Biology 17:1098-1105.
Archibald, J. M. 2008. The eocyte hypothesis and the origin of eukaryotic cells. Proc Natl
Acad Sci U S A 105:20049-20050.
Arias, E. E., and J. C. Walter. 2005. Replication-dependent destruction of Cdt1 limits
DNA replication to a single round per cell cycle in Xenopus egg extracts. Genes
Dev 19:114-126.
Armbrust, E. V., J. A. Berges, C. Bowler, B. R. Green, D. Martinez, N. H. Putnam, S.
Zhou, A. E. Allen, K. E. Apt, M. Bechner, M. A. Brzezinski, B. K. Chaal, A.
Chiovitti, A. K. Davis, M. S. Demarest, J. C. Detter, T. Glavina, D. Goodstein, M.
Z. Hadi, U. Hellsten, M. Hildebrand, B. D. Jenkins, J. Jurka, V. V. Kapitonov, N.
Kroger, W. W. Lau, T. W. Lane, F. W. Larimer, J. C. Lippmeier, S. Lucas, M.
Medina, A. Montsant, M. Obornik, M. S. Parker, B. Palenik, G. J. Pazour, P. M.
Richardson, T. A. Rynearson, M. A. Saito, D. C. Schwartz, K. Thamatrakoln, K.
Valentin, A. Vardi, F. P. Wilkerson, and D. S. Rokhsar. 2004. The genome of the
diatom Thalassiosira pseudonana: Ecology, evolution, and metabolism. Science
306:79-86.
Arnaiz, O., S. Cain, J. Cohen, and L. Sperling. 2007. ParameciumDB: a community
resource that integrates the Paramecium tetraurelia genome sequence with
genetic data. Nucleic Acids Res 35:D439-444.
Aslett, M., C. Aurrecoechea, M. Berriman, J. Brestelli, B. P. Brunk, M. Carrington, D. P.
Depledge, S. Fischer, B. Gajria, X. Gao, M. J. Gardner, A. Gingle, G. Grant, O. S.
Harb, M. Heiges, C. Hertz-Fowler, R. Houston, F. Innamorato, J. Iodice, J. C.
Kissinger, E. Kraemer, W. Li, F. J. Logan, J. A. Miller, S. Mitra, P. J. Myler, V.
Nayak, C. Pennington, I. Phan, D. F. Pinney, G. Ramasamy, M. B. Rogers, D. S.
Roos, C. Ross, D. Sivam, D. F. Smith, G. Srinivasamoorthy, C. J. Stoeckert, Jr.,
S. Subramanian, R. Thibodeau, A. Tivey, C. Treatman, G. Velarde, and H. Wang.
2010. TriTrypDB: a functional genomic resource for the Trypanosomatidae.
Nucleic Acids Res 38:D457-462.
Atcheson, C. L., B. DiDomenico, S. Frackman, R. E. Esposito, and R. T. Elder. 1987.
Isolation, DNA sequence, and regulation of a meiosis-specific eukaryotic
recombination gene. Proc Natl Acad Sci U S A 84:8035-8039.
Aurrecoechea, C., J. Brestelli, B. P. Brunk, J. M. Carlton, J. Dommer, S. Fischer, B.
Gajria, X. Gao, A. Gingle, G. Grant, O. S. Harb, M. Heiges, F. Innamorato, J.
Iodice, J. C. Kissinger, E. Kraemer, W. Li, J. A. Miller, H. G. Morrison, V.
Nayak, C. Pennington, D. F. Pinney, D. S. Roos, C. Ross, C. J. Stoeckert, Jr., S.
Sullivan, C. Treatman, and H. Wang. 2009a. GiardiaDB and TrichDB: integrated
genomic resources for the eukaryotic protist pathogens Giardia lamblia and
Trichomonas vaginalis. Nucleic Acids Res 37:D526-530.
Aurrecoechea, C., J. Brestelli, B. P. Brunk, J. Dommer, S. Fischer, B. Gajria, X. Gao, A.
Gingle, G. Grant, O. S. Harb, M. Heiges, F. Innamorato, J. Iodice, J. C. Kissinger,
E. Kraemer, W. Li, J. A. Miller, V. Nayak, C. Pennington, D. F. Pinney, D. S.
Roos, C. Ross, C. J. Stoeckert, Jr., C. Treatman, and H. Wang. 2009b. PlasmoDB:
a functional genomic database for malaria parasites. Nucleic Acids Res 37:D539543.
200
Aurrecoechea, C., M. Heiges, H. Wang, Z. Wang, S. Fischer, P. Rhodes, J. Miller, E.
Kraemer, C. J. Stoeckert, Jr., D. S. Roos, and J. C. Kissinger. 2007. ApiDB:
integrated resources for the apicomplexan bioinformatics resource center. Nucleic
Acids Res 35:D427-430.
Avery, O. T., C. M. Macleod, and M. McCarty. 1944. Studies on the Chemical Nature of
the Substance Inducing Transformation of Pneumococcal Types : Induction of
Transformation by a Desoxyribonucleic Acid Fraction Isolated from
Pneumococcus Type Iii. J Exp Med 79:137-158.
Bai, Y., A. P. Davis, and L. S. Symington. 1999. A novel allele of RAD52 that causes
severe DNA repair and recombination deficiencies only in the absence of RAD51
or RAD59. Genetics 153:1117-1130.
Bai, Y., and L. S. Symington. 1996. A Rad52 homolog is required for RAD51independent mitotic recombination in Saccharomyces cerevisiae. Genes Dev
10:2025-2037.
Baldauf, S. L. 2008. An overview of the phylogeny and diversity of eukaryotes. Journal
of Systematics and Evolution 46:263-273.
Baldauf, S. L. 2003. The deep roots of eukaryotes. Science 300:1703-1706.
Baldauf, S. L., and J. D. Palmer. 1993. Animals and fungi are each others closest
relatives - congruent evidence from multiple proteins. Proceedings of the National
Academy of Sciences of the United States of America 90:11558-11562.
Baldauf, S. L., J. D. Palmer, and W. F. Doolittle. 1996. The root of the universal tree and
the origin of eukaryotes based on elongation factor phylogeny. Proceedings of the
National Academy of Sciences of the United States of America 93:7749-7754.
Baldauf, S. L., A. J. Roger, I. Wenk-Siefert, and W. F. Doolittle. 2000. A kingdom-level
phylogeny of eukaryotes based on combined protein data. Science 290:972-977.
Barbier, G., C. Oesterhelt, M. D. Larson, R. G. Halgren, C. Wilkerson, R. M. Garavito,
C. Benning, and A. P. M. Weber. 2005. Comparative genomics of two closely
related unicellular thermo-acidophilic red algae, Galdieria sulphuraria and
Cyanidioschyzon merolae, reveals the molecular basis of the metabolic flexibility
of Galdieria sulphuraria and significant differences in carbohydrate metabolism
of both algae. Plant Physiology 137:460-474.
Barton, N. H., and B. Charlesworth. 1998. Why sex and recombination? Science
281:1986-1990.
Baudat, F., and S. Keeney. 2001. Meiotic recombination: Making and breaking go hand
in hand. Current Biology 11:R45-R48.
Bell, G. 1982. The Masterpiece of Nature: The Evolution and Genetics of Sexuality.
University of California Press, Berkeley.
Bergerat, A., B. de Massy, D. Gadelle, P. C. Varoutas, A. Nicolas, and P. Forterre. 1997.
An atypical topoisomerase II from Archaea with implications for meiotic
recombination. Nature 386:414-417.
201
Bernstein, C., and H. Bernstein. 1991. Aging, Sex and DNA Repair. Academic Press, San
Diego.
Bernstein, C., and V. Johns. 1989. Sexual Reproduction as a Response to H2o2 Damage
in Schizosaccharomyces pombe. Journal of Bacteriology 171:1893-1897.
Bernstein, H., and C. Bernstein. 2010. Evolutionary Origin of Recombination during
Meiosis. Bioscience 60:498-505.
Bernstein, H., H. C. Byerly, F. A. Hopf, and R. E. Michod. 1984. Origin of Sex. Journal
of Theoretical Biology 110:323-351.
Berriman, M.E. GhedinC. Hertz-FowlerG. BlandinH. RenauldD. C. BartholomeuN. J.
LennardE. CalerN. E. HamlinB. HaasW. BohmeL. HannickM. A. AslettJ.
ShallomL. MarcelloL. H. HouB. WicksteadU. C. M. AlsmarkC. ArrowsmithR. J.
AtkinA. J. BarronF. BringaudK. BrooksM. CarringtonI. CherevachT. J.
ChillingworthC. ChurcherL. N. ClarkC. H. CortonA. CroninR. M. DaviesJ.
DoggettA. DjikengT. FeldblyumM. C. FieldA. FraserI. GoodheadZ. HanceD.
HarperB. R. HarrisH. HauserJ. HostetterA. IvensK. JagelsD. JohnsonJ. JohnsonK.
JonesA. X. KerhornouH. KooN. LarkeS. LandfearC. LarkinV. LeechA. LineA.
LordA. MacLeodP. J. MooneyS. MouleD. M. A. MartinG. W. MorganK.
MungallH. NorbertczakD. OrmondG. PaiC. S. PeacockJ. PetersonM. A. QuailE.
RabbinowitschM. A. RajandreamC. ReitterS. L. SalzbergM. SandersS. SchobelS.
SharpM. SimmondsA. J. SimpsonL. TaltonC. M. R. TurnerA. TaitA. R. TiveyS.
Van AkenD. WalkerD. WanlessS. L. WangB. WhiteO. WhiteS. WhiteheadJ.
WoodwardJ. WortmanM. D. AdamsT. M. EmbleyK. GullE. UlluJ. D. BarryA. H.
FairlambF. OpperdoesB. G. BarretJ. E. DonelsonN. HallC. M. FraserS. E.
Melville, and N. M. El-Sayed. 2005. The genome of the African trypanosome
Trypanosoma brucei. Science 309:416-422.
Bishop, D. K. 1994. recA homologs Dmc1 and Rad51 interact to form multiple nuclearcomplexes prior to meiotic chromosome synapsis. Cell 79:1081-1092.
Bishop, D. K., Y. Nikolski, J. Oshiro, J. Chon, M. Shinohara, and X. Chen. 1999. High
copy number suppression of the meiotic arrest caused by a dmc1 mutation:
REC114 imposes an early recombination block and RAD54 promotes a DMC1independent DSB repair pathway. Genes Cells 4:425-444.
Bishop, D. K., D. Park, L. Xu, and N. Kleckner. 1992. DMC1: a meiosis-specific yeast
homolog of E. coli recA required for recombination, synaptonemal complex
formation, and cell cycle progression. Cell 69:439-456.
Bishop, D. K., and D. Zickler. 2004. Early decision; meiotic crossover interference prior
to stable strand exchange and synapsis. Cell 117:9-15.
Bleuyard, J. Y., M. E. Gallego, and C. I. White. 2006. Recent advances in understanding
of the DNA double-strand break repair machinery of plants. DNA Repair (Amst)
5:1-12.
Blow, J. J., and A. Dutta. 2005. Preventing re-replication of chromosomal DNA. Nat Rev
Mol Cell Biol 6:476-486.
202
Bochkareva, E., S. Korolev, S. P. Lees-Miller, and A. Bochkarev. 2002. Structure of the
RPA trimerization core and its role in the multistep DNA-binding mechanism of
RPA. EMBO J 21:1855-1863.
Bonhoeffer, S., C. Chappey, N. T. Parkin, J. M. Whitcomb, and C. J. Petropoulos. 2004.
Evidence for positive epistasis in HIV-1. Science 306:1547-1550.
Borner, G. V., N. Kleckner, and N. Hunter. 2004. Crossover/noncrossover differentiation,
synaptonemal complex formation, and regulatory surveillance at the
leptotene/zygotene transition of meiosis. Cell 117:29-45.
Borts, R. H., S. R. Chambers, and M. F. Abdullah. 2000. The many faces of mismatch
repair in meiosis. Mutat Res 451:129-150.
Bowler, C., A. E. Allen, J. H. Badger, J. Grimwood, K. Jabbari, A. Kuo, U. Maheswari,
C. Martens, F. Maumus, R. P. Otillar, E. Rayko, A. Salamov, K. Vandepoele, B.
Beszteri, A. Gruber, M. Heijde, M. Katinka, T. Mock, K. Valentin, F. Verret, J.
A. Berges, C. Brownlee, J. P. Cadoret, A. Chiovitti, C. J. Choi, S. Coesel, A. De
Martino, J. C. Detter, C. Durkin, A. Falciatore, J. Fournet, M. Haruta, M. J.
Huysman, B. D. Jenkins, K. Jiroutova, R. E. Jorgensen, Y. Joubert, A. Kaplan, N.
Kroger, P. G. Kroth, J. La Roche, E. Lindquist, M. Lommer, V. Martin-Jezequel,
P. J. Lopez, S. Lucas, M. Mangogna, K. McGinnis, L. K. Medlin, A. Montsant,
M. P. Oudot-Le Secq, C. Napoli, M. Obornik, M. S. Parker, J. L. Petit, B. M.
Porcel, N. Poulsen, M. Robison, L. Rychlewski, T. A. Rynearson, J. Schmutz, H.
Shapiro, M. Siaut, M. Stanley, M. R. Sussman, A. R. Taylor, A. Vardi, P. von
Dassow, W. Vyverman, A. Willis, L. S. Wyrwicz, D. S. Rokhsar, J. Weissenbach,
E. V. Armbrust, B. R. Green, Y. Van de Peer, and I. V. Grigoriev. 2008. The
Phaeodactylum genome reveals the evolutionary history of diatom genomes.
Nature 456:239-244.
Brill, S. J., and B. Stillman. 1991. Replication factor-A from Saccharomyces cerevisiae is
encoded by three essential genes coordinately expressed at S phase. Genes Dev
5:1589-1600.
Brinkmann, H., and H. Philippe. 2007. The diversity of eukaryotes and the root of the
eukaryotic tree. Pp. 20-37. Eukaryotic Membranes and Cytoskeleton: Origins and
Evolution. Springer-Verlag Berlin, Berlin.
Brocks, J. J., G. A. Logan, R. Buick, and R. E. Summons. 1999. Archean molecular
fossils and the early rise of eukaryotes. Science 285:1033-1036.
Broek, D., R. Bartlett, K. Crawford, and P. Nurse. 1991. Involvement of p34cdc2 in
establishing the dependency of S phase on mitosis. Nature 349:388-393.
Brown, J. R., and W. F. Doolittle. 1995. Root of the Universal Tree of Life Based on
Ancient Aminoacyl-Transfer-Rna Synthetase Gene Duplications. Proceedings of
the National Academy of Sciences of the United States of America 92:2441-2445.
Brown, J. W., and U. Sorhannus. 2010. A molecular genetic timescale for the
diversification of autotrophic stramenopiles (Ochrophyta): substantive
underestimation of putative fossil ages. PLoS One 5.
Brush, G. S. 2002. Recombination functions of Replication Protein A. Current Organic
Chemistry 6:795-813.
203
Burgess, S. M., N. Kleckner, and B. M. Weiner. 1999. Somatic pairing of homologs in
budding yeast: existence and modulation. Genes Dev 13:1627-1641.
Burki, F., A. Kudryavtsev, M. V. Matz, G. V. Aglyamova, S. Bulman, M. Fiers, P. J.
Keeling, and J. Pawlowski. Evolution of Rhizaria: new insights from
phylogenomic analysis of uncultivated protists. BMC Evol Biol 10:377.
Burki, F., and J. Pawlowski. 2006. Monophyly of Rhizaria and multigene phylogeny of
unicellular bikonts. Mol Biol Evol 23:1922-1930.
Burki, F., K. Shalchian-Tabrizi, M. Minge, A. Skjaeveland, S. I. Nikolaev, K. S.
Jakobsen, and J. Pawlowski. 2007. Phylogenomics reshuffles the eukaryotic
supergroups. PLoS One 2:e790.
Burki, F., K. Shalchian-Tabrizi, and J. Pawlowski. 2008. Phylogenomics reveals a new
'megagroup' including most photosynthetic eukaryotes. Biol Lett 4:366-369.
Callan, H. G. 1972. Replication of DNA in Chromosomes of Eukaryotes. Proceedings of
the Royal Society of London Series B-Biological Sciences 181:19-&.
Camerini-Otero, R. D., and P. Hsieh. 1995. Homologous recombination proteins in
prokaryotes and eukaryotes. Annu Rev Genet 29:509-552.
Cameron, A. C., and P. K. Trivedi. 1998. Regression analysis of count data. Cambridge
University Press, Cambridge, UK ; New York, NY, USA.
Cardoso, R. A., L. T. Pires, T. D. Zucchi, F. D. Zucchi, and T. M. Zucchi. Mitotic
crossing-over induced by two commercial herbicides in diploid strains of the
fungus Aspergillus nidulans. Genet Mol Res 9:231-238.
Carlton, J. M., R. P. Hirt, J. C. Silva, A. L. Delcher, M. Schatz, Q. Zhao, J. R. Wortman,
S. L. Bidwell, U. C. Alsmark, S. Besteiro, T. Sicheritz-Ponten, C. J. Noel, J. B.
Dacks, P. G. Foster, C. Simillion, Y. Van de Peer, D. Miranda-Saavedra, G. J.
Barton, G. D. Westrop, S. Muller, D. Dessi, P. L. Fiori, Q. Ren, I. Paulsen, H.
Zhang, F. D. Bastida-Corcuera, A. Simoes-Barbosa, M. T. Brown, R. D. Hayes,
M. Mukherjee, C. Y. Okumura, R. Schneider, A. J. Smith, S. Vanacova, M.
Villalvazo, B. J. Haas, M. Pertea, T. V. Feldblyum, T. R. Utterback, C. L. Shu, K.
Osoegawa, P. J. de Jong, I. Hrdy, L. Horvathova, Z. Zubacova, P. Dolezal, S. B.
Malik, J. M. Logsdon, Jr., K. Henze, A. Gupta, C. C. Wang, R. L. Dunne, J. A.
Upcroft, P. Upcroft, O. White, S. L. Salzberg, P. Tang, C. H. Chiu, Y. S. Lee, T.
M. Embley, G. H. Coombs, J. C. Mottram, J. Tachezy, C. M. Fraser-Liggett, and
P. J. Johnson. 2007. Draft genome sequence of the sexually transmitted pathogen
Trichomonas vaginalis. Science 315:207-212.
Carpenter, A. T. C. 1987. Gene Conversion, Recombination Nodules, and the Initiation
of Meiotic Synapsis. Bioessays 6:232-236.
Cavalier-Smith, T. 1987a. The origin of eukaryotic and archaebacterial cells. Ann N Y
Acad Sci 503:17-54.
Cavalier-Smith, T. 2002a. The phagotrophic origin of eukaryotes and phylogenetic
classification of protozoa. International Journal of Systematic and Evolutionary
Microbiology 52:297-354.
204
Cavalier-Smith, T. 2003a. Protist phylogeny and the high-level classification of Protozoa.
European Journal of Protistology 39:338-348.
Cavalier-Smith, T. 1987b. The origin of cells: A symbiosis between genes, catalysts, and
membranes. Cold Spring Harb Symp Quant Biol 52:805-824.
Cavalier-Smith, T. 2004. Only six kingdoms of life. Proc Biol Sci 271:1251-1262.
Cavalier-Smith, T. 2009. Kingdoms Protozoa and Chromista and the eozoan root of the
eukaryotic tree. Biol Lett 6:342-345.
Cavalier-Smith, T. 1981a. Eukaryotic kingdoms: Seven or nine? Biosystems 14:461-481.
Cavalier-Smith, T. 1975. Origin of Nuclei and of Eukaryotic Cells. Nature 256:463-468.
Cavalier-Smith, T. 2002b. Chloroplast evolution: secondary symbiogenesis and multiple
losses. Curr Biol 12:R62-64.
Cavalier-Smith, T. 1988. Origin of the cell nucleus. Bioessays 9:72-78.
Cavalier-Smith, T. 1989. Molecular phylogeny. Archaebacteria and Archezoa. Nature
339:l00-01.
Cavalier-Smith, T. 2002c. The neomuran origin of archaebacteria, the negibacterial root
of the universal tree and bacterial megaclassification. Int J Syst Evol Microbiol
52:7-76.
Cavalier-Smith, T. 2003b. Genomic reduction and evolution of novel genetic membranes
and protein-targeting machinery in eukaryote-eukaryote chimaeras (meta-algae).
Philos Trans R Soc Lond B Biol Sci 358:109-133; discussion 133-104.
Cavalier-Smith, T. 2010. Origin of the cell nucleus, mitosis and sex: Roles of
intracellular coevolution. Biol Direct 5:7.
Cavalier-Smith, T. 2002d. Origins of the machinery of recombination and sex. Heredity
88:125-141.
Cavalier-Smith, T. 1981b. The origin and early evolution of the eukaryotic cell in M. J.
Carlile, J. F. Collins, and B. E. B. Moseley, eds. Molecular and Cellular Aspects
of Microbial Evolution. Cambridge University Press, Cambridge, UK.
Cavalier-Smith, T. 1987c. The origin of fungi and pseudofungi. Pp. 339-353 in A. D. M.
Rayner, ed. Evolutionary biology of fungi. Cambridge Univ. Press, Cambridge.
Cavalier-Smith, T., and E. E. Chao. 2010. Phylogeny and evolution of Apusomonadida
(Protozoa: Apusozoa): New genera and species. Protist.
Cavalier-Smith, T., and E. E. Chao. 2003a. Phylogeny and classification of phylum
Cercozoa (Protozoa). Protist 154:341-358.
Cavalier-Smith, T., and E. E. Chao. 2003b. Phylogeny of choanozoa, apusozoa, and other
protozoa and early eukaryote megaevolution. J Mol Evol 56:540-563.
205
Cavalier-Smith, T., and E. E. Chao. 2003c. Molecular phylogeny of centrohelid heliozoa,
a novel lineage of bikont eukaryotes that arose by ciliary loss. J Mol Evol 56:387396.
Cavalier-Smith, T., and E. E. Chao. 2006. Phylogeny and megasystematics of
phagotrophic heterokonts (kingdom Chromista). J Mol Evol 62:388-420.
Chandley, A. C. 1966. Studies on Oogenesis in Drosophila Melanogaster with 3hThymidine Label. Experimental Cell Research 44:201-&.
Chang, Y. X., L. Gong, W. Y. Yuan, X. W. Li, G. X. Chen, X. H. Li, Q. F. Zhang, and C.
Y. Wu. 2009. Replication Protein A (RPA1a) is required for meiotic and somatic
DNA repair but is dispensable for DNA replication and homologous
recombination in rice. Plant Physiology 151:2162-2173.
Charlesworth, B. 1991. Evolution - When to Be Diploid. Nature 351:273-274.
Charlesworth, B., and N. H. Barton. 1996. Recombination load associated with selection
for increased recombination. Genetical Research 67:27-41.
Chen, G., S. S. F. Yuan, W. Liu, Y. Xu, K. Trujillo, B. W. Song, F. Cong, S. P. Goff, Y.
Wu, R. Arlinghaus, D. Baltimore, P. J. Gasser, M. S. Park, P. Sung, and E. Lee.
1999. Radiation-induced assembly of Rad51 and Rad52 recombination complex
requires ATM and c-Abl. Journal of Biological Chemistry 274:12748-12752.
Chen, L. T., T. P. Ko, Y. C. Chang, K. A. Lin, C. S. Chang, A. H. J. Wang, and T. F.
Wang. 2007. Crystal structure of the left-handed archaeal RadA helical filament:
identification of a functional motif for controlling quaternary structures and
enzymatic functions of RecA family proteins. Nucleic Acids Research 35:17871801.
Chen, Y. K., C. H. Leng, H. Olivares, M. H. Lee, Y. C. Chang, W. M. Kung, S. C. Ti, Y.
H. Lo, A. H. Wang, C. S. Chang, D. K. Bishop, Y. P. Hsueh, and T. F. Wang.
2004. Heterodimeric complexes of Hop2 and Mnd1 function with Dmc1 to
promote meiotic homolog juxtaposition and strand assimilation. Proc Natl Acad
Sci U S A 101:10572-10577.
Chen, Z. C., H. J. Yang, and N. P. Pavletich. 2008. Mechanism of homologous
recombination from the RecA-ssDNA/dsDNA structures. Nature 453:489-U483.
Chi, P., Y. Kwon, C. Seong, A. Epshtein, I. Lam, P. Sung, and H. L. Klein. 2006. Yeast
recombination factor Rdh54 functionally interacts with the Rad51 recombinase
and catalyzes Rad51 removal from DNA. J Biol Chem 281:26268-26279.
Churchill, F. B. 1970. Hertwig, Weismann, and the meaning of the reduction division,
circa 1890. Isis 61:429-457.
Clark, A. J., and S. J. Sandler. 1994. Homologous genetic recombination: the pieces
begin to fall into place. Crit Rev Microbiol 20:125-142.
Cleveland, L. R. 1956. Brief Accounts of the Sexual Cycles of the Flagellates of
Cryptocercus. Journal of Protozoology 3:161-180.
Cleveland, L. R. 1947. The Origin and Evolution of Meiosis. Science 105:287-289.
206
Cole, E. S., D. Cassidy-Hanley, J. Hemish, J. Tuan, and P. J. Bruns. 1997. A mutational
analysis of conjugation in Tetrahymena thermophila. 1. Phenotypes affecting
early development: meiosis to nuclear selection. Dev Biol 189:215-232.
Colegrave, N., O. Kaltz, and G. Bell. 2002. The ecology and genetics of fitness in
Chlamydomonas. VIII. The dynamics of adaptation to novel environments after a
single episode of sex. Evolution 56:14-21.
Collins, J. E., C. Wright, C. A. Edwards, M. P. Davis, J. A. Grinham, C. G. Cole, M. E.
Goward, B. Aguado, M. Mallya, Y. Mokrab, E. J. Huckle, D. M. Beare, and I.
Dunham. 2004. A genome annotation-driven approach to cloning the human
ORFeome. Genome Biology 5:11.
Conway, A. B., T. W. Lynch, Y. Zhang, G. S. Fortin, C. W. Fung, L. S. Symington, and
P. A. Rice. 2004. Crystal structure of a Rad51 filament. Nature Structural &
Molecular Biology 11:791-796.
Corbett, K. D., and J. M. Berger. 2003. Structure of the topoisomerase VI-B subunit:
implications for type II topoisomerase mechanism and evolution. EMBO J
22:151-163.
Cox, C. J., P. G. Foster, R. P. Hirt, S. R. Harris, and T. M. Embley. 2008. The
archaebacterial origin of eukaryotes. Proc Natl Acad Sci U S A 105:20356-20361.
Cox, M. M. 2007. Motoring along with the bacterial RecA protein. Nature Reviews
Molecular Cell Biology 8:127-138.
Cox, M. M. 2003. The bacterial RecA protein as a motor protein. Annual Review of
Microbiology 57:551-577.
Cox, M. M. 1993. Relating biochemistry to biology: how the recombinational repair
function of RecA protein is manifested in its molecular properties. Bioessays
15:617-623.
Crow, J. F., and M. Kimura. 1965. Evolution in sexual and asexual populations.
American Naturalist 99:439-450.
d'Erfurth, I., S. Jolivet, N. Froger, O. Catrice, M. Novatchkova, and R. Mercier. 2009.
Turning meiosis into mitosis. PLoS Biol 7:e1000124.
Dacks, J., and A. J. Roger. 1999. The first sexual lineage and the relevance of facultative
sex. J Mol Evol 48:779-783.
Dacks, J. B., and W. F. Doolittle. 2001. Reconstructing/deconstructing the earliest
eukaryotes: How comparative genomics can help. Cell 107:419-425.
Dahmann, C., J. F. Diffley, and K. A. Nasmyth. 1995. S-phase-promoting cyclindependent kinases prevent re-replication by inhibiting the transition of replication
origins to a pre-replicative state. Curr Biol 5:1257-1269.
Darwin, C. 1859. On the origin of species by means of natural selection. J. Murray,
London,.
207
Davis, A. P., and L. S. Symington. 2001. The yeast recombinational repair protein Rad59
interacts with Rad52 and stimulates single-strand annealing. Genetics 159:515525.
Davis, A. P., and L. S. Symington. 2003. The Rad52-Rad59 complex interacts with
Rad51 and Replication Protein A. DNA Repair 2:1127-1134.
de la Cruz, J., D. Kressler, and P. Linder. 1999. Unwinding RNA in Saccharomyces
cerevisiae: DEAD-box proteins and related families. Trends Biochem Sci 24:192198.
DePamphilis, M. L. 1996. DNA replication in eukaryotic cells. Cold Spring Harbor
Laboratory Press, [Plainview, New York].
Dernburg, A. F., K. McDonald, G. Moulder, R. Barstead, M. Dresser, and A. M.
Villeneuve. 1998. Meiotic recombination in C. elegans initiates by a conserved
mechanism and is dispensable for homologous chromosome synapsis. Cell
94:387-398.
Diffley, J. F. 1996. Once and only once upon a time: specifying and regulating origins of
DNA replication in eukaryotic cells. Genes Dev 10:2819-2830.
Doolittle, W. F., C. L. Nesbo, E. Bapteste, and O. Zhaxybayeva. 2008. Lateral gene
transfer. Pp. 45-79 in M. Pagel, and A. Pomiankowski, eds. In Evolutionary
Genomics and Proteomics. Sinauer.
Dudas, A., and M. Chovanec. 2004. DNA double-strand break repair by homologous
recombination. Mutation Research-Reviews in Mutation Research 566:131-167.
Dunn, C. W., A. Hejnol, D. Q. Matus, K. Pang, W. E. Browne, S. A. Smith, E. Seaver, G.
W. Rouse, M. Obst, G. D. Edgecombe, M. V. Sorensen, S. H. D. Haddock, A.
Schmidt-Rhaesa, A. Okusu, R. M. Kristensen, W. C. Wheeler, M. Q. Martindale,
and G. Giribet. 2008. Broad phylogenomic sampling improves resolution of the
animal tree of life. Nature 452:745-U745.
Edgar, R. C. 2004. MUSCLE: a multiple sequence alignment method with reduced time
and space complexity. BMC Bioinformatics 5:1-19.
Edlind, T. D., J. Li, G. S. Visvesvara, M. H. Vodkin, G. L. McLaughlin, and S. K.
Katiyar. 1996. Phylogenetic analysis of beta-tubulin sequences from
amitochondrial protozoa. Molecular Phylogenetics and Evolution 5:359-367.
Elena, S. F., and R. E. Lenski. 1997. Test of synergistic interactions among deleterious
mutations in bacteria. Nature 390:395-398.
Embley, T. M., and W. Martin. 2006. Eukaryotic evolution, changes and challenges.
Nature 440:623-630.
Eme, L., D. Moreira, E. Talla, and C. Brochier-Armanet. 2009. A complex cell division
machinery was present in the last common ancestor of eukaryotes. PLoS One
4:e5021.
208
Enomoto, R., T. Kinebuchi, M. Sato, H. Yagi, H. Kurumizaka, and S. Yokoyama. 2006.
Stimulation of DNA strand exchange by the human TBPIP/Hop2-Mnd1 complex.
Journal of Biological Chemistry 281:5575-5581.
Fast, N. M., J. C. Kissinger, D. S. Roos, and P. J. Keeling. 2001. Nuclear-encoded,
plastid-targeted genes suggest a single common origin for apicomplexan and
dinoflagellate plastids. Mol Biol Evol 18:418-426.
Feldman, M. W. 1972. Selection for Linkage Modification .1. Random Mating
Populations. Theoretical Population Biology 3:324-&.
Feldman, M. W., F. B. Christiansen, and L. D. Brooks. 1980. Evolution of
Recombination in a Constant Environment. Proceedings of the National Academy
of Sciences of the United States of America-Biological Sciences 77:4838-4841.
Felsenstein, J. 2004. Inferring Phylogenies. Pp. 664. Sinauer Associates, Inc., Suderland,
MA.
Fenchel, T., and B. J. Finlay. 2004. The ubiquity of small species: Patterns of local and
global diversity. Bioscience 54:777-784.
Feng, Q., L. During, A. A. de Mayolo, G. Lettier, M. Lisby, N. Erdeniz, U. H.
Mortensen, and R. Rothstein. 2007. Rad52 and Rad59 exhibit both overlapping
and distinct functions. DNA Repair 6:27-37.
Filippo, J. S., P. Sung, and H. Klein. 2008. Mechanism of eukaryotic homologous
recombination. Annual Review of Biochemistry 77:229-257.
Firmenich, A. A., M. Elias-Arnanz, and P. Berg. 1995. A novel allele of Saccharomyces
cerevisiae RFA1 that is deficient in recombination and repair and suppressible by
RAD52. Mol Cell Biol 15:1620-1631.
Fisher, R. A. 1930. The genetical theory of natural selection. The Clarendon press,
Oxford,.
Flaus, A., D. M. A. Martin, G. J. Barton, and T. Owen-Hughes. 2006. Identification of
multiple distinct Snf2 subfamilies with conserved structural motifs. Nucleic Acids
Research 34:2887-2905.
Flemming, W. 1878. Zur Kenntniss der Zelle und ihrer Theilungs-Erscheinungen.
Schriften des Naturwissenschaftlichen Vereins für Schleswig-Holstein 3:23-27.
Flowers, J. M., S. I. Li, A. Stathos, G. Saxer, E. A. Ostrowski, D. C. Queller, J. E.
Strassmann, and M. D. Purugganan. 2010. Variation, Sex, and Social
Cooperation: Molecular Population Genetics of the Social Amoeba Dictyostelium
discoideum. Plos Genetics 6:-.
Force, A., M. Lynch, F. B. Pickett, A. Amores, Y. L. Yan, and J. Postlethwait. 1999.
Preservation of duplicate genes by complementary, degenerative mutations.
Genetics 151:1531-1545.
Fortin, G. S., and L. S. Symington. 2002. Mutations in yeast Rad51 that partially bypass
the requirement for Rad55 and Rad57 in DNA repair by increasing the stability of
Rad51-DNA complexes. Embo Journal 21:3160-3170.
209
Fung, C. W., G. S. Fortin, S. E. Peterson, and L. S. Symington. 2006. The rad51-K191R
ATPase-defective mutant is impaired for presynaptic filament formation. Mol
Cell Biol 26:9544-9554.
Fung, C. W., A. M. Mozlin, and L. S. Symington. 2009. Suppression of the doublestrand-break-repair defect of the Saccharomyces cerevisiae rad57 mutant.
Genetics 181:1195-1206.
Gabaldon, T., and M. A. Huynen. 2003. Reconstruction of the proto-mitochondrial
metabolism. Science 301:609.
Game, J. C., and R. K. Mortimer. 1974. A genetic study of x-ray sensitive mutants in
yeast. Mutat Res 24:281-292.
Gasior, S. L., H. Olivares, U. Ear, D. M. Hari, R. Weichselbaum, and D. K. Bishop.
2001. Assembly of RecA-like recombinases: Distinct roles for mediator proteins
in mitosis and meiosis. Proceedings of the National Academy of Sciences of the
United States of America 98:8411-8418.
Gasior, S. L., A. K. Wong, Y. Kora, A. Shinohara, and D. K. Bishop. 1998. Rad52
associates with RPA and functions with rad55 and rad57 to assemble meiotic
recombination complexes. Genes Dev 12:2208-2221.
Germot, A., H. Philippe, and H. LeGuyader. 1997. Evidence for loss of mitochondria in
Microsporidia from a mitochondrial-type HSP70 in Nosema locustae. Molecular
and Biochemical Parasitology 87:159-168.
Gogarten, J. P., H. Kibak, P. Dittrich, L. Taiz, E. J. Bowman, B. J. Bowman, M. F.
Manolson, R. J. Poole, T. Date, T. Oshima, J. Konishi, K. Denda, and M.
Yoshida. 1989. Evolution of the Vacuolar H+-Atpase - Implications for the Origin
of Eukaryotes. Proceedings of the National Academy of Sciences of the United
States of America 86:6661-6665.
Goh, C. S., A. A. Bogan, M. Joachimiak, D. Walther, and F. E. Cohen. 2000. Coevolution of proteins with their interaction partners. Journal of Molecular Biology
299:283-293.
Gray, M. W. 1989. The evolutionary origins of organelles. Trends Genet 5:294-299.
Gray, M. W., and W. F. Doolittle. 1982. Has the endosymbiont hypothesis been proven?
Microbiol Rev 46:1-42.
Griffith, F. 1928. The Significance of Pneumococcal Types. J Hyg (Lond) 27:113-159.
Griffiths, A. J. F., J. H. Miller, D. T. Suzuki, R. C. Lewontin, and W. M. Gelbart. 2000.
Introduction to genetic analysis. W.H. Freeman and Co., New York.
Grigorescu, A. A., J. H. A. Vissers, D. Ristic, Y. Z. Pigli, T. W. Lynch, C. Wyman, and
P. A. Rice. 2009. Inter-subunit interactions that coordinate Rad51s activities.
Nucleic Acids Research 37:557-567.
Grishchuk, A. L., R. Kraehenbuehl, M. Molnar, O. Fleck, and J. Kohli. 2004. Genetic and
cytological characterization of the RecA-homologous proteins Rad51 and Dmc1
of Schizosaccharomyces pombe. Curr Genet 44:317-328.
210
Gruber, S., C. H. Haering, and K. Nasmyth. 2003. Chromosomal cohesin forms a ring.
Cell 112:765-777.
Guindon, S., F. Delsuc, J. F. Dufayard, and O. Gascuel. 2009. Estimating maximum
likelihood phylogenies with PhyML. Methods Mol Biol 537:113-137.
Hackett, J. D., H. S. Yoon, S. Li, A. Reyes-Prieto, S. E. Rummele, and D. Bhattacharya.
2007. Phylogenomic analysis supports the monophyly of cryptophytes and
haptophytes and the association of rhizaria with chromalveolates. Mol Biol Evol
24:1702-1713.
Hall, T. Z. 1999. BioEdit: a user-friendly biological sequence alignment editor and
analysis program for Windows 95/98/NT. Nucl. Acids Symp. Ser. 41:95-98.
Hamilton, W. J. 1999. Evolution of Sex. Oxford University Press, Oxford.
Hampl, V., L. Hug, J. W. Leigh, J. B. Dacks, B. F. Lang, A. G. Simpson, and A. J. Roger.
2009. Phylogenomic analyses support the monophyly of Excavata and resolve
relationships among eukaryotic "supergroups". Proc Natl Acad Sci U S A
106:3859-3864.
Han, T. M., and B. Runnegar. 1992. Megascopic eukaryotic algae from the 2.1-billionyear-old Negaunee iron-formation, Michigan. Science 257:232-235.
Hartung, F., K. J. Angelis, A. Meister, I. Schubert, M. Melzer, and H. Puchta. 2002. An
archaebacterial topoisomerase homolog not present in other eukaryotes is
indispensable for cell proliferation of plants. Curr Biol 12:1787-1791.
Hartung, F., and H. Puchta. 2001. Molecular characterization of homologues of both
subunits A (SPO11) and B of the archaebacterial topoisomerase 6 in plants. Gene
271:81-86.
Hashimoto, T., Y. Nakamura, T. Kamaishi, and M. Hasegawa. 1997. Early evolution of
eukaryotes inferred from protein phylogenies of translation elongation factors 1
alpha and 2. Archiv Fur Protistenkunde 148:287-295.
Hauf, S., and Y. Watanabe. 2004. Kinetochore orientation in mitosis and meiosis. Cell
119:317-327.
Hayles, J., D. Fisher, A. Woollard, and P. Nurse. 1994. Temporal order of S phase and
mitosis in fission yeast is determined by the state of the p34cdc2-mitotic B cyclin
complex. Cell 78:813-822.
Hays, S. L., A. A. Firmenich, and P. Berg. 1995. Complex formation in yeast doublestrand break repair: participation of Rad51, Rad52, Rad55, and Rad57 proteins.
Proc Natl Acad Sci U S A 92:6925-6929.
Hays, S. L., A. A. Firmenich, P. Massey, R. Banerjee, and P. Berg. 1998. Studies of the
interaction between Rad52 protein and the yeast single-stranded DNA binding
protein RPA. Mol Cell Biol 18:4400-4406.
211
Heiges, M., H. Wang, E. Robinson, C. Aurrecoechea, X. Gao, N. Kaluskar, P. Rhodes, S.
Wang, C. Z. He, Y. Su, J. Miller, E. Kraemer, and J. C. Kissinger. 2006.
CryptoDB: a Cryptosporidium bioinformatics resource update. Nucleic Acids Res
34:D419-422.
Henry, J. M., R. Camahort, D. A. Rice, L. Florens, S. K. Swanson, M. P. Washburn, and
J. L. Gerton. 2006. Mnd1/Hop2 facilitates Dmc1-dependent interhomolog
crossover formation in meiosis of budding yeast. Mol Cell Biol 26:2913-2923.
Herskowitz, I. 1988. Life-Cycle of the Budding Yeast Saccharomyces cerevisiae.
Microbiological Reviews 52:536-553.
Hill, W. G., and Robertson. 1966. Effect of Linkage on Limits to Artificial Selection.
Genetical Research 8:269-&.
Hirt, R. P., B. Healy, C. R. Vossbrinck, E. U. Canning, and T. M. Embley. 1997. A
mitochondrial Hsp70 orthologue in Vairimorpha necatrix: Molecular evidence
that microsporidia once contained mitochondria. Current Biology 7:995-998.
Hirt, R. P., J. M. Logsdon, B. Healy, M. W. Dorey, W. F. Doolittle, and T. M. Embley.
1999. Microsporidia are related to Fungi: Evidence from the largest subunit of
RNA Polymerase II and other proteins. Proceedings of the National Academy of
Sciences of the United States of America 96:580-585.
Hoffmann, E. R., P. V. Shcherbakova, T. A. Kunkel, and R. H. Borts. 2003. MLH1
mutations differentially affect meiotic functions in Saccharomyces cerevisiae.
Genetics 163:515-526.
Holzen, T. M., P. P. Shah, H. A. Olivares, and D. K. Bishop. 2006. Tid1/Rdh54 promotes
dissociation of Dmc1 from nonrecombinogenic sites in meiotic chromatin. Genes
& Development 20:2593-2604.
Horiike, T., K. Hamada, S. Kanaya, and T. Shinozawa. 2001. Origin of eukaryotic cell
nuclei by symbiosis of Archaea in Bacteria is revealed by homology-hit analysis.
Nature Cell Biology 3:210-214.
Hunter, N., and R. H. Borts. 1997. Mlh1 is unique among mismatch repair proteins in its
ability to promote crossing-over during meiosis. Genes Dev 11:1573-1582.
Hurst, L. D., and P. Nurse. 1991. A Note on the Evolution of Meiosis. Journal of
Theoretical Biology 150:561-563.
Huxley, J. 1942. Evolution, the modern synthesis. G. Allen & Unwin ltd, London,.
Ishibashi, T., S. Kimura, and K. Sakaguchi. 2006. A higher plant has three different types
of RPA heterotrimeric complex. J Biochem 139:99-104.
Ishibashi, T., A. Koga, T. Yamamoto, Y. Uchiyama, Y. Mori, J. Hashimoto, S. Kimura,
and K. Sakaguchi. 2005. Two types of Replication Protein A in seed plants. FEBS
J 272:3270-3281.
Ito, M., and M. H. Takegami. 1982. Commitment of Mitotic Cells to Meiosis during the
G2 Phase of Pre-Meiosis. Plant and Cell Physiology 23:943-952.
212
Iwabe, N., K. Kuma, M. Hasegawa, S. Osawa, and T. Miyata. 1989. Evolutionary
Relationship of Archaebacteria, Eubacteria, and Eukaryotes Inferred from
Phylogenetic Trees of Duplicated Genes. Proceedings of the National Academy of
Sciences of the United States of America 86:9355-9359.
Iwabe, N., K. Kuma, H. Kishino, M. Hasegawa, and T. Miyata. 1991. Evolution of RnaPolymerases and Branching Patterns of the 3 Major Groups of Archaebacteria.
Journal of Molecular Evolution 32:70-78.
Janouskovec, J., A. Horak, M. Obornik, J. Lukes, and P. J. Keeling. 2010. A common red
algal origin of the apicomplexan, dinoflagellate, and heterokont plastids.
Proceedings of the National Academy of Sciences of the United States of America
107:10949-10954.
Janssens, F. A. 1909. La Theorie de la chiasmatypie. Noouvelle interpretation des
cineses de maturation. La Cellule 25:387-411.
John, B. 1990. Meiosis. Cambridge University Press, Cambridge [England] ; New York.
Johnson, R. D., and L. S. Symington. 1995. Functional differences and interactions
among the putative RecA homologs Rad51, Rad55, and Rad57. Mol Cell Biol
15:4843-4850.
Jorgensen, A., and E. Sterud. 2007. Phylogeny of Spironucleus (Eopharyngia :
Diplomonadida : Hexamitinae). Protist 158:247-254.
Kadyk, L. C., and L. H. Hartwell. 1992. Sister chromatids are preferred over homologs as
substrates for recombination repair in Saccharomyces cerevisiae. Genetics
132:387-402.
Kaltz, O., and G. Bell. 2002. The ecology and genetics of fitness in Chlamydomonas. XII.
Repeated sexual episodes increase rates of adaptation to novel environments.
Evolution 56:1743-1753.
Kamaishi, T., T. Hashimoto, Y. Nakamura, F. Nakamura, S. Murata, N. Okada, K.
Okamoto, M. Shimizu, and M. Hasegawa. 1996. Protein phylogeny of translation
elongation factor EF-1 alpha suggests microsporidians are extremely ancient
eukaryotes. Journal of Molecular Evolution 42:257-263.
Kathiresan, A., G. S. Khush, and J. Bennett. 2002. Two rice DMC1 genes are
differentially expressed during meiosis and during haploid and diploid mitosis.
Sexual Plant Reproduction 14:257-267.
Katinka, M. D., S. Duprat, E. Cornillot, G. Metenier, F. Thomarat, G. Prensier, V. Barbe,
E. Peyretaillade, P. Brottier, P. Wincker, F. Delbac, H. El Alaoui, P. Peyret, W.
Saurin, M. Gouy, J. Weissenbach, and C. P. Vivares. 2001. Genome sequence and
gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature
414:450-453.
Keane, T. M., C. J. Creevey, M. M. Pentony, T. J. Naughton, and J. O. McLnerney. 2006.
Assessment of methods for amino acid matrix selection and their use on empirical
data shows that ad hoc assumptions for choice of matrix are not justified. Bmc
Evolutionary Biology 6:17.
213
Keeling, P. J. 2010. The endosymbiotic origin, diversification and fate of plastids.
Philosophical Transactions of the Royal Society B-Biological Sciences 365:729748.
Keeling, P. J., and W. F. Doolittle. 1996. Alpha-tubulin from early-diverging eukaryotic
lineages and the evolution of the tubulin family. Molecular Biology and Evolution
13:1297-1305.
Keeney, S., C. N. Giroux, and N. Kleckner. 1997. Meiosis-specific DNA double-strand
breaks are catalyzed by Spo11, a member of a widely conserved protein family.
Cell 88:375-384.
Kiianitsa, K., J. A. Solinger, and W. D. Heyer. 2002. Rad54 protein exerts diverse modes
of ATPase activity on duplex DNA partially and fully covered with Rad51
protein. J Biol Chem 277:46205-46215.
Kim, E., A. G. B. Simpson, and L. E. Graham. 2006. Evolutionary relationships of
apusomonads inferred from taxon-rich analyses of 6 nuclear encoded genes.
Molecular Biology and Evolution 23:2455-2466.
Kimura, S., and K. Sakaguchi. 2006. DNA repair in plants. Chem Rev 106:753-766.
King, N., M. J. Westbrook, S. L. Young, A. Kuo, M. Abedin, J. Chapman, S. Fairclough,
U. Hellsten, Y. Isogai, I. Letunic, M. Marr, D. Pincus, N. Putnam, A. Rokas, K. J.
Wright, R. Zuzow, W. Dirks, M. Good, D. Goodstein, D. Lemons, W. Li, J. B.
Lyons, A. Morris, S. Nichols, D. J. Richter, A. Salamov, J. G. Sequencing, P.
Bork, W. A. Lim, G. Manning, W. T. Miller, W. McGinnis, H. Shapiro, R. Tjian,
I. V. Grigoriev, and D. Rokhsar. 2008. The genome of the choanoflagellate
Monosiga brevicollis and the origin of metazoans. Nature 451:783-788.
Kirk, D. L., and M. M. Kirk. 1986. Heat-Shock Elicits Production of Sexual Inducer in
Volvox. Science 231:51-54.
Kleckner, N. 1996. Meiosis: how could it work? Proc Natl Acad Sci U S A 93:81678174.
Klein, H. L. 1997. RDH54, a RAD54 homologue in Saccharomyces cerevisiae, is
required for mitotic diploid-specific recombination and repair and for meiosis.
Genetics 147:1533-1543.
Knoll, A. H. 2003. Life on a young planet : the first three billion years of evolution on
Earth. Princeton University Press, Princeton, N.J.
Kolas, N. K., and D. Durocher. 2006. DNA repair: DNA polymerase zeta and Rev1 break
in. Curr Biol 16:R296-299.
Kolisko, M., I. Cepicka, V. Hampl, J. Leigh, A. J. Roger, J. Kulda, A. G. Simpson, and J.
Flegr. 2008. Molecular phylogeny of diplomonads and enteromonads based on
SSU rRNA, alpha-tubulin and HSP90 genes: implications for the evolutionary
history of the double karyomastigont of diplomonads. BMC Evol Biol 8:205.
Komori, K., T. Miyata, H. Daiyasu, H. Toh, H. Shinagawa, and Y. Ishino. 2000. Domain
analysis of an archaeal RadA protein for the strand exchange activity. Journal of
Biological Chemistry 275:33791-33797.
214
Kondrashov, A. S. 1993. Classification of Hypotheses on the Advantage of Amphimixis.
Journal of Heredity 84:372-387.
Kondrashov, A. S. 1984. Deleterious Mutations as an Evolutionary Factor .1. The
Advantage of Recombination. Genetical Research 44:199-217.
Kondrashov, A. S. 1994. The Asexual Ploidy Cycle and the Origin of Sex. Nature
370:213-216.
Kondrashov, A. S. 1988. Deleterious mutations and the evolution of sexual reproduction.
Nature 336:435-440.
Kondrashov, A. S., and J. F. Crow. 1991. Haploidy or diploidy: which is better? Nature
351:314-315.
Koonin, E. V. 2010. The origin and early evolution of eukaryotes in the light of
phylogenomics. Genome Biology 11:-.
Kouyos, R. D., S. P. Otto, and S. Bonhoeffer. 2006. Effect of varying epistasis on the
evolution of recombination. Genetics 173:589-597.
Krejci, L., J. Damborsky, B. Thomsen, M. Duno, and C. Bendixen. 2001. Molecular
dissection of interactions between Rad51 and members of the recombinationrepair group. Molecular and Cellular Biology 21:966-976.
Krejci, L., B. Song, W. Bussen, R. Rothstein, U. H. Mortensen, and P. Sung. 2002.
Interaction with Rad51 is indispensable for recombination mediator function of
Rad52. J Biol Chem 277:40132-40141.
Krogh, B. O., and L. S. Symington. 2004. Recombination proteins in yeast. Annu. Rev.
Genet. 38:233-271.
Kudoh, A., S. Iwahori, Y. Sato, S. Nakayama, H. Isomura, T. Murata, and T. Tsurumi.
2009. Homologous recombinational repair factors are recruited and loaded onto
the viral DNA genome in Epstein-Barr virus replication compartments. Journal of
Virology 83:6641-6651.
Kuhn, C. D., S. R. Geiger, S. Baumli, M. Gartmann, J. Gerber, S. Jennebach, T. Mielke,
H. Tschochner, R. Beckmann, and P. Cramer. 2007. Functional architecture of
RNA Polymerase I. Cell 131:1260-1272.
Lake, J. A., E. Henderson, M. Oakes, and M. W. Clark. 1984. Eocytes - a New Ribosome
Structure Indicates a Kingdom with a Close Relationship to Eukaryotes.
Proceedings of the National Academy of Sciences of the United States of
America-Biological Sciences 81:3786-3790.
Lake, J. A., and M. C. Rivera. 1994. Was the Nucleus the 1st Endosymbiont?
Proceedings of the National Academy of Sciences of the United States of America
91:2880-2881.
Lartillot, N., T. Lepage, and S. Blanquart. 2009. PhyloBayes 3: a Bayesian software
package for phylogenetic reconstruction and molecular dating. Bioinformatics
25:2286-2288.
215
Latypov, V., M. Rothenberg, A. Lorenz, G. Octobre, O. Csutak, E. Lehmann, J. Loidl,
and J. Kohli. 2010. Roles of Hop1 and Mek1 in Meiotic Chromosome Pairing and
Recombination Partner Choice in Schizosaccharomyces pombe. Molecular and
Cellular Biology 30:1570-1581.
Lederberg, J., E. M. Lederberg, N. D. Zinder, and E. R. Lively. 1951. Recombination
analysis of bacterial heredity. Cold Spring Harb Symp Quant Biol 16:413-443.
Lederberg, J., and E. L. Tatum. 1946. Gene recombination in Escherichia coli. Nature
158:558.
Lee, C., B. Hong, J. M. Choi, Y. Kim, S. Watanabe, Y. Ishimi, T. Enomoto, S. Tada, and
Y. Cho. 2004. Structural basis for inhibition of the replication licensing factor
Cdt1 by geminin. Nature 430:913-917.
Lee, K. Y., and K. J. Myung. 2008. PCNA modifications for regulation of postreplication repair pathways. Molecules and Cells 26:5-11.
Leipe, D. D., J. H. Gunderson, T. A. Nerad, and M. L. Sogin. 1993. Small subunit
ribosomal RNA+ of Hexamita inflata and the quest for the first branch in the
eukaryotic tree. Mol Biochem Parasitol 59:41-48.
Leu, J. Y., P. R. Chua, and G. S. Roeder. 1998. The meiosis-specific Hop2 protein of S.
cerevisiae ensures synapsis between homologous chromosomes. Cell 94:375-386.
Lewis, W. M. 1985. Nutrient Scarcity as an Evolutionary Cause of Haploidy. American
Naturalist 125:692-701.
Lewontin, R. C. 1971. Effect of Genetic Linkage on Mean Fitness of Population.
Proceedings of the National Academy of Sciences of the United States of America
68:984-&.
Lewontin, R. C. 1974. The genetic basis of evolutionary change. Columbia University
Press, New York,.
Li, A., and J. J. Blow. 2005. Cdt1 downregulation by proteolysis and geminin inhibition
prevents DNA re-replication in Xenopus. EMBO J 24:395-404.
Lichten, M. 2001. Meiotic recombination: Breaking the genome to save it. Current
Biology 11:R253-R256.
Lima-de-Faria, A. 1969. Handbook of molecular cytology. Pp. xv, 1508 p. with illus.
Frontiers of biology (Amsterdam), v. 15. North-Holland Pub. Co., Amsterdam,.
Lin, Y., and G. R. Smith. 1994. Transient, meiosis-induced expression of the rec6 and
rec12 genes of Schizosaccharomyces pombe. Genetics 136:769-779.
Lin, Z. G., H. Z. Kong, M. Nei, and H. Ma. 2006. Origins and evolution of the
recA/RAD51 gene family: Evidence for ancient gene duplication and
endosymbiotic gene transfer. Proceedings of the National Academy of Sciences of
the United States of America 103:10328-10333.
216
Lopez-Casamichana, M., E. Orozco, L. A. Marchat, and C. Lopez-Camarillo. 2008.
Transcriptional profile of the homologous recombination machinery and
characterization of the EhRAD51 recombinase in response to DNA damage in
Entamoeba histolytica. Bmc Molecular Biology 9:16.
Maeshima, K., K. Morimatsu, A. Shinohara, and T. Horii. 1995. Rad51 Homologs in
Xenopus laevis - 2 Distinct Genes Are Highly Expressed in Ovary and Testis.
Gene 160:195-200.
Malik, S.-B., A. W. Pightling, L. M. Stefaniak, A. M. Schurko, and J. M. Logsdon. 2008.
An expanded inventory of conserved meiotic genes provides evidence for sex in
Trichomonas vaginalis. PLoS One 3:1-13.
Malik, S. B. 2007. The Early Evolution of Meiotic Genes. Pp. 238. Biology. University
of Iowa, Iowa City.
Malik, S. B., M. A. Ramesh, A. M. Hulstrand, and J. M. Logsdon, Jr. 2007. Protist
homologs of the meiotic Spo11 gene and topoisomerase VI reveal an evolutionary
history of gene duplication and lineage-specific loss. Mol Biol Evol 24:28272841.
Marcon, E., and P. B. Moens. 2005. The evolution of meiosis: recruitment and
modification of somatic DNA-repair proteins. Bioessays 27:795-808.
Margulis, L. 1970. Origin of eukaryotic cells; evidence and research implications for a
theory of the origin and evolution of microbial, plant, and animal cells on the
Precambrian earth. Yale University Press, New Haven,.
Martin, F., A. Aerts, D. Ahren, A. Brun, E. G. Danchin, F. Duchaussoy, J. Gibon, A.
Kohler, E. Lindquist, V. Pereda, A. Salamov, H. J. Shapiro, J. Wuyts, D. Blaudez,
M. Buee, P. Brokstein, B. Canback, D. Cohen, P. E. Courty, P. M. Coutinho, C.
Delaruelle, J. C. Detter, A. Deveau, S. DiFazio, S. Duplessis, L. FraissinetTachet, E. Lucic, P. Frey-Klett, C. Fourrey, I. Feussner, G. Gay, J. Grimwood, P.
J. Hoegger, P. Jain, S. Kilaru, J. Labbe, Y. C. Lin, V. Legue, F. Le Tacon, R.
Marmeisse, D. Melayah, B. Montanini, M. Muratet, U. Nehls, H. Niculita-Hirzel,
M. P. Oudot-Le Secq, M. Peter, H. Quesneville, B. Rajashekar, M. Reich, N.
Rouhier, J. Schmutz, T. Yin, M. Chalot, B. Henrissat, U. Kues, S. Lucas, Y. Van
de Peer, G. K. Podila, A. Polle, P. J. Pukkila, P. M. Richardson, P. Rouze, I. R.
Sanders, J. E. Stajich, A. Tunlid, G. Tuskan, and I. V. Grigoriev. 2008. The
genome of Laccaria bicolor provides insights into mycorrhizal symbiosis. Nature
452:88-92.
Martin, W. 1999. A briefly argued case that mitochondria and plastids are descendants of
endosymbionts, but that the nuclear compartment is not. Proceedings of the Royal
Society of London Series B-Biological Sciences 266:1387-1395.
Masson, J. Y., and S. C. West. 2001. The Rad51 and Dmc1 recombinases: a non-identical
twin relationship. Trends in Biochemical Sciences 26:131-136.
217
Matsuzaki, M., O. Misumi, I. T. Shin, S. Maruyama, M. Takahara, S. Y. Miyagishima, T.
Mori, K. Nishida, F. Yagisawa, Y. Yoshida, Y. Nishimura, S. Nakao, T.
Kobayashi, Y. Momoyama, T. Higashiyama, A. Minoda, M. Sano, H. Nomoto, K.
Oishi, H. Hayashi, F. Ohta, S. Nishizaka, S. Haga, S. Miura, T. Morishita, Y.
Kabeya, K. Terasawa, Y. Suzuki, Y. Ishii, S. Asakawa, H. Takano, N. Ohta, H.
Kuroiwa, K. Tanaka, N. Shimizu, S. Sugano, N. Sato, H. Nozaki, N. Ogasawara,
Y. Kohara, and T. Kuroiwa. 2004. Genome sequence of the ultrasmall unicellular
red alga Cyanidioschyzon merolae 10D. Nature 428:653-657.
Maynard Smith, J. 1978. The evolution of sex. Cambridge University Press, Cambridge
[Eng.] ; New York.
Maynard Smith, J., and E. Szathmary. 1995. The major transitions in evolution. Freeman,
Oxford.
Mehdiabadi, N. J., M. R. Kronforst, D. C. Queller, and J. E. Strassmann. 2009.
Phylogeny, Reproductive Isolation and Kin Recognition in the Social Amoeba
Dictyostelium purpureum. Evolution 63:542-548.
Mehdiabadi, N. J., M. R. Kronforst, D. C. Queller, and J. E. Strassmann. 2010.
Phylogeography and sexual macrocyst formation in the social amoeba
Dictyostelium giganteum. Bmc Evolutionary Biology 10:-.
Merchant, S. S.S. E. ProchnikO. VallonE. H. HarrisS. J. KarpowiczG. B. WitmanA.
TerryA. SalamovL. K. Fritz-LaylinL. Marechal-DrouardW. F. MarshallL. H.
QuD. R. NelsonA. A. SanderfootM. H. SpaldingV. V. KapitonovQ. RenP.
FerrisE. LindquistH. ShapiroS. M. LucasJ. GrimwoodJ. SchmutzP. CardolH.
CeruttiG. ChanfreauC. L. ChenV. CognatM. T. CroftR. DentS. DutcherE.
FernandezH. FukuzawaD. Gonzalez-BallesterD. Gonzalez-HalphenA.
HallmannM. HanikenneM. HipplerW. InwoodK. JabbariM. KalanonR. KurasP.
A. LefebvreS. D. LemaireA. V. LobanovM. LohrA. ManuellI. MeierL. MetsM.
MittagT. MittelmeierJ. V. MoroneyJ. MoseleyC. NapoliA. M. NedelcuK.
NiyogiS. V. NovoselovI. T. PaulsenG. PazourS. PurtonJ. P. RalD. M. RianoPachonW. RiekhofL. RymarquisM. SchrodaD. SternJ. UmenR. WillowsN.
WilsonS. L. ZimmerJ. AllmerJ. BalkK. BisovaC. J. ChenM. EliasK. GendlerC.
HauserM. R. LambH. LedfordJ. C. LongJ. MinagawaM. D. PageJ. PanW.
PootakhamS. RojeA. RoseE. StahlbergA. M. TerauchiP. YangS. BallC. BowlerC.
L. DieckmannV. N. GladyshevP. GreenR. JorgensenS. MayfieldB. MuellerRoeberS. RajamaniR. T. SayreP. BroksteinI. DubchakD. GoodsteinL. HornickY.
W. HuangJ. JhaveriY. LuoD. MartinezW. C. NgauB. OtillarA. PoliakovA.
PorterL. SzajkowskiG. WernerK. ZhouI. V. GrigorievD. S. Rokhsar, and A. R.
Grossman. 2007. The Chlamydomonas genome reveals the evolution of key
animal and plant functions. Science 318:245-250.
Michod, R. E., H. Bernstein, and A. M. Nedelcu. 2008. Adaptive value of sex in
microbial pathogens. Infect Genet Evol 8:267-285.
Michod, R. E., and B. R. Levin. 1988. The Evoluton of Sex: An Examination of Current
Ideas. Sinauer Press, Sunderland, MA.
Miller, M., M. Holder, R. Vos, P. Midford, T. Liebowitz, L. Chan, P. Hoover, and T.
Warnow. 2009. The CIPRES Portals.
218
Milne, G. T., and D. T. Weaver. 1993. Dominant negative alleles of Rad52 reveal a DNA
repair/recombination complex including Rad51 and Rad52. Genes Dev 7:17551765.
Minge, M. A., J. D. Silberman, R. J. S. Orr, T. Cavalier-Smith, K. Shalchian-Tabrizi, F.
Burki, A. Skjaeveland, and K. S. Jakobsen. 2009. Evolutionary position of
breviate amoebae and the primary eukaryote divergence. Proceedings of the
Royal Society B-Biological Sciences 276:597-604.
Miyagawa, K., T. Tsuruga, A. Kinomura, K. Usui, M. Katsura, S. Tashiro, H. Mishima,
and K. Tanaka. 2002. A role for RAD54B in homologous recombination in human
cells. Embo Journal 21:175-180.
Moore, D. P., and T. L. Orr-Weaver. 1998. Chromosome segregation during meiosis:
building an unambivalent bivalent. Curr Top Dev Biol 37:263-299.
Moreira, D., S. von der Heyden, D. Bass, P. Lopez-Garcia, E. Chao, and T. CavalierSmith. 2007. Global eukaryote phylogeny: Combined small- and large-subunit
ribosomal DNA trees support monophyly of Rhizaria, Retaria and Excavata.
Molecular Phylogenetics and Evolution 44:255-266.
Mozlin, A. M., C. W. Fung, and L. S. Symington. 2008. Role of the Saccharomyces
cerevisiae Rad51 paralogs in sister chromatid recombination. Genetics 178:113126.
Muller, H. J. 1964. The Relation of Recombination to Mutational Advance. Mutat Res
106:2-9.
Muller, H. J. 1932. Some genetic aspects of sex. American Naturalist 66:118-138.
Muller, M. 1993. The hydrogenosome. J Gen Microbiol 139:2879-2889.
Muniyappa, K., S. Anuradha, and B. Byers. 2000. Yeast meiosis-specific protein Hop1
binds to G4 DNA and promotes its formation. Molecular and Cellular Biology
20:1361-1369.
Nedelcu, A. M., O. Marcu, and R. E. Michod. 2004. Sex as a response to oxidative stress:
a twofold increase in cellular reactive oxygen species activates sex genes. Proc
Biol Sci 271:1591-1596.
Nedelcu, A. M., and R. E. Michod. 2003. Sex as a response to oxidative stress: the effect
of antioxidants on sexual induction in a facultatively sexual lineage. Proc Biol Sci
270 Suppl 2:S136-139.
Nei, M. 1967. Modification of Linkage Intensity by Natural Selection. Genetics 57:625.
Nei, M., and S. Kumar. 2000. Molecular Evolution and Phylogenetics. Oxford University
Press, Oxford.
Nguyen, V. Q., C. Co, and J. J. Li. 2001. Cyclin-dependent kinases prevent DNA rereplication through multiple mechanisms. Nature 411:1068-1073.
219
Nichols, M. D., K. DeAngelis, J. L. Keck, and J. M. Berger. 1999. Structure and function
of an archaeal topoisomerase VI subunit with homology to the meiotic
recombination factor Spo11. EMBO J 18:6177-6188.
Nicklas, R. B. 1977. Chromosome distribution: experiments on cell hybrids and in vitro.
Philos Trans R Soc Lond B Biol Sci 277:267-276.
Nimonkar, A. V., I. Amitani, R. J. Baskin, and S. C. Kowalczykowski. 2007. Single
molecule Imaging of Tid1/Rdh54, a rad54 homolog that translocates on duplex
DNA and can disrupt joint molecules. Journal of Biological Chemistry
282:30776-30784.
Nishinaka, T., A. Shinohara, Y. Ito, S. Yokoyama, and T. Shibata. 1998. Base pair
switching by interconversion of sugar puckers in DNA extended by proteins of
RecA-family: a model for homology search in homologous genetic
recombination. Proc Natl Acad Sci U S A 95:11071-11076.
Nishitani, H., Z. Lygerou, T. Nishimoto, and P. Nurse. 2000. The Cdt1 protein is required
to license DNA for replication in fission yeast. Nature 404:625-628.
Noble, S. M., and C. Guthrie. 1996. Identification of novel genes required for yeast premRNA splicing by means of cold-sensitive mutations. Genetics 143:67-80.
Octobre, G., A. Lorenz, J. Loidl, and J. Kohli. 2008. The Rad52 homologs Rad22 and rtil
of Schizosaccharomyces pombe are not essential for meiotic interhomolog DNA
strand exchange, but are required for meiotic intrachromosomal recombination
and mating type-related DNA repair. Genetics 178:2399-2412.
Ohno, S. 1970. Evolution by gene duplication. Springer-Verlag, Berlin, New York,.
Okada, H., Y. Hirota, R. Moriyama, Y. Saga, and K. Yanagisawa. 1986. Nuclear-Fusion
in Multinucleated Giant-Cells during the Sexual Development of Dictyostelium
discoideum. Developmental Biology 118:95-102.
Okorokov, A. L., Y. L. Chaban, D. V. Bugreev, J. Hodgkinson, A. V. Mazin, and E. V.
Orlova. 2010. Structure of the hDmc1-ssDNA Filament Reveals the Principles of
Its Architecture. PLoS One 5:14.
Orr-Weaver, T. L. 1995. Meiosis in Drosophila: seeing is believing. Proc Natl Acad Sci
U S A 92:10443-10449.
Otto, S. 2008. Sexual reproduction and the evolution of sex. Nature Education 1.
Otto, S. P., and A. C. Gerstein. 2006. Why have sex? The population genetics of sex and
recombination. Biochemical Society Transactions 34:519-522.
Otto, S. P., and D. B. Goldstein. 1992. Recombination and the Evolution of Diploidy.
Genetics 131:745-751.
Otto, S. P., and T. Lenormand. 2002. Resolving the paradox of sex and recombination.
Nature Reviews Genetics 3:252-261.
220
Palenik, B., J. Grimwood, A. Aerts, P. Rouze, A. Salamov, N. Putnam, C. Dupont, R.
Jorgensen, E. Derelle, S. Rombauts, K. Zhou, R. Otillar, S. S. Merchant, S.
Podell, T. Gaasterland, C. Napoli, K. Gendler, A. Manuell, V. Tai, O. Vallon, G.
Piganeau, S. Jancek, M. Heijde, K. Jabbari, C. Bowler, M. Lohr, S. Robbens, G.
Werner, I. Dubchak, G. J. Pazour, Q. Ren, I. Paulsen, C. Delwiche, J. Schmutz, D.
Rokhsar, Y. Van de Peer, H. Moreau, and I. V. Grigoriev. 2007. The tiny
eukaryote Ostreococcus provides genomic insights into the paradox of plankton
speciation. Proc Natl Acad Sci U S A 104:7705-7710.
Pannunzio, N. R., G. M. Manthey, and A. M. Bailis. 2008. RAD59 is required for
efficient repair of simultaneous double-strand breaks resulting in translocations in
Saccharomyces cerevisiae. DNA Repair 7:788-800.
Paques, F., and J. E. Haber. 1999. Multiple pathways of recombination induced by
double-strand breaks in Saccharomyces cerevisiae. Microbiology and Molecular
Biology Reviews 63:349-+.
Parfrey, L. W., E. Barbero, E. Lasser, M. Dunthorn, D. Bhattacharya, D. Patterson, and
L. Katz. 2006. Evaluating Support for the Current Classification of Eukaryotic
Diversity. Plos Genetics 2:2062-2073.
Parfrey, L. W., J. Grant, Y. I. Tekle, E. Lasek-Nesselquist, H. G. Morrison, M. L. Sogin,
D. J. Patterson, and L. A. Katz. 2010. Broadly sampled multigene analyses yield a
well-resolved eukaryotic tree of life. Syst Biol.
Parisi, S., M. J. McKay, M. Molnar, M. A. Thompson, P. J. van der Spek, E. van DrunenSchoenmaker, R. Kanaar, E. Lehmann, J. H. Hoeijmakers, and J. Kohli. 1999.
Rec8p, a meiotic recombination and sister chromatid cohesion phosphoprotein of
the Rad21p family conserved from fission yeast to humans. Mol Cell Biol
19:3515-3528.
Patron, N. J., Y. Inagaki, and P. J. Keeling. 2007. Multiple gene phylogenies support the
monophyly of cryptomonad and haptophyte host lineages. Curr Biol 17:887-891.
Patterson, D. J. 1999. The diversity of eukaryotes. American Naturalist 154:S96-S124.
Pellegrini, L., D. S. Yu, T. Lo, S. Anand, M. Lee, T. L. Blundell, and A. R.
Venkitaraman. 2002. Insights into DNA recombination from the structure of a
RAD51-BRCA2 complex. Nature 420:287-293.
Perrot, V., S. Richerd, and M. Valero. 1991. Transition from haploidy to diploidy. Nature
351:315-317.
Petersen, G., and O. Seberg. 2002. Molecular evolution and phylogenetic application of
DMC1. Molecular Phylogenetics and Evolution 22:43-50.
Petersen, G., O. Seberg, and C. Baden. 2004. A phylogenetic analysis of the genus
Psathyrostachys (Poaceae) based on one nuclear gene, three plastid genes, and
morphology. Plant Systematics and Evolution 249:99-110.
221
Petes, T. D., R. E. Malone, and L. S. Symington. 1991. Recombination in Yeast. Pp. 407521 in J. R. Broach, E. W. Jones, and J. R. Pringle, eds. The Molecular and
Cellular Biology of the Yeast Saccharomyces: Genome Dynamics, Protein
Synthesis, and Energetics. Cold Spring Harbor Laboratory, Cold Spring Harbor,
NY.
Petukhova, G., S. Stratton, and P. Sung. 1998. Catalysis of homologous DNA pairing by
yeast Rad51 and Rad54 proteins. Nature 393:91-94.
Petukhova, G., S. Van Komen, S. Vergano, H. Klein, and P. Sung. 1999. Yeast Rad54
promotes Rad51-dependent homologous DNA pairing via ATP hydrolysis-driven
change in DNA double helix conformation. J Biol Chem 274:29453-29462.
Pevsner, J. 2009. Bioinformatics and Functional Genomics. Wiley-Blackwell, Hoboken,
NJ.
Piatti, S., T. Bohm, J. H. Cocker, J. F. Diffley, and K. Nasmyth. 1996. Activation of Sphase-promoting CDKs in late G1 defines a "point of no return" after which Cdc6
synthesis cannot promote DNA replication in yeast. Genes Dev 10:1516-1531.
Pool, R. 1990. The Third Kingdom of Life. Science 247:159.
Poole, A. M., and D. Penny. 2007. Evaluating hypotheses for the origin of eukaryotes.
Bioessays 29:74-84.
Poxleitner, M. K., M. L. Carpenter, J. J. Mancuso, C. J. R. Wang, S. C. Dawson, and W.
Z. Cande. 2008. Evidence for karyogamy and exchange of genetic material in the
binucleate intestinal parasite Giardia intestinalis. Science 319:1530-1533.
Proudfoot, C., and R. McCulloch. 2006. Trypanosoma brucei DMC1 does not act in
DNA recombination, repair or antigenic variation in bloodstream stage cells. Mol
Biochem Parasitol 145:245-253.
Putnam, N. H., M. Srivastava, U. Hellsten, B. Dirks, J. Chapman, A. Salamov, A. Terry,
H. Shapiro, E. Lindquist, V. V. Kapitonov, J. Jurka, G. Genikhovich, I. V.
Grigoriev, S. M. Lucas, R. E. Steele, J. R. Finnerty, U. Technau, M. Q.
Martindale, and D. S. Rokhsar. 2007. Sea anemone genome reveals ancestral
eumetazoan gene repertoire and genomic organization. Science 317:86-94.
Ramesh, M. A., S. B. Malik, and J. M. Logsdon. 2005. A phylogenomic inventory of
meiotic genes: Evidence for sex in Giardia and an early eukaryotic origin of
meiosis. Current Biology 15:185-191.
Reeb, V. C., M. T. Peglar, H. S. Yoon, J. R. Bai, M. Wu, P. Shiu, J. L. Grafenberg, A.
Reyes-Prieto, S. E. Rummele, J. Gross, and D. Bhattacharya. 2009.
Interrelationships of chromalveolates within a broadly sampled tree of
photosynthetic protists. Molecular Phylogenetics and Evolution 53:202-211.
222
Rensing, S. A., D. Lang, A. D. Zimmer, A. Terry, A. Salamov, H. Shapiro, T. Nishiyama,
P. F. Perroud, E. A. Lindquist, Y. Kamisugi, T. Tanahashi, K. Sakakibara, T.
Fujita, K. Oishi, I. T. Shin, Y. Kuroki, A. Toyoda, Y. Suzuki, S. Hashimoto, K.
Yamaguchi, S. Sugano, Y. Kohara, A. Fujiyama, A. Anterola, S. Aoki, N. Ashton,
W. B. Barbazuk, E. Barker, J. L. Bennetzen, R. Blankenship, S. H. Cho, S. K.
Dutcher, M. Estelle, J. A. Fawcett, H. Gundlach, K. Hanada, A. Heyl, K. A.
Hicks, J. Hughes, M. Lohr, K. Mayer, A. Melkozernov, T. Murata, D. R. Nelson,
B. Pils, M. Prigge, B. Reiss, T. Renner, S. Rombauts, P. J. Rushton, A.
Sanderfoot, G. Schween, S. H. Shiu, K. Stueber, F. L. Theodoulou, H. Tu, Y. Van
de Peer, P. J. Verrier, E. Waters, A. Wood, L. Yang, D. Cove, A. C. Cuming, M.
Hasebe, S. Lucas, B. D. Mishler, R. Reski, I. V. Grigoriev, R. S. Quatrano, and J.
L. Boore. 2008. The Physcomitrella genome reveals evolutionary insights into the
conquest of land by plants. Science 319:64-69.
Rice, W. R. 2002. Experimental tests of the adaptive significance of sexual
recombination. Nature Reviews Genetics 3:241-251.
Rice, W. R., and A. K. Chippindale. 2001. Sexual recombination and the power of natural
selection. Science 294:555-559.
Richards, A. J. 1986. Plant breeding systems. G. Allen & Unwin, London ; Boston.
Ridley, M. 2004. Evolution. Blackwell Science Ltd., Malden, MA.
Rodriguez-Ezpeleta, N., H. Brinkmann, S. C. Burey, B. Roure, G. Burger, W.
Loffelhardt, H. J. Bohnert, H. Philippe, and B. F. Lang. 2005. Monophyly of
primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes.
Curr Biol 15:1325-1330.
Roger, A. J. 1999. Reconstructing early events in eukaryotic evolution. American
Naturalist 154:S146-S163.
Roger, A. J., and L. A. Hug. 2006. The origin and diversification of eukaryotes: problems
with molecular phylogenetics and molecular clock estimation. Pp. 1039-1054.
Royal Society.
Roger, A. J., and A. G. B. Simpson. 2009. Evolution: Revisiting the Root of the
Eukaryote Tree. Current Biology 19:R165-R167.
Rokas, A., B. L. Williams, N. King, and S. B. Carroll. 2003. Genome-scale approaches to
resolving incongruence in molecular phylogenies. Nature 425:798-804.
Ruckert, J. 1892. Zur Entwicklungsgeschichte des Ovarioleies bei Selachien.
Anatomischer Anzeiger 7:107-158.
Saeki, T., I. Machida, and S. Nakai. 1980. Genetic control of diploid recovery after
gamma-irradiation in the yeast Saccharomyces cerevisiae. Mutat Res 73:251-265.
Sagan, L. 1967. On the origin of mitosing cells. J Theor Biol 14:255-274.
Sager, R., and S. Granick. 1954. Nutritional Control of Sexuality in Chlamydomonas
reinhardtii. Journal of General Physiology 37:729-742.
223
Sakaguchi, K., T. Ishibashi, Y. Uchiyama, and K. Iwabata. 2009. The multi-Replication
Protein A (RPA) system - a new perspective. Febs Journal 276:943-963.
Sakane, I., C. Kamataki, Y. Takizawa, M. Nakashima, S. Toki, H. Ichikawa, S. Ikawa, T.
Shibata, and H. Kurumizaka. 2008. Filament formation and robust strand
exchange activities of the rice DMC1A and DMC1B proteins. Nucleic Acids
Research 36:4266-4276.
Sandler, S. J., L. H. Satin, H. S. Samra, and A. J. Clark. 1996. recA-like genes from three
archaean species with putative protein products similar to Rad51 and Dmc1
proteins of the yeast Saccharomyces cerevisiae. Nucleic Acids Res 24:2125-2132.
Sarai, N., W. Kagawa, N. Fujikawa, K. Saito, J. Hikiba, K. Tanaka, K. Miyagawa, H.
Kurumizaka, and S. Yokoyama. 2008. Biochemical analysis of the N-terminal
domain of human RAD54B. Nucleic Acids Research 36:5441-5450.
Sauvageau, S., A. Z. Stasiak, I. Banville, M. Ploquin, A. Stasiak, and J. Y. Masson. 2005.
Fission yeast Rad51 and Dmc1, two efficient DNA recombinases forming helical
nucleoprotein filaments. Mol Cell Biol 25:4377-4387.
Schild, D. 1995. Suppression of a new allele of the yeast RAD52 gene by overexpression
of RAD51, mutations in srs2 and ccr4, or mating-type heterozygosity. Genetics
140:115-127.
Schild, D., and C. Wiese. 2009. Overexpression of RAD51 suppresses recombination
defects: a possible mechanism to reverse genomic instability. Nucleic Acids Res.
Schlegel, M. 1994. Molecular Phylogeny of Eukaryotes. Trends in Ecology & Evolution
9:330-335.
Schrader, F., and S. Hughes-Schrader. 1931. Haploidy in Metazoa. Quarterly Review of
Biology 6:411-438.
Schurko, A. M., and J. M. Logsdon. 2008. Using a meiosis detection toolkit to investigate
ancient asexual "scandals" and the evolution of sex. Bioessays 30:579-589.
Schurko, A. M., M. Neiman, and J. M. Logsdon, Jr. 2009. Signs of sex: what we know
and how we know it. Trends Ecol Evol 24:208-217.
Scudo, F. M. 1967. Selection on Both Haplo and Diplophase. Genetics 56:693-&.
Searfoss, A., T. E. Dever, and R. Wickner. 2001. Linking the 3 ' poly(A) tail to the
subunit joining step of translation initiation: Relations of Pab1p, eukaryotic
translation initiation factor 5B (Fun12p), and Ski2p-Slh1p. Molecular and
Cellular Biology 21:4900-4908.
Sehorn, M. G., S. Sigurdsson, W. Bussen, V. M. Unger, and P. Sung. 2004. Human
meiotic recombinase Dmc1 promotes ATP-dependent homologous DNA strand
exchange. Nature 429:433-437.
Seong, C., S. Colavito, Y. Kwon, P. Sung, and L. Krejci. 2009. Regulation of Rad51
Recombinase Presynaptic Filament Assembly via Interactions with the Rad52
Mediator and the Srs2 Anti-recombinase. Journal of Biological Chemistry
284:24363-24371.
224
Shadwick, L. L., F. W. Spiegel, J. D. Shadwick, M. W. Brown, and J. D. Silberman.
2009. Eumycetozoa = Amoebozoa?: SSUrDNA phylogeny of protosteloid slime
molds and its significance for the amoebozoan supergroup. PLoS One 4:e6754.
Sherman, F., and H. Roman. 1963. Evidence for 2 Types of Allelic Recombination in
Yeast. Genetics 48:255-&.
Shin, D. S., L. Pellegrini, D. S. Daniels, B. Yelent, L. Craig, D. Bates, D. S. Yu, M. K.
Shivji, C. Hitomi, A. S. Arvai, N. Volkmann, H. Tsuruta, T. L. Blundell, A. R.
Venkitaraman, and J. A. Tainer. 2003. Full-length archaeal Rad51 structure and
mutants: mechanisms for RAD51 assembly and control by BRCA2. Embo Journal
22:4566-4576.
Shinohara, A., H. Ogawa, and T. Ogawa. 1992. Rad51 protein involved in repair and
recombination in S. cerevisiae is a RecA-like protein. Cell 69:457-470.
Shinohara, A., M. Shinohara, T. Ohta, S. Matsuda, and T. Ogawa. 1998. Rad52 forms
ring structures and co-operates with RPA in single-strand DNA annealing. Genes
Cells 3:145-156.
Shinohara, M., S. L. Gasior, D. K. Bishop, and A. Shinohara. 2000. Tid1/Rdh54
promotes colocalization of Rad51 and Dmc1 during meiotic recombination. Proc
Natl Acad Sci U S A 97:10814-10819.
Shinozawa, T., T. Horiike, and K. Hamada. 2001. Does endosymbiosis explain the origin
of the nucleus? Reply. Nature Cell Biology 3:E173-E174.
Simchen, G., and Y. Hugerat. 1993. What Determines Whether Chromosomes Segregate
Reductionally or Equationally in Meiosis. Bioessays 15:1-8.
Simpson, A. G. 2003. Cytoskeletal organization, phylogenetic affinities and systematics
in the contentious taxon Excavata (Eukaryota). Int J Syst Evol Microbiol 53:17591777.
Simpson, A. G., Y. Inagaki, and A. J. Roger. 2006. Comprehensive multigene
phylogenies of excavate protists reveal the evolutionary positions of "primitive"
eukaryotes. Mol Biol Evol 23:615-625.
Simpson, A. G. B., and D. J. Patterson. 1999. The ultrastructure of Carpediemonas
membranifera (Eukaryota) with reference to the "Excavate hypothesis". European
Journal of Protistology 35:353-370.
Simpson, A. G. B., and A. J. Roger. 2004. The real 'kingdoms' of eukaryotes. Current
Biology 14:R693-R696.
Smith, T. F., and M. S. Waterman. 1981. Identification of common molecular
subsequences. J Mol Biol 147:195-197.
Snowden, T., S. Acharya, C. Butz, M. Berardini, and R. Fishel. 2004. hMSH4-hMSH5
recognizes Holliday Junctions and forms a meiosis-specific sliding clamp that
embraces homologous chromosomes. Mol Cell 15:437-451.
225
Sogin, M., H. Elwood, and J. Gunderson. 1986. Evolutionary diversity of eukaryotic
small-subunit rRNA genes. Proceedings of the National Academy of Sciences of
the United States of America 83:1383-1387.
Sogin, M. L. 1991. Early evolution and the origin of eukaryotes. Curr Opin Genet Dev
1:457-463.
Solinger, J. A., K. Kiianitsa, and W. D. Heyer. 2002. Rad54, a Swi2/Snf2-like
recombinational repair protein, disassembles Rad51:dsDNA filaments. Mol Cell
10:1175-1188.
Sonnhammer, E. L., S. R. Eddy, E. Birney, A. Bateman, and R. Durbin. 1998. Pfam:
multiple sequence alignments and HMM-profiles of protein domains. Nucleic
Acids Res 26:320-322.
Soustelle, C., M. Vedel, R. Kolodner, and A. Nicolas. 2002. Replication Protein A is
required for meiotic recombination in Saccharomyces cerevisiae. Genetics
161:535-547.
Srivastava, M., E. Begovic, J. Chapman, N. H. Putnam, U. Hellsten, T. Kawashima, A.
Kuo, T. Mitros, A. Salamov, M. L. Carpenter, A. Y. Signorovitch, M. A. Moreno,
K. Kamm, J. Grimwood, J. Schmutz, H. Shapiro, I. V. Grigoriev, L. W. Buss, B.
Schierwater, S. L. Dellaporta, and D. S. Rokhsar. 2008. The Trichoplax genome
and the nature of placozoans. Nature 454:955-960.
Stack, S. M., and W. V. Brown. 1969. Somatic pairing, reduction and recombination: an
evolutionary hypothesis of meiosis. Nature 222:1275-1276.
Stamatakis, A., P. Hoover, and J. Rougemont. 2008. A Rapid Bootstrap Algorithm for the
RAxML Web Servers. Systematic Biology 57:758-771.
Stamatakis, A., T. Ludwig, and H. Meier. 2005. RAxML-III: a fast program for
maximum likelihood-based inference of large phylogenetic trees. Bioinformatics
21:456-463.
Stassen, N. Y., J. M. Logsdon, G. J. Vora, H. H. Offenberg, J. D. Palmer, and M. E.
Zolan. 1997. Isolation and characterization of rad51 orthologs from Coprinus
cinereus and Lycopersicon esculentum, and phylogenetic analysis of eukaryotic
recA homologs. Current Genetics 31:144-157.
Stechmann, A., and T. Cavalier-Smith. 2002. Rooting the eukaryote tree by using a
derived gene fusion. Science 297:89-91.
Stechmann, A., and T. Cavalier-Smith. 2003a. The root of the eukaryote tree pinpointed.
Curr Biol 13:R665-666.
Stechmann, A., and T. Cavalier-Smith. 2003b. Phylogenetic analysis of eukaryotes using
heat-shock protein Hsp90. J Mol Evol 57:408-419.
Steenkamp, E. T., J. Wright, and S. L. Baldauf. 2006. The protistan origins of animals
and fungi. Molecular Biology and Evolution 23:93-106.
226
Stiller, J. W., and L. Harrell. 2005. The largest subunit of RNA Polymerase II from the
Glaucocystophyta: functional constraint and short-branch exclusion in deep
eukaryotic phylogeny. BMC Evol Biol 5:71.
Story, R. M., I. T. Weber, and T. A. Steitz. 1992. The structure of the E. coli recA protein
monomer and polymer. Nature 355:318-325.
Sugawara, H., K. Iwabata, A. Koshiyama, T. Yanai, Y. Daikuhara, S. H. Namekawa, F.
N. Hamada, and K. Sakaguchi. 2009. Coprinus cinereus Mer3 is required for
synaptonemal complex formation during meiosis. Chromosoma 118:127-139.
Sugawara, N., X. Wang, and J. E. Haber. 2003. In Vivo roles of Rad52, Rad54, and
Rad55 proteins in Rad51-mediated recombination. Molecular Cell 12:209-219.
Sugimoto-Shirasu, K., N. J. Stacey, J. Corsar, K. Roberts, and M. C. McCann. 2002.
DNA topoisomerase VI is essential for endoreduplication in Arabidopsis. Curr
Biol 12:1782-1786.
Sung, P. 1997. Function of yeast Rad52 protein as a mediator between Replication
Protein A and the Rad51 recombinase. J Biol Chem 272:28194-28197.
Symington, L. S. 2002. Role of RAD52 epistasis group genes in homologous
recombination and double-strand break repair. Microbiology and Molecular
Biology Reviews 66:630-+.
Syvanen, M. 1985. Cross-Species Gene-Transfer - Implications for a New Theory of
Evolution. Journal of Theoretical Biology 112:333-343.
Szathmary, E., I. Scheuring, M. Kotsis, and I. Gladkih. 1990. Sexuality of eukaryotic
unicells: hyperbolic growth, coesixtence of facultative parthenogens, and the
repair hypothesis. Pp. 279-287 in J. Maynard Smith, and G. Vida, eds.
Organizational Constraints on the Dynamics of Evolution. Manchester University
Press, Manchester.
Szathmary, E., and J. M. Smith. 1995. The Major Evolutionary Transitions. Nature
374:227-232.
Szekvolgyi, L., and A. Nicolas. 2010. From meiosis to postmeiotic events: Homologous
recombination is obligatory but flexible. Febs Journal 277:571-589.
Tada, S., A. Li, D. Maiorano, M. Mechali, and J. J. Blow. 2001. Repression of origin
assembly in metaphase depends on inhibition of RLF-B/Cdt1 by geminin. Nat
Cell Biol 3:107-113.
Tatusov, R. L., N. D. Fedorova, J. D. Jackson, A. R. Jacobs, B. Kiryutin, E. V. Koonin,
D. M. Krylov, R. Mazumder, S. L. Mekhedov, A. N. Nikolskaya, B. S. Rao, S.
Smirnov, A. V. Sverdlov, S. Vasudevan, Y. I. Wolf, J. J. Yin, and D. A. Natale.
2003. The COG database: an updated version includes eukaryotes. BMC
Bioinformatics 4:41.
Tatusov, R. L., M. Y. Galperin, D. A. Natale, and E. V. Koonin. 2000. The COG
database: a tool for genome-scale analysis of protein functions and evolution.
Nucleic Acids Research 28:33-36.
227
Thomer, M., N. R. May, B. D. Aggarwal, G. Kwok, and B. R. Calvi. 2004. Drosophila
double-parked is sufficient to induce re-replication during development and is
regulated by cyclin E/CDK2. Development 131:4807-4818.
Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997.
The CLUSTAL_X windows interface: flexible strategies for multiple sequence
alignment aided by quality analysis tools. Nucleic Acids Research 25:4876-4882.
Timmermans, M. J., D. Roelofs, J. Marien, and N. M. van Straalen. 2008. Revealing
pancrustacean relationships: phylogenetic analysis of ribosomal protein genes
places Collembola (springtails) in a monophyletic Hexapoda and reinforces the
discrepancy between mitochondrial and nuclear DNA markers. BMC Evol Biol
8:83.
Toth, A., K. P. Rabitsch, M. Galova, A. Schleiffer, S. B. Buonomo, and K. Nasmyth.
2000. Functional genomics identifies monopolin: a kinetochore protein required
for segregation of homologs during meiosis i. Cell 103:1155-1168.
Tovar, J., A. Fischer, and C. G. Clark. 1999. The mitosome, a novel organelle related to
mitochondria in the amitochondrial parasite Entamoeba histolytica. Mol
Microbiol 32:1013-1021.
Tovar, J., G. Leon-Avila, L. B. Sanchez, R. Sutak, J. Tachezy, M. van der Giezen, M.
Hernandez, M. Muller, and J. M. Lucocq. 2003. Mitochondrial remnant
organelles of Giardia function in iron-sulphur protein maturation. Nature
426:172-176.
Tsubouchi, H., and G. S. Roeder. 2003. The importance of genetic recombination for
fidelity of chromosome pairing in meiosis. Dev Cell 5:915-925.
Tsubouchi, H., and G. S. Roeder. 2002. The Mndl protein forms a complex with Hop2 to
promote homologous chromosome pairing and meiotic double-strand break.
Molecular and Cellular Biology 22:3078-3088.
Tsuzuki, T., Y. Fujii, K. Sakumi, Y. Tominaga, K. Nakao, M. Sekiguchi, A. Matsushiro,
Y. Yoshimura, and MoritaT. 1996. Targeted disruption of the Rad51 gene leads to
lethality in embryonic mice. Proc Natl Acad Sci U S A 93:6236-6240.
Tyler, B. M., S. Tripathy, X. Zhang, P. Dehal, R. H. Jiang, A. Aerts, F. D. Arredondo, L.
Baxter, D. Bensasson, J. L. Beynon, J. Chapman, C. M. Damasceno, A. E.
Dorrance, D. Dou, A. W. Dickerman, I. L. Dubchak, M. Garbelotto, M. Gijzen, S.
G. Gordon, F. Govers, N. J. Grunwald, W. Huang, K. L. Ivors, R. W. Jones, S.
Kamoun, K. Krampis, K. H. Lamour, M. K. Lee, W. H. McDonald, M. Medina,
H. J. Meijer, E. K. Nordberg, D. J. Maclean, M. D. Ospina-Giraldo, P. F. Morris,
V. Phuntumart, N. H. Putnam, S. Rash, J. K. Rose, Y. Sakihama, A. A. Salamov,
A. Savidor, C. F. Scheuring, B. M. Smith, B. W. Sobral, A. Terry, T. A. TortoAlalibo, J. Win, Z. Xu, H. Zhang, I. V. Grigoriev, D. S. Rokhsar, and J. L. Boore.
2006. Phytophthora genome sequences uncover evolutionary origins and
mechanisms of pathogenesis. Science 313:1261-1266.
Umezu, K., N. Sugawara, C. Chen, J. E. Haber, and R. D. Kolodner. 1998. Genetic
analysis of yeast RPA1 reveals its multiple functions in DNA metabolism.
Genetics 148:989-1005.
228
van der Giezen, M. 2009. Hydrogenosomes and mitosomes: Conservation and evolution
of functions. Journal of Eukaryotic Microbiology 56:221-231.
van der Giezen, M., J. Tovar, and C. G. Clark. 2005. Mitochondrion-derived organelles in
protists and fungi. Int Rev Cytol 244:175-225.
van Keulen, H., S. R. Campbell, S. L. Erlandsen, and E. L. Jarroll. 1991a. Cloning and
restriction enzyme mapping of ribosomal DNA of Giardia duodenalis, Giardia
ardeae and Giardia muris. Mol Biochem Parasitol 46:275-284.
van Keulen, H., S. Horvat, S. L. Erlandsen, and E. L. Jarroll. 1991b. Nucleotide sequence
of the 5.8S and large subunit rRNA genes and the internal transcribed spacer and
part of the external spacer from Giardia ardeae. Nucleic Acids Res 19:6050.
Van Valen, L. 1973. A new evolutionary law. Evol. Theory 1:1-30.
Vaziri, C., S. Saxena, Y. Jeon, C. Lee, K. Murata, Y. Machida, N. Wagle, D. S. Hwang,
and A. Dutta. 2003. A p53-dependent checkpoint pathway prevents rereplication.
Mol Cell 11:997-1008.
229
Venter, J. C., M. D. Adams, E. W. Myers, P. W. Li, R. J. Mural, G. G. Sutton, H. O.
Smith, M. Yandell, C. A. Evans, R. A. Holt, J. D. Gocayne, P. Amanatides, R. M.
Ballew, D. H. Huson, J. R. Wortman, Q. Zhang, C. D. Kodira, X. H. Zheng, L.
Chen, M. Skupski, G. Subramanian, P. D. Thomas, J. Zhang, G. L. Gabor Miklos,
C. Nelson, S. Broder, A. G. Clark, J. Nadeau, V. A. McKusick, N. Zinder, A. J.
Levine, R. J. Roberts, M. Simon, C. Slayman, M. Hunkapiller, R. Bolanos, A.
Delcher, I. Dew, D. Fasulo, M. Flanigan, L. Florea, A. Halpern, S. Hannenhalli,
S. Kravitz, S. Levy, C. Mobarry, K. Reinert, K. Remington, J. Abu-Threideh, E.
Beasley, K. Biddick, V. Bonazzi, R. Brandon, M. Cargill, I. Chandramouliswaran,
R. Charlab, K. Chaturvedi, Z. Deng, V. Di Francesco, P. Dunn, K. Eilbeck, C.
Evangelista, A. E. Gabrielian, W. Gan, W. Ge, F. Gong, Z. Gu, P. Guan, T. J.
Heiman, M. E. Higgins, R. R. Ji, Z. Ke, K. A. Ketchum, Z. Lai, Y. Lei, Z. Li, J.
Li, Y. Liang, X. Lin, F. Lu, G. V. Merkulov, N. Milshina, H. M. Moore, A. K.
Naik, V. A. Narayan, B. Neelam, D. Nusskern, D. B. Rusch, S. Salzberg, W.
Shao, B. Shue, J. Sun, Z. Wang, A. Wang, X. Wang, J. Wang, M. Wei, R. Wides,
C. Xiao, C. Yan, A. Yao, J. Ye, M. Zhan, W. Zhang, H. Zhang, Q. Zhao, L.
Zheng, F. Zhong, W. Zhong, S. Zhu, S. Zhao, D. Gilbert, S. Baumhueter, G.
Spier, C. Carter, A. Cravchik, T. Woodage, F. Ali, H. An, A. Awe, D. Baldwin,
H. Baden, M. Barnstead, I. Barrow, K. Beeson, D. Busam, A. Carver, A. Center,
M. L. Cheng, L. Curry, S. Danaher, L. Davenport, R. Desilets, S. Dietz, K.
Dodson, L. Doup, S. Ferriera, N. Garg, A. Gluecksmann, B. Hart, J. Haynes, C.
Haynes, C. Heiner, S. Hladun, D. Hostin, J. Houck, T. Howland, C. Ibegwam, J.
Johnson, F. Kalush, L. Kline, S. Koduru, A. Love, F. Mann, D. May, S.
McCawley, T. McIntosh, I. McMullen, M. Moy, L. Moy, B. Murphy, K. Nelson,
C. Pfannkoch, E. Pratts, V. Puri, H. Qureshi, M. Reardon, R. Rodriguez, Y. H.
Rogers, D. Romblad, B. Ruhfel, R. Scott, C. Sitter, M. Smallwood, E. Stewart, R.
Strong, E. Suh, R. Thomas, N. N. Tint, S. Tse, C. Vech, G. Wang, J. Wetter, S.
Williams, M. Williams, S. Windsor, E. Winn-Deen, K. Wolfe, J. Zaveri, K.
Zaveri, J. F. Abril, R. Guigo, M. J. Campbell, K. V. Sjolander, B. Karlak, A.
Kejariwal, H. Mi, B. Lazareva, T. Hatton, A. Narechania, K. Diemer, A.
Muruganujan, N. Guo, S. Sato, V. Bafna, S. Istrail, R. Lippert, R. Schwartz, B.
Walenz, S. Yooseph, D. Allen, A. Basu, J. Baxendale, L. Blick, M. Caminha, J.
Carnes-Stine, P. Caulk, Y. H. Chiang, M. Coyne, C. Dahlke, A. Mays, M.
Dombroski, M. Donnelly, D. Ely, S. Esparham, C. Fosler, H. Gire, S. Glanowski,
K. Glasser, A. Glodek, M. Gorokhov, K. Graham, B. Gropman, M. Harris, J. Heil,
S. Henderson, J. Hoover, D. Jennings, C. Jordan, J. Jordan, J. Kasha, L. Kagan, C.
Kraft, A. Levitsky, M. Lewis, X. Liu, J. Lopez, D. Ma, W. Majoros, J. McDaniel,
S. Murphy, M. Newman, T. Nguyen, N. Nguyen, M. Nodell, S. Pan, J. Peck, M.
Peterson, W. Rowe, R. Sanders, J. Scott, M. Simpson, T. Smith, A. Sprague, T.
Stockwell, R. Turner, E. Venter, M. Wang, M. Wen, D. Wu, M. Wu, A. Xia, A.
Zandieh, and X. Zhu. 2001. The sequence of the human genome. Science
291:1304-1351.
Villeneuve, A. M., and K. J. Hillers. 2001. Whence meiosis? Cell 106:647-650.
Wang, H., Z. Xu, L. Gao, and B. Hao. 2009. A fungal phylogeny based on 82 complete
genomes using the composition vector method. BMC Evol Biol 9:195.
Watanabe, Y., and P. Nurse. 1999. Cohesin Rec8 is required for reductional chromosome
segregation at meiosis. Nature 400:461-464.
Watson, J. D., and F. H. C. Crick. 1953. Genetical Implications of the Structure of
Deoxyribonucleic Acid. Nature 171:964-967.
230
Weber, A. P. M., C. Oesterhelt, W. Gross, A. Brautigam, L. A. Imboden, I.
Krassovskaya, N. Linka, J. Truchina, J. Schneidereit, H. Voll, L. M. Voll, M.
Zimmermann, A. Jamai, W. R. Riekhof, B. Yu, R. M. Garavito, and C. Benning.
2004. EST-analysis of the thermo-acidophilic red microalga Galdieria
sulphuraria reveals potential for lipid A biosynthesis and unveils the pathway of
carbon export from rhodoplasts. Plant Molecular Biology 55:17-32.
Weiner, B. M., and N. Kleckner. 1994. Chromosome pairing via multiple interstitial
interactions before and during meiosis in yeast. Cell 77:977-991.
Weismann, A., W. N. Parker, and H. Ronnfeldt. 1893. The germ-plasm: a theory of
heredity. C. Scribner's sons, New York,.
West, S. A., C. M. Lively, and A. F. Read. 1999. A pluralist approach to sex and
recombination. Journal of Evolutionary Biology 12:1003-1012.
West, S. C. 1992. Enzymes and molecular mechanisms of genetic recombination. Annu
Rev Biochem 61:603-640.
White, M. J. D. 1978. Modes of Speciation. Freeman, San Francisco.
Wickstead, B., K. Gull, and T. A. Richards. 2010. Patterns of kinesin evolution reveal a
complex ancestral eukaryote with a multifunctional cytoskeleton. Bmc
Evolutionary Biology 10:-.
Wilkins, A. S., and R. Holliday. 2009. The evolution of meiosis from mitosis. Genetics
181:3-12.
Williamson, D. H., L. H. Johnston, D. J. Fennell, and G. Simchen. 1983. The Timing of
the S-Phase and Other Nuclear Events in Yeast Meiosis. Experimental Cell
Research 145:209-217.
Woese, C. R., and G. E. Fox. 1977. Phylogenetic Structure of Prokaryotic Domain Primary Kingdoms. Proceedings of the National Academy of Sciences of the
United States of America 74:5088-5090.
Woese, C. R., O. Kandler, and M. L. Wheelis. 1990. Towards a natural system of
organisms-proposal for the domains Archaea, Bacteria, and Eucarya. Proceedings
of the National Academy of Sciences of the United States of America 87:45764579.
Wohlschlegel, J. A., B. T. Dwyer, S. K. Dhar, C. Cvetic, J. C. Walter, and A. Dutta.
2000. Inhibition of eukaryotic DNA replication by geminin binding to Cdt1.
Science 290:2309-2312.
Wold, M. S. 1997. Replication Protein A: A heterotrimeric, single-stranded DNA-binding
protein required for eukaryotic DNA metabolism. Annual Review of
Biochemistry 66:61-92.
Xu, T., and G. M. Rubin. 1993. Analysis of genetic mosaics in developing and adult
Drosophila tissues. Development 117:1223-1237.
231
Yin, Y., H. Cheong, D. Friedrichsen, Y. Zhao, J. Hu, S. Mora-Garcia, and J. Chory. 2002.
A crucial role for the putative Arabidopsis topoisomerase VI in plant growth and
development. Proc Natl Acad Sci U S A 99:10191-10196.
Yokobayashi, S., M. Yamamoto, and Y. Watanabe. 2003. Cohesins determine the
attachment manner of kinetochores to spindle microtubules at meiosis I in fission
yeast. Mol Cell Biol 23:3965-3973.
Yoon, H. S., J. Grant, Y. I. Tekle, M. Wu, B. C. Chaon, J. C. Cole, J. M. Logsdon, D. J.
Patterson, D. Bhattacharya, and L. A. Katz. 2008. Broadly sampled multigene
trees of eukaryotes. Bmc Evolutionary Biology 8:-.
Yoon, H. S., J. D. Hackett, C. Ciniglia, G. Pinto, and D. Bhattacharya. 2004. A molecular
timeline for the origin of photosynthetic eukaryotes. Molecular Biology and
Evolution 21:809-818.
Yoon, H. S., J. D. Hackett, G. Pinto, and D. Bhattacharya. 2002. The single, ancient
origin of chromist plastids. Proc Natl Acad Sci U S A 99:15507-15512.
Zalevsky, J., A. J. MacQueen, J. B. Duffy, K. J. Kemphues, and A. M. Villeneuve. 1999.
Crossing over during Caenorhabditis elegans meiosis requires a conserved MutSbased pathway that is partially dispensable in budding yeast. Genetics 153:12711283.
Zhou, X. F., Z. G. Lin, and H. Ma. 2010. Phylogenetic detection of numerous gene
duplications shared by animals, fungi and plants. Genome Biology 11:-.
Zou, Y., Y. Y. Liu, X. M. Wu, and S. M. Shell. 2006. Functions of human Replication
Protein A (RPA): From DNA replication to DNA damage and stress responses.
Journal of Cellular Physiology 208:267-273.