analysis of various Ascomycota
The recent genome sequencing efforts comprises various species of the phylum ascomycota. Beside the well known budding yeast Saccharomyces cervisiae (Hemiascomyetes) the fission yeast
(Archiascomycetes) has been completely sequenced (Wood et. al, 2002).
The human pathogen Candida albicans has been sequenced by the Stanford Genome Technology Center. The Candida albicans SC5314 genome sequence is based on the CandidaDB (Pasteur) version after Assembly 6 release (Stanford).
Also the sequencing of Neurospora crassa and Fusarium graminearum were done by the Whitehead Institute, manually supervised and evaluated annotation is being performed by MIPS. The Whitehead Institute proteome sets of the filamentous fungi Magnaporthe grisea and Aspergillus nidulans were also analysed. Please note the Whitehead Institute Disclaimer. Partial sequencing of other selected hemiascomycetous species was done in the Genolevures project (see below). The results of these projects were analysed using the Protein Extraction, Description and ANalysis Tool. You may browse the complete analysis for each species or search across all databases (see below).
The "Génolevures" project
The source of the data is the "Génolevures" project, a comparative genomics project: a large scale comparative analysis between Saccharomyces cerevisiae and 13 other yeast species representative of the various branches of the hemiascomycetous class:
- Saccharomyces sensu stricto
- Saccharomyces bayanus var uvarum
- Saccharomyces sensu lato
- Saccharomyces exiguus
- Saccharomyces servazzii
- Zygosaccharomyces rouxii
- Saccharomyces kluyveri
- Kluyveromyces genus
- Kluyveromyces thermotolerans
- Kluyveromyces lactis
- Kluyveromyces marxianus var marxianus
- distant species
- Pichia angusta
- Debaryomyces hansenii var hansenii
- Pichia sorbitophila
- Candida tropicalis
- Yarrowia lipolytica
To integrate the raw data into CYGD a PEDANT analysis was accomplished. Potential ORFs in the Random Sequence Tags (RSTs) were located by homology searches using BLASTX ('BLASTX' and 'Scerevisiae' links on the report page). As the sequences produced in this project are single read RSTs, they are prone to contain undetermined residues and frameshifts. Consequently caution should be exercised when interpreting the computed RSTs translation products.
All the translations given here are hypothetical translations of the segments corresponding to BLASTx alignments or the longest possible orf. The most probable ORF/RST (only one!) was translated to a protein which was subsequently subjected to extensive bioinformatic methods in the PEDANT system and stored in species specific databases.
How to browse the data
The species specific PEDANT databases can be viewed by selecting the species and the 'Complete PEDANT' entry. You may search inside the PEDANT view (Search in the left panel):
There is further help on each PEDANT topic ().
Free search - Text search over all columns.
- By sequence id - Search the primary code (systematic ORF code for S. cerevisisae, C. albicans, N. crassa or S. pombe, RST number for the remaining species.
- By gene id - Search for gene name or alias (only for S. cerevisiae, Candida albicans or S. pombe).
Sequence pattern searches against the protein or
DNA sequences in the given PEDANT data set.
Search your protein or DNA sequence of interest against the protein sequences of the database (or against the ORF or Contig DNA, S. cerevisiae, S. pombe, C. albicans only).
You also may select S. cerevisiae homologues to retrieve a list of the best Blast/Blastx hits against S. cerevisiae. Additionally a list of contig sequences (RSTs/chromosomal DNA) may be viewed.Starting your search by retrieving an S. cerevisiae ORF from the PEDANT analysis provides you the opportunity to compare this protein against the protein set of selected species using the 'Compare genomes starting from this gene' link on the report page.
Searches across all species
Searches across all species are possible using the BioRS system. First you have to log in BioRS by clicking on the BioRS logo. Then select a category to specify your search (see below) and type in your query. Avoid non-alphabetic characters and spaces, instead do sub queries (AND/OR) on the result list to search multiple terms.
Free text over all columns|
Genolevures Random Sequence Tag number|
|Protein description||All protein descriptions|
S. cerevisiae homologues||
Systematic S. cerevisiae ORF code |
S. c. hom. description||
Gene name or any free text in the protein description|
Enzyme nomenclature description|
Cluster of Orthologous Groups of proteins description|
Functional Catalogue text entry|
Protein description of the FUNCAT hits|
Protein family description|
PROSITE domain description|
PIR keyword like|
Last modified: Mon Jul 12 11:07:32 CEST 2004