| Calling Sequence
| GenomeSummary(DB)
|
| Parameters
| | Name | Type | Description |
|
| DB
| database | database structure to create a summary |
|
| Return Type
| GenomeSummary |
| Selectors
| | Name | Type | Description |
|
| FileName
| string | name of external file containing the database |
| string
| string | the entire header of the database as a string |
| TotAA
| posint | number of amino acids or bases in the database |
| TotChars
| posint | number of characters in the database |
| TotEntries
| posint | number of entries in the database |
| type
| string | dna, rna, mixed or peptide |
| EntryLengths
| list(posint) | length of each entry |
| Id
| string | 5-letter code (SwissProt) for species/genome |
| Kingdom
| string | either Bacteria, Archaea or Eukaryota |
| Lineage
| list(string) | Lineage as a list (from OS tags) |
| Genus
| string | First part of the scientific name |
| Epithet
| string | Second part of the scientific name |
| sgml_tag
| string | The contents of the tag in the database header |
|
| Methods
| print, Rand, select, string, type |
| Synopsis
| GenomeSummary provides an alternative to loading a database
when the sequences themselves are not needed.
Typically, the database is loaded, then GenomeSummary is run and its
results are stored in a file for later reading.
In this way, all of the data except for the sequences themselves, is
available and many genomes can be loaded into a darwin session. |
| GenomeSummary has all the selectors which are
available for a database (except for Entry and Pat which are can only
be used if the sequences are present).
Additionally it provides a few additional selectors.
The EntryLengths contains the length of the sequence of each entry.
The string selector, does not select the entire text of the database,
just the text that is before the first entry.
This is normally called the header of the database.
In the header there are several useful tags which describe the entire
database, for example, 5-letter code, kingdom, lineage, etc.
This information is available directly through selectors.
Any other tagged information in the header can be selected with the
name of the tag as a selector.
|
| Examples
| > ReadDb('/home/darwin/DB/genomes/ECOLI/ECOLI.db'):
> gs := GenomeSummary(DB):
> gs[TotAA];
1358990
> gs[Lineage];
[Bacteria, Proteobacteria, Gammaproteobacteria, Enterobacteriales,
Enterobacteriaceae, Escherichia, Escherichia coli]
> print(gs);
FileName: /home/darwin/DB/genomes/ECOLI/ECOLI.db
string: <DBNAME>Escherichia coli K-12 MG1655 complete genome.</DBNAME><D...
TotAA: 1358990
TotChars: 6806443
TotEntries: 4289
type: Peptide
Id: ECOLI
Kingdom: Bacteria
Lineage: [Bacteria, Proteobacteria, Gammaproteobacteria, Enterobacteriales, E
nterobacteriaceae, Escherichia, Escherichia coli]
|
| See also
| ConsistentGenome, database, DB, Entry, ReadDb, Sequence |