Bio-recipes (Bioinformatics recipes) in Darwin

Bio-recipes are a collection of Darwin example programs.  They show how to solve standard problems in Bioinformatics. Each bio-recipe consists of an introduction, explanations, graphs, figures, and most importantly, Darwin commands (the input commands and the output that they produce) that solve the given problem.

Darwin is an interactive language of the same lineage as Maple designed to solve problems in Bioinformatics. It relies on a simple language for the interactive user, plus the infrastructure necessary for writing object oriented libraries, plus very efficient primitive operations. The primitive operations of Darwin are the most common and time consuming operations typical of bioinformatics, including linear algebra operations.

The reasons behind this particular format are the following.

  1. It is much easier to understand an algorithm or a procedure or even a theorem, when it is illustrated with a running example.
  2. The procedures, as written, may be run on different data and hence serve a useful purpose.
  3. It is an order of magnitude easier to modify a correct, existing program, than to write a new one from scratch. This is particularly true for non-computer scientists.
  4. The full examples show some features of the language and of the system that may not known to the casual user of the Darwin, hence they serve a tutorial purpose.
The problems considered up to now include:

Total Number of Pairs of Amino Acids in the SwissProt Database

Basic introduction to programming in Darwin

Chi-square Test for a Contingency Table of Counts

Linear Regressions: 5 Methods to Compute A^t * A

Unbiased selection of sample alignments

Counters: an example of a simple Class

Linear algebra in Darwin

Finding Orthologous sequences and building a phylogenetic tree

Significance assessment of an alignment

Multiple repetitions of a short motif

Sequence Alignments with Special Characteristics

Search your Name in the SwissProt Database

Recognizing Proteins by Weight of their Digested Parts

String Alignment using Dynamic Programming

Linear Classification or Discrimination

Random Distance Trees, analysis of properties

A Class for Discrete Bayesian Networks in Darwin

Phylogenetic tree of the "Nigerian Prince" email scam

The tRNA Pairing Index

The most significant Codon Bias in Yeast

Virus Classification using k-nucleotide Frequencies

Determination of Haplotypes from Genotype information

How to Compute Mutation and Dayhoff Matrices

Introduction to Codon Substitution Matrices

Significance of Alignment Scores

Idealized Mutational Clocks

Back Translation (protein to DNA) in an optimal way

Greedy algorithms for optimization: an example with Synteny

Computing Confidence Levels for Quartets

How to Solve a Number Puzzle

Each problem is organized in a separate recipe which corresponds to an html file. These files can be accessed by clicking on the links above. Each recipe contains Darwin statements shown in green, Darwin output in red and comments in black. The comments include a short description about the problem that is solved in the recipe, and about the algorithm that is used to produce the solution.

The intended method of usage for the Bio-recipes is that after reading the explanatory text, the Darwin statments can be copied (cut and paste) to a Darwin prompt. The Darwin output can then be compared to what is given as output in the recipe. If you have to solve a slightly different task from what is shown in the recipe, you can always copy the Darwin statements into a text editor. There you can modify the statements such that your specific problem can be solved.

Bio-recipes tied to biological questions

Total Number of Pairs of Amino Acids in the SwissProt Database
Unbiased selection of sample alignments
Find Orthologous sequences and build a phylogenetic tree
Significance assessment of an alignment
Multiple repetitions of a short motif
Sequence Alignments with Special Characteristics
Recognizing Proteins by Weight of their Digested Parts
Random Distance Trees, analysis of properties
Search your Name in the SwissProt Database
tRNA Pairing Index
The most significant Codon Bias in Yeast
Back Translation (protein to DNA) in an optimal way

Mathematical/algorithmical topics

Chi-square Test for a Contingency Table of Counts
Linear Regressions: 5 Methods to Compute A^t * A
Significance assessment of an alignment
Dynamic Programming
Linear Classification or Discrimination
tRNA Pairing Index
Back Translation (protein to DNA) in an optimal way
Solving a Number Puzzle

Programming in Darwin

Total Number of Pairs of Amino Acids in the SwissProt Database
Basic introduction to programming in Darwin
Linear Regressions: 5 Methods to Compute A^t * A
Counters: an example of a simple Class
Linear algebra in Darwin
Search your Name in the SwissProt Database
A Class for Discrete Bayesian Networks in Darwin

CBRG

Last updated on Tue Jul 8 04:11:35 CEST 2003 by GhG