This is an archived course. A more recent version may be available at ocw.mit.edu.
Gibbs Sampler, Weak Motif Animation (AVI) courtesy of Professor Chris Burge.
The majority of the homework assignments will include problems that involve writing simple programs in the scripting language, Python. Python, as well as Perl, is widely used in the fields of bioinformatics and computational biology. Because many students may have little or no programming experience, a hands-on python tutorial to take place across three sessions will be offered by Dr. Peter Woolf during the second week of classes.
The aim of this tutorial is to give students a basic working knowledge of the scripting language Python. This course is intended for students with little or no programming experience, and will focus on the tools and utilities needed to do research in bioinformatics and computational biology.
My goal is to make the class informal and hands on, so please speak up if something does not make sense. Programming is not something that can be easily learned by watching, but must be learned by doing.
At minimum, by the end of this class, you should be able to read in a FASTA sequence from a file, parse it, and return the reverse compliment of that sequence to a file.
Session One: Introduction to Unix, Text Editors, Basic Python Commands and Data Structures
Session Two: Flow Control in Python, Input/Output, Files, HTML
Session Three: Modules, Program Organization, and Regular Expressions
Lutz, Mark, and David Ascher. Learning Python. 2nd ed. Beijing; Cambridge, MA: O'Reilly, 2003. ISBN: 9780596002817.
The tutorial will roughly follow the structure of the standard documentation tutorial that can be found at: Online Python Tutorial.
If you are already a proficient programmer, look at: Dive into Python.
A good Unix-command Cheat Sheet can be found at: Unix-command Cheat Sheet.
For an introduction to regular expressions: Regular Expression HOWTO.
To quickly test your regular expressions, try the program: Kodos.
Finally, for lots of examples of good Python code related to Bioinformatics and Computational Biology, see: Biopython Web site.
Review the notes on Unix Commands and Beginner's Python (PDF).
In Python you can write programs that can run as a stand alone program or you can import them into other Python code. In fact, you have already been using Python programs every time you use an import command.
As an example of the framework of a basic Python program, see SampleProg.py (PY).
Regular expressions are a powerful text parsing tool that is widely used in bioinformatics. See the notes on regular expressions (PDF) for a summary of the commands.