Recommended Web Sites

Table of Contents



Sites Comments
Bioinformatics and Computational Biology Books This is an Amazon Listmania! list of bioinformatics and computational biology books recommended by a bioinformatician in Singapore.


Sites Comments
BioConductor BioConductor is an open source software project that provides R packages for bioinformatics. BioConductor’s greatest strength is microarray analysis, but other packages are available for more traditional bioinformatics analysis.
BioPerl BioPerl is open source software for DNA sequence analysis, written in Perl.
DAS “The distributed annotation system (DAS) is a client-server system in which a single client integrates information from multiple servers. It allows a single machine to gather up genome annotation information from multiple distant web sites, collate the information, and display it to the user in a single view. Little coordination is needed among the various information providers.”
and EMBOSS Applications
EMBOSS (European Molecular Biology Open Source Software Suite) is software for DNA sequence analysis, written in C.
NCBI BLAST BLAST (Basic Local Alignment Search Tool) is NCBI’s tool for identifying regions of similarity among sequences. The source code and precompiled binaries are available.
NCBI ToolBox NCBI has written two versions of its toolbox, one in C and the other in C++. The software was written originally to support NCBI’s internal pipelines, but there is a great deal of useful code here.
Open Bioinformatics Foundation CVS This page provides information for obtaining the source code from the Open Bioinformatics Foundation CVS server. Source code is available for BioPerl, BioJava, BioPython, MOBY, and DAS.
Sanger Institute Software This is the Sanger Institute’s main software page. I made a small contribution to the SSAHA software.
The Scriptome This is a site provided by the Bauer Center for Genomics Research at Harvard University that provides simple one-liners or short scripts designed for use by non-programmers. See Data Munging for Non-Programming Biologists for an overview.
Taverna “The Taverna project aims to provide a language and software tools to facilitate easy use of workflow and distributed compute technology within the eScience community.”


Sites Comments
A. F. Rosenberg Consulting, LLC Alex Rosenberg, Ph.D., is a former colleague, a good friend of mine, and one of the smartest people I know. He has substantial industry experience in data visualization, analysis, and mining, and he is a MATLAB expert.
The BioTeam “BioTeam is a consulting collective dedicated to delivering vendor-agnostic informatics solutions to the life science industry.” I worked closely with the BioTeam at a former job. These guys are extremely smart, competent, capable, hard-working, and efficient.

Database Management Systems

Sites Comments
MySQL MySQL is an open source database management system that is used on millions of systems. It is fairly easy to install MySQL on your computer and then use it to support your database needs. APIs are available for a wide variety of programming languages, including C, C++, PHP, Perl, etc.
MySQL++ (MySQL C++ Wrapper for C API) The MySQL API is written in C. This site provides a C++ template-based wrapper for C++ programmers.


Genome Browsers

Sites Comments
Ensembl Genome Browser Ensembl provides annotated genome sequence data for many organisms.
Integrated Microbial Genomes (JGI) The Joint Genome Institute of the U.S. Department of Energy has sequenced the genomes of many microbes. These and many other microbial genomes are presented at this site.
Saccharomyces Genome Database (SGD) SGD provides annotation and sequence data for Saccharomyces cerevisiae.
UCSC Genome Browser The UCSC Genome Browser provides annotated genome sequence data for many organisms.

GNU/Linux and UNIX

Sites Comments
Advanced Programming in the UNIX Environment, 2nd Edition This is the second edition of Richard Stevens’s classic book. The book is also available through Safari Bookshelf.
Fedora Core This is the home page for the Fedora Core distribution of Linux. The Fedora Core project is sponsored by Red Hat.
Installing Debian This is an article published by about how to install the Debian distribution.

Mac OS X


Sites Comments
Advanced Mac OS X Programming Advanced Mac OS X Programming is a book published by Big Nerd Ranch, a company that provides training in Mac OS X and Unix. “This book goes down to the real nitty-gritty of multi-threading, interprocess communication, networking, performance tuning, distributed objects, kqueues, Bonjour, authentication, the keychain, and directory services. The tools are also covered: gcc, gdb, subversion, Shark, and Saturn.” ISBN 0-9740785-1-4, $69.99.
Carbon Dev -- Mac OS X Carbon Programming Wiki Carbon Dev provides resources for programming using the Carbon libraries of Mac OS X.
Getting Started with launchd This is an Apple Developer article about the launch daemon in Mac OS X 10.4.
GoingWare’s Programming Tips This site provides a collection of articles and essays that cover programming, object oriented design, web site design, programming for Mac OS X and Darwin, and other topics.
High Performance Computing The definitive collection of the latest compilers and high performance computing tools for Panther (Mac OS X 10.3.x) and Tiger (Mac OS X 10.4.x) on the G4 and G5 processors.
Introduction to Open Source Scripting on Mac OS X Introduces the open source scripting environment on Mac OS X, including shell scripting, sed, awk, Perl, Python, PHP, and Ruby. Also introduces using Xcode for managing scripts.
The Mac Xcode 2 Book The Mac Xcode 2 Book, written by Michael E. Cohen and Dennis R. Cohen, was published in June, 2005. Other reviews report that this book explains how to use Xcode 2 but does not explain much about programming.
Working with Xcode This is an Apple Developer article about using Apple’s Xcode IDE. At the time of writing, the article describes Xcode 2.2.


Sites Comments
Bare Bones Software BBEdit and TextWrangler Bare Bones is the creator of BBEdit, the premier text editor for Mac OS X. I use BBEdit for programming and for writing web pages. Bare Bones also gives away TextWrangler, a text editor that is similar to BBEdit but is lacking the more sophisticated features of BBEdit.
Circus Ponies NoteBook Notebook is a wonderful program that helps you store and organize notes, pictures, and files. It is easy to use, and it has become indispensable for my life.
Configuring and Running X11 Applications on Mac OS X This page on Apple Computer’s web site provides information for installing and using X11 under Mac OS X.
DarwinPorts DarwinPorts contains precompiled binaries for quick installation under Mac OS X, Darwin, and OpenDarwin.
Fetch Softworks Fetch is an FTP and SFTP client for Mac OS X. I use it for maintaining this web site.
Little Snitch Little Snitch monitors your Internet connection so you can detect when a program sends information out.

Microarray Analysis

Sites Comments
ArrayExpress at the EBI ArrayExpress is the microarray data repository located at the European Bioinformatics Institute.
BioConductor BioConductor is an open source software project that provides R packages for bioinformatics. BioConductor’s greatest strength is microarray analysis, but other packages are available for more traditional bioinformatics analysis.
caArray “The NCICB’s cancer array informatics project, caArray, consists of a microarray database and microarray data analysis and visualization tools. CaArray is an open source project, and the source code and APIs are available in the caArray Informatics page. The goals of the project are to make microarray data publicly available, and to develop and bring together open source tools to analyze these data.”
DNA-Chip Analyzer (dChip) dChip is software for the Windows platform for model-based analysis of expression microarrays.
Gene Expression Omnibus GEO is the microarray data repository located at the U.S. National Center for Biotechnology Information.
MicroArray Gene Expression Data Open Source Projects This site provides access to the MAGE specification and software.
Significance Analysis of Microarrays (SAM) SAM is software for the analysis of microarray data.

Numerics and Scientific Computing

Sites Comments
GNU Linear Programming Kit (GLPK) “The GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.”
GNU Scientific Library (GSL) “The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total.”
The Object-Oriented Numerics Page A comprehensive set of links to object-oriented numerics software, maintained by Todd Veldhuizen.

Programming Languages


Sites Comments
Boost C++ Libraries The Boost project provides C++ libraries. Many of these libraries may become part of the next C++ Standard Library.
C++ FAQ Lite Answers to frequently asked questions about C++, maintained by Marshall Cline.
C++ Programming This site provides tutorials on C++ programming and the Standard Template Library, a C++ glossary, and much other information useful to a beginner in C++. A drawback is the number of ads, including popup ads.
GNU cgicc This is the site for GNU’s C++ class library for writing CGI applications.


Sites Comments
Basic JavaScript Tutorials There are plenty of JavaScript resources at this site, including sample scripts and tutorials.


Sites Comments
Comprehensive Perl Archive Network (CPAN) This site provides access to the source code for Perl itself and thousands of Perl modules as well as the Perl documentation. This is where to go to read articles and tutorials about Perl. This is a large and comprehensive site. Another Perl site with lots of links.


Sites Comments
PHP Function Index for Mac OS X A PHP function browser for Mac OS X, with searching and an AppleScript interface.
The Practicality of OO PHP This is an O’Reilly article about using object-oriented PHP.
Recommended PHP Reading List This is a list of recommended reading compiled by developers at IBM.


Sites Comments
Python Programming Language—Official Web Site Python is an object-oriented programming language that is supposed to be easy to learn. Python 2.3.5 is installed with Mac OS X 10.4.6.


Sites Comments
Ajax on Rails This is an O’Reilly article about using Ajax with Ruby on Rails.
Ruby on Rails This is the home page for the Ruby on Rails project.
What is Ruby on Rails This is an O’Reilly article about Ruby on Rails, written by Curt Hibbs.


Sites Comments
Cincom Smalltalk This is the home page for Concom’s Smalltalk, with a short walk-through, online tutorials, and an application developer’s guide, as well as links to the download page.
GNU Smalltalk GNU Smalltalk-80.
Squeak Squeak is an open, portable implementation of Smalltalk.

Protein Structures

Sites Comments
CATH Protein Structure Classification Database The CATH database provides a classification of protein structures based on Class, Architecture, Topology, and Homologous superfamily.
RCSB Protein Data Bank (PDB) PDB is a repository of protein structures.
SCOP: Structural Classification of Proteins SCOP is a manually curated database of protein structures. The goal of the SCOP project is “to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.”


R Project for Statistical Computing

Sites Comments
BioConductor BioConductor is an open source software project that provides R packages for bioinformatics. BioConductor’s greatest strength is microarray analysis, but other packages are available for more traditional bioinformatics analysis.
Building R Step by Step [for Mac OS X] Step by step directions (possibly obsolete) for building R for Mac OS X.
Java GUI for R (JGR) JGR is a universal and unified graphical user interface for R, written in Java.
R for Mac OS X FAQ The FAQ for R under Mac OS X.
R for Mac Wiki This is Simon Urbanek’s Wiki server for the R for Macintosh Project, for which Simon is one of the developers.
R Project for Statistical Computing R is open source software for statistical computing and graphics. I devote an entire section of this web site to R. It is difficult to learn, but very powerful once you get the hang of it.
RKWard RKWard is planned to be a front end for R that makes R easier to use. This is an open source project.
RoSuDa R-related Software The Lehrstuhl für Rechnerorientierte Statistik und Datenanalyse distributes JGR, a Java-based user interface for R, iPlots, and Rserve.


Sites Comments
SAS OnlineDoc, V8 The online documentation for SAS, version 8, provided through a Java applet.
Documentation for SAS 9 The online documentation for SAS, version 9, available as HTML or PDF. The freely downloadable PDF versions replace books costing hundreds of dollars.

Other Software

Sites Comments
Biostatistical Software Routines Written By Paul Johnson A set of software programs, routines, and technical reports on a CD-ROM. For Windows.
Dap Dap is a small statistics package distributed by the GNU project.
libsvm libsvm is a library for support vector machines.
RoSuDa Interactive Statistical Software The Lehrstuhl für Rechnerorientierte Statistik und Datenanalyse distributes interactive software for data visualization and statistical analysis.

Just for Fun

Sites Comments
Gary C. Ramseyer’s First Internet Gallery Of Statistics Jokes Jokes only a statistician can appreciate. Some of these are pretty awful.

Web Standards

Sites Comments
Developer Documentation for the W3C Markup Validation Service The W3C has developed a markup validator and provides a markup validation service. This page provides directions for obtaining the source code and installing the validator on your own machine.
A List Apart A List Apart provides articles about designing functional and attractive web sites using current standards.
css/edge Eric A. Meyer’s web site that demonstrates cutting edge CSS techniques.
UsableType: Web Typography Guide This valuable site provides typography recommendations for web pages.
W3Schools Online Web Tutorials Free tutorials on HTML, XML, browser scripting, server scripting, .NET, multimedia, and web building. The price is lots of clutter from advertising, but I have used several of these tutorials.
Web Standards Project The Web Standards Project “fights for standards that reduce the cost and complexity of development while increasing the accessibility and long-term viability of any site published on the Web.”
World Wide Web Consortium (W3C) The W3C develops specifications, guidelines, and software for web technologies.