|
|
|
Home
| CV |
Databases
| IMEG
Seminars |
Journals |
|
|
|
SGI:
Modified Suzuki and Gojobori's method for detecting positive and negative selection at individual codon sites |
(c) Copyright July 2000 by Chen Su and the Pennsylvania State University. Permission is granted to copy this document provided that no fee is charged for it and that this copyright notice is not removed. It is distributed free of charge by |
| Chen Su Institute of Molecular Evolutionary Genetics and Department of Biology 322 Mueller Laboratory The Pennsylvania State University University Park, PA 16802, USA Email: cxs513@yahoo.com |
|
| Suggested
Citation Suzuki, Y., and T. Gojobori. 1999. A method for detecting positive selection at single amino acid sites. Mol Biol Evol 16: 1315-1328. Su, C. 2000. SGI: Modified Suzuki and Gojobori's method for detecting positive and negative selection at individual codon sites. The Pennsylvania State University, University Park, PA, USA. |
|
| Introduction Suzuki and Gojobori's (1999) method was designed for detecting positive and negative selection at single codon sites. According to their method, for each codon site, the probabilities of nonsynonymous changes and synonymous changes are computed, and the total numbers of changes, the numbers of synonymous changes and the numbers of nonsynonymous changes are counted. Under the assumption of a binomial distribution, if no selection is involved, the numbers of nonsynonymous changes should equal to the expected values. If for a certain site, the actual number of nonsynonymous changes is significantly higher than the expected values, positive selection is assigned. If the actual synonymous changes is higher, purifying selection is assigned. This program implements the above method with two minor modifications. First, in the above method, the tree topology estimated from the synonymous substitutions is used for estimating the branch lengths, the ancestral states, and so on. However, I found that the tree topology estimated this way is not reliable. For example, when the number of sequences is large and the number of sites is small, the uncorrected p-distance often gives better results (see Nei and Kumar 2000). In this program, you can simply input a tree topology and get the results. If you are not sure if the tree topology you have is correct, you can try different ones and compare the results. Second, the above method uses a (unweighted) parsimony method to estimate the ancestral states. If different pathways are possible, they are treated as having an equal probability of occurrence. An alternative way is to estimate the ancestral states by using Zhang and Nei's (1997) distance method, since it is simpler and gives equally good results. This program is written in Perl and for the ancestor reconstruction part, it makes system calls to Zhang's program anc-gene. (For detail of the anc-gene program, use the upper link of 'Ancestral Sequences'.) This program has been tested on IBM PC compatible computers. |
|
|
Installation |
|
| Computation
To run the program, 1. open the file SGI.pl using a text editor such as Notepad, and replace the name of the input file (VH.dat) with your file; 2. save SGI.pl and close it; 3. type c:\SGI\perl SGI.pl; 4. follow the instructions on the screen. |
|
| Output file
The output file will have an extension of .out with the name of your input. For example, if your input file is named myseq.dat, the output file will be myseq.dat.out. You can then copy the output to an excel spreadsheet and calculate the probabilities. (See VH.xls for an example). If, for a certain site, the value of 1-p (the last column in the example file) is higher than 0.95, than this site is under positive or purifying selection, depending on whether you are computing nonsynonymous changes or synonymous changes. |
|
|
|
|
Home
| CV |
Databases
| IMEG
Seminars |
Journals |
|
|
|
| Department of Biology |
Eberly College of Science | |