PRL Programs and Software

The Polymorphism Research Laboratory


Software made available online and in executable form that we find helpful in our studies.
Projects
Papers
People
Programs
Places
Partners
Production
Positions

geSNP

The use of high-density oligonucleotide arrays to measure thousands of mRNA abundance levels in parallel has become commonplace. In order to take further advantage of the growing body of data and to enable others to do so, we have developed a method and computer program to mine the hybridization patterns in oligonucleotide array-based gene expression data to identify genes with sequence differences. The program enables the broad, unbiased and opportunistic extraction of genetic information from new or pre-existing gene expression data obtained with high-density oligonucleotide arrays.

 

Multivariate Distance Matrix Regression

This analysis method allows predictor variables collected on the samples to be related to variation in pair-wise distance values reflected in a distance matrix. The proposed multivariate method avoids the need for reducing the dimensions of the matrix, can be used to assess relationships between data used to construct the matrix and additional information collected on the samples under study, and can be used to analyze individual data points or groups of data points identified in different ways. It provides a formal statistical test, rooted in traditional linear models, of how independent variables are associated to the variation present in a pair-wise distance matrix.

 

Multivariate Association Mapping Algorithms

A Multivariate Likelihood Ratio Test that Simultaneously Assesses Mean and Covariance Matrix Difference Among a Set of Variables. This methodology considers the fact that groups (e.g., subjects broken down into haplotype or genotype categories) can differ with respect to mean values of a set of traits or with respect to the relationships between those variables. The proposed tests are therefore more comprehensive and useful than many traditional multivariate tests which do not leverage information about both means and covariance matrices in their construction. The proposed tests are modifications and extensions of multvariate ANOVA and other techniques. A simultaneous test of the equality of means and covariance matrices across the genotypic categories can be constructed as simply the product of the test of means and covariances.

 

SNP-Expectation Maximization

SNPEM is used to assess the accuracy of haplotype frequency estimation as a function of a number of factors, including sample size, number of loci studied, allele frequencies, and locus-specific allelic departures from Hardy-Weinberg and linkage equilibrium. Many haplotype-analysis methods require phase information that can be difficult to obtain from samples of nonhaploid species. SNPEM employs strategies for estimating haplotype frequencies from unphased diploid genotype data collected on a sample of individuals that make use of the expectation-maximization (EM) algorithm to overcome the missing phase information.

 

External Programs

Complearn

From the Complearn website:

CompLearn is a suite of simple-to-use utilities that you can use to apply compression techniques to the process of discovering and learning patterns.

The compression-based approach used is powerful because it can mine patterns in completely different domains. It can classify musical styles of pieces of music and identify unknown composers. It can identify the language of bodies of text. It can discover the relationships between species of life and even the origin of new unknown viruses such as SARS. Other uncharted areas are up to you to explore.

This method is so general that it requires no background knowledge about any particular classification. There are no domain-specific parameters to set and only a handful of general settings. Just feed and run.

 

QuickLinks:
MMR
Haplotyping
Candidate Loci
SNPEM
MAMA
Meeting Schedule

contact webmaster: Charles Abney

counter
official ucsd webpage

Last updated: Tue Oct 24 11:41:12 PDT 2006  Valid HTML 4.0!