| Home | Downloads | UCSC Browser | Documentation | Publications | About |

RepeatMap

RepeatMap is a set of resources enabling researchers to quickly determine the uniqueness of a sequence. Some uses include:

  • RNA Probe Design. Off-target effects seem to begin when there is a 15-20 basepair region of perfect homology. RepeatMap is able to determine exact counts so that the probability of off-target effects is minimized.
  • UCSC Genome Browser Track. Each kmer of the genome is annotated with the exact number of times it occurs in the genome. This can then be used to determine structures correlated with high/low repeat counts.
  • Compression. A fundamental problem in compression is knowing what strings occur at higher frequency and then using smaller symbols for those string. RepeatMap provides an efficient way for determining a priori the exact number of times a kmer occurs. In essense, RepeatMap is a similar idea as Burrow-Wheeler (BW), except we only look at kmers whereas BW uses a full suffix tree.

Description

RepeatMap is composed of individual modules that are each meant to enable extremely rapid repeat counting. We provide intuitive and easy to use interfaces to each of the tools. There are currently three parts of the RepeatMap system:

  1. The RepeatMap Dictionary Server creates the dictionaries with repeats and loads these repeats into dictionaries. This "server" can run on the same computer as the client (see below) as long as the computer has sufficient memory (see documentation).
  2. The RepeatMap Client queries the RepeatMap Dictionary Server to determine the repeat counts of strings.
  3. The Annotator Client queries the RepeatMap Dictionary Server to determine the repeat counts for very long strings (e.g. chromosomes). It then outputs the results in a file that can be viewed in the UCSC Genome Browser.

All components can be used independently and can be tweaked by the user for any purpose under the GPL.


Valid CSS! Valid HTML 4.01 Transitional SourceForge.net Logo

This page last modified Sunday, 29-May-2011 13:45:42 EDT