Research Article
No access
Published Online: 23 June 2006

A Fast and Symmetric DUST Implementation to Mask Low-Complexity DNA Sequences

Publication: Journal of Computational Biology
Volume 13, Issue Number 5

Abstract

The DUST module has been used within BLAST for many years to mask low-complexity sequences. In this paper, we present a new implementation of the DUST module that uses the same function to assign a complexity score to a sequence, but uses a different rule by which high-scoring sequences are masked. The new rule masks every nucleotide masked by the old rule and occasionally masks more. The new masking rule corrects two related deficiencies with the old rule. First, the new rule is symmetric with respect to reversing the sequence. Second, the new rule is not context sensitive; the decision to mask a subsequence does not depend on what sequences flank it. The new implementation is at least four times faster than the old on the human genome. We show that both the percentage of additional bases masked and the effect on MegaBLAST outputs are very small.

Get full access to this article

View all available purchase options and get full access to this article.

Information & Authors

Information

Published In

cover image Journal of Computational Biology
Journal of Computational Biology
Volume 13Issue Number 5June 2006
Pages: 1028 - 1040
PubMed: 16796549

History

Published online: 23 June 2006
Published in print: June 2006

Permissions

Request permissions for this article.

Topics

    Authors

    Affiliations

    Aleksandr Morgulis
    National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland.
    E. Michael Gertz
    National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland.
    Alejandro A. Schäffer
    National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland.
    Richa Agarwala
    National Center for Biotechnology Information, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland.

    Metrics & Citations

    Metrics

    Citations

    Export citation

    Select the format you want to export the citations of this publication.

    View Options

    Get Access

    Access content

    To read the fulltext, please use one of the options below to sign in or purchase access.

    Society Access

    If you are a member of a society that has access to this content please log in via your society website and then return to this publication.

    Restore your content access

    Enter your email address to restore your content access:

    Note: This functionality works only for purchases done as a guest. If you already have an account, log in to access the content to which you are entitled.

    View options

    PDF/EPUB

    View PDF/ePub

    Media

    Figures

    Other

    Tables

    Share

    Share

    Copy the content Link

    Share on social media

    Back to Top