Class BlastLikeSAXParser

  • All Implemented Interfaces:
    org.xml.sax.XMLReader

    public class BlastLikeSAXParser
    extends java.lang.Object
    A facade class allowing for direct SAX2-like parsing of the native output from Blast-like bioinformatics software. Because the parser is SAX2 compliant, application writers can simply pass XML ContentHandlers to the parser in order to receive notifcation of SAX2 events.

    The SAX2 events produced are as if the input to the parser was an XML file validating against the biojava BlastLikeDataSetCollection DTD. There is no requirement for an intermediate conversion of native output to XML format. An application of the parsing framework, however, is to create XML format files from native output files.

    The biojava Blast-like parsing framework is designed to uses minimal memory,so that in principle, extremely large native outputs can be parsed and XML ContentHandlers can listen only for small amounts of information.

    The framework currently supports parsing of native output from the following bioinformatics programs. Please note that if you are using different versions of NCBI or WU Blast to those listed below, it is worth considering trying setting the parsing mode to Lazy, which means parsing will be attempted if the program is recognised, regardless of version.

    • NCBI Blast version 2.0.11
    • NCBI Blast version 2.2.2
    • NCBI Blast version 2.2.3
    • WU-Blast version 2.0a19mp-washu
    • HMMER 2.1.1 hmmsearch
    Planned addition support
    • Support for HMMER hmmpfam almost there but not fully tested

    Notes to SAX driver writers

    The framework that this parser is built on is designed to be extensible with support for both different pieces of software (i.e. not just software that produces Blast-like output), and multiple versions of programs.

    This class inherits from the org.biojava.bio.program.sax.AbstractNativeAppSAXParser abstract base class. The abstract base class is a good place to start looking if you want to write new native application SAX2 parsers. This and releated classes have only package-level visibility. Typically, application writers are expected to provide a facade class in this package (similar to the current class) to allow users access to functionality.

    NB Support for InputSource is not complete due to the fact that URLs are not resolved and cannot, therefore, be used as an InputSource. System pathnames, ByteStreams and CharacterStreams, however, are all supported.

    Copyright © 2000 Cambridge Antibody Technology.

    Primary author -

    • Simon Brocklehurst (CAT)
    Other authors -
    • Tim Dilks (CAT)
    • Colin Hardman (CAT)
    • Stuart Johnston (CAT)
    • Mathieu Wiepert (Mayo Foundation)
    • Travis Banks
    Version:
    1.0
    Author:
    Cambridge Antibody Technology (CAT), Travis Banks
    See Also:
    BlastLikeToXMLConverter
    • Constructor Summary

      Constructors 
      Constructor Description
      BlastLikeSAXParser()
      Initialises SAXParser, and sets default namespace prefix to "biojava".
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void addPrefixMapping​(java.lang.String poPrefix, java.lang.String poURI)
      Adds a namespace prefix to URI mapping as (key,value) pairs.
      protected void changeState​(int piState)
      Centralise chaining of iState field to help with debugging.
      protected void characters​(char[] ch, int start, int length)
      Utility method to centralize the sending of a SAX characters message a document handler.
      protected void endElement​(org.biojava.bio.program.sax.QName poQName)
      Utility method to centralize the sending of a SAX endElement message a document handler.
      org.xml.sax.ContentHandler getContentHandler()
      Return the content handler.
      protected java.io.BufferedReader getContentStream​(org.xml.sax.InputSource poSource)
      Create a stream from an an InputSource, picking the correct stream according to order of precedance.
      org.xml.sax.DTDHandler getDTDHandler()
      Do-nothing implementation of interface method
      org.xml.sax.EntityResolver getEntityResolver()
      Do-nothing implementation of interface method
      org.xml.sax.ErrorHandler getErrorHandler()
      Do-nothing implementation of interface method
      boolean getFeature​(java.lang.String poName)
      Do-nothing implementation of interface method
      java.lang.String getNamespacePrefix()
      Describe getNamespacePrefix method here.
      boolean getNamespacePrefixes()
      Support SAX2 configuration of namespace support of parser.
      boolean getNamespaces()
      Support SAX2 configuration of namespace support of parser.
      java.lang.Object getProperty​(java.lang.String name)
      Do-nothing implementation of interface method
      java.lang.String getURIFromPrefix​(java.lang.String poPrefix)
      Gets the URI for a namespace prefix, given that prefix, or null if the prefix is not recognised.
      void parse​(java.lang.String poSystemId)
      Full implementation of interface method.
      void parse​(org.xml.sax.InputSource poSource)
      parse initiates the parsing operation.
      java.lang.String prefix​(java.lang.String poElementName)
      Given an unprefixed element name, returns a new element name with a namespace prefix
      void setContentHandler​(org.xml.sax.ContentHandler poHandler)
      Allow an application to register a content event handler.
      void setDTDHandler​(org.xml.sax.DTDHandler handler)
      Do-nothing implementation of interface method
      void setEntityResolver​(org.xml.sax.EntityResolver resolver)
      Do-nothing implementation of interface method
      void setErrorHandler​(org.xml.sax.ErrorHandler handler)
      Do-nothing implementation of interface method
      void setFeature​(java.lang.String poName, boolean value)
      Handles support for ReasoningDomain and Namespace-prefixes
      void setModeLazy()
      Setting the mode to lazy means that, if the program is recognised, e.g.
      void setModeStrict()
      This is the default, parsing will be attempted only if both the program e.g.
      void setNamespacePrefix​(java.lang.String poPrefix)  
      void setProperty​(java.lang.String name, java.lang.Object value)
      Do-nothing implementation of interface method
      protected void startElement​(org.biojava.bio.program.sax.QName poQName, org.xml.sax.Attributes atts)
      Utility method to centralize sending of a SAX startElement message to document handler
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • oHandler

        protected org.xml.sax.ContentHandler oHandler
      • tNamespaces

        protected boolean tNamespaces
      • tNamespacePrefixes

        protected boolean tNamespacePrefixes
      • oNamespacePrefix

        protected java.lang.String oNamespacePrefix
      • oFullNamespacePrefix

        protected java.lang.String oFullNamespacePrefix
      • iState

        protected int iState
    • Constructor Detail

      • BlastLikeSAXParser

        public BlastLikeSAXParser()
        Initialises SAXParser, and sets default namespace prefix to "biojava".
    • Method Detail

      • parse

        public void parse​(org.xml.sax.InputSource poSource)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        parse initiates the parsing operation.
        Specified by:
        parse in interface org.xml.sax.XMLReader
        Parameters:
        poSource - an InputSource.
        Throws:
        java.io.IOException - if an error occurs.
        org.xml.sax.SAXException - if an error occurs.
      • setModeStrict

        public void setModeStrict()
        This is the default, parsing will be attempted only if both the program e.g. NCBI BlastP, and a particular version are recognised as bsing supported.
      • setModeLazy

        public void setModeLazy()
        Setting the mode to lazy means that, if the program is recognised, e.g. WU-TBlastX, then parsing will be attempted even if the particular version is not recognised. Using this option is more likely to result in erroneous parsing than if the strict mode is used.
      • setContentHandler

        public void setContentHandler​(org.xml.sax.ContentHandler poHandler)
        Allow an application to register a content event handler. If the application does not register a content handler, all content events reported by the SAX parser will be silently ignored.

        Applications may register a new or different handler in the middle of a parse, and the SAX parser must begin using the new handler immediately.

        Specified by:
        setContentHandler in interface org.xml.sax.XMLReader
        Parameters:
        poHandler - a ContentHandler The XML content handler
        Throws:
        java.lang.NullPointerException - If the handler argument is null
      • getContentHandler

        public org.xml.sax.ContentHandler getContentHandler()
        Return the content handler.
        Specified by:
        getContentHandler in interface org.xml.sax.XMLReader
        Returns:
        a ContentHandler The current content handler, or null if none has been registered.
      • parse

        public void parse​(java.lang.String poSystemId)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Full implementation of interface method.
        Specified by:
        parse in interface org.xml.sax.XMLReader
        Throws:
        java.io.IOException
        org.xml.sax.SAXException
      • getFeature

        public boolean getFeature​(java.lang.String poName)
                           throws org.xml.sax.SAXNotRecognizedException,
                                  org.xml.sax.SAXNotSupportedException
        Do-nothing implementation of interface method
        Specified by:
        getFeature in interface org.xml.sax.XMLReader
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setFeature

        public void setFeature​(java.lang.String poName,
                               boolean value)
                        throws org.xml.sax.SAXNotRecognizedException,
                               org.xml.sax.SAXNotSupportedException
        Handles support for ReasoningDomain and Namespace-prefixes
        Specified by:
        setFeature in interface org.xml.sax.XMLReader
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • getProperty

        public java.lang.Object getProperty​(java.lang.String name)
                                     throws org.xml.sax.SAXNotRecognizedException,
                                            org.xml.sax.SAXNotSupportedException
        Do-nothing implementation of interface method
        Specified by:
        getProperty in interface org.xml.sax.XMLReader
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setProperty

        public void setProperty​(java.lang.String name,
                                java.lang.Object value)
                         throws org.xml.sax.SAXNotRecognizedException,
                                org.xml.sax.SAXNotSupportedException
        Do-nothing implementation of interface method
        Specified by:
        setProperty in interface org.xml.sax.XMLReader
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setEntityResolver

        public void setEntityResolver​(org.xml.sax.EntityResolver resolver)
        Do-nothing implementation of interface method
        Specified by:
        setEntityResolver in interface org.xml.sax.XMLReader
      • getEntityResolver

        public org.xml.sax.EntityResolver getEntityResolver()
        Do-nothing implementation of interface method
        Specified by:
        getEntityResolver in interface org.xml.sax.XMLReader
      • setDTDHandler

        public void setDTDHandler​(org.xml.sax.DTDHandler handler)
        Do-nothing implementation of interface method
        Specified by:
        setDTDHandler in interface org.xml.sax.XMLReader
      • getDTDHandler

        public org.xml.sax.DTDHandler getDTDHandler()
        Do-nothing implementation of interface method
        Specified by:
        getDTDHandler in interface org.xml.sax.XMLReader
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler handler)
        Do-nothing implementation of interface method
        Specified by:
        setErrorHandler in interface org.xml.sax.XMLReader
      • getErrorHandler

        public org.xml.sax.ErrorHandler getErrorHandler()
        Do-nothing implementation of interface method
        Specified by:
        getErrorHandler in interface org.xml.sax.XMLReader
      • startElement

        protected void startElement​(org.biojava.bio.program.sax.QName poQName,
                                    org.xml.sax.Attributes atts)
                             throws org.xml.sax.SAXException
        Utility method to centralize sending of a SAX startElement message to document handler
        Parameters:
        poQName - a QName value
        atts - an Attributes value
        Throws:
        org.xml.sax.SAXException - if an error occurs
      • endElement

        protected void endElement​(org.biojava.bio.program.sax.QName poQName)
                           throws org.xml.sax.SAXException
        Utility method to centralize the sending of a SAX endElement message a document handler.
        Parameters:
        poQName - -
        Throws:
        org.xml.sax.SAXException - thrown if
        thrown - if
      • characters

        protected void characters​(char[] ch,
                                  int start,
                                  int length)
                           throws org.xml.sax.SAXException
        Utility method to centralize the sending of a SAX characters message a document handler.
        Parameters:
        ch - -
        start - -
        length - -
        Throws:
        org.xml.sax.SAXException - thrown if
        thrown - if
      • getNamespaces

        public boolean getNamespaces()
        Support SAX2 configuration of namespace support of parser.
      • getNamespacePrefixes

        public boolean getNamespacePrefixes()
        Support SAX2 configuration of namespace support of parser.
      • addPrefixMapping

        public void addPrefixMapping​(java.lang.String poPrefix,
                                     java.lang.String poURI)
        Adds a namespace prefix to URI mapping as (key,value) pairs. This mapping can be looked up later to get URIs on request using the getURIFromPrefix method.
        Parameters:
        poPrefix - a String representation of the namespace prefix
        poURI - a String representation of the URI for the namespace prefix.
      • getURIFromPrefix

        public java.lang.String getURIFromPrefix​(java.lang.String poPrefix)
        Gets the URI for a namespace prefix, given that prefix, or null if the prefix is not recognised.
        Parameters:
        poPrefix - a String The namespace prefix.
      • setNamespacePrefix

        public void setNamespacePrefix​(java.lang.String poPrefix)
        Parameters:
        poPrefix - a String value
      • getNamespacePrefix

        public java.lang.String getNamespacePrefix()
        Describe getNamespacePrefix method here.
        Returns:
        a String value
      • prefix

        public java.lang.String prefix​(java.lang.String poElementName)
        Given an unprefixed element name, returns a new element name with a namespace prefix
        Returns:
        a String value
      • getContentStream

        protected java.io.BufferedReader getContentStream​(org.xml.sax.InputSource poSource)
        Create a stream from an an InputSource, picking the correct stream according to order of precedance.
        Parameters:
        poSource - an InputSource value
        Returns:
        a BufferedReader value
      • changeState

        protected void changeState​(int piState)
        Centralise chaining of iState field to help with debugging. E.g. printing out value etc. All changes to iState should be made through this method.
        Parameters:
        piState - an int value