pyHMMER Stars

Cython bindings and Python interface to HMMER3.

Actions Coverage PyPI Bioconda Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads DOI

Overview

HMMER is a biological sequence analysis tool that uses profile hidden Markov models to search for sequence homologs. HMMER3 is maintained by members of the the Eddy/Rivas Laboratory at Harvard University.

pyhmmer is a Python module, implemented using the Cython language, that provides bindings to HMMER3. It directly interacts with the HMMER internals, which has the following advantages over CLI wrappers:

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyhmmer as a dependency to your project, and stop worrying about the HMMER binaries being properly setup on the end-user machine.

  • no intermediate files: Everything happens in memory, in Python objects you have control on, making it easier to pass your inputs to HMMER without needing to write them to a temporary file. Output retrieval is also done in memory, via instances of the pyhmmer.plan7.TopHits class.

  • no input formatting: The Easel object model is exposed in the pyhmmer.easel module, and you have the possibility to build a Sequence object yourself to pass to the HMMER pipeline. This is useful if your sequences are already loaded in memory, for instance because you obtained them from another Python library (such as Pyrodigal or Biopython).

  • no output formatting: HMMER3 is notorious for its numerous output files and its fixed-width tabular output, which is hard to parse (even Bio.SearchIO.HmmerIO is struggling on some sequences).

  • efficient: Using pyhmmer to launch hmmsearch on sequences and HMMs in disk storage is typically faster than directly using the hmmsearch binary. pyhmmer.hmmer.hmmsearch uses a different parallelisation strategy compared to the hmmsearch binary from HMMER, which helps getting the most of multiple CPUs.

Setup

Run pip install pyhmmer in a shell to download the latest release and all its dependencies from PyPi, or have a look at the Installation page to find other ways to install pyhmmer.

License

This library is provided under the MIT License. The HMMER3 and Easel code is available under the BSD 3-clause license, which allows redistribution of their sources in the pyhmmer distribution.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original HMMER authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.