{ "cells": [ { "cell_type": "markdown", "id": "7ef61cee-b754-40bb-9a5d-fc5af9f3afd0", "metadata": {}, "source": [ "# Recipes" ] }, { "cell_type": "markdown", "id": "e4ec17e8-b02c-48c7-aa56-feae2c375b91", "metadata": {}, "source": [ "This page references recipes for common operations for the PyHMMER API, similarly to the `itertools` [recipes](https://docs.python.org/3/library/itertools.html#itertools-recipes) from the Python documentation." ] }, { "cell_type": "markdown", "id": "f895db71-5408-48d7-b5f2-909e2900ea47", "metadata": {}, "source": [ "## Loading multiple HMMs" ] }, { "cell_type": "markdown", "id": "4914a6a6-7d3e-4a6d-ab7b-8cb61b8dc1ac", "metadata": {}, "source": [ "An adapter for loading several `HMMFile` objects into a single object to pass to .\n", "\n", "
\n", " \n", "Credits\n", " \n", "The original implementation proposed by [Zachary Kurtz](https://github.com/zdk123) in [#24](https://github.com/althonos/pyhmmer/issues/24), which was failing when too many file-descriptors where open on some OS ([#48](https://github.com/althonos/pyhmmer/issues/24)), was updated to keep at most a single file open at a time.\n", " \n", "
" ] }, { "cell_type": "code", "execution_count": null, "id": "985ae6b7-2a63-4146-9b98-a34801ca162b", "metadata": {}, "outputs": [], "source": [ "import contextlib\n", "import itertools\n", "import os\n", "import typing\n", "\n", "from pyhmmer.plan7 import HMMFile, HMM\n", "\n", "class HMMFiles(typing.Iterable[HMM]):\n", " def __init__(self, *files: typing.Union[str, bytes, os.PathLike]):\n", " self.files = files\n", " \n", " def __iter__(self):\n", " for file in self.files:\n", " with HMMFile(file) as hmm_file:\n", " yield from hmm_file" ] }, { "cell_type": "markdown", "id": "dfcb96c9-85c1-4f8d-94a3-ddf274d959ec", "metadata": {}, "source": [ "To use it with `hmmsearch`, simply create a `HMMFiles` object with the paths for the different HMM files to concatenate. They will be read in the order given as argument." ] }, { "cell_type": "code", "execution_count": null, "id": "0a0a3b40-9556-415d-8a73-ed765df33fa4", "metadata": {}, "outputs": [], "source": [ "import pyhmmer\n", "from pyhmmer.easel import SequenceFile\n", "\n", "with SequenceFile(\"data/seqs/938293.PRJEB85.HG003687.faa\", digital=True) as sequences:\n", " targets = sequences.read_block()\n", "\n", "hmm_files = HMMFiles(\"data/hmms/txt/PKSI-AT.hmm\", \"data/hmms/txt/LuxC.hmm\")\n", "all_hits = list(pyhmmer.hmmsearch(hmm_files, targets))\n", " \n", "print(\"HMMs searched:\", len(all_hits))\n", "print(\"Hits found: \", sum(len(hits) for hits in all_hits))" ] }, { "cell_type": "markdown", "id": "d447f43b-3a46-461d-a750-ac5725705b06", "metadata": {}, "source": [ "If your filenames are stored in a list, just use the [*splat* operator](https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists) to unpack it:" ] }, { "cell_type": "code", "execution_count": null, "id": "16f217e0-1b4c-4a54-9149-27b59c3b46dd", "metadata": {}, "outputs": [], "source": [ "filenames = [\"data/hmms/txt/PKSI-AT.hmm\", \"data/hmms/txt/LuxC.hmm\"]\n", "hmm_files = HMMFiles(*filenames)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 5 }