{ "cells": [ { "cell_type": "markdown", "id": "4bf9e5a3-df0f-414b-b0ea-25f8ae4e88cd", "metadata": {}, "source": [ "# Run an iterative search to build a HMM for rhodopsins" ] }, { "cell_type": "markdown", "id": "c0c7cc2b-06b4-41af-a0ff-6fdee1e77b98", "metadata": {}, "source": [ "This examples shows an example workflow for building an HMM from a seed sequence through an iterative search in a sequence database, similar to the JackHMMER methodology. In the second half it will also introduce extra capabilities of the Python wrapper allowing the dynamic modification of the search parameters.\n", "\n", "The goal of this example is to build a HMM specific to [halorhodopsins](https://en.wikipedia.org/wiki/Halorhodopsin), a class of light-activated ion pumps found in Archaea. The difficulty of that task comes from their similarity to [bacteriorhodopsins](https://en.wikipedia.org/wiki/Bacteriorhodopsin), which have a comparable conserved structure, but different activity.\n", "\n", "![Classes of rhodopsins](https://els-jbs-prod-cdn.jbs.elsevierhealth.com/cms/attachment/612410/4909711/gr1.jpg)\n", "\n", "*Figure from* [Zhang et al. (2011)](https://www.cell.com/fulltext/S0092-8674(11)01502-9#gr1).\n", "\n", "In a second time we will then build a broader HMM for any kind of rhodopsin, while introducing how to implement a custom hit selection function. This will allow for the automatic exclusion of false positives coming from other organisms, using an additional filter based on taxonomy.\n", "\n", "