Plan7¶
High-level interface to the Plan7 data model.
Plan7 is the model architecture used by HMMER since HMMER2.
See also
Details about the Plan 7 architecture in the HMMER documentation.
Background Model¶
Builder¶
-
class
pyhmmer.plan7.
Builder
¶ A factory for constructing new HMMs from raw sequences.
-
build
(sequence, background, popen=0.02, pextend=0.4)¶ Build a new HMM from
sequence
using the builder configuration.- Parameters
sequence (
DigitalSequence
) – A single biological sequence in digital mode to build a HMM with.background (
pyhmmer.plan7.background
) – The background model to use to create the HMM.popen (
float
) – The gap open probability to use with the score system. Defaults to 0.02.pextend (
float
) – The gap extend probability to use with the score system. Defaults to 0.4.
- Raises
ValueError – When either
sequence
orbackground
have the wrong alphabet for this builder.
-
Hits¶
-
class
pyhmmer.plan7.
Hit
¶ A high-scoring database hit found by the comparison pipeline.
-
class
pyhmmer.plan7.
TopHits
¶ A ranked list of top-scoring hits.
TopHits
are thresholded using the parameters from the pipeline, and are sorted by key when you obtain them from aPipeline
instance:>>> abc = thioesterase.alphabet >>> hits = Pipeline(abc).search_hmm(thioesterase, proteins) >>> hits.is_sorted() True
Use
len
to query the number of top hits, and the usual indexing notation to extract a particularHit
:>>> len(hits) 1 >>> hits[0].name b'938293.PRJEB85.HG003687_113'
-
clear
()¶ Free internals to allow reusing for a new pipeline run.
-
HMM¶
HMM File¶
-
class
pyhmmer.plan7.
HMMFile
¶ A wrapper around a file (or database), storing serialized HMMs.
Pipeline¶
-
class
pyhmmer.plan7.
Pipeline
¶ An HMMER3 accelerated sequence/profile comparison pipeline.
-
clear
()¶ Reset the pipeline configuration to its default state.
-
search_hmm
(query, sequences)¶ Run the pipeline using a query HMM against a sequence database.
- Parameters
query (
HMM
) – The HMM object to use to query the sequence database.sequences (iterable of
DigitalSequence
) – The sequences to query with the HMM. For instance, pass aSequenceFile
in digital mode to read from disk iteratively.
- Returns
TopHits
– the hits found in the sequence database.- Raises
ValueError – When the alphabet of the current pipeline does not match the alphabet of the given HMM.
-
search_seq
(query, sequences, builder=None)¶ Run the pipeline using a query sequence against a sequence database.
- Parameters
query (
DigitalSequence
) – The sequence object to use to query the sequence database.sequences (iterable of
DigitalSequence
) – The sequences to query. Pass aSequenceFile
instance in digital mode to read from disk iteratively.builder (
Builder
, optional) – A HMM builder to use to convert the query to aHMM
. IfNone
is given, it will use a default one.
- Returns
TopHits
– the hits found in the sequence database.- Raises
ValueError – When the alphabet of the current pipeline does not match the alphabet of the given query.
-
Z
¶ The number of effective targets searched.
It is used to compute the independent e-value for each domain, and for an entire hit. If
None
, the parameter number will be set automatically after all the comparisons have been done. Otherwise, it can be set to an arbitrary number.
-
Profile¶
-
class
pyhmmer.plan7.
Profile
¶ A Plan7 search profile.
-
clear
()¶ Clear internal buffers to reuse the profile without reallocation.
-
configure
(hmm, background, L, multihit=True, local=True)¶ Configure a search profile using the given models.
- Parameters
hmm (
pyhmmer.plan7.HMM
) – The model HMM with core probabilities.bg (
pyhmmer.plan7.Background
) – The null background model.L (
int
) – The expected target sequence length.multihit (
bool
) – Whether or not to use multihit modes.local (
bool
) – Whether or not to use non-local modes.
-
copy
()¶ Return a copy of the profile with the exact same configuration.
-
is_local
()¶ Return whether or not the profile is in a local alignment mode.
-
is_multihit
()¶ Returns whether or not the profile is in a multihit alignment mode.
-
optimized
()¶ Convert the profile to a platform-specific optimized profile.
- Returns
OptimizedProfile
– The platform-specific optimized profile built using the configuration of this profile.
-
-
class
pyhmmer.plan7.
OptimizedProfile
¶ An optimized profile that uses platform-specific instructions.
-
copy
()¶ Create an exact copy of the optimized profile.
-
is_local
()¶ Return whether or not the profile is in a local alignment mode.
-
write
(fh_filter, fh_profile)¶ Write an optimized profile to two separate files.
HMMER implements an acceleration pipeline using several scoring algorithms. Parameters for MSV (the Multi ungapped Segment Viterbi) are saved independently to the
fh_filter
handle, while the rest of the profile is saved tofh_profile
.
-