Changelog#
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Unreleased#
v0.10.15 - 2024-10-08#
Added#
query
propertyTopHits
referencing the original object used to create theTopHits
#76.
Changed#
Require the query object to create a
TopHits
object.Make
TopHits
generic over itsquery
property.Deprecate old query properties of
TopHits
(query_name
,query_length
,query_accession
).
Removed#
Detection of SSE flush from
setup.py
(#71).
v0.10.14 - 2024-07-16#
Added#
Detection of SSE flush modes to
setup.py
for possible performance gains on x86 platforms.
Changed#
Migrate documentation to
pydata-sphinx-theme
.
Fixed#
Documentation examples not using permanent resource links.
v0.10.13 - 2024-06-19#
Changed#
Allow
AlphabetMismatch
error to allow for an unknown actual alphabet.Make
HMMFile
andHMMPressedFile
raiseAlphabetMismatch
on files with mixed alphabets.
Fixed#
Avoid calling
fclose
with null pointers inSequence.write
andMSA.write
.
v0.10.12 - 2024-04-25#
Fixed#
HMM.__setstate__
not properly extracting the cutoff frompickle
state for some HMMs (#67).
Changed#
Update and remove some test files to reduce size of distributed package data.
v0.10.11 - 2024-03-27#
Fixed#
Compilation of Easel and HMMER code not using SSE4.1 extensions.
v0.10.10 - 2024-03-18 - YANKED#
Fixed#
Implement
write
function forfopencookie
withoff_t
instead ofoff64_t
for compatibility.Fix handling of NULL buffers passed to
read
andwrite
methods offopencookie
.
v0.10.9 - 2024-03-12 - YANKED#
Fixed#
Reallocation issue causing segmentation faults in
nhmmer
with more than 64 sequences (#62).
v0.10.8 - 2024-03-06 - YANKED#
Added#
Getter to access the strand of a
Domain
produced by aLongTargetsPipeline
.
Changed#
Display model and cutoff names in
MissingCutoffs
error message, if any.Allow
LongTargetsPipeline
to be configured with window length and beta parameters.Make
nhmmer
use the window length and beta from the options when creating aBuilder
.
Fixed#
nhmmer
not computing E-values for non-default window lengths (moshi4/pybarrnap#2).SequenceFile
andMSAFile
crashing with a segmentation fault when given the path to a folder rather than a file.
v0.10.7 - 2024-03-04 - YANKED#
Added#
Pre-compiled wheels for PyPy 3.10.
Fixed#
Invalid pointer cast in
__getbuffer__
method ofMatrix
andVector
objects.Remaining tests failing to run on missing
importlib-resources
.pyhmmer.hmmer
dispatchers possibly dead-locking on background thread errors (#60).
v0.10.6 - 2024-02-20 - YANKED#
Added#
armv7
andaarch64
to thePKGBUILD
architectures.
Changed#
SSIReader
andSSIWriter
constructors now accept path-like objects.Skip tests dependending on
importlib.resources.files
when it is not available on the host machine.
Fixed#
Memory leak caused by alphabet allocation in
Pipeline._scan_loop_file
.
v0.10.5 - 2024-02-16 - YANKED#
Added#
Alignment
properties to get the original lengths of the sequence and HMM being stored.Hit.length
property storing the length of the hit sequence (or HMM).TopHits.query_length
storing the length of the hit HMM (or query).Alignment.posterior_probabilities
property showing an encoded representation of posteriors (#59, by @arajkovic).Trace.score
method to compute a trace score from a given profile and sequence.Alignment.__sizeof__
implementation leveraingp7_alidisplay_SizeOf
.
Fixed#
Cutoffs
proxy objects not recording their owner to prevent deallocation.Avoid GIL re-acquisition in
GeneticCode.translate
.Query metadata not being recorded in
Hits
obtained fromdaemon.Client
.Empty
MatrixU8
creation attempting zero-allocation.VectorU8.zeros
allocating 4x more memory than required.Memory leak caused by string duplication in
__getbuffer__
methods ofMatrix
andVector
types.
v0.10.4 - 2023-10-29 - YANKED#
Added#
residue_markups
argument toTextSequence
andDigitalSequence
constructors.__reduce__
implementation toTextSequence
,DigitalSequence
,TextSequenceBlock
andDigitalSequenceBlock
.
Changed#
Handling of
easel
I/O methods to avoid implicit GIL acquisition for error checking.
Fixed#
Syntax errors in type annotation files.
v0.10.3 - 2023-10-22 - YANKED#
Added#
Out-of-band pickle serialization of
Bitfield
objects.Getters for
float
attributes and forward/backward parameters ofOptimizedProfile
.InvalidHMM
error raised byHMM.validate
.
Changed#
Mark
HMM.zero
method asnoexcept
.Increase size of buffer for the query queue in the
hmmer
dispatcher.
Fixed#
Unneeded semaphore in
pyhmmer.hmmer
message passing implementation.Broken assertion in
Bitfield._from_raw_bytes
.Relax tolerance of HMM validation in
TraceAligner.align_traces
.
v0.10.2 - 2023-08-20 - YANKED#
Fixed#
Invalid buffer write in
DigitalSequenceBlock.translate
(#50).
v0.10.1 - 2023-08-17 - YANKED#
Added#
HMM.set_consensus
method to set the consensus for a method or compute it from the emission probabilities.
Fixed#
Platform detection for MacOS and Armv7 platforms in
setup.py
.pyhmmer.plan7.HMM
constructor setting a consensus string forcefully.
v0.10.0 - 2023-08-16 - YANKED#
Added#
Support for compiling wheels for Aarch64 and NEON-enabled Arm platforms.
Changed#
Updated HMMER to
v3.4
.Updated Easel to
v0.49
.Use
cibuildwheel
to build wheel distributions.
Fixed#
Patch missing
PyInterpreterState_GetID
preventing the package from working on PyPy 3.9.
v0.9.0 - 2023-08-03#
Added#
TopHits.mode
property showing from which pipeline mode (search or scan) the hits were obtained.
Changed#
Updated the code for Cython
v3.0
.
Fixed#
v0.8.2 - 2023-06-07#
Added#
Bracket-style
repr
implementation toHMM
,Profile
andOptimizedProfile
showing model alphabet, length and name.MissingCutoffs
andInvalidParameter
exceptions inheritingValueError
.
Changed#
Replace
pthread
locks withPyThread
API for synchronizing models inOptimizedProfileBlock
.
Fixed#
Sequence length extraction in
LongTargetsPipeline.search_hmm
(#42).LongTargetsPipeline.search_msa
not building a HMM withBuilder.build_msa
.
v0.8.1 - 2023-05-19#
Added#
HMM.validate
method to ensure a HMM holds HMMER structural constraints.plan7.Transitions
enum with transition names for indexingHMM.transition_probabilities
.
v0.8.0 - 2023-05-01#
PyHMMER has been accepted for publication in Bioinformatics. Paper can be reached at doi:10.1093/bioinformatics/btad214.
Added#
Fixed#
Type annotations of
Pipeline.iterate_seq
andPipeline.iterate_hmm
.Potential memory leak on exceptions raised by
HMMPressedFile.read
.Offsets.profile
not recording offsets properly, causingpyhmmer.hmmer.hmmpress
to produce invalid pressed files (#37).
Changed#
HMM.__init__
andHMM.sample
now take theAlphabet
as the first argument, for consistency with the rest of the API.HMM
now require aname
argument.
Removed#
Deprecated
ignore_gaps
argument inSequenceFile.__init__
.Deprecated
Sequence.taxonomy_id
property.
v0.7.4 - 2023-04-14#
Added#
Fixed#
TraceAligner
methods causing a segfault when passed an uninitialized HMM (#36).
Changed#
HMM
default constructor now always creates a valid HMM (with respects to probability arrays).TraceAligner
now validates the inputHMM
before calling the HMMER code.Use stack allocation for all error buffers instead of creating empty
bytearray
objects where applicable.
v0.7.3 - 2023-03-24#
Fixed#
v0.7.2 - 2023-02-17#
Added#
easel.GeneticCode
class wrapping anESL_GENCODE
struct for configuring translation.DigitalSequence.translate
method to translate a nucleotide sequence to a protein sequence. Metadata is copied from the source sequence to its translation (#31, by @valentynbez).
Deprecated#
Sequence.taxonomy_id
property, as it is not used by Easel and implementation is not consistent (see EddyRivasLab/easel#68).
v0.7.1 - 2022-12-15#
Added#
Missing
__reduce__
method toTopHits
.
Fixed#
Build detection of available platform functions in
setup.py
.
v0.7.0 - 2022-12-04#
Added#
Bitfield.zeros
andBitfield.ones
classmethods for constructing an empty bitfield of known size.Bitfield.copy
method to copy a bitfield object.SequenceBlock
andOptimizedProfileBlock
classes to store Python objects next to a contiguous array of pointers for iterating with the GIL released.SequenceFile.read_block
method to read a whole sequence block from a file.HMM.sample
class method to generate a HMM at random given aRandomness
source.hmmscan
function to scan a profile database with sequence queries.deepcopy
implementations toHMM
,Profile
andOptimizedProfile
classes ofplan7
.rewind
method toHMMFile
,HMMPressedFile
andSequenceFile
to reset a file back to its initial position.name
attribute toHMMFile
,HMMPressedFile
,MSAFile
andSequenceFile
to expose the path of a file (when it was created from path).local
property toProfile
andOptimizedProfile
, indicating whether a profile is in local or global mode.multihit
property toProfile
andOptimizedProfile
, indicating whether a profile is in unihit or multihit mode, with a setter taking care of the reconfiguration.Domain.included
andDomain.reported
settable properties to report the inclusion and reporting status of a single domain.TopHits.included
andTopHits.reported
sized iterator to iterate only on included and reported hits.Domains.included
andDomains.reported
sized iterator to iterate only on included and reported domains.
Changed#
Bitfield
,Vector
andMatrix
can now be created from an iterable.Pipeline
search methods now expect aDigitalSequenceBlock
or aSequenceFile
for the target sequence database.Pipeline
scan methods now expect anOptimizedProfileBlock
or aHMMPressedFile
for the target profile database.TraceAligner
now expect aDigitalSequenceBlock
for the sequences to align to the HMM.Profile.configure
now uses a default value of 400 for theL
argument.hmmsearch
,nhmmer
andphmmer
support being given a single query instead of requiring an iterable.HMMPressedFile
can now be created, closed and used as a context manager directly without having to manage the sourceHMMFile
.Renamed
Profile.optimized
method toProfile.to_optimized
.Replaced
Randomness.is_fast
method with theRandomness.fast
property.Rewrite handling of
Hit
flags using settable properties (Hit.included
,Hit.reported
,Hit.new
,Hit.dropped
,Hit.duplicate
) instead of methods.
Fixed#
Memory leak in the
LongTargetsPipeline
search loop.PyPy behaviour change of
readinto
methods now expectingunsigned char*
instead ofchar*
memoryview.NULL
-pointer dereference inPipeline.search_hmm
when given a query without name.LongTargetsPipeline
not recording the query name and accession.Memory leak caused by using a non-default prior scheme when constructing a
Builder
.
Removed#
PipelineSearchTargets
, replaced in functionality witheasel.DigitalSequenceBlock
.is_local
andis_multihit
methods ofProfile
andOptimizedProfile
, replaced with equivalent properties.Hit.manually_drop
andHit.manually_include
methods, replaced with the differentHit
properties.
v0.6.3 - 2022-09-09#
Fixed#
Error not being raised on alphabet detection failure in
SequenceFile
orMSAFile
.Add check in
DigitalSequence
constructor to make sure encoded characters are in valid range (#25).
Added#
SequenceFile.guess_alphabet
andMSAFile.guess_alphabet
to guess the alphabet from an open file.Alphabet.encode
andAlphabet.decode
to convert raw sequences between digital and text format.
v0.6.2 - 2022-08-12#
Changed#
hmmsearch
,phmmer
andnhmmer
functions will reduce the requested number of threads to the number of queries, if it can be detected usingoperator.length_hint
.
Added#
Documentation for loading all HMMs from an
HMMFile
object at once (#23).List of projects depending on PyHMMER to the
Examples
page of the documentation.
v0.6.1 - 2022-06-28#
Added#
pickle
protocol support forTopHits
objects, using the HMMER network serialization.TopHits.write
method to write hits to a file in tabular format.query_name
andquery_accession
properties toTopHits
objects to access the name and accession of the query that produced the hits.
Fixed#
Extraction of filename from file-like objects in the
HMMFile
constructor.Use
os.cpu_count
instead ofmultiprocessing.cpu_count
where applicable to preserve OS scheduling.Wrong return type in docstring of
HMM.insert_emissions
.TopHits.searched_nodes
returning the searched number of residues instead of the searched number of model nodes.Unsound decoding of pickled
MatrixF
orVectorF
when data comes from a source of different endianness.
Changed#
Rewrite
pyhmmer.hmmer
threading code usingDeque
instead ofcollections.Queue
to store the queries and results.Reduce memory consumption of
pyhmmer.hmmer
by reducing the number of semaphores and event flags used concurrently.Make
pyhmmer.hmmer
main threads block on query insertion rather than result retrieval to make sure worker threads are never idling.
v0.6.0 - 2022-05-01#
Added#
pyhmmer.daemon
module with an client implementation to communicate to ahmmpgmd
server.Pipeline.arguments
methods to get a list of CLI arguments from the parameters used to initialize thePipeline
.Setters for
name
,accession
anddescription
properties ofplan7.Hit
.Constructor for individual
plan7.Trace
objects outside aplan7.Traces
list.plan7.Trace.from_sequence
constructor to create a faux trace from a single sequence.manually_include
andmanually_drop
methods toplan7.Hit
for manually selecting the inclusion status of aHit
in aTopHits
instance.compare_ranking
method toplan7.TopHits
for comparing the order of the hits compared to a previous run on the same targets stored in aneasel.KeyHash
object.Pipeline.iterate_seq
andPipeline.iterate_hmm
to run iterative queries like JackHMMER.repr
implementations foreasel.MSAFile
,easel.SequenceFile
andeasel.HMMFile
showing the path or file object they were created from.repr
implementation foreasel.Randomness
showing the seed and the RNG algorithm in use.str
implementation forplan7.Alignment
using HMMER original code to display a domain alignment like in search/scan results.
Changed#
plan7.Trace.posterior_probabilities
property may now beNone
in case no memory is allocated for the posteriors in theP7_TRACE
struct.TopHits.to_msa
can now add additional sequences passed as arguments to the alignment.plan7.HMMPressedFile
now raises an exception on attempts to create a new instance manually.ignore_gaps
argument ofeasel.SequenceFile
is now deprecated.repr
implementations foreasel
types now use the fully qualified class name.
Fixed#
easel.SequenceFile.readinto
docstring not rendering properly in documentation.Type annotations of
hits_included
andhits_reported
ofplan7.TopHits
marking these properties asbool
instead ofint
.Setters of
name
,accession
,description
andauthor
properties ofeasel.MSA
crashing when givenNone
values.Exception value raised from Easel code not being properly extracted.
Plain strings being used in example for
easel.TextSequence
andeasel.TextMSA
constructors where byte strings are expected (#20).
v0.5.0 - 2022-03-14#
Added#
plan7.PipelineSearchTargets
to reduce the overhead when searching the same sequences several times with different. query profiles.TopHits.copy
method to duplicate aTopHits
instance.TopHits.merge
method to merge hits obtained with the same query on different targets.Buffer protocol implementation for
pyhmmer.easel.Bitfield
.
Changed#
Renamed
TopHits.included
andTopHits.reported
properties toTopHits.hits_included
andTopHits.hits_included
.MSAFile
andSequenceFile
are now directly in digital mode if they are instantiated withdigital=True
.SequenceFile.parse
can now return a sequence in digital mode.Reorganized tests to make then runnable from a site install.
Fixed#
Usage of
memcpy
in contexts where it may have had undefined behaviour.VectorF.__eq__
crashing when comparing two empty objects.SequenceFile
andMSAFile
not closing file handles when raising an error in__init__
.
v0.4.11 - 2021-12-15#
Added#
plan7.HMMFile.read
method to read a singleplan7.HMM
from anplan7.HMMFile
(instead of usingnext
).closed
property oneasel.SequenceFile
,easel.MSAFile
andplan7.HMMFile
to mark whether a file object is closed.plan7.HMMFile.is_pressed
method to check whether a HMM file has associated pressed data.plan7.HMMFile.optimized_profiles
methods to read theplan7.OptimizedProfile
entries in anplan7.HMMFile
is there are associated pressed data available.Getters for the
name
,accession
,description
,consensus
,consensus_structure
,evalue_parameters
andcutoffs
properties of aplan7.OptimizedProfile
.plan7.OptimizedProfile.__eq__
implementation to compare two optimized profiles.__sizeof__
implementations forplan7.OptimizedProfile
andplan7.Profile
to get the allocated size of a profile.
Fixed#
Double-free caused by the Cython cycle breaking feature on several view types (
easel.Randomness
,easel.Vector
,easel.Matrix
,plan7.Cutoffs
,plan7.EvalueParameters
,plan7.Offsets
,plan7.Trace
)plan7.Hit.description
using the pointer to the accession string erroneously, causing occasional NULL dereference.plan7.OptimizedProfile.copy
performing a shallow copy instead of a deep copy as expected.
Changed#
pyhmmer.hmmer
type annotations now explicit support forplan7.Profile
orplan7.OptimizedProfile
inputs where applicable.
v0.4.10 - 2021-12-06#
Added#
entropy
andrelative_entropy
methods toeasel.VectorF
to compute the Shannon entropy of a vector and the Kullback-Leibler divergence of two vectors.mean_match_entropy
,mean_match_information
andmean_match_relative_entropy
methods toplan7.HMM
to get information statistics of an HMM model.match_occupancy
method toplan7.HMM
to compute the occupancy for each match state as aneasel.VectorF
.
Fixed#
plan7.Builder.build_msa
using the gap-open and gap-extend probabilities instead of the MSA itself to compute the transition probabilities for the new HMM.
Changed#
plan7.Builder.build
will now only load the score system once and reuse it unless a different score system is requested between calls.
v0.4.9 - 2021-11-11#
Added#
plan7.ScoreData
class to store the substitution scores and maximal extensions for a long target search.plan7.LongTargetsPipeline
to run searches on targets longer than 100,000 residues.Alphabet
methods to check whether anAlphabet
object is a DNA, RNA, nucleotide or protein alphabet.window_length
andwindow_beta
arguments toplan7.Builder
to set the max length of nucleotideHMM
created by builder objects.
Changed#
pyhmmer.hmmer.nhmmer
now uses aLongTargetsPipeline
instead of aPipeline
to search the target sequences.pyhmmer.hmmer.nhmmer
now supportsHMM
queries in addition toDigitalSequence
andDigitalMSA
queries.pyhmmer.hmmer.phmmer
now always assumes protein queries.Z
anddomZ
attributes ofplan7.TopHits
objects is now read-only.
Fixed#
nhmmer
now uses DNA as the default alphabet instead of amino acid alphabet like it did before (#12).
v0.4.8 - 2021-10-27#
Added#
Constructor arguments and properties to
plan7.Pipeline
to support bit score thresholds instead to filter top hits.Support for creating a
SequenceFile
and anMSAFile
using a Python file-like object instead of only supporting filenames.Support for reading individual sequences from an MSA file with
SequenceFile
.TextMSA.alignment
to access the actual alignment as a tuple of strings.Subtraction and division support for
easel.Vector
subclasses
Changed#
plan7.Cutoffs
now support setting the bit score cutoffs, but requires both to be set or cleared at the same time.easel.Vector
will always allocate some memory when created manually to avoid having a special empty case in every vector method.pyhmmer.easel.AllocationError
now stores the size it failed to allocate, and the number of elements when allocating an array.
Fixed#
TextSequence.digitize
will not raise aValueError
when the sequence contains invalid characters for the alphabet (previously was anUnexpectedError
).
v0.4.7 - 2021-09-28#
Added#
TraceAligner
,Trace
andTraces
classes topyhmmer.plan7
to get tracebacks after aligning several sequences against an HMM.pyhmmer.hmmalign
function with the same features as thehmmalign
binary from HMMER3.Support for out-of-band pickling in
easel.Vector
andeasel.Matrix
.
Changed#
Allow creating an empty
Vector
orMatrix
by calling their constructor without arguments.
Fixed#
Potential unreported exceptions in
plan7.OptimizedProfile.write
and severalplan7.SSIWriter
methods.
v0.4.6 - 2021-09-10#
Added#
pickle
protocol foreasel.Alphabet
,easel.Bitfield
,easel.KeyHash
,easel.Vector
,easel.Matrix
andplan7.HMM
.taxonomy_id
andresidue_markups
properties toeasel.Sequence
.sum_score
property toplan7.Hit
.plan7.EvalueParameters
class to expose the e-value parameters of aplan7.HMM
or aplan7.Profile
.Equality checks and slicing for
easel.Matrix
andeasel.Vector
.Support for creating and manipulating zero-sized
easel
matrices and vectors.plan7.Cutoffs
class to expose the Pfam score cutoffs of aplan7.HMM
or aplan7.Profile
.Keyword arguments to configure E-value thresholds when creating a
plan7.Pipeline
object.Support for using model-specific thresholding options in
plan7.Pipeline
.
Changed#
Use the replace error handler when decoding error messages to skip potential decoding issues when already building an exception.
Improve
pyhmmer.hmmer
to ensure background threads exit on aKeyboardInterrupt
.easel.VectorU8.__eq__
accepts any object implementing the buffer protocol.plan7.HMM.creation_time
now takes and returns adatetime.datetime
object, assuming the field is only ever set withasctime
.Refactor
easel.Vector
andeasel.Matrix
and mark exposed memory as C-contiguous.
Fixed#
easel.Alphabet
not reporting potential allocation errors.Potential buffer overflow in
easel.Matrix
andeasel.Vector
when calling__init__
more than once.
v0.4.5 - 2021-07-19#
Added#
OptimizedProfile.convert
method to configure an optimized profile from aProfile
without reallocating a newP7_OPROFILE
struct.
Changed#
Rewrite the
plan7.Pipeline
search loop to avoid reacquiring the GIL between reference sequences.Require the reference sequences to be stored in a collection (instead of an iterable) when passing them to the
search_hmm
,search_msa
andsearch_seq
methods ofplan7.Pipeline
.Avoid reallocating a new
OptimizedProfile
every time a new HMM is passed toPipeline.search_hmm
.Relax the GIL while sorting and thresholding
TopHits
inPipeline
search methods.
v0.4.4 - 2021-07-07#
Added#
ignore_gaps
parameter topyhmmer.plan7.SequenceFile
, allowing to skip the gap characters when reading a sequence from an ungapped format.__sizeof__
implementation for someDedicated check for sequence length before running the platform-specific code in
pyhmmer.plan7.Pipeline
.
Fixed#
Score system not being set in
pyhmmer.plan7.Builder.build_msa
.Alphabet not being checked after the first sequence in
Pipeline
search and scan methods.
v0.4.3 - 2021-07-03#
Fixed#
File object wrappers not reporting exceptions raised when seeking on OSX/BSD platforms.
v0.4.2 - 2021-06-20#
Added#
pyhmmer.easel.Randomness
class exposing a deterministic random number generator.pyhmmer.plan7.Builder.randomness
andpyhmmer.plan7.Pipeline.randomness
attributes exposing the internal random number generator used by each object.pyhmmer.plan7.Hit.best_domain
property mapping to the highest scoring domain of a hit.pyhmmer.plan7.OptimizedProfile.rbv
property exposing match scores.pyhmmer.plan7.Domain.pvalue
andpyhmmer.plan7.Hit.pvalue
reporting the p-value for a domain or hit bitscore.
Fixed#
Dimensions of the
pyhmmer.plan7.OptimizedProfile.sbv
matrix not being properly set.
v0.4.1 - 2021-06-06#
Fixed#
Main buffer not being freed in
MatrixF.__dealloc__
andMatrixU8.__dealloc__
when created without owner.
Added#
Additional configuration values for
pyhmmer.plan7.Pipeline
as both constructor arguments and mutable properties.consensus
,consensus_structure
andoffsets
properties topyhmmer.plan7.Profile
objects.
Changed#
Make
OptimizedProfile.ssv_filter
check the alphabet of the given sequence.
v0.4.0 - 2021-06-05 - YANKED#
Added#
Linear algebra primitives to expose 1D (
Vector
) and 2D (Matrix
) contiguous buffers containing numerical values topyhmmer.easel
.Documentation for the
Z
anddomZ
parameters of thepyhmmer.plan7.Pipeline
constructor.pyhmmer.errors.AlphabetMismatch
exception deriving fromValueError
to specifically report mismatching Easel alphabets where applicable.scale
andnormalize
methods topyhmmer.plan7.HMM
objects.Property to access
pyhmmer.plan7.Background
residue frequencies as aVectorF
object.Property to access
pyhmmer.plan7.HMM
mean residue composition as aVectorF
object.Property to access
pyhmmer.plan7.HMM
probabilities and emissions asMatrixF
objects.ssv_filter
methods topyhmmer.plan7.OptimizedProfile
to get the SSV filter score of the profile for a given sequence.Several additional properties to access the
pyhmmer.plan7.OptimizedProfile
internals.
Removed#
Unused
report_e
parameter ofpyhmmer.plan7.Pipeline
constructor.pyhmmer.plan7.TopHits.clear
method which could lead to segfault if it was called while aHit
is being held.
Changed#
Multithreaded loop in
pyhmmer.hmmer
to reduce memory consumption while still yielding hits in order.pyhmmer.easel.DigitalSequence.sequence
property is now aVectorU8
.
Fixed#
Type annotations in
pyhmmer.hmmer
.Potential double free in
pyhmmer.plan7.HMM.command_line
property setter.Minor floating-point precision issues in
pyhmmer.plan7.Builder
constructor.Segfault in
TextMSA.digitize
caused byesl_msa_Copy
not digitizing on-the-fly likeesl_sq_Copy
.Exceptions not being raised in some methods of
pyhmmer.plan7.Profile
andpyhmmer.plan7.TopHits
.
v0.3.1 - 2021-05-08#
Added#
Pipeline.scan_seq
method to query a database of profiles with one or more sequences.transition_probabilities
,match_emissions
,insert_emissions
properties to theHMM
class, providing access to the numerical parameters of the HMM.consensus_structure
andconsensus_accessibility
properties to theHMM
class to get consensus lines from the source alignment if the HMM was created from a MSA.nseq
andnseq_effective
properties to theHMM
class to get the number of training sequences and effective sequences used to build the HMM.
Changed#
HMM.checksum
is nowNone
if thep7H_CHKSUM
flag is not set.Builder
methods will now recordsys.argv
when creating a HMM.
Fixed#
HMM.write(..., binary=False)
crashing on HMMs without a consensus line. (#5). Fixed upstream in (EddyRivasLab/HMMER#236).Pipeline.reset
mishandling theZ
anddomZ
values if those were detected from the number of targets.pyhmmer.hmmer
functions will not block until all results have been collected anymore when run in multithreaded mode.
v0.3.0 - 2021-03-11#
Added#
easel.MSAFile
to read from a file containingaccession
,author
,name
anddescription
properties toeasel.MSA
objects.plan7.Builder.build_msa
to build a pHMM from a sequence alignment.Additional methods to
easel.KeyHash
, allowing to use it as adict
/set
hybrid.Sequence.write
andMSA.write
methods to format a sequence or an alignment to a file handle.plan7.TopHits.to_msa
method to convert all the top hits of a query against a database into a multiple sequence alignment.easel.MSA.sequences
attribute to access individual sequences of an alignment using thecollections.abc.Sequence
interface.easel.DigitalMSA.textize
method to convert a multiple sequence alignment in digital mode to its text-mode counterpart.Read-only
name
,accession
anddescription
properties toplan7.Profile
showing attributes inherited from the HMM it was configured with.plan7.HMM.consensus
property, allowing to access the consensus sequence of a pHMM.plan7.HMM
equality implementation, using zero tolerance.plan7.Pipeline.search_msa
to query a MSA against a sequence database.easel.Sequence.reverse_complement
method allowing to reverse-complement inplace or to build a copy.errors.AlphabetMismatch
exception for use in cases where an alphabet is expected but not matched by the input.hmmer.nhmmer
function with the same behaviour ashmmer.phmmer
, except it expects inputs with a DNA alphabet.
Fixed#
plan7.Builder.copy
not copying some parameters correctly, causingpyhmmer.hmmer.phmmer
to give inconsistent results in multithreaded mode.easel.Bitfield
not properly handling index overflows.Documentation not rendering for the
__init__
method of all classes.
Changed#
plan7.Builder
gap-open and gap-extend probabilities are now set on instantiation and depend on the alphabet type.Constructors for
easel.TextMSA
andeasel.DigitalMSA
, which can now be given an iterable ofeasel.Sequence
objects to store in the alignment.
Removed#
Unimplemented
easel.SequenceFile.fetch
andeasel.SequenceFile.fetchinto
methods.
v0.2.2 - 2021-03-04#
Fixed#
Linking issues on OSX caused by aggressive stripping of intermediate libraries.
plan7.Builder
RNG not reseeding between different HMMs.
v0.2.1 - 2021-01-29#
Added#
pyhmmer.plan7.HMM.checksum
property to get the 32-bit checksum of an HMM.
v0.2.0 - 2021-01-21#
Added#
pyhmmer.plan7.Builder
class to handle building a HMM from a sequence.Pipeline.search_seq
to query a sequence against a sequence database.psutil
dependency to detect the most efficient thread count forhmmsearch
based on the number of physical CPUs.pyhmmer.hmmer.phmmer
function to run a search of query sequences against a sequence database.
Changed#
Pipeline.search
was renamed toPipeline.search_hmm
for disambiguation.libeasel.random
sequences do not require the GIL anymore.Public API now have proper signature annotations.
Fixed#
Inaccurate exception messages in
Pipeline.search_hmm
.Unneeded RNG reallocation, replaced with re-initialisation where possible.
SequenceFile.__next__
not working after being set in digital mode.sequences
argument ofhmmsearch
now only requires atyping.Collection[DigitalSequence]
instead of atyping.Collection[Sequence]
(not more__getitem__
needed).
Removed#
hits
argument toPipeline.search_hmm
to reduce risk of issues withTopHits
reuse.Broken alignment coordinates on
Domain
classes.
v0.1.4 - 2021-01-15#
Added#
DigitalSequence.textize
to convert a digital sequence to a text sequence.DigitalSequence.__init__
method allowing to create a digital sequence from any object implementing the buffer protocol.Alignment.hmm_accession
property to retrieve the accession of the HMM in an alignment.
v0.1.3 - 2021-01-08#
Fixed#
Compilation issues in OSX-specific Cython code.
v0.1.2 - 2021-01-07#
Fixed#
Required Cython files not being included in source distribution.
v0.1.1 - 2020-12-02#
Fixed#
HMMFile
callingfile.peek
without arguments, causing it to crash when passed some types, e.g.gzip.GzipFile
.HMMFile
failing to work with PyPy file objects because of a bug with their implementation ofreadinto
.C/Python file object implementation using
strcpy
instead ofmemcpy
, causing issues when null bytes were read.
v0.1.0 - 2020-12-01#
Initial beta release.
Fixed#
TextSequence
uses the sequence argument it’s given on instantiation.Segmentation fault in
Sequence.__eq__
caused by implicit type conversion.Segmentation fault on
SequenceFile.read
failure.Missing type annotations for the
pyhmmer.easel
module.
v0.1.0-a5 - 2020-11-28#
Added#
Sequence.__len__
magic method so thatlen(seq)
returns the number of letters inseq
.Python file-handle support when opening an
pyhmmer.plan7.HMMFile
.Context manager protocol to
pyhmmer.easel.SSIWriter
.Type annotations for
pyhmmer.easel.SSIWriter
.add_alias
topyhmmer.easel.SSIWriter
.write
method topyhmmer.plan7.OptimizedProfile
to write an optimized profile in binary format.offsets
property to interact with the disk offsets of apyhmmer.plan7.OptimizedProfile
instance.pyhmmer.hmmer.hmmpress
emulating thehmmpress
binary from HMMER.M
property topyhmmer.plan7.HMM
exposing the number of nodes in the model.
Changed#
Bumped vendored Easel to
v0.48
.Bumped vendored HMMER to
v3.3.2
.pyhmmer.plan7.HMMFile
will raise anEOFError
when given an empty file.Renamed
length
property toL
inpyhmmer.plan7.Background
.
Fixed#
Segmentation fault when
close
method ofpyhmmer.easel.SSIWriter
was called more than once.close
method ofpyhmmer.easel.SSIWriter
not writing the index contents.
v0.1.0-a4 - 2020-11-24#
Added#
MSA
,TextMSA
andDigitalMSA
classes representing a multiple sequence alignment topyhmmer.easel
.Methods and protocol to copy a
Sequence
and aMSA
.pyhmmer.plan7.OptimizedProfile
wrapping a platform-specific optimized profile.SSIReader
andSSIWriter
classes interacting with sequence/subsequence indices topyhmmer.easel
.Exception handler using Python exceptions to report Easel errors.
Changed#
pyhmmer.hmmsearch
returns an iterator ofTopHits
, with one instance perHMM
in the input.pyhmmer.hmmsearch
properly raises errors happenning in the background threads without deadlock.pyhmmer.plan7.Pipeline
recycles memory betweenPipeline.search
calls.
Fixed#
Missing type annotations for the
pyhmmer.errors
module.
Removed#
Unneeded or private methods from
pyhmmer.plan7
.
v0.1.0-a3 - 2020-11-19#
Added#
TextSequence
andDigitalSequence
representing aSequence
in a given mode.E-value properties to
Hit
andDomain
.TopHits
now stores a reference to the pipeline it was obtained from.Pipeline.Z
andPipeline.domZ
properties.Experimental pickling support to
Alphabet
.Experimental freelist to
Sequence
class to avoid allocation bottlenecks when iterating on aSequenceFile
without recycling sequence buffers.
Changed#
Made
Sequence
an abstract base class.Additional
Pipeline
parameters can be passed as keyword arguments topyhmmer.hmmsearch
.SequenceFile.read
can now be configured to skip reading the metadata or the content of a sequence.
Removed#
Redundant
SequenceFile
methods.
Fixed#
doctest
loader crashing on Python 3.5.TopHits.threshold
segfaulting when being called without priorTophits.sort
callUnknown
format
argument toSequenceFile
constructor not raising the right error.
v0.1.0-a2 - 2020-11-12#
Added#
Support for compilation on PowerPC big-endian platforms.
Type annotations and stub files for Cython modules.
Changed#
distutils
is now used to compile the package, instead of callingautotools
and letting HMMER configure itself.Bitfield.count
now allows passing an argument (for compatibility withcollections.abc.Sequence
).
v0.1.0-a1 - 2020-11-10#
Initial alpha release (test deployment to PyPI).