Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
v0.4.7 - 2021-09-28¶
Added¶
TraceAligner
,Trace
andTraces
classes topyhmmer.plan7
to get tracebacks after aligning several sequences against an HMM.pyhmmer.hmmalign
function with the same features as thehmmalign
binary from HMMER3.Support for out-of-band pickling in
easel.Vector
andeasel.Matrix
.
Changed¶
Allow creating an empty
Vector
orMatrix
by calling their constructor without arguments.
Fixed¶
Potential unreported exceptions in
plan7.OptimizedProfile.write
and severalplan7.SSIWriter
methods.
v0.4.6 - 2021-09-10¶
Added¶
pickle
protocol foreasel.Alphabet
,easel.Bitfield
,easel.KeyHash
,easel.Vector
,easel.Matrix
andplan7.HMM
.taxonomy_id
andresidue_markups
properties toeasel.Sequence
.sum_score
property toplan7.Hit
.plan7.EvalueParameters
class to expose the e-value parameters of aplan7.HMM
or aplan7.Profile
.Equality checks and slicing for
easel.Matrix
andeasel.Vector
.Support for creating and manipulating zero-sized
easel
matrices and vectors.plan7.Cutoffs
class to expose the Pfam score cutoffs of aplan7.HMM
or aplan7.Profile
.Keyword arguments to configure E-value thresholds when creating a
plan7.Pipeline
object.Support for using model-specific thresholding options in
plan7.Pipeline
.
Changed¶
Use the replace error handler when decoding error messages to skip potential decoding issues when already building an exception.
Improve
pyhmmer.hmmer
to ensure background threads exit on aKeyboardInterrupt
.easel.VectorU8.__eq__
accepts any object implementing the buffer protocol.plan7.HMM.creation_time
now takes and returns adatetime.datetime
object, assuming the field is only ever set withasctime
.Refactor
easel.Vector
andeasel.Matrix
and mark exposed memory as C-contiguous.
Fixed¶
easel.Alphabet
not reporting potential allocation errors.Potential buffer overflow in
easel.Matrix
andeasel.Vector
when calling__init__
more than once.
v0.4.5 - 2021-07-19¶
Added¶
OptimizedProfile.convert
method to configure an optimized profile from aProfile
without reallocating a newP7_OPROFILE
struct.
Changed¶
Rewrite the
plan7.Pipeline
search loop to avoid reacquiring the GIL between reference sequences.Require the reference sequences to be stored in a collection (instead of an iterable) when passing them to the
search_hmm
,search_msa
andsearch_seq
methods ofplan7.Pipeline
.Avoid reallocating a new
OptimizedProfile
every time a new HMM is passed toPipeline.search_hmm
.Relax the GIL while sorting and thresholding
TopHits
inPipeline
search methods.
v0.4.4 - 2021-07-07¶
Added¶
ignore_gaps
parameter topyhmmer.plan7.SequenceFile
, allowing to skip the gap characters when reading a sequence from an ungapped format.__sizeof__
implementation for someDedicated check for sequence length before running the platform-specific code in
pyhmmer.plan7.Pipeline
.
Fixed¶
Score system not being set in
pyhmmer.plan7.Builder.build_msa
.Alphabet not being checked after the first sequence in
Pipeline
search and scan methods.
v0.4.3 - 2021-07-03¶
Fixed¶
File object wrappers not reporting exceptions raised when seeking on OSX/BSD platforms.
v0.4.2 - 2021-06-20¶
Added¶
pyhmmer.easel.Randomness
class exposing a deterministic random number generator.pyhmmer.plan7.Builder.randomness
andpyhmmer.plan7.Pipeline.randomness
attributes exposing the internal random number generator used by each object.pyhmmer.plan7.Hit.best_domain
property mapping to the highest scoring domain of a hit.pyhmmer.plan7.OptimizedProfile.rbv
property exposing match scores.pyhmmer.plan7.Domain.pvalue
andpyhmmer.plan7.Hit.pvalue
reporting the p-value for a domain or hit bitscore.
Fixed¶
Dimensions of the
pyhmmer.plan7.OptimizedProfile.sbv
matrix not being properly set.
v0.4.1 - 2021-06-06¶
Fixed¶
Main buffer not being freed in
MatrixF.__dealloc__
andMatrixU8.__dealloc__
when created without owner.
Added¶
Additional configuration values for
pyhmmer.plan7.Pipeline
as both constructor arguments and mutable properties.consensus
,consensus_structure
andoffsets
properties topyhmmer.plan7.Profile
objects.
Changed¶
Make
OptimizedProfile.ssv_filter
check the alphabet of the given sequence.
v0.4.0 - 2021-06-05 - YANKED¶
Added¶
Linear algebra primitives to expose 1D (
Vector
) and 2D (Matrix
) contiguous buffers containing numerical values topyhmmer.easel
.Documentation for the
Z
anddomZ
parameters of thepyhmmer.plan7.Pipeline
constructor.pyhmmer.errors.AlphabetMismatch
exception deriving fromValueError
to specifically report mismatching Easel alphabets where applicable.scale
andnormalize
methods topyhmmer.plan7.HMM
objects.Property to access
pyhmmer.plan7.Background
residue frequencies as aVectorF
object.Property to access
pyhmmer.plan7.HMM
mean residue composition as aVectorF
object.Property to access
pyhmmer.plan7.HMM
probabilities and emissions asMatrixF
objects.ssv_filter
methods topyhmmer.plan7.OptimizedProfile
to get the SSV filter score of the profile for a given sequence.Several additional properties to access the
pyhmmer.plan7.OptimizedProfile
internals.
Removed¶
Unused
report_e
parameter ofpyhmmer.plan7.Pipeline
constructor.pyhmmer.plan7.TopHits.clear
method which could lead to segfault if it was called while aHit
is being held.
Changed¶
Multithreaded loop in
pyhmmer.hmmer
to reduce memory consumption while still yielding hits in order.pyhmmer.easel.DigitalSequence.sequence
property is now aVectorU8
.
Fixed¶
Type annotations in
pyhmmer.hmmer
.Potential double free in
pyhmmer.plan7.HMM.command_line
property setter.Minor floating-point precision issues in
pyhmmer.plan7.Builder
constructor.Segfault in
TextMSA.digitize
caused byesl_msa_Copy
not digitizing on-the-fly likeesl_sq_Copy
.Exceptions not being raised in some methods of
pyhmmer.plan7.Profile
andpyhmmer.plan7.TopHits
.
v0.3.1 - 2021-05-08¶
Added¶
Pipeline.scan_seq
method to query a database of profiles with one or more sequences.transition_probabilities
,match_emissions
,insert_emissions
properties to theHMM
class, providing access to the numerical parameters of the HMM.consensus_structure
andconsensus_accessibility
properties to theHMM
class to get consensus lines from the source alignment if the HMM was created from a MSA.nseq
andnseq_effective
properties to theHMM
class to get the number of training sequences and effective sequences used to build the HMM.
Changed¶
HMM.checksum
is nowNone
if thep7H_CHKSUM
flag is not set.Builder
methods will now recordsys.argv
when creating a HMM.
Fixed¶
HMM.write(..., binary=False)
crashing on HMMs without a consensus line. (#5). Fixed upstream in (EddyRivasLab/HMMER#236).Pipeline.reset
mishandling theZ
anddomZ
values if those were detected from the number of targets.pyhmmer.hmmer
functions will not block until all results have been collected anymore when run in multithreaded mode.
v0.3.0 - 2021-03-11¶
Added¶
easel.MSAFile
to read from a file containingaccession
,author
,name
anddescription
properties toeasel.MSA
objects.plan7.Builder.build_msa
to build a pHMM from a sequence alignment.Additional methods to
easel.KeyHash
, allowing to use it as adict
/set
hybrid.Sequence.write
andMSA.write
methods to format a sequence or an alignment to a file handle.plan7.TopHist.to_msa
method to convert all the top hits of a query against a database into a multiple sequence alignment.easel.MSA.sequences
attribute to access individual sequences of an alignment using thecollections.abc.Sequence
interface.easel.DigitalMSA.textize
method to convert a multiple sequence alignment in digital mode to its text-mode counterpart.Read-only
name
,accession
anddescription
properties toplan7.Profile
showing attributes inherited from the HMM it was configured with.plan7.HMM.consensus
property, allowing to access the consensus sequence of a pHMM.plan7.HMM
equality implementation, using zero tolerance.plan7.Pipeline.search_msa
to query a MSA against a sequence database.easel.Sequence.reverse_complement
method allowing to reverse-complement inplace or to build a copy.errors.AlphabetMismatch
exception for use in cases where an alphabet is expected but not matched by the input.hmmer.nhmmer
function with the same behaviour ashmmer.phmmer
, except it expects inputs with a DNA alphabet.
Fixed¶
plan7.Builder.copy
not copying some parameters correctly, causingpyhmmer.hmmer.phmmer
to give inconsistent results in multithreaded mode.easel.Bitfield
not properly handling index overflows.Documentation not rendering for the
__init__
method of all classes.
Changed¶
plan7.Builder
gap-open and gap-extend probabilities are now set on instantiation and depend on the alphabet type.Constructors for
easel.TextMSA
andeasel.DigitalMSA
, which can now be given an iterable ofeasel.Sequence
objects to store in the alignment.
Removed¶
Unimplemented
easel.SequenceFile.fetch
andeasel.SequenceFile.fetchinto
methods.
v0.2.2 - 2021-03-04¶
Fixed¶
Linking issues on OSX caused by aggressive stripping of intermediate libraries.
plan7.Builder
RNG not reseeding between different HMMs.
v0.2.1 - 2021-01-29¶
Added¶
pyhmmer.plan7.HMM.checksum
property to get the 32-bit checksum of an HMM.
v0.2.0 - 2021-01-21¶
Added¶
pyhmmer.plan7.Builder
class to handle building a HMM from a sequence.Pipeline.search_seq
to query a sequence against a sequence database.psutil
dependency to detect the most efficient thread count forhmmsearch
based on the number of physical CPUs.pyhmmer.hmmer.phmmer
function to run a search of query sequences against a sequence database.
Changed¶
Pipeline.search
was renamed toPipeline.search_hmm
for disambiguation.libeasel.random
sequences do not require the GIL anymore.Public API now have proper signature annotations.
Fixed¶
Inaccurate exception messages in
Pipeline.search_hmm
.Unneeded RNG reallocation, replaced with re-initialisation where possible.
SequenceFile.__next__
not working after being set in digital mode.sequences
argument ofhmmsearch
now only requires atyping.Collection[DigitalSequence]
instead of atyping.Collection[Sequence]
(not more__getitem__
needed).
Removed¶
hits
argument toPipeline.search_hmm
to reduce risk of issues withTopHits
reuse.Broken alignment coordinates on
Domain
classes.
v0.1.4 - 2021-01-15¶
Added¶
DigitalSequence.textize
to convert a digital sequence to a text sequence.DigitalSequence.__init__
method allowing to create a digital sequence from any object implementing the buffer protocol.Alignment.hmm_accession
property to retrieve the accession of the HMM in an alignment.
v0.1.1 - 2020-12-02¶
Fixed¶
HMMFile
callingfile.peek
without arguments, causing it to crash when passed some types, e.g.gzip.GzipFile
.HMMFile
failing to work with PyPy file objects because of a bug with their implementation ofreadinto
.C/Python file object implementation using
strcpy
instead ofmemcpy
, causing issues when null bytes were read.
v0.1.0 - 2020-12-01¶
Initial beta release.
Fixed¶
TextSequence
uses the sequence argument it’s given on instantiation.Segmentation fault in
Sequence.__eq__
caused by implicit type conversion.Segmentation fault on
SequenceFile.read
failure.Missing type annotations for the
pyhmmer.easel
module.
v0.1.0-a5 - 2020-11-28¶
Added¶
Sequence.__len__
magic method so thatlen(seq)
returns the number of letters inseq
.Python file-handle support when opening an
pyhmmer.plan7.HMMFile
.Context manager protocol to
pyhmmer.easel.SSIWriter
.Type annotations for
pyhmmer.easel.SSIWriter
.add_alias
topyhmmer.easel.SSIWriter
.write
method topyhmmer.plan7.OptimizedProfile
to write an optimized profile in binary format.offsets
property to interact with the disk offsets of apyhmmer.plan7.OptimizedProfile
instance.pyhmmer.hmmer.hmmpress
emulating thehmmpress
binary from HMMER.M
property topyhmmer.plan7.HMM
exposing the number of nodes in the model.
Changed¶
Bumped vendored Easel to
v0.48
.Bumped vendored HMMER to
v3.3.2
.pyhmmer.plan7.HMMFile
will raise anEOFError
when given an empty file.Renamed
length
property toL
inpyhmmer.plan7.Background
.
Fixed¶
Segmentation fault when
close
method ofpyhmmer.easel.SSIWriter
was called more than once.close
method ofpyhmmer.easel.SSIWriter
not writing the index contents.
v0.1.0-a4 - 2020-11-24¶
Added¶
MSA
,TextMSA
andDigitalMSA
classes representing a multiple sequence alignment topyhmmer.easel
.Methods and protocol to copy a
Sequence
and aMSA
.pyhmmer.plan7.OptimizedProfile
wrapping a platform-specific optimized profile.SSIReader
andSSIWriter
classes interacting with sequence/subsequence indices topyhmmer.easel
.Exception handler using Python exceptions to report Easel errors.
Changed¶
pyhmmer.hmmsearch
returns an iterator ofTopHits
, with one instance perHMM
in the input.pyhmmer.hmmsearch
properly raises errors happenning in the background threads without deadlock.pyhmmer.plan7.Pipeline
recycles memory betweenPipeline.search
calls.
Fixed¶
Missing type annotations for the
pyhmmer.errors
module.
Removed¶
Unneeded or private methods from
pyhmmer.plan7
.
v0.1.0-a3 - 2020-11-19¶
Added¶
TextSequence
andDigitalSequence
representing aSequence
in a given mode.E-value properties to
Hit
andDomain
.TopHits
now stores a reference to the pipeline it was obtained from.Pipeline.Z
andPipeline.domZ
properties.Experimental pickling support to
Alphabet
.Experimental freelist to
Sequence
class to avoid allocation bottlenecks when iterating on aSequenceFile
without recycling sequence buffers.
Changed¶
Made
Sequence
an abstract base class.Additional
Pipeline
parameters can be passed as keyword arguments topyhmmer.hmmsearch
.SequenceFile.read
can now be configured to skip reading the metadata or the content of a sequence.
Removed¶
Redundant
SequenceFile
methods.
Fixed¶
doctest
loader crashing on Python 3.5.TopHits.threshold
segfaulting when being called without priorTophits.sort
callUnknown
format
argument toSequenceFile
constructor not raising the right error.
v0.1.0-a2 - 2020-11-12¶
Added¶
Support for compilation on PowerPC big-endian platforms.
Type annotations and stub files for Cython modules.
Changed¶
distutils
is now used to compile the package, instead of callingautotools
and letting HMMER configure itself.Bitfield.count
now allows passing an argument (for compatibility withcollections.abc.Sequence
).