Miscellaneous#

class pyhmmer.easel.Alphabet#

A biological alphabet, including additional marker symbols.

This type is used to share an alphabet to several objects in the easel and plan7 modules. Reference counting helps sharing the same instance everywhere, instead of reallocating memory every time an alphabet is needed.

Use the factory class methods to obtain a default Alphabet for one of the three standard biological alphabets:

>>> dna = Alphabet.dna()
>>> rna = Alphabet.rna()
>>> aa  = Alphabet.amino()
classmethod amino()#

Create a default amino-acid alphabet.

decode(sequence)#

Decode a raw digital sequence into its textual representation.

Parameters:

sequence (object, buffer-like) – A raw sequence in digital format. Any object implementing the buffer protocol (like bytearray, VectorU8, etc.) may be given.

Returns:

str – A raw sequence in textual format.

Example

>>> alphabet = easel.Alphabet.amino()
>>> dseq = easel.VectorU8([0, 4, 2, 17, 3, 13, 0, 0, 5])
>>> alphabet.decode(dseq)
'AFDVEQAAG'

Added in version 0.6.3.

classmethod dna()#

Create a default DNA alphabet.

encode(sequence)#

Encode a raw text sequence into its digital representation.

Parameters:

sequence (str) – A raw sequence in text format.

Returns:

VectorU8 – A raw sequence in digital format.

Example

>>> alphabet = easel.Alphabet.dna()
>>> alphabet.encode("ACGT")
VectorU8([0, 1, 2, 3])

Added in version 0.6.3.

is_amino()#

Check whether the Alphabet object is a protein alphabet.

is_dna()#

Check whether the Alphabet object is a DNA alphabet.

is_nucleotide()#

Check whether the Alphabet object is a nucleotide alphabet.

is_rna()#

Check whether the Alphabet object is a RNA alphabet.

classmethod rna()#

Create a default RNA alphabet.

K#

The alphabet size, counting only actual alphabet symbols.

Example

>>> Alphabet.dna().K
4
>>> Alphabet.amino().K
20
Type:

int

Kp#

The complete alphabet size, including marker symbols.

Example

>>> Alphabet.dna().Kp
18
>>> Alphabet.amino().Kp
29
Type:

int

symbols#

The symbols composing the alphabet.

Example

>>> Alphabet.dna().symbols
'ACGT-RYMKSWHBVDN*~'
>>> Alphabet.rna().symbols
'ACGU-RYMKSWHBVDN*~'
Type:

str

type#

The alphabet type, as a short string.

Example

>>> Alphabet.dna().type
'DNA'
>>> Alphabet.amino().type
'amino'

Added in version 0.8.2.

Type:

str

class pyhmmer.easel.GeneticCode#

A genetic code table for translation.

Added in version 0.7.2.

__init__(translation_table=1, *, nucleotide_alphabet=None, amino_alphabet=None)#

Create a new genetic code for translating nucleotide sequences.

Parameters:
  • translation_table (int) – The translation table to use. Check the Wikipedia page listing all genetic codes for the available values.

  • nucleotide_alphabet (Alphabet) – The nucleotide alphabet from which to translate the sequence.

  • amino_alphabet (Alphabet) – The target alphabet into which to translate the sequence.

translate(sequence)#

Translate a raw nucleotide sequence into a protein.

Parameters:

sequence (object, buffer-like) – A raw sequence in digital format. Any object implementing the buffer protocol (like bytearray, VectorU8, etc.) may be given.

Returns:

VectorU8 – The translation of the input sequence, as a raw digital sequence.

Raises:

ValueError – When sequence could not be translated properly, because of a codon could not be recognized, or because the sequence has an invalid length.

Note

The translation of a DNA/RNA codon supports ambiguous codons. If the amino acid is unambiguous, despite codon ambiguity, the correct amino acid is still determined: GGR translates as Gly, UUY as Phe, etc. If there is no single unambiguous amino acid translation, the codon is translated as X. Ambiguous amino acids (such as J or B) are never produced.

description#

A description of the translation table currently in use.

Type:

str

translation_table#

The translation table in use.

Can be set manually to a different number to change the translation table for the current GeneticCode object.

Type:

int

class pyhmmer.easel.Randomness#

A portable, thread-safe random number generator.

Methods with an implementation in Easel are named after the equivalent methods of random.Random.

Added in version 0.4.2.

__init__(seed=None, fast=False)#

Create a new random number generator with the given seed.

Parameters:
  • seed (int) – The seed to initialize the generator with. If 0 or None is given, an arbitrary seed will be chosen using the system clock.

  • fast (bool) – If True, use a linear congruential generator (LCG), which is low quality and should only be used for integration with legacy code. With False, use the Mersenne Twister MT19937 algorithm instead.

copy()#

Return a copy of the random number generator in the same exact state.

getstate()#

Get a tuple containing the current state.

normalvariate(mu, sigma)#

Generate a Gaussian-distributed sample.

Parameters:
  • mu (float) – The mean of the Gaussian being sampled.

  • sigma (float) – The standard deviation of the Gaussian being sampled.

random()#

Generate a uniform random deviate on \(\left[ 0, 1 \right)\).

seed(n=None)#

Reinitialize the random number generator with the given seed.

Parameters:

n (int, optional) – The seed to use. If 0 or None, an arbitrary seed will be chosen using the current time.

setstate(state)#

Restores the state of the random number generator.

fast#

True when the linear congruential generator is in use.

Type:

bool

class pyhmmer.easel.SSIReader#

A read-only handler for sequence/subsequence index file.

class Entry(fd, record_offset, data_offset, record_length)#
data_offset#

Alias for field number 2

fd#

Alias for field number 0

record_length#

Alias for field number 3

record_offset#

Alias for field number 1

class FileInfo(name, format)#
format#

Alias for field number 1

name#

Alias for field number 0

__init__(file)#

Create a new SSI file reader for the file at the given location.

Parameters:

file (str, bytes or os.PathLike) – The path to a sequence/subsequence index file to read.

close()#

Close the SSI file reader.

file_info(fd)#

Retrieve the FileInfo of the descriptor.

find_name(key)#

Retrieve the Entry for the given name.

class pyhmmer.easel.SSIWriter#

A writer for sequence/subsequence index files.

__init__(file, exclusive=False)#

Create a new SSI file write for the file at the given location.

Parameters:
  • file (str, bytes or os.PathLike) – The path to a sequence/subsequence index file to write.

  • exclusive (bool) – Whether or not to create a file if one does not exist.

Raises:
add_alias(alias, key)#

Make alias an alias of key in the index.

add_file(filename, format=0)#

Add a new file to the index.

Parameters:
  • filename (str, bytes or os.PathLike) – The name of the file to register.

  • format (int) – A format code to associate with the file, or 0 by default.

Returns:

int – The filehandle associated with the new indexed file.

add_key(key, fd, record_offset, data_offset=0, record_length=0)#

Add a new entry to the index with the given key.

close()#

Close the SSI file writer.