Linear Algebra#

Vectors#

class pyhmmer.easel.Vector#

An abstract 1D array of fixed size.

Added in version 0.4.0.

argmax()#

Return index of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

argmin()#

Return index of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

copy()#: Create a copy of the vector, allocating a new buffer.

max()#

Return value of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

min()#

Return value of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

reverse()#: Reverse the vector, in place.

sum()#: Returns the scalar sum of all elements in the vector.

classmethod zeros(n)#: Create a vector of size n filled with zeros.

format#

The format of each item in the vector.

See also

The array module of the Python standard library for a detail about available type codes.

Added in version 0.4.6.

Type:: str

itemsize#

The size of each item in the vector, in bytes.

Added in version 0.4.6.

Type:: int

shape#

The shape of the vector.

Type:: tuple

strides#

The strides of the vector.

Type:: tuple

class pyhmmer.easel.VectorF(Vector)#

A vector storing single-precision floating point numbers.

Individual elements of a vector can be accessed and modified with the usual indexing notation:

>>> v = VectorF([1.0, 2.0, 3.0])
>>> v[0]
1.0
>>> v[-1]
3.0
>>> v[0] = v[-1] = 4.0
>>> v
VectorF([4.0, 2.0, 4.0])

Slices are also supported, and they do not copy data (use the copy method to allocate a new vector):

>>> v = VectorF(range(6))
>>> v[2:5]
VectorF([2.0, 3.0, 4.0])
>>> v[2:-1] = 10.0
>>> v
VectorF([0.0, 1.0, 10.0, 10.0, 10.0, 5.0])

Addition and multiplication is supported for scalars, in place or not:

>>> v = VectorF([1.0, 2.0, 3.0])
>>> v += 1
>>> v
VectorF([2.0, 3.0, 4.0])
>>> v * 3
VectorF([6.0, 9.0, 12.0])

Pairwise operations can also be performed, but only on vectors of the same dimension and precision:

>>> v = VectorF([1.0, 2.0, 3.0])
>>> v * v
VectorF([1.0, 4.0, 9.0])
>>> v += VectorF([3.0, 4.0, 5.0])
>>> v
VectorF([4.0, 6.0, 8.0])
>>> v *= VectorF([1.0])
Traceback (most recent call last):
  ...
ValueError: cannot pairwise multiply vectors of different sizes

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray of one dimension using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> v = VectorF([1.0, 2.0, 3.0])
>>> numpy.asarray(v)
array([1., 2., 3.], dtype=float32)
>>> numpy.log2(v)
array([0.       , 1.       , 1.5849625], dtype=float32)

Added in version 0.4.0.

__init__(iterable=())#: Create a new float vector from the given data.

argmax()#

Return index of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

argmin()#

Return index of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

copy()#: Create a copy of the vector, allocating a new buffer.

entropy()#

Compute the Shannon entropy of the vector.

The Shannon entropy of a probability vector is defined as:

\[H = \sum_{i=0}^{N}{\log_2 p_i}\]

Example

>>> easel.VectorF([0.1, 0.1, 0.3, 0.5]).entropy()
1.6854...
>>> easel.VectorF([0.25, 0.25, 0.25, 0.25]).entropy()
2.0

References

Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.

Added in version 0.4.10.

max()#

Return value of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

min()#

Return value of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

normalize()#: Normalize a vector so that all elements sum to 1.

Caution

If sum is zero, sets all elements to \(\frac{1}{n}\), where \(n\) is the size of the vector.

relative_entropy(other)#

Compute the relative entropy between two probability vectors.

The Shannon relative entropy of two probability vectors \(p\) and \(q\), also known as the Kullback-Leibler divergence, is defined as:

\[D(p \parallel q) = \sum_i p_i \log_2 \frac{p_i}{q_i}.\]

with \(D(p \parallel q) = \infty\) per definition if \(q_i = 0\) and \(p_i > 0\) for any \(i\).

Example

>>> v1 = easel.VectorF([0.1, 0.1, 0.3, 0.5])
>>> v2 = easel.VectorF([0.25, 0.25, 0.25, 0.25])
>>> v1.relative_entropy(v2)
0.3145...
>>> v2.relative_entropy(v1)   # this is no symmetric relation
0.3452...

References

Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.

Added in version 0.4.10.

reverse()#: Reverse the vector, in place.

sum()#

Returns the scalar sum of all elements in the vector.

Float summations use Kahan’s algorithm, in order to minimize roundoff error accumulation. Additionally, they are most accurate if the vector is sorted in increasing order, from small to large, so you may consider sorting the vector before summing it.

References

Kahan, W. Pracniques: Further Remarks on Reducing Truncation Errors. Communications of the ACM 8, no. 1 (1 January 1965): 40. doi:10.1145/363707.363723.

class pyhmmer.easel.VectorD(Vector)#

A vector storing double-precision floating point numbers.

Individual elements of a vector can be accessed and modified with the usual indexing notation:

>>> v = VectorD([1.0, 2.0, 3.0])
>>> v[0]
1.0
>>> v[-1]
3.0
>>> v[0] = v[-1] = 4.0
>>> v
VectorD([4.0, 2.0, 4.0])

Slices are also supported, and they do not copy data (use the copy method to allocate a new vector):

>>> v = VectorD(range(6))
>>> v[2:5]
VectorD([2.0, 3.0, 4.0])
>>> v[2:-1] = 10.0
>>> v
VectorD([0.0, 1.0, 10.0, 10.0, 10.0, 5.0])

Addition and multiplication is supported for scalars, in place or not:

>>> v = VectorD([1.0, 2.0, 3.0])
>>> v += 1
>>> v
VectorD([2.0, 3.0, 4.0])
>>> v * 3
VectorD([6.0, 9.0, 12.0])

Pairwise operations can also be performed, but only on vectors of the same dimension and precision:

>>> v = VectorD([1.0, 2.0, 3.0])
>>> v * v
VectorD([1.0, 4.0, 9.0])
>>> v += VectorD([3.0, 4.0, 5.0])
>>> v
VectorD([4.0, 6.0, 8.0])
>>> v *= VectorD([1.0])
Traceback (most recent call last):
  ...
ValueError: cannot pairwise multiply vectors of different sizes

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray of one dimension using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> v = VectorD([1.0, 2.0, 3.0])
>>> numpy.asarray(v)
array([1., 2., 3.])
>>> numpy.log2(v)
array([0.       , 1.       , 1.5849625])

Added in version 0.11.3.

__init__(iterable=())#: Create a new double vector from the given data.

argmax()#

Return index of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

argmin()#

Return index of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

copy()#: Create a copy of the vector, allocating a new buffer.

entropy()#

Compute the Shannon entropy of the vector.

The Shannon entropy of a probability vector is defined as:

\[H = \sum_{i=0}^{N}{\log_2 p_i}\]

Example

>>> easel.VectorF([0.1, 0.1, 0.3, 0.5]).entropy()
1.6854...
>>> easel.VectorF([0.25, 0.25, 0.25, 0.25]).entropy()
2.0

References

Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.

Added in version 0.4.10.

max()#

Return value of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

min()#

Return value of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

normalize()#: Normalize a vector so that all elements sum to 1.

Caution

If sum is zero, sets all elements to \(\frac{1}{n}\), where \(n\) is the size of the vector.

relative_entropy(other)#

Compute the relative entropy between two probability vectors.

The Shannon relative entropy of two probability vectors \(p\) and \(q\), also known as the Kullback-Leibler divergence, is defined as:

\[D(p \parallel q) = \sum_i p_i \log_2 \frac{p_i}{q_i}.\]

with \(D(p \parallel q) = \infty\) per definition if \(q_i = 0\) and \(p_i > 0\) for any \(i\).

Example

>>> v1 = easel.VectorF([0.1, 0.1, 0.3, 0.5])
>>> v2 = easel.VectorF([0.25, 0.25, 0.25, 0.25])
>>> v1.relative_entropy(v2)
0.3145...
>>> v2.relative_entropy(v1)   # this is no symmetric relation
0.3452...

References

Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.

Added in version 0.4.10.

reverse()#: Reverse the vector, in place.

sum()#

Returns the scalar sum of all elements in the vector.

Float summations use Kahan’s algorithm, in order to minimize roundoff error accumulation. Additionally, they are most accurate if the vector is sorted in increasing order, from small to large, so you may consider sorting the vector before summing it.

References

Kahan, W. Pracniques: Further Remarks on Reducing Truncation Errors. Communications of the ACM 8, no. 1 (1 January 1965): 40. doi:10.1145/363707.363723.

class pyhmmer.easel.VectorI(Vector)#

A vector storing system-sized integers.

Individual elements of a vector can be accessed and modified with the usual indexing notation:

>>> v = VectorI([1, 2, 3])
>>> v[0]
1
>>> v[-1]
3
>>> v[0] = v[-1] = 4
>>> v
VectorI([4, 2, 4])

Slices are also supported, and they do not copy data (use the copy method to allocate a new vector):

>>> v = VectorI(range(6))
>>> v[2:5]
VectorI([2, 3, 4])
>>> v[2:-1] = 10
>>> v
VectorI([0, 1, 10, 10, 10, 5])

Addition and multiplication is supported for scalars, in place or not:

>>> v = VectorI([1, 2, 3])
>>> v += 1
>>> v
VectorI([2, 3, 4])
>>> v * 3
VectorI([6, 9, 12])

Pairwise operations can also be performed, but only on vectors of the same dimension and precision:

>>> v = VectorI([1, 2, 3])
>>> v * v
VectorI([1, 4, 9])
>>> v += VectorI([3, 4, 5])
>>> v
VectorI([4, 6, 8])
>>> v *= VectorI([1])
Traceback (most recent call last):
  ...
ValueError: cannot pairwise multiply vectors of different sizes

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray of one dimension using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> v = VectorI([1.0, 2.0, 3.0])
>>> numpy.asarray(v)
array([1, 2, 3], dtype=int32)
>>> numpy.add(v, 2)
array([3, 4, 5], dtype=int32)

Added in version 0.12.0.

__init__(iterable=())#: Create a new integer vector from the given data.

argmax()#

Return index of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

argmin()#

Return index of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

copy()#: Create a copy of the vector, allocating a new buffer.

max()#

Return value of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

min()#

Return value of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

reverse()#: Reverse the vector, in place.

sum()#: Returns the scalar sum of all elements in the vector.

class pyhmmer.easel.VectorU8(Vector)#

A vector storing byte-sized unsigned integers.

Added in version 0.4.0.

__init__(iterable=())#: Create a new byte vector from the given data.

argmax()#

Return index of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

argmin()#

Return index of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

copy()#: Create a copy of the vector, allocating a new buffer.

max()#

Return value of the maximum element in the vector.

Raises:: ValueError – When called on an empty vector.

min()#

Return value of the minimum element in the vector.

Raises:: ValueError – When called on an empty vector.

reverse()#: Reverse the vector, in place.

sum()#

Returns the scalar sum of all elements in the vector.

Caution

The sum is wrapping:

>>> vec = VectorU8([255, 2])
>>> vec.sum()
1

Matrices#

class pyhmmer.easel.Matrix#

An abstract 2D array of fixed size.

Added in version 0.4.0.

classmethod _from_raw_bytes(buffer, m, n, byteorder)#: Create a new matrix using the given bytes to fill its contents.

argmax()#

Return the coordinates of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

argmin()#

Return the coordinates of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

copy()#: Create a copy of the matrix, allocating a new buffer.

flatten()#

Return a flattened view over the matrix data.

Note

This method does not return a copy, but simply exposes the internal buffer used by the matrix, since all matrices are allocated as a C-contiguous array in HMMER.

Returns:: Vector – A vector treating the matrix buffer as a 1-D vector with the same item format.

Added in version 0.12.1.

max()#

Return the value of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

min()#

Return the value of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

sum()#: Return the sum of all elements in the matrix.

classmethod zeros(m, n)#: Create a new \(m \times n\) matrix filled with zeros.

format#

The format of each item in the matrix.

See also

The array module of the Python standard library for a detail about available type codes.

Added in version 0.4.7.

Type:: str

itemsize#

The size of each item in the matrix, in bytes.

Added in version 0.4.7.

Type:: int

shape#

The shape of the matrix.

Example

>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0], [5.0, 6.0] ])
>>> m.shape
(3, 2)

Type:: tuple

strides#

The strides of the matrix.

Type:: tuple

class pyhmmer.easel.MatrixD(Matrix)#

A matrix storing double-precision floating point numbers.

Use indexing notation to access and edit individual elements of the matrix:

>>> m = MatrixD.zeros(2, 2)
>>> m[0, 0] = 3.0
>>> m
MatrixD([[3.0, 0.0], [0.0, 0.0]])

Indexing can also be performed at the row-level to get a VectorD without copying the underlying data:

>>> m = MatrixD([ [1.0, 2.0], [3.0, 4.0] ])
>>> m[0]
VectorD([1.0, 2.0])

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray with two dimensions using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> m = MatrixD([ [1.0, 2.0], [3.0, 4.0] ])
>>> numpy.asarray(m)
array([[1., 2.],
       [3., 4.]])
>>> numpy.log2(m)
array([[0.       , 1.       ],
       [1.5849625, 2.       ]])

Added in version 0.11.2.

__init__(iterable=())#: Create a new matrix from an iterable of rows.

argmax()#

Return the coordinates of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

argmin()#

Return the coordinates of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

copy()#: Create a copy of the matrix, allocating a new buffer.

flatten()#

Return a flattened view over the matrix data.

Note

This method does not return a copy, but simply exposes the internal buffer used by the matrix, since all matrices are allocated as a C-contiguous array in HMMER.

Returns:: Vector – A vector treating the matrix buffer as a 1-D vector with the same item format.

Added in version 0.12.1.

max()#

Return the value of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

min()#

Return the value of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

sum()#: Return the sum of all elements in the matrix.

format#

itemsize#

class pyhmmer.easel.MatrixF(Matrix)#

A matrix storing single-precision floating point numbers.

Use indexing notation to access and edit individual elements of the matrix:

>>> m = MatrixF.zeros(2, 2)
>>> m[0, 0] = 3.0
>>> m
MatrixF([[3.0, 0.0], [0.0, 0.0]])

Indexing can also be performed at the row-level to get a VectorF without copying the underlying data:

>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0] ])
>>> m[0]
VectorF([1.0, 2.0])

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray with two dimensions using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0] ])
>>> numpy.asarray(m)
array([[1., 2.],
       [3., 4.]], dtype=float32)
>>> numpy.log2(m)
array([[0.       , 1.       ],
       [1.5849625, 2.       ]], dtype=float32)

Added in version 0.4.0.

__init__(iterable=())#: Create a new matrix from an iterable of rows.

argmax()#

Return the coordinates of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

argmin()#

Return the coordinates of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

copy()#: Create a copy of the matrix, allocating a new buffer.

flatten()#

Return a flattened view over the matrix data.

Note

This method does not return a copy, but simply exposes the internal buffer used by the matrix, since all matrices are allocated as a C-contiguous array in HMMER.

Returns:: Vector – A vector treating the matrix buffer as a 1-D vector with the same item format.

Added in version 0.12.1.

max()#

Return the value of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

min()#

Return the value of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

sum()#: Return the sum of all elements in the matrix.

format#

itemsize#

class pyhmmer.easel.MatrixI(Matrix)#

A matrix storing system-sized integers.

Use indexing notation to access and edit individual elements of the matrix:

>>> m = MatrixI.zeros(2, 2)
>>> m[0, 0] = 3
>>> m
MatrixI([[3, 0], [0, 0]])

Indexing can also be performed at the row-level to get a VectorI without copying the underlying data:

>>> m = MatrixI([ [1, 2], [3, 4] ])
>>> m[0]
VectorI([1, 2])

Objects of this type support the buffer protocol, and can be viewed as a numpy.ndarray with two dimensions using the numpy.asarray function, and can be passed without copy to most numpy functions:

>>> m = MatrixI([ [1, 2], [3, 4] ])
>>> numpy.asarray(m)
array([[1, 2],
       [3, 4]], dtype=int32)
>>> numpy.add(m, 2)
array([[3, 4],
       [5, 6]], dtype=int32)

Added in version 0.12.0.

__init__(iterable=())#: Create a new matrix from an iterable of rows.

argmax()#

Return the coordinates of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

argmin()#

Return the coordinates of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

copy()#: Create a copy of the matrix, allocating a new buffer.

flatten()#

Return a flattened view over the matrix data.

Note

This method does not return a copy, but simply exposes the internal buffer used by the matrix, since all matrices are allocated as a C-contiguous array in HMMER.

Returns:: Vector – A vector treating the matrix buffer as a 1-D vector with the same item format.

Added in version 0.12.1.

max()#

Return the value of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

min()#

Return the value of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

sum()#: Return the sum of all elements in the matrix.

format#

itemsize#

class pyhmmer.easel.MatrixU8(Matrix)#

A matrix storing byte-sized unsigned integers.

Added in version 0.4.0.

__init__(iterable=())#: Create a new matrix from an iterable of rows.

argmax()#

Return the coordinates of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

argmin()#

Return the coordinates of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

copy()#: Create a copy of the matrix, allocating a new buffer.

flatten()#

Return a flattened view over the matrix data.

Note

This method does not return a copy, but simply exposes the internal buffer used by the matrix, since all matrices are allocated as a C-contiguous array in HMMER.

Returns:: Vector – A vector treating the matrix buffer as a 1-D vector with the same item format.

Added in version 0.12.1.

max()#

Return the value of the maximum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

min()#

Return the value of the minimum element in the matrix.

Raises:: ValueError – When called on an empty matrix.

sum()#: Return the sum of all elements in the matrix.

format#

itemsize#