Linear Algebra#
- class pyhmmer.easel.Vector#
An abstract 1D array of fixed size.
Added in version 0.4.0.
- argmax()#
Return index of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- argmin()#
Return index of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- copy()#
Create a copy of the vector, allocating a new buffer.
- max()#
Return value of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- min()#
Return value of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- reverse()#
Reverse the vector, in place.
- sum()#
Returns the scalar sum of all elements in the vector.
- classmethod zeros(n)#
Create a vector of size
n
filled with zeros.
- class pyhmmer.easel.VectorF(Vector)#
A vector storing single-precision floating point numbers.
Individual elements of a vector can be accessed and modified with the usual indexing notation:
>>> v = VectorF([1.0, 2.0, 3.0]) >>> v[0] 1.0 >>> v[-1] 3.0 >>> v[0] = v[-1] = 4.0 >>> v VectorF([4.0, 2.0, 4.0])
Slices are also supported, and they do not copy data (use the
copy
method to allocate a new vector):>>> v = VectorF(range(6)) >>> v[2:5] VectorF([2.0, 3.0, 4.0]) >>> v[2:-1] = 10.0 >>> v VectorF([0.0, 1.0, 10.0, 10.0, 10.0, 5.0])
Addition and multiplication is supported for scalars, in place or not:
>>> v = VectorF([1.0, 2.0, 3.0]) >>> v += 1 >>> v VectorF([2.0, 3.0, 4.0]) >>> v * 3 VectorF([6.0, 9.0, 12.0])
Pairwise operations can also be performed, but only on vectors of the same dimension and precision:
>>> v = VectorF([1.0, 2.0, 3.0]) >>> v * v VectorF([1.0, 4.0, 9.0]) >>> v += VectorF([3.0, 4.0, 5.0]) >>> v VectorF([4.0, 6.0, 8.0]) >>> v *= VectorF([1.0]) Traceback (most recent call last): ... ValueError: cannot pairwise multiply vectors of different sizes
Objects of this type support the buffer protocol, and can be viewed as a
numpy.ndarray
of one dimension using thenumpy.asarray
function, and can be passed without copy to mostnumpy
functions:>>> v = VectorF([1.0, 2.0, 3.0]) >>> numpy.asarray(v) array([1., 2., 3.], dtype=float32) >>> numpy.log2(v) array([0. , 1. , 1.5849625], dtype=float32)
Added in version 0.4.0.
- __init__(iterable=())#
Create a new float vector from the given data.
- argmax()#
Return index of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- argmin()#
Return index of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- copy()#
Create a copy of the vector, allocating a new buffer.
- entropy()#
Compute the Shannon entropy of the vector.
The Shannon entropy of a probability vector is defined as:
\[H = \sum_{i=0}^{N}{\log_2 p_i}\]Example
>>> easel.VectorF([0.1, 0.1, 0.3, 0.5]).entropy() 1.6854... >>> easel.VectorF([0.25, 0.25, 0.25, 0.25]).entropy() 2.0
References
Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.
Added in version 0.4.10.
- max()#
Return value of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- min()#
Return value of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- normalize()#
Normalize a vector so that all elements sum to 1.
Caution
If sum is zero, sets all elements to \(\frac{1}{n}\), where \(n\) is the size of the vector.
- relative_entropy(other)#
Compute the relative entropy between two probability vectors.
The Shannon relative entropy of two probability vectors \(p\) and \(q\), also known as the Kullback-Leibler divergence, is defined as:
\[D(p \parallel q) = \sum_i p_i \log_2 \frac{p_i}{q_i}.\]with \(D(p \parallel q) = \infty\) per definition if \(q_i = 0\) and \(p_i > 0\) for any \(i\).
Example
>>> v1 = easel.VectorF([0.1, 0.1, 0.3, 0.5]) >>> v2 = easel.VectorF([0.25, 0.25, 0.25, 0.25]) >>> v1.relative_entropy(v2) 0.3145... >>> v2.relative_entropy(v1) # this is no symmetric relation 0.3452...
References
Cover, Thomas M., and Thomas, Joy A. Entropy, Relative Entropy, and Mutual Information. In Elements of Information Theory, 13–55. Wiley (2005): 2. doi:10.1002/047174882X.ch2 ISBN:9780471241959.
Added in version 0.4.10.
- reverse()#
Reverse the vector, in place.
- sum()#
Returns the scalar sum of all elements in the vector.
Float summations use Kahan’s algorithm, in order to minimize roundoff error accumulation. Additionally, they are most accurate if the vector is sorted in increasing order, from small to large, so you may consider sorting the vector before summing it.
References
Kahan, W. Pracniques: Further Remarks on Reducing Truncation Errors. Communications of the ACM 8, no. 1 (1 January 1965): 40. doi:10.1145/363707.363723.
- format#
- itemsize#
- class pyhmmer.easel.VectorU8(Vector)#
A vector storing byte-sized unsigned integers.
Added in version 0.4.0.
- __init__(iterable=())#
Create a new byte vector from the given data.
- argmax()#
Return index of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- argmin()#
Return index of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- copy()#
Create a copy of the vector, allocating a new buffer.
- max()#
Return value of the maximum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- min()#
Return value of the minimum element in the vector.
- Raises:
ValueError – When called on an empty vector.
- reverse()#
Reverse the vector, in place.
- sum()#
Returns the scalar sum of all elements in the vector.
Caution
The sum is wrapping:
>>> vec = VectorU8([255, 2]) >>> vec.sum() 1
- format#
- itemsize#
- class pyhmmer.easel.Matrix#
An abstract 2D array of fixed size.
Added in version 0.4.0.
- classmethod _from_raw_bytes(buffer, m, n, byteorder)#
Create a new matrix using the given bytes to fill its contents.
- argmax()#
Return the coordinates of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- argmin()#
Return the coordinates of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- copy()#
Create a copy of the matrix, allocating a new buffer.
- max()#
Return the value of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- min()#
Return the value of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- sum()#
Return the sum of all elements in the matrix.
- classmethod zeros(m, n)#
Create a new \(m \times n\) matrix filled with zeros.
- format#
The format of each item in the matrix.
See also
The
array
module of the Python standard library for a detail about available type codes.Added in version 0.4.7.
- Type:
- shape#
The shape of the matrix.
Example
>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0], [5.0, 6.0] ]) >>> m.shape (3, 2)
- Type:
- class pyhmmer.easel.MatrixF#
A matrix storing single-precision floating point numbers.
Use indexing notation to access and edit individual elements of the matrix:
>>> m = MatrixF.zeros(2, 2) >>> m[0, 0] = 3.0 >>> m MatrixF([[3.0, 0.0], [0.0, 0.0]])
Indexing can also be performed at the row-level to get a
VectorF
without copying the underlying data:>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0] ]) >>> m[0] VectorF([1.0, 2.0])
Objects of this type support the buffer protocol, and can be viewed as a
numpy.ndarray
with two dimensions using thenumpy.asarray
function, and can be passed without copy to mostnumpy
functions:>>> m = MatrixF([ [1.0, 2.0], [3.0, 4.0] ]) >>> numpy.asarray(m) array([[1., 2.], [3., 4.]], dtype=float32) >>> numpy.log2(m) array([[0. , 1. ], [1.5849625, 2. ]], dtype=float32)
Added in version 0.4.0.
- __init__(iterable=())#
Create a new matrix from an iterable of rows.
- argmax()#
Return the coordinates of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- argmin()#
Return the coordinates of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- copy()#
Create a copy of the matrix, allocating a new buffer.
- max()#
Return the value of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- min()#
Return the value of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- sum()#
Return the sum of all elements in the matrix.
- format#
- itemsize#
- class pyhmmer.easel.MatrixU8#
A matrix storing byte-sized unsigned integers.
Added in version 0.4.0.
- __init__(iterable=())#
Create a new matrix from an iterable of rows.
- argmax()#
Return the coordinates of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- argmin()#
Return the coordinates of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- copy()#
Create a copy of the matrix, allocating a new buffer.
- max()#
Return the value of the maximum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- min()#
Return the value of the minimum element in the matrix.
- Raises:
ValueError – When called on an empty matrix.
- sum()#
Return the sum of all elements in the matrix.
- format#
- itemsize#