mangoes.utils.arrays module¶
Utility classes and functions to handle matrices
This module provides the Matrix
class that encapsulates all needed methods for both sparse and dense matrices.
-
mangoes.utils.arrays.
sqrt
(matrix, inplace=False)¶ Perform element-wise sqrt.
- Parameters
- matrix: a matrix-like object
- inplace: boolean, optional
whether or not to perform the operation inplace (default=False)
- Returns
- a matrix-like object
has the same type as the one input
-
mangoes.utils.arrays.
normalize
(matrix, norm='l2', axis=1, inplace=False)¶ Normalize the matrix-like input.
- Parameters
- matrix: a matrix-like object
- norm: {‘l2’ (default), ‘l1’ or ‘max’}, optional
- axis: {1 (default) or 0}
specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
-
mangoes.utils.arrays.
center
(matrix, axis=1, inplace=False)¶ Center the matrix-like input wrt row average, column average, or total average.
- Parameters
- matrix: a matrix-like object
- axis: {1 (default), 0, None}
the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes moot), possibly causing a memory exhaustion error.
-
class
mangoes.utils.arrays.
Matrix
¶ Bases:
object
Abstract class used to store generated vectors in a matrix
Methods
Check if all values of the matrix are positive or zero.
as_dense
()Return a dense version of the matrix
center
([axis, inplace])Center the matrix to row average, column average, or total average.
combine
(other, new_shape[, row_indices_map, …])Merge another matrix with this one, creating a new matrix
factory
(matrix)Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix
format_vector
(vector, sep)Format a line of a csrSparseMatrix as a string
load
(path, name)Load a matrix from a ‘.npz’ archive of ‘.npy’ files.
multiply_rowwise
(array)Multiply the values of the matrix by the value of the array whose index corresponds to the row index
Return the numbers of non zeros value for each line of the matrix
normalize
([norm, axis, inplace])Normalize the matrix.
pairwise_distances
([Y, metric])Compute a distance matrix from the rows of the matrix.
replace_negative_or_zeros_by
(value)Replace the negative or zeros value of the matrix by a new value
save
(path[, name])Save the matrix in a file.
sqrt
([inplace])Performs element-wise sqrt.
-
abstract
multiply_rowwise
(array)¶ Multiply the values of the matrix by the value of the array whose index corresponds to the row index
- Parameters
- array: array
a 1d array with length = nb of matrix rows
- Returns
- matrix
new matrix with same type as self
Examples
>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]])) >>> v = [[3], [4]] >>> M.multiply_rowwise(v) NumpyMatrix([[3, 3, 3], [8, 8, 8]])
-
abstract
normalize
(norm='l2', axis=1, inplace=False)¶ Normalize the matrix.
- Parameters
- norm: {‘l2’ (default), ‘l1’ or ‘max’}
- axis: {1 (default) or 0}
specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
-
abstract
center
(axis=1, inplace=False)¶ Center the matrix to row average, column average, or total average.
- Parameters
- axis: {1 (default), None, 0}
the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes
moot), possibly causing a memory exhaustion error.
-
abstract
sqrt
(inplace=False)¶ Performs element-wise sqrt.
- Parameters
- inplace: boolean
whether or not to perform the operation inplace (default=False)
- Returns
- a matrix-like object
-
abstract
combine
(other, new_shape, row_indices_map=None, col_indices_map=None)¶ Merge another matrix with this one, creating a new matrix
The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.
- Parameters
- other: matrix-like object
another matrix to merge with this one
- new_shape: tuple
the shape of the resulting matrix
- row_indices_map: dict
a mapping between the indices of the rows of ‘other’ and the merged matrix
- col_indices_map: dict
a mapping between the indices of the columns of ‘other’ and the merged matrix
- Returns
- mergedmatrix-like object
Examples
>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3))) >>> print(A) [[0 1 2] [3 4 5]] >>> B = NumpyMatrix(np.array(range(4)).reshape((2,2))) >>> print(B) [[0 1] [2 3]] >>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3}) NumpyMatrix([[ 0., 1., 2., 1.], [ 3., 6., 5., 3.]])
- indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix
1:3} # column 1 of the second matrix go to column 3 of the merged matrix
-
abstract
pairwise_distances
(Y=None, metric='cosine', **kwargs)¶ Compute a distance matrix from the rows of the matrix.
This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.
This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.
Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,
‘manhattan’]. These metrics support sparse matrix inputs.
From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]
See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.
- Parameters
- Y: matrix, optional
An optional second matrix.
- metricstring, or callable
The metric to use when calculating distance between vectors.
- **kwdsoptional keyword parameters
Any further parameters are passed directly to the distance function.
- Returns
- array [len(self), len(self)] or [len(self), len(Y)]
A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.
-
abstract
replace_negative_or_zeros_by
(value)¶ Replace the negative or zeros value of the matrix by a new value
- Parameters
- value: int or float
- Returns
- a matrix-like object
-
abstract
nb_of_non_zeros_values_by_row
()¶ Return the numbers of non zeros value for each line of the matrix
- Returns
- np.array
np.array with number of non zeros values for each row
-
abstract
all_positive
()¶ Check if all values of the matrix are positive or zero.
- Returns
- boolean
-
abstract
as_dense
()¶ Return a dense version of the matrix
- Returns
- array
-
abstract
save
(path, name='matrix')¶ Save the matrix in a file.
The format of the file depends of the type of the matrix. See subclasses for more information.
- Parameters
- path: str
path to the folder where the file will be saved
- name: str
name of the file (without extension)
- Returns
- str
Complete path to the created file
-
classmethod
load
(path, name)¶ Load a matrix from a ‘.npz’ archive of ‘.npy’ files.
- Parameters
- path: str
path to the folder where the archive is stored
- name: str
name of the file (without extension)
- Returns
- a matrix-like object
-
abstract static
format_vector
(vector, sep)¶ Format a line of a csrSparseMatrix as a string
- Parameters
- vector:
line of a csrSparseMatrix
- sep: str
separator to use
- Returns
- str
string representation of the vector
-
static
factory
(matrix)¶ Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix
If matrix is already a Matrix, returns it
- Parameters
- matrix:
Matrix, numpy.ndarray or scipy.sparse.csr_matrix
- Returns
- a matrix-like object
-
abstract
-
class
mangoes.utils.arrays.
csrSparseMatrix
(*args, **kwargs)¶ Bases:
scipy.sparse.csr.csr_matrix
,mangoes.utils.arrays.Matrix
Class used to store generated vectors in a matrix with a scipy.sparse.csr_matrix
- Attributes
- dtype
has_canonical_format
Determine whether the matrix has sorted indices and no duplicates
has_sorted_indices
Determine whether the matrix has sorted indices
nnz
Number of stored values, including explicit zeros.
shape
Get shape of a matrix.
Methods
Check if all values of the matrix are positive or zero.
allclose
(first, second)Returns True if two arrays are element-wise equal within a tolerance.git
arcsin
()Element-wise arcsin.
arcsinh
()Element-wise arcsinh.
arctan
()Element-wise arctan.
arctanh
()Element-wise arctanh.
argmax
([axis, out])Return indices of maximum elements along an axis.
argmin
([axis, out])Return indices of minimum elements along an axis.
as_dense
()Return a dense version of the matrix
asformat
(format[, copy])Return this matrix in the passed format.
asfptype
()Upcast matrix to a floating point format (if necessary)
astype
(dtype[, casting, copy])Cast the matrix elements to a specified type.
ceil
()Element-wise ceil.
center
([axis, inplace])Center the matrix to row average, column average, or total average.
check_format
([full_check])check whether the matrix format is valid
combine
(other, new_shape[, row_indices_map, …])Merge another matrix with this one, creating a new matrix
conj
([copy])Element-wise complex conjugation.
conjugate
([copy])Element-wise complex conjugation.
copy
()Returns a copy of this matrix.
count_nonzero
()Number of non-zero entries, equivalent to
deg2rad
()Element-wise deg2rad.
diagonal
([k])Returns the kth diagonal of the matrix.
dot
(other)Ordinary dot product
eliminate_zeros
()Remove zero entries from the matrix
expm1
()Element-wise expm1.
factory
(matrix)Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix
floor
()Element-wise floor.
format_vector
(vector, sep)Format a line of a csrSparseMatrix as a string
getH
()Return the Hermitian transpose of this matrix.
get_shape
()Get shape of a matrix.
getcol
(i)Returns a copy of column i of the matrix, as a (m x 1) CSR matrix (column vector).
getformat
()Format of a matrix representation as a string.
getmaxprint
()Maximum number of elements to display when printed.
getnnz
([axis])Number of stored values, including explicit zeros.
getrow
(i)Returns a copy of row i of the matrix, as a (1 x n) CSR matrix (row vector).
load
(path, name)Load a _matrix from a ‘.npz’ archive of ‘.npy’ files.
load_from_text_file
(file_object, nb_columns, …)Load a matrix and a list of words from a text file
log
()Natural logarithm, element-wise.
log1p
()Element-wise log1p.
max
([axis, out])Return the maximum of the matrix or maximum along an axis.
maximum
(other)Element-wise maximum between this and another matrix.
mean
([axis, dtype, out])Compute the arithmetic mean along the specified axis.
min
([axis, out])Return the minimum of the matrix or maximum along an axis.
minimum
(other)Element-wise minimum between this and another matrix.
multiply
(other)Point-wise multiplication by another matrix, vector, or scalar.
multiply_rowwise
(array)Multiply the values of the matrix by the value of the array whose index corresponds to the row index
Return the numbers of non zeros value for each line of the matrix
nonzero
()nonzero indices
normalize
([norm, axis, inplace])Normalize the matrix.
pairwise_distances
([Y, metric])Compute a distance matrix from the rows of the matrix.
power
(n[, dtype])This function performs element-wise power.
prune
()Remove empty space after all non-zero elements.
rad2deg
()Element-wise rad2deg.
replace_negative_or_zeros_by
(value)Replace the negative or zeros value of the matrix by a new value
reshape
(self, shape[, order, copy])Gives a new shape to a sparse matrix without changing its data.
resize
(*shape)Resize the matrix in-place to dimensions given by
shape
rint
()Element-wise rint.
save
(path[, name])Save the matrix as a ‘.npz’ archive of ‘.npy’ files.
set_shape
(shape)See reshape.
setdiag
(values[, k])Set diagonal or off-diagonal elements of the array.
sign
()Element-wise sign.
sin
()Element-wise sin.
sinh
()Element-wise sinh.
sort_indices
()Sort the indices of this matrix in place
sorted_indices
()Return a copy of this matrix with sorted indices
sqrt
([inplace])Element-wise sqrt.
sum
([axis, dtype, out])Sum the matrix elements over a given axis.
sum_duplicates
()Eliminate duplicate matrix entries by adding them together
tan
()Element-wise tan.
tanh
()Element-wise tanh.
toarray
([order, out])Return a dense ndarray representation of this matrix.
tobsr
([blocksize, copy])Convert this matrix to Block Sparse Row format.
tocoo
([copy])Convert this matrix to COOrdinate format.
tocsc
([copy])Convert this matrix to Compressed Sparse Column format.
tocsr
([copy])Convert this matrix to Compressed Sparse Row format.
todense
([order, out])Return a dense matrix representation of this matrix.
todia
([copy])Convert this matrix to sparse DIAgonal format.
todok
([copy])Convert this matrix to Dictionary Of Keys format.
tolil
([copy])Convert this matrix to List of Lists format.
transpose
([axes, copy])Reverses the dimensions of the sparse matrix.
trunc
()Element-wise trunc.
chebyshev
-
EXT
= '.npz'¶
-
multiply_rowwise
(array)¶ Multiply the values of the matrix by the value of the array whose index corresponds to the row index
- Parameters
- array: array
a 1d array with length = nb of matrix rows
- Returns
- matrix
new matrix with same type as self
Examples
>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]])) >>> v = [[3], [4]] >>> M.multiply_rowwise(v) NumpyMatrix([[3, 3, 3], [8, 8, 8]])
-
normalize
(norm='l2', axis=1, inplace=False)¶ Normalize the matrix.
- Parameters
- norm: {‘l2’ (default), ‘l1’ or ‘max’}
- axis: {1 (default) or 0}
specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
-
center
(axis=1, inplace=False)¶ Center the matrix to row average, column average, or total average.
- Parameters
- axis: {1 (default), None, 0}
the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes
moot), possibly causing a memory exhaustion error.
-
sqrt
(inplace=False)¶ Element-wise sqrt.
See numpy.sqrt for more information.
-
log
()¶ Natural logarithm, element-wise.
Adaptation of numpy.log to csr_matrix
-
replace_negative_or_zeros_by
(value)¶ Replace the negative or zeros value of the matrix by a new value
- Parameters
- value: int or float
- Returns
- a matrix-like object
-
nb_of_non_zeros_values_by_row
()¶ Return the numbers of non zeros value for each line of the matrix
- Returns
- np.array
np.array with number of non zeros values for each row
-
all_positive
()¶ Check if all values of the matrix are positive or zero.
- Returns
- boolean
-
combine
(other, new_shape, row_indices_map=None, col_indices_map=None)¶ Merge another matrix with this one, creating a new matrix
The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.
- Parameters
- other: matrix-like object
another matrix to merge with this one
- new_shape: tuple
the shape of the resulting matrix
- row_indices_map: dict
a mapping between the indices of the rows of ‘other’ and the merged matrix
- col_indices_map: dict
a mapping between the indices of the columns of ‘other’ and the merged matrix
- Returns
- mergedmatrix-like object
Examples
>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3))) >>> print(A) [[0 1 2] [3 4 5]] >>> B = NumpyMatrix(np.array(range(4)).reshape((2,2))) >>> print(B) [[0 1] [2 3]] >>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3}) NumpyMatrix([[ 0., 1., 2., 1.], [ 3., 6., 5., 3.]])
- indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix
1:3} # column 1 of the second matrix go to column 3 of the merged matrix
-
as_dense
()¶ Return a dense version of the matrix
- Returns
- array
-
save
(path, name='matrix')¶ Save the matrix as a ‘.npz’ archive of ‘.npy’ files.
- Parameters
- path: str
path to the folder where the archive will be written
- name: str
name of the file (without extension)
- Returns
- str
Complete path to the created file
-
classmethod
load
(path, name)¶ Load a _matrix from a ‘.npz’ archive of ‘.npy’ files.
- Parameters
- path: str
path to the folder where the archive is stored
- name: str
name of the file (without extension)
- Returns
- csrSparseMatrix
-
static
load_from_text_file
(file_object, nb_columns, data_type, sep)¶ Load a matrix and a list of words from a text file
- Parameters
- file_object: file-like object
- nb_columns: int
number of columns in the matrix
- data_type
- sep: str
token used as column separator in the text file
- Returns
- csrSparseMatrix
-
static
format_vector
(vector, sep)¶ Format a line of a csrSparseMatrix as a string
- Parameters
- vector:
line of a csrSparseMatrix
- sep: str
separator to use
- Returns
- str
string representation of the vector
-
static
allclose
(first, second)¶ Returns True if two arrays are element-wise equal within a tolerance.git
- Parameters
- first: matrix
- second: matrix
- Returns
- bool
-
pairwise_distances
(Y=None, metric='cosine', **kwargs)¶ Compute a distance matrix from the rows of the matrix.
This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.
This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.
Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,
‘manhattan’]. These metrics support sparse matrix inputs.
From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]
See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.
- Parameters
- Y: matrix, optional
An optional second matrix.
- metricstring, or callable
The metric to use when calculating distance between vectors.
- **kwdsoptional keyword parameters
Any further parameters are passed directly to the distance function.
- Returns
- array [len(self), len(self)] or [len(self), len(Y)]
A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.
-
static
chebyshev
(X, Y)¶
-
class
mangoes.utils.arrays.
NumpyMatrix
(input_array, info=None)¶ Bases:
numpy.ndarray
,mangoes.utils.arrays.Matrix
Class used to store generated vectors in a matrix with a numpy.ndarray
- Attributes
T
The transposed array.
base
Base object if memory is from some other object.
ctypes
An object to simplify the interaction of the array with the ctypes module.
data
Python buffer object pointing to the start of the array’s data.
dtype
Data-type of the array’s elements.
flags
Information about the memory layout of the array.
flat
A 1-D iterator over the array.
imag
The imaginary part of the array.
itemsize
Length of one array element in bytes.
nbytes
Total bytes consumed by the elements of the array.
ndim
Number of array dimensions.
real
The real part of the array.
shape
Tuple of array dimensions.
size
Number of elements in the array.
strides
Tuple of bytes to step in each dimension when traversing an array.
Methods
all
([axis, out, keepdims])Returns True if all elements evaluate to True.
Check if all values of the matrix are positive or zero.
any
([axis, out, keepdims])Returns True if any of the elements of a evaluate to True.
argmax
([axis, out])Return indices of the maximum values along the given axis.
argmin
([axis, out])Return indices of the minimum values along the given axis of a.
argpartition
(kth[, axis, kind, order])Returns the indices that would partition this array.
argsort
([axis, kind, order])Returns the indices that would sort this array.
as_dense
()Return a dense version of the matrix
astype
(dtype[, order, casting, subok, copy])Copy of the array, cast to a specified type.
byteswap
([inplace])Swap the bytes of the array elements
center
([axis, inplace])Center the matrix to row average, column average, or total average.
choose
(choices[, out, mode])Use an index array to construct a new array from a set of choices.
clip
([min, max, out])Return an array whose values are limited to
[min, max]
.combine
(other, new_shape[, row_indices_map, …])Merge another matrix with this one, creating a new matrix
compress
(condition[, axis, out])Return selected slices of this array along given axis.
conj
()Complex-conjugate all elements.
conjugate
()Return the complex conjugate, element-wise.
copy
([order])Return a copy of the array.
cumprod
([axis, dtype, out])Return the cumulative product of the elements along the given axis.
cumsum
([axis, dtype, out])Return the cumulative sum of the elements along the given axis.
diagonal
([offset, axis1, axis2])Return specified diagonals.
dot
(b[, out])Dot product of two arrays.
dump
(file)Dump a pickle of the array to the specified file.
dumps
()Returns the pickle of the array as a string.
factory
(matrix)Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix
fill
(value)Fill the array with a scalar value.
flatten
([order])Return a copy of the array collapsed into one dimension.
format_vector
(vector, sep)Format a line of a csrSparseMatrix as a string
getfield
(dtype[, offset])Returns a field of the given array as a certain type.
item
(*args)Copy an element of an array to a standard Python scalar and return it.
itemset
(*args)Insert scalar into an array (scalar is cast to array’s dtype, if possible)
load
(path, name)Load a matrix from a ‘.npz’ archive of ‘.npy’ files.
load_from_text_file
(file_object, sep)Load a matrix and a list of words from a text file
max
([axis, out, keepdims, initial, where])Return the maximum along a given axis.
mean
([axis, dtype, out, keepdims])Returns the average of the array elements along given axis.
min
([axis, out, keepdims, initial, where])Return the minimum along a given axis.
multiply_rowwise
(array)Multiply the values of the matrix by the value of the array whose index corresponds to the row index
Return the numbers of non zeros value for each line of the matrix
newbyteorder
([new_order])Return the array with the same data viewed with a different byte order.
nonzero
()Return the indices of the elements that are non-zero.
normalize
([norm, axis, inplace])Normalize the matrix.
pairwise_distances
([Y, metric])Compute a distance matrix from the rows of the matrix.
partition
(kth[, axis, kind, order])Rearranges the elements in the array in such a way that the value of the element in kth position is in the position it would be in a sorted array.
prod
([axis, dtype, out, keepdims, initial, …])Return the product of the array elements over the given axis
ptp
([axis, out, keepdims])Peak to peak (maximum - minimum) value along a given axis.
put
(indices, values[, mode])Set
a.flat[n] = values[n]
for all n in indices.ravel
([order])Return a flattened array.
repeat
(repeats[, axis])Repeat elements of an array.
replace_negative_or_zeros_by
(value)Replace the negative or zeros value of the matrix by a new value
reshape
(shape[, order])Returns an array containing the same data with a new shape.
resize
(new_shape[, refcheck])Change shape and size of array in-place.
round
([decimals, out])Return a with each element rounded to the given number of decimals.
save
(path[, name])Save the matrix in a file.
searchsorted
(v[, side, sorter])Find indices where elements of v should be inserted in a to maintain order.
setfield
(val, dtype[, offset])Put a value into a specified place in a field defined by a data-type.
setflags
([write, align, uic])Set array flags WRITEABLE, ALIGNED, (WRITEBACKIFCOPY and UPDATEIFCOPY), respectively.
sort
([axis, kind, order])Sort an array in-place.
sqrt
([inplace])Performs element-wise sqrt.
squeeze
([axis])Remove single-dimensional entries from the shape of a.
std
([axis, dtype, out, ddof, keepdims])Returns the standard deviation of the array elements along given axis.
sum
([axis, dtype, out, keepdims, initial, where])Return the sum of the array elements over the given axis.
swapaxes
(axis1, axis2)Return a view of the array with axis1 and axis2 interchanged.
take
(indices[, axis, out, mode])Return an array formed from the elements of a at the given indices.
tobytes
([order])Construct Python bytes containing the raw data bytes in the array.
tofile
(fid[, sep, format])Write array to a file as text or binary (default).
tolist
()Return the array as an
a.ndim
-levels deep nested list of Python scalars.tostring
([order])A compatibility alias for tobytes, with exactly the same behavior.
trace
([offset, axis1, axis2, dtype, out])Return the sum along diagonals of the array.
transpose
(*axes)Returns a view of the array with axes transposed.
var
([axis, dtype, out, ddof, keepdims])Returns the variance of the array elements, along given axis.
view
([dtype][, type])New view of array with the same data.
-
EXT
= '.npy'¶
-
multiply_rowwise
(array)¶ Multiply the values of the matrix by the value of the array whose index corresponds to the row index
- Parameters
- array: array
a 1d array with length = nb of matrix rows
- Returns
- matrix
new matrix with same type as self
Examples
>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]])) >>> v = [[3], [4]] >>> M.multiply_rowwise(v) NumpyMatrix([[3, 3, 3], [8, 8, 8]])
-
normalize
(norm='l2', axis=1, inplace=False)¶ Normalize the matrix.
- Parameters
- norm: {‘l2’ (default), ‘l1’ or ‘max’}
- axis: {1 (default) or 0}
specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
-
center
(axis=1, inplace=False)¶ Center the matrix to row average, column average, or total average.
- Parameters
- axis: {1 (default), None, 0}
the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.
- inplace: boolean
whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.
- Returns
- a matrix-like object
Warning
the inplace=True parameter isn’t implemented yet
If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes
moot), possibly causing a memory exhaustion error.
-
sqrt
(inplace=False)¶ Performs element-wise sqrt.
- Parameters
- inplace: boolean
whether or not to perform the operation inplace (default=False)
- Returns
- a matrix-like object
-
replace_negative_or_zeros_by
(value)¶ Replace the negative or zeros value of the matrix by a new value
- Parameters
- value: int or float
- Returns
- a matrix-like object
-
nb_of_non_zeros_values_by_row
()¶ Return the numbers of non zeros value for each line of the matrix
- Returns
- np.array
np.array with number of non zeros values for each row
-
all_positive
()¶ Check if all values of the matrix are positive or zero.
- Returns
- boolean
-
combine
(other, new_shape, row_indices_map=None, col_indices_map=None)¶ Merge another matrix with this one, creating a new matrix
The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.
- Parameters
- other: matrix-like object
another matrix to merge with this one
- new_shape: tuple
the shape of the resulting matrix
- row_indices_map: dict
a mapping between the indices of the rows of ‘other’ and the merged matrix
- col_indices_map: dict
a mapping between the indices of the columns of ‘other’ and the merged matrix
- Returns
- mergedmatrix-like object
Examples
>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3))) >>> print(A) [[0 1 2] [3 4 5]] >>> B = NumpyMatrix(np.array(range(4)).reshape((2,2))) >>> print(B) [[0 1] [2 3]] >>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3}) NumpyMatrix([[ 0., 1., 2., 1.], [ 3., 6., 5., 3.]])
- indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix
1:3} # column 1 of the second matrix go to column 3 of the merged matrix
-
as_dense
()¶ Return a dense version of the matrix
- Returns
- array
-
save
(path, name='matrix')¶ Save the matrix in a file.
The format of the file depends of the type of the matrix. See subclasses for more information.
- Parameters
- path: str
path to the folder where the file will be saved
- name: str
name of the file (without extension)
- Returns
- str
Complete path to the created file
-
static
format_vector
(vector, sep)¶ Format a line of a csrSparseMatrix as a string
- Parameters
- vector:
line of a csrSparseMatrix
- sep: str
separator to use
- Returns
- str
string representation of the vector
-
static
load_from_text_file
(file_object, sep)¶ Load a matrix and a list of words from a text file
- Parameters
- file_object: file-like object
- sep: str
token used as a separator in the text file
- Returns
- tuple
(matrix, list of words)
-
classmethod
load
(path, name)¶ Load a matrix from a ‘.npz’ archive of ‘.npy’ files.
- Parameters
- path: str
path to the folder where the archive is stored
- name: str
name of the file (without extension)
- Returns
- a matrix-like object
-
pairwise_distances
(Y=None, metric='cosine', **kwargs)¶ Compute a distance matrix from the rows of the matrix.
This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.
This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.
Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,
‘manhattan’]. These metrics support sparse matrix inputs.
From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]
See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.
- Parameters
- Y: matrix, optional
An optional second matrix.
- metricstring, or callable
The metric to use when calculating distance between vectors.
- **kwdsoptional keyword parameters
Any further parameters are passed directly to the distance function.
- Returns
- array [len(self), len(self)] or [len(self), len(Y)]
A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.