mangoes.utils.arrays module

Utility classes and functions to handle matrices

This module provides the Matrix class that encapsulates all needed methods for both sparse and dense matrices.

mangoes.utils.arrays.sqrt(matrix, inplace=False)

Perform element-wise sqrt.

Parameters
matrix: a matrix-like object
inplace: boolean, optional

whether or not to perform the operation inplace (default=False)

Returns
a matrix-like object

has the same type as the one input

mangoes.utils.arrays.normalize(matrix, norm='l2', axis=1, inplace=False)

Normalize the matrix-like input.

Parameters
matrix: a matrix-like object
norm: {‘l2’ (default), ‘l1’ or ‘max’}, optional
axis: {1 (default) or 0}

specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

the inplace=True parameter isn’t implemented yet

mangoes.utils.arrays.center(matrix, axis=1, inplace=False)

Center the matrix-like input wrt row average, column average, or total average.

Parameters
matrix: a matrix-like object
axis: {1 (default), 0, None}

the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

  • the inplace=True parameter isn’t implemented yet

  • If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes moot), possibly causing a memory exhaustion error.

class mangoes.utils.arrays.Matrix

Bases: object

Abstract class used to store generated vectors in a matrix

Methods

all_positive()

Check if all values of the matrix are positive or zero.

as_dense()

Return a dense version of the matrix

center([axis, inplace])

Center the matrix to row average, column average, or total average.

combine(other, new_shape[, row_indices_map, …])

Merge another matrix with this one, creating a new matrix

factory(matrix)

Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix

format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

load(path, name)

Load a matrix from a ‘.npz’ archive of ‘.npy’ files.

multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

normalize([norm, axis, inplace])

Normalize the matrix.

pairwise_distances([Y, metric])

Compute a distance matrix from the rows of the matrix.

replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

save(path[, name])

Save the matrix in a file.

sqrt([inplace])

Performs element-wise sqrt.

abstract multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

Parameters
array: array

a 1d array with length = nb of matrix rows

Returns
matrix

new matrix with same type as self

Examples

>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]]))
>>> v = [[3], [4]]
>>> M.multiply_rowwise(v)
NumpyMatrix([[3, 3, 3], [8, 8, 8]])
abstract normalize(norm='l2', axis=1, inplace=False)

Normalize the matrix.

Parameters
norm: {‘l2’ (default), ‘l1’ or ‘max’}
axis: {1 (default) or 0}

specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

the inplace=True parameter isn’t implemented yet

abstract center(axis=1, inplace=False)

Center the matrix to row average, column average, or total average.

Parameters
axis: {1 (default), None, 0}

the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

  • the inplace=True parameter isn’t implemented yet

  • If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes

moot), possibly causing a memory exhaustion error.

abstract sqrt(inplace=False)

Performs element-wise sqrt.

Parameters
inplace: boolean

whether or not to perform the operation inplace (default=False)

Returns
a matrix-like object
abstract combine(other, new_shape, row_indices_map=None, col_indices_map=None)

Merge another matrix with this one, creating a new matrix

The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.

Parameters
other: matrix-like object

another matrix to merge with this one

new_shape: tuple

the shape of the resulting matrix

row_indices_map: dict

a mapping between the indices of the rows of ‘other’ and the merged matrix

col_indices_map: dict

a mapping between the indices of the columns of ‘other’ and the merged matrix

Returns
mergedmatrix-like object

Examples

>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3)))
>>> print(A)
[[0 1 2]
 [3 4 5]]
>>> B = NumpyMatrix(np.array(range(4)).reshape((2,2)))
>>> print(B)
[[0 1]
 [2 3]]
>>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3})
NumpyMatrix([[ 0.,  1.,  2.,  1.],
             [ 3.,  6.,  5.,  3.]])
indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix

1:3} # column 1 of the second matrix go to column 3 of the merged matrix

abstract pairwise_distances(Y=None, metric='cosine', **kwargs)

Compute a distance matrix from the rows of the matrix.

This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.

This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.

Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,

‘manhattan’]. These metrics support sparse matrix inputs.

  • From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]

See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.

Parameters
Y: matrix, optional

An optional second matrix.

metricstring, or callable

The metric to use when calculating distance between vectors.

**kwdsoptional keyword parameters

Any further parameters are passed directly to the distance function.

Returns
array [len(self), len(self)] or [len(self), len(Y)]

A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.

abstract replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

Parameters
value: int or float
Returns
a matrix-like object
abstract nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

Returns
np.array

np.array with number of non zeros values for each row

abstract all_positive()

Check if all values of the matrix are positive or zero.

Returns
boolean
abstract as_dense()

Return a dense version of the matrix

Returns
array
abstract save(path, name='matrix')

Save the matrix in a file.

The format of the file depends of the type of the matrix. See subclasses for more information.

Parameters
path: str

path to the folder where the file will be saved

name: str

name of the file (without extension)

Returns
str

Complete path to the created file

classmethod load(path, name)

Load a matrix from a ‘.npz’ archive of ‘.npy’ files.

Parameters
path: str

path to the folder where the archive is stored

name: str

name of the file (without extension)

Returns
a matrix-like object
abstract static format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

Parameters
vector:

line of a csrSparseMatrix

sep: str

separator to use

Returns
str

string representation of the vector

static factory(matrix)

Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix

If matrix is already a Matrix, returns it

Parameters
matrix:

Matrix, numpy.ndarray or scipy.sparse.csr_matrix

Returns
a matrix-like object
class mangoes.utils.arrays.csrSparseMatrix(*args, **kwargs)

Bases: scipy.sparse.csr.csr_matrix, mangoes.utils.arrays.Matrix

Class used to store generated vectors in a matrix with a scipy.sparse.csr_matrix

Attributes
dtype
has_canonical_format

Determine whether the matrix has sorted indices and no duplicates

has_sorted_indices

Determine whether the matrix has sorted indices

nnz

Number of stored values, including explicit zeros.

shape

Get shape of a matrix.

Methods

all_positive()

Check if all values of the matrix are positive or zero.

allclose(first, second)

Returns True if two arrays are element-wise equal within a tolerance.git

arcsin()

Element-wise arcsin.

arcsinh()

Element-wise arcsinh.

arctan()

Element-wise arctan.

arctanh()

Element-wise arctanh.

argmax([axis, out])

Return indices of maximum elements along an axis.

argmin([axis, out])

Return indices of minimum elements along an axis.

as_dense()

Return a dense version of the matrix

asformat(format[, copy])

Return this matrix in the passed format.

asfptype()

Upcast matrix to a floating point format (if necessary)

astype(dtype[, casting, copy])

Cast the matrix elements to a specified type.

ceil()

Element-wise ceil.

center([axis, inplace])

Center the matrix to row average, column average, or total average.

check_format([full_check])

check whether the matrix format is valid

combine(other, new_shape[, row_indices_map, …])

Merge another matrix with this one, creating a new matrix

conj([copy])

Element-wise complex conjugation.

conjugate([copy])

Element-wise complex conjugation.

copy()

Returns a copy of this matrix.

count_nonzero()

Number of non-zero entries, equivalent to

deg2rad()

Element-wise deg2rad.

diagonal([k])

Returns the kth diagonal of the matrix.

dot(other)

Ordinary dot product

eliminate_zeros()

Remove zero entries from the matrix

expm1()

Element-wise expm1.

factory(matrix)

Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix

floor()

Element-wise floor.

format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

getH()

Return the Hermitian transpose of this matrix.

get_shape()

Get shape of a matrix.

getcol(i)

Returns a copy of column i of the matrix, as a (m x 1) CSR matrix (column vector).

getformat()

Format of a matrix representation as a string.

getmaxprint()

Maximum number of elements to display when printed.

getnnz([axis])

Number of stored values, including explicit zeros.

getrow(i)

Returns a copy of row i of the matrix, as a (1 x n) CSR matrix (row vector).

load(path, name)

Load a _matrix from a ‘.npz’ archive of ‘.npy’ files.

load_from_text_file(file_object, nb_columns, …)

Load a matrix and a list of words from a text file

log()

Natural logarithm, element-wise.

log1p()

Element-wise log1p.

max([axis, out])

Return the maximum of the matrix or maximum along an axis.

maximum(other)

Element-wise maximum between this and another matrix.

mean([axis, dtype, out])

Compute the arithmetic mean along the specified axis.

min([axis, out])

Return the minimum of the matrix or maximum along an axis.

minimum(other)

Element-wise minimum between this and another matrix.

multiply(other)

Point-wise multiplication by another matrix, vector, or scalar.

multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

nonzero()

nonzero indices

normalize([norm, axis, inplace])

Normalize the matrix.

pairwise_distances([Y, metric])

Compute a distance matrix from the rows of the matrix.

power(n[, dtype])

This function performs element-wise power.

prune()

Remove empty space after all non-zero elements.

rad2deg()

Element-wise rad2deg.

replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

reshape(self, shape[, order, copy])

Gives a new shape to a sparse matrix without changing its data.

resize(*shape)

Resize the matrix in-place to dimensions given by shape

rint()

Element-wise rint.

save(path[, name])

Save the matrix as a ‘.npz’ archive of ‘.npy’ files.

set_shape(shape)

See reshape.

setdiag(values[, k])

Set diagonal or off-diagonal elements of the array.

sign()

Element-wise sign.

sin()

Element-wise sin.

sinh()

Element-wise sinh.

sort_indices()

Sort the indices of this matrix in place

sorted_indices()

Return a copy of this matrix with sorted indices

sqrt([inplace])

Element-wise sqrt.

sum([axis, dtype, out])

Sum the matrix elements over a given axis.

sum_duplicates()

Eliminate duplicate matrix entries by adding them together

tan()

Element-wise tan.

tanh()

Element-wise tanh.

toarray([order, out])

Return a dense ndarray representation of this matrix.

tobsr([blocksize, copy])

Convert this matrix to Block Sparse Row format.

tocoo([copy])

Convert this matrix to COOrdinate format.

tocsc([copy])

Convert this matrix to Compressed Sparse Column format.

tocsr([copy])

Convert this matrix to Compressed Sparse Row format.

todense([order, out])

Return a dense matrix representation of this matrix.

todia([copy])

Convert this matrix to sparse DIAgonal format.

todok([copy])

Convert this matrix to Dictionary Of Keys format.

tolil([copy])

Convert this matrix to List of Lists format.

transpose([axes, copy])

Reverses the dimensions of the sparse matrix.

trunc()

Element-wise trunc.

chebyshev

EXT = '.npz'
multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

Parameters
array: array

a 1d array with length = nb of matrix rows

Returns
matrix

new matrix with same type as self

Examples

>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]]))
>>> v = [[3], [4]]
>>> M.multiply_rowwise(v)
NumpyMatrix([[3, 3, 3], [8, 8, 8]])
normalize(norm='l2', axis=1, inplace=False)

Normalize the matrix.

Parameters
norm: {‘l2’ (default), ‘l1’ or ‘max’}
axis: {1 (default) or 0}

specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

the inplace=True parameter isn’t implemented yet

center(axis=1, inplace=False)

Center the matrix to row average, column average, or total average.

Parameters
axis: {1 (default), None, 0}

the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

  • the inplace=True parameter isn’t implemented yet

  • If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes

moot), possibly causing a memory exhaustion error.

sqrt(inplace=False)

Element-wise sqrt.

See numpy.sqrt for more information.

log()

Natural logarithm, element-wise.

Adaptation of numpy.log to csr_matrix

replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

Parameters
value: int or float
Returns
a matrix-like object
nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

Returns
np.array

np.array with number of non zeros values for each row

all_positive()

Check if all values of the matrix are positive or zero.

Returns
boolean
combine(other, new_shape, row_indices_map=None, col_indices_map=None)

Merge another matrix with this one, creating a new matrix

The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.

Parameters
other: matrix-like object

another matrix to merge with this one

new_shape: tuple

the shape of the resulting matrix

row_indices_map: dict

a mapping between the indices of the rows of ‘other’ and the merged matrix

col_indices_map: dict

a mapping between the indices of the columns of ‘other’ and the merged matrix

Returns
mergedmatrix-like object

Examples

>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3)))
>>> print(A)
[[0 1 2]
 [3 4 5]]
>>> B = NumpyMatrix(np.array(range(4)).reshape((2,2)))
>>> print(B)
[[0 1]
 [2 3]]
>>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3})
NumpyMatrix([[ 0.,  1.,  2.,  1.],
             [ 3.,  6.,  5.,  3.]])
indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix

1:3} # column 1 of the second matrix go to column 3 of the merged matrix

as_dense()

Return a dense version of the matrix

Returns
array
save(path, name='matrix')

Save the matrix as a ‘.npz’ archive of ‘.npy’ files.

Parameters
path: str

path to the folder where the archive will be written

name: str

name of the file (without extension)

Returns
str

Complete path to the created file

classmethod load(path, name)

Load a _matrix from a ‘.npz’ archive of ‘.npy’ files.

Parameters
path: str

path to the folder where the archive is stored

name: str

name of the file (without extension)

Returns
csrSparseMatrix
static load_from_text_file(file_object, nb_columns, data_type, sep)

Load a matrix and a list of words from a text file

Parameters
file_object: file-like object
nb_columns: int

number of columns in the matrix

data_type
sep: str

token used as column separator in the text file

Returns
csrSparseMatrix
static format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

Parameters
vector:

line of a csrSparseMatrix

sep: str

separator to use

Returns
str

string representation of the vector

static allclose(first, second)

Returns True if two arrays are element-wise equal within a tolerance.git

Parameters
first: matrix
second: matrix
Returns
bool
pairwise_distances(Y=None, metric='cosine', **kwargs)

Compute a distance matrix from the rows of the matrix.

This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.

This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.

Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,

‘manhattan’]. These metrics support sparse matrix inputs.

  • From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]

See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.

Parameters
Y: matrix, optional

An optional second matrix.

metricstring, or callable

The metric to use when calculating distance between vectors.

**kwdsoptional keyword parameters

Any further parameters are passed directly to the distance function.

Returns
array [len(self), len(self)] or [len(self), len(Y)]

A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.

static chebyshev(X, Y)
class mangoes.utils.arrays.NumpyMatrix(input_array, info=None)

Bases: numpy.ndarray, mangoes.utils.arrays.Matrix

Class used to store generated vectors in a matrix with a numpy.ndarray

Attributes
T

The transposed array.

base

Base object if memory is from some other object.

ctypes

An object to simplify the interaction of the array with the ctypes module.

data

Python buffer object pointing to the start of the array’s data.

dtype

Data-type of the array’s elements.

flags

Information about the memory layout of the array.

flat

A 1-D iterator over the array.

imag

The imaginary part of the array.

itemsize

Length of one array element in bytes.

nbytes

Total bytes consumed by the elements of the array.

ndim

Number of array dimensions.

real

The real part of the array.

shape

Tuple of array dimensions.

size

Number of elements in the array.

strides

Tuple of bytes to step in each dimension when traversing an array.

Methods

all([axis, out, keepdims])

Returns True if all elements evaluate to True.

all_positive()

Check if all values of the matrix are positive or zero.

any([axis, out, keepdims])

Returns True if any of the elements of a evaluate to True.

argmax([axis, out])

Return indices of the maximum values along the given axis.

argmin([axis, out])

Return indices of the minimum values along the given axis of a.

argpartition(kth[, axis, kind, order])

Returns the indices that would partition this array.

argsort([axis, kind, order])

Returns the indices that would sort this array.

as_dense()

Return a dense version of the matrix

astype(dtype[, order, casting, subok, copy])

Copy of the array, cast to a specified type.

byteswap([inplace])

Swap the bytes of the array elements

center([axis, inplace])

Center the matrix to row average, column average, or total average.

choose(choices[, out, mode])

Use an index array to construct a new array from a set of choices.

clip([min, max, out])

Return an array whose values are limited to [min, max].

combine(other, new_shape[, row_indices_map, …])

Merge another matrix with this one, creating a new matrix

compress(condition[, axis, out])

Return selected slices of this array along given axis.

conj()

Complex-conjugate all elements.

conjugate()

Return the complex conjugate, element-wise.

copy([order])

Return a copy of the array.

cumprod([axis, dtype, out])

Return the cumulative product of the elements along the given axis.

cumsum([axis, dtype, out])

Return the cumulative sum of the elements along the given axis.

diagonal([offset, axis1, axis2])

Return specified diagonals.

dot(b[, out])

Dot product of two arrays.

dump(file)

Dump a pickle of the array to the specified file.

dumps()

Returns the pickle of the array as a string.

factory(matrix)

Create a Matrix object from a numpy.ndarray or a scipy.sparse.csr_matrix

fill(value)

Fill the array with a scalar value.

flatten([order])

Return a copy of the array collapsed into one dimension.

format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

getfield(dtype[, offset])

Returns a field of the given array as a certain type.

item(*args)

Copy an element of an array to a standard Python scalar and return it.

itemset(*args)

Insert scalar into an array (scalar is cast to array’s dtype, if possible)

load(path, name)

Load a matrix from a ‘.npz’ archive of ‘.npy’ files.

load_from_text_file(file_object, sep)

Load a matrix and a list of words from a text file

max([axis, out, keepdims, initial, where])

Return the maximum along a given axis.

mean([axis, dtype, out, keepdims])

Returns the average of the array elements along given axis.

min([axis, out, keepdims, initial, where])

Return the minimum along a given axis.

multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

newbyteorder([new_order])

Return the array with the same data viewed with a different byte order.

nonzero()

Return the indices of the elements that are non-zero.

normalize([norm, axis, inplace])

Normalize the matrix.

pairwise_distances([Y, metric])

Compute a distance matrix from the rows of the matrix.

partition(kth[, axis, kind, order])

Rearranges the elements in the array in such a way that the value of the element in kth position is in the position it would be in a sorted array.

prod([axis, dtype, out, keepdims, initial, …])

Return the product of the array elements over the given axis

ptp([axis, out, keepdims])

Peak to peak (maximum - minimum) value along a given axis.

put(indices, values[, mode])

Set a.flat[n] = values[n] for all n in indices.

ravel([order])

Return a flattened array.

repeat(repeats[, axis])

Repeat elements of an array.

replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

reshape(shape[, order])

Returns an array containing the same data with a new shape.

resize(new_shape[, refcheck])

Change shape and size of array in-place.

round([decimals, out])

Return a with each element rounded to the given number of decimals.

save(path[, name])

Save the matrix in a file.

searchsorted(v[, side, sorter])

Find indices where elements of v should be inserted in a to maintain order.

setfield(val, dtype[, offset])

Put a value into a specified place in a field defined by a data-type.

setflags([write, align, uic])

Set array flags WRITEABLE, ALIGNED, (WRITEBACKIFCOPY and UPDATEIFCOPY), respectively.

sort([axis, kind, order])

Sort an array in-place.

sqrt([inplace])

Performs element-wise sqrt.

squeeze([axis])

Remove single-dimensional entries from the shape of a.

std([axis, dtype, out, ddof, keepdims])

Returns the standard deviation of the array elements along given axis.

sum([axis, dtype, out, keepdims, initial, where])

Return the sum of the array elements over the given axis.

swapaxes(axis1, axis2)

Return a view of the array with axis1 and axis2 interchanged.

take(indices[, axis, out, mode])

Return an array formed from the elements of a at the given indices.

tobytes([order])

Construct Python bytes containing the raw data bytes in the array.

tofile(fid[, sep, format])

Write array to a file as text or binary (default).

tolist()

Return the array as an a.ndim-levels deep nested list of Python scalars.

tostring([order])

A compatibility alias for tobytes, with exactly the same behavior.

trace([offset, axis1, axis2, dtype, out])

Return the sum along diagonals of the array.

transpose(*axes)

Returns a view of the array with axes transposed.

var([axis, dtype, out, ddof, keepdims])

Returns the variance of the array elements, along given axis.

view([dtype][, type])

New view of array with the same data.

EXT = '.npy'
multiply_rowwise(array)

Multiply the values of the matrix by the value of the array whose index corresponds to the row index

Parameters
array: array

a 1d array with length = nb of matrix rows

Returns
matrix

new matrix with same type as self

Examples

>>> M = mangoes.utils.arrays.Matrix.factory(np.asarray([[1, 1, 1], [2, 2, 2]]))
>>> v = [[3], [4]]
>>> M.multiply_rowwise(v)
NumpyMatrix([[3, 3, 3], [8, 8, 8]])
normalize(norm='l2', axis=1, inplace=False)

Normalize the matrix.

Parameters
norm: {‘l2’ (default), ‘l1’ or ‘max’}
axis: {1 (default) or 0}

specify along which axis to compute the average values that will be used to carry out the normalization. Use axis=1 to normalize the rows, and axis=0 to normalize the columns.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for axis=1, csc for axis=0), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

the inplace=True parameter isn’t implemented yet

center(axis=1, inplace=False)

Center the matrix to row average, column average, or total average.

Parameters
axis: {1 (default), None, 0}

the axis along which to compute the average vector (ex: if axis=0, the columns are centered, if axis=1, then the rows are). If None, the average is computed using all the values of the matrix, and is then subtracted from every value.

inplace: boolean

whether or not to perform inplace normalization (default=False). Note that if the sparse matrix’s format is not compatible with the axis along which the normalization is asked (csr for rows=True, csc for Rows=False), another one will be created, and ‘inplace’ will become moot. Also, will not apply if the input’s dtype is ‘int’ instead of a type compatible with division, such as ‘float’.

Returns
a matrix-like object

Warning

  • the inplace=True parameter isn’t implemented yet

  • If the input is a sparse matrix, then a dense numpy array will be created (and the ‘inplace’ option becomes

moot), possibly causing a memory exhaustion error.

sqrt(inplace=False)

Performs element-wise sqrt.

Parameters
inplace: boolean

whether or not to perform the operation inplace (default=False)

Returns
a matrix-like object
replace_negative_or_zeros_by(value)

Replace the negative or zeros value of the matrix by a new value

Parameters
value: int or float
Returns
a matrix-like object
nb_of_non_zeros_values_by_row()

Return the numbers of non zeros value for each line of the matrix

Returns
np.array

np.array with number of non zeros values for each row

all_positive()

Check if all values of the matrix are positive or zero.

Returns
boolean
combine(other, new_shape, row_indices_map=None, col_indices_map=None)

Merge another matrix with this one, creating a new matrix

The shape of the merged matrix is new_shape The rows and columns of the other matrix are mapped to the columns of the merged matrix using xxx_indices_map and the resulting matrix is added to the current one, extended to fit the new shape.

Parameters
other: matrix-like object

another matrix to merge with this one

new_shape: tuple

the shape of the resulting matrix

row_indices_map: dict

a mapping between the indices of the rows of ‘other’ and the merged matrix

col_indices_map: dict

a mapping between the indices of the columns of ‘other’ and the merged matrix

Returns
mergedmatrix-like object

Examples

>>> A = NumpyMatrix(np.array(range(6)).reshape((2,3)))
>>> print(A)
[[0 1 2]
 [3 4 5]]
>>> B = NumpyMatrix(np.array(range(4)).reshape((2,2)))
>>> print(B)
[[0 1]
 [2 3]]
>>> A.combine(B, (2, 4), col_indices_map={0:1, 1:3})
NumpyMatrix([[ 0.,  1.,  2.,  1.],
             [ 3.,  6.,  5.,  3.]])
indices_map = {0:1, # column 0 of the second matrix go to column 1 of the merged matrix

1:3} # column 1 of the second matrix go to column 3 of the merged matrix

as_dense()

Return a dense version of the matrix

Returns
array
save(path, name='matrix')

Save the matrix in a file.

The format of the file depends of the type of the matrix. See subclasses for more information.

Parameters
path: str

path to the folder where the file will be saved

name: str

name of the file (without extension)

Returns
str

Complete path to the created file

static format_vector(vector, sep)

Format a line of a csrSparseMatrix as a string

Parameters
vector:

line of a csrSparseMatrix

sep: str

separator to use

Returns
str

string representation of the vector

static load_from_text_file(file_object, sep)

Load a matrix and a list of words from a text file

Parameters
file_object: file-like object
sep: str

token used as a separator in the text file

Returns
tuple

(matrix, list of words)

classmethod load(path, name)

Load a matrix from a ‘.npz’ archive of ‘.npy’ files.

Parameters
path: str

path to the folder where the archive is stored

name: str

name of the file (without extension)

Returns
a matrix-like object
pairwise_distances(Y=None, metric='cosine', **kwargs)

Compute a distance matrix from the rows of the matrix.

This method returns the matrix of the distances between the rows of the matrix. Or, if Y is given (default is None), then the returned matrix is the pairwise distance between the rows of the matrix and the ones from Y.

This function relies on the sklearn.metrics.pairwise_distances module so you can use any distance available in it.

Valid values for metric are: - From scikit-learn: [‘cityblock’, ‘cosine’, ‘euclidean’, ‘l1’, ‘l2’,

‘manhattan’]. These metrics support sparse matrix inputs.

  • From scipy.spatial.distance: [‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘correlation’, ‘dice’, ‘hamming’, ‘jaccard’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘yule’]

See the documentation for sklearn.metrics.pairwise_distances for details on these metrics.

Parameters
Y: matrix, optional

An optional second matrix.

metricstring, or callable

The metric to use when calculating distance between vectors.

**kwdsoptional keyword parameters

Any further parameters are passed directly to the distance function.

Returns
array [len(self), len(self)] or [len(self), len(Y)]

A distance matrix D such that D_{i, j} is the distance between the ith and the jth rows of this matrix, if Y is None. If Y is not None, then D_{i, j} is the distance between the ith row of this matrix and the jth row of Y.