mangoes.evaluate module¶

Classes and functions to evaluate embeddings.

class mangoes.evaluate.AnalogyResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶

Bases: mangoes.evaluate._Result

Class to handle the result of analogies evaluation

Attributes

dataset: mangoes.dataset.Dataset: the dataset used to evaluate
predictions: dict: A dictionary where keys are the questions of the dataset (without gold) and the values are a tuple with the answers predicted with an Embedding using 3COSMUL and 3COSADD respectively
score: AnalogyResult.Score: Global score. The score is a tuple with the percentage of questions where the expected answer is in the predictions computed using, respectively, 3COSADD and 3COSMUL
oov: set of strings: words of the dataset that are not represented in the embedding
ignored: int: number of questions of the dataset ignored in the evaluation
summmary
detail: Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail: Returns a printable string with the detail of the results of the evaluation for each subset

Methods

Score(cosadd, cosmul)

Attributes

get_score([subset])

Returns the score of a subset of the dataset

to_string([show_subsets, show_questions, …])

Returns a printable version of the results

class Score(cosadd, cosmul)¶

Bases: tuple

Attributes

cosadd: Alias for field number 0
cosmul: Alias for field number 1

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

property cosadd¶: Alias for field number 0

property cosmul¶: Alias for field number 1

mangoes.evaluate.analogy(embedding, dataset='all', allowed_answers=1, epsilon=0.001)¶

Evaluate an embedding on analogy task

Parameters

embedding: mangoes.Embeddings: The representation to evaluate
dataset: mangoes.Dataset: The dataset to use
allowed_answers: Number of answers
epsilon: float: Value used to prevent division by zero in 3CosMul

Returns

AnalogyResult

class mangoes.evaluate.SimilarityResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶

Bases: mangoes.evaluate._Result

Class to handle the result of word similarity evaluation

Attributes

dataset: mangoes.dataset.Dataset: the dataset used to evaluate
predictions: dict: A dictionary where keys are the questions of the dataset (without gold) and the values are the computed similarity
score: SimilarityResult.Score: Global score. The score is a tuple of the 2 correlation coefficient : Pearson and Spearman, respectively. Each coefficient is a tuple with the coefficient itself and the p-value.
oov: set of strings: words of the dataset that are not represented in the embedding
ignored: int: number of questions of the dataset ignored in the evaluation
summmary
detail: Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail: Returns a printable string with the detail of the results of the evaluation for each subset

Methods

`Coeff`(coeff, pvalue)	Attributes
`Score`(pearson, spearman)	Attributes
`get_score`([subset])	Returns the score of a subset of the dataset
`to_string`([show_subsets, show_questions, …])	Returns a printable version of the results

class Coeff(coeff, pvalue)¶

Bases: tuple

Attributes

coeff: Alias for field number 0
pvalue: Alias for field number 1

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

property coeff¶: Alias for field number 0

property pvalue¶: Alias for field number 1

class Score(pearson, spearman)¶

Bases: tuple

Attributes

pearson: Alias for field number 0
spearman: Alias for field number 1

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

property pearson¶: Alias for field number 0

property spearman¶: Alias for field number 1

mangoes.evaluate.similarity(embedding, dataset='all', metric=<function rowwise_cosine_similarity>)¶

Evaluate an embedding on word similarity task

Parameters

embedding: mangoes.Embeddings: The representation to evaluate
dataset: mangoes.Dataset: The dataset to use
metric: the metric to use to compute the similarity (default : cosine)

Returns

SimilarityResult

class mangoes.evaluate.OutlierDetectionResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶

Bases: mangoes.evaluate._Result

Class to handle the result of outlier detection evaluation

Attributes

dataset: mangoes.dataset.Dataset: the dataset used to evaluate
predictions: dict: A dictionary where keys are the questions of the dataset and the values are the words of the question, sorted according to their compactness score
score: OutlierDetectionResult.Score: Global score. The score is a tuple with the Outlier Position Percentage and the Accuracy measures.
oov: set of strings: words of the dataset that are not represented in the embedding
ignored: int: number of questions of the dataset ignored in the evaluation
summmary
detail: Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail: Returns a printable string with the detail of the results of the evaluation for each subset

Methods

Score(opp, accuracy)

Attributes

get_score([subset])

Returns the score of a subset of the dataset

to_string([show_subsets, show_questions, …])

Returns a printable version of the results

class Score(opp, accuracy)¶

Bases: tuple

Attributes

accuracy: Alias for field number 1
opp: Alias for field number 0

Methods

`count`(value, /)	Return number of occurrences of value.
`index`(value[, start, stop])	Return first index of value.

property accuracy¶: Alias for field number 1

property opp¶: Alias for field number 0

mangoes.evaluate.outlier_detection(embedding, dataset='all')¶

Evaluate an embedding on outlier detection task

Parameters

embedding: mangoes.Embeddings: The representation to evaluate
dataset: mangoes.Dataset: The dataset to use

Returns

SimilarityResult

mangoes.evaluate.isotropy_from_partition_function(*args, **kwargs)¶

mangoes.evaluate.distances_one_word_histogram(*args, **kwargs)¶

mangoes.evaluate.distances_histogram(*args, **kwargs)¶

mangoes.evaluate.tsne(embeddings)¶

Create a 2d projections of the embeddings using t-SNE

Parameters

embeddings: an instance of Embeddings: Instance of mangoes.Embeddings to project