mangoes.evaluate module

Classes and functions to evaluate embeddings.

class mangoes.evaluate.AnalogyResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')

Bases: mangoes.evaluate._Result

Class to handle the result of analogies evaluation

Attributes
dataset: mangoes.dataset.Dataset

the dataset used to evaluate

predictions: dict

A dictionary where keys are the questions of the dataset (without gold) and the values are a tuple with the answers predicted with an Embedding using 3COSMUL and 3COSADD respectively

score: AnalogyResult.Score

Global score. The score is a tuple with the percentage of questions where the expected answer is in the predictions computed using, respectively, 3COSADD and 3COSMUL

oov: set of strings

words of the dataset that are not represented in the embedding

ignored: int

number of questions of the dataset ignored in the evaluation

summmary
detail

Returns a printable string with the results of the evaluation for each subset of the dataset

more_detail

Returns a printable string with the detail of the results of the evaluation for each subset

Methods

Score(cosadd, cosmul)

Attributes

get_score([subset])

Returns the score of a subset of the dataset

to_string([show_subsets, show_questions, …])

Returns a printable version of the results

class Score(cosadd, cosmul)

Bases: tuple

Attributes
cosadd

Alias for field number 0

cosmul

Alias for field number 1

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property cosadd

Alias for field number 0

property cosmul

Alias for field number 1

mangoes.evaluate.analogy(embedding, dataset='all', allowed_answers=1, epsilon=0.001)

Evaluate an embedding on analogy task

Parameters
embedding: mangoes.Embeddings

The representation to evaluate

dataset: mangoes.Dataset

The dataset to use

allowed_answers

Number of answers

epsilon: float

Value used to prevent division by zero in 3CosMul

Returns
AnalogyResult
class mangoes.evaluate.SimilarityResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')

Bases: mangoes.evaluate._Result

Class to handle the result of word similarity evaluation

Attributes
dataset: mangoes.dataset.Dataset

the dataset used to evaluate

predictions: dict

A dictionary where keys are the questions of the dataset (without gold) and the values are the computed similarity

score: SimilarityResult.Score

Global score. The score is a tuple of the 2 correlation coefficient : Pearson and Spearman, respectively. Each coefficient is a tuple with the coefficient itself and the p-value.

oov: set of strings

words of the dataset that are not represented in the embedding

ignored: int

number of questions of the dataset ignored in the evaluation

summmary
detail

Returns a printable string with the results of the evaluation for each subset of the dataset

more_detail

Returns a printable string with the detail of the results of the evaluation for each subset

Methods

Coeff(coeff, pvalue)

Attributes

Score(pearson, spearman)

Attributes

get_score([subset])

Returns the score of a subset of the dataset

to_string([show_subsets, show_questions, …])

Returns a printable version of the results

class Coeff(coeff, pvalue)

Bases: tuple

Attributes
coeff

Alias for field number 0

pvalue

Alias for field number 1

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property coeff

Alias for field number 0

property pvalue

Alias for field number 1

class Score(pearson, spearman)

Bases: tuple

Attributes
pearson

Alias for field number 0

spearman

Alias for field number 1

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property pearson

Alias for field number 0

property spearman

Alias for field number 1

mangoes.evaluate.similarity(embedding, dataset='all', metric=<function rowwise_cosine_similarity>)

Evaluate an embedding on word similarity task

Parameters
embedding: mangoes.Embeddings

The representation to evaluate

dataset: mangoes.Dataset

The dataset to use

metric

the metric to use to compute the similarity (default : cosine)

Returns
SimilarityResult
class mangoes.evaluate.OutlierDetectionResult(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')

Bases: mangoes.evaluate._Result

Class to handle the result of outlier detection evaluation

Attributes
dataset: mangoes.dataset.Dataset

the dataset used to evaluate

predictions: dict

A dictionary where keys are the questions of the dataset and the values are the words of the question, sorted according to their compactness score

score: OutlierDetectionResult.Score

Global score. The score is a tuple with the Outlier Position Percentage and the Accuracy measures.

oov: set of strings

words of the dataset that are not represented in the embedding

ignored: int

number of questions of the dataset ignored in the evaluation

summmary
detail

Returns a printable string with the results of the evaluation for each subset of the dataset

more_detail

Returns a printable string with the detail of the results of the evaluation for each subset

Methods

Score(opp, accuracy)

Attributes

get_score([subset])

Returns the score of a subset of the dataset

to_string([show_subsets, show_questions, …])

Returns a printable version of the results

class Score(opp, accuracy)

Bases: tuple

Attributes
accuracy

Alias for field number 1

opp

Alias for field number 0

Methods

count(value, /)

Return number of occurrences of value.

index(value[, start, stop])

Return first index of value.

property accuracy

Alias for field number 1

property opp

Alias for field number 0

mangoes.evaluate.outlier_detection(embedding, dataset='all')

Evaluate an embedding on outlier detection task

Parameters
embedding: mangoes.Embeddings

The representation to evaluate

dataset: mangoes.Dataset

The dataset to use

Returns
SimilarityResult
mangoes.evaluate.isotropy_from_partition_function(*args, **kwargs)
mangoes.evaluate.distances_one_word_histogram(*args, **kwargs)
mangoes.evaluate.distances_histogram(*args, **kwargs)
mangoes.evaluate.tsne(embeddings)

Create a 2d projections of the embeddings using t-SNE

Parameters
embeddings: an instance of Embeddings

Instance of mangoes.Embeddings to project