mangoes.evaluate module¶
Classes and functions to evaluate embeddings.
-
class
mangoes.evaluate.
AnalogyResult
(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶ Bases:
mangoes.evaluate._Result
Class to handle the result of analogies evaluation
- Attributes
- dataset: mangoes.dataset.Dataset
the dataset used to evaluate
- predictions: dict
A dictionary where keys are the questions of the dataset (without gold) and the values are a tuple with the answers predicted with an Embedding using 3COSMUL and 3COSADD respectively
- score: AnalogyResult.Score
Global score. The score is a tuple with the percentage of questions where the expected answer is in the predictions computed using, respectively, 3COSADD and 3COSMUL
- oov: set of strings
words of the dataset that are not represented in the embedding
- ignored: int
number of questions of the dataset ignored in the evaluation
- summmary
detail
Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail
Returns a printable string with the detail of the results of the evaluation for each subset
Methods
Score
(cosadd, cosmul)- Attributes
get_score
([subset])Returns the score of a subset of the dataset
to_string
([show_subsets, show_questions, …])Returns a printable version of the results
-
mangoes.evaluate.
analogy
(embedding, dataset='all', allowed_answers=1, epsilon=0.001)¶ Evaluate an embedding on analogy task
- Parameters
- embedding: mangoes.Embeddings
The representation to evaluate
- dataset: mangoes.Dataset
The dataset to use
- allowed_answers
Number of answers
- epsilon: float
Value used to prevent division by zero in 3CosMul
- Returns
- AnalogyResult
-
class
mangoes.evaluate.
SimilarityResult
(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶ Bases:
mangoes.evaluate._Result
Class to handle the result of word similarity evaluation
- Attributes
- dataset: mangoes.dataset.Dataset
the dataset used to evaluate
- predictions: dict
A dictionary where keys are the questions of the dataset (without gold) and the values are the computed similarity
- score: SimilarityResult.Score
Global score. The score is a tuple of the 2 correlation coefficient : Pearson and Spearman, respectively. Each coefficient is a tuple with the coefficient itself and the p-value.
- oov: set of strings
words of the dataset that are not represented in the embedding
- ignored: int
number of questions of the dataset ignored in the evaluation
- summmary
detail
Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail
Returns a printable string with the detail of the results of the evaluation for each subset
Methods
Coeff
(coeff, pvalue)- Attributes
Score
(pearson, spearman)- Attributes
get_score
([subset])Returns the score of a subset of the dataset
to_string
([show_subsets, show_questions, …])Returns a printable version of the results
-
class
Coeff
(coeff, pvalue)¶ Bases:
tuple
Methods
count
(value, /)Return number of occurrences of value.
index
(value[, start, stop])Return first index of value.
-
property
coeff
¶ Alias for field number 0
-
property
pvalue
¶ Alias for field number 1
-
property
-
mangoes.evaluate.
similarity
(embedding, dataset='all', metric=<function rowwise_cosine_similarity>)¶ Evaluate an embedding on word similarity task
- Parameters
- embedding: mangoes.Embeddings
The representation to evaluate
- dataset: mangoes.Dataset
The dataset to use
- metric
the metric to use to compute the similarity (default : cosine)
- Returns
- SimilarityResult
-
class
mangoes.evaluate.
OutlierDetectionResult
(dataset, predictions, oov='Not Applicable', ignored='Not Applicable')¶ Bases:
mangoes.evaluate._Result
Class to handle the result of outlier detection evaluation
- Attributes
- dataset: mangoes.dataset.Dataset
the dataset used to evaluate
- predictions: dict
A dictionary where keys are the questions of the dataset and the values are the words of the question, sorted according to their compactness score
- score: OutlierDetectionResult.Score
Global score. The score is a tuple with the Outlier Position Percentage and the Accuracy measures.
- oov: set of strings
words of the dataset that are not represented in the embedding
- ignored: int
number of questions of the dataset ignored in the evaluation
- summmary
detail
Returns a printable string with the results of the evaluation for each subset of the dataset
more_detail
Returns a printable string with the detail of the results of the evaluation for each subset
Methods
Score
(opp, accuracy)- Attributes
get_score
([subset])Returns the score of a subset of the dataset
to_string
([show_subsets, show_questions, …])Returns a printable version of the results
-
mangoes.evaluate.
outlier_detection
(embedding, dataset='all')¶ Evaluate an embedding on outlier detection task
- Parameters
- embedding: mangoes.Embeddings
The representation to evaluate
- dataset: mangoes.Dataset
The dataset to use
- Returns
- SimilarityResult
-
mangoes.evaluate.
isotropy_from_partition_function
(*args, **kwargs)¶
-
mangoes.evaluate.
distances_one_word_histogram
(*args, **kwargs)¶
-
mangoes.evaluate.
distances_histogram
(*args, **kwargs)¶
-
mangoes.evaluate.
tsne
(embeddings)¶ Create a 2d projections of the embeddings using t-SNE
- Parameters
- embeddings: an instance of Embeddings
Instance of
mangoes.Embeddings
to project