Hyperparameters

Both corpus preprocessing and word representation construction provide different parameters that can be tuned with mangoes.

References : LEVY, Omer, GOLDBERG, Yoav, et DAGAN, Ido. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 2015, vol. 3, p. 211-225.

Description

Params

Values

Effect

CORPUS

Text normalisation

lower

boolean (default = False)

Convert input corpus to lower case

digit

boolean (default = False)

Replace all numeric values with 0

ignore_punctuation

boolean (default = False)

Ignore punctuation

COUNTING

Vocabulary and features selection

words

a Vocabulary

words to represent

context or vocabulary param of the context param

a Vocabulary

words to use as features

If vocabulary is extracted from the corpus : Corpus.create_vocabulary()

Vocabulary filters

filters

function (default = None)

Filter most or least frequent words, remove punctuation, …

Context definition

context

callable class (default = Window)

from a sentence return the words to be considered as co-occurring for each word in the sentence

If using window-like contexts : context.Window

Size of the window

window_half_size

int (default = 1)

size of the window

Fixed size or dynamic

dynamic

boolean (default = False)

Fixed size of window or random between 1 and window_half_size

Symmetric or asymmetric

symmetric

boolean (default = True)

The window can be centered around a word or asymmetrical

Clean or dirty

dirty

boolean (default = False)

If dirty, remove ignored word before creating the window

Subsampling

subsampling

boolean or float defining the threshold (default = False)

Downsample the words more frequent than the threshold

EMBEDDING

Transformations applied to the co-occurrence matrix

transformations

list of functions (default = None)

Apply weighting and dimensionality reduction to counts

Dimension of the vectors

dimensions

int

Size of the vectors

If using PMI or variant

Context Distribution
Smoothing

alpha

float (default = 1 for not smoothed)

Raise context counts to the power of alpha to “smooth” the contexts’ distribution

Shift

shift

int >= 1 (default = 1 for no shift)

Shift the matrix of log(shift)

If using SVD (svd())

Eigenvalue weighting

weight

int (default = 1)

Weighting exponent to apply to the eigenvalues

Add context vectors

add_context_vectors

boolean (default = False)

Use the context vectors in addition to the words vectors

Symmetric weighting

symmetric

boolean (default = False)

Way to compute the context vectors