mangoes.composition module

This module provides different ways to derive a phrase vector from the vectors of its parts.

Given two vectors u and v representing two words w1 and w2, composition methods can be applied to derive a new vector p representing the phrase ‘w1 w2’ by combining u and v.

This module provides classes to learn parameters for various compositional models and apply them to predict new vectors.

These models are :

  • additive model 8: p is obtained by a (weighted) sum of u and v :

    \mathbf{p = \alpha u + \beta v}

    These weights can be learned with AdditiveComposer

  • multiplicative 8: p is obtained by component-wise multiplication of u and v :

    \mathbf{p = u \odot v}

    This method has no parameter to learn but MultiplicativeComposer is provided here to be compared with other ones.

  • dilation model 7 8: p is obtained by calculating the dot products of u.u and u.v and stretching v by a factor \lambda in the direction of u :

    \mathbf{p = (u.u)v + (\lambda - 1)(u.v)}

    \lambda can be learned with DilationComposer

  • full additive 9: an extension of the additive model where the two n-dimension input vectors are multiplied by two n x n weight matrices :

    \mathbf{p = Au + Bv}

    A and B can be learned with FullAdditiveComposer

  • lexical function 10: w1 is seen as a function and represented as a matrix U and p is the product of this matrix and v

    \mathbf{p = Uv}

    U can be learned with LexicalComposer

References

6

Boleda, G., Baroni, M., & McNally, L. (2013). Intensionality was only alleged: On adjective-noun composition in distributional semantics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)–Long Papers (pp. 35-46).

7

Clark, S., Coecke, B., & Sadrzadeh, M. (2008). A compositional distributional model of meaning. In Proceedings of the Second Quantum Interaction Symposium (QI-2008) (pp. 133-140).

8(1,2,3)

Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics. Cognitive science, 34(8), 1388-1429.

9

Guevara, E. (2010, July). A regression model of adjective-noun compositionality in distributional semantics. In Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics (pp. 33-37). Association for Computational Linguistics.

10

Baroni, M., & Zamparelli, R. (2010, October). Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (pp. 1183-1193). Association for Computational Linguistics.

class mangoes.composition.AdditiveComposer(representation, init=(1, 1))

Bases: mangoes.composition._ScalarComposer

Compose phrase vectors using the additive model

Given two vectors u and v representing two words w1 and w2, the Additive Model derives a new vector p representing the phrase ‘w1 w2’ by combining u and v by a (weighted) sum of u and v :

\mathbf{p = \alpha u + \beta v}

This class learn these weights from a Representation then apply them to predict new vectors

Parameters
representation: mangoes.Representation

Representation from which the weights will be learned. So, the vocabulary represented should contains bigrams where both part are also parts of the vocabulary.

init: (float, float)

initial values for alpha and beta. Default = (1, 1)

Examples

>>> import numpy
>>> import mangoes.composition
>>> colors = ['white', 'black', 'green', 'red']
>>> nouns = ['dress', 'rabbit', 'flag']
>>> adj_nouns = ['white dress', 'black dress', 'red flag', 'green flag', 'white rabbit']
>>> vocabulary = mangoes.Vocabulary(colors + nouns + adj_nouns)
>>> matrix = numpy.random.random((12, 5))
>>> embeddings = mangoes.Embeddings(vocabulary, matrix)
>>> additive_composer = mangoes.composition.AdditiveComposer(embeddings)
>>> additive_composer.fit()
>>> green_rabbit = additive_composer.predict('green', 'rabbit')
Attributes
alpha:

Learned weight applied to first words

beta:

Learned weight applied to second words

Methods

fit([bigrams])

Fit model to data

compose

predict

residual

transform

property alpha
property beta
static compose(u, v, a, b)
fit(bigrams=None)

Fit model to data

Parameters
bigrams: list

If bigrams = None (default), parameters will be learned from all the bigrams found in the vocabulary of the Representation. But a list of bigrams can be provided here : it has to be a subset of the vocabulary of the Representation.

class mangoes.composition.DilationComposer(representation, init=0.5)

Bases: mangoes.composition._ScalarComposer

Compose phrase vectors using the dilation model

Given two vectors u and v representing two words w1 and w2, the dilation model derives a new vector p representing the phrase ‘w1 w2’ by calculating the dot products of u.u and u.v and stretching v by a factor \lambda in the direction of u :

\mathbf{p = (u.u)v + (\lambda - 1)(u.v)}

This class learn \lambda from a Representation then apply the dilation to predict new vectors

Parameters
representation: mangoes.Representation

Representation from which the dilation factor will be learned. So, the vocabulary represented should contains bigrams where both part are also parts of the vocabulary.

init: float

initial value for lambda_. Default = 0.5

Examples

>>> import numpy
>>> import mangoes.composition
>>> colors = ['white', 'black', 'green', 'red']
>>> nouns = ['dress', 'rabbit', 'flag']
>>> adj_nouns = ['white dress', 'black dress', 'red flag', 'green flag', 'white rabbit']
>>> vocabulary = mangoes.Vocabulary(colors + nouns + adj_nouns)
>>> matrix = numpy.random.random((12, 5))
>>> embeddings = mangoes.Embeddings(vocabulary, matrix)
>>> dilation_composer = mangoes.composition.AdditiveComposer(embeddings)
>>> dilation_composer.fit()
>>> green_rabbit = dilation_composer.predict('green', 'rabbit')
Attributes
lambda_float

The dilation factor

Methods

fit([bigrams])

Fit model to data

compose

predict

residual

transform

property lambda_
static compose(u, v, lambda_)
fit(bigrams=None)

Fit model to data

Parameters
bigrams: list

If bigrams = None (default), lambda_ will be learned from all the bigrams found in the vocabulary of the Representation. But a list of bigrams can be provided here : it has to be a subset of the vocabulary of the Representation.

class mangoes.composition.MultiplicativeComposer(representation)

Bases: mangoes.composition._Composer

Compose phrase vectors using the multiplicative model

Given two vectors u and v representing two words w1 and w2, the multiplicative model derives a new vector p representing the phrase ‘w1 w2’ by component-wise multiplication of u and v :

\mathbf{p = u \odot v}

This class has no parameter to learn but is provided with a fit() function to be easily compared with other models.

Parameters
representation: mangoes.Representation

Representation used to predict phrase vectors.

Examples

>>> import numpy
>>> import mangoes.composition
>>> colors = ['white', 'black', 'green', 'red']
>>> nouns = ['dress', 'rabbit', 'flag']
>>> adj_nouns = ['white dress', 'black dress', 'red flag', 'green flag', 'white rabbit']
>>> vocabulary = mangoes.Vocabulary(colors + nouns + adj_nouns)
>>> matrix = numpy.random.random((12, 5))
>>> embeddings = mangoes.Embeddings(vocabulary, matrix)
>>> multiplicative_composer = mangoes.composition.MultiplicativeComposer(embeddings)
>>> multiplicative_composer.fit() # does nothing
>>> green_rabbit = multiplicative_composer.predict('green', 'rabbit')

Methods

compose

fit

predict

static compose(u, v)
fit(bigrams=None)
predict(u, v)
class mangoes.composition.FullAdditiveComposer(representation)

Bases: mangoes.composition._Composer

Compose phrase vectors using the full additive model

Given two vectors u and v representing two words w1 and w2, the Full Additive Model derives a new vector p representing the phrase ‘w1 w2’ by multiplying u and v by two n x n weight matrices :

\mathbf{p = Au + Bv}

can be learned with FullAdditiveComposer

This class learn A and B from a Representation, using PLSR, then apply them to predict new vectors

Parameters
representation: mangoes.Representation

Representation from which the weights will be learned. So, the vocabulary represented should contains bigrams where both part are also parts of the vocabulary.

Examples

>>> import numpy
>>> import mangoes.composition
>>> colors = ['white', 'black', 'green', 'red']
>>> nouns = ['dress', 'rabbit', 'flag']
>>> adj_nouns = ['white dress', 'black dress', 'red flag', 'green flag', 'white rabbit']
>>> vocabulary = mangoes.Vocabulary(colors + nouns + adj_nouns)
>>> matrix = numpy.random.random((12, 5))
>>> embeddings = mangoes.Embeddings(vocabulary, matrix)
>>> composer = mangoes.composition.FullAdditiveComposer(embeddings)
>>> composer.fit()
>>> green_rabbit = composer.predict('green', 'rabbit')
Attributes
A: array
B: array

Learned weight applied to first and second words. The shape of the matrices is nxn where n is the dimension of the representation

Methods

fit([bigrams])

Fit model to data

compose

predict

static compose(u, v, A, B)
fit(bigrams=None)

Fit model to data

Parameters
bigrams: list

If bigrams = None (default), parameters will be learned from all the bigrams found in the vocabulary of the Representation. But a list of bigrams can be provided here : it has to be a subset of the vocabulary of the Representation.

predict(u, v)
class mangoes.composition.LexicalComposer(representation, word, n_components=None)

Bases: mangoes.composition._Composer

Compose phrase vectors using the lexical model

Given a word w1, the Full Additive Model sees it as a function, represented as a matrix U and applies it to another word w2 represented as a vector v to derives a new vector p representing the phrase ‘w1 w2’

\mathbf{p = Uv}

This class learn U for word w1 from a Representation, using PLSR, then apply it to predict new vectors

Parameters
representation: mangoes.Representation

Representation from which the weights will be learned. So, the vocabulary represented should contains bigrams where both part are also parts of the vocabulary.

word: str

The word to represent as a matrix

n_components: int

Number of components to keep in PLSR. If None, use 1/4 of the numbers of bigrams that will be used in fit.

Examples

>>> import numpy
>>> import mangoes.composition
>>> colors = ['green']
>>> nouns = ['dress', 'hat', 'bean', 'lantern', 'rabbit']
>>> adj_nouns = ['green dress', 'green bean', 'green hat', 'green lantern']
>>> vocabulary = mangoes.Vocabulary(colors + nouns + adj_nouns)
>>> matrix = numpy.random.random((10, 5))
>>> embeddings = mangoes.Embeddings(vocabulary, matrix)
>>> green = mangoes.composition.LexicalComposer(embeddings, 'green')
>>> green.fit()
>>> green_rabbit = green.predict('rabbit')
Attributes
U: array

Learned matrix to represent word. The shape of the matrices is nxn where n is the dimension of the representation

Methods

fit([bigrams])

Fit model to data

compose

predict

static compose(U, v)
fit(bigrams=None)

Fit model to data

Parameters
bigrams: list

If bigrams = None (default), parameters will be learned from all the bigrams found in the vocabulary of the Representation. But a list of bigrams can be provided here : it has to be a subset of the vocabulary of the Representation.

predict(v)
property U