Pretrained Enhanced Contextual Language Models

In addition to access to all the pretrained models on Huggingface’s model hub, we also provide pretrained “enhanced” contextual models. These models are trained with some kind of knowledge that is not textual data, such as encyclopedic data from Wikipedia or lexical data from WordNet. Here is an overview of the models provided by mangoes, and below there are guides on how to instantiate them. The “Base Model” column is the non-enhanced model in which the enhanced model is based off of or fine-tuned.

Provided Pretrained Enhanced Models

Model

Base Model

Non-textual Knowledge

Paper

Ernie

BERT

Wikidata

https://arxiv.org/pdf/1905.07129.pdf

Kepler

RoBERTa

Wikidata

https://arxiv.org/pdf/1911.06136.pdf

K-Adapter

RoBERTa

Wikidata

https://arxiv.org/pdf/2002.01808.pdf

LiBERT

BERT

WordNet

https://aclanthology.org/2020.coling-main.118/

Human Conceptual Knowledge Transformer (HCK)

BERT

Feature norms

https://bpb-us-w2.wpmucdn.com/web.sas.upenn.edu/dist/a/511/files/2021/12/Bhatia-Richie-2021-Psych-Rev.pdf

Using a pretrained enhanced model

The main idea behind using these models is: we instantiate the model and tokenizer class first, then pass these objects to the mangoes class. When instantiating the model/tokenizer class using the same API as Huggingface’s transformer classes (as they are subclassed from transformer model/tokenizer classes), we pass the None keyword in place of the pretrained model/tokenizer name. This will tell mangoes to download the pretrained model or tokenizer from INRIA’s server, which will then be used when instantiating the object.

We will demonstrate how to load pretrained enhanced models. We will use sequence classification as our example task, but the same code works for all tasks. You can find this code in mangoes/notebooks/Enhanced Models.ipynb .

Some models have their own tokenizer, like Ernie:

>>> model = ErnieForSequenceClassification.from_pretrained(None, label2id=label2id)
>>> tokenizer = ErnieTokenizer.from_pretrained(None)

Libert:

>>> model = LibertForSequenceClassification.from_pretrained(None, label2id=label2id)
>>> tokenizer = LibertTokenizerFast.from_pretrained(None)

or HCK:

>>> model = HCKForSequenceClassification.from_pretrained(None, label2id=label2id)
>>> tokenizer = HCKTokenizerFast.from_pretrained(None)

While other models use non-enhanced tokenizer classes, like Kepler:

>>> model = KeplerForSequenceClassification.from_pretrained(None, label2id=label2id)
>>> tokenizer = AutoTokenizer.from_pretrained(KEPLER_PRETRAINED_TOKENIZER_NAME)

or K-Adapter:

>>> model = KAdapterForSequenceClassification.from_pretrained(None, label2id=label2id)
>>> tokenizer = AutoTokenizer.from_pretrained(KADAPTER_PRETRAINED_TOKENIZER_NAME)

Note that both KEPLER_PRETRAINED_TOKENIZER_NAME and KADAPTER_PRETRAINED_TOKENIZER_NAME variables can be imported from mangoes.modeling

Once the model and tokenizer are instantiated, one can make a mangoes class by simply passing these objects to the init function:

>>> mangoes_model = TransformerForSequenceClassification(model, tokenizer, label2id=label2id)

From there, the mangoes obeject behaves the same as if it was using a non-enhanced model, so you can extract features or predictions, or fine-tune it for any of the supported tasks.