fibber.resources.resource_utils module¶
-
fibber.resources.resource_utils.
get_bert_clf_demo
()[source]¶ Download the pretrained classifier for demo dataset.
-
fibber.resources.resource_utils.
get_bert_lm_demo
()[source]¶ Download the pretrained language model for demo dataset.
-
fibber.resources.resource_utils.
get_counter_fitted_vector
(download_only=False)[source]¶ Download default pretrained counter fitted embeddings and return a dict.
See https://github.com/nmrksic/counter-fitting
- Parameters
download_only (bool) – set True to only download. (Returns None)
- Returns
- a dict of GloVe word embedding model.
”emb_table”: a numpy array of size(N, 300) “id2tok”: a list of strings. “tok2id”: a dict that maps word (string) to its id.
- Return type
(dict)
-
fibber.resources.resource_utils.
get_glove_emb
(download_only=False)[source]¶ Download default pretrained glove embeddings and return a dict.
We use the 300-dimensional model trained on Wikipedia 2014 + Gigaword 5. See https://nlp.stanford.edu/projects/glove/
- Parameters
download_only (bool) – set True to only download. (Returns None)
- Returns
- a dict of GloVe word embedding model.
”emb_table”: a numpy array of size(N, 300) “id2tok”: a list of strings. “tok2id”: a dict that maps word (string) to its id.
- Return type
(dict)
-
fibber.resources.resource_utils.
get_nltk_data
()[source]¶ Download nltk data to
<fibber_root_dir>/nltk_data
.
-
fibber.resources.resource_utils.
get_stopwords
()[source]¶ Download default stopword words.
- Returns
a list of strings.
- Return type
([str])
-
fibber.resources.resource_utils.
get_transformers
(name)[source]¶ Download pretrained transformer models.
- Parameters
name (str) – the name of the pretrained models. options are
["bert-base-cased", "bert-base-uncased", "gpt2-medium"]
.- Returns
directory of the downloaded model.
- Return type
(str)
-
fibber.resources.resource_utils.
get_universal_sentence_encoder
()[source]¶ Download pretrained universal sentence encoder.
- Returns
directory of the downloaded model.
- Return type
(str)
-
fibber.resources.resource_utils.
get_wordpiece_emb_demo
()[source]¶ Download wordpiece embeddings for demo dataset.
-
fibber.resources.resource_utils.
load_glove_model
(glove_file, dim)[source]¶ Load glove embeddings from txt file.
- Parameters
glove_file – filename.
dim – the dimension of the embedding.
- Returns
“emb_table”: a numpy array of size(N, 300) “id2tok”: a list of strings. “tok2id”: a dict that maps word (string) to its id.
- Return type
a dict