fibber.metrics package¶
Subpackages¶
Submodules¶
Module contents¶
-
class
fibber.metrics.
BertPerplexityMetric
(dataset_name, trainset, bert_ppl_gpu_id=- 1, bert_ppl_filter=- 1, **kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the perplexity of paraphrased text divided by the perplexity of original text. The perplexity is measured using BERT model.
Initialize Bert perplexity model.
-
property
lm_model
¶
-
property
tokenizer
¶
-
property
-
class
fibber.metrics.
CESimilarityMetric
(ce_pretrained_model='stsb-roberta-large', ce_gpu_id=- 1, **kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the semantic similarity of two sentences using Cross Encoder model.
By default the we use stsb-roberta-large model.
see https://github.com/UKPLab/sentence-transformers for more information.
Initialize ce model.
-
class
fibber.metrics.
EditDistanceMetric
(editing_distance_ignore_punctuation=True, **kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This class measures the editing distance between two sentences.
Initialize.
- Parameters
editing_distance_ignore_punctuation (bool) – whether to ignore punctuation when computing editing distance.
-
class
fibber.metrics.
FasttextClassifier
(dataset_name, trainset, testset, fasttext_lr=1.0, fasttext_epoch=25, fasttext_ngram=5, **kwargs)[source]¶ Bases:
fibber.metrics.classifier.classifier_base.ClassifierBase
fasttext classifier prediction on paraphrase_list.
This metric is special, it does not compare the original and paraphrased sentence. Instead, it outputs the classifier prediction on paraphrase_list. So we should not compute mean or std on this metric.
- Parameters
dataset_name (str) – the name of the dataset.
trainset (dict) – a fibber dataset.
testset (dict) – a fibber dataset.
fasttext_lr (float) – learning rate.
fasttext_epoch (int) – epochs to train.
fasttext_ngram (int) – classification feature ngram.
-
class
fibber.metrics.
GPT2PerplexityMetric
(gpt2_pretrained_model='gpt2-medium', gpt2_gpu_id=- 1, **kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the perplexity of paraphrased text divided by the perplexity of original text. The perplexity is measured using GPT2 model.
Initialize GPT2 model.
-
class
fibber.metrics.
GloVeSimilarityMetric
(**kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the cosine similarity between two sentences.
Initialize, load Glove embeddings.
-
class
fibber.metrics.
MetricBase
(field, bs=32, **kwargs)[source]¶ Bases:
abc.ABC
Base class for Metrics.
All metrics should be derived from this class.
To implement a new metric, you should at least overwrite the
measure_example
method.The simplest metric can be directly computed from a pair of text, in this case, the metric can use the
origin
andparaphrase
args directly.Other metrics need more information from the data record. For example,
text0
,text1
, orlabel
. Thus thedata_record
andfield
are also provided as args.Some metrics may run more efficiently on a batch of data. In this case, you should overwrite the
measure_batch
function. If you don’t overwrite batch_call, it will compute the metric of paraphrase_list one by one.-
measure_batch
(origin, paraphrase_list, data_record=None, **kwargs)[source]¶ Measure the metric on a batch of paraphrase_list.
If batch is larger than self._bs, the data will be split into smaller batches.
- Parameters
origin (str) – the original text.
paraphrase_list (list) – a set of paraphrase_list.
data_record (dict) – the corresponding data record of original text.
- Returns
a list containing the metric for each paraphrase.
- Return type
(list)
-
-
class
fibber.metrics.
MetricBundle
(enable_edit_distance=True, enable_use_similarity=True, enable_glove_similarity=True, enable_gpt2_perplexity=False, enable_transformer_classifier=True, enable_ce_similarity=False, enable_fasttext_classifier=False, enable_bert_perplexity=True, enable_bert_perplexity_per_class=False, enable_self_bleu=False, enable_ref_bleu=False, target_clf='transformer', field='text0', bs=32, **kwargs)[source]¶ Bases:
object
MetricBundle can help easily initialize and compute multiple metrics.
Initialize various metrics.
- Parameters
enable_edit_distance (bool) – whether to use editing distance in the bundle.
enable_use_similarity (bool) – whether to use Universal sentence encoder to compute sentence similarity
enable_glove_similarity (bool) – whether to use Glove embeddings to compute sentence similarity.
enable_gpt2_perplexity (bool) – whether to use GPT2 to compute sentence quality.
enable_transformer_classifier (bool) – whether to include BERT classifier prediction in metrics.
enable_ce_similarity (bool) – whether to use Cross Encoder to measure sentence similarity.
enable_fasttext_classifier (bool) – whether to include Fasttext classifier prediction in metrics.
target_clf (str) – choose from “trasformer”, “fasttext”.
field (str) – the field where perturbation can happen.
bs (int) – batch size.
kwargs – arguments for metrics. kwargs will be passed to all metrics.
-
add_advanced_aggregation_fn
(aggregation_name, aggregation_fn, direction)[source]¶ Add advanced aggregation function.
Some aggregation function can aggregate multiple metrics, these aggregation functions can be added here.
- Parameters
aggregation_name (str) – the name of the aggregation.
aggregation_fn (fn) – an aggregation function that takes
data_record
as arg.direction (str) – chose from DIRECTION_HIGHER_BETTER, DIRECTION_LOWER_BETTER, and DIRECTION_UNKNOWN.
-
add_classifier
(classifier_metric, set_target_clf=False)[source]¶ Set a target classifier to attack.
- Parameters
classifier_metric (ClassifierBase) – A classifier metric to be added.
set_target_clf (bool) – whether to set this classifier metric as target classifier.
-
add_metric
(metric, direction)[source]¶ Add a customized metric to metric bundle.
- Parameters
metric (MetricBase) – the metric object to add.
direction (str) – choose from
DIRECTION_HIGHER_BETTER
,DIRECTION_HIGHER_BETTER
andDIRECTION_UNKNOWN
.
-
aggregate_metrics
(dataset_name, paraphrase_strategy_name, experiment_name, results)[source]¶ Aggregate paraphrase metrics on a dataset to a summary and store in a dict.
- Parameters
dataset_name (str) – the name of the dataset.
paraphrase_strategy_name (str) – the name of the paraphrase strategy.
experiment_name (str) – the name of the experiment.
results (dict) – the fibber dataset with paraphrases and metrics. The return value of
compute_metrics
.
- Returns
the aggregated metrics.
- Return type
(dict)
-
get_classifier
(classifier_name)[source]¶ Returns the classifier in current metric bundle.
- Parameters
classifier_name (str) – the name of the requested classifier.
-
get_metric
(metric_name)[source]¶ Returns a metric in the bundle using the metric name.
Metric name is the class name of a metric.
Raises assertion error if metric is not found.
- Parameters
metric_name – the name of the matric.
- Returns
a metric object.
- Return type
-
get_metric_direction
(metric_name)[source]¶ Returns the direction of a metric.
Metric name is the class name of a metric.
Raises assertion error if metric is not found.
- Parameters
metric_name – the name of the matric.
- Returns
a metric object.
- Return type
-
measure_batch
(origin, paraphrase_list, data_record=None)[source]¶ Measure the metric on a batch of paraphrase_list.
- Parameters
origin (str) – the original text.
paraphrase_list (list) – a set of paraphrase_list.
data_record (dict) – the corresponding data record of original text.
- Returns
a list containing dict of metrics for each paraphrase.
- Return type
(list)
-
measure_dataset
(results, output_filename)[source]¶ Compute the all metrics for results on a dataset.
- Parameters
results (dict) – A fibber dataset with paraphrase_list.
output_filename (str) – A json filename to store results and metrics.
- Returns
- the results dict with
original_text_metrics
andparaphrase_metrics
added.
- the results dict with
- Return type
(dict)
-
measure_example
(origin, paraphrase, data_record=None)[source]¶ Compute the results of all metrics in the bundle for one pair of text.
- Parameters
origin (str) – original text.
paraphrase (str) – paraphrased text.
data_record (dict) – the data record.
- Returns
a dict with metric name as key.
- Return type
(dict)
-
class
fibber.metrics.
RefBleuMetric
(**kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the bleu score between input and output
Initialize ce model.
-
class
fibber.metrics.
SelfBleuMetric
(**kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric computes the bleu score between input and output
Initialize ce model.
-
class
fibber.metrics.
TransformerClassifier
(dataset_name, trainset, testset, transformer_clf_gpu_id=- 1, transformer_clf_steps=20000, transformer_clf_bs=32, transformer_clf_lr=2e-05, transformer_clf_optimizer='adamw', transformer_clf_weight_decay=0.001, transformer_clf_period_summary=100, transformer_clf_period_val=500, transformer_clf_period_save=20000, transformer_clf_val_steps=10, transformer_clf_model_init='bert-base-cased', **kwargs)[source]¶ Bases:
fibber.metrics.classifier.classifier_base.ClassifierBase
BERT classifier prediction on paraphrase_list.
This metric is special, it does not compare the original and paraphrased sentence. Instead, it outputs the classifier prediction on paraphrase_list. So we should not compute mean or std on this metric.
- Parameters
dataset_name (str) – the name of the dataset.
trainset (dict) – a fibber dataset.
testset (dict) – a fibber dataset.
transformer_clf_gpu_id (int) – the gpu id for BERT model. Set -1 to use CPU.
transformer_clf_steps (int) – steps to train a classifier.
transformer_clf_bs (int) – the batch size.
transformer_clf_lr (float) – the learning rate.
transformer_clf_optimizer (str) – the optimizer name.
transformer_clf_weight_decay (float) – the weight decay in the optimizer.
transformer_clf_period_summary (int) – the period in steps to write training summary.
transformer_clf_period_val (int) – the period in steps to run validation and write validation summary.
transformer_clf_period_save (int) – the period in steps to save current model.
transformer_clf_val_steps (int) – number of batched in each validation.
-
class
fibber.metrics.
USESimilarityMetric
(use_gpu_id=- 1, **kwargs)[source]¶ Bases:
fibber.metrics.metric_base.MetricBase
This metric uses universal sentence encoder to measure the semantic similarity of two sentences.
Initialize universal sentence encoder.