fibber.metrics package

Module contents

class fibber.metrics.BertPerplexityMetric(dataset_name, trainset, bert_ppl_gpu_id=- 1, bert_ppl_filter=- 1, **kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the perplexity of paraphrased text divided by the perplexity of original text. The perplexity is measured using BERT model.

Initialize Bert perplexity model.

property lm_model
property tokenizer
class fibber.metrics.CESimilarityMetric(ce_pretrained_model='stsb-roberta-large', ce_gpu_id=- 1, **kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the semantic similarity of two sentences using Cross Encoder model.

By default the we use stsb-roberta-large model.

see https://github.com/UKPLab/sentence-transformers for more information.

Initialize ce model.

class fibber.metrics.EditDistanceMetric(editing_distance_ignore_punctuation=True, **kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This class measures the editing distance between two sentences.

Initialize.

Parameters

editing_distance_ignore_punctuation (bool) – whether to ignore punctuation when computing editing distance.

class fibber.metrics.FasttextClassifier(dataset_name, trainset, testset, fasttext_lr=1.0, fasttext_epoch=25, fasttext_ngram=5, **kwargs)[source]

Bases: fibber.metrics.classifier.classifier_base.ClassifierBase

fasttext classifier prediction on paraphrase_list.

This metric is special, it does not compare the original and paraphrased sentence. Instead, it outputs the classifier prediction on paraphrase_list. So we should not compute mean or std on this metric.

Parameters
  • dataset_name (str) – the name of the dataset.

  • trainset (dict) – a fibber dataset.

  • testset (dict) – a fibber dataset.

  • fasttext_lr (float) – learning rate.

  • fasttext_epoch (int) – epochs to train.

  • fasttext_ngram (int) – classification feature ngram.

load_robust_tuned_model(save_path)[source]
robust_tune_init(optimizer, lr, weight_decay, steps)[source]
robust_tune_step(data_record_list)[source]
save_robust_tuned_model(load_path)[source]
class fibber.metrics.GPT2PerplexityMetric(gpt2_pretrained_model='gpt2-medium', gpt2_gpu_id=- 1, **kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the perplexity of paraphrased text divided by the perplexity of original text. The perplexity is measured using GPT2 model.

Initialize GPT2 model.

class fibber.metrics.GloVeSimilarityMetric(**kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the cosine similarity between two sentences.

Initialize, load Glove embeddings.

class fibber.metrics.MetricBase(field, bs=32, **kwargs)[source]

Bases: abc.ABC

Base class for Metrics.

All metrics should be derived from this class.

To implement a new metric, you should at least overwrite the measure_example method.

The simplest metric can be directly computed from a pair of text, in this case, the metric can use the origin and paraphrase args directly.

Other metrics need more information from the data record. For example, text0, text1, or label. Thus the data_record and field are also provided as args.

Some metrics may run more efficiently on a batch of data. In this case, you should overwrite the measure_batch function. If you don’t overwrite batch_call, it will compute the metric of paraphrase_list one by one.

measure_batch(origin, paraphrase_list, data_record=None, **kwargs)[source]

Measure the metric on a batch of paraphrase_list.

If batch is larger than self._bs, the data will be split into smaller batches.

Parameters
  • origin (str) – the original text.

  • paraphrase_list (list) – a set of paraphrase_list.

  • data_record (dict) – the corresponding data record of original text.

Returns

a list containing the metric for each paraphrase.

Return type

(list)

measure_example(origin, paraphrase, data_record=None, **kwargs)[source]
measure_multiple_examples(origin_list, paraphrase_list, data_record_list=None, **kwargs)[source]
class fibber.metrics.MetricBundle(enable_edit_distance=True, enable_use_similarity=True, enable_glove_similarity=True, enable_gpt2_perplexity=False, enable_transformer_classifier=True, enable_ce_similarity=False, enable_fasttext_classifier=False, enable_bert_perplexity=True, enable_bert_perplexity_per_class=False, enable_self_bleu=False, enable_ref_bleu=False, target_clf='transformer', field='text0', bs=32, **kwargs)[source]

Bases: object

MetricBundle can help easily initialize and compute multiple metrics.

Initialize various metrics.

Parameters
  • enable_edit_distance (bool) – whether to use editing distance in the bundle.

  • enable_use_similarity (bool) – whether to use Universal sentence encoder to compute sentence similarity

  • enable_glove_similarity (bool) – whether to use Glove embeddings to compute sentence similarity.

  • enable_gpt2_perplexity (bool) – whether to use GPT2 to compute sentence quality.

  • enable_transformer_classifier (bool) – whether to include BERT classifier prediction in metrics.

  • enable_ce_similarity (bool) – whether to use Cross Encoder to measure sentence similarity.

  • enable_fasttext_classifier (bool) – whether to include Fasttext classifier prediction in metrics.

  • target_clf (str) – choose from “trasformer”, “fasttext”.

  • field (str) – the field where perturbation can happen.

  • bs (int) – batch size.

  • kwargs – arguments for metrics. kwargs will be passed to all metrics.

add_advanced_aggregation_fn(aggregation_name, aggregation_fn, direction)[source]

Add advanced aggregation function.

Some aggregation function can aggregate multiple metrics, these aggregation functions can be added here.

Parameters
  • aggregation_name (str) – the name of the aggregation.

  • aggregation_fn (fn) – an aggregation function that takes data_record as arg.

  • direction (str) – chose from DIRECTION_HIGHER_BETTER, DIRECTION_LOWER_BETTER, and DIRECTION_UNKNOWN.

add_classifier(classifier_metric, set_target_clf=False)[source]

Set a target classifier to attack.

Parameters
  • classifier_metric (ClassifierBase) – A classifier metric to be added.

  • set_target_clf (bool) – whether to set this classifier metric as target classifier.

add_metric(metric, direction)[source]

Add a customized metric to metric bundle.

Parameters
  • metric (MetricBase) – the metric object to add.

  • direction (str) – choose from DIRECTION_HIGHER_BETTER, DIRECTION_HIGHER_BETTER and DIRECTION_UNKNOWN.

aggregate_metrics(dataset_name, paraphrase_strategy_name, experiment_name, results)[source]

Aggregate paraphrase metrics on a dataset to a summary and store in a dict.

Parameters
  • dataset_name (str) – the name of the dataset.

  • paraphrase_strategy_name (str) – the name of the paraphrase strategy.

  • experiment_name (str) – the name of the experiment.

  • results (dict) – the fibber dataset with paraphrases and metrics. The return value of compute_metrics.

Returns

the aggregated metrics.

Return type

(dict)

get_advanced_aggregation_fn(aggregation_name)[source]
get_advanced_aggregation_fn_direction(aggregation_name)[source]
get_advanced_aggregation_fn_names()[source]
get_classifier(classifier_name)[source]

Returns the classifier in current metric bundle.

Parameters

classifier_name (str) – the name of the requested classifier.

get_classifier_names()[source]
get_metric(metric_name)[source]

Returns a metric in the bundle using the metric name.

Metric name is the class name of a metric.

Raises assertion error if metric is not found.

Parameters

metric_name – the name of the matric.

Returns

a metric object.

Return type

(MetricBase)

get_metric_direction(metric_name)[source]

Returns the direction of a metric.

Metric name is the class name of a metric.

Raises assertion error if metric is not found.

Parameters

metric_name – the name of the matric.

Returns

a metric object.

Return type

(MetricBase)

get_metric_names()[source]

“Returns all metric names in this metric bundle.

Returns

list of str

get_target_classifier()[source]

Returns the classifier for attack.

get_target_classifier_name()[source]

Return the name of the target classifier.

measure_batch(origin, paraphrase_list, data_record=None)[source]

Measure the metric on a batch of paraphrase_list.

Parameters
  • origin (str) – the original text.

  • paraphrase_list (list) – a set of paraphrase_list.

  • data_record (dict) – the corresponding data record of original text.

Returns

a list containing dict of metrics for each paraphrase.

Return type

(list)

measure_dataset(results, output_filename)[source]

Compute the all metrics for results on a dataset.

Parameters
  • results (dict) – A fibber dataset with paraphrase_list.

  • output_filename (str) – A json filename to store results and metrics.

Returns

the results dict with original_text_metrics and paraphrase_metrics

added.

Return type

(dict)

measure_example(origin, paraphrase, data_record=None)[source]

Compute the results of all metrics in the bundle for one pair of text.

Parameters
  • origin (str) – original text.

  • paraphrase (str) – paraphrased text.

  • data_record (dict) – the data record.

Returns

a dict with metric name as key.

Return type

(dict)

replace_target_classifier(clf)[source]

Remove the original target classifier and add a new classifier.

set_target_classifier_by_name(classifier_name)[source]

Set a target classifier to attack.

Parameters

classifier_name (str) – set a classifier as target classifier.

class fibber.metrics.RefBleuMetric(**kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the bleu score between input and output

Initialize ce model.

class fibber.metrics.SelfBleuMetric(**kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric computes the bleu score between input and output

Initialize ce model.

class fibber.metrics.TransformerClassifier(dataset_name, trainset, testset, transformer_clf_gpu_id=- 1, transformer_clf_steps=20000, transformer_clf_bs=32, transformer_clf_lr=2e-05, transformer_clf_optimizer='adamw', transformer_clf_weight_decay=0.001, transformer_clf_period_summary=100, transformer_clf_period_val=500, transformer_clf_period_save=20000, transformer_clf_val_steps=10, transformer_clf_model_init='bert-base-cased', **kwargs)[source]

Bases: fibber.metrics.classifier.classifier_base.ClassifierBase

BERT classifier prediction on paraphrase_list.

This metric is special, it does not compare the original and paraphrased sentence. Instead, it outputs the classifier prediction on paraphrase_list. So we should not compute mean or std on this metric.

Parameters
  • dataset_name (str) – the name of the dataset.

  • trainset (dict) – a fibber dataset.

  • testset (dict) – a fibber dataset.

  • transformer_clf_gpu_id (int) – the gpu id for BERT model. Set -1 to use CPU.

  • transformer_clf_steps (int) – steps to train a classifier.

  • transformer_clf_bs (int) – the batch size.

  • transformer_clf_lr (float) – the learning rate.

  • transformer_clf_optimizer (str) – the optimizer name.

  • transformer_clf_weight_decay (float) – the weight decay in the optimizer.

  • transformer_clf_period_summary (int) – the period in steps to write training summary.

  • transformer_clf_period_val (int) – the period in steps to run validation and write validation summary.

  • transformer_clf_period_save (int) – the period in steps to save current model.

  • transformer_clf_val_steps (int) – number of batched in each validation.

enable_ppl_filter(ppl_metric)[source]
get_device()[source]
get_model_and_tokenizer()[source]
get_model_init()[source]
load_robust_tuned_model(load_path)[source]
robust_tune_init(optimizer, lr, weight_decay, steps)[source]
robust_tune_step(data_record_list)[source]
save_robust_tuned_model(save_path)[source]
class fibber.metrics.USESimilarityMetric(use_gpu_id=- 1, **kwargs)[source]

Bases: fibber.metrics.metric_base.MetricBase

This metric uses universal sentence encoder to measure the semantic similarity of two sentences.

Initialize universal sentence encoder.