fibber.metrics.metric_utils module

class fibber.metrics.metric_utils.MetricBundle(enable_edit_distance=True, enable_use_similarity=True, enable_glove_similarity=True, enable_gpt2_perplexity=False, enable_transformer_classifier=True, enable_ce_similarity=False, enable_fasttext_classifier=False, enable_bert_perplexity=True, enable_bert_perplexity_per_class=False, enable_self_bleu=False, enable_ref_bleu=False, target_clf='transformer', field='text0', bs=32, **kwargs)[source]

Bases: object

MetricBundle can help easily initialize and compute multiple metrics.

Initialize various metrics.

Parameters
  • enable_edit_distance (bool) – whether to use editing distance in the bundle.

  • enable_use_similarity (bool) – whether to use Universal sentence encoder to compute sentence similarity

  • enable_glove_similarity (bool) – whether to use Glove embeddings to compute sentence similarity.

  • enable_gpt2_perplexity (bool) – whether to use GPT2 to compute sentence quality.

  • enable_transformer_classifier (bool) – whether to include BERT classifier prediction in metrics.

  • enable_ce_similarity (bool) – whether to use Cross Encoder to measure sentence similarity.

  • enable_fasttext_classifier (bool) – whether to include Fasttext classifier prediction in metrics.

  • target_clf (str) – choose from “trasformer”, “fasttext”.

  • field (str) – the field where perturbation can happen.

  • bs (int) – batch size.

  • kwargs – arguments for metrics. kwargs will be passed to all metrics.

add_advanced_aggregation_fn(aggregation_name, aggregation_fn, direction)[source]

Add advanced aggregation function.

Some aggregation function can aggregate multiple metrics, these aggregation functions can be added here.

Parameters
  • aggregation_name (str) – the name of the aggregation.

  • aggregation_fn (fn) – an aggregation function that takes data_record as arg.

  • direction (str) – chose from DIRECTION_HIGHER_BETTER, DIRECTION_LOWER_BETTER, and DIRECTION_UNKNOWN.

add_classifier(classifier_metric, set_target_clf=False)[source]

Set a target classifier to attack.

Parameters
  • classifier_metric (ClassifierBase) – A classifier metric to be added.

  • set_target_clf (bool) – whether to set this classifier metric as target classifier.

add_metric(metric, direction)[source]

Add a customized metric to metric bundle.

Parameters
  • metric (MetricBase) – the metric object to add.

  • direction (str) – choose from DIRECTION_HIGHER_BETTER, DIRECTION_HIGHER_BETTER and DIRECTION_UNKNOWN.

aggregate_metrics(dataset_name, paraphrase_strategy_name, experiment_name, results)[source]

Aggregate paraphrase metrics on a dataset to a summary and store in a dict.

Parameters
  • dataset_name (str) – the name of the dataset.

  • paraphrase_strategy_name (str) – the name of the paraphrase strategy.

  • experiment_name (str) – the name of the experiment.

  • results (dict) – the fibber dataset with paraphrases and metrics. The return value of compute_metrics.

Returns

the aggregated metrics.

Return type

(dict)

get_advanced_aggregation_fn(aggregation_name)[source]
get_advanced_aggregation_fn_direction(aggregation_name)[source]
get_advanced_aggregation_fn_names()[source]
get_classifier(classifier_name)[source]

Returns the classifier in current metric bundle.

Parameters

classifier_name (str) – the name of the requested classifier.

get_classifier_names()[source]
get_metric(metric_name)[source]

Returns a metric in the bundle using the metric name.

Metric name is the class name of a metric.

Raises assertion error if metric is not found.

Parameters

metric_name – the name of the matric.

Returns

a metric object.

Return type

(MetricBase)

get_metric_direction(metric_name)[source]

Returns the direction of a metric.

Metric name is the class name of a metric.

Raises assertion error if metric is not found.

Parameters

metric_name – the name of the matric.

Returns

a metric object.

Return type

(MetricBase)

get_metric_names()[source]

“Returns all metric names in this metric bundle.

Returns

list of str

get_target_classifier()[source]

Returns the classifier for attack.

get_target_classifier_name()[source]

Return the name of the target classifier.

measure_batch(origin, paraphrase_list, data_record=None)[source]

Measure the metric on a batch of paraphrase_list.

Parameters
  • origin (str) – the original text.

  • paraphrase_list (list) – a set of paraphrase_list.

  • data_record (dict) – the corresponding data record of original text.

Returns

a list containing dict of metrics for each paraphrase.

Return type

(list)

measure_dataset(results, output_filename)[source]

Compute the all metrics for results on a dataset.

Parameters
  • results (dict) – A fibber dataset with paraphrase_list.

  • output_filename (str) – A json filename to store results and metrics.

Returns

the results dict with original_text_metrics and paraphrase_metrics

added.

Return type

(dict)

measure_example(origin, paraphrase, data_record=None)[source]

Compute the results of all metrics in the bundle for one pair of text.

Parameters
  • origin (str) – original text.

  • paraphrase (str) – paraphrased text.

  • data_record (dict) – the data record.

Returns

a dict with metric name as key.

Return type

(dict)

replace_target_classifier(clf)[source]

Remove the original target classifier and add a new classifier.

set_target_classifier_by_name(classifier_name)[source]

Set a target classifier to attack.

Parameters

classifier_name (str) – set a classifier as target classifier.