fibber.benchmark package

Module contents

class fibber.benchmark.Benchmark(output_dir, dataset_name, trainset=None, testset=None, attack_set=None, subsample_attack_set=0, customized_clf=None, enable_transformer_clf=True, enable_fasttext_classifier=False, use_gpu_id=- 1, bert_ppl_gpu_id=- 1, transformer_clf_gpu_id=- 1, transformer_clf_steps=20000, transformer_clf_bs=32, best_adv_metric_name='USESimilarityMetric', best_adv_metric_lower_better=False, target_classifier='transformer', transformer_clf_model_init='bert-base-cased', field='text0')[source]

Bases: object

Benchmark framework for adversarial attack methods on text classification.

Initialize Benchmark framework.

Parameters
  • output_dir (str) – the directory to write outputs including model, sentences, metrics and log.

  • dataset_name (str) – the name of the dataset.

  • trainset (dict) – the training set. If the dataset_name matches built-in datasets, trainset should be None.

  • testset (dict) – the test set. If the dataset_name matches built-in datasets, testset should be None.

  • attack_set (dict or None) – the set to run adversarial attack. Use None to attack the testset.

  • subsample_attack_set (int) – subsample the attack set. 0 to use the whole attack set.

  • customized_clf (ClassifierBase) – an classifier object.

  • enable_transformer_clf (bool) – whether to enable transformer classifier in metrics.

  • enable_fasttext_classifier (bool) – whether to enable fasttext classifier in metrics.

  • use_gpu_id (int) – the gpu to run universal sentence encoder to compute metrics. -1 for CPU.

  • bert_ppl_gpu_id (int) – the gpu to run the BERT language model for perplexity. -1 for CPU.

  • transformer_clf_gpu_id (int) – the gpu to run the BERT text classifier, which is the model being attacked. -1 for CPU.

  • transformer_clf_steps (int) – number of steps to train the BERT text classifier.

  • transformer_clf_bs (int) – the batch size to train the BERT classifier.

  • best_adv_metric_name (str) – the metric name to identify the best adversarial example if the paraphrase strategy outputs multiple options.

  • best_adv_metric_lower_better (bool) – whether the metric is lower better.

  • target_classifier (str) – the victim classifier. Choose from [“transformer”, “fasttext”, “customized”].

  • transformer_clf_model_init (str) – the backbone pretrained language model, e.g., bert-base-cased.

  • field (str) – attack text field.

fit_defense(paraphrase_strategy, defense_strategy)[source]
get_metric_bundle()[source]
load_defense(defense_strategy)[source]
run_benchmark(paraphrase_strategy='IdentityStrategy', strategy_gpu_id=- 1, max_paraphrases=50, exp_name=None, update_global_results=False)[source]

Run the benchmark.

Parameters
  • paraphrase_strategy (str or StrategyBase) – the paraphrase strategy to benchmark. Either the name of a builtin strategy or a customized strategy derived from StrategyBase.

  • strategy_gpu_id (int) – the gpu id to run the strategy. -1 for CPU. Ignored when paraphrase_strategy is an object.

  • max_paraphrases (int) – number of paraphrases for each sentence.

  • exp_name (str or None) – the name of current experiment. None for default name. the default name is <dataset_name>-<strategy_name>-<date>-<time>.

  • update_global_results (bool) – whether to write results in <fibber_root_dir> or the benchmark output dir.

Returns

A dict of evaluation results.