fibber.paraphrase_strategies.strategy_base module¶

class fibber.paraphrase_strategies.strategy_base.StrategyBase(arg_dict, dataset_name, strategy_gpu_id, output_dir, metric_bundle, field)[source]¶

Bases: object

The base class for all paraphrase strategies.

The simplest way to write a strategy is to overwrite the paraphrase_example function. This function takes one data records, and returns multiple paraphrases of a given field.

For more advanced use cases, you can overwrite the paraphrase function.

Some strategy may have hyper-parameters. Add hyper parameters into the class attribute __hyperparameters__.

Hyperparameters defined in __hyperparameters__ can be added to the command line arg parser by add_parser_args(parser). The value of the hyperparameters will be added to self._strategy_config.

__abbr__¶

a unique string as an abbreviation for the strategy.

Type: str

__hyper_parameters__¶

A list of tuples that defines the hyperparameters for the strategy. Each tuple is (name, type, default, help). For example:

__hyperparameters = [ ("p1", int, -1, "the first hyper parameter"), ...]

Type: list

Initialize the paraphrase_strategies.

This function initialize the self._strategy_config, self._metric_bundle, self._device, self._output_dir, self._dataset_name.

You should not overwrite this function.

self._strategy_config (dict): a dictionary that stores the strategy name and all hyperparameter values. The dict is also saved to the results.
self._metric_bundle (MetricBundle): the metrics that will be used to evaluate paraphrases. Strategies can compute metrics during paraphrasing.
self._device (torch.Device): any computation that requires a GPU accelerator should use this device.
self._output_dir (str): the dir name where the strategy can save files.
self._dataset_name (str): the dataset name.

Parameters

arg_dict (dict) – all args load from command line.
dataset_name (str) – the name of the dataset.
strategy_gpu_id (int) – the gpu id to run the strategy.
output_dir (str) – a directory to save any models or temporary files.
metric_bundle (MetricBundle) – a MetricBundle object.

classmethod add_parser_args(parser)[source]¶

create commandline args for all hyperparameters in __hyperparameters__.

Parameters: parser – an arg parser.

fit(trainset)[source]¶

Fit the paraphrase strategy on a training set.

Parameters: trainset (dict) – a fibber dataset.

paraphrase_dataset(paraphrase_set, n, tmp_output_filename)[source]¶

Paraphrase one dataset.

Parameters

paraphrase_set (dict) – a dict storing one data of a dataset.
n (int) – number of paraphrases.
tmp_output_filename (str) – the output json filename to save results during running.

Returns

A dict containing the original text and paraphrased text.

Return type

(dict)

paraphrase_example(data_record, n)[source]¶

Paraphrase one data record.

This function should be overwritten by subclasses. When overwriting this class, you can use self._strategy_config, self._metric_bundle, self._device, self._output_dir, and self._dataset_name

Parameters

data_record (dict) – a dict storing one data of a dataset.
n (int) – number of paraphrases.

Returns

A list contain at most n strings.

Return type

([str,])

fibber.paraphrase_strategies.ssrs_strategy module

fibber.paraphrase_strategies.textattack_strategy module