fibber.paraphrase_strategies.fudge_strategy module¶

class fibber.paraphrase_strategies.fudge_strategy.FudgeStrategy(arg_dict, dataset_name, strategy_gpu_id, output_dir, metric_bundle, field)[source]¶

Bases: fibber.paraphrase_strategies.strategy_base.StrategyBase

A baseline paraphrase strategy. Just return the reference.

Initialize the paraphrase_strategies.

This function initialize the self._strategy_config, self._metric_bundle, self._device, self._output_dir, self._dataset_name.

You should not overwrite this function.

self._strategy_config (dict): a dictionary that stores the strategy name and all hyperparameter values. The dict is also saved to the results.
self._metric_bundle (MetricBundle): the metrics that will be used to evaluate paraphrases. Strategies can compute metrics during paraphrasing.
self._device (torch.Device): any computation that requires a GPU accelerator should use this device.
self._output_dir (str): the dir name where the strategy can save files.
self._dataset_name (str): the dataset name.

Parameters

arg_dict (dict) – all args load from command line.
dataset_name (str) – the name of the dataset.
strategy_gpu_id (int) – the gpu id to run the strategy.
output_dir (str) – a directory to save any models or temporary files.
metric_bundle (MetricBundle) – a MetricBundle object.

fit(trainset)[source]¶

Fit the paraphrase strategy on a training set.

Parameters: trainset (dict) – a fibber dataset.

paraphrase_example(data_record, n)[source]¶

Paraphrase one data record.

This function should be overwritten by subclasses. When overwriting this class, you can use self._strategy_config, self._metric_bundle, self._device, self._output_dir, and self._dataset_name

Parameters

data_record (dict) – a dict storing one data of a dataset.
n (int) – number of paraphrases.

Returns

A list contain at most n strings.

Return type

([str,])

score(data_record, tmp, text, ll)[source]¶

fibber.paraphrase_strategies.fudge_strategy.make_batch(toks_list)[source]¶: Convert multiple text to a batch tensor.

fibber.paraphrase_strategies.fudge_strategy.make_input_output_pair(tokenizer, x)[source]¶: Tokenize the text, then construct input and output for GPT2.

fibber.paraphrase_strategies.cheat_strategy module

fibber.paraphrase_strategies.identity_strategy module