syndi.task.create_tasks

syndi.task.create_tasks(train_dataset='data/train.csv', test_dataset='data/test.csv', target='TARGET', path_to_generators='generators/', pycaret_models=None, task_sampling_method='all', run_num=1, output_dir=None, is_regression=False, regression_bins=5, preprocess_fn=None)[source]

Create a list of benchmark task objects.

Parameters:
  • train_dataset (str) – the directory of training dataset csv file

  • test_dataset (str) – the directory of test dataset csv file

  • target (str) – the name of the target column in the train and test dataset (must be the same for both datasets)

  • path_to_generators (str) – the directory of generators

  • pycaret_models (list) – list of strings of pycaret classification models to use, if None runs all.

  • sampling_method (str) – “uniform” , “original”, “baseline” (no sampling), or “all” (for both uniform and original)

  • run_num (int) – the number of times to generate a sample and test a classifier on it.

  • output_dir (str) – the path to store the task configurations.

Returns:

a list of Task objects that store the benchmarking task configurations.

Return type:

list