Parallelization#

Parallel Generation - `num_workers` and `pool`#

Running a function for many different parameters theoretically allows perfect parallelization since each run is independent. xyzpy can automatically handle this in a number of different ways:

Supply parallel=True when calling Runner.run_combos(...) or Harvester.harvest_combos(...) etc. This spawns a ProcessExecutorPool with the same number of workers as logical cores.
Supply num_workers=... instead to explicitly control how any workers are used. Since for many numeric codes threading is controlled by the environement variable $OMP_NUM_THREADS you generally want the product of this and num_workers to be equal to the number of cores.
Supply executor=... to use any custom parallel pool-executor like object (e.g. a dask.distributed client or mpi4py pool) which has a submit/apply_async method, and yields futures with a result/get method. More specifically, this covers pools with an API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also explicitly handled.
Use a Crop to write combos to disk, which can then be ‘grown’ persistently by any computers with access to the filesystem, such as distributed cluster - see below.

The first three options can be used on any of the various functions derived from combo_runner():

import xyzpy as xyz

def slow(a, b):
    import time
    time.sleep(1)
    return a + b, a - b

xyz.case_runner_to_df(
    slow,
    fn_args=['a', 'b'],
    cases=[(i, i + 1) for i in range(8)],
    var_names=['sum', 'diff'],
    parallel=True,
)

100%|##########| 8/8 [00:01<00:00,  6.56it/s]

	a	b	sum	diff
0	0	1	1	-1
1	1	2	3	-1
2	2	3	5	-1
3	3	4	7	-1
4	4	5	9	-1
5	5	6	11	-1
6	6	7	13	-1
7	7	8	15	-1

Batched / Distributed generation - `Crop`#

Running combos using the disk as a persistence mechanism requires one more object - the Crop. These can be instantiated directly or, generally more convenient, from a parent Runner or Harvester: xyzpy.Runner.Crop() and xyzpy.Harvester.Crop(). Using a Crop requires a number of steps:

Creation with:
- a unique name to call this set of runs, defaulting the function name
- a fn if not creating from an Harvester or Runner
- other optional settings such as batchsize controlling how many runs to group into one.

Hint

You can automatically load all crops in the current directory (or a specific one) to a dictionary by calling the function xyzpy.load_crops().

‘Sow’. Use xyzpy.Crop.sow_combos() to write combos into batches on disk.
‘Grow’. Grow each batch. This can be done a number of ways:

The script xyzpy-grow can be called from the command line to process all batches. It has signature:

!xyzpy-grow --help

usage: xyzpy-grow [-h] [--parent-dir PARENT_DIR] [--debug]
                  [--num-threads NUM_THREADS] [--num-workers NUM_WORKERS]
                  [--ray] [--gpus-per-task GPUS_PER_TASK]
                  [--verbosity VERBOSITY]
                  crop_name

Grow crops using xyzpy-gen-cropping.

positional arguments:
  crop_name             The name of the crop to grow.

options:
  -h, --help            show this help message and exit
  --parent-dir PARENT_DIR
                        The parent directory of the crop.
  --debug               Enable debugging mode.
  --num-threads NUM_THREADS
                        The number of threads per worker (OMP_NUM_THREADS
                        etc.)
  --num-workers NUM_WORKERS
                        The number of worker processes to use.
  --ray                 Use a ray executor, either connecting to an existing
                        cluster, or starting a new one with num_workers
  --gpus-per-task GPUS_PER_TASK
                        The number of gpus to request per task, if using a ray
                        executor. The overall GPUs available is set by
                        CUDA_VISIBLE_DEVICES, which ray follows.
  --verbosity VERBOSITY
                        The verbosity level.

In another python process, navigate to the same directory and run for example python -c "import xyzpy; c = xyzpy.Crop(name=...); xyzpy.grow(i, crop=c)" to grow the ith batch of crop with specified name. See grow() for other options. This could manually be put in a script to run on a batch system.
Use Crop.grow_cluster - experimental! This automatically generates and submits a script using SGE, PBS or SLURM. See its options and Crop.gen_cluster_script for the template scripts.
Use xyzpy.Crop.grow() or xyzpy.Crop.grow_missing() to complete some or all of the batches in the local process. This can be useful to A. finish up a few missing/errored runs B. run all the combos with persistent progress, so that one can restart the runs at a completely different time/ with updated functions etc.

Watch the progress. Crop.__str__ will show how many batches have been completed of the total sown.
‘Reap’. Once all the batches have completed, run xyzpy.Crop.reap() to collect the results and remove the batches’ temporary directory. If the crop originated from a Runner or Harvester, the data will be labelled, merged and saved accordingly.

Note

You can reap an unfinished Crop as long as there is at least one result by passing the allow_incomplete=True option to reap(). Note that missing results will be represented by numpy.nan which might effect the eventual dtype of harvested results. To avoid this, consider also setting sync=False to avoid writing anything to disk until the full Crop is finished.

See the full demonstrations in Examples.

Parallelization#

Parallel Generation - num_workers and pool#

Batched / Distributed generation - Crop#

Parallel Generation - `num_workers` and `pool`#

Batched / Distributed generation - `Crop`#