Computing Results

Parallel Generation - num_workers and pool

Running a function for many different parameters theoretically allows perfect parallelization since each run is independent. xyzpy can automatically handle this in a number of different ways:

  1. Supply parallel=True when calling Runner.run_combos(...) or Harvester.harvest_combos(...) etc. This spawns a ProcessExecutorPool with the same number of workers as logical cores.

  2. Supply num_workers=... instead to explicitly control how any workers are used. Since for many numeric codes threading is controlled by the environement variable $OMP_NUM_THREADS you generally want the product of this and num_workers to be equal to the number of cores.

  3. Supply executor=... to use any custom parallel pool-executor like object (e.g. a dask.distributed client or mpi4py pool) which has a submit/apply_async method, and yields futures with a result/get method. More specifically, this covers pools with an API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also explicitly handled.

  4. Use a Crop to write combos to disk, which can then be ‘grown’ persistently by any computers with access to the filesystem, such as distributed cluster - see below.

The first three options can be used on any of the various functions derived from combo_runner():

import xyzpy as xyz


def slow(a, b):
    import time

    time.sleep(1)
    return a + b, a - b


xyz.case_runner_to_df(
    slow,
    fn_args=["a", "b"],
    cases=[(i, i + 1) for i in range(8)],
    var_names=["sum", "diff"],
    parallel=True,
)
100%|##########| 8/8 [00:01<00:00,  6.53it/s]
a b sum diff
0 0 1 1 -1
1 1 2 3 -1
2 2 3 5 -1
3 3 4 7 -1
4 4 5 9 -1
5 5 6 11 -1
6 6 7 13 -1
7 7 8 15 -1

Batched / Distributed generation - Crop

Running combos using the disk as a persistence mechanism requires one more object - the Crop. These can be instantiated directly or, generally more convenient, from a parent Runner or Harvester: xyzpy.Runner.Crop() and xyzpy.Harvester.Crop(). Using a Crop requires a number of steps:

  1. Creation with:

    • a unique name to call this set of runs, defaulting the function name

    • a fn if not creating from an Harvester or Runner

    • other optional settings such as batchsize controlling how many runs to group into one.

Hint

You can automatically load all crops in the current directory (or a specific one) to a dictionary by calling the function xyzpy.load_crops().

  1. ‘Sow’. Use xyzpy.Crop.sow_combos() to write combos into batches on disk.

  2. ‘Grow’. Grow each batch. This can be done a number of ways:

  • The script xyzpy-grow can be called from the command line to process all batches. It has signature:

!xyzpy-grow --help
usage: xyzpy-grow [-h] [--parent-dir PARENT_DIR] [--batch-ids BATCH_IDS]
                  [--raise-errors [RAISE_ERRORS]] [--num-threads NUM_THREADS]
                  [--num-workers NUM_WORKERS] [--subprocess [SUBPROCESS]]
                  [--log [LOG]] [--gpus GPUS] [--affinities AFFINITIES]
                  [--ray] [--gpus-per-task GPUS_PER_TASK]
                  [--verbosity VERBOSITY] [--verbosity-grow VERBOSITY_GROW]
                  crop_name

Grow crops using xyzpy-gen-cropping.

positional arguments:
  crop_name             The name of the crop to grow.

options:
  -h, --help            show this help message and exit
  --parent-dir PARENT_DIR
                        The parent directory of the crop.
  --batch-ids BATCH_IDS
                        Comma separated list of which batches to grow, by
                        default all missing results.
  --raise-errors [RAISE_ERRORS]
                        Raise batch errors.
  --num-threads NUM_THREADS
                        The number of threads per worker (OMP_NUM_THREADS
                        etc.)
  --num-workers NUM_WORKERS
                        The number of worker processes to use.
  --subprocess [SUBPROCESS]
                        Run each batch in its own fresh subprocess. This is
                        most robust in terms of memory, at the cost of the
                        process startup overhead. Optional value: true/false.
  --log [LOG]           Save subprocess stdout and stderr to log files in the
                        crop directory under logs/batch-{batch_id}.log. Only
                        used when --subprocess is enabled.
  --gpus GPUS           If subprocess is enabled, this is an optional comma
                        separated list of GPU device IDs to assign to
                        subprocesses via CUDA_VISIBLE_DEVICES. Each subprocess
                        gets a single GPU from this pool; the pool also limits
                        concurrency. You can oversubscribe GPUs by repeating
                        device IDs, e.g. `0,0,1,1` to allow 2 subprocesses to
                        share each GPU.
  --affinities AFFINITIES
                        If subprocess is enabled, this is an optional comma
                        separated list of affinities to use, one for each
                        process. This ensures a single cpu core is used for
                        each batch, regardless of other environment variables.
  --ray                 Use a ray executor, either connecting to an existing
                        cluster, or starting a new one with num_workers
  --gpus-per-task GPUS_PER_TASK
                        The number of gpus to request per task, if using a ray
                        executor. The overall GPUs available is set by
                        CUDA_VISIBLE_DEVICES, which ray follows.
  --verbosity VERBOSITY
                        The verbosity level.
  --verbosity-grow VERBOSITY_GROW
                        The verbosity level.
  • In another python process, navigate to the same directory and run for example python -c "import xyzpy; c = xyzpy.Crop(name=...); xyzpy.grow(i, crop=c)" to grow the ith batch of crop with specified name. See grow() for other options. This could manually be put in a script to run on a batch system.

  • Use Crop.grow_cluster - experimental! This automatically generates and submits a script using SGE, PBS or SLURM. See its options and Crop.gen_cluster_script for the template scripts.

  • Use xyzpy.Crop.grow() or xyzpy.Crop.grow_missing() to complete some or all of the batches in the local process. This can be useful to A. finish up a few missing/errored runs B. run all the combos with persistent progress, so that one can restart the runs at a completely different time/ with updated functions etc.

  1. Watch the progress. Crop.__str__ will show how many batches have been completed of the total sown.

  2. ‘Reap’. Once all the batches have completed, run xyzpy.Crop.reap() to collect the results and remove the batches’ temporary directory. If the crop originated from a Runner or Harvester, the data will be labelled, merged and saved accordingly.

Note

You can reap an unfinished Crop as long as there is at least one result by passing the allow_incomplete=True option to reap(). Note that missing results will be represented by numpy.nan which might effect the eventual dtype of harvested results. To avoid this, consider also setting sync=False to avoid writing anything to disk until the full Crop is finished.

See the full demonstrations in Examples.