xyzpy.gen.cropping
==================

.. py:module:: xyzpy.gen.cropping


Attributes
----------

.. autoapisummary::

   xyzpy.gen.cropping.BTCH_NM
   xyzpy.gen.cropping.RSLT_NM
   xyzpy.gen.cropping.FNCT_NM
   xyzpy.gen.cropping.INFO_NM
   xyzpy.gen.cropping._SGE_HEADER
   xyzpy.gen.cropping._SGE_ARRAY_HEADER
   xyzpy.gen.cropping._PBS_HEADER
   xyzpy.gen.cropping._PBS_ARRAY_HEADER
   xyzpy.gen.cropping._SLURM_HEADER
   xyzpy.gen.cropping._SLURM_ARRAY_HEADER
   xyzpy.gen.cropping._BASE
   xyzpy.gen.cropping._CLUSTER_SGE_GROW_ALL_SCRIPT
   xyzpy.gen.cropping._CLUSTER_PBS_GROW_ALL_SCRIPT
   xyzpy.gen.cropping._CLUSTER_SLURM_GROW_ALL_SCRIPT
   xyzpy.gen.cropping._CLUSTER_SGE_GROW_PARTIAL_SCRIPT
   xyzpy.gen.cropping._CLUSTER_PBS_GROW_PARTIAL_SCRIPT
   xyzpy.gen.cropping._CLUSTER_SLURM_GROW_PARTIAL_SCRIPT
   xyzpy.gen.cropping._BASE_CLUSTER_GROW_SINGLE
   xyzpy.gen.cropping._BASE_CLUSTER_SCRIPT_END


Classes
-------

.. autoapisummary::

   xyzpy.gen.cropping._ResourcePool
   xyzpy.gen.cropping.Crop
   xyzpy.gen.cropping.Sower
   xyzpy.gen.cropping.Reaper


Functions
---------

.. autoapisummary::

   xyzpy.gen.cropping.write_to_disk
   xyzpy.gen.cropping.read_from_disk
   xyzpy.gen.cropping.get_picklelib
   xyzpy.gen.cropping.to_pickle
   xyzpy.gen.cropping.from_pickle
   xyzpy.gen.cropping.parse_crop_details
   xyzpy.gen.cropping.parse_fn_farmer
   xyzpy.gen.cropping.calc_clean_up_default_res
   xyzpy.gen.cropping.check_ready_to_reap
   xyzpy.gen.cropping._parse_resource_ids
   xyzpy.gen.cropping._acquire_affinity
   xyzpy.gen.cropping._acquire_gpu
   xyzpy.gen.cropping.load_crops
   xyzpy.gen.cropping.grow
   xyzpy.gen.cropping.gen_cluster_script
   xyzpy.gen.cropping.grow_cluster
   xyzpy.gen.cropping.gen_qsub_script
   xyzpy.gen.cropping.qsub_grow
   xyzpy.gen.cropping.clean_slurm_outputs
   xyzpy.gen.cropping.manage_slurm_outputs


Module Contents
---------------

.. py:data:: BTCH_NM
   :value: 'xyz-batch-{}.jbdmp'


.. py:data:: RSLT_NM
   :value: 'xyz-result-{}.jbdmp'


.. py:data:: FNCT_NM
   :value: 'xyz-function.clpkl'


.. py:data:: INFO_NM
   :value: 'xyz-settings.jbdmp'


.. py:function:: write_to_disk(obj, fname)

.. py:function:: read_from_disk(fname)

.. py:function:: get_picklelib(picklelib='joblib.externals.cloudpickle')

.. py:function:: to_pickle(obj, picklelib='joblib.externals.cloudpickle')

.. py:function:: from_pickle(s, picklelib='joblib.externals.cloudpickle')

.. py:function:: parse_crop_details(fn, crop_name, crop_parent)

   Work out how to structure the sowed data.

   :param fn: Function to infer name crop_name from, if not given.
   :type fn: callable, optional
   :param crop_name: Specific name to give this set of runs.
   :type crop_name: str, optional
   :param crop_parent: Specific directory to put the ".xyz-{crop_name}/" folder in
                       with all the cases and results.
   :type crop_parent: str, optional

   :returns: * **crop_location** (*str*) -- Full path to the crop-folder.
             * **crop_name** (*str*) -- Name of the crop.
             * **crop_parent** (*str*) -- Parent folder of the crop.


.. py:function:: parse_fn_farmer(fn, farmer)

.. py:function:: calc_clean_up_default_res(crop, clean_up, allow_incomplete)

   Logic for choosing whether to automatically clean up a crop, and what,
   if any, the default all-nan result should be.


.. py:function:: check_ready_to_reap(crop, allow_incomplete, wait)

.. py:function:: _parse_resource_ids(raw)

   Normalize an int, list, tuple, range, or comma-separated string
   into a list of integer resource IDs.


.. py:function:: _acquire_affinity(rid, pargs, env)

   Prepend ``taskset -c <cpu>`` to pin to a CPU core.


.. py:function:: _acquire_gpu(rid, pargs, env)

   Set ``CUDA_VISIBLE_DEVICES`` to pin to a GPU.


.. py:class:: _ResourcePool(ids, acquire_fn)

   A pool of reusable resource IDs (CPUs, GPUs, etc.) that can be
   acquired and released once per batch subprocess.

   :param ids: The resource IDs available to hand out.
   :type ids: list of int
   :param acquire_fn: ``fn(rid, pargs, env)`` — mutate *pargs* (the command prefix
                      list) and/or *env* (the environment dict) to apply *rid*.
   :type acquire_fn: callable


   .. py:attribute:: free


   .. py:attribute:: used


   .. py:attribute:: acquire_fn


   .. py:method:: from_raw(raw, acquire_fn)
      :classmethod:


      Create a pool from a raw user value, or return ``None``.


   .. py:method:: available()

      Whether there is at least one free resource.


   .. py:method:: acquire(batch_id, pargs, env)

      Pop a resource, apply it, and track it against *batch_id*.


   .. py:method:: release(batch_id)

      Return the resource used by *batch_id* to the free pool.


.. py:class:: Crop(*, fn=None, name=None, parent_dir=None, save_fn=None, batchsize=None, num_batches=None, shuffle=False, farmer=None, autoload=True)

   Bases: :py:obj:`object`


   Encapsulates all the details describing a single 'crop', that is,
   its location, name, and batch size/number. Also allows tracking of
   crop's progress, and experimentally, automatic submission of
   workers to grid engine to complete un-grown cases. Can also be instantiated
   directly from a :class:`~xyzpy.Runner` or :class:`~xyzpy.Harvester` or
   :class:`~Sampler.Crop` instance.

   :param fn: Target function - Crop `name` will be inferred from this if
              not given explicitly. If given, `Sower` will also default
              to saving a version of `fn` to disk for `cropping.grow` to use.
   :type fn: callable, optional
   :param name: Custom name for this set of runs - must be given if `fn`
                is not.
   :type name: str, optional
   :param parent_dir: If given, alternative directory to put the ".xyz-{name}/"
                      folder in with all the cases and results.
   :type parent_dir: str, optional
   :param save_fn: Whether to save the function to disk for `cropping.grow` to use.
                   Will default to True if `fn` is given.
   :type save_fn: bool, optional
   :param batchsize: How many cases to group into a single batch per worker.
                     By default, batchsize=1. Cannot be specified if `num_batches`
                     is.
   :type batchsize: int, optional
   :param num_batches: How many total batches to aim for, cannot be specified if
                       `batchsize` is.
   :type num_batches: int, optional
   :param farmer: A Runner, Harvester or Sampler, instance, from which the `fn` can be
                  inferred and which can also allow the Crop to reap itself straight to a
                  dataset or dataframe.
   :type farmer: {xyzpy.Runner, xyzpy.Harvester, xyzpy.Sampler}, optional
   :param autoload: If True, check for the existence of a Crop written to disk
                    with the same location, and if found, load it.
   :type autoload: bool, optional

   .. seealso:: :py:obj:`Runner.Crop`, :py:obj:`Harvester.Crop`, :py:obj:`Sampler.Crop`


   .. py:attribute:: name
      :value: None


   .. py:attribute:: parent_dir
      :value: None


   .. py:attribute:: save_fn
      :value: None


   .. py:attribute:: batchsize
      :value: None


   .. py:attribute:: num_batches
      :value: None


   .. py:attribute:: shuffle
      :value: False


   .. py:attribute:: _batch_remainder
      :value: None


   .. py:attribute:: _all_nan_result
      :value: None


   .. py:attribute:: _num_sown_batches
      :value: -1


   .. py:attribute:: _num_results
      :value: -1


   .. py:property:: runner


   .. py:method:: choose_batch_settings(*, combos=None, cases=None)

      Work out how to divide all cases into batches, i.e. ensure
      that ``batchsize * num_batches >= num_cases``.


   .. py:method:: ensure_dirs_exists()

      Make sure the directory structure for this crop exists.


   .. py:method:: save_info(combos=None, cases=None, fn_args=None)

      Save information about the sowed cases.


   .. py:method:: load_info()

      Load the full settings from disk.


   .. py:method:: load_batch(batch_number)

      Load a specific batch from disk.


   .. py:method:: load_result(batch_number)

      Load a specific result from disk.


   .. py:method:: save_result(batch_number, result)

      Save a specific result to disk.


   .. py:method:: _sync_info_from_disk(only_missing=True)

      Load information about the saved cases.


   .. py:method:: save_function_to_disk()

      Save the base function to disk using cloudpickle


   .. py:method:: load_function()

      Load the saved function from disk, and try to re-insert it back into
      Harvester or Runner if present.


   .. py:method:: prepare(combos=None, cases=None, fn_args=None)

      Write information about this crop and the supplied combos to disk.
      Typically done at start of sow, not when Crop instantiated.


   .. py:method:: is_prepared()

      Check whether this crop has been written to disk.


   .. py:method:: calc_progress()

      Calculate how much progressed has been made in growing the batches.


   .. py:method:: is_ready_to_reap()

      Have all batches been grown?


   .. py:method:: completed_results() -> tuple[int, Ellipsis]

      Return tuple of batches which have been grown already.


   .. py:method:: missing_results() -> tuple[int, Ellipsis]

      Return tuple of batches which haven't been grown yet.


   .. py:method:: delete_all()

      Delete the crop directory and all its contents, and reset
      any loaded information on this Crop object.


   .. py:method:: handle_existing(action='ask', msg=None, e=None, overwrite=False)

      Handle an already prepared crop.

      :param action: What to do with the existing crop. If ``'ask'`` (default),
                     interactively prompt the user. Otherwise, execute the
                     specified action directly.
      :type action: {'ask', 'reap', 'delete', 'skip', 'raise'}
      :param msg: Message to display when prompting.
      :type msg: str, optional
      :param e: Exception to re-raise if action is ``'raise'``.
      :type e: Exception, optional
      :param overwrite: Whether to overwrite existing data when reaping.
      :type overwrite: bool, optional


   .. py:property:: all_nan_result

      Get a stand-in result for cases which are missing still.


   .. py:method:: __str__()


   .. py:method:: __repr__()


   .. py:method:: parse_constants(constants=None)


   .. py:method:: sow_combos(combos, cases=None, constants=None, shuffle=False, verbosity=1, desc='Sow', batchsize=None, num_batches=None)

      Sow combos to disk to be later grown, potentially in batches. Note
      if you have already sown this `Crop`, as long as the number of batches
      hasn't changed (e.g. you have just tweaked the function or a constant
      argument), you can safely resow and only the batches will be
      overwritten, i.e. the results will remain.

      :param combos: The combinations to sow for all or some function arguments.
      :type combos: dict_like[str, iterable]
      :param cases: Optionally provide a sequence of individual cases to sow for some
                    or all function arguments.
      :type cases: iterable or mappings, optional
      :param constants: Provide additional constant function values to use when sowing.
      :type constants: mapping, optional
      :param shuffle: If given, sow the combos in a random order (using ``random.seed``
                      and ``random.shuffle``), which can be helpful for distributing
                      resources when not all cases are computationally equal.
      :type shuffle: bool or int, optional
      :param verbosity: How much information to show when sowing. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being sown.
      :type verbosity: int, optional
      :param desc: Description to show in the progress bar when sowing.
      :type desc: str, optional
      :param batchsize: If specified, set a new batchsize for the crop.
      :type batchsize: int, optional
      :param num_batches: If specified, set a new num_batches for the crop.
      :type num_batches: int, optional


   .. py:method:: sow_cases(fn_args, cases, combos=None, constants=None, verbosity=1, batchsize=None, num_batches=None)

      Sow cases to disk to be later grown, potentially in batches.

      :param fn_args: The names and order of the function arguments, can be ``None`` if
                      each case is supplied as a ``dict``.
      :type fn_args: iterable[str] or str
      :param cases: Sequence of individual cases to sow for all or some function
                    arguments.
      :type cases: iterable or mappings, optional
      :param combos: Combinations to sow for some or all function arguments.
      :type combos: dict_like[str, iterable]
      :param constants: Provide additional constant function values to use when sowing.
      :type constants: mapping, optional
      :param verbosity: How much information to show when sowing. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being sown.
      :type verbosity: int, optional
      :param batchsize: If specified, set a new batchsize for the crop.
      :type batchsize: int, optional
      :param num_batches: If specified, set a new num_batches for the crop.
      :type num_batches: int, optional


   .. py:method:: sow_samples(n, combos=None, constants=None, verbosity=1)

      Sow ``n`` samples to disk.


   .. py:method:: grow_subprocess(batch_ids=None, num_workers=None, num_threads=None, gpus=None, affinities=None, raise_errors=False, log=False, min_wait=1e-06, max_wait=0.1, verbosity=1, verbosity_grow=0, desc='Grow')

      Grow particular or missing batches using a single fresh subprocess
      per batch. This has a higher overhead for starting each process, but is
      more robust memory wise, and allows controlling the number of threads
      used, CPU affinity and GPU assignment.

      :param batch_ids: Which batch numbers to grow, defaults to all missing.
      :type batch_ids: int or sequence of int, optional
      :param num_workers: The maximum number of concurrent subprocesses (default 1).
      :type num_workers: int, optional
      :param num_threads: The number of threads per subprocess (default 1).
      :type num_threads: int, optional
      :param gpus: GPU device IDs to assign to subprocesses via
                   ``CUDA_VISIBLE_DEVICES``. Each subprocess gets a single GPU from
                   this pool; the pool also limits concurrency to the number of GPUs
                   provided. You can oversubscribe GPUs by repeating device IDs, e.g.
                   ``0,0,1,1`` to allow 2 subprocesses to share each GPU.
      :type gpus: int, str, or sequence of int, optional
      :param affinities: CPU core IDs to pin subprocesses to via ``taskset``.
                         Also limits concurrency to the number of affinities.
      :type affinities: int, str, or sequence of int, optional
      :param raise_errors: Whether to raise errors encountered during growing.
      :type raise_errors: bool, optional
      :param log: Whether to save subprocess stdout and stderr to log files in the
                  crop directory under ``logs/batch-{batch_id}.log``. Default is
                  False, which discards stdout and only prints stderr on error.
      :type log: bool, optional
      :param min_wait: Minimum polling interval in seconds.
      :type min_wait: float, optional
      :param max_wait: Maximum polling interval in seconds.
      :type max_wait: float, optional
      :param verbosity: How much information to show when growing. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being grown.
      :type verbosity: int, optional
      :param verbosity_grow: Verbosity within each batch grow.
      :type verbosity_grow: int, optional
      :param desc: Description to show in the progress bar when sowing.
      :type desc: str, optional


   .. py:method:: grow(batch_ids=None, subprocess='auto', num_workers=None, num_threads=None, gpus=None, affinities=None, raise_errors=False, debugging=False, verbosity=1, verbosity_grow=0, log=False, desc='Grow', **combo_runner_opts)

      Grow specific batch numbers using this process.

      :param batch_ids: Which batch numbers to grow, by default all missing results.
      :type batch_ids: int or sequence of ints, optional
      :param subprocess: Whether to grow each batch in a fresh subprocess. This adds about
                         1 second of overhead per batch, but allows the number of threads,
                         cpu affinity and gpu assignment to be controlled. If "auto"
                         (default) then subprocesses will be used if ``num_threads``,
                         ``gpus`` or ``affinities`` are specified.
                         See :meth:`Crop.grow_subprocess` for details.
      :type subprocess: "auto" or bool, optional
      :param num_workers: Maximum number of batches to run concurrently. In subprocess mode
                          this is the cap on simultaneous subprocesses (defaults to 1 if not
                          given). In in-process mode this is the size of the joblib loky
                          process pool used by ``combo_runner_core`` (``None`` = serial).
      :type num_workers: int, optional
      :param num_threads: Number of threads each worker is allowed to use, applied via the
                          standard env vars (``OMP_NUM_THREADS``, ``MKL_NUM_THREADS``,
                          ``OPENBLAS_NUM_THREADS``, ...). Only meaningful in subprocess mode
                          (the env vars must be set before numerical libraries are imported);
                          setting it implies ``subprocess=True`` when ``subprocess="auto"``.
                          Passing this with ``subprocess=False`` raises ``ValueError``.
      :type num_threads: int, optional
      :param gpus: GPU device IDs to assign to subprocesses via
                   ``CUDA_VISIBLE_DEVICES``. Each subprocess gets a single GPU from
                   this pool; the pool also caps concurrency to its size. Repeat IDs
                   to oversubscribe (e.g. ``"0,0,1,1"`` shares each GPU between two
                   workers). Subprocess-mode only — implies ``subprocess=True`` when
                   ``subprocess="auto"``; raises ``ValueError`` with
                   ``subprocess=False``.
      :type gpus: int, str, or sequence of int, optional
      :param affinities: CPU core IDs to pin subprocesses to via ``taskset``. Each
                         subprocess gets one affinity from the pool, which also caps
                         concurrency. Subprocess-mode only — implies ``subprocess=True``
                         when ``subprocess="auto"``; raises ``ValueError`` with
                         ``subprocess=False``.
      :type affinities: int, str, or sequence of int, optional
      :param raise_errors: Whether to raise errors if they occur during growing.
      :type raise_errors: bool, optional
      :param debugging: Whether to set the logging level to debug.
      :type debugging: bool, optional
      :param verbosity: How much information to show when growing. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being grown.
      :type verbosity: int, optional
      :param verbosity_grow: How much information to show when growing each batch.
      :type verbosity_grow: int, optional
      :param log: Whether to save subprocess output to log files. Only used
                  when ``subprocess=True``.
      :type log: bool, optional
      :param desc: Description to show in the progress bar when growing.
      :type desc: str, optional
      :param \*\*combo_runner_opts: Additional options forwarded to either :meth:`Crop.grow_subprocess`
                                    (``min_wait``, ``max_wait``, ...) when ``subprocess`` is True, or
                                    to ``combo_runner_core`` (``executor``, ``parallel``, ...) when
                                    ``subprocess`` is False.


   .. py:method:: grow_missing(**combo_runner_opts)

      Grow any missing results using this process.


   .. py:method:: reap_combos(wait=False, clean_up=None, allow_incomplete=False, verbosity=1, desc='Reap')

      Reap already sown and grown results from this crop.

      :param wait: Whether to wait for results to appear. If false (default) all
                   results need to be in place before the reap.
      :type wait: bool, optional
      :param clean_up: Whether to delete all the batch files once the results have been
                       gathered. If left as ``None`` this will be automatically set to
                       ``not allow_incomplete``.
      :type clean_up: bool, optional
      :param allow_incomplete: Allow only partially completed crop results to be reaped,
                               incomplete results will all be filled-in as nan.
      :type allow_incomplete: bool, optional
      :param verbosity: How much information to show when reaping. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being reaped.
      :type verbosity: int, optional
      :param desc: Description to show in the progress bar when reaping.
      :type desc: str, optional

      :returns: **results** -- 'N-dimensional' tuple containing the results.
      :rtype: nested tuple


   .. py:method:: reap_combos_to_ds(var_names=None, var_dims=None, var_coords=None, constants=None, attrs=None, parse=True, wait=False, clean_up=None, allow_incomplete=False, to_df=False, verbosity=1, desc='Reap')

      Reap a function over sowed combinations and output to a Dataset.

      :param var_names: Variable name(s) of the output(s) of `fn`, set to None if
                        fn outputs data already labeled in a Dataset or DataArray.
      :type var_names: str, sequence of strings, or None
      :param var_dims: 'Internal' names of dimensions for each variable, the values for
                       each dimension should be contained as a mapping in either
                       `var_coords` (not needed by `fn`) or `constants` (needed by `fn`).
      :type var_dims: sequence of either strings or string sequences, optional
      :param var_coords: Mapping of extra coords the output variables may depend on.
      :type var_coords: mapping, optional
      :param constants: Arguments to `fn` which are not iterated over, these will be
                        recorded either as attributes or coordinates if they are named
                        in `var_dims`.
      :type constants: mapping, optional
      :param resources: Like `constants` but they will not be recorded.
      :type resources: mapping, optional
      :param attrs: Any extra attributes to store.
      :type attrs: mapping, optional
      :param wait: Whether to wait for results to appear. If false (default) all
                   results need to be in place before the reap.
      :type wait: bool, optional
      :param clean_up: Whether to delete all the batch files once the results have been
                       gathered. If left as ``None`` this will be automatically set to
                       ``not allow_incomplete``.
      :type clean_up: bool, optional
      :param allow_incomplete: Allow only partially completed crop results to be reaped,
                               incomplete results will all be filled-in as nan.
      :type allow_incomplete: bool, optional
      :param to_df: Whether to reap to a ``xarray.Dataset`` or a ``pandas.DataFrame``.
      :type to_df: bool, optional
      :param verbosity: How much information to show when reaping. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being reaped.
      :type verbosity: int, optional
      :param desc: Description to show in the progress bar when reaping.
      :type desc: str, optional

      :returns: Multidimensional labeled dataset containing all the results.
      :rtype: xarray.Dataset or pandas.Dataframe


   .. py:method:: reap_runner(runner, wait=False, clean_up=None, allow_incomplete=False, to_df=False, verbosity=1, desc='Reap', **kwargs)

      Reap a Crop over sowed combos and save to a dataset defined by a
      :class:`~xyzpy.Runner`.


   .. py:method:: reap_harvest(harvester, wait=False, sync=True, overwrite=None, clean_up=None, allow_incomplete=False, verbosity=1, desc='Reap', **kwargs)

      Reap a Crop over sowed combos and merge with the dataset defined by
      a :class:`~xyzpy.Harvester`.


   .. py:method:: reap_samples(sampler, wait=False, sync=True, clean_up=None, allow_incomplete=False, verbosity=1, desc='Reap', **kwargs)

      Reap a Crop over sowed combos and merge with the dataframe defined
      by a :class:`~xyzpy.Sampler`.


   .. py:method:: reap(wait=False, sync=True, overwrite=None, clean_up=None, allow_incomplete=False, verbosity=1, desc='Reap')

      Reap sown and grown combos from disk. Return a dataset if a runner
      or harvester is set, otherwise, the raw nested tuple.

      :param wait: Whether to wait for results to appear. If false (default) all
                   results need to be in place before the reap.
      :type wait: bool, optional
      :param sync: Immediately sync the new dataset with the on-disk full dataset or
                   dataframe if a harvester or sampler is used.
      :type sync: bool, optional
      :param overwrite: How to compare data when syncing to on-disk dataset.
                        If ``None``, (default) merge as long as no conflicts.
                        ``True``: overwrite with the new data. ``False``, discard any
                        new conflicting data.
      :type overwrite: bool, optional
      :param clean_up: Whether to delete all the batch files once the results have been
                       gathered. If left as ``None`` this will be automatically set to
                       ``not allow_incomplete``.
      :type clean_up: bool, optional
      :param allow_incomplete: Allow only partially completed crop results to be reaped,
                               incomplete results will all be filled-in as nan.
      :type allow_incomplete: bool, optional
      :param verbosity: How much information to show when reaping. 0: no output, 1:
                        progress bar, 2: progress bar with each setting being reaped.
      :type verbosity: int, optional
      :param desc: Description to show in the progress bar when reaping.
      :type desc: str, optional

      :rtype: nested tuple or xarray.Dataset


   .. py:method:: check_bad(delete_bad=True)

      Check that the result dumps are not bad -> sometimes length does not
      match the batch. Optionally delete these so that they can be re-grown.

      :param delete_bad: Delete bad results as they are come across.
      :type delete_bad: bool

      :returns: **bad_ids** -- The bad batch numbers.
      :rtype: tuple


   .. py:method:: _get_fn()


   .. py:method:: _set_fn(fn)


   .. py:method:: _del_fn()


   .. py:attribute:: fn


   .. py:property:: num_sown_batches

      Total number of batches to be run/grown.


   .. py:property:: num_results


.. py:function:: load_crops(directory='.')

   Automatically load all the crops found in the current directory.

   :param directory: Which directory to load the crops from, defaults to '.' - the current.
   :type directory: str, optional

   :returns: Mapping of the crop name to the Crop.
   :rtype: dict[str, Crop]


.. py:class:: Sower(crop)

   Bases: :py:obj:`object`


   Class for sowing a 'crop' of batched combos to then 'grow' (on any
   number of workers sharing the filesystem) and then reap.


   .. py:attribute:: crop


   .. py:attribute:: _batch_cases
      :value: []


   .. py:attribute:: _counter
      :value: 0


   .. py:attribute:: _batch_counter
      :value: 0


   .. py:method:: save_batch()

      Save the current batch of cases to disk and start the next batch.


   .. py:method:: __enter__()


   .. py:method:: __call__(**kwargs)


   .. py:method:: __exit__(exception_type, exception_value, traceback)


.. py:function:: grow(batch_number, crop=None, fn=None, num_workers=None, check_mpi=True, verbosity=2, debugging=False, raise_errors=True)

   Automatically process a batch of cases into results. Should be run in an
   ".xyz-{fn_name}" folder, or `crop` should be specified.

   :param batch_number: Which batch to 'grow' into a set of results.
   :type batch_number: int
   :param crop: Description of where and how to store the cases and results.
   :type crop: xyzpy.Crop
   :param fn: If specified, the function used to generate the results, otherwise
              the function will be loaded from disk.
   :type fn: callable, optional
   :param num_workers: If specified, grow using a pool of this many workers. This uses
                       ``joblib.externals.loky`` to spawn processes.
   :type num_workers: int, optional
   :param check_mpi: Whether to check if the process is rank 0 and only save results if
                     so - allows mpi functions to be simply used. Defaults to true,
                     this should only be turned off if e.g. a pool of workers is being
                     used to run different ``grow`` instances.
   :type check_mpi: bool, optional
   :param verbosity: How much information to show.
   :type verbosity: {0, 1, 2}, optional
   :param debugging: Set logging level to DEBUG.
   :type debugging: bool, optional
   :param raise_errors: Whether to raise errors that occur during the computation. If growing
                        many batches in parallel, it can be useful to set this to False so
                        a single error doesn't crash the whole process.
   :type raise_errors: bool, optional


.. py:class:: Reaper(crop, num_batches, wait=False, default_result=None)

   Bases: :py:obj:`object`


   Class that acts as a stateful function to retrieve already sown and
   grow results.


   .. py:attribute:: crop


   .. py:attribute:: results


   .. py:method:: __call__(**kwargs)


   .. py:method:: check_finished()


.. py:data:: _SGE_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#!/bin/bash -l
      #$ -S /bin/bash
      #$ -N {name}
      #$ -l h_rt={hours}:{minutes}:{seconds},mem={gigabytes}G
      #$ -l tmpfs={temp_gigabytes}G
      mkdir -p {output_directory}
      #$ -wd {output_directory}
      #$ -pe {pe} {num_procs}
      {header_options}
      """

   .. raw:: html

      </details>


.. py:data:: _SGE_ARRAY_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#$ -t {run_start}-{run_stop}
      """

   .. raw:: html

      </details>


.. py:data:: _PBS_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#!/bin/bash -l
      #PBS -N {name}
      #PBS -lselect={num_nodes}:ncpus={num_procs}:mem={gigabytes}gb
      #PBS -lwalltime={hours:02}:{minutes:02}:{seconds:02}
      {header_options}
      """

   .. raw:: html

      </details>


.. py:data:: _PBS_ARRAY_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#PBS -J {run_start}-{run_stop}
      """

   .. raw:: html

      </details>


.. py:data:: _SLURM_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#!/bin/bash -l
      #SBATCH --job-name={name}
      #SBATCH --time={hours:02}:{minutes:02}:{seconds:02}
      {header_options}
      """

   .. raw:: html

      </details>


.. py:data:: _SLURM_ARRAY_HEADER
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """#SBATCH --array={run_start}-{run_stop}
      """

   .. raw:: html

      </details>


.. py:data:: _BASE
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """echo 'XYZPY script starting...'
      cd {working_directory}
      export OMP_NUM_THREADS={num_threads}
      export MKL_NUM_THREADS={num_threads}
      export OPENBLAS_NUM_THREADS={num_threads}
      export NUMBA_NUM_THREADS={num_threads}
      {shell_setup}
      read -r -d '' SCRIPT << EOM
      {setup}
      from xyzpy.gen.cropping import grow, Crop
      if __name__ == '__main__':
          crop = Crop(name='{name}', parent_dir='{parent_dir}')
          print('Growing:', repr(crop))
          grow_kwargs = dict(
              num_workers={num_workers},
              subprocess={subprocess},
              debugging={debugging},
              verbosity_grow=2,
          )
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_SGE_GROW_ALL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    crop.grow($SGE_TASK_ID, **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_PBS_GROW_ALL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    crop.grow($PBS_ARRAY_INDEX, **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_SLURM_GROW_ALL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    crop.grow($SLURM_ARRAY_TASK_ID, **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_SGE_GROW_PARTIAL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    batch_ids = {batch_ids}]
          crop.grow(batch_ids[$SGE_TASK_ID - 1], **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_PBS_GROW_PARTIAL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    batch_ids = {batch_ids}
          crop.grow(batch_ids[$PBS_ARRAY_INDEX - 1], **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _CLUSTER_SLURM_GROW_PARTIAL_SCRIPT
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    batch_ids = {batch_ids}
          crop.grow(batch_ids[$SLURM_ARRAY_TASK_ID - 1], **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _BASE_CLUSTER_GROW_SINGLE
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """    grow_kwargs['verbosity_grow'] = 0
          batch_ids = {batch_ids}
          crop.grow(batch_ids, **grow_kwargs)
      """

   .. raw:: html

      </details>


.. py:data:: _BASE_CLUSTER_SCRIPT_END
   :value: Multiline-String

   .. raw:: html

      <details><summary>Show Value</summary>

   .. code-block:: python

      """EOM
      {launcher} -c "$SCRIPT"
      echo 'XYZPY script finished'
      """

   .. raw:: html

      </details>


.. py:function:: gen_cluster_script(crop, scheduler, batch_ids=None, *, mode='array', num_procs=None, num_threads=None, num_nodes=None, num_workers=None, subprocess=False, mem=None, mem_per_cpu=None, gigabytes=None, time=None, hours=None, minutes=None, seconds=None, conda_env=True, launcher=None, setup='#', shell_setup='', mpi=False, temp_gigabytes=1, output_directory=None, debugging=False, **kwargs)

   Generate a cluster script to grow a Crop.

   :param crop: The crop to grow.
   :type crop: Crop
   :param scheduler: Whether to use a SGE, PBS or slurm submission script template.
   :type scheduler: {'sge', 'pbs', 'slurm'}
   :param batch_ids: Which batch numbers to grow, defaults to all missing batches.
   :type batch_ids: int or tuple[int]
   :param mode: How to distribute the batches, either as an array job with a single
                batch per job, or as a single job processing batches in parallel.
   :type mode: {'array', 'single'}
   :param hours: How many hours to request, default=0.
   :type hours: int
   :param minutes: How many minutes to request, default=20.
   :type minutes: int, optional
   :param seconds: How many seconds to request, default=0.
   :type seconds: int, optional
   :param gigabytes: How much memory to request, default: 2.
   :type gigabytes: int, optional
   :param num_procs: How many processes to request (threaded cores or MPI), default: 1.
   :type num_procs: int, optional
   :param num_threads: How many threads to use per process. Will be computed automatically
                       based on ``num_procs`` and ``num_workers`` if not specified.
   :type num_threads: int, optional
   :param num_workers: How many workers to use for parallel growing, default is sequential. If
                       specified, then generally ``num_workers * num_threads == num_procs``.
   :type num_workers: int, optional
   :param subprocess: Whether to use a fresh subprocess for each batch, default: False.
   :type subprocess: bool, optional
   :param num_nodes: How many nodes to request, default: 1.
   :type num_nodes: int, optional
   :param conda_env: Whether to activate a conda environment before running the script.
                     If ``True``, the environment will be the same as the one used to
                     launch the script. If a string, the environment will be the one
                     specified by the string.
   :type conda_env: bool or str, optional
   :param launcher: How to launch the script, default: the current Python interpreter. But
                    could for example be ``'mpiexec python'`` for an MPI program.
   :type launcher: str, optional
   :param setup: Python script to run before growing, for things that shouldn't be put
                 in the crop function itself, e.g. one-time imports with side-effects
                 like: ``"import tensorflow as tf; tf.enable_eager_execution()``".
   :type setup: str, optional
   :param shell_setup: Commands to be run by the shell before the python script is executed.
   :type shell_setup: str, optional
   :param mpi: Request MPI processes not threaded processes
   :type mpi: bool, optional
   :param temp_gigabytes: How much temporary on-disk memory.
   :type temp_gigabytes: int, optional
   :param output_directory: What directory to write output to. Defaults to "$HOME/Scratch/output".
   :type output_directory: str, optional
   :param debugging: Set the python log level to debugging.
   :type debugging: bool, optional
   :param kwargs: Extra keyword arguments are taken to be extra resources to request
                  in the header of the submission script, e.g. ``{'gpu': 1}`` will
                  add ``"#SBATCH --gpu=1"`` to the header if using slurm. If you supply
                  literal ``True`` or ``None`` as the value, then the key will be treated
                  as a flag. E.g. ``{'requeue': None}`` will add ``"#SBATCH --requeue"``
                  to the header.
   :type kwargs: dict, optional

   :rtype: str


.. py:function:: grow_cluster(crop, scheduler, batch_ids=None, *, hours=None, minutes=None, seconds=None, gigabytes=2, num_nodes=1, num_procs=1, num_threads=None, num_workers=None, subprocess=False, conda_env=True, launcher=None, setup='#', shell_setup='', mpi=False, temp_gigabytes=1, output_directory=None, debugging=False, **kwargs)

   Automagically submit SGE, PBS, or slurm jobs to grow all missing
   results.

   :param crop: The crop to grow.
   :type crop: Crop
   :param scheduler: Whether to use a SGE, PBS or slurm submission script template.
   :type scheduler: {'sge', 'pbs', 'slurm'}
   :param batch_ids: Which batch numbers to grow, defaults to all missing batches.
   :type batch_ids: int or tuple[int]
   :param hours: How many hours to request, default=0.
   :type hours: int
   :param minutes: How many minutes to request, default=20.
   :type minutes: int, optional
   :param seconds: How many seconds to request, default=0.
   :type seconds: int, optional
   :param gigabytes: How much memory to request, default: 2.
   :type gigabytes: int, optional
   :param num_nodes: How many nodes to request, default: 1.
   :type num_nodes: int, optional
   :param num_procs: How many processes to request (threaded cores or MPI), default: 1.
   :type num_procs: int, optional
   :param num_threads: How many threads to use per process. Will be computed automatically
                       based on ``num_procs`` and ``num_workers`` if not specified.
   :type num_threads: int, optional
   :param num_workers: How many workers to use for parallel growing, default is sequential. If
                       specified, then generally ``num_workers * num_threads == num_procs``.
   :type num_workers: int, optional
   :param subprocess: Whether to use a fresh subprocess for each batch, default: False.
   :type subprocess: bool, optional
   :param conda_env: Whether to activate a conda environment before running the script.
                     If ``True``, the environment will be the same as the one used to
                     launch the script. If a string, the environment will be the one
                     specified by the string.
   :type conda_env: bool or str, optional
   :param launcher: How to launch the script, default: the current Python interpreter. But
                    could for example be ``'mpiexec python'`` for a MPI program.
   :type launcher: str, optional
   :param setup: Python script to run before growing, for things that shouldnt't be put
                 in the crop function itself, e.g. one-time imports with side-effects
                 like: ``"import tensorflow as tf; tf.enable_eager_execution()``".
   :type setup: str, optional
   :param shell_setup: Commands to be run by the shell before the python script is executed.
                       E.g. ``conda activate my_env``.
   :type shell_setup: str, optional
   :param mpi: Request MPI processes not threaded processes.
   :type mpi: bool, optional
   :param temp_gigabytes: How much temporary on-disk memory.
   :type temp_gigabytes: int, optional
   :param output_directory: What directory to write output to. Defaults to "$HOME/Scratch/output".
   :type output_directory: str, optional
   :param debugging: Set the python log level to debugging.
   :type debugging: bool, optional


.. py:function:: gen_qsub_script(crop, batch_ids=None, *, scheduler='sge', **kwargs)

   Generate a qsub script to grow a Crop. Deprecated in favor of
   `gen_cluster_script` and will be removed in the future.

   :param crop: The crop to grow.
   :type crop: Crop
   :param batch_ids: Which batch numbers to grow, defaults to all missing batches.
   :type batch_ids: int or tuple[int]
   :param scheduler: Whether to use an SGE or PBS submission script template.
   :type scheduler: {'sge', 'pbs'}, optional
   :param kwargs: See `gen_cluster_script` for all other parameters.


.. py:function:: qsub_grow(crop, batch_ids=None, *, scheduler='sge', **kwargs)

   Automagically submit SGE or PBS jobs to grow all missing results.
   Deprecated in favor of `grow_cluster` and will be removed in the future.

   :param crop: The crop to grow.
   :type crop: Crop
   :param batch_ids: Which batch numbers to grow, defaults to all missing batches.
   :type batch_ids: int or tuple[int]
   :param scheduler: Whether to use a SGE or PBS submission script template.
   :type scheduler: {'sge', 'pbs'}, optional
   :param kwargs: See `grow_cluster` for all other parameters.


.. py:function:: clean_slurm_outputs(job, directory='.', cancel_if_finished=True)

.. py:function:: manage_slurm_outputs(crop, job, wait_time=60)