xyzpy.gen.case_runner#

Functions for systematically evaluating a function over specific cases.

Module Contents#

Functions#

case_runner(fn, fn_args, cases[, combos, constants, ...])

Simple case runner that outputs the raw tuple of results.

case_runner_to_ds(fn, fn_args, cases, var_names[, ...])

Takes a list of cases to run fn over, possibly in parallel, and

is_case_missing(ds, setting[, method])

Does the dataset or dataarray ds not contain any non-null data for

find_missing_cases(ds[, ignore_dims, method, show_progbar])

Find all cases in a dataset with missing data.

parse_into_cases([combos, cases, ds, method])

Convert maybe combos and maybe cases to a single list of

Attributes#

xyzpy.gen.case_runner.case_runner(fn, fn_args, cases, combos=None, constants=None, split=False, shuffle=False, parse=True, parallel=False, executor=None, num_workers=None, verbosity=1)[source]#

Simple case runner that outputs the raw tuple of results.

Parameters:
  • fn (callable) – Function with which to evalute cases with

  • fn_args (tuple) – Names of case arguments that fn takes, can be None if each case is a dict.

  • cases (iterable[tuple] or iterable[dict]) – List of specific configurations that fn_args should take. If fn_args is None, each case should be a dict.

  • combos (dict_like[str, iterable], optional) – Optional specification of sub-combinations.

  • constants (dict, optional) – Constant function arguments.

  • split (bool, optional) – See combo_runner().

  • shuffle (bool or int, optional) – If given, compute the results in a random order (using random.seed and random.shuffle), which can be helpful for distributing resources when not all cases are computationally equal.

  • parallel (bool, optional) – Process combos in parallel, default number of workers picked.

  • executor (executor-like pool, optional) – Submit all combos to this pool executor. Must have submit or apply_async methods and API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also supported.

  • num_workers (int, optional) – Explicitly choose how many workers to use, None for automatic.

  • verbosity ({0, 1, 2}, optional) –

    How much information to display:

    • 0: nothing,

    • 1: just progress,

    • 2: all information.

Returns:

results

Return type:

list of fn output for each case

xyzpy.gen.case_runner.case_runner_to_ds(fn, fn_args, cases, var_names, var_dims=None, var_coords=None, combos=None, constants=None, resources=None, attrs=None, shuffle=False, to_df=False, parse=True, parallel=False, num_workers=None, executor=None, verbosity=1)[source]#

Takes a list of cases to run fn over, possibly in parallel, and outputs a xarray.Dataset.

Parameters:
  • fn (callable) – Function to evaluate.

  • fn_args (str or iterable[str]) – Names and order of arguments to fn, can be None if cases are supplied as dicts.

  • cases (iterable[tuple] or iterable[dict]) – List of configurations used to generate results.

  • var_names (str or iterable of str) – Variable name(s) of the output(s) of fn.

  • var_dims (sequence of either strings or string sequences, optional) – ‘Internal’ names of dimensions for each variable, the values for each dimension should be contained as a mapping in either var_coords (not needed by fn) or constants (needed by fn).

  • var_coords (mapping, optional) – Mapping of extra coords the output variables may depend on.

  • combos (dict_like[str, iterable], optional) – If specified, run all combinations of some arguments in these mappings.

  • constants (mapping, optional) – Arguments to fn which are not iterated over, these will be recorded either as attributes or coordinates if they are named in var_dims.

  • resources (mapping, optional) – Like constants but they will not be recorded.

  • attrs (mapping, optional) – Any extra attributes to store.

  • shuffle (bool or int, optional) – If given, compute the results in a random order (using random.seed and random.shuffle), which can be helpful for distributing resources when not all cases are computationally equal.

  • parse (bool, optional) – Whether to perform parsing of the inputs arguments.

  • parallel (bool, optional) – Process combos in parallel, default number of workers picked.

  • executor (executor-like pool, optional) – Submit all combos to this pool executor. Must have submit or apply_async methods and API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also supported.

  • num_workers (int, optional) – Explicitly choose how many workers to use, None for automatic.

  • verbosity ({0, 1, 2}, optional) –

    How much information to display:

    • 0: nothing,

    • 1: just progress,

    • 2: all information.

Returns:

ds – Dataset with minimal covering coordinates and all cases evaluated.

Return type:

xarray.Dataset

xyzpy.gen.case_runner.case_runner_to_df#
xyzpy.gen.case_runner.is_case_missing(ds, setting, method='isnull')[source]#

Does the dataset or dataarray ds not contain any non-null data for location setting?

Note that this only returns true if all data across all variables is completely missing at the location.

Parameters:
Returns:

missing

Return type:

bool

xyzpy.gen.case_runner.find_missing_cases(ds, ignore_dims=None, method='isnull', show_progbar=False)[source]#

Find all cases in a dataset with missing data.

Parameters:
  • ds (xarray.Dataset) – Dataset in which to find missing data

  • ignore_dims (set (optional)) – internal variable dimensions (i.e. to ignore)

  • show_progbar (bool (optional)) – Show the current progress

Returns:

Function arguments and missing cases.

Return type:

missing_fn_args, missing_cases

xyzpy.gen.case_runner.parse_into_cases(combos=None, cases=None, ds=None, method='isnull')[source]#

Convert maybe combos and maybe cases to a single list of cases only, also optionally filtering based on whether any data at each location is already present in Dataset or DataArray ds.

Parameters:
  • combos (dict_like[str, iterable], optional) – Parameter combinations.

  • cases (iterable[dict], optional) – Parameter configurations.

  • ds (xarray.Dataset or xarray.DataArray, optional) – Dataset or DataArray in which to check for existing data.

Returns:

new_cases – The combined and possibly filtered list of cases.

Return type:

iterable[dict]