xyzpy.gen.case_runner

Functions for systematically evaluating a function over specific cases.

Attributes

Functions

case_runner(fn, fn_args, cases[, combos, constants, ...])

Simple case runner that outputs the raw tuple of results.

case_runner_to_ds(fn, fn_args, cases, var_names[, ...])

Takes a list of cases to run fn over, possibly in parallel, and

is_case_missing(ds, setting[, method])

Does the dataset or dataarray ds not contain any non-null data for

parse_into_cases([combos, cases, ds, method])

Convert maybe combos and maybe cases to a single list of

find_missing_cases(ds[, ignore_dims, method])

Find all cases in a dataset or DataArray with missing data.

Module Contents

xyzpy.gen.case_runner.case_runner(fn, fn_args, cases, combos=None, constants=None, split=False, shuffle=False, parse=True, parallel=False, executor=None, num_workers=None, verbosity=1)[source]

Simple case runner that outputs the raw tuple of results.

Parameters:
  • fn (callable) – Function with which to evalute cases with

  • fn_args (tuple) – Names of case arguments that fn takes, can be None if each case is a dict.

  • cases (iterable[tuple] or iterable[dict]) – List of specific configurations that fn_args should take. If fn_args is None, each case should be a dict.

  • combos (dict_like[str, iterable], optional) – Optional specification of sub-combinations.

  • constants (dict, optional) – Constant function arguments.

  • split (bool, optional) – See combo_runner().

  • shuffle (bool or int, optional) – If given, compute the results in a random order (using random.seed and random.shuffle), which can be helpful for distributing resources when not all cases are computationally equal.

  • parallel (bool, optional) – Process combos in parallel, default number of workers picked.

  • executor (executor-like pool, optional) – Submit all combos to this pool executor. Must have submit or apply_async methods and API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also supported.

  • num_workers (int, optional) – Explicitly choose how many workers to use, None for automatic.

  • verbosity ({0, 1, 2}, optional) –

    How much information to display:

    • 0: nothing,

    • 1: just progress,

    • 2: all information.

Returns:

results

Return type:

list of fn output for each case

xyzpy.gen.case_runner.case_runner_to_ds(fn, fn_args, cases, var_names, var_dims=None, var_coords=None, combos=None, constants=None, resources=None, attrs=None, shuffle=False, to_df=False, parse=True, parallel=False, num_workers=None, executor=None, verbosity=1)[source]

Takes a list of cases to run fn over, possibly in parallel, and outputs a xarray.Dataset.

Parameters:
  • fn (callable) – Function to evaluate.

  • fn_args (str or iterable[str]) – Names and order of arguments to fn, can be None if cases are supplied as dicts.

  • cases (iterable[tuple] or iterable[dict]) – List of configurations used to generate results.

  • var_names (str or iterable of str) – Variable name(s) of the output(s) of fn.

  • var_dims (sequence of either strings or string sequences, optional) – ‘Internal’ names of dimensions for each variable, the values for each dimension should be contained as a mapping in either var_coords (not needed by fn) or constants (needed by fn).

  • var_coords (mapping, optional) – Mapping of extra coords the output variables may depend on.

  • combos (dict_like[str, iterable], optional) – If specified, run all combinations of some arguments in these mappings.

  • constants (mapping, optional) – Arguments to fn which are not iterated over, these will be recorded either as attributes or coordinates if they are named in var_dims.

  • resources (mapping, optional) – Like constants but they will not be recorded.

  • attrs (mapping, optional) – Any extra attributes to store.

  • shuffle (bool or int, optional) – If given, compute the results in a random order (using random.seed and random.shuffle), which can be helpful for distributing resources when not all cases are computationally equal.

  • parse (bool, optional) – Whether to perform parsing of the inputs arguments.

  • parallel (bool, optional) – Process combos in parallel, default number of workers picked.

  • executor (executor-like pool, optional) – Submit all combos to this pool executor. Must have submit or apply_async methods and API matching either concurrent.futures or an ipyparallel view. Pools from multiprocessing.pool are also supported.

  • num_workers (int, optional) – Explicitly choose how many workers to use, None for automatic.

  • verbosity ({0, 1, 2}, optional) –

    How much information to display:

    • 0: nothing,

    • 1: just progress,

    • 2: all information.

Returns:

ds – Dataset with minimal covering coordinates and all cases evaluated.

Return type:

xarray.Dataset

xyzpy.gen.case_runner.case_runner_to_df
xyzpy.gen.case_runner.is_case_missing(ds, setting, method='isnull')[source]

Does the dataset or dataarray ds not contain any non-null data for single location setting?

Note that this only returns true if all data across all variables is completely missing at the location.

Parameters:
Returns:

missing

Return type:

bool

xyzpy.gen.case_runner.parse_into_cases(combos=None, cases=None, ds=None, method='isnull')[source]

Convert maybe combos and maybe cases to a single list of cases only, also optionally filtering based on whether any data at each location is already present in Dataset or DataArray ds.

Note that this only checks whether all data across all variables is completely missing at the location. To check against a single variable only simply supply a DataArray instead of a Dataset, e.g. ds=ds["var_name"].

Parameters:
  • combos (dict_like[str, iterable], optional) – Parameter combinations.

  • cases (iterable[dict], optional) – Parameter configurations.

  • ds (xarray.Dataset or xarray.DataArray, optional) – Dataset or DataArray in which to check for existing data.

  • method ({"isnull", "isfinite"}, optional) – How to determine whether data is missing when ds is supplied. “isnull” checks for null/nan values, while “isfinite” checks for all non-finite values (i.e. inf or nan).

Returns:

new_cases – The combined and possibly filtered list of cases.

Return type:

iterable[dict]

xyzpy.gen.case_runner.find_missing_cases(ds, ignore_dims=None, method='isnull')[source]

Find all cases in a dataset or DataArray with missing data.

Parameters:
  • ds (xarray.Dataset or xarray.DataArray) – Dataset or DataArray in which to find missing data

  • ignore_dims (set, optional) – Internal variable dimensions (i.e. to ignore). By default (None) this is set to any dimensions that don’t appear on all variables.

Returns:

cases_missing – List of cases with missing data, where each case is a dict mapping from dimension name to coordinate value.

Return type:

iterable[dict]