xyzpy.utils#

Utility functions.

Module Contents#

Classes#

Timer

A very simple context manager class for timing blocks.

Benchmarker

Compare the performance of various kernels. Internally this makes

RunningStatistics

Running mean & standard deviation using Welford's

RunningCovariance

Running covariance class.

RunningCovarianceMatrix

Functions#

isiterable(obj)

prod(it)

Product of an iterable.

unzip(its[, zip_level])

Split a nested iterable at a specified level, i.e. in numpy language

flatten(its, n)

Take the n-dimensional nested iterable its and flatten it.

_get_fn_name(fn)

Try to inspect a function's name, taking into account several common

progbar([it, nb])

Turn any iterable into a progress bar, with notebook option

getsizeof(obj)

Compute the real size of a python object. Taken from

_auto_min_time(timer[, min_t, repeats, get])

benchmark(fn[, setup, n, min_t, repeats, get, starmap])

Benchmark the time it takes to run fn.

format_number_with_error(x, err)

Given x with error err, format a string showing the relevant

estimate_from_repeats(fn, *fn_args[, rtol, tol_scale, ...])

param fn:

The function that estimates a single value.

report_memory()

report_memory_gpu()

autocorrect_kwargs([func, valid_kwargs])

A decorator that suggests the right keyword arguments if you get them

exception xyzpy.utils.XYZError[source]#

Bases: Exception

Common base class for all non-exit exceptions.

xyzpy.utils.isiterable(obj)[source]#
xyzpy.utils.prod(it)[source]#

Product of an iterable.

xyzpy.utils.unzip(its, zip_level=1)[source]#

Split a nested iterable at a specified level, i.e. in numpy language transpose the specified ‘axis’ to be the first.

Parameters:
  • its (iterable (of iterables (of iterables ...))) – ‘n-dimensional’ iterable to split

  • zip_level (int) – level at which to split the iterable, default of 1 replicates zip(*its) behaviour.

Example

>>> x = [[(1, True), (2, False), (3, True)],
         [(7, True), (8, False), (9, True)]]
>>> nums, bools = unzip(x, 2)
>>> nums
((1, 2, 3), (7, 8, 9))
>>> bools
((True, False, True), (True, False, True))
xyzpy.utils.flatten(its, n)[source]#

Take the n-dimensional nested iterable its and flatten it.

Parameters:
  • its (nested iterable) –

  • n (number of dimensions) –

Return type:

flattened iterable of all items

xyzpy.utils._get_fn_name(fn)[source]#

Try to inspect a function’s name, taking into account several common non-standard types of function: dask, functools.partial …

xyzpy.utils.progbar(it=None, nb=False, **kwargs)[source]#

Turn any iterable into a progress bar, with notebook option

Parameters:
  • it (iterable) – Iterable to wrap with progress bar

  • nb (bool) – Whether to display the notebook progress bar

  • **kwargs (dict-like) – additional options to send to tqdm

xyzpy.utils.getsizeof(obj)[source]#

Compute the real size of a python object. Taken from

https://stackoverflow.com/a/30316760/5640201

class xyzpy.utils.Timer[source]#

A very simple context manager class for timing blocks.

Examples

>>> from xyzpy import Timer
>>> with Timer() as timer:
...     print('Doing some work!')
...
Doing some work!
>>> timer.t
0.00010752677917480469
__enter__()[source]#
__exit__(*args)[source]#
xyzpy.utils._auto_min_time(timer, min_t=0.2, repeats=5, get='min')[source]#
xyzpy.utils.benchmark(fn, setup=None, n=None, min_t=0.1, repeats=3, get='min', starmap=False)[source]#

Benchmark the time it takes to run fn.

Parameters:
  • fn (callable) – The function to time.

  • setup (callable, optional) – If supplied the function that sets up the argument for fn.

  • n (int, optional) – If supplied, the integer to supply to setup of fn.

  • min_t (float, optional) – Aim to repeat function enough times to take up this many seconds.

  • repeats (int, optional) – Repeat the whole procedure (with setup) this many times in order to take the minimum run time.

  • get ({'min', 'mean'}, optional) – Return the minimum or mean time for each run.

  • starmap (bool, optional) – Unpack the arguments from setup, if given.

Returns:

t – The minimum, averaged, time to run fn in seconds.

Return type:

float

Examples

Just a parameter-less function:

>>> import xyzpy as xyz
>>> import numpy as np
>>> xyz.benchmark(lambda: np.linalg.eig(np.random.randn(100, 100)))
0.004726233000837965

The same but with a setup and size parameter n specified:

>>> setup = lambda n: np.random.randn(n, n)
>>> fn = lambda X: np.linalg.eig(X)
>>> xyz.benchmark(fn, setup, 100)
0.0042192734545096755
class xyzpy.utils.Benchmarker(kernels, setup=None, names=None, benchmark_opts=None, data_name=None)[source]#

Compare the performance of various kernels. Internally this makes use of benchmark(), Harvester() and xyzpys plotting functionality.

Parameters:
  • kernels (sequence of callable) – The functions to compare performance with.

  • setup (callable, optional) – If given, setup each benchmark run by suppling the size argument n to this function first, then feeding its output to each of the functions.

  • names (sequence of str, optional) – Alternate names to give the function, else they will be inferred.

  • benchmark_opts (dict, optional) – Supplied to benchmark().

  • data_name (str, optional) – If given, the file name the internal harvester will use to store results persistently.

harvester#

The harvester that runs and accumulates all the data.

Type:

xyz.Harvester

ds#

Shortcut to the harvester’s full dataset.

Type:

xarray.Dataset

property ds#
run(ns, kernels=None, **harvest_opts)[source]#

Run the benchmarks. Each run accumulates rather than overwriting the results.

Parameters:
  • ns (sequence of int or int) – The sizes to run the benchmarks with.

  • kernels (sequence of str, optional) – If given, only run the kernels with these names.

  • harvest_opts – Supplied to harvest_combos().

plot(**plot_opts)[source]#

Plot the benchmarking results.

lineplot(**plot_opts)[source]#

Plot the benchmarking results.

ilineplot(**plot_opts)[source]#

Interactively plot the benchmarking results.

xyzpy.utils.format_number_with_error(x, err)[source]#

Given x with error err, format a string showing the relevant digits of x with two significant digits of the error bracketed, and overall exponent if necessary.

Parameters:
  • x (float) – The value to print.

  • err (float) – The error on x.

Return type:

str

Examples

>>> print_number_with_uncertainty(0.1542412, 0.0626653)
'0.154(63)'
>>> print_number_with_uncertainty(-128124123097, 6424)
'-1.281241231(64)e+11'
class xyzpy.utils.RunningStatistics[source]#

Running mean & standard deviation using Welford’s algorithm. This is a very efficient way of keeping track of the error on the mean for example.

mean#

Current mean.

Type:

float

count#

Current count.

Type:

int

std#

Current standard deviation.

Type:

float

var#

Current variance.

Type:

float

err#

Current error on the mean.

Type:

float

rel_err#

The current relative error.

Type:

float

Examples

>>> rs = RunningStatistics()
>>> rs.update(1.1)
>>> rs.update(1.4)
>>> rs.update(1.2)
>>> rs.update_from_it([1.5, 1.3, 1.6])
>>> rs.mean
1.3499999046325684
>>> rs.std  # standard deviation
0.17078252585383266
>>> rs.err  # error on the mean
0.06972167422092768
property var#
property std#
property err#
property rel_err#
update(x)[source]#

Add a single value x to the statistics.

update_from_it(xs)[source]#

Add all values from iterable xs to the statistics.

converged(rtol, atol)[source]#

Check if the stats have converged with respect to relative and absolute tolerance rtol and atol.

__repr__()[source]#

Return repr(self).

class xyzpy.utils.RunningCovariance[source]#

Running covariance class.

property covar#

The covariance.

property sample_covar#

The covariance with “Bessel’s correction”.

update(x, y)[source]#
update_from_it(xs, ys)[source]#
class xyzpy.utils.RunningCovarianceMatrix(n=2)[source]#
property count#
property covar_matrix#
property sample_covar_matrix#
update(*x)[source]#
update_from_it(*xs)[source]#
to_uncertainties(bias=True)[source]#

Convert the accumulated statistics to correlated uncertainties, from which new quantities can be calculated with error automatically propagated.

Parameters:

bias (bool, optional) – If False, use the sample covariance with “Bessel’s correction”.

Returns:

values – The sequence of correlated variables.

Return type:

tuple of uncertainties.ufloat

Examples

Estimate quantities of two perfectly correlated sequences.

>>> rcm = xyz.RunningCovarianceMatrix()
>>> rcm.update_from_it((1, 3, 2), (2, 6, 4))
>>> x, y = rcm.to_uncertainties(rcm)

Calculated quantities like sums have the error propagated:

>>> x + y
6.0+/-2.4494897427831783

But the covariance is also taken into account, meaning the ratio here can be estimated with zero error:

>>> x / y
0.5+/-0
xyzpy.utils.estimate_from_repeats(fn, *fn_args, rtol=0.02, tol_scale=1.0, get='stats', verbosity=0, min_samples=5, max_samples=1000000, **fn_kwargs)[source]#
Parameters:
  • fn (callable) – The function that estimates a single value.

  • fn_args – Supplied to fn.

  • optional – Supplied to fn.

  • rtol (float, optional) – Relative tolerance for error on mean.

  • tol_scale (float, optional) – The expected ‘scale’ of the estimate, this modifies the aboslute tolerance near zero to rtol * tol_scale, default: 1.0.

  • get ({'stats', 'samples', 'mean'}, optional) – Just get the RunningStatistics object, or the actual samples too, or just the actual mean estimate.

  • verbosity ({ 0, 1, 2}, optional) –

    How much information to show:

    • 0: nothing

    • 1: progress bar just with iteration rate,

    • 2: progress bar with running stats displayed.

  • min_samples (int, optional) – Take at least this many samples before checking for convergence.

  • max_samples (int, optional) – Take at maximum this many samples.

  • fn_kwargs – Supplied to fn.

  • optional – Supplied to fn.

Returns:

  • rs (RunningStatistics) – Statistics about the random estimation.

  • samples (list[float]) – If get=='samples', the actual samples.

Examples

Estimate the sum of n random numbers:

>>> import numpy as np
>>> import xyzpy as xyz
>>> def fn(n):
...     return np.random.rand(n).sum()
...
>>> stats = xyz.estimate_from_repeats(fn, n=10, verbosity=3)
59: 5.13(12): : 58it [00:00, 3610.84it/s]
RunningStatistics(mean=5.13(12), count=59)
xyzpy.utils.report_memory()[source]#
xyzpy.utils.report_memory_gpu()[source]#
xyzpy.utils.autocorrect_kwargs(func=None, valid_kwargs=None)[source]#

A decorator that suggests the right keyword arguments if you get them wrong. Useful for functions with many specific options.

Parameters:
  • func (callable, optional) – The function to decorate.

  • valid_kwargs (sequence[str], optional) – The valid keyword arguments for func, if not given these are inferred from the function signature.