`xyzpy.utils`#

Utility functions.

Module Contents#

Classes#

`Timer`	A very simple context manager class for timing blocks.
`Benchmarker`	Compare the performance of various `kernels`. Internally this makes
`RunningStatistics`	Running mean & standard deviation using Welford's
`RunningCovariance`	Running covariance class.
`RunningCovarianceMatrix`

Functions#

`isiterable`(obj)
`prod`(it)	Product of an iterable.
`unzip`(its[, zip_level])	Split a nested iterable at a specified level, i.e. in numpy language
`flatten`(its, n)	Take the n-dimensional nested iterable its and flatten it.
`_get_fn_name`(fn)	Try to inspect a function's name, taking into account several common
`progbar`([it, nb])	Turn any iterable into a progress bar, with notebook option
`getsizeof`(obj)	Compute the real size of a python object. Taken from
`_auto_min_time`(timer[, min_t, repeats, get])
`benchmark`(fn[, setup, n, min_t, repeats, get, starmap])	Benchmark the time it takes to run `fn`.
`format_number_with_error`(x, err)	Given `x` with error `err`, format a string showing the relevant
`estimate_from_repeats`(fn, *fn_args[, rtol, tol_scale, ...])	param fn: The function that estimates a single value.
`report_memory`()
`report_memory_gpu`()
`autocorrect_kwargs`([func, valid_kwargs])	A decorator that suggests the right keyword arguments if you get them

exception xyzpy.utils.XYZError[source]#

Bases: Exception

Common base class for all non-exit exceptions.

xyzpy.utils.isiterable(obj)[source]#

xyzpy.utils.prod(it)[source]#: Product of an iterable.

xyzpy.utils.unzip(its, zip_level=1)[source]#

Split a nested iterable at a specified level, i.e. in numpy language transpose the specified ‘axis’ to be the first.

Parameters:

its (iterable (of iterables (of iterables ...))) – ‘n-dimensional’ iterable to split
zip_level (int) – level at which to split the iterable, default of 1 replicates zip(*its) behaviour.

Example

>>> x = [[(1, True), (2, False), (3, True)],
         [(7, True), (8, False), (9, True)]]
>>> nums, bools = unzip(x, 2)
>>> nums
((1, 2, 3), (7, 8, 9))
>>> bools
((True, False, True), (True, False, True))

xyzpy.utils.flatten(its, n)[source]#

Take the n-dimensional nested iterable its and flatten it.

Parameters:

its (nested iterable) –
n (number of dimensions) –

Return type:

flattened iterable of all items

xyzpy.utils._get_fn_name(fn)[source]#: Try to inspect a function’s name, taking into account several common non-standard types of function: dask, functools.partial …

xyzpy.utils.progbar(it=None, nb=False, **kwargs)[source]#

Turn any iterable into a progress bar, with notebook option

Parameters:

it (iterable) – Iterable to wrap with progress bar
nb (bool) – Whether to display the notebook progress bar
**kwargs (dict-like) – additional options to send to tqdm

xyzpy.utils.getsizeof(obj)[source]#

Compute the real size of a python object. Taken from

https://stackoverflow.com/a/30316760/5640201

class xyzpy.utils.Timer[source]#

A very simple context manager class for timing blocks.

Examples

>>> from xyzpy import Timer
>>> with Timer() as timer:
...     print('Doing some work!')
...
Doing some work!
>>> timer.t
0.00010752677917480469

__enter__()[source]#

__exit__(*args)[source]#

xyzpy.utils._auto_min_time(timer, min_t=0.2, repeats=5, get='min')[source]#

xyzpy.utils.benchmark(fn, setup=None, n=None, min_t=0.1, repeats=3, get='min', starmap=False)[source]#

Benchmark the time it takes to run fn.

Parameters:

fn (callable) – The function to time.
setup (callable, optional) – If supplied the function that sets up the argument for fn.
n (int, optional) – If supplied, the integer to supply to setup of fn.
min_t (float, optional) – Aim to repeat function enough times to take up this many seconds.
repeats (int, optional) – Repeat the whole procedure (with setup) this many times in order to take the minimum run time.
get ({'min', 'mean'}, optional) – Return the minimum or mean time for each run.
starmap (bool, optional) – Unpack the arguments from setup, if given.

Returns:

t – The minimum, averaged, time to run fn in seconds.

Return type:

float

Examples

Just a parameter-less function:

>>> import xyzpy as xyz
>>> import numpy as np
>>> xyz.benchmark(lambda: np.linalg.eig(np.random.randn(100, 100)))
0.004726233000837965

The same but with a setup and size parameter n specified:

>>> setup = lambda n: np.random.randn(n, n)
>>> fn = lambda X: np.linalg.eig(X)
>>> xyz.benchmark(fn, setup, 100)
0.0042192734545096755

class xyzpy.utils.Benchmarker(kernels, setup=None, names=None, benchmark_opts=None, data_name=None)[source]#

Compare the performance of various kernels. Internally this makes use of benchmark(), Harvester() and xyzpys plotting functionality.

Parameters:

kernels (sequence of callable) – The functions to compare performance with.
setup (callable, optional) – If given, setup each benchmark run by suppling the size argument n to this function first, then feeding its output to each of the functions.
names (sequence of str, optional) – Alternate names to give the function, else they will be inferred.
benchmark_opts (dict, optional) – Supplied to benchmark().
data_name (str, optional) – If given, the file name the internal harvester will use to store results persistently.

harvester#

The harvester that runs and accumulates all the data.

Type:: xyz.Harvester

ds#

Shortcut to the harvester’s full dataset.

Type:: xarray.Dataset

property ds#

run(ns, kernels=None, **harvest_opts)[source]#

Run the benchmarks. Each run accumulates rather than overwriting the results.

Parameters:

ns (sequence of int or int) – The sizes to run the benchmarks with.
kernels (sequence of str, optional) – If given, only run the kernels with these names.
harvest_opts – Supplied to harvest_combos().

plot(**plot_opts)[source]#: Plot the benchmarking results.

lineplot(**plot_opts)[source]#: Plot the benchmarking results.

ilineplot(**plot_opts)[source]#: Interactively plot the benchmarking results.

xyzpy.utils.format_number_with_error(x, err)[source]#

Given x with error err, format a string showing the relevant digits of x with two significant digits of the error bracketed, and overall exponent if necessary.

Parameters:

x (float) – The value to print.
err (float) – The error on x.

Return type:

str

Examples

>>> print_number_with_uncertainty(0.1542412, 0.0626653)
'0.154(63)'

>>> print_number_with_uncertainty(-128124123097, 6424)
'-1.281241231(64)e+11'

class xyzpy.utils.RunningStatistics[source]#

Running mean & standard deviation using Welford’s algorithm. This is a very efficient way of keeping track of the error on the mean for example.

mean#

Current mean.

Type:: float

count#

Current count.

Type:: int

std#

Current standard deviation.

Type:: float

var#

Current variance.

Type:: float

err#

Current error on the mean.

Type:: float

rel_err#

The current relative error.

Type:: float

Examples

>>> rs = RunningStatistics()
>>> rs.update(1.1)
>>> rs.update(1.4)
>>> rs.update(1.2)
>>> rs.update_from_it([1.5, 1.3, 1.6])
>>> rs.mean
1.3499999046325684

>>> rs.std  # standard deviation
0.17078252585383266

>>> rs.err  # error on the mean
0.06972167422092768

property var#

property std#

property err#

property rel_err#

update(x)[source]#: Add a single value x to the statistics.

update_from_it(xs)[source]#: Add all values from iterable xs to the statistics.

converged(rtol, atol)[source]#: Check if the stats have converged with respect to relative and absolute tolerance rtol and atol.

__repr__()[source]#: Return repr(self).

class xyzpy.utils.RunningCovariance[source]#

Running covariance class.

property covar#: The covariance.

property sample_covar#: The covariance with “Bessel’s correction”.

update(x, y)[source]#

update_from_it(xs, ys)[source]#

class xyzpy.utils.RunningCovarianceMatrix(n=2)[source]#

property count#

property covar_matrix#

property sample_covar_matrix#

update(*x)[source]#

update_from_it(*xs)[source]#

to_uncertainties(bias=True)[source]#

Convert the accumulated statistics to correlated uncertainties, from which new quantities can be calculated with error automatically propagated.

Parameters:: bias (bool, optional) – If False, use the sample covariance with “Bessel’s correction”.
Returns:: values – The sequence of correlated variables.
Return type:: tuple of uncertainties.ufloat

Examples

Estimate quantities of two perfectly correlated sequences.

>>> rcm = xyz.RunningCovarianceMatrix()
>>> rcm.update_from_it((1, 3, 2), (2, 6, 4))
>>> x, y = rcm.to_uncertainties(rcm)

Calculated quantities like sums have the error propagated:

>>> x + y
6.0+/-2.4494897427831783

But the covariance is also taken into account, meaning the ratio here can be estimated with zero error:

>>> x / y
0.5+/-0

xyzpy.utils.estimate_from_repeats(fn, *fn_args, rtol=0.02, tol_scale=1.0, get='stats', verbosity=0, min_samples=5, max_samples=1000000, **fn_kwargs)[source]#

Parameters:

fn (callable) – The function that estimates a single value.
fn_args – Supplied to fn.
optional – Supplied to fn.
rtol (float, optional) – Relative tolerance for error on mean.
tol_scale (float, optional) – The expected ‘scale’ of the estimate, this modifies the aboslute tolerance near zero to rtol * tol_scale, default: 1.0.
get ({'stats', 'samples', 'mean'}, optional) – Just get the RunningStatistics object, or the actual samples too, or just the actual mean estimate.
verbosity ({ 0, 1, 2}, optional) –
How much information to show:
- 0: nothing
- 1: progress bar just with iteration rate,
- 2: progress bar with running stats displayed.
min_samples (int, optional) – Take at least this many samples before checking for convergence.
max_samples (int, optional) – Take at maximum this many samples.
fn_kwargs – Supplied to fn.
optional – Supplied to fn.

Returns:

rs (RunningStatistics) – Statistics about the random estimation.
samples (list[float]) – If get=='samples', the actual samples.

Examples

Estimate the sum of n random numbers:

>>> import numpy as np
>>> import xyzpy as xyz
>>> def fn(n):
...     return np.random.rand(n).sum()
...
>>> stats = xyz.estimate_from_repeats(fn, n=10, verbosity=3)
59: 5.13(12): : 58it [00:00, 3610.84it/s]
RunningStatistics(mean=5.13(12), count=59)

xyzpy.utils.report_memory()[source]#

xyzpy.utils.report_memory_gpu()[source]#

xyzpy.utils.autocorrect_kwargs(func=None, valid_kwargs=None)[source]#

A decorator that suggests the right keyword arguments if you get them wrong. Useful for functions with many specific options.

Parameters:

func (callable, optional) – The function to decorate.
valid_kwargs (sequence[str], optional) – The valid keyword arguments for func, if not given these are inferred from the function signature.

xyzpy.utils#

Module Contents#

Classes#

Functions#

`xyzpy.utils`#