xyzpy.utils¶
Utility functions.
Exceptions¶
Common base class for all non-exit exceptions. |
Classes¶
A very simple context manager class for timing blocks. |
|
Compare the performance of various |
|
Running mean & standard deviation using Welford's |
|
Running covariance class. |
|
Running covariance matrix for |
|
Monitor this process' peak memory usage with specified sampling interval |
Functions¶
|
|
|
Product of an iterable. |
|
Split a nested iterable at a specified level, i.e. in numpy language |
|
Take the n-dimensional nested iterable its and flatten it. |
|
Try to inspect a function's name, taking into account several common |
|
Turn any iterable into a progress bar, with notebook option |
|
Compute the real size of a Python object in bytes, taken from |
|
|
|
Benchmark the time it takes to run |
|
Given |
|
|
Get the peak memory usage of the current process in gigabytes. This |
|
Return a formatted memory usage summary for the current process. |
|
Return a formatted GPU memory usage summary for the process. |
|
|
A decorator that suggests the right keyword arguments if you get them |
Module Contents¶
- exception xyzpy.utils.XYZError[source]¶
Bases:
ExceptionCommon base class for all non-exit exceptions.
- xyzpy.utils.unzip(its, zip_level=1)[source]¶
Split a nested iterable at a specified level, i.e. in numpy language transpose the specified ‘axis’ to be the first.
- Parameters:
its (iterable (of iterables (of iterables ...))) – ‘n-dimensional’ iterable to split
zip_level (int) – level at which to split the iterable, default of 1 replicates
zip(*its)behaviour.
Example
>>> x = [[(1, True), (2, False), (3, True)], [(7, True), (8, False), (9, True)]] >>> nums, bools = unzip(x, 2) >>> nums ((1, 2, 3), (7, 8, 9)) >>> bools ((True, False, True), (True, False, True))
- xyzpy.utils.flatten(its, n)[source]¶
Take the n-dimensional nested iterable its and flatten it.
- Parameters:
its (nested iterable)
n (number of dimensions)
- Return type:
flattened iterable of all items
- xyzpy.utils._get_fn_name(fn)[source]¶
Try to inspect a function’s name, taking into account several common non-standard types of function: dask, functools.partial …
- xyzpy.utils.progbar(it=None, nb=False, **kwargs)[source]¶
Turn any iterable into a progress bar, with notebook option
- Parameters:
it (iterable) – Iterable to wrap with progress bar
nb (bool) – Whether to display the notebook progress bar
**kwargs (dict-like) – additional options to send to tqdm
- xyzpy.utils.getsizeof(obj)[source]¶
Compute the real size of a Python object in bytes, taken from https://stackoverflow.com/a/30316760/5640201.
- class xyzpy.utils.Timer[source]¶
A very simple context manager class for timing blocks.
Examples
>>> from xyzpy import Timer >>> with Timer() as timer: ... print('Doing some work!') ... Doing some work! >>> timer.t 0.00010752677917480469
- xyzpy.utils.benchmark(fn, setup=None, n=None, min_t=0.2, repeats=2, get='min', starmap=False)[source]¶
Benchmark the time it takes to run
fn.- Parameters:
fn (callable) – The function to time.
setup (callable, optional) – If supplied the function that sets up the argument for
fn.n (int, optional) – If supplied, the integer to supply to
setupoffn.min_t (float, optional) – Aim to repeat function enough times to take up this many seconds.
repeats (int, optional) – Repeat the whole procedure (with setup) this many times in order to take the minimum run time.
get ({'min', 'mean'}, optional) – Return the minimum or mean time for each run.
starmap (bool, optional) – Unpack the arguments from
setup, if given.
- Returns:
t – The minimum, averaged, time to run
fnin seconds.- Return type:
Examples
Just a parameter-less function:
>>> import xyzpy as xyz >>> import numpy as np >>> xyz.benchmark(lambda: np.linalg.eig(np.random.randn(100, 100))) 0.004726233000837965
The same but with a setup and size parameter
nspecified:>>> setup = lambda n: np.random.randn(n, n) >>> fn = lambda X: np.linalg.eig(X) >>> xyz.benchmark(fn, setup, 100) 0.0042192734545096755
- class xyzpy.utils.Benchmarker(kernels, setup=None, names=None, benchmark_opts=None, data_name=None)[source]¶
Compare the performance of various
kernels. Internally this makes use ofbenchmark(),Harvester()and xyzpys plotting functionality.- Parameters:
kernels (sequence of callable) – The functions to compare performance with.
setup (callable, optional) – If given, setup each benchmark run by suppling the size argument
nto this function first, then feeding its output to each of the functions.names (sequence of str, optional) – Alternate names to give the function, else they will be inferred.
benchmark_opts (dict, optional) – Supplied to
benchmark().data_name (str, optional) – If given, the file name the internal harvester will use to store results persistently.
- harvester¶
The harvester that runs and accumulates all the data.
- Type:
xyz.Harvester
- ds¶
Shortcut to the harvester’s full dataset.
- Type:
- kernels¶
- names¶
- setup = None¶
- benchmark_opts¶
- runner¶
- harvester¶
- run(ns, kernels=None, **harvest_opts)[source]¶
Run the benchmarks. Each run accumulates rather than overwriting the results.
- Parameters:
ns (sequence of int or int) – The sizes to run the benchmarks with.
kernels (sequence of str, optional) – If given, only run the kernels with these names.
harvest_opts – Supplied to
harvest_combos().
- property ds¶
- xyzpy.utils.format_number_with_error(x, err)[source]¶
Given
xwith errorerr, format a string showing the relevant digits ofxwith two significant digits of the error bracketed, and overall exponent if necessary.Examples
>>> print_number_with_uncertainty(0.1542412, 0.0626653) '0.154(63)'
>>> print_number_with_uncertainty(-128124123097, 6424) '-1.281241231(64)e+11'
- class xyzpy.utils.RunningStatistics[source]¶
Running mean & standard deviation using Welford’s algorithm. This is a very efficient way of keeping track of the error on the mean for example.
Examples
>>> rs = RunningStatistics() >>> rs.update(1.1) >>> rs.update(1.4) >>> rs.update(1.2) >>> rs.update_from_it([1.5, 1.3, 1.6]) >>> rs.mean 1.3499999046325684
>>> rs.std # standard deviation 0.17078252585383266
>>> rs.err # error on the mean 0.06972167422092768
- count = 0¶
- mean = 0.0¶
- M2 = 0.0¶
- converged(rtol, atol)[source]¶
Check if the stats have converged with respect to relative and absolute tolerance
rtolandatol.
- property var¶
- property std¶
- property err¶
- property rel_err¶
- class xyzpy.utils.RunningCovariance[source]¶
Running covariance class.
- count = 0¶
- xmean = 0.0¶
- ymean = 0.0¶
- C = 0.0¶
- property covar¶
The covariance.
- property sample_covar¶
The covariance with “Bessel’s correction”.
- class xyzpy.utils.RunningCovarianceMatrix(n=2)[source]¶
Running covariance matrix for
nvariables.- Parameters:
n (int, optional) – Number of variables to track.
- n = 2¶
- rcs¶
- property count¶
Return the number of samples accumulated.
- property covar_matrix¶
Return the population covariance matrix.
- property sample_covar_matrix¶
Return the sample covariance matrix.
- to_uncertainties(bias=True)[source]¶
Convert the accumulated statistics to correlated uncertainties, from which new quantities can be calculated with error automatically propagated.
- Parameters:
bias (bool, optional) – If False, use the sample covariance with “Bessel’s correction”.
- Returns:
values – The sequence of correlated variables.
- Return type:
tuple of uncertainties.ufloat
Examples
Estimate quantities of two perfectly correlated sequences.
>>> rcm = xyz.RunningCovarianceMatrix() >>> rcm.update_from_it((1, 3, 2), (2, 6, 4)) >>> x, y = rcm.to_uncertainties(rcm)
Calculated quantities like sums have the error propagated:
>>> x + y 6.0+/-2.4494897427831783
But the covariance is also taken into account, meaning the ratio here can be estimated with zero error:
>>> x / y 0.5+/-0
- xyzpy.utils.estimate_from_repeats(fn, *fn_args, rtol=0.02, tol_scale=1.0, get='stats', verbosity=0, min_samples=5, max_samples=1000000, **fn_kwargs)[source]¶
- Parameters:
fn (callable) – The function that estimates a single value.
fn_args – Supplied to
fn.optional – Supplied to
fn.rtol (float, optional) – Relative tolerance for error on mean.
tol_scale (float, optional) – The expected ‘scale’ of the estimate, this modifies the aboslute tolerance near zero to
rtol * tol_scale, default: 1.0.get ({'stats', 'samples', 'mean'}, optional) – Just get the
RunningStatisticsobject, or the actual samples too, or just the actual mean estimate.verbosity ({ 0, 1, 2}, optional) –
How much information to show:
0: nothing1: progress bar just with iteration rate,2: progress bar with running stats displayed.
min_samples (int, optional) – Take at least this many samples before checking for convergence.
max_samples (int, optional) – Take at maximum this many samples.
fn_kwargs – Supplied to
fn.optional – Supplied to
fn.
- Returns:
rs (RunningStatistics) – Statistics about the random estimation.
samples (list[float]) – If
get=='samples', the actual samples.
Examples
Estimate the sum of
nrandom numbers:>>> import numpy as np >>> import xyzpy as xyz >>> def fn(n): ... return np.random.rand(n).sum() ... >>> stats = xyz.estimate_from_repeats(fn, n=10, verbosity=3) 59: 5.13(12): : 58it [00:00, 3610.84it/s] RunningStatistics(mean=5.13(12), count=59)
- class xyzpy.utils.MemoryMonitor(interval: float = 0.1)[source]¶
Monitor this process’ peak memory usage with specified sampling interval in a daemon thread. This is intended to be used as a context manager for long running and memory intensive processes, not fine grained memory tracking.
- Parameters:
interval (float, optional) – Time between memory measurements in seconds. Fluctuations in peak memory between measurements might not be captured.
- interval = 0.1¶
- peak = None¶
- is_running = False¶
- monitor_thread = None¶
- xyzpy.utils.get_peak_memory_usage()[source]¶
Get the peak memory usage of the current process in gigabytes. This uses the psutil package on Windows, and the resource package on Linux and macOS.
- xyzpy.utils.report_memory()[source]¶
Return a formatted memory usage summary for the current process.
- xyzpy.utils.report_memory_gpu()[source]¶
Return a formatted GPU memory usage summary for the process.
- xyzpy.utils.autocorrect_kwargs(func=None, valid_kwargs=None)[source]¶
A decorator that suggests the right keyword arguments if you get them wrong. Useful for functions with many specific options.
- Parameters:
func (callable, optional) – The function to decorate.
valid_kwargs (sequence[str], optional) – The valid keyword arguments for
func, if not given these are inferred from the function signature.