Utils

The kszx.utils module contains miscelleanous utilities, that didn’t really fit in elsewhere.

kszx.utils.get_nthreads()

Returns default number of threads, for functions which are multithreaded (e.g. scipy.fft).

Currently equal to os.cpu_count(), except for worker tasks in a multiprocessing pool. In that case, we divide the number of threads by the number of workers in the pool.

Note: for this to work, you should construct the pool with kszx.utils.Pool(), not multiprocessing.Pool()!

kszx.utils.set_nthreads(nthreads, reseed_numpy_rng=False)

Sets the default number of threads. Not intended to be called directly – use with caution!

(The intended use case for this function is in kszx.utils.Pool(), when initializing worker threads. In particular, the reseed_numpy_rng arg isn’t really related to threads, but is included for convenience in kszx.utils.Pool().)

kszx.utils.Pool(processes=None, reseed_numpy_rng=True)

Wrapper around multiprocessing.Pool(), which reseeds the numpy RNG and calls set_nthreads() in each worker.

This function addresses some issues with multiprocessing.Pool:

  • By default, workers inherit a copy of the parent’s global RNG state, so functions like np.random.normal() produce the same random numbers in each worker. This can be a huge problem – e.g. a Pool which makes Monte Carlo sims will generate the same realization multiple times. We address this by re-seeding numpy’s default RNG state in each worker.

  • Functions which use a thread pool (e.g. kszx.core.fft_r2c()) should use fewer threads when called from workers, in order to avoid oversubscription issues.

  • (Minor.) If the number of workers (processes arg) is unspecified, then I’d prefer to use the number of CPU cores, not the value of os.cpu_count() (which is usually twice the number of cores, since hyperthreading is usually enabled).

Usage is the same as multiprocessing.Pool():

def square(i):
    return i*i

# NOTE: I recommend calling apply_async() in a loop, not calling map_async()!
# This is because map_async() "swallows" exceptions until the very end, whereas
# apply_async() aborts computation as soon as an exception is encountered.

with kszx.utils.Pool() as pool:
    f = [ pool.apply_async(square, (i,)) for i in range(100) ]
    f = [ x.get() for x in f ]

print('Here is a list of the first 100 squares:', f)

Function arguments:

  • processes (int): number of worker subprocesses. If None, then we use the number of CPU cores (which is usually half the value of os.cpu_count()).

  • reseed_numpy_rng (boolean): if True (the default), then numpy’s global RNG will be re-seeded in each worker thread. You should always do this, unless you’re confident that the workers won’t use the global RNG.

kszx.utils.ra_dec_to_xyz(ra_deg, dec_deg, r=None)

Converts spherical polar coordinates (ra_deg, dec_deg) to cartesian coords (x,y,z).

Function arguments:

  • ra_deg (array): RA in degrees (not radians!)

  • dec_deg (array): DEC in degrees (not radians!)

  • r (optional array): radial coordinates (units of length)

Return value:

  • xyz (array): new array with a length-3 axis appended. (E.g. if ra_deg and dec_deg have shape (m,n), then xyz has shape (m,n,3)).

The coordinate systems are related by:

x = r * cos(dec) * cos(ra)
y = r * cos(dec) * sin(ra)
z = r * sin(dec)

Note that the radial coordinate r must be specified as a comoving distance. If you have redshifts instead, you should call Cosmology.chi():

cosmo = kszx.Cosmology('planck18+bao')
xyz = kszx.utils.ra_dec_to_xyz(ra_deg, dec_deg, cosmo.chi(z=z))
kszx.utils.xyz_to_ra_dec(xyz, return_r=False)

Converts cartesian coords (x,y,z) to spherical polar coordinates (ra_deg, dec_deg).

Function arguments:

  • xyz (array): array whose last axis has length 3.

  • return_r (boolean): indicates whether radial coords are returned, see below.

Return value:

  • ra_deg (array): RA in degrees (not radians!)

  • dec_deg (array): DEC in degrees (not radians!)

  • r (array): radial coordinates (in same units as xyz)

  • If return_r is True, then the return value is a triple (ra_deg, dec_deg, r). If return_r is False, then the return value is a pair (ra_deg, dec_deg).

The returned arrays have one lower dimension than xyz. (E.g. if xyz has shape (m,n,3) then ra_deg, dec_deg, and r all have shape (m,n).)

The coordinate systems are related by:

x = r * cos(dec) * cos(ra)
y = r * cos(dec) * sin(ra)
z = r * sin(dec)
kszx.utils.W_tophat(x)

Returns Fourier transform of a 3-d tophat W(x), where x=kR. Vectorized.

W(x) is given by any of the equivalent forms:

\[\begin{split}W(x) &= 3 (\sin(x) - x \cos(x)) / x^3 \\ &= 3 j_1(x) / x \\ &= j_0(x) + j_2(x)\end{split}\]
kszx.utils.subtract_binned_means(data, x, nbins)

Averages a 1-d ‘data’ array in x-bins, and returns a copy of ‘data’ with binned means subtracted. If nbins==0, then returns a copy of ‘data’.

kszx.utils.random_rotation_matrix(N)

Returns a random N-by-N rotation matrix.