Welcome to Physt’s documentation!¶
Tutorials¶
Get started with physt¶
This tutorial describes some of the basic features of physt.
[1]:
# Necessary import evil
%matplotlib inline
from physt import histogram, binnings, h1, h2, h3
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1337) # Have always the same data
Getting physt (to run)¶
I believe you can skip this section but anyway, for the sake of completeness, the default way of installing a relatively stable version of physt is via pip:
pip install physt
Alternatively, you can download the source code from github (https://github.com/janpipek/physt).
You will need numpy to use physt (required), but there are other packages (optional) that are very useful if you want to use physt at its best: matplotlib for plotting (or bokeh as a not-so-well supported alternative).
Your first histogram¶
If you need to create a histogram, call the histogram
(or h1
) function with your data (like heights of people) as the first argument. The default gives a reasonable result…
[2]:
# Basic dataset
heights = [160, 155, 156, 198, 177, 168, 191, 183, 184, 179, 178, 172, 173, 175,
172, 177, 176, 175, 174, 173, 174, 175, 177, 169, 168, 164, 175, 188,
178, 174, 173, 181, 185, 166, 162, 163, 171, 165, 180, 189, 166, 163,
172, 173, 174, 183, 184, 161, 162, 168, 169, 174, 176, 170, 169, 165]
hist = histogram(heights) # Automatically select all settings
hist
[2]:
Histogram1D(bins=(10,), total=56, dtype=int64)
…which is an object of the Histogram1D type that holds all the bin information…
[3]:
hist.bins # All the bins
[3]:
array([[155. , 159.3],
[159.3, 163.6],
[163.6, 167.9],
[167.9, 172.2],
[172.2, 176.5],
[176.5, 180.8],
[180.8, 185.1],
[185.1, 189.4],
[189.4, 193.7],
[193.7, 198. ]])
[4]:
hist.frequencies # All the frequencies
[4]:
array([ 2, 6, 5, 11, 15, 7, 6, 2, 1, 1], dtype=int64)
…and provides further features and methods, like plotting for example…
[5]:
hist.plot(show_values=True);

…or adding new values (note that this is something numpy.histogram won’t do for you)…
[6]:
original = hist.copy() # Store the original data to see changes
# ******* Here comes a lonely giant
hist.fill(197)
step1 = hist.copy() # Store the intermediate value
# ******* And a bunch of relatively short people
hist.fill_n([160, 160, 161, 157, 156, 159, 162])
# See how the plot changes (you can safely ignore the following 4 lines)
ax = hist.plot(label="After fill_n");
step1.plot(color="yellow", ax=ax, label="After fill")
original.plot(color="red", ax=ax, label="Before filling")
ax.legend(loc=1)
# See the number of entries
hist
[6]:
Histogram1D(bins=(10,), total=64, dtype=int64)

Data representation¶
The primary goal of physt library is to represent histograms as data objects with a set methods for easy manipulation and analysis (including mathematical operations, adaptivity, summary statistics. …). The histogram classes used in ROOT framework served as inspiration but not as a model to copy (though relevant methods often have same names).
Based on its dimensionality, a histogram is an instance of one of the following classes (all inheriting from HistogramBase):
- Histogram1D for univariate data
- Histogram2D for bivariate data
- HistogramND for data with higher dimensionality
- …or some special dedicated class (user-provided). Currently, there is a PolarHistogram as an example (considered to be experimental, not API-stable).
However, these objects are __init__ialized with already calculated data and therefore, you typically don’t construct the yourselves but call one of the facade functions:
- histogram or h1
- histogram2d or h2
- histogramdd (or h3 for 3D case)
These functions try to find the best binning schema, calculate bin contents and set other properties for the histograms. In principle (if not, submit a bug report), if you call a function with arguments understood by eponymous numpy functions (histogram, histogram2d and histogramdd), you should receive histogram with exactly the same bin edges and bin contents. However, there are many more arguments available!
[7]:
# Back to people's parameters...
heights = np.random.normal(172, 10, 100)
weights = np.random.normal(70, 15, 100)
iqs = np.random.normal(100, 15, 100)
[8]:
# 1D histogram
h1(heights)
[8]:
Histogram1D(bins=(10,), total=100, dtype=int64)
[9]:
# 2D histogram
h2(heights, weights, [5, 7])
[9]:
Histogram2D(bins=(5, 7), total=100, dtype=int64)
[10]:
# 3D histogram
h3([heights, weights, iqs]) # Simplification with respect to numpy.histogramdd
[10]:
HistogramND(bins=(10, 10, 10), total=100, dtype=int64)
So, what do these objects contain? In principle:
- binning schema (
_binning
or_binnings
) - bin contents (
frequencies
) together with errors (errors
) - some statistics about the data (
mean
,variance
,std
) - metadata (like
name
andaxis_name
oraxis_names
) - …
In the following, properties of Histogram1D
will be described. Analogous methods and data fields do exist also for Histogram2D
and HistogramND
, perhaps with the name in plural.
Binning schema¶
The structure of bins is stored in the histogram object as a hidden attribute _binning
. This value is an instance of one of the binning classes that are all descendants of physt.binnings.BinningBase
. You are not supposed to directly access this value because manipulating it without at the same time updating the bin contents is dangerous.
A dedicated notebook deals with the binning specifics, here we sum at least the most important features.
Histogram1D
offers the following attributes to access (read-only or read-only-intended) the binning information (explicitly or implicitly stored in _binning
):
[11]:
# Create a histogram with "reasonable" bins
data = np.random.normal(0, 7, 10000)
hist = histogram(data, "human", bin_count=4)
hist
[11]:
Histogram1D(bins=(6,), total=10000, dtype=int64)
[12]:
hist._binning # Just to show, don't use it
[12]:
FixedWidthBinning(bin_width=10.0, bin_count=6, min=-30.0)
[13]:
hist.bin_count # The total number of bins
[13]:
6
[14]:
hist.bins # Bins as array of both edges
[14]:
array([[-30., -20.],
[-20., -10.],
[-10., 0.],
[ 0., 10.],
[ 10., 20.],
[ 20., 30.]])
[15]:
hist.numpy_bins # Bin edges with the same semantics as the numpy.histogram
[15]:
array([-30., -20., -10., 0., 10., 20., 30.])
[16]:
hist.bin_left_edges
[16]:
array([-30., -20., -10., 0., 10., 20.])
[17]:
hist.bin_right_edges
[17]:
array([-20., -10., 0., 10., 20., 30.])
[18]:
hist.bin_centers # Centers of the bins - useful for interpretation of histograms as scatter data
[18]:
array([-25., -15., -5., 5., 15., 25.])
[19]:
hist.bin_widths # Widths of the bins - useful for calculating densities and also for bar plots
[19]:
array([10., 10., 10., 10., 10., 10.])
Just as a simple overview of binning schemas, that are provided by physt, we show the bins as produced by different schemas:
[20]:
list(binnings.binning_methods.keys()) # Show them all
[20]:
['numpy',
'exponential',
'quantile',
'fixed_width',
'integer',
'human',
'blocks',
'knuth',
'scott',
'freedman']
These names can be used as the second parameter of the h1
function:
[21]:
# Fixed-width
h1(data, "fixed_width", bin_width=6).numpy_bins
[21]:
array([-30., -24., -18., -12., -6., 0., 6., 12., 18., 24., 30.])
[22]:
# Numpy-like
print("Expected:", np.histogram(data, 5)[1])
print("We got:", h1(data, "numpy", bin_count=5).numpy_bins)
Expected: [-26.89092563 -16.07128189 -5.25163815 5.56800559 16.38764933
27.20729307]
We got: [-26.89092563 -16.07128189 -5.25163815 5.56800559 16.38764933
27.20729307]
[23]:
# Integer - centered around integers; useful for integer data
h1(data, "integer").numpy_bins
[23]:
array([-27.5, -26.5, -25.5, -24.5, -23.5, -22.5, -21.5, -20.5, -19.5,
-18.5, -17.5, -16.5, -15.5, -14.5, -13.5, -12.5, -11.5, -10.5,
-9.5, -8.5, -7.5, -6.5, -5.5, -4.5, -3.5, -2.5, -1.5,
-0.5, 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5,
8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5,
17.5, 18.5, 19.5, 20.5, 21.5, 22.5, 23.5, 24.5, 25.5,
26.5, 27.5])
[24]:
# Exponential - positive numbers required
h1(np.abs(data), "exponential").numpy_bins # We 'abs' the values
[24]:
array([1.03182494e-03, 2.03397046e-03, 4.00943579e-03, 7.90354415e-03,
1.55797507e-02, 3.07113654e-02, 6.05393490e-02, 1.19337344e-01,
2.35242068e-01, 4.63717632e-01, 9.14096888e-01, 1.80190069e+00,
3.55197150e+00, 7.00177407e+00, 1.38021491e+01, 2.72072931e+01])
[25]:
# Quantile - each bin should have a similar statistical importance
h1(data, "quantile", bin_count=5).numpy_bins
[25]:
array([-26.89092563, -5.87687499, -1.69550961, 1.81670859,
5.79232538, 27.20729307])
[26]:
# Human - as friendly to your plots as possible, you may set an approximate number of bins
h1(data, "human").numpy_bins
[26]:
array([-30., -25., -20., -15., -10., -5., 0., 5., 10., 15., 20.,
25., 30.])
Bin contents¶
The bin contents (frequencies
) and associated errors (errors
) are stored as numpy arrays with a shape corresponding to number of bins (in all dimensions). Again, you cannot manipulate these properties diractly (unless you break the dont-touch-the-underscore convention).
[27]:
hist = h1(data, "human")
hist.frequencies
[27]:
array([ 2, 23, 140, 604, 1580, 2601, 2691, 1640, 573, 124, 17,
5], dtype=int64)
Errors are calculated as \(\sqrt(N)\) which is the simplest expectation for independent values. If you don’t accept this, you can set your errors through _errors2
field which contains squared errors.
Note: Filling with weights, arithmetic operations and scaling preserve correct error values under similar conditions.
[28]:
hist.errors
[28]:
array([ 1.41421356, 4.79583152, 11.83215957, 24.57641145, 39.74921383,
51. , 51.8748494 , 40.49691346, 23.93741841, 11.13552873,
4.12310563, 2.23606798])
[29]:
# Doubling the histogram doubles the error
(hist * 2).errors
[29]:
array([ 2.82842712, 9.59166305, 23.66431913, 49.15282291,
79.49842766, 102. , 103.74969879, 80.99382693,
47.87483681, 22.27105745, 8.24621125, 4.47213595])
Data types¶
Internally, histogram bins can contain values in several types (dtype
in numpy terminology). By default, this is either np.int64
(for histograms without weights) or np.float64
(for histograms with weight). Wherever possible, this distinction is preserved. If you try filling in with weights, if you multiply by a float constant, if you divide, … - basically whenever this is reasonable - an integer histogram is automatically converted to a float one.
[30]:
hist = h1(data)
print("Default type:", hist.dtype)
hist = h1(data, weights=np.abs(data)) # Add some random weights
print("Default type with weights:", hist.dtype)
hist = h1(data)
hist.fill(1.0, weight=.44)
print("Default type after filling with weight:", hist.dtype)
hist = h1(data)
hist *= 2
print("Default type after multiplying by an int:", hist.dtype)
hist *= 5.6
print("Default type after multiplying by a float:", hist.dtype)
Default type: int64
Default type with weights: float64
Default type after filling with weight: float64
Default type after multiplying by an int: int64
Default type after multiplying by a float: float64
[31]:
# You can specify the type in the method call
hist = h1(data, dtype="int32")
hist.dtype
[31]:
dtype('int32')
[32]:
# You can set the type of the histogram using the attribute
hist = h1(data)
hist.dtype = np.int32
hist.dtype
[32]:
dtype('int32')
[33]:
# Adding two histograms uses the broader range
hist1 = h1(data, dtype="int64")
hist2 = h1(data, dtype="float32")
(hist1 + hist2).dtype # See the result!
[33]:
dtype('float64')
Manually create histogram instances¶
As mentioned, h1
and h2
are just facade functions. You can construct the objects directly using the constructors. The first argument accepts something that can be interpreted as binning or list of bins, second argument is an array of frequencies (numpy array or something convertible).
[34]:
from physt.histogram1d import Histogram1D
from physt.histogram_nd import Histogram2D
hist1 = Histogram1D([0.0, 0.2, 0.4, 0.6, 0.8, 1.0], [1, 2, 3, 4, 5])
hist2 = Histogram2D([[0, 0.5, 1], [0, 1, 2, 3]], [[0.2, 2.2, 7.3], [6, 5, 3]], axis_names=["x", "y"])
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
hist1.plot(ax = axes[0])
hist2.plot(ax = axes[1])
hist1, hist2
[34]:
(Histogram1D(bins=(5,), total=15, dtype=int64),
Histogram2D(bins=(2, 3), total=23.7, dtype=float64))

[35]:
# Create a physt "logo", also available as physt.examples.fist
_, ax = plt.subplots(figsize=(4, 4))
widths = np.cumsum([0, 1.2, 0.2, 1, 0.1, 1, 0.1, 0.9, 0.1, 0.8])
fingers = np.asarray([4, 1, 7.5, 6, 7.6, 6, 7.5, 6, 7.2]) + 5
hist1 = Histogram1D(widths, fingers)
hist1.plot(lw=0, ax=ax)
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlabel("physt")
hist1
[35]:
Histogram1D(bins=(9,), total=97.8, dtype=float64)

Indexing¶
Supported indexing is more or less compatible with numpy arrays.
[36]:
hist.find_bin(3) # Find a proper bin for some value (0 - based indices)
[36]:
5
[37]:
hist[3] # Return the bin (with frequency)
[37]:
(array([-10.66146002, -5.25163815]), 1600)
[38]:
hist[-3:] # Sub-histogram (as slice)
[38]:
Histogram1D(bins=(3,), total=559, dtype=int32)
[39]:
hist[hist.frequencies > 5] # Masked array (destroys underflow & overflow information)
[39]:
Histogram1D(bins=(10,), total=10000, dtype=int32)
[40]:
hist[[1, 3, 5]] # Select some of the bins
[40]:
Histogram1D(bins=(3,), total=4550, dtype=int32)
Arithmetics¶
With histograms, you can do basic arithmetic operations, preserving bins and usually having intuitive meaning.
[41]:
hist + hist
[41]:
Histogram1D(bins=(10,), total=20000, dtype=int32)
[42]:
hist - hist
c:\users\janpi\documents\code\my\physt\physt\histogram_base.py:852: UserWarning:
Subtracting histograms is considered to be a bad idea.
[42]:
Histogram1D(bins=(10,), total=0, dtype=int32)
[43]:
hist * 0.45
[43]:
Histogram1D(bins=(10,), total=4500.000000000001, dtype=float64)
[44]:
hist / 0.45
[44]:
Histogram1D(bins=(10,), total=22222.222222222226, dtype=float64)
Some of the operations are prohibited:
[45]:
try:
hist * hist # Does not make sense
except Exception as ex:
print(repr(ex))
TypeError('Multiplication of two histograms is not supported.')
[46]:
try:
hist + 4 # Does not make sense
except Exception as ex:
print(repr(ex))
TypeError("Only histograms can be added together. <class 'int'> found instead.")
[47]:
try:
(-0.2) * hist
except Exception as ex:
print(repr(ex))
ValueError('Cannot have negative frequencies.')
Some of the above checks are dropped if you allow “free arithmetics”. This you can do by: 1. Setting the PHYST_FREE_ARITHMETICS
environment variable to 1 (note: not any other “truthy” value) 2. By setting config.free_arithmetics
to True 3. By using the context manager config.enable_free_arithmetics()
:
[48]:
from physt.config import config
with config.enable_free_arithmetics():
neg_hist = (-0.2) * hist
ax = neg_hist.plot()
ax.set_ylim((-800, 0)) # TODO: Rendering bug requires this
neg_hist
c:\users\janpi\documents\code\my\physt\physt\histogram_base.py:338: UserWarning:
Negative frequencies in the histogram.
[48]:
Histogram1D(bins=(10,), total=-1999.9999999999998, dtype=float64)

With this relaxation, you can also use any numpy array as (right) operand for any of the operations:
[49]:
# Add some noise
with config.enable_free_arithmetics():
hist_plus_array = hist + np.random.normal(800, 200, hist.shape)
hist_plus_array.plot()
hist_plus_array
[49]:
Histogram1D(bins=(10,), total=18628.188812174514, dtype=float64)

If you need to side-step any rules completely, just use the histogram in a numpy array:
[50]:
np.asarray(hist) * np.asarray(hist)
# Excercise: Reconstruct a histogram with original bins
[50]:
array([ 100, 10000, 285156, 2560000, 7856809, 8122500, 2383936,
221841, 5625, 169])
Statistics¶
When creating histograms, it is possible to keep simple statistics about the sampled distribution, like mean() and std(). The behaviour was inspired by similar features in ROOT.
To be yet refined.
[51]:
hist.mean()
[51]:
-0.0
[52]:
hist.std()
[52]:
0.0
Plotting¶
This is currently based on matplotlib, but other tools might come later (d3.js, bokeh?)
[53]:
hist.plot(); # Basic plot

[54]:
hist.plot(density=True, errors=True, ecolor="red"); # Include errors

[55]:
hist.plot(show_stats=True, errors=True, alpha=0.3); # Show summary statistics (not fully supported yet)

[56]:
hist.plot(cumulative=True, color="yellow", lw=3, edgecolor="red"); # Use matplotlib parameters

[57]:
hist.plot(kind="scatter", s=hist.frequencies, cmap="rainbow", density=True); # Another plot type

[58]:
hist.plot(kind="step", lw=4)
[58]:
<AxesSubplot:xlabel='axis0'>

[59]:
# Plot different bins using different styles
axis = hist[hist.frequencies > 5].plot(label="High", alpha=0.5)
hist[1:-1][hist[1:-1].frequencies <= 5].plot(ax=axis, color="green", label="Low", alpha=0.5)
hist[[0, -1]].plot(ax=axis, color="red", label="Edge cases", alpha=0.5)
hist.plot(kind="scatter", ax=axis, s=hist.frequencies / 10, label="Scatter")
# axis.legend(); # Does not work - why?
[59]:
<AxesSubplot:xlabel='axis0'>

[60]:
# Bar plot with colormap (with logarithmic scale)
ax = hist.plot(cmap="Reds_r", yscale="log", show_values=True);

Irregular binning and densities¶
[61]:
figure, axes = plt.subplots(1, 3, figsize=(11, 3))
hist_irregular = histogram(heights, [160, 162, 166, 167, 175, 188, 191])
hist_irregular.plot(ax=axes[0], errors=True, cmap="rainbow");
hist_irregular.plot(ax=axes[1], density=True, errors=True, cmap="rainbow");
hist_irregular.plot(ax=axes[2], density=True, cumulative=True, cmap="rainbow");
axes[0].set_title("Absolute values")
axes[1].set_title("Densities")
axes[2].set_title("Cumulative");

Adding new values¶
Add (fill) single values¶
[62]:
figure, axes = plt.subplots(1, 4, figsize=(12, 3))
hist3 = histogram([], 20, range=(160, 200))
for i, ax in enumerate(axes):
for height in np.random.normal(165 + 10 * i, 2.8, 10000):
hist3.fill(height)
hist3.plot(ax=ax);
print("After {0} batches: {1}".format(i, hist3))
figure.tight_layout()
After 0 batches: Histogram1D(bins=(20,), total=9648, dtype=int64)
After 1 batches: Histogram1D(bins=(20,), total=19648, dtype=int64)
After 2 batches: Histogram1D(bins=(20,), total=29648, dtype=int64)
After 3 batches: Histogram1D(bins=(20,), total=39251, dtype=int64)

Add histograms with same binning¶
[63]:
heights1 = histogram(np.random.normal(169, 10, 100000), 50, range=(150, 200))
heights2 = histogram(np.random.normal(180, 11, 100000), 50, range=(150, 200))
total = heights1 + heights2
axis = heights1.plot(label="Women", color="red", alpha=0.5)
heights2.plot(label="Men", color="blue", alpha=0.5, ax=axis)
total.plot(label="All", color="gray", alpha=0.5, ax=axis)
axis.legend();

Compatibility¶
Note: Mostly, the compatibility is a trivial consequence of the object being convertible to numpy array
[64]:
# Convert to pandas dataframe
hist.to_dataframe()
[64]:
left | right | frequency | error | |
---|---|---|---|---|
0 | -26.890926 | -21.481104 | 10 | 3.162278 |
1 | -21.481104 | -16.071282 | 100 | 10.000000 |
2 | -16.071282 | -10.661460 | 534 | 23.108440 |
3 | -10.661460 | -5.251638 | 1600 | 40.000000 |
4 | -5.251638 | 0.158184 | 2803 | 52.943366 |
5 | 0.158184 | 5.568006 | 2850 | 53.385391 |
6 | 5.568006 | 10.977827 | 1544 | 39.293765 |
7 | 10.977827 | 16.387649 | 471 | 21.702534 |
8 | 16.387649 | 21.797471 | 75 | 8.660254 |
9 | 21.797471 | 27.207293 | 13 | 3.605551 |
[65]:
# Works on xarray
import xarray as xr
arr = xr.DataArray(np.random.rand(10, 50, 100))
histogram(arr).plot(cmap="Reds_r", cmap_min=4744, cmap_max=5100, lw=1, edgecolor="red", show_values=True);

[66]:
# Works on pandas dataseries
import pandas as pd
series = pd.Series(heights, name="height [cm]")
hist = histogram(series, title="Height distribution")
hist.plot()
hist
[66]:
Histogram1D(bins=(10,), total=100, dtype=int64)

Export & import¶
[67]:
json = hist.to_json() # add path argument to write it to file
json
[67]:
'{"histogram_type": "Histogram1D", "binnings": [{"adaptive": false, "binning_type": "NumpyBinning", "numpy_bins": [144.46274207992508, 148.91498677707023, 153.36723147421537, 157.81947617136055, 162.2717208685057, 166.72396556565084, 171.176210262796, 175.62845495994114, 180.0806996570863, 184.53294435423146, 188.9851890513766]}], "frequencies": [2, 4, 4, 15, 11, 12, 19, 17, 7, 9], "dtype": "int64", "errors2": [2, 4, 4, 15, 11, 12, 19, 17, 7, 9], "meta_data": {"name": null, "title": "Height distribution", "axis_names": ["height [cm]"]}, "missed": [0, 0, 0], "missed_keep": true, "physt_version": "0.4.12.2", "physt_compatible": "0.3.20"}'
[68]:
from physt.io import parse_json
hist = parse_json(json)
hist.plot()
hist
[68]:
Histogram1D(bins=(10,), total=100, dtype=int64)

Plotting physt histograms¶
Some matplotlib-based plotting examples, with no exhaustive documentation.
[1]:
# Necessary import evil
import physt
from physt import h1, h2, histogramdd
from physt.plotting import matplotlib
import numpy as np
import matplotlib.pyplot as plt
# %matplotlib inline
np.random.seed(42)
from physt import plotting
[2]:
# Some data
x = np.random.normal(100, 1, 10000)
y = np.random.normal(10, 10, 10000)
[3]:
ax = h2(x, y, 15).plot(figsize=(6, 6), show_zero=False, alpha=0, text_color="black", show_values=True, cmap="BuGn_r", show_colorbar=False, transform=lambda x:1)
h2(x, y, 50).plot.image(cmap="Spectral_r", alpha=0.75, ax=ax)
[3]:
<AxesSubplot:xlabel='axis0', ylabel='axis1'>

[ ]:
[4]:
h2(x, y, 40, name="Gauss").plot("image", cmap="rainbow", figsize=(5, 5))
[4]:
<AxesSubplot:title={'center':'Gauss'}, xlabel='axis0', ylabel='axis1'>

[5]:
plotting.matplotlib.bar3d(h2(x, y, 10, name="Gauss"), figsize=(5, 5), cmap="Accent");

[6]:
h1(x, "human", bin_count=10, name="Gauss").plot(ylim=(100, 1020), cmap="Greys", ticks="edge", errors=True);

[7]:
h1(x, "human", bin_count=200, name="Gauss").plot.line(errors=True, yscale="log")
[7]:
<AxesSubplot:xlabel='axis0'>

[8]:
h1(x, "human", bin_count=200, name="Gauss").plot.fill(lw=1, alpha=.4, figsize=(8, 4))
h1(x, "human", bin_count=200, name="Gauss").plot.fill(lw=1, alpha=.4, yscale="log", figsize=(8, 4), color="red")
[8]:
<AxesSubplot:xlabel='axis0'>


[9]:
h1(x, "human", bin_count=200, name="Gauss").plot.scatter(errors=True, xlim=(90, 100), show_stats="all")
[9]:
<AxesSubplot:xlabel='axis0'>

[10]:
ha = h1(x, "human", bin_count=20, name="Left")
hb = h1(x + 1 * np.sin(x / 12), "human", bin_count=40, name="Right")
from physt.plotting.matplotlib import pair_bars
[11]:
pair_bars(ha, hb, density=True, errors=True, figsize=(5, 5));
c:\users\janpi\documents\code\my\physt\physt\histogram_base.py:355: UserWarning:
Negative frequencies in the histogram.
c:\users\janpi\documents\code\my\physt\physt\plotting\matplotlib.py:789: UserWarning:
FixedFormatter should only be used together with FixedLocator

Example - histogram of time values¶
[12]:
# Get values close to
from physt.plotting.common import TimeTickHandler
data = np.random.normal(3600, 900, 4800)
H = h1(data, "human", axis_name="time")
H.plot(tick_handler=TimeTickHandler())
[12]:
<AxesSubplot:xlabel='time'>

2D Histograms in physt¶
[1]:
# Necessary import evil
import physt
from physt import h1, h2, histogramdd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
np.random.seed(42)
[2]:
# Some data
x = np.random.normal(100, 1, 1000)
y = np.random.normal(10, 10, 1000)
[3]:
# Create a simple histogram
histogram = h2(x, y, [8, 4], name="Some histogram", axis_names=["x", "y"])
histogram
[3]:
Histogram2D('Some histogram', bins=(8, 4), total=1000, dtype=int64)
[4]:
# Frequencies are a 2D-array
histogram.frequencies
[4]:
array([[ 0, 2, 4, 0],
[ 3, 26, 20, 5],
[ 17, 78, 104, 10],
[ 26, 163, 147, 17],
[ 17, 136, 96, 17],
[ 6, 41, 38, 6],
[ 1, 11, 7, 0],
[ 0, 1, 0, 1]], dtype=int64)
Multidimensional binning¶
In most cases, binning methods that apply for 1D histograms, can be used also in higher dimensions. In such cases, each parameter can be either scalar (applies to all dimensions) or a list/tuple with independent values for each dimension. This also applies for range that has to be list/tuple of tuples.
[6]:
histogram = h2(x, y, "fixed_width", bin_width=[2, 10], name="Fixed-width bins", axis_names=["x", "y"])
histogram.plot();
histogram.numpy_bins
[6]:
[array([ 96., 98., 100., 102., 104.]),
array([-20., -10., 0., 10., 20., 30., 40., 50.])]

[7]:
histogram = h2(x, y, "quantile", bin_count=[3, 4], name="Quantile bins", axis_names=["x", "y"])
histogram.plot(cmap_min=0);
histogram.numpy_bins
[7]:
[array([ 96.75873266, 99.54993453, 100.40825276, 103.85273149]),
array([-19.40388635, 3.93758311, 10.63077132, 17.28882177,
41.93107568])]

[8]:
histogram = h2(x, y, "human", bin_count=5, name="Human-friendly bins", axis_names=["x", "y"])
histogram.plot();
histogram.numpy_bins
[8]:
[array([ 96., 98., 100., 102., 104.]),
array([-20., -10., 0., 10., 20., 30., 40., 50.])]

Plotting¶
2D¶
[ ]:
# Default is workable
ax = histogram.plot()
[9]:
# Custom colormap, no colorbar
import matplotlib.cm as cm
fig, ax = plt.subplots()
ax = histogram.plot(ax=ax, cmap=cm.copper, show_colorbar=False, grid_color=cm.copper(0.5))
ax.set_title("Custom colormap");

[10]:
# Use a named colormap + limit it to a range of values
import matplotlib.cm as cm
fig, ax = plt.subplots()
ax = histogram.plot(ax=ax, cmap="Oranges", show_colorbar=True, cmap_min=20, cmap_max=100, show_values=True)
ax.set_title("Clipped colormap");

[11]:
# Show labels (and hide zero bins), no grid(lw=0)
ax = histogram.plot(show_values=True, show_zero=False, cmap=cm.RdBu, format_value=float, lw=0)

Large histograms as images¶
Plotting histograms in this way gets problematic with more than roughly 50x50 bins. There is an alternative, though, partially inspired by the datashader
project - plot the histogram as bitmap, which works very fast even for very large histograms.
Note: This method does not work for histograms with irregular bins.
[12]:
x = np.random.normal(100, 1, 1000000)
y = np.random.normal(10, 10, 1000000)
[13]:
fig, axes = plt.subplots(1, 3, figsize=(12, 4))
h2(x, y, 20, name="20 bins - map").plot("map", cmap="rainbow", lw=0, alpha=1, ax=axes[0], show_colorbar=False)
h2(x, y, 20, name="20 bins - image").plot("image", cmap="rainbow", alpha=1, ax=axes[1])
h2(x, y, 500, name="500 bins - image").plot("image", cmap="rainbow", alpha=1, ax=axes[2]);

See that the output is equivalent to map without lines.
Transformation¶
Sometimes, the value range is too big to show details. Therefore, it may be of some use to transform the values by a function, e.g. logarithm.
[14]:
fig, axes = plt.subplots(1, 3, figsize=(12, 4))
h2(x, y, 20, name="20 bins - map").plot("map", alpha=1, lw=0, show_zero=False, cmap="rainbow", ax=axes[0], show_colorbar=False, cmap_normalize="log")
h2(x, y, 20, name="20 bins - image").plot("image", alpha=1, ax=axes[1], cmap="rainbow", cmap_normalize="log")
h2(x, y, 500, name="500 bins - image").plot("image", alpha=1, ax=axes[2], cmap="rainbow", cmap_normalize="log");

[15]:
# Composition - show histogram overlayed with "points"
fig, ax = plt.subplots(figsize=(8, 7))
h_2 = h2(x, y, 30)
h_2.plot("map", lw=0, alpha=0.9, cmap="Blues", ax=ax, cmap_normalize="log", show_zero=False)
# h2(x, y, 300).plot("image", alpha=1, cmap="Greys", ax=ax, transform=lambda x: x > 0);
# Not working currently
[15]:
<AxesSubplot:xlabel='axis0', ylabel='axis1'>

3D¶
By this, we mean 3D bar plots of 2D histograms (not a visual representation of 3D histograms).
[16]:
histogram.plot("bar3d", cmap="rainbow");

[17]:
histogram.plot("bar3d", color="red");

Projections¶
[18]:
proj1 = histogram.projection("x", name="Projection to X")
proj1.plot(errors=True)
proj1
[18]:
Histogram1D('Projection to X', bins=(4,), total=1000, dtype=int64)

[19]:
proj2 = histogram.projection("y", name="Projection to Y")
proj2.plot(errors=True)
proj2
[19]:
Histogram1D('Projection to Y', bins=(7,), total=1000, dtype=int64)

Adaptive 2D histograms¶
[20]:
# Create and add two histograms with adaptive binning
height1 = np.random.normal(180, 5, 1000)
weight1 = np.random.normal(80, 2, 1000)
ad1 = h2(height1, weight1, "fixed_width", bin_width=1, adaptive=True)
ad1.plot(show_zero=False)
height2 = np.random.normal(160, 5, 1000)
weight2 = np.random.normal(70, 2, 1000)
ad2 = h2(height2, weight2, "fixed_width", bin_width=1, adaptive=True)
ad2.plot(show_zero=False)
(ad1 + ad2).plot(show_zero=False);



N-dimensional histograms¶
Although is not easy to visualize them, it is possible to create histograms of any dimensions that behave similar to 2D ones. Warning: be aware that the memory consumption can be significant.
[21]:
# Create a 4D histogram
data = [np.random.rand(1000)[:, np.newaxis] for i in range(4)]
data = np.concatenate(data, axis=1)
h4 = histogramdd(data, [3, 2, 2, 3], axis_names="abcd")
h4
[21]:
HistogramND(bins=(3, 2, 2, 3), total=1000, dtype=int64)
[22]:
h4.frequencies
[22]:
array([[[[31, 28, 33],
[21, 22, 22]],
[[25, 29, 28],
[29, 35, 28]]],
[[[20, 25, 20],
[28, 32, 31]],
[[30, 28, 24],
[29, 21, 27]]],
[[[27, 26, 33],
[21, 35, 30]],
[[38, 30, 32],
[25, 30, 27]]]], dtype=int64)
[23]:
h4.projection("a", "d", name="4D -> 2D").plot(show_values=True, format_value=int, cmap_min="min");

[24]:
h4.projection("d", name="4D -> 1D").plot("scatter", errors=True);

Support for pandas DataFrames (without pandas dependency ;-))¶
[25]:
# Load notorious example data set
iris = sns.load_dataset('iris')
[28]:
iris = sns.load_dataset('iris')
iris_hist = physt.h2(iris["sepal_length"], iris["sepal_width"], "human", bin_count=[12, 7], name="Iris")
iris_hist.plot(show_zero=False, cmap=cm.gray_r, show_values=True, format_value=int);

[29]:
iris_hist.projection("sepal_length").plot();

Binning in physt¶
[1]:
# Necessary import evil
%matplotlib inline
from physt import histogram, binnings
import numpy as np
import matplotlib.pyplot as plt
[2]:
# Some data
np.random.seed(42)
heights1 = np.random.normal(169, 10, 100000)
heights2 = np.random.normal(180, 6, 100000)
numbers = np.random.rand(100000)
Ideal number of bins¶
[3]:
X = [int(x) for x in np.logspace(0, 4, 50)]
algos = binnings.bincount_methods
Ys = { algo: [] for algo in algos}
for x in X:
ex_dataset = np.random.exponential(1, x)
for algo in algos:
Ys[algo].append(binnings.ideal_bin_count(ex_dataset, algo))
figure, axis = plt.subplots(figsize=(8, 8))
for algo in algos:
if algo == "default":
axis.plot(X, Ys[algo], ":.", label=algo, alpha=0.5, lw=2)
else:
axis.plot(X, Ys[algo], "-", label=algo, alpha=0.5, lw=2)
axis.set_xscale("log")
axis.set_yscale("log")
axis.set_xlabel("Sample size")
axis.set_ylabel("Bin count")
axis.legend(loc=2);

Binning schemes¶
Exponential binning¶
Uses numpy.logscale to create bins.
[4]:
figure, axis = plt.subplots(1, 2, figsize=(10, 4))
hist1 = histogram(numbers, "exponential", bin_count=10, range=(0.0001, 1))
hist1.plot(color="green", ax=axis[0])
hist1.plot(density=True, errors=True, ax=axis[1])
axis[0].set_title("Absolute scale")
axis[1].set_title("Log scale")
axis[1].set_xscale("log");

Integer binning¶
Useful for integer values (or something you want to round to integers), creates bins of width=1 around integers (i.e. 0.5-1.5, …)
[5]:
# Sum of two dice (should be triangle, right?)
dice = np.floor(np.random.rand(10000) * 6) + np.floor(np.random.rand(10000) * 6) + 2
histogram(dice, "integer").plot(ticks="center", density=True);

Quantile-based binning¶
Based on quantiles, this binning results in all bins containing roughly the same amount of observances.
[6]:
figure, axis = plt.subplots(1, 2, figsize=(10, 4))
# bins2 = binning.quantile_bins(heights1, 40)
hist2 = histogram(heights1, "quantile", bin_count=40)
hist2.plot(ax=axis[0]);
hist2.plot(density=True, ax=axis[1]);
axis[0].set_title("Frequencies")
axis[1].set_title("Density");
hist2
[6]:
Histogram1D(bins=(40,), total=100000, dtype=int64)

[7]:
figure, axis = plt.subplots()
histogram(heights1, "quantile", bin_count=10).plot(alpha=0.3, density=True, ax=axis, label="Quantile based")
histogram(heights1, 10).plot(alpha=0.3, density=True, ax=axis, color="green", label="Equal spaced")
axis.legend(loc=2);

Fixed-width bins¶
This binning is useful if you want “human-friendly” bin intervals.
[8]:
hist_fixed = histogram(heights1, "fixed_width", bin_width=3)
hist_fixed.plot()
hist_fixed
[8]:
Histogram1D(bins=(31,), total=100000, dtype=int64)

“Human” bins¶
The width and alignment of bins is guessed from the data with an approximate number of bins as (optional) parameter.
[9]:
human = histogram(heights1, "human", bin_count=15)
human.plot()
human
[9]:
Histogram1D(bins=(19,), total=100000, dtype=int64)

Astropy binning¶
Astropy includes its histogramming tools. If this package is available, we reuse its binning methods. These include:
- Bayesian blocks
- Knuth
- Freedman
- Scott
See http://docs.astropy.org/en/stable/visualization/histogram.html for more details.
[10]:
middle_sized = np.random.normal(180, 6, 5000)
for n in ["blocks", "scott", "knuth", "freedman"]:
algo = "{0}".format(n)
hist = histogram(middle_sized, algo, name=algo)
hist.plot(density=True)




Adaptive histogram¶
This type of histogram automatically adapts bins when new values are added. Note that only fixed-width continuous binning scheme is currently supported.
[1]:
# Necessary import evil
import physt
from physt import h1, h2, histogramdd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
[2]:
# Create an empty histogram
h = h1(None, "fixed_width", bin_width=10, name="People height", axis_name="cm", adaptive=True)
h
[2]:
Histogram1D('People height', bins=(0,), total=0, dtype=int64)
Adding single values¶
[3]:
# Add a first value
h.fill(157)
h.plot()
h
[3]:
Histogram1D('People height', bins=(1,), total=1, dtype=int64)

[4]:
# Add a second value
h.fill(173)
h.plot();

[5]:
# Add a few more values, including weights
h.fill(173, 2)
h.fill(186, 5)
h.fill(188, 3)
h.fill(193, 1)
h.plot(errors=True, show_stats=True);

Adding multiple values at once¶
[6]:
ha = h1(None, "fixed_width", bin_width=10, adaptive=True)
ha.plot(show_stats=True);

[7]:
# Beginning
ha.fill_n([10, 11, 34])
ha.plot();

[8]:
# Add a distant value
ha.fill_n([234], weights=[10])
ha.plot(show_stats=True);

[9]:
# Let's create a huge dataset
values = np.random.normal(130, 20, 100000)
[10]:
%%time
# Add lots of values (no loop in Python)
hn = h1(None, "fixed_width", bin_width=10, adaptive=True)
hn.fill_n(values)
# ha.plot()
Wall time: 15.5 ms
[11]:
%%time
# Comparison with Python loop
hp = h1(None, "fixed_width", bin_width=10, adaptive=True)
for value in values:
hp.fill(value)
Wall time: 5.2 s
[12]:
# Hopefully equal results
print("Equal?", hp == hn)
hp.plot(show_stats=True);
Equal? True

Adding two adaptive histograms together¶
[13]:
ha1 = h1(None, "fixed_width", bin_width=5, adaptive=True)
ha1.fill_n(np.random.normal(100, 10, 1000))
ha2 = h1(None, "fixed_width", bin_width=5, adaptive=True)
ha2.fill_n(np.random.normal(70, 10, 500))
ha = ha1 + ha2
fig, ax= plt.subplots()
ha1.plot(alpha=0.1, ax=ax, label="1", color="red")
ha2.plot(alpha=0.1, ax=ax, label="2")
ha.plot("scatter", label="sum", ax=ax, errors=True)
ax.legend(loc=2); # TODO? Why don't we show the sum???

[ ]:
Interrupted workflow¶
This example shows that using IO, you can easily interrupt your workflow, save it and continue some other time.
[1]:
import numpy as np
import physt
%matplotlib inline
[2]:
histogram = physt.h1(None, "fixed_width", bin_width=0.1, adaptive=True)
histogram
[2]:
Histogram1D(bins=(0,), total=0, dtype=int64)
[3]:
# Big chunk of data
data1 = np.random.normal(0, 1, 10000000)
histogram.fill_n(data1)
histogram
[3]:
Histogram1D(bins=(106,), total=10000000, dtype=int64)
[4]:
histogram.plot()
[4]:
<AxesSubplot:xlabel='axis0'>

Store the histogram (and delete it to pretend we come with a fresh table):
[5]:
histogram.to_json(path="./histogram.json");
del histogram
Turn off the machine, go for lunch, return home later…
Read the histogram:
[6]:
histogram = physt.io.load_json(path="./histogram.json")
histogram
[6]:
Histogram1D(bins=(106,), total=10000000, dtype=int64)
[7]:
histogram.plot()
[7]:
<AxesSubplot:xlabel='axis0'>

The same one ;-)
Continue filling:
[8]:
# Another big chunk of data
data1 = np.random.normal(3, 2, 10000000)
histogram.fill_n(data1)
histogram
[8]:
Histogram1D(bins=(205,), total=20000000, dtype=int64)
[9]:
histogram.plot()
[9]:
<AxesSubplot:xlabel='axis0'>

Merging bins¶
[1]:
from physt.binnings import *
from physt import h1, h2
import numpy as np
np.random.seed(42)
%matplotlib inline
[2]:
data = np.random.rand(100)
[3]:
hh = h1(data, 120)
hh.plot(errors=True);

[4]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

[5]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

[6]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

[7]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

[8]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

[9]:
hh.merge_bins(2, inplace=True)
hh.plot(errors=True);

By min frequency¶
[10]:
data = np.random.normal(0, 1, 5000)
hh = h1(data, 120)
hh.plot();

[11]:
hh.merge_bins(min_frequency=100, inplace=True)
hh.plot(density=True);

[12]:
hh.merge_bins(min_frequency=600, inplace=True)
hh.plot(density=True);

The same can be done for 2D histograms (i.e. each column, each row should contain more than the minimum). Unfortunately, a general, irregular-shaped binning is not yet supported.
[13]:
# 2D example
data1 = np.random.normal(0, 1, 600)
data2 = np.random.rand(600)
[14]:
hh = h2(data1, data2, 23)
ax = hh.plot(show_zero=0, cmap="rainbow", show_colorbar=False);
ax.set_title("Before merging")
hh.merge_bins(min_frequency=30, inplace=True)
ax = hh.plot(density=True, show_zero=False, cmap="rainbow", show_colorbar=False)
ax.set_title("After merging");


Support for dask arrays¶
It is possible to operate on dask arrays and spare the memory (or perhaps even time).
[1]:
# Necessary imports
import dask
import dask.multiprocessing
import physt
import numpy as np
import dask.array as da
from physt import h1, h2
%matplotlib inline
[2]:
# Create two arrays
np.random.seed(42)
SIZE = 2 ** 21
CHUNK = int(SIZE / 16)
million = np.random.rand(SIZE)#.astype(int)
million2 = (3 * million + np.random.normal(0., 0.3, SIZE))#.astype(int)
# Chunk them for dask
chunked = da.from_array(million, chunks=(CHUNK))
chunked2 = da.from_array(million2, chunks=(CHUNK))
Create histograms¶
h1
, h2
, … have their alternatives in physt.dask_compat
. They should work similarly. Although, they are not complete and unexpected errors may occur.
[3]:
from physt.compat.dask import h1 as d1
from physt.compat.dask import h2 as d2
[4]:
# Use chunks to create a 1D histogram
ha = d1(chunked2, "fixed_width", bin_width=0.2)
check_ha = h1(million2, "fixed_width", bin_width=0.2)
ok = (ha == check_ha)
print("Check: ", ok)
ha.plot()
ha
Check: True
[4]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)

[5]:
# Use chunks to create a 2D histogram
hb = d2(chunked, chunked2, "fixed_width", bin_width=.2, axis_names=["x", "y"])
check_hb = h2(million, million2, "fixed_width", bin_width=.2, axis_names=["x", "y"])
hb.plot(show_zero=False, cmap="rainbow")
ok = (hb == check_hb)
print("Check: ", ok)
hb
Check: True
[5]:
Histogram2D(bins=(5, 28), total=2097152, dtype=int64)

[6]:
# And another cross-check
hh = hb.projection("y")
hh.plot()
print("Check: ", np.array_equal(hh.frequencies, ha.frequencies)) # Just frequencies
Check: True

[8]:
# Use dask for normal arrays (will automatically split array to chunks)
d1(million2, "fixed_width", bin_width=0.2) == ha
[8]:
True
Some timings¶
Your results may vary substantially. These numbers are just for illustration, on 4-core (8-thread) machine. The real gain comes when we have data that don’t fit into memory.
Efficiency¶
[9]:
# Standard
%time h1(million2, "fixed_width", bin_width=0.2)
Wall time: 361 ms
[9]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
[10]:
# Same array, but using dask
%time d1(million2, "fixed_width", bin_width=0.2)
Wall time: 116 ms
[10]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
[11]:
# Most efficient: dask with already chunked data
%time d1(chunked2, "fixed_width", bin_width=0.2)
Wall time: 91.8 ms
[11]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
Different scheduling¶
[12]:
%time d1(chunked2, "fixed_width", bin_width=0.2)
Wall time: 76 ms
[12]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
[13]:
%%time
# Hyper-threading or not?
graph, name = d1(chunked2, "fixed_width", bin_width=0.2, compute=False)
dask.threaded.get(graph, name, num_workers=4)
Wall time: 114 ms
[13]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
[14]:
# Multiprocessing not so efficient for small arrays?
%time d1(chunked2, "fixed_width", bin_width=0.2, dask_method=dask.multiprocessing.get)
Wall time: 960 ms
[14]:
Histogram1D(bins=(28,), total=2097152, dtype=int64)
Special histograms in physt¶
Sometimes, it is necessary to bin values in transformed coordinates (e.g. polar). In principle, it is possible to create histograms from already transformed values (i.e. r and φ). However, this is not always the best way to go as each set of coordinates has its own peculiarities (e.g. the typical range of values for azimuthal angle)
Physt provides a general framework for constructing the transformed histograms (see a dedicated section of this document) and a couple of most frequently used variants:
- PolarHistogram
- SphericalHistogram
- CylindricalHistogram
[1]:
# Necessary import evil
%matplotlib inline
from physt import cylindrical, polar, spherical
import numpy as np
import matplotlib.pyplot as plt
[2]:
# Generate some points in the Cartesian coordinates
np.random.seed(42)
x = np.random.rand(1000)
y = np.random.rand(1000)
z = np.random.rand(1000)
Polar histogram¶
This histograms maps values to radius (r) and azimuthal angle (φ, ranging from 0 to 2π).
By default (unless you specify the phi_bins
parameter), the whole azimuthal range is spanned (even if there are no values that fall in parts of the circle).
[3]:
# Create a polar histogram with default parameters
hist = polar(x, y)
ax = hist.plot.polar_map()
hist
[3]:
PolarHistogram(bins=(10, 16), total=1000, dtype=int64)

[4]:
hist.bins
[4]:
[array([[0.02704268, 0.16306851],
[0.16306851, 0.29909433],
[0.29909433, 0.43512015],
[0.43512015, 0.57114597],
[0.57114597, 0.7071718 ],
[0.7071718 , 0.84319762],
[0.84319762, 0.97922344],
[0.97922344, 1.11524926],
[1.11524926, 1.25127509],
[1.25127509, 1.38730091]]),
array([[0. , 0.39269908],
[0.39269908, 0.78539816],
[0.78539816, 1.17809725],
[1.17809725, 1.57079633],
[1.57079633, 1.96349541],
[1.96349541, 2.35619449],
[2.35619449, 2.74889357],
[2.74889357, 3.14159265],
[3.14159265, 3.53429174],
[3.53429174, 3.92699082],
[3.92699082, 4.3196899 ],
[4.3196899 , 4.71238898],
[4.71238898, 5.10508806],
[5.10508806, 5.49778714],
[5.49778714, 5.89048623],
[5.89048623, 6.28318531]])]
[5]:
# Create a polar histogram with different binning
hist2 = polar(x+.3, y+.3, radial_bins="human", phi_bins="human")
ax = hist2.plot.polar_map(density=True)

[6]:
# Default axes names
hist.axis_names
[6]:
('r', 'phi')
When working with any transformed histograms, you can fill values in the original, or transformed coordinates. All methods working with coordinates understand the parameter transformed
which (if True) says that the method parameter are already in the transformed coordinated; otherwise, all values are considered to be in the original coordinates and transformed on inserting (creating, searching).
[7]:
# Using transformed / untransformed values
print("Non-transformed", hist.find_bin((0.1, 1)))
print("Transformed", hist.find_bin((0.1, 1), transformed=True))
print("Non-transformed", hist.find_bin((0.1, 2.7))) # Value
print("Transformed", hist.find_bin((0.1, 2.7), transformed=True))
Non-transformed (7, 3)
Transformed (0, 2)
Non-transformed None
Transformed (0, 6)
[8]:
# Simple plotting, similar to Histogram2D
hist.plot.polar_map(density=True, show_zero=False, cmap="Wistia", lw=0.5, figsize=(5, 5));

Adding new values¶
[9]:
# Add a single, untransformed value
hist.fill((-.5, -.5), weight=12)
hist.plot.polar_map(density=True, show_zero=True, cmap="Reds", lw=0.5, figsize=(5, 5));

[10]:
# Add a couple of values, transformed
data = [[.5, 3.05], [.5, 3.2], [.7, 3.3]]
weights = [1, 5, 20]
hist.fill_n(data, weights=weights, transformed=True)
hist.plot.polar_map(density=True, show_zero=True, cmap="Reds", lw=0.5, figsize=(5, 5));

Projections¶
The projections are stored using specialized Histogram1D subclasses that keep (in the case of radial) information about the proper bin sizes.
[11]:
radial = hist.projection("r")
radial.plot(density=True, color="red", alpha=0.5).set_title("Density")
radial.plot(label="absolute", color="blue", alpha=0.5).set_title("Absolute")
radial.plot(label="cumulative", cumulative=True, density=True, color="green", alpha=0.5).set_title("Cumulative")
radial
[11]:
RadialHistogram(bins=(10,), total=1026, dtype=int64)



[12]:
hist.projection("phi").plot(cmap="rainbow")
[12]:
<AxesSubplot:xlabel='phi'>

Cylindrical histogram¶
To be implemented
[13]:
data = np.random.rand(100, 3)
h = cylindrical(data)
h
[13]:
CylindricalHistogram(bins=(10, 16, 10), total=100, dtype=int64)
[14]:
# %matplotlib qt
proj = h.projection("rho", "phi")
proj.plot.polar_map()
proj
[14]:
PolarHistogram(bins=(10, 16), total=100, dtype=int64)

[15]:
proj = h.projection("phi", "z")
ax = proj.plot.cylinder_map(show_zero=False)
ax.view_init(50, 70)
proj
[15]:
CylindricalSurfaceHistogram(bins=(16, 10), total=100, dtype=int64)

Spherical histogram¶
To be implemented
[16]:
n = 1000000
data = np.empty((n, 3))
data[:,0] = np.random.normal(0, 1, n)
data[:,1] = np.random.normal(0, 1.3, n)
data[:,2] = np.random.normal(1, 1.2, n)
h = spherical(data)
h
[16]:
SphericalHistogram(bins=(10, 16, 16), total=1000000, dtype=int64)
[17]:
globe = h.projection("theta", "phi")
# globe.plot()
globe.plot.globe_map(density=True, figsize=(7, 7), cmap="rainbow")
globe.plot.globe_map(density=False, figsize=(7, 7))
globe
[17]:
SphericalSurfaceHistogram(bins=(16, 16), total=1000000, dtype=int64)


Implementing custom transformed histogram¶
TO BE WRITTEN
ASCII plotting¶
Note: For this notebook to work properly, you need to install the ``asciiplotlib`` and ``xtermcolor`` packages.
[1]:
from physt import examples
from physt import plotting
plotting.set_default_backend("ascii")
import numpy as np
np.random.seed(42)
[2]:
examples.normal_h1().plot()
-3.92e+00 - -3.14e+00 [ 10] ▏
-3.14e+00 - -2.35e+00 [ 88] █▎
-2.35e+00 - -1.57e+00 [ 485] ██████▉
-1.57e+00 - -7.83e-01 [1605] ██████████████████████▋
-7.83e-01 - +1.92e-03 [2831] ███████████████████████████████████████▉
+1.92e-03 - +7.87e-01 [2844] ████████████████████████████████████████
+7.87e-01 - +1.57e+00 [1543] █████████████████████▊
+1.57e+00 - +2.36e+00 [ 498] ███████
+2.36e+00 - +3.14e+00 [ 88] █▎
+3.14e+00 - +3.93e+00 [ 8] ▏
[3]:
plotting.ascii.ENABLE_ASCIIPLOTLIB = False
examples.normal_h1().plot(show_values=True)
13
# 143
##### 680
################# 2160
########################## 3223
#################### 2482
######## 1059
## 213
25
2
[4]:
examples.normal_h2().plot(cmap='Greys_r')
3.69 →
+----------+
|██████████|3.73 ↑
|██████████|
|██████████|
|██████████|
|██████████|
|██████████|
|██████████|
|██████████|
|██████████|
|██████████|-4.47 ↓
+----------+
← -3.66
↓ 0
████████████
843 ↑
Geospatial histogram visualization using folium¶
Note: You need to have the ``folium`` package installed to run this notebook.
“Bagging” the munros into rectangular bins¶
A Munro (About this sound listen (help·info)) is a mountain in Scotland with a height over 3,000 feet (914 m). Munros are named after Sir Hugh Munro, 4th Baronet (1856–1919), who produced the first list of such hills, known as Munro’s Tables, in 1891… says Wikipedia, more in https://en.wikipedia.org/wiki/Munro.
Let’s show the possibility to plot histograms in the maps with the help of folium library.
[1]:
# Necessary import evil
import pandas as pd
import numpy as np
import physt
import physt.plotting
physt.plotting.set_default_backend("folium")
[2]:
# Read the data
import pandas as pd
munros = pd.read_csv("../physt/examples/munros.csv")
munros.head()
[2]:
name | height | long | lat | |
---|---|---|---|---|
0 | Ben Nevis | 1344.0 | -5.003526 | 56.796834 |
1 | Ben Macdui [Beinn Macduibh] | 1309.0 | -3.669100 | 57.070386 |
2 | Braeriach | 1296.0 | -3.728581 | 57.078177 |
3 | Cairn Toul | 1291.0 | -3.710790 | 57.054415 |
4 | Sgor an Lochain Uaine | 1258.0 | -3.725797 | 57.058378 |
[3]:
# How many of them are there? Wikipedia says 282 (as of 2017)
munros.shape
[3]:
(282, 4)
How many munros are in each 10’ rectangle?¶
[4]:
hist = physt.h2(munros["lat"], munros["long"], "fixed_width", bin_width=1 / 6)
[5]:
map = hist.plot()
map
[5]:
[6]:
# Now, let's combine this information with positions of the 20 tallest
import folium
map = hist.plot()
for i, row in munros.iloc[:20].iterrows():
marker = folium.Marker([row["lat"], row["long"]], popup="{0} ({1} m)".format(row["name"], row["height"]))
marker.add_to(map)
map
[6]:
Vega backend examples¶
Note: for this notebook to work, you need to have ``vega3`` library installed.
pip install vega3
[1]:
from physt.examples import normal_h1, normal_h2, normal_h3, munros
from physt.plotting import set_default_backend
import numpy as np
np.random.seed(42)
set_default_backend("vega")
[2]:
H = normal_h1()
H.plot.scatter()

[3]:
H.plot.bar(cumulative=True, xlabel="Other label")

[4]:
H = normal_h1()
H.plot.line(cumulative=True)

[5]:
H2 = munros().T
H2.plot(cmap="YellowGreen", show_values=True, height=333, width=333, value_format=".:;oO##############".__getitem__)

Example of an interactive 3D histogram¶
Note: Unfortunately, this example does not render properly nor in GitHub renderer or notebook viewer. A live notebook must be running.
[6]:
H3 = normal_h3()
H3.axis_names = ("first", "second", "third")
H3.plot(show_values=True, show_zero=False, cmap="Blues", density=True, show_colorbar=False, value_format=".1f")

Plotting with plotly backend¶
[1]:
# Basic imports
from physt.examples import normal_h2, normal_h1
from physt.plotting import plotly
import physt.plotting
import numpy as np
np.random.seed(42)
# Set that we want plotly
physt.plotting.set_default_backend("plotly")
[2]:
# Define the 1-D example
H = normal_h1()
The default plot is bar
.
[3]:
H.plot() # Same as H.plot.bar()
[4]:
H.plot.line()
[5]:
H.plot.scatter()
Plotly Figure object¶
If you want to further manipulate the figures, you can return them from the function as-is using the raw
keyword.
[6]:
figure = H.plot.scatter(raw=True)
type(figure)
[6]:
plotly.graph_objs._figure.Figure
Collections¶
[9]:
from physt import collection
collection = collection({
"small": np.random.normal(160, 20, 600),
"tall": np.random.normal(180, 20, 1000),
"huge": np.random.normal(200, 20, 400),
"gigantic": np.random.normal(220, 20, 200)
}, "human")
[10]:
collection.plot.line()
[11]:
# Let's see normalized histograms in the collection
collection.normalize_bins().plot(barmode="overlay", alpha=0.3)
[12]:
# ...and how they look like when stacked
collection.normalize_bins().plot(barmode="stack")
API¶
API Reference¶
physt package¶
Subpackages¶
physt.compat package¶
Support for Geant4 histograms saved in CSV format.
See https://geant4.web.cern.ch/ for the project pages.
-
physt.compat.geant4.
load_csv
(path)¶ Loads a histogram as output from Geant4 analysis tools in CSV format.
Parameters: path (str) – Path to the CSV file Returns: Return type: physt.histogram1d.Histogram1D or physt.histogram_nd.Histogram2D
Histograms types and function for various external libraries.
physt.examples package¶
A set of examples used for demonstrating the physt capabilities / in tests.
-
physt.examples.
fist
() → physt.histogram1d.Histogram1D¶ A simple histogram in the shape of a fist.
-
physt.examples.
normal_h1
(size: int = 10000, mean: float = 0, sigma: float = 1) → physt.histogram1d.Histogram1D¶ A simple 1D histogram with normal distribution.
Parameters: - size (Number of points) –
- mean (Mean of the distribution) –
- sigma (Sigma of the distribution) –
-
physt.examples.
normal_h2
(size: int = 10000) → physt.histogram_nd.Histogram2D¶ A simple 2D histogram with normal distribution.
Parameters: size (Number of points) –
-
physt.examples.
normal_h3
(size: int = 10000) → physt.histogram_nd.HistogramND¶ A simple 3D histogram with normal distribution.
Parameters: size (Number of points) –
physt.io package¶
JSON I/O
-
physt.io.json.
load_json
(path: str, encoding: str = 'utf-8') → Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection]¶ Load histogram from a JSON file.
-
physt.io.json.
parse_json
(text: str) → Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection]¶ Create histogram from a JSON string.
-
physt.io.json.
save_json
(histogram: Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection], path: Optional[str] = None, **kwargs) → str¶ Save histogram to JSON format.
Parameters: - histogram (Any histogram) –
- path (If set, also writes to the path.) –
Returns: json
Return type: The JSON representation of the histogram
-
physt.io.util.
create_from_dict
(data: dict, format_name: str, check_version: bool = True) → Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection]¶ Once dict from source data is created, turn this into histogram.
Parameters: data (Parsed JSON-like tree.) – Returns: histogram Return type: A histogram (of any dimensionality)
-
exception
physt.io.version.
VersionError
¶ Bases:
Exception
-
physt.io.version.
require_compatible_version
(compatible_version, word='File')¶ Check that compatible version of input data is not too new.
Input and output for histograms.
JSON format is included by default. Other formats are/will be available as modules.
- Note: When implementing, try to work with a JSON-like
- tree and reuse create_from_dict and HistogramBase.to_dict.
physt.plotting package¶
ASCII plots (experimental).
The plots are printed directly to standard output.
-
physt.plotting.ascii.
hbar
(h1, width=80, show_values=False)¶
Functions that are shared by several (all) plotting backends.
-
class
physt.plotting.common.
TimeTickHandler
(level: str = None)¶ Bases:
object
Callable that creates ticks and labels corresponding to “sane” time values.
Note: This class is very experimental and subject to change or disappear.
-
LEVELS
= {'day': 86400, 'hour': 3600, 'min': 60, 'sec': 1}¶
-
LevelType
= typing.Tuple[str, typing.Union[float, int]]¶
-
classmethod
deduce_level
(h1: physt.histogram1d.Histogram1D, min_: float, max_: float) → Tuple[str, Union[float, int]]¶
-
format_time_ticks
(ticks: List[float], level: Tuple[str, Union[float, int]]) → List[str]¶
-
get_time_ticks
(h1: physt.histogram1d.Histogram1D, level: Tuple[str, Union[float, int]], min_: float, max_: float) → List[float]¶
-
classmethod
parse_level
(value: Union[Tuple[str, Union[float, int]], float, str, datetime.timedelta]) → Tuple[str, Union[float, int]]¶
-
classmethod
split_hms
(value) → Tuple[bool, int, int, Union[int, float]]¶
-
-
physt.plotting.common.
check_ndim
(ndim: Union[int, Tuple[int, ...]])¶ Decorator checking proper histogram dimension.
-
physt.plotting.common.
get_data
(histogram: physt.histogram_base.HistogramBase, density: bool = False, cumulative: bool = False, flatten: bool = False) → numpy.ndarray¶ Get histogram data based on plotting parameters.
Parameters: - density (Whether to divide bin contents by bin size) –
- cumulative (Whether to return cumulative sums instead of individual) –
- flatten (Whether to flatten multidimensional bins) –
-
physt.plotting.common.
get_err_data
(histogram: physt.histogram_base.HistogramBase, density: bool = False, cumulative: bool = False, flatten: bool = False) → numpy.ndarray¶ Get histogram error data based on plotting parameters.
Parameters: - density (Whether to divide bin contents by bin size) –
- cumulative (Whether to return cumulative sums instead of individual) –
- flatten (Whether to flatten multidimensional bins) –
-
physt.plotting.common.
get_value_format
(value_format: Union[Callable[[float], str], str, None]) → Callable[[float], str]¶ Create a formatting function from a generic value_format argument.
-
physt.plotting.common.
pop_kwargs_with_prefix
(prefix: str, kwargs: dict) → dict¶ Pop all items from a dictionary that have keys beginning with a prefix.
Parameters: - prefix (str) –
- kwargs (dict) –
Returns: kwargs – Items popped from the original directory, with prefix removed.
Return type: dict
Vega3 backend for plotting in physt.
The JSON can be produced without any external dependency, the ability to show plots in-line in IPython requires ‘vega3’ library.
Implementation note: Values passed to JSON cannot be of type np.int64 (solution: explicit cast to float)
See the enable_inline_view wrapper.
-
physt.plotting.vega.
bar
(h1: physt.histogram1d.Histogram1D, **kwargs) → dict¶ Bar plot of 1D histogram.
Parameters: - lw (float) – Width of the line between bars
- alpha (float) – Opacity of the bars
- hover_alpha (float) – Opacity of the bars when hover on
-
physt.plotting.vega.
display_vega
(vega_data: dict, display: bool = True) → Union[Vega, dict]¶ Optionally display vega dictionary.
Parameters: - vega_data (Valid vega data as dictionary) –
- display (Whether to try in-line display in IPython) –
-
physt.plotting.vega.
enable_inline_view
(f)¶ Decorator to enable in-line viewing in Python and saving to external file.
It adds several parameters to each decorated plotted function:
Parameters: - write_to (str (optional)) – Path to write vega JSON/HTML to.
- write_format ("auto" | "json" | "html") – Whether to create a JSON data file or a full-fledged HTML page.
- display ("auto" | True | False) – Whether to try in-line display in IPython
- indent (int) – Indentation of JSON
-
physt.plotting.vega.
line
(h1: physt.histogram1d.Histogram1D, **kwargs) → dict¶ Line plot of 1D histogram values.
Points are horizontally placed in bin centers.
Parameters: h1 (physt.histogram1d.Histogram1D) – Dimensionality of histogram for which it is applicable
-
physt.plotting.vega.
map
(h2: physt.histogram_nd.Histogram2D, *, show_zero: bool = True, show_values: bool = False, **kwargs) → dict¶ Heat-map of two-dimensional histogram.
-
physt.plotting.vega.
map_with_slider
(h3: physt.histogram_nd.HistogramND, *, show_zero: bool = True, show_values: bool = False, **kwargs) → dict¶ Heatmap showing slice in first two dimensions, third dimension represented as a slider.
-
physt.plotting.vega.
scatter
(h1: physt.histogram1d.Histogram1D, **kwargs) → dict¶ Scatter plot of 1D histogram values.
Points are horizontally placed in bin centers.
Parameters: shape (str) –
-
physt.plotting.vega.
write_vega
(vega_data, *, title: Optional[str], write_to: str, write_format: str = 'auto', indent: int = 2)¶ Write vega dictionary to an external file.
Parameters: - vega_data (Valid vega data as dictionary) –
- write_to (Path to write vega JSON/HTML to.) –
- write_format ("auto" | "json" | "html") – Whether to create a JSON data file or a full-fledged HTML page.
- indent (Indentation of JSON) –
Plotting for physt histograms.
- matplotlib
- vega
- plotly (simple wrapper around matplotlib for 1D histograms)
- folium (just for the geographical histograms)
Calling the plotting functions
There are several backends (and user-defined may be added) and several plotting functions for each - we try to keep a consistent set of parameters to which all implementations should try to stick (with exceptions).
- write_to : str (optional)
- Path to file where the output will be stored
- title : str (optional)
- String to be displayed as plot title (defaults to h.title)
- xlabel : str (optional)
- String to be displayed as x-axis label (defaults to corr. axis name)
- ylabel : str (optional)
- String to be displayed as y-axis label (defaults to corr. axis name)
- xscale : str (optional)
- If “log”, x axis will be scaled logarithmically
- yscale : str (optional)
- If “log”, y axis will be scaled logarithmically
xlim : tuple | “auto” | “keep”
ylim : tuple | “auto” | “keep”
- invert_y : bool
- If True, the y axis points downwards
- ticks : {“center”, “edge”}, optional
- If set, each bin will have a tick (either central or edge)
- alpha : float (optional)
- The alpha of the whole plot (default: 1)
- cmap : str or list
- Name of the palette or list of colors or something that the respective backend can interpret as colourmap.
cmap_normalize : {“log”}, optional
cmap_min :
cmap_max :
- show_values : bool
- If True, show values next to (or inside) the bins
- value_format : str or Callable
- How bin values (if to be displayed) are rendered.
zorder : int (optional)
text_color : text_alpha : text_* :
Other options that are passed to the formatting of values without the prefix
- cumulative : bool
- If True, show CDF instead of bin heights
- density : bool
- If True, does not show bin contents but contents divided by width
- errors : bool
- Whether to show error bars (if available)
- show_stats : bool
- If True, display a small box with statistical info
- show_zero : bool
- Whether to show bins that have 0 frequency
- grid_color :
- Colour of line between bins
- show_colorbar : bool
- Whether to display a colorbar next to the plot itself
- lw (or linewidth) : int
- Width of the lines
-
class
physt.plotting.
PlottingProxy
(h: Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection])¶ Bases:
object
Proxy enabling to call plotting methods on histogram objects.
It can be used both as a method or as an object containing methods. In any case, it only forwards the call to the universal plot() function.
The __dir__ method should offer all plotting methods supported by the currently selected backend.
Example
plotter = histogram.plot plotter(…) # Plots using defaults plotter.bar(…) # Plots as a specified plot type (“bar”)
Note
Inspiration taken from the way how pandas deals with this.
-
physt.plotting.
get_default_backend
() → Optional[str]¶ The backend that will be used by default with the plot function.
-
physt.plotting.
plot
(histogram: Union[physt.histogram_base.HistogramBase, physt.histogram_collection.HistogramCollection], kind: Optional[str] = None, backend: Optional[str] = None, **kwargs)¶ Universal plotting function.
All keyword arguments are passed to the plotting methods.
Parameters: kind (Type of the plot (like "scatter", "line", ..), similar to pandas) –
-
physt.plotting.
set_default_backend
(name: str) → None¶ Choose a default backend.
Submodules¶
physt.bin_utils module¶
Methods for investigation and manipulation of bin arrays.
-
physt.bin_utils.
find_human_width
(raw_width: float, kind: Optional[str] = None) → float¶
-
physt.bin_utils.
find_human_width_24
(raw_width: float) → int¶
-
physt.bin_utils.
find_human_width_60
(raw_width: float) → int¶
-
physt.bin_utils.
find_human_width_decimal
(raw_width: float) → float¶
-
physt.bin_utils.
is_bin_subset
(sub: Union[numpy.ndarray, Iterable[T_co], int, float], sup: Union[numpy.ndarray, Iterable[T_co], int, float]) → bool¶ Check whether all bins in one binning are present also in another:
Parameters: - sub (array_like) – Candidate for the bin subset
- sup (array_like) – Candidate for the bin superset
-
physt.bin_utils.
is_bin_superset
(sup: Union[numpy.ndarray, Iterable[T_co], int, float], sub: Union[numpy.ndarray, Iterable[T_co], int, float]) → bool¶ Inverse of is_bin_subset
-
physt.bin_utils.
is_consecutive
(bins: Union[numpy.ndarray, Iterable[T_co], int, float], rtol: float = 1e-05, atol: float = 1e-08) → bool¶ Check whether the bins are consecutive (edges match).
Does not check if the bins are in rising order.
-
physt.bin_utils.
is_rising
(bins: Union[numpy.ndarray, Iterable[T_co], int, float]) → bool¶ Check whether the bins are in raising order.
Does not check if the bins are consecutive.
Parameters: bins (array_like) –
-
physt.bin_utils.
make_bin_array
(bins: Union[numpy.ndarray, Iterable[T_co], int, float]) → numpy.ndarray¶ Turn bin data into array understood by HistogramXX classes.
Parameters: bins (array_like) – Array of edges or array of edge tuples Examples
>>> make_bin_array([0, 1, 2]) array([[0, 1], [1, 2]]) >>> make_bin_array([[0, 1], [2, 3]]) array([[0, 1], [2, 3]])
-
physt.bin_utils.
to_numpy_bins
(bins: Union[numpy.ndarray, Iterable[T_co], int, float]) → numpy.ndarray¶ Convert physt bin format to numpy edges.
Parameters: bins (array_like) – 1-D (n) or 2-D (n, 2) array of edges Returns: edges Return type: all edges
-
physt.bin_utils.
to_numpy_bins_with_mask
(bins: Union[numpy.ndarray, Iterable[T_co], int, float]) → Tuple[numpy.ndarray, numpy.ndarray]¶ Numpy binning edges including gaps.
Parameters: bins (1-D (n) or 2-D (n, 2) array of edges) – Returns: - edges (All edges)
- mask (List of indices that correspond to bins that have to be included)
Examples
>>> to_numpy_bins_with_mask([0, 1, 2]) (array([0., 1., 2.]), array([0, 1]))
>>> to_numpy_bins_with_mask([[0, 1], [2, 3]]) (array([0, 1, 2, 3]), array([0, 2])
physt.binnings module¶
Different binning algorithms/schemas for the histograms.
-
class
physt.binnings.
BinningBase
(bins: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, numpy_bins: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, includes_right_edge: bool = False, adaptive: bool = False)¶ Bases:
object
Abstract base class for binning schemas.
- define at least one of the following properties: bins, numpy_bins (cached conversion exists)
- if you modify bins, put _bins and _numpy_bins into proper state (None may be sufficient)
- checking of proper bins should be done in __init__
- if you want to support adaptive histogram, override _force_bin_existence
- implement _update_dict to contain the binning representation
- the constructor (and facade methods) must accept any kwargs (and ignores those that are not used).
-
adaptive_allowed
¶ Whether is possible to update the bins dynamically
Type: bool
-
inconsecutive_allowed
¶ Whether it is possible to have bins with gaps
Type: bool
-
TODO
¶ Type: Check the last point (does it make sense?)
-
adapt
(other: physt.binnings.BinningBase)¶ Adapt this binning so that it contains all bins of another binning.
Parameters: other (BinningBase) –
-
adaptive_allowed
= False
-
apply_bin_map
(bin_map) → physt.binnings.BinningBase¶ Parameters: bin_map (Iterator(tuple)) – The bins must be in ascending order
-
as_fixed_width
(copy: bool = True) → physt.binnings.FixedWidthBinning¶ Convert binning to recipe with fixed width (if possible.)
Parameters: copy (If True, ensure that we receive another object.) –
-
as_static
(copy: bool = True) → physt.binnings.StaticBinning¶ Convert binning to a static form.
Parameters: copy (bool) – Ensure that we receive another object Returns: A new static binning with a copy of bins. Return type: StaticBinning
-
bin_count
¶ The total number of bins.
-
bins
¶ Bins in the wider format (as edge pairs)
Returns: bins – shape=(bin_count, 2) Return type: np.ndarray
-
copy
() → BinningType¶ An identical, independent copy.
-
first_edge
¶ The left edge of the first bin.
-
force_bin_existence
(values)¶ Change schema so that there is a bin for value.
It is necessary to implement the _force_bin_existence template method.
Parameters: values (np.ndarray) – All values we want bins for. Returns: bin_map – None => There was no change in bins int => The bins are only shifted (allows mass assignment) Otherwise => the iterable contains tuples (old bin index, new bin index) new bin index can occur multiple times, which corresponds to bin mergingReturn type: Iterable[tuple] or None or int
-
static
from_dict
(a_dict)¶
-
includes_right_edge
¶
-
inconsecutive_allowed
= False
-
is_adaptive
() → bool¶ Whether the binning can be adapted to include values not currently spanned.
-
is_consecutive
(rtol: float = 1e-05, atol: float = 1e-08) → bool¶ Whether all bins are in a growing order.
Parameters: atol (rtol,) –
-
is_regular
(*, rtol: float = 1e-05, atol: float = 1e-08) → bool¶ Whether all bins have the same width.
Parameters: atol (rtol,) –
-
last_edge
¶ The right edge of the last bin.
-
numpy_bins
¶ Bins in the numpy format
This might not be available for inconsecutive binnings.
Returns: edges – shape=(bin_count+1,) Return type: np.ndarray
-
numpy_bins_with_mask
¶ Bins in the numpy format, including the gaps in inconsecutive binnings.
Returns: edges, mask Return type: np.ndarray See also
bin_utils.to_numpy_bins_with_mask
-
set_adaptive
(value: bool = True) → None¶ Set/unset the adaptive property of the binning.
This is available only for some of the binning types.
-
to_dict
() → Dict[str, Any]¶ Dictionary representation of the binning schema.
This serves as template method, please implement _update_dict
-
physt.binnings.
BinningLike
= typing.Union[physt.binnings.BinningBase, numpy.ndarray, typing.Iterable, int, float]¶ Anything that can be converted to a binning.
-
class
physt.binnings.
ExponentialBinning
(log_min: float, log_width: float, bin_count: int, includes_right_edge: bool = True, adaptive: bool = False, **kwargs)¶ Bases:
physt.binnings.BinningBase
Binning schema with exponentially distributed bins.
-
adaptive_allowed
= False¶
-
copy
() → physt.binnings.ExponentialBinning¶ An identical, independent copy.
-
is_regular
(**kwargs) → bool¶ Whether all bins have the same width.
Parameters: atol (rtol,) –
-
numpy_bins
¶ Bins in the numpy format
This might not be available for inconsecutive binnings.
Returns: edges – shape=(bin_count+1,) Return type: np.ndarray
-
-
class
physt.binnings.
FixedWidthBinning
(*, bin_width, bin_count=0, bin_times_min=None, min=None, includes_right_edge=False, adaptive=False, bin_shift=None, align=True, **kwargs)¶ Bases:
physt.binnings.BinningBase
Binning schema with predefined bin width.
-
adaptive_allowed
= True¶
-
as_fixed_width
(copy: bool = True) → physt.binnings.FixedWidthBinning¶ Convert binning to recipe with fixed width (if possible.)
Parameters: copy (If True, ensure that we receive another object.) –
-
bin_count
¶ The total number of bins.
-
bin_width
¶
-
copy
()¶ An identical, independent copy.
-
first_edge
¶ The left edge of the first bin.
-
is_regular
(**kwargs) → bool¶ Whether all bins have the same width.
Parameters: atol (rtol,) –
-
last_edge
¶ The right edge of the last bin.
-
numpy_bins
¶ Bins in the numpy format
This might not be available for inconsecutive binnings.
Returns: edges – shape=(bin_count+1,) Return type: np.ndarray
-
-
class
physt.binnings.
NumpyBinning
(numpy_bins: Union[numpy.ndarray, Iterable[T_co], int, float], includes_right_edge=True, **kwargs)¶ Bases:
physt.binnings.BinningBase
Binning schema working as numpy.histogram.
-
copy
() → physt.binnings.NumpyBinning¶ An identical, independent copy.
-
numpy_bins
¶ Bins in the numpy format
This might not be available for inconsecutive binnings.
Returns: edges – shape=(bin_count+1,) Return type: np.ndarray
-
-
class
physt.binnings.
StaticBinning
(bins, includes_right_edge=True, **kwargs)¶ Bases:
physt.binnings.BinningBase
Binning defined by an array of bin edge pairs.
-
as_static
(copy: bool = True) → physt.binnings.StaticBinning¶ Convert binning to a static form.
Returns: A new static binning with a copy of bins. Return type: StaticBinning Parameters: copy (if True, returns itself (already satisfying conditions)) –
-
copy
()¶ An identical, independent copy.
-
inconsecutive_allowed
= True¶
-
-
physt.binnings.
as_binning
(obj: Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float], copy: bool = False) → physt.binnings.BinningBase¶ Ensure that an object is a binning
Parameters: - obj (BinningBase or array_like) – Can be a binning, numpy-like bins or full physt bins
- copy (If true, ensure that the returned object is independent) –
-
physt.binnings.
binning_methods
= {'exponential': <function exponential_binning>, 'fixed_width': <function fixed_width_binning>, 'human': <function human_binning>, 'integer': <function integer_binning>, 'numpy': <function numpy_binning>, 'quantile': <function quantile_binning>, 'static': <function static_binning>}¶ Dictionary of available binnnings.
-
physt.binnings.
calculate_bins
(array, _=None, **kwargs) → physt.binnings.BinningBase¶ Find optimal binning from arguments.
Parameters: - array (arraylike) – Data from which the bins should be decided (sometimes used, sometimes not)
- _ (int or str or Callable or arraylike or Iterable or BinningBase) – To-be-guessed parameter that specifies what kind of binning should be done
- check_nan (bool) – Check for the presence of nan’s in array? Default: True
- range (tuple) – Limit values to a range. Some of the binning methods also (subsequently) use this parameter for the bin shape.
Returns: A two-dimensional array with pairs of bin edges (not necessarily consecutive).
Return type:
-
physt.binnings.
calculate_bins_nd
(array: Optional[numpy.ndarray], bins=None, dim: Optional[int] = None, check_nan=True, **kwargs) → List[physt.binnings.BinningBase]¶ Find optimal binning from arguments (n-dimensional variant)
Usage similar to calculate_bins.
-
physt.binnings.
exponential_binning
(data=None, bin_count: Optional[int] = None, *, range: Optional[Tuple[float, float]] = None, **kwargs) → physt.binnings.ExponentialBinning¶ Construct exponential binning schema.
Parameters: - bin_count (Number of bins) –
- range ((min, max)) –
See also
numpy.logspace()
-
physt.binnings.
fixed_width_binning
(data=None, bin_width: Union[float, int] = 1, *, range: Optional[Tuple[float, float]] = None, includes_right_edge: bool = False, **kwargs) → physt.binnings.FixedWidthBinning¶ Construct fixed-width binning schema.
Parameters: - bin_width (float) –
- range (Optional[tuple]) – (min, max)
- align (Optional[float]) – Must be multiple of bin_width
-
physt.binnings.
human_binning
(data: Optional[numpy.ndarray] = None, bin_count: Optional[int] = None, *, kind: Optional[str] = None, range: Optional[Tuple[float, float]] = None, min_bin_width: Optional[float] = None, max_bin_width: Optional[float] = None, **kwargs) → physt.binnings.FixedWidthBinning¶ Construct fixed-width ninning schema with bins automatically optimized to human-friendly widths.
Typical widths are: 1.0, 25,0, 0.02, 500, 2.5e-7, …
Parameters: - bin_count (Number of bins) –
- kind (Optional value "time" works in h,m,s scale instead of seconds) –
- range (Tuple of (min, max)) –
- min_bin_width (If present, the bin cannot be narrower than this.) –
- max_bin_width (If present, the bin cannot be wider than this.) –
-
physt.binnings.
ideal_bin_count
(data: numpy.ndarray, method: str = 'default') → int¶ A theoretically ideal bin count.
Parameters: - data (Data to work on. Most methods don't use this.) –
- method (str) –
- Name of the method to apply, available values:
- default (~sturges)
- sqrt
- sturges
- doane
- rice
See https://en.wikipedia.org/wiki/Histogram for the description
-
physt.binnings.
integer_binning
(data=None, **kwargs) → physt.binnings.FixedWidthBinning¶ Construct fixed-width binning schema with bins centered around integers.
Parameters: - range (Optional[Tuple[int]]) – min (included) and max integer (excluded) bin
- bin_width (Optional[int]) – group “bin_width” integers into one bin (not recommended)
-
physt.binnings.
numpy_binning
(data: Optional[numpy.ndarray], bin_count: int = 10, range: Optional[Tuple[float, float]] = None, **kwargs) → physt.binnings.NumpyBinning¶ Construct binning schema compatible with numpy.histogram together with int argument
Parameters: - data (array_like, optional) – This is optional if both bins and range are set
- bin_count (int) –
- range (Optional[tuple]) – (min, max)
- includes_right_edge (Optional[bool]) – default: True
See also
numpy.histogram()
,static_binning()
-
physt.binnings.
quantile_binning
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bin_count: Optional[int] = None, q: Optional[Sequence[int]] = None, qrange: Optional[Tuple[float, float]] = None, **kwargs) → physt.binnings.StaticBinning¶ Binning schema based on quantile ranges.
This binning finds equally spaced quantiles. This should lead to all bins having roughly the same frequencies.
Note: weights are not (yet) take into account for calculating quantiles.
Parameters: - bin_count (Number of bins) –
- q (Sequence of quantiles to be used as edges (a la numpy)) –
- qrange (Two floats as minimum and maximum quantile (default: 0.0, 1.0)) –
Returns: Return type:
-
physt.binnings.
register_binning
(f=None, *, name: Optional[str] = None)¶ Decorator to register among available binning methods.
-
physt.binnings.
static_binning
(data=None, bins=None, **kwargs) → physt.binnings.StaticBinning¶ Construct static binning with whatever bins.
physt.config module¶
physt.facade module¶
-
physt.facade.
h1
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=None, *, adaptive: bool = False, dropna: bool = True, dtype: Union[type, numpy.dtype, str, None] = None, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, keep_missed: bool = True, name: Optional[str] = None, title: Optional[str] = None, axis_name: Optional[str] = None, **kwargs) → physt.histogram1d.Histogram1D¶ Facade function to create 1D histograms.
This proceeds in three steps: 1) Based on magical parameter bins, construct bins for the histogram 2) Calculate frequencies for the bins 3) Construct the histogram object itself
Guiding principle: parameters understood by numpy.histogram should be understood also by physt.histogram as well and should result in a Histogram1D object with (h.numpy_bins, h.frequencies) same as the numpy.histogram output. Additional functionality is a bonus.
Parameters: - data (array_like, optional) – Container of all the values (tuple, list, np.ndarray, pd.Series)
- bins (int or sequence of scalars or callable or str, optional) – If iterable => the bins themselves If int => number of bins for default binning If callable => use binning method (+ args, kwargs) If string => use named binning method (+ args, kwargs)
- weights (array_like, optional) – (as numpy.histogram)
- keep_missed (Store statistics about how many values were lower than limits) – and how many higher than limits (default: True)
- dropna (Whether to clear data from nan's before histogramming) –
- name (Name of the histogram) –
- title (What will be displayed in the title of the plot) –
- axis_name (Name of the variable on x axis) –
- adaptive (Whether we want the bins to be modifiable) – (useful for continuous filling of a priori unknown data)
- dtype (Customize underlying data type: default int64 (without weight) or float (with weights)) –
- numpy.histogram parameters are excluded, see the methods of the Histogram1D class itself. (Other) –
See also
numpy.histogram()
-
physt.facade.
h2
(data1: Union[numpy.ndarray, Iterable[T_co], int, float, None], data2: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=10, **kwargs) → physt.histogram_nd.Histogram2D¶ Facade function to create 2D histograms.
For implementation and parameters, see histogramdd.
See also
numpy.histogram2d()
,histogramdd()
-
physt.facade.
h3
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=None, **kwargs) → physt.histogram_nd.HistogramND¶ Facade function to create 3D histograms.
Parameters: data (array_like or list[array_like] or tuple[array_like]) – Can be a single array (with three columns) or three different arrays (for each component)
-
physt.facade.
histogram
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=None, *, adaptive: bool = False, dropna: bool = True, dtype: Union[type, numpy.dtype, str, None] = None, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, keep_missed: bool = True, name: Optional[str] = None, title: Optional[str] = None, axis_name: Optional[str] = None, **kwargs) → physt.histogram1d.Histogram1D¶ Facade function to create 1D histograms.
This proceeds in three steps: 1) Based on magical parameter bins, construct bins for the histogram 2) Calculate frequencies for the bins 3) Construct the histogram object itself
Guiding principle: parameters understood by numpy.histogram should be understood also by physt.histogram as well and should result in a Histogram1D object with (h.numpy_bins, h.frequencies) same as the numpy.histogram output. Additional functionality is a bonus.
Parameters: - data (array_like, optional) – Container of all the values (tuple, list, np.ndarray, pd.Series)
- bins (int or sequence of scalars or callable or str, optional) – If iterable => the bins themselves If int => number of bins for default binning If callable => use binning method (+ args, kwargs) If string => use named binning method (+ args, kwargs)
- weights (array_like, optional) – (as numpy.histogram)
- keep_missed (Store statistics about how many values were lower than limits) – and how many higher than limits (default: True)
- dropna (Whether to clear data from nan's before histogramming) –
- name (Name of the histogram) –
- title (What will be displayed in the title of the plot) –
- axis_name (Name of the variable on x axis) –
- adaptive (Whether we want the bins to be modifiable) – (useful for continuous filling of a priori unknown data)
- dtype (Customize underlying data type: default int64 (without weight) or float (with weights)) –
- numpy.histogram parameters are excluded, see the methods of the Histogram1D class itself. (Other) –
See also
numpy.histogram()
-
physt.facade.
histogram2d
(data1: Union[numpy.ndarray, Iterable[T_co], int, float, None], data2: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=10, **kwargs) → physt.histogram_nd.Histogram2D¶ Facade function to create 2D histograms.
For implementation and parameters, see histogramdd.
See also
numpy.histogram2d()
,histogramdd()
-
physt.facade.
histogramdd
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None], bins=10, *, adaptive=False, dropna: bool = True, name: Optional[str] = None, title: Optional[str] = None, axis_names: Optional[Iterable[str]] = None, dim: Optional[int] = None, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, **kwargs) → physt.histogram_nd.HistogramND¶ Facade function to create n-dimensional histograms.
3D variant of this function is also aliased as “h3”.
Parameters: - data (array_like) – Container of all the values
- bins (Any) –
- weights (array_like, optional) – (as numpy.histogram)
- dropna (Whether to clear data from nan's before histogramming) –
- name (Name of the histogram) –
- axis_names (Names of the variable on x axis) –
- adaptive (Whether the bins should be updated when new non-fitting value are filled) –
- dtype (Optional[type]) – Underlying type for the histogram. If weights are specified, default is float. Otherwise int64
- title (What will be displayed in the title of the plot) –
- dim (Dimension - necessary if you are creating an empty adaptive histogram) –
- Note (For most arguments, if a list is passed, its values are used as values for) –
- axes. (individual) –
See also
numpy.histogramdd()
-
physt.facade.
collection
(data, bins=10, **kwargs) → physt.histogram_collection.HistogramCollection¶ Create histogram collection with shared binnning.
-
physt.facade.
polar
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', radial_range: Optional[Tuple[float, float]] = None, phi_bins=16, phi_range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.PolarHistogram¶ Facade construction function for the PolarHistogram.
-
physt.facade.
azimuthal
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins=16, range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights=None, transformed: bool = False, **kwargs) → physt.special_histograms.AzimuthalHistogram¶ Facade function to create an AzimuthalHistogram.
-
physt.facade.
radial
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, zdata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins='numpy', range: Optional[Tuple[float, float]] = None, dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.RadialHistogram¶ Facade function to create a radial histogram.
-
physt.facade.
cylindrical
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, rho_bins='numpy', phi_bins=16, z_bins='numpy', transformed: bool = False, dropna: bool = True, rho_range: Optional[Tuple[float, float]] = None, phi_range: Tuple[float, float] = (0, 6.283185307179586), weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalHistogram¶ Facade function to create a cylindrical histogram.
-
physt.facade.
cylindrical_surface
(data=None, *, phi_bins=16, z_bins='numpy', transformed: bool = False, radius: Optional[float] = None, dropna: bool = False, weights=None, phi_range: Tuple[float, float] = (0, 6.283185307179586), z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalSurfaceHistogram¶ Facade function to create a cylindrical surface histogram.
-
physt.facade.
spherical
(data: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', theta_bins=16, phi_bins=16, dropna: bool = True, transformed: bool = False, theta_range: Tuple[float, float] = (0, 3.141592653589793), phi_range: Tuple[float, float] = (0, 6.283185307179586), radial_range: Optional[Tuple[float, float]] = None, weights=None, **kwargs) → physt.special_histograms.SphericalHistogram¶ Facade function to create a speherical histogram.
-
physt.facade.
spherical_surface
(data: Union[numpy.ndarray, Iterable[T_co], int, float], *, theta_bins=16, phi_bins=16, transformed: bool = False, radius: Optional[float] = None, dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, theta_range: Tuple[float, float] = (0, 3.141592653589793), phi_range: Tuple[float, float] = (0, 6.283185307179586), **kwargs) → physt.special_histograms.SphericalSurfaceHistogram¶ Facade construction function for the SphericalSurfaceHistogram.
physt.histogram1d module¶
One-dimensional histograms.
-
class
physt.histogram1d.
Histogram1D
(binning: Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, errors2: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, keep_missed: bool = True, stats: Optional[Dict[str, float]] = None, overflow: Optional[float] = 0.0, underflow: Optional[float] = 0.0, inner_missed: Optional[float] = 0.0, axis_name: Optional[str] = None, **kwargs)¶ Bases:
physt.histogram1d.ObjectWithBinning
,physt.histogram_base.HistogramBase
One-dimensional histogram data.
The bins can be of different widths.
The bins need not be consecutive. However, some functionality may not be available for non-consecutive bins (like keeping information about underflow and overflow).
-
_stats
¶ Type: dict
These are the basic attributes that can be used in the constructor (see there) Other attributes are dynamic.
-
EMPTY_STATS
= {'sum': 0.0, 'sum2': 0.0}¶
-
axis_name
¶
-
binning
¶ The binning.
Note: Please, do not try to update the object itself.
-
cumulative_frequencies
¶ Cumulative frequencies.
Note: underflow values are not considered
-
fill
(value: float, weight: float = 1, **kwargs) → Optional[int]¶ Update histogram with a new value.
Parameters: - value (Value to be added.) –
- weight (Weight assigned to the value.) –
Returns: - index of bin which was incremented (-1=underflow, N=overflow, None=not found)
- Note (If a gap in unconsecutive bins is matched, underflow & overflow are not valid anymore.)
- Note (Name was selected because of the eponymous method in ROOT)
-
fill_n
(values: Union[numpy.ndarray, Iterable[T_co], int, float], weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dropna: bool = True) → None¶ Update histogram with more values at once.
It is an in-place operation.
Parameters: - values (Values to add) –
- weights (Optional weights to assign to each value) –
- drop_na (If true (default), all nan's are skipped.) –
Note
This method should be overloaded with a more efficient one.
May change the dtype if weight is set.
-
find_bin
(value: Union[numpy.ndarray, Iterable[T_co], int, float], axis: Union[int, str, None] = None) → Optional[int]¶ Index of bin corresponding to a value.
Returns: (-1=underflow, N=overflow, None=not found - inconsecutive) Return type: index of bin to which value belongs
-
classmethod
from_calculate_frequencies
(data: Union[numpy.ndarray, Iterable[T_co], int, float], binning: physt.binnings.BinningBase, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, validate_bins: bool = True, already_sorted: bool = False, dtype: Union[type, numpy.dtype, str, None] = None, **kwargs) → Histogram1DType¶ Construct the histogram from values and bins.
-
classmethod
from_xarray
(arr: xarray.Dataset) → Histogram1D¶ Convert form xarray.Dataset
Parameters: arr (The data in xarray representation) –
-
inner_missed
¶
-
mean
() → Optional[float]¶ Statistical mean of all values entered into histogram.
This number is precise, because we keep the necessary data separate from bin contents.
-
numpy_like
¶ Same result as would the numpy.histogram function return.
-
overflow
¶
-
select
(axis, index, *, force_copy: bool = False) → Union[physt.histogram1d.Histogram1D, Tuple[numpy.ndarray, float]]¶ Alias for [] to be compatible with HistogramND.
-
std
() → Optional[float]¶ Standard deviation of all values entered into histogram.
This number is precise, because we keep the necessary data separate from bin contents.
Returns: Return type: float
-
to_dataframe
() → pandas.DataFrame¶ Convert to pandas DataFrame.
This is not a lossless conversion - (under/over)flow info is lost.
-
to_xarray
() → xarray.Dataset¶ Convert to xarray.Dataset
-
underflow
¶
-
variance
() → Optional[float]¶ Statistical variance of all values entered into histogram.
This number is precise, because we keep the necessary data separate from bin contents.
Returns: Return type: float
-
-
class
physt.histogram1d.
ObjectWithBinning
¶ Bases:
abc.ABC
Mixin with shared methods for 1D objects that have a binning.
Note: Used to share behaviour between Histogram1D and HistogramCollection.
-
bin_centers
¶ Centers of all bins.
-
bin_left_edges
¶ Left edges of all bins.
-
bin_right_edges
¶ Right edges of all bins.
-
bin_sizes
¶
-
bin_widths
¶ Widths of all bins.
-
binning
¶ The binning itself.
-
bins
¶ Array of all bin edges.
Returns: Return type: Wide-format [[leftedge1, rightedge1], .. [leftedgeN, rightedgeN]]
-
edges
¶
-
get_bin_left_edges
(i)¶
-
get_bin_right_edges
(i)¶
-
max_edge
¶ Right edge of the last bin.
-
min_edge
¶ Left edge of the first bin.
-
ndim
¶
-
numpy_bins
¶ Bins in the format of numpy.
-
total_width
¶ Total width of all bins.
In inconsecutive histograms, the missing intervals are not counted in.
-
-
physt.histogram1d.
calculate_frequencies
(data: Union[numpy.ndarray, Iterable[T_co], int, float], binning: physt.binnings.BinningBase, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, validate_bins: bool = True, already_sorted: bool = False, dtype: Union[type, numpy.dtype, str, None] = None) → Tuple[numpy.ndarray, numpy.ndarray, float, float, dict]¶ Get frequencies and bin errors from the data.
Parameters: - data (Data items to work on.) –
- binning (A set of bins.) –
- weights (Weights of the items.) –
- validate_bins (If True (default), bins are validated to be in ascending order.) –
- already_sorted (If True, the data being entered are already sorted, no need to sort them once more.) –
- dtype (Underlying type for the histogram.) – (If weights are specified, default is float. Otherwise long.)
Returns: - frequencies (Bin contents)
- errors2 (Error squares of the bins)
- underflow (Weight of items smaller than the first bin)
- overflow (Weight of items larger than the last bin)
- stats (dict) – { sum: …, sum2: …}
Note
Checks that the bins are in a correct order (not necessarily consecutive). Does not check for numerical overflows in bins.
physt.histogram_base module¶
HistogramBase - base for all histogram classes.
-
class
physt.histogram_base.
HistogramBase
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, errors2: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, axis_names: Optional[Iterable[str]] = None, dtype: Union[type, numpy.dtype, str, None] = None, keep_missed: bool = True, **kwargs)¶ Bases:
abc.ABC
Histogram base class.
Behaviour shared by all histogram classes.
The most important daughter classes are: - Histogram1D - HistogramND
There are also special histogram types that are modifications of these classes.
The methods you should override: - fill - fill_n (optional) - copy - _update_dict (optional)
Underlying data type is int64 / float or an explicitly specified other type (dtype).
-
_binnings
¶ Type: Schema for binning(s)
-
frequencies
¶ Bin contents
Type: np.ndarray
-
errors2
¶ Square errors associated with the bin contents
Type: np.ndarray
-
_meta_data
¶ All meta-data (names, user-custom values, …). Anything can be put in. When exported, all information is kept.
Type: dict
-
_dtype
¶ Type of the frequencies and also errors (int64, float64 or user-overridden)
Type: np.dtype
-
_missed
¶ Various storage for missed values in different histogram types (1 value for multi-dimensional, 3 values for one-dimensional)
Type: array_like
-
Invariants
¶
-
----------
-
- Frequencies in the histogram should always be non-negative.
-
Many operations rely on that, but it is not always enforced.
-
(if you set config.free_arithmetics (see below), negative frequencies are also
-
allowed.
-
Arithmetics
¶
-
-----------
-
Histograms offer standard arithmetic operators that by default allow only
-
meaningful application (i.e. addition / subtraction of two histograms
-
with matching or mutually adaptable bin sets, multiplication and division by a constant).
-
If you relax the criteria by setting `config.free_aritmetics` or inside
-
the config.enable_free_arithmetics() context manager, you are in addition
-
allowed to use any array-like with matching shape.
See also
histogram1d
,histogram_nd
,special
-
adaptive
¶
-
axis_names
¶ Names of axes (stored in meta-data).
-
bin_count
¶ Total number of bins.
-
bin_sizes
¶
-
binnings
¶ The binnings.
Note: Please, do not try to update the objects themselves.
-
bins
¶
-
copy
(*, include_frequencies: bool = True) → HistogramType¶ Copy the histogram.
Parameters: include_frequencies (If false, all frequencies are set to zero.) –
-
default_axis_names
¶ Axis names to be used when an instance does not define them.
-
default_init_values
= {}¶
-
densities
¶ Frequencies normalized by bin sizes.
Useful when bins are not of the same size.
-
dtype
¶ Data type of the bin contents.
-
errors
¶ Bin errors.
-
errors2
Squares of the bin errors.
-
fill
(value: float, weight: float = 1, **kwargs) → Union[None, int, Tuple[int, ...]]¶ Update histogram with a new value.
It is an in-place operation.
Parameters: - value (Value to be added. Can be scalar or array depending on the histogram type.) –
- weight (Weight of the value) –
Note
May change the dtype if weight is set
-
fill_n
(values: Union[numpy.ndarray, Iterable[T_co], int, float], weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dropna: bool = True)¶ Update histogram with more values at once.
It is an in-place operation.
Parameters: - values (Values to add) –
- weights (Optional weights to assign to each value) –
- drop_na (If true (default), all nan's are skipped.) –
Note
This method should be overloaded with a more efficient one.
May change the dtype if weight is set.
-
find_bin
(value: Union[numpy.ndarray, Iterable[T_co], int, float], axis: Union[int, str, None] = None) → Union[None, int, Tuple[int, ...]]¶ Index(-ices) of bin corresponding to a value.
Parameters: - value (Value with dimensionality equal to histogram) –
- axis (If set, find axis along an axis. Otherwise, find bins along all axes.) – None = outside the bins
Returns: Return type: If axis is specified (or the histogram is 1D), a number. Otherwise, a tuple. If not available, None.
-
frequencies
Frequencies (values, contents) of the histogram bins.
-
classmethod
from_dict
(a_dict: Mapping[str, Any]) → physt.histogram_base.HistogramBase¶ Create an instance from a dictionary.
If customization is necessary, override the _from_dict_kwargs template method, not this one.
-
has_same_bins
(other: physt.histogram_base.HistogramBase) → bool¶ Whether two histograms share the same binning.
-
is_adaptive
() → bool¶ Whether the binning can be changed with operations.
-
merge_bins
(amount: Optional[int] = None, *, min_frequency: Optional[float] = None, axis: Union[int, str, None] = None, inplace: bool = False) → HistogramType¶ Reduce the number of bins and add their content:
Parameters: - amount (How many adjacent bins to join together.) –
- min_frequency (Try to have at least this value in each bin) – (this is not enforce e.g. for minima between high bins)
- axis (On which axis to do this (None => all)) –
- inplace (Whether to modify this histogram or return a new one) –
-
meta_data
¶ A dictionary of non-numerical information about the histogram.
It contains several pre-defined ones, but you can add any other. These are preserved when saving and also in operations.
-
missed
¶ Total number (weight) of entries that missed the bins.
-
name
¶ Name of the histogram (stored in meta-data).
-
ndim
¶ Dimensionality of histogram’s data.
i.e. the number of axes along which we bin the values.
-
normalize
(inplace: bool = False, percent: bool = False) → physt.histogram_base.HistogramBase¶ Normalize the histogram, so that the total weight is equal to 1.
Parameters: - inplace (If True, updates itself. If False (default), returns copy) –
- percent (If True, normalizes to percent instead of 1. Default: False) –
Returns: Return type: either modified copy or self
See also
densities()
,HistogramND.partial_normalize()
-
plot
¶ Proxy to plotting.
This attribute is a special proxy to plotting. In the most simple cases, it can be used as a method. For more sophisticated use, see the documentation for physt.plotting package.
-
select
(axis: Union[int, str], index: Union[int, slice], *, force_copy: bool = False) → Any¶ Select in an axis.
Parameters: - axis (Axis, in which we select.) –
- index (Index of bin (as in numpy)) –
- force_copy (If True, identity slice force a copy to be made.) –
-
set_adaptive
(value: bool = True)¶ Change the histogram binning to (non)adaptive.
This requires binning in all dimensions to allow this.
-
set_dtype
(value: Union[type, numpy.dtype, str], *, check: bool = True) → None¶ Change data type of the bin contents.
Allowed conversions: - from integral to float types - between the same category of type (float/integer) - from float types to integer if weights are trivial
Parameters: - value (np.dtype or something convertible to it.) –
- check (If True (default), all values are checked against the limits) –
-
shape
¶ Shape of histogram’s data.
Returns: Return type: Tuple with the number of bins along each axis.
-
title
¶ Title of the histogram to be displayed when plotted (stored in meta-data).
If not specified, defaults to name.
-
to_dict
() → Dict[str, Any]¶ Dictionary with all data in the histogram.
This is used for export into various formats (e.g. JSON) If a descendant class needs to update the dictionary in some way (put some more information), override the _update_dict method.
-
to_json
(path: Optional[str] = None, **kwargs) → str¶ Convert to JSON representation.
Parameters: path (Where to write the JSON.) – Returns: Return type: The JSON representation.
-
total
¶ Total number (sum of weights) of entries excluding underflow and overflow.
-
physt.histogram_collection module¶
-
class
physt.histogram_collection.
HistogramCollection
(*histograms, binning: Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float, None] = None, title: Optional[str] = None, name: Optional[str] = None)¶ Bases:
collections.abc.Container
,typing.Generic
,physt.histogram1d.ObjectWithBinning
Experimental collection of histograms.
It contains (potentially name-addressable) 1-D histograms with a shared binning.
-
add
(histogram: physt.histogram1d.Histogram1D) → None¶ Add a histogram to the collection.
-
axis_name
¶
-
axis_names
¶
-
binning
¶ The binning itself.
-
copy
() → physt.histogram_collection.HistogramCollection¶
-
create
(name: str, values, *, weights=None, dropna: bool = True, **kwargs) → physt.histogram1d.Histogram1D¶
-
classmethod
from_dict
(a_dict: Dict[str, Any]) → physt.histogram_collection.HistogramCollection¶
-
classmethod
multi_h1
(a_dict: Dict[str, Union[numpy.ndarray, Iterable[T_co], int, float]], bins=None, **kwargs) → physt.histogram_collection.HistogramCollection¶ Create a collection from multiple datasets.
-
normalize_all
(inplace: bool = False) → physt.histogram_collection.HistogramCollection¶ Normalize all histograms so that total content of each of them is equal to 1.0.
-
normalize_bins
(inplace: bool = False) → physt.histogram_collection.HistogramCollection¶ Normalize each bin in the collection so that the sum is 1.0 for each bin.
Note: If a bin is zero in all collections, the result will be inf.
-
plot
¶ Proxy to plotting.
This attribute is a special proxy to plotting. In the most simple cases, it can be used as a method. For more sophisticated use, see the documentation for physt.plotting package.
-
sum
() → physt.histogram1d.Histogram1D¶ Return the sum of all contained histograms.
-
to_dict
() → Dict[str, Any]¶
-
to_json
(path: Optional[str] = None, **kwargs) → str¶ Convert to JSON representation.
Parameters: path (Where to write the JSON.) – Returns: Return type: The JSON representation.
-
physt.histogram_nd module¶
Multi-dimensional histograms.
-
class
physt.histogram_nd.
Histogram2D
(binnings, frequencies=None, **kwargs)¶ Bases:
physt.histogram_nd.HistogramND
Specialized 2D variant of the general HistogramND class.
In contrast to general HistogramND, it is plottable.
-
T
¶ Histogram with swapped axes.
Returns: Return type: Histogram2D - a copy with swapped axes
-
numpy_like
¶ Same result as would the numpy.histogram function return.
-
partial_normalize
(axis: Union[int, str] = 0, inplace: bool = False)¶ Normalize in rows or columns.
Parameters: - axis (int or str) – Along which axis to sum (numpy-sense)
- inplace (bool) – Update the object itself
Returns: hist
Return type:
-
-
class
physt.histogram_nd.
HistogramND
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.histogram_base.HistogramBase
Multi-dimensional histogram data.
-
accumulate
(axis: Union[int, str]) → physt.histogram_base.HistogramBase¶ Calculate cumulative frequencies along a certain axis.
Returns: new_hist Return type: Histogram of the same type & size
-
bin_sizes
¶
-
bins
¶ List of bin matrices.
-
edges
¶
-
fill
(value, weight=1, **kwargs)¶ Update histogram with a new value.
It is an in-place operation.
Parameters: - value (Value to be added. Can be scalar or array depending on the histogram type.) –
- weight (Weight of the value) –
Note
May change the dtype if weight is set
-
fill_n
(values: Union[numpy.ndarray, Iterable[T_co], int, float], weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dropna: bool = True, columns: bool = False)¶ Add more values at once.
Parameters: - values (array_like) – Values to add. Can be array of shape (count, ndim) or array of shape (ndim, count) [use columns=True] or something convertible to it
- weights (array_like) – Weights for values (optional)
- dropna (bool) – Whether to remove NaN values. If False and such value is met, exception is thrown.
- columns (bool) – Signal that the data are transposed (in columns, instead of rows). This allows to pass list of arrays in values.
-
find_bin
(value, axis: Union[int, str, None] = None) → Union[None, int, Tuple[int, ...]]¶ Index(-ices) of bin corresponding to a value.
Parameters: - value (Value with dimensionality equal to histogram) –
- axis (If set, find axis along an axis. Otherwise, find bins along all axes.) – None = outside the bins
Returns: Return type: If axis is specified (or the histogram is 1D), a number. Otherwise, a tuple. If not available, None.
-
classmethod
from_calculate_frequencies
(data, binnings, weights=None, *, dtype=None, **kwargs)¶
-
get_bin_centers
(axis: Union[int, str, None] = None) → numpy.ndarray¶
-
get_bin_edges
(axis: Union[int, str, None] = None) → Tuple[numpy.ndarray, ...]¶
-
get_bin_left_edges
(axis: Union[int, str, None] = None) → numpy.ndarray¶
-
get_bin_right_edges
(axis: Union[int, str, None] = None) → numpy.ndarray¶
-
get_bin_widths
(axis: Union[int, str, None] = None) → numpy.ndarray¶
-
numpy_bins
¶ Numpy-like bins (if available).
-
numpy_like
¶ Same result as would the numpy.histogram function return.
-
projection
(*axes, **kwargs) → physt.histogram_base.HistogramBase¶ Reduce dimensionality by summing along axis/axes.
Parameters: - axes (Iterable[int or str]) – List of axes for the new histogram. Could be either numbers or names. Must contain at least one axis.
- name (Optional[str] # TODO: Check) – Name for the projected histogram (default: same)
- type (Optional[type] # TODO: Check) – If set, predefined class for the projection
Returns: Return type: HistogramND or Histogram2D or Histogram1D (or others in special cases)
-
select
(axis: Union[int, str], index: Union[int, slice], *, force_copy: bool = False) → physt.histogram_base.HistogramBase¶ Select in an axis.
Parameters: - axis (Axis, in which we select.) –
- index (Index of bin (as in numpy)) –
- force_copy (If True, identity slice force a copy to be made.) –
-
total_size
¶ The total size of the bin space.
Note
Perhaps not optimized, but should work also with transformed axes
-
-
physt.histogram_nd.
calculate_frequencies
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None], binnings: Iterable[physt.binnings.BinningBase], weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dtype: Union[type, numpy.dtype, str, None] = None) → Tuple[Optional[numpy.ndarray], Optional[numpy.ndarray], float]¶ “Get frequencies and bin errors from the data (n-dimensional variant).
Parameters: - data (2D array with ndim columns and row for each entry.) –
- binnings (Binnings to apply in all axes.) –
- weights (1D array of weights to assign to values.) – (If present, must have same length as the number of rows.)
- dtype (Underlying type for the histogram.) – (If weights are specified, default is float. Otherwise int64.)
Returns: - frequencies (Frequencies (if data supplied))
- errors2 (Errors squared if different from frequencies)
- missing (scalar[dtype])
physt.special module¶
physt.special_histograms module¶
Transformed histograms.
These histograms use a transformation from input values to bins in a different coordinate system.
There are three basic classes:
- PolarHistogram
- CylindricalHistogram
- SphericalHistogram
Apart from these, there are their projections into lower dimensions.
And of course, it is possible to re-use the general transforming functionality by adding TransformedHistogramMixin among the custom histogram class superclasses.
-
class
physt.special_histograms.
AzimuthalHistogram
(binning: Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, errors2: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, keep_missed: bool = True, stats: Optional[Dict[str, float]] = None, overflow: Optional[float] = 0.0, underflow: Optional[float] = 0.0, inner_missed: Optional[float] = 0.0, axis_name: Optional[str] = None, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram1d.Histogram1D
Projection of polar histogram to 1D with respect to phi.
This is a special case of a 1D histogram with transformed coordinates.
-
bin_sizes
¶
-
default_axis_names
= ['phi']¶
-
default_init_values
= {'radius': 1}¶
-
radius
¶ Radius of the surface.
Useful for calculating densities.
-
source_ndim
= 2¶
-
-
class
physt.special_histograms.
CylindricalHistogram
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram_nd.HistogramND
3D histogram in cylindrical coordinates.
This is a special case of a 3D histogram with transformed coordinates: - r as radius projection to xy plane in the (0, +inf) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range - z as the last direction without modification, in (-inf, +inf) range
-
bin_sizes
¶
-
default_axis_names
= ['rho', 'phi', 'z']¶
-
projection
(*axes, **kwargs)¶ Projection to lower-dimensional histogram.
The inheriting class should implement the _projection_class_map class attribute to suggest class for the projection. If the arguments don’t match any of the map keys, HistogramND is used.
-
source_ndim
= 3¶
-
-
class
physt.special_histograms.
CylindricalSurfaceHistogram
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram_nd.HistogramND
2D histogram in coordinates on cylinder surface.
This is a special case of a 2D histogram with transformed coordinates: - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range - z as the last direction without modification, in (-inf, +inf) range
-
radius
¶ The radius of the surface. Useful for plotting
Type: float
-
bin_sizes
¶
-
default_axis_names
= ['rho', 'phi', 'z']¶
-
default_init_values
= {'radius': 1}¶
-
radius
Radius of the cylindrical surface.
Useful for calculating densities.
-
source_ndim
= 3¶
-
-
class
physt.special_histograms.
PolarHistogram
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram_nd.HistogramND
2D histogram in polar coordinates.
This is a special case of a 2D histogram with transformed coordinates: - r as radius in the (0, +inf) range - phi as azimuthal angle in the (0, 2*pi) range
-
bin_sizes
¶
-
default_axis_names
= ['r', 'phi']¶
-
source_ndim
= 2¶
-
-
class
physt.special_histograms.
RadialHistogram
(binning: Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, errors2: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, keep_missed: bool = True, stats: Optional[Dict[str, float]] = None, overflow: Optional[float] = 0.0, underflow: Optional[float] = 0.0, inner_missed: Optional[float] = 0.0, axis_name: Optional[str] = None, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram1d.Histogram1D
Projection of polar histogram to 1D with respect to radius.
This is a special case of a 1D histogram with transformed coordinates.
-
bin_sizes
¶
-
default_axis_names
= ['r']¶
-
source_ndim
= (2, 3)¶
-
-
class
physt.special_histograms.
SphericalHistogram
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram_nd.HistogramND
3D histogram in spherical coordinates.
This is a special case of a 3D histogram with transformed coordinates: - r as radius in the (0, +inf) range - theta as angle between z axis and the vector, in the (0, 2*pi) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range
-
bin_sizes
¶
-
default_axis_names
= ['r', 'theta', 'phi']¶
-
source_ndim
= 3¶
-
-
class
physt.special_histograms.
SphericalSurfaceHistogram
(binnings: Iterable[Union[physt.binnings.BinningBase, numpy.ndarray, Iterable[T_co], int, float]], frequencies: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dimension: Optional[int] = None, missed=0, **kwargs)¶ Bases:
physt.special_histograms.TransformedHistogramMixin
,physt.histogram_nd.HistogramND
2D histogram in spherical coordinates.
This is a special case of a 2D histogram with transformed coordinates: - theta as angle between z axis and the vector, in the (0, 2*pi) range - phi as azimuthal angle (in the xy projection) in the (0, 2*pi) range
-
bin_sizes
¶
-
default_axis_names
= ['theta', 'phi']¶
-
default_init_values
= {'radius': 1}¶
-
radius
¶ Radius of the surface.
Useful for calculating densities.
-
source_ndim
= 3¶
-
-
class
physt.special_histograms.
TransformedHistogramMixin
¶ Bases:
abc.ABC
Histogram with non-cartesian (or otherwise transformed) axes.
This is a mixin, providing transform-aware find_bin, fill and fill_n.
When implementing, you are required to provide tbe following: - _transform_correct_dimension method to convert rectangular (it must be a classmethod) - bin_sizes property
In certain cases, you may want to have default axis names + projections. Look at PolarHistogram / SphericalHistogram / CylindricalHistogram as an example.
-
bin_sizes
¶
-
fill
(value: Union[numpy.ndarray, Iterable[T_co], int, float], weight: Union[numpy.ndarray, Iterable[T_co], int, float, None] = 1, *, transformed: bool = False, **kwargs)¶
-
fill_n
(values: Union[numpy.ndarray, Iterable[T_co], int, float], weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, dropna: bool = True, transformed: bool = False, **kwargs)¶
-
find_bin
(value, axis: Union[int, str, None] = None, transformed: bool = False)¶ Parameters: - value (array_like) – Value with dimensionality equal to histogram.
- transformed (bool) – If true, the value is already transformed and has same axes as the bins.
-
projection
(*axes, **kwargs)¶ Projection to lower-dimensional histogram.
The inheriting class should implement the _projection_class_map class attribute to suggest class for the projection. If the arguments don’t match any of the map keys, HistogramND is used.
-
classmethod
transform
(value) → Union[numpy.ndarray, float]¶ Convert cartesian (general) coordinates into internal ones.
Parameters: - value (array_like) – This method should accept both scalars and numpy arrays. If multiple values are to be transformed, it should of (nvalues, ndim) shape.
- Note (Implement _) –
-
-
physt.special_histograms.
azimuthal
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins=16, range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights=None, transformed: bool = False, **kwargs) → physt.special_histograms.AzimuthalHistogram¶ Facade function to create an AzimuthalHistogram.
-
physt.special_histograms.
azimuthal_histogram
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins=16, range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights=None, transformed: bool = False, **kwargs) → physt.special_histograms.AzimuthalHistogram¶ Facade function to create an AzimuthalHistogram.
-
physt.special_histograms.
cylindrical
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, rho_bins='numpy', phi_bins=16, z_bins='numpy', transformed: bool = False, dropna: bool = True, rho_range: Optional[Tuple[float, float]] = None, phi_range: Tuple[float, float] = (0, 6.283185307179586), weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalHistogram¶ Facade function to create a cylindrical histogram.
-
physt.special_histograms.
cylindrical_histogram
(data: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, rho_bins='numpy', phi_bins=16, z_bins='numpy', transformed: bool = False, dropna: bool = True, rho_range: Optional[Tuple[float, float]] = None, phi_range: Tuple[float, float] = (0, 6.283185307179586), weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalHistogram¶ Facade function to create a cylindrical histogram.
-
physt.special_histograms.
cylindrical_surface
(data=None, *, phi_bins=16, z_bins='numpy', transformed: bool = False, radius: Optional[float] = None, dropna: bool = False, weights=None, phi_range: Tuple[float, float] = (0, 6.283185307179586), z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalSurfaceHistogram¶ Facade function to create a cylindrical surface histogram.
-
physt.special_histograms.
cylindrical_surface_histogram
(data=None, *, phi_bins=16, z_bins='numpy', transformed: bool = False, radius: Optional[float] = None, dropna: bool = False, weights=None, phi_range: Tuple[float, float] = (0, 6.283185307179586), z_range: Optional[Tuple[float, float]] = None, **kwargs) → physt.special_histograms.CylindricalSurfaceHistogram¶ Facade function to create a cylindrical surface histogram.
-
physt.special_histograms.
polar
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', radial_range: Optional[Tuple[float, float]] = None, phi_bins=16, phi_range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.PolarHistogram¶ Facade construction function for the PolarHistogram.
-
physt.special_histograms.
polar_histogram
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', radial_range: Optional[Tuple[float, float]] = None, phi_bins=16, phi_range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.PolarHistogram¶ Facade construction function for the PolarHistogram.
-
physt.special_histograms.
radial
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, zdata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins='numpy', range: Optional[Tuple[float, float]] = None, dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.RadialHistogram¶ Facade function to create a radial histogram.
-
physt.special_histograms.
radial_histogram
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, zdata: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, *, bins='numpy', range: Optional[Tuple[float, float]] = None, dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.RadialHistogram¶ Facade function to create a radial histogram.
-
physt.special_histograms.
spherical
(data: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', theta_bins=16, phi_bins=16, dropna: bool = True, transformed: bool = False, theta_range: Tuple[float, float] = (0, 3.141592653589793), phi_range: Tuple[float, float] = (0, 6.283185307179586), radial_range: Optional[Tuple[float, float]] = None, weights=None, **kwargs) → physt.special_histograms.SphericalHistogram¶ Facade function to create a speherical histogram.
-
physt.special_histograms.
spherical_histogram
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', radial_range: Optional[Tuple[float, float]] = None, phi_bins=16, phi_range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.PolarHistogram¶ Facade construction function for the PolarHistogram.
-
physt.special_histograms.
spherical_surface
(data: Union[numpy.ndarray, Iterable[T_co], int, float], *, theta_bins=16, phi_bins=16, transformed: bool = False, radius: Optional[float] = None, dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, theta_range: Tuple[float, float] = (0, 3.141592653589793), phi_range: Tuple[float, float] = (0, 6.283185307179586), **kwargs) → physt.special_histograms.SphericalSurfaceHistogram¶ Facade construction function for the SphericalSurfaceHistogram.
-
physt.special_histograms.
spherical_surface_histogram
(xdata: Union[numpy.ndarray, Iterable[T_co], int, float], ydata: Union[numpy.ndarray, Iterable[T_co], int, float], *, radial_bins='numpy', radial_range: Optional[Tuple[float, float]] = None, phi_bins=16, phi_range: Tuple[float, float] = (0, 6.283185307179586), dropna: bool = False, weights: Union[numpy.ndarray, Iterable[T_co], int, float, None] = None, transformed: bool = False, **kwargs) → physt.special_histograms.PolarHistogram¶ Facade construction function for the PolarHistogram.
physt.time module¶
physt.typing_aliases module¶
Definitions for type hints.
physt.util module¶
Various utility functions to support physt implementation.
These functions are mostly general Python functions, not specific for numerical computing, histogramming, etc.
-
physt.util.
all_subclasses
(cls: type) → Tuple[type, ...]¶ All subclasses of a class.
-
physt.util.
deprecation_alias
(f, deprecated_name: str)¶ Provide a deprecated copy of a function.
Parameters: - f (The correct function) –
- deprecated_name (The name the function will be given) –
Examples
>>> def new(x): return 1 >>> old = deprecated_name(new, "old")
-
physt.util.
find_subclass
(base: type, name: str) → type¶ Find a named subclass of a base class.
Uses only the class name without namespace.
-
physt.util.
pop_many
(a_dict: Dict[str, Any], *args, **kwargs) → Dict[str, Any]¶ Pop multiple items from a dictionary.
Parameters: - a_dict (Dictionary from which the items will popped) –
- args (Keys which will be popped (and not included if not present)) –
- kwargs (Keys + default value pairs (if key not found, this default is included)) –
Returns: Return type: A dictionary of collected items.
physt.version module¶
Package information.
Module contents¶
physt¶
P(i/y)thon h(i/y)stograms. Inspired (and based on) numpy.histogram, but designed for humans(TM) on steroids(TM).
(C) Jan Pipek, 2016-2021, MIT licence See https://github.com/janpipek/physt