netket.stats.statistics

netket.stats.statistics#

netket.stats.statistics(data)[source]#

Returns statistics of a given array (or matrix, see below) containing a stream of data. This is particularly useful to analyze Markov Chain data, but it can be used also for other type of time series.

Parameters:

data (vector or matrix) – The input data. It can be real or complex valued. * if a vector, it is assumed that this is a time series of data (not necessarily independent); * if a matrix, it is assumed that that rows data[i] contain independent time series.

Returns:

A dictionary-compatible class containing the average (.mean, ["Mean"]), variance (.variance, ["Variance"]), the Monte Carlo standard error of the mean (error_of_mean, ["Sigma"]), an estimate of the autocorrelation time (tau_corr, ["TauCorr"]), and the Gelman-Rubin split-Rhat diagnostic (.R_hat, ["R_hat"]).

If the flag NETKET_EXPERIMENTAL_FFT_AUTOCORRELATION is set, the autocorrelation is computed exactly using a FFT transform, and an extra field tau_corr_max is inserted in the statistics object

These properties can be accessed both the attribute and the dictionary-style syntax (both indicated above).

The split-Rhat diagnostic is based on comparing intra-chain and inter-chain statistics of the sample and is thus only available for 2d-array inputs where the rows are independently sampled MCMC chains. In an ideal MCMC samples, R_hat should be 1.0. If it deviates from this value too much, this indicates MCMC convergence issues. Thresholds such as R_hat > 1.1 or even R_hat > 1.01 have been suggested in the literature for when to discard a sample. (See, e.g., Gelman et al., Bayesian Data Analysis, or Vehtari et al., arXiv:1903.08008.)

Return type:

Stats

netket.stats.statistics

Contents

netket.stats.statistics#