netket.stats.statistics#
- netket.stats.statistics(data)[source]#
Returns statistics of a given array (or matrix, see below) containing a stream of data. This is particularly useful to analyze Markov Chain data, but it can be used also for other type of time series.
- Parameters:
data (
vector
ormatrix
) – The input data. It can be real or complex valued. * if a vector, it is assumed that this is a time series of data (not necessarily independent); * if a matrix, it is assumed that that rowsdata[i]
contain independent time series.- Returns:
A dictionary-compatible class containing the average (
.mean
,["Mean"]
), variance (.variance
,["Variance"]
), the Monte Carlo standard error of the mean (error_of_mean
,["Sigma"]
), an estimate of the autocorrelation time (tau_corr
,["TauCorr"]
), and the Gelman-Rubin split-Rhat diagnostic (.R_hat
,["R_hat"]
).If the flag NETKET_EXPERIMENTAL_FFT_AUTOCORRELATION is set, the autocorrelation is computed exactly using a FFT transform, and an extra field tau_corr_max is inserted in the statistics object
These properties can be accessed both the attribute and the dictionary-style syntax (both indicated above).
The split-Rhat diagnostic is based on comparing intra-chain and inter-chain statistics of the sample and is thus only available for 2d-array inputs where the rows are independently sampled MCMC chains. In an ideal MCMC samples, R_hat should be 1.0. If it deviates from this value too much, this indicates MCMC convergence issues. Thresholds such as R_hat > 1.1 or even R_hat > 1.01 have been suggested in the literature for when to discard a sample. (See, e.g., Gelman et al., Bayesian Data Analysis, or Vehtari et al., arXiv:1903.08008.)
- Return type:
Stats