Non-parametric cluster-level paired t-test.
array, shape (n_observations, p[, q][, r])
The data to be clustered. The first dimension should correspond to the
difference between paired samples (observations) in two conditions.
X[k] can be 1D (e.g., time series), 2D (e.g.,
time series over channels), or 3D (e.g., time-frequencies over
channels) associated with the kth observation. For spatiotemporal data,
The so-called “cluster forming threshold” in the form of a test statistic
(note: this is not an alpha level / “p-value”).
If numeric, vertices with data values more extreme than
be used to form clusters. If
None, a t-threshold will be chosen
automatically that corresponds to a p-value of 0.05 for the given number of
observations (only valid when using a t-statistic). If
threshold is a
dict (with keys
'step') then threshold-free
cluster enhancement (TFCE) will be used (see the
TFCE example and ).
See Notes for an example on how to compute a threshold based on
a particular p-value for one-tailed or two-tailed tests.
The number of permutations to compute. Can be ‘all’ to perform an exact test.
If tail is 1, the statistic is thresholded above threshold. If tail is -1, the statistic is thresholded below threshold. If tail is 0, the statistic is thresholded on both sides of the distribution.
Function called to calculate the test statistic. Must accept 1D-array as
input and return a 1D array. If
None (the default), uses
Defines adjacency between locations in the data, where “locations” can be
spatial vertices, frequency bins, time points, etc. For spatial vertices,
no adjacency (each location is treated as independent and unconnected).
None, a regular lattice adjacency is assumed, connecting
each location to its neighbor(s) along the last dimension
X (or the last two dimensions if
X is 2D).
adjacency is a matrix, it is assumed to be symmetric (only the
upper triangular half is used) and must be square with dimension equal to
X.shape[-1] (for 2D data) or
X.shape[-1] * X.shape[-2]
(for 3D data) or (optionally)
X.shape[-1] * X.shape[-2] * X.shape[-3]
(for 4D data). The function
mne.stats.combine_adjacency may be useful for 4D data.
The number of jobs to run in parallel. If
-1, it is set
to the number of CPU cores. Requires the
None (default) is a marker for ‘unset’ that will be interpreted
n_jobs=1 (sequential execution) unless the call is performed under
joblib.parallel_backend() context manager that sets another
int| instance of
A seed for the NumPy random number generator (RNG). If
the seed will be obtained from the operating system
RandomState for details), meaning it will most
likely produce different output every time this function or method is run.
To achieve reproducible results, pass a value here to explicitly initialize
the RNG with a defined state.
Maximum distance between samples along the second axis of
X to be
considered adjacent (typically the second axis is the “time” dimension).
Only used when
adjacency has shape (n_vertices, n_vertices), that is,
when adjacency is only specified for sensors (e.g., via
mne.channels.find_ch_adjacency()), and not via sensors and
further dimensions such as time points (e.g., via an additional call of
Mask to apply to the data to exclude certain points from clustering
(e.g., medial wall vertices). Should be the same shape as
None, no points are excluded.
To perform a step-down-in-jumps test, pass a p-value for clusters to exclude from each successive iteration. Default is zero, perform no step-down test (since no clusters will be smaller than this value). Setting this to a reasonable value, e.g. 0.05, can increase sensitivity but costs computation time.
Power to raise the statistical values (usually t-values) by before
summing (sign will be retained). Note that
t_power=0 will give a
count of locations in each cluster,
t_power=1 will weight each location
by its statistical score.
Output format of clusters within a list.
'mask', returns a list of boolean arrays,
each with the same shape as the input data (or slices if the shape is 1D
and adjacency is None), with
True values indicating locations that are
part of a cluster. If
'indices', returns a list of tuple of ndarray,
where each ndarray contains the indices of locations that together form the
given cluster along the given dimension. Note that for large datasets,
'indices' may use far less memory than
Whether to check if the connectivity matrix can be separated into disjoint
sets before clustering. This may lead to faster clustering, especially if
the second dimension of
X (usually the “time” dimension) is large.
Block size to use when computing test statistics. This can significantly
reduce memory usage when
n_jobs > 1 and memory sharing between
processes is enabled (see
X will be
shared between processes and each process only needs to allocate space for
a small block of locations at a time.
From an array of paired observations, e.g. a difference in signal amplitudes or power spectra in two conditions, calculate if the data distributions in the two conditions are significantly different. The procedure uses a cluster analysis with permutation test for calculating corrected p-values. Randomized data are generated with random sign flips. See  for more information.
Because a 1-sample t-test on the difference in observations is mathematically equivalent to a paired t-test, internally this function computes a 1-sample t-test (by default) and uses sign flipping (always) to perform permutations. This might not be suitable for the case where there is truly a single observation under test; see Statistical inference.
For computing a
threshold based on a p-value, use the conversion
pval = 0.001 # arbitrary df = n_observations - 1 # degrees of freedom for the test thresh = scipy.stats.t.ppf(1 - pval / 2, df) # two-tailed, t distribution
For a one-tailed test (
tail=1), don’t divide the p-value by 2.
For testing the lower tail (
tail=-1), don’t subtract
pval from 1.
n_permutations exceeds the maximum number of possible permutations
given the number of observations, then
will be ignored since an exact test (full permutation test) will be
performed (this is the case when
n_permutations >= 2 ** (n_observations - (tail == 0))).
If no initial clusters are found because all points in the true
distribution are below the threshold, then
H0 will all be empty arrays.