| Title: | Inference for High-Dimensional Mixture Transition Distribution Models |
|---|---|
| Description: | Estimates parameters in Mixture Transition Distribution (MTD) models, a class of high-order Markov chains. The set of relevant pasts (lags) is selected using either the Bayesian Information Criterion or the Forward Stepwise and Cut algorithms. Other model parameters (e.g. transition probabilities and oscillations) can be estimated via maximum likelihood estimation or the Expectation-Maximization algorithm. Additionally, 'hdMTD' includes a perfect sampling algorithm that generates samples of an MTD model from its invariant distribution. For theory, see Ost & Takahashi (2023) <http://jmlr.org/papers/v24/22-0266.html>. |
| Authors: | Maiara Gripp [aut, cre], Guilherme Ost [ths], Giulio Iacobelli [ths] |
| Maintainer: | Maiara Gripp <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.4 |
| Built: | 2026-05-25 06:51:54 UTC |
| Source: | https://github.com/maiaragripp/hdmtd |
Convenience coercion to rebuild an object of class "MTD" from an EM fit
(i.e., an output from MTDest() with class c("MTDest","MTD")).
This simply feeds the estimated parameters back into MTDmodel.
Note that most methods for "MTD" also work directly on "MTDest"
objects via inheritance and explicit coercion is therefore optional.
as.MTD(x, ...)as.MTD(x, ...)
x |
An object of class |
... |
Further arguments passed to or from other methods (ignored). |
An object of class "MTD" as returned by MTDmodel.
## Not run: set.seed(1) MTD <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) # generates MTD model X <- perfectSample(MTD, N = 400) # generates MTD sample init <- list( p0 = c(0.4, 0.6), lambdas = c(0.05, 0.45, 0.5), pj = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 3), init = init) # estimates parameters from sample m <- as.MTD(fit) # generates an MTD model from estimated parameters str(m, max.level = 1) ## End(Not run)## Not run: set.seed(1) MTD <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) # generates MTD model X <- perfectSample(MTD, N = 400) # generates MTD sample init <- list( p0 = c(0.4, 0.6), lambdas = c(0.05, 0.45, 0.5), pj = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 3), init = init) # estimates parameters from sample m <- as.MTD(fit) # generates an MTD model from estimated parameters str(m, max.level = 1) ## End(Not run)
Creates a tibble containing all unique sequences of length d+1 found in
the sample, along with their absolute frequencies.
countsTab(X, d)countsTab(X, d)
X |
A numeric vector, a single-column data frame, or a list with a sample from a Markov chain. The first element must be the most recent observation. |
d |
A positive integer specifying the number of elements in each sequence,
which will be |
The function generates a tibble with d+2 columns. In the first
d+1 columns, each row displays a unique sequence of size d+1
observed in the sample. The last column, called Nxa, contains the
number of times each of these sequences appeared in the sample.
The number of rows in the output varies between and ,
where is the number of unique states in X, since it depends
on the number of unique sequences that appear in the sample.
A tibble with all observed sequences of length d+1 and their
absolute frequencies.
countsTab(c(1,2,2,1,2,1,1,2,1,2), d = 3) # Using simulated data. set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) countsTab(X, d = 2)countsTab(c(1,2,2,1,2,1,1,2,1,2), d = 3) # Using simulated data. set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) countsTab(X, d = 2)
Calculates the total variation distance between distributions conditioned in a given past sequence.
dTV_sample(S, j, A = NULL, base, lenA = NULL, A_pairs = NULL, x_S)dTV_sample(S, j, A = NULL, base, lenA = NULL, A_pairs = NULL, x_S)
S |
A numeric vector of positive integers (or |
j |
A positive integer representing a lag in the |
A |
A vector of unique nonnegative integers (state space) with at least
two elements. |
base |
A data frame with sequences of elements from |
lenA |
An integer |
A_pairs |
A two-column matrix with all unique pairs of elements from |
x_S |
A vector of length |
This function computes the total variation distance between distributions
found in base, which is expected to be the output of the function freqTab().
Therefore, base must follow a specific structure (e.g., column names must
match, and a column named qax_Sj, containing transition distributions, must be
present). For more details on the output structure of freqTab(), refer to its
documentation.
The total-variation distance is computed as
for each pair of symbols b, c in A using the empirical conditional
probabilities obtained from freqTab for the supplied S
and j.
If you provide the state space A, the function calculates:
lenA <- length(A) and A_pairs <- t(utils::combn(A, 2)).
Alternatively, you can input lenA and A_pairs directly and let
A <- NULL, which is useful in loops to improve efficiency.
A single-row matrix with one column per pair of distinct elements from
the state space A (so choose(length(A), 2) columns). Each entry
corresponds to the total variation distance between a pair of distributions,
conditioned on the same fixed past x_S (when S is not NULL),
differing only in the symbol indexed by j, which varies across all
distinct pairs of elements in A. When S is NULL, the row
name is the empty string "".
set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 2, 3), lam0 = 0.1) X <- perfectSample(M, N = 400) ct <- countsTab(X, d = 5) # --- Case 1: S non-empty pbase <- freqTab(S = c(1, 4), j = 2, A = c(1, 2, 3), countsTab = ct) dTV_sample(S = c(1, 2), j = 4, A = c(1, 2, 3), base = pbase, x_S = c(2, 3)) # --- Case 2: S = NULL pbase2 <- freqTab(S = NULL, j = 1, A = c(1, 2, 3), countsTab = ct) dTV_sample(S = NULL, j = 1, A = c(1, 2, 3), base = pbase2)set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 2, 3), lam0 = 0.1) X <- perfectSample(M, N = 400) ct <- countsTab(X, d = 5) # --- Case 1: S non-empty pbase <- freqTab(S = c(1, 4), j = 2, A = c(1, 2, 3), countsTab = ct) dTV_sample(S = c(1, 2), j = 4, A = c(1, 2, 3), base = pbase, x_S = c(2, 3)) # --- Case 2: S = NULL pbase2 <- freqTab(S = NULL, j = 1, A = c(1, 2, 3), countsTab = ct) dTV_sample(S = NULL, j = 1, A = c(1, 2, 3), base = pbase2)
Computes the Maximum Likelihood estimators (MLE) for an MTD Markov chain with
relevant lag set S.
empirical_probs(X, S, matrixform = FALSE, A = NULL, warn = FALSE)empirical_probs(X, S, matrixform = FALSE, A = NULL, warn = FALSE)
X |
A vector or single-column data frame containing a sample of a Markov chain ( |
S |
A numeric vector of unique positive integers. Typically, |
matrixform |
Logical. If |
A |
A numeric vector of distinct integers representing the state space.
If not provided, this function will set |
warn |
Logical. If |
The probabilities are estimated as:
where is the number of times the sequence appeared in the sample
followed by , and is the number of times appeared
(followed by any state). If , the probability is set to
(assuming a uniform distribution over A).
A data frame or a matrix containing estimated transition probabilities:
If matrixform = FALSE, the function returns a data frame with three columns:
The past sequence (a concatenation of past states).
The current state .
The estimated probability .
If matrixform = TRUE, the function returns a stochastic transition matrix,
where rows correspond to past sequences and columns correspond to states in A.
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 15, 30), A = c(1, 2, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Estimate probabilities for different subsets S empirical_probs(X, S = c(1, 30)) empirical_probs(X, S = c(1, 15, 30), matrixform = TRUE)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 15, 30), A = c(1, 2, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Estimate probabilities for different subsets S empirical_probs(X, S = c(1, 30)) empirical_probs(X, S = c(1, 15, 30), matrixform = TRUE)
This function returns a tibble containing the sample sequences, their frequencies and the estimated transition probabilities.
freqTab(S, j = NULL, A, countsTab, complete = TRUE)freqTab(S, j = NULL, A, countsTab, complete = TRUE)
S |
A numeric vector of positive integers or |
j |
An integer or |
A |
A vector with nonnegative integers. Must have at least two different entries.
|
countsTab |
A tibble or a data frame with all sequences of length d+1 that
appear in the sample, and their absolute frequency. This tibble is typically
generated by the function |
complete |
Logical. If |
The parameters S and j determine which columns of countsTab
are retained in the output. Specifying a lag j is optional. All lags can
be specified via S, while leaving j = NULL (default). The output
remains the same as when specifying S and j separately. The
inclusion of j as a parameter improves clarity within the package's
algorithms. Note that j cannot be an element of S.
A tibble where each row represents a sequence of elements from A.
The initial columns display each sequence symbol separated into columns
corresponding to their time indexes. The remaining columns show the sample
frequencies of the sequences and the MLE (Maximum Likelihood Estimator)
of the transition probabilities.
# Reproducible simulated data set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 2, 3), lam0 = 0.1) X <- perfectSample(M, N = 400) ct <- countsTab(X, d = 5) # Example with S non-empty and j specified freqTab(S = c(1, 4), j = 2, A = c(1, 2, 3), countsTab = ct) # Equivalent to calling with S = c(1,2,4) and j = NULL freqTab(S = c(1, 2, 4), j = NULL, A = c(1, 2, 3), countsTab = ct)# Reproducible simulated data set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 2, 3), lam0 = 0.1) X <- perfectSample(M, N = 400) ct <- countsTab(X, d = 5) # Example with S non-empty and j specified freqTab(S = c(1, 4), j = 2, A = c(1, 2, 3), countsTab = ct) # Equivalent to calling with S = c(1,2,4) and j = NULL freqTab(S = c(1, 2, 4), j = NULL, A = c(1, 2, 3), countsTab = ct)
This function estimates the relevant lag set in a Mixture Transition Distribution (MTD) model using one of the available methods. By default, it applies the Forward Stepwise ("FS") method, which is particularly useful in high-dimensional settings.
hdMTD(X, d, method = "FS", ...)hdMTD(X, d, method = "FS", ...)
X |
A vector or single-column data frame containing a chain sample. |
d |
A positive integer representing an upper bound for the chain order. |
method |
A character string indicating the method for estimating the relevant lag set. The available methods are: "FS" (default), "FSC", "CUT", and "BIC". See the Details section and the documentation of the corresponding method functions for more information. |
... |
Additional arguments for the selected method. If not specified, default values will be used (see Details ). |
The available methods are:
"FS" (Forward Stepwise): selects the lags by a criterion that depends on their oscillations.
"CUT": a method that selects the relevant lag set based on a predefined threshold.
"FSC" (Forward Stepwise and Cut): applies the "FS" method followed by the "CUT" method.
"BIC": selects the lag set using the Bayesian Information Criterion.
The function dynamically calls the corresponding method function (e.g., hdMTD_FSC() for
method = "FSC"). Additional parameters specific to each method can be provided via ...,
and default values are used for unspecified parameters.
This function serves as a wrapper for the method-specific functions:
hdMTD_FS(), for method = "FS"
hdMTD_FSC(), for method = "FSC"
hdMTD_CUT(), for method = "CUT"
hdMTD_BIC(), for method = "BIC"
Any additional parameters (...) must match those accepted by the corresponding method function.
If a parameter value is not explicitly provided, a default value is used.
The main default parameters are:
S = seq_len(d): Used in "BIC" or "CUT" methods.
alpha = 0.05, mu = 1. Used in "CUT" or "FSC" methods.
xi = 0.5. Used in "CUT", "FSC" or "BIC" methods.
minl = 1, maxl = length(S), byl = FALSE. Used in "BIC" method.
All default values are specified in the documentation of the method-specific functions.
BIC-specific notes. When method = "BIC", the object may carry
additional BIC diagnostics:
attr(x, "extras")$BIC_out: exactly the object returned by
hdMTD_BIC() (its type depends on BICvalue and byl:
named numeric vector when BICvalue = TRUE; otherwise a character
vector of labels, possibly including "smallest: <set>").
attr(x, "BIC_selected_value"): when BICvalue = TRUE,
the numeric BIC at the selected lag set (shown by summary()).
An integer vector of class "hdMTD" (the selected lag set ),
with attributes:
method, d, call, settings
A: the state space actually used (either provided A, or sort(unique(X)) if A = NULL)
(for method="BIC") extras: a list with BIC_out (exactly the object returned by hdMTD_BIC)
(for method="BIC" and BICvalue = TRUE) BIC_selected_value: numeric BIC at the selected set
The returned object has class "hdMTD" and supports:
print and summary:
methods for lag-selection outputs (see hdMTD-methods).
The selected lag set can be retrieved via S and in the package
convention as negative lags via lags.
hdMTD_FS, hdMTD_FSC, hdMTD_CUT, hdMTD_BIC
for the method-specific implementations, and MTD-accessors for accessors.
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Fit using Forward Stepwise (FS) S_fs <- hdMTD(X = X, d = 5, method = "FS", l = 2) print(S_fs) summary(S_fs) # shows A used if A = NULL in the call # Fit using Bayesian Information Criterion (BIC) hdMTD(X = X, d = 5, method = "BIC", xi = 0.6, minl = 3, maxl = 3)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Fit using Forward Stepwise (FS) S_fs <- hdMTD(X = X, d = 5, method = "FS", l = 2) print(S_fs) summary(S_fs) # shows A used if A = NULL in the call # Fit using Bayesian Information Criterion (BIC) hdMTD(X = X, d = 5, method = "BIC", xi = 0.6, minl = 3, maxl = 3)
A function for estimating the relevant lag set of a Markov chain using
Bayesian Information Criterion (BIC). This means that this method selects the set of lags
that minimizes a penalized log likelihood for a given sample, see References below for
details on the method.
hdMTD_BIC( X, d, S = seq_len(d), minl = 1, maxl = length(S), xi = 1/2, A = NULL, byl = FALSE, BICvalue = FALSE, single_matrix = FALSE, indep_part = TRUE, zeta = maxl, warn = FALSE, ... )hdMTD_BIC( X, d, S = seq_len(d), minl = 1, maxl = length(S), xi = 1/2, A = NULL, byl = FALSE, BICvalue = FALSE, single_matrix = FALSE, indep_part = TRUE, zeta = maxl, warn = FALSE, ... )
X |
A vector or single-column data frame containing a chain sample ( |
d |
A positive integer representing an upper bound for the chain order. |
S |
A numeric vector of positive integers from which this function will select
a set of relevant lags. Typically, |
minl |
A positive integer. |
maxl |
A positive integer equal to or greater than |
xi |
The BIC penalization term constant. Defaulted to 1/2. A smaller |
A |
A vector with positive integers representing the state space. If not informed,
this function will set |
byl |
Logical. If |
BICvalue |
Logical. If |
single_matrix |
Logical. If |
indep_part |
Logical. If |
zeta |
A positive integer representing the number of distinct matrices |
warn |
Logical. If |
... |
Additional arguments (not used in this function, but maintained for compatibility with |
Criterion. For each candidate lag set contained in with
size where minl <= l <= maxl, hdMTD_BIC() evaluates
where and
The empirical conditionals are
computed from the sample counts (same quantities returned by
freqTab and empirical_probs).
Degrees of freedom. The parameter count is the number of free
parameters of an MTD model with lag set and state space , honoring the
constraints single_matrix, indep_part, and zeta:
Here is the number of distinct matrices allowed across lags
(by default ; setting single_matrix = TRUE forces ).
The weight and independent-part contributions are:
if indep_part is TRUE, otherwise ;
if indep_part is TRUE, otherwise .
Scale. With xi = 1/2 (the default), equals one half of the
classical Schwarz BIC ; minimizing either criterion selects
the same lag set.
Note. The likelihood term sums over the effective
transitions, while the penalty uses (this matches the implementation).
Note that the upper bound for the order of the chain (d) affects the estimation
of the transition probabilities. If we run the function with a certain order parameter d,
only the sequences of length d that appeared in the sample will be counted. Therefore,
all transition probabilities, and hence all BIC values, will be calculated with respect to
that d. If we use another value for d to run the function, even if the output
agrees with that of the previous run, its BIC value might change a little.
The parameter zeta indicates the the number of distinct matrices pj in the MTD.
If zeta = 1, all matrices are identical; if zeta = 2 there exists
two groups of distinct matrices and so on. The largest value for zeta is maxl
since this is the largest number of matrices . When minl<maxl,
for each minl l maxl, zeta = min(zeta,l).
If single_matrix = TRUE then zeta is set to 1.
Returns a vector with the estimated relevant lag set using BIC. It might return more
than one set if minl < maxl and byl = TRUE. Additionally, it can return the value
of the penalized likelihood for the outputted lag sets if BICvalue = TRUE.
Imre Csiszár, Paul C. Shields. The consistency of the BIC Markov order estimator. The Annals of Statistics, 28(6), 1601-1619. doi:10.1214/aos/1015957472
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Fit using BIC with a single lag hdMTD_BIC(X, d = 6, minl = 1, maxl = 1) # Fit using BIC with lag selection and extract BIC value hdMTD_BIC(X, d = 3, minl = 1, maxl = 2, BICvalue = TRUE)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Fit using BIC with a single lag hdMTD_BIC(X, d = 6, minl = 1, maxl = 1) # Fit using BIC with lag selection and extract BIC value hdMTD_BIC(X, d = 3, minl = 1, maxl = 2, BICvalue = TRUE)
A function that estimates the set of relevant lags of an MTD model using the CUT method.
hdMTD_CUT( X, d, S = seq_len(d), alpha = 0.05, mu = 1, xi = 0.5, A = NULL, warn = FALSE, ... )hdMTD_CUT( X, d, S = seq_len(d), alpha = 0.05, mu = 1, xi = 0.5, A = NULL, warn = FALSE, ... )
X |
A vector or single-column data frame containing a chain sample ( |
d |
A positive integer representing an upper bound for the chain order. |
S |
A numeric vector of distinct positive integers from which this function will select
a set of relevant lags. Should be a subset of |
alpha |
A positive real number used in the CUT threshold (which determines if two
distributions can be considered different). The larger the |
mu |
A positive real number such that |
xi |
A positive real number, |
A |
A vector with positive integers representing the state space. If not informed,
this function will set |
warn |
Logical. If |
... |
Additional arguments (not used in this function, but maintained for compatibility with |
The "Forward Stepwise and Cut" (FSC) is an algorithm for inference in Mixture Transition Distribution (MTD) models. It consists in the application of the "Forward Stepwise" (FS) step followed by the CUT algorithm. This method and its steps where developed by Ost and Takahashi and are specially useful for inference in high-order MTD Markov chains. This specific function will only apply the CUT step of the algorithm and return an estimated relevant lag set.
Returns a set of relevant lags estimated using the CUT algorithm.
Ost, G. & Takahashi, D. Y. (2023). Sparse Markov models for high-dimensional inference. Journal of Machine Learning Research, 24(279), 1-54. http://jmlr.org/papers/v24/22-0266.html
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Apply CUT with custom alpha, mu, and xi hdMTD_CUT(X, d = 4, alpha = 0.02, mu = 1, xi = 0.4) # Apply CUT with selected lags and smaller alpha hdMTD_CUT(X, d = 6, S = c(1, 4, 6), alpha = 0.08)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) # Apply CUT with custom alpha, mu, and xi hdMTD_CUT(X, d = 4, alpha = 0.02, mu = 1, xi = 0.4) # Apply CUT with selected lags and smaller alpha hdMTD_CUT(X, d = 6, S = c(1, 4, 6), alpha = 0.08)
A function that estimates the set of relevant lags of an MTD model using the FS method.
hdMTD_FS(X, d, l, A = NULL, elbowTest = FALSE, warn = FALSE, ...)hdMTD_FS(X, d, l, A = NULL, elbowTest = FALSE, warn = FALSE, ...)
X |
A vector or single-column data frame containing a chain sample ( |
d |
A positive integer representing an upper bound for the chain order. |
l |
A positive integer specifying the number of lags to be selected as relevant. |
A |
A vector with positive integers representing the state space. If not informed,
this function will set |
elbowTest |
Logical. If TRUE, the function applies an alternative stopping criterion to determine the length of the set of relevant lags. See Details for more information. |
warn |
Logical. If |
... |
Additional arguments (not used in this function, but maintained for compatibility with |
The "Forward Stepwise" (FS) algorithm is the first step of the "Forward Stepwise and Cut"
(FSC) algorithm for inference in Mixture Transition Distribution (MTD) models.
This method was developed by Ost and Takahashi.
This specific function will only apply the FS step of the algorithm and return an estimated
relevant lag set of length l.
This method iteratively selects the most relevant lags based on a certain quantity .
In the first step, the lag in 1:d with the greatest is deemed important.
This lag is included in the output, and using this knowledge, the function proceeds to seek
the next important lag (the one with the highest among the remaining ones).
The process stops when the output vector reaches length l if elbowTest=FALSE.
If elbowTest = TRUE, the function will store these maximum values
at each iteration, and output only the lags that appear before the one with smallest
among them.
A numeric vector containing the estimated relevant lag set using FS algorithm.
Tie-breaking: if multiple lags share the maximum , FS picks the most recent
lag (smallest j). Up to version 0.1.2 ties were broken at random, which could cause
run-to-run differences.
Ost, G. & Takahashi, D. Y. (2023). Sparse Markov models for high-dimensional inference. Journal of Machine Learning Research, 24(279), 1-54. http://jmlr.org/papers/v24/22-0266.html
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Forward Stepwise with l = 2 hdMTD_FS(X, d = 5, l = 2) # Forward Stepwise with l = 3 hdMTD_FS(X, d = 4, l = 3)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Forward Stepwise with l = 2 hdMTD_FS(X, d = 5, l = 2) # Forward Stepwise with l = 3 hdMTD_FS(X, d = 4, l = 3)
A function for inference in MTD Markov chains with FSC method. This function estimates the relevant
lag set of an MTD model through the FSC algorithm.
hdMTD_FSC(X, d, l, alpha = 0.05, mu = 1, xi = 0.5, A = NULL, ...)hdMTD_FSC(X, d, l, alpha = 0.05, mu = 1, xi = 0.5, A = NULL, ...)
X |
A vector or single-column data frame containing a chain sample ( |
d |
A positive integer representing an upper bound for the chain order. |
l |
A positive integer that sets the number of elements in the output vector. |
alpha |
A positive real number used in the CUT threshold (which determines if two
distributions can be considered different). The larger the |
mu |
A positive real number such that |
xi |
A positive real number, |
A |
A vector with positive integers representing the state space. If not informed,
this function will set |
... |
Additional arguments (not used in this function, but maintained for compatibility with |
The "Forward Stepwise and Cut" (FSC) is an algorithm for inference in Mixture Transition Distribution (MTD) models. It consists in the application of the "Forward Stepwise" (FS) step followed by the CUT algorithm. This method and its steps where developed by Ost and Takahashi and are specially useful for inference in high-order MTD Markov chains.
Returns a vector with the estimated relevant lag set using FSC algorithm.
Ost, G. & Takahashi, D. Y. (2023). Sparse Markov models for high-dimensional inference. Journal of Machine Learning Research, 24(279), 1-54. http://jmlr.org/papers/v24/22-0266.html
# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Forward Stepwise and Cut with different parameters hdMTD_FSC(X, d = 4, l = 2) hdMTD_FSC(X, d = 4, l = 3, alpha = 0.1)# Simulate a chain from an MTD model set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(1, 2), lam0 = 0.05) X <- perfectSample(M, N = 400) # Forward Stepwise and Cut with different parameters hdMTD_FSC(X, d = 4, l = 2) hdMTD_FSC(X, d = 4, l = 3, alpha = 0.1)
"hdMTD"
Printing and summarizing methods for lag-selection results returned by
hdMTD.
x |
An object of class |
object |
An object of class |
settings |
Logical ( |
... |
Further arguments passed to or from other methods (ignored). |
An object of class "hdMTD" is an integer vector (selected lags,
elements of ) with attributes:
method: one of "FS", "FSC", "CUT", "BIC".
d: upper bound for the order used in the call.
call: the matched call that produced the object.
settings: a (method-specific) list of the arguments actually used.
A: the state space actually used (provided or inferred).
(optional) BIC_selected_value for method="BIC" with BICvalue=TRUE.
(optional) extras$BIC_out for method="BIC" (exactly the
output of hdMTD_BIC()).
print() shows the method, d, and the selected set of lags in
. summary() prints the call, the estimated lag
set, optional BIC diagnostics and (optionally) the method-specific settings
when settings = TRUE.
print.hdMTDInvisibly returns the "hdMTD" object.
summary.hdMTDInvisibly returns a named list with fields:
call, S, lags, A,
method, d, settings, BIC_selected and
BIC_out. Relevant information is printed to the console in
a readable format.
hdMTD, hdMTD_FS, hdMTD_FSC,
hdMTD_CUT, hdMTD_BIC, S, lags
## Not run: set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) S_hat <- hdMTD(X, d = 5, method = "FS", l = 2) print(S_hat) summary(S_hat) S(S_hat); lags(S_hat) ## End(Not run)## Not run: set.seed(1) M <- MTDmodel(Lambda = c(1, 4), A = c(1, 3), lam0 = 0.05) X <- perfectSample(M, N = 400) S_hat <- hdMTD(X, d = 5, method = "FS", l = 2) print(S_hat) summary(S_hat) S(S_hat); lags(S_hat) ## End(Not run)
"MTD", "MTDest", and/or "hdMTD"
Public accessors that expose object components without relying on the internal
list structure. These accessors are available for "MTD" (model
objects), "MTDest" (EM fits), and/or "hdMTD" (lag selection).
pj(object) p0(object) lambdas(object) lags(object) Lambda(object) S(object) states(object) transitP(object)pj(object) p0(object) lambdas(object) lags(object) Lambda(object) S(object) states(object) transitP(object)
object |
An object of class |
Returned lag sets follow the package convention and are shown as negative
integers via lags() (elements of ). For convenience,
positive-index accessors are also provided:
Lambda() for "MTD" objects and "MTDest" fits, and
S() for "MTDest" and "hdMTD" objects
(elements of ).
The function transitP() returns the global transition matrix, i.e., a
convex combination of the independent distribution p0 and the
lag-specific transition matrices pj, weighted by lambdas.
pj(object)A list of stochastic matrices (one per lag).
p0(object)A numeric probability vector for the independent component.
lambdas(object)A numeric vector of mixture weights that sums to 1.
lags(object)The lag set (elements of ).
Lambda(object)The set of relevant lags as positive integers
(elements of ).
S(object)For "MTDest" and "hdMTD", the set of
candidate/estimated lags as positive integers (elements of ).
states(object)The state space.
transitP(object)The global transition matrix .
Naming conventions reflect the conceptual distinction:
Lambda: The true (known) relevant lag set in "MTD" models
S: An estimated or candidate lag set in "MTDest" and
"hdMTD" objects
Most accessors for "MTD" also work for "MTDest" via inheritance.
When needed (e.g., transitP()), a specific "MTDest" method
is provided.
## Not run: ## For generating an MTD model set.seed(1) m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1)) pj(m); p0(m); lambdas(m); lags(m); Lambda(m); states(m) transitP(m) ## For an EM fit (using coef(m) as init for simplicity): X <- perfectSample(m, N = 800) fit <- MTDest(X, S = c(1, 3), init = coef(m)) pj(fit); p0(fit); lambdas(fit); lags(fit); S(fit); states(fit) transitP(fit) ## For lag selection: S_hat <- hdMTD(X, d = 5, method = "FS", l = 2) S(S_hat); lags(S_hat) ## End(Not run)## Not run: ## For generating an MTD model set.seed(1) m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1)) pj(m); p0(m); lambdas(m); lags(m); Lambda(m); states(m) transitP(m) ## For an EM fit (using coef(m) as init for simplicity): X <- perfectSample(m, N = 800) fit <- MTDest(X, S = c(1, 3), init = coef(m)) pj(fit); p0(fit); lambdas(fit); lags(fit); S(fit); states(fit) transitP(fit) ## For lag selection: S_hat <- hdMTD(X, d = 5, method = "FS", l = 2) S(S_hat); lags(S_hat) ## End(Not run)
"MTD"
Printing, summarizing, and coefficient-extraction methods for Mixture Transition Distribution (MTD) model objects.
x |
An object of class |
object |
An object of class |
X |
A vector or single-column data frame containing an MTD chain sample
(required for |
... |
Further arguments passed to or from other methods (ignored). |
print.MTD() displays a compact summary of the model:
the relevant lag set (shown as negative integers) and the state space.
summary.MTD() computes and prints a detailed summary of the
model, including order, relevant lags, state space, mixture weights,
the independent distribution (if present), and a compact preview of
the global transition matrix .
coef.MTD() extracts parameters as a list with lambdas,
pj, and p0 (works for c("MTDest","MTD") objects by inheritance).
logLik.MTD() computes the log-likelihood of a sample under the
model. Since an object of class "MTD" carries only the model
parameters, a sample X must be supplied. The method honors
constraints such as single_matrix and an independent component
(indep_part), and returns an object of class "logLik"
with appropriate attributes.
print.MTDInvisibly returns the "MTD" object, after
displaying its relevant lag set and state space.
summary.MTDInvisibly returns a named list with fields:
call, order, Lambda, states,
lags, indep, lambdas, pj,
p0 (or NULL), P_dim, and P. The same
information is printed to the console in a readable format.
coef.MTDA list with parameters:
lambdas, pj, and p0.
logLik.MTDAn object of class "logLik" with attributes
nobs (number of transitions) and df (free parameters),
honoring model constraints such as single_matrix and the independent
component (indep_part).
MTDmodel, MTDest for fitted models (note that "MTDest"
objects inherit from "MTD"),
transitP, lambdas, pj,
p0, lags, Lambda, states,
oscillation, perfectSample, probs,
logLik
## Not run: set.seed(1) m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) print(m) # compact display: lags (Z^-) and state space s <- summary(m) str(s) coef(m) # list(lambdas = ..., pj = ..., p0 = ...) transitP(m) # global transition matrix P pj(m); p0(m); lambdas(m); lags(m); Lambda(m); states(m) X <- perfectSample(m, N = 400) logLik(m, X) ## End(Not run)## Not run: set.seed(1) m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) print(m) # compact display: lags (Z^-) and state space s <- summary(m) str(s) coef(m) # list(lambdas = ..., pj = ..., p0 = ...) transitP(m) # global transition matrix P pj(m); p0(m); lambdas(m); lags(m); Lambda(m); states(m) X <- perfectSample(m, N = 400) logLik(m, X) ## End(Not run)
Estimation of MTD parameters through the Expectation Maximization (EM) algorithm.
MTDest( X, S, M = 0.01, init, iter = FALSE, nIter = 100, A = NULL, oscillations = FALSE )MTDest( X, S, M = 0.01, init, iter = FALSE, nIter = 100, A = NULL, oscillations = FALSE )
X |
A vector or single-column data frame containing an MTD chain sample
( |
S |
A numeric vector of distinct positive integers, sorted increasingly.
Typically, |
M |
A stopping point for the EM algorithm. If |
init |
A list with initial parameters: |
iter |
Logical. If |
nIter |
An integer positive number with the maximum number of EM iterations. |
A |
Optional integer vector giving the state space. If omitted, defaults
to |
oscillations |
Logical. If |
Regarding the M parameter: it functions as a stopping
criterion within the EM algorithm. When the difference between
the log-likelihood computed with the newly estimated parameters
and that computed with the previous parameters falls below M,
the algorithm halts. Nevertheless, if the value of nIter
(which represents the maximum number of iterations) is smaller
than the number of iterations required to meet the M criterion,
the algorithm will conclude its execution when nIter is reached.
To ensure that the M criterion is effectively utilized, we
recommend using a higher value for nIter, which is set to a
default of 100.
Concerning the init parameter, it is expected to be a list with 2 or
3 entries. These entries consist of:
an optional vector named p0, representing an independent
distribution (the probability in the first entry of p0 must be
that of the smallest element in A and so on), a required list
of matrices pj, containing a stochastic matrix for each
element of S (the first matrix corresponds to the smallest lag in
S and so on), and a vector named lambdas representing
the weights, the first entry must be the weight for p0, and then one entry
for each element in pj list. If your MTD model does not have an independent
distribution p0, set init$lambdas[1]=0.
An S3 object of class c("MTDest", "MTD") (a list) with at least the following elements:
lambdas: estimated mixture weights (independent part first, if any).
pj: list of estimated transition matrices .
p0: estimated independent distribution (if applicable).
logLik: log-likelihood of the final fitted model.
iterations, deltaLogLik, lastComputedDelta (optional):
returned if iter = TRUE. Here, iterations is the number of
parameter updates performed; deltaLogLik stores the successive
absolute log-likelihood changes for the accepted updates; and
lastComputedDelta is the last computed change (which may be below
M when the loop stops by convergence).
oscillations (optional): returned if oscillations = TRUE.
call: the matched call.
S: the lag set used for estimation.
A: the state space used for estimation.
n: the sample size.
n_eff: the effective sample size used for estimation.
Objects returned by MTDest() have class c("MTDest","MTD") and support:
print and summary: methods for fitted objects
(see MTDest-methods).
coef: extracts fitted parameters lambdas, pj, and p0
(inherited from "MTD"; see MTD-methods).
logLik: returns the final log-likelihood stored in the fit
(see MTDest-methods).
plot: diagnostic plots for fitted objects (see plot.MTDest).
probs, oscillation, and perfectSample: additional utilities
available by inheritance from "MTD".
as.MTD: coerces an "MTDest" fit to an "MTD" model.
Stable access to fitted components is provided by MTD-accessors, including
S (or Lambda), lags, lambdas,
pj, p0, states, and transitP.
Lebre, Sophie and Bourguignon, Pierre-Yves. (2008). An EM algorithm for estimation in the Mixture Transition Distribution model. Journal of Statistical Computation and Simulation, 78(1), 1-15. doi:10.1080/00949650701266666
MTDmodel for constructing an MTD model,
hdMTD for lag selection procedures,
MTDest-methods for methods specific to fitted objects,
MTD-methods for methods inherited from "MTD",
MTD-accessors for stable access to components,
and as.MTD for coercion of fits to model objects.
# Simulating data. set.seed(1) MTD <- MTDmodel(Lambda = c(1, 10), A = c(0, 1), lam0 = 0.01) X <- perfectSample(MTD, N = 400) # Initial Parameters: init <- list( 'p0' = c(0.4, 0.6), 'lambdas' = c(0.05, 0.45, 0.5), 'pj' = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 10), init = init, iter = TRUE) str(fit, max.level = 1) fit$logLik class(fit) fit2 <- MTDest(X, S = c(1, 10), init = init, oscillations = TRUE) fit2$oscillations# Simulating data. set.seed(1) MTD <- MTDmodel(Lambda = c(1, 10), A = c(0, 1), lam0 = 0.01) X <- perfectSample(MTD, N = 400) # Initial Parameters: init <- list( 'p0' = c(0.4, 0.6), 'lambdas' = c(0.05, 0.45, 0.5), 'pj' = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 10), init = init, iter = TRUE) str(fit, max.level = 1) fit$logLik class(fit) fit2 <- MTDest(X, S = c(1, 10), init = init, oscillations = TRUE) fit2$oscillations
"MTDest"
Methods for objects returned by MTDest() – EM fits of Mixture Transition
Distribution models. Note that "MTDest" objects inherit from class "MTD"
(they have class c("MTDest", "MTD")), and several methods for "MTD"
also work on "MTDest" objects by inheritance. The methods documented
here are specific to EM fits, providing diagnostics and summaries of the estimation.
x |
An object of class |
object |
An object of class |
... |
Further arguments passed to or from other methods (ignored). |
These methods handle objects returned by MTDest (class c("MTDest","MTD")):
print.MTDest() displays a compact summary of the fitted model:
the lag set (S), the state space (A), the final
log-likelihood, and, if available, the number of EM updates performed.
summary.MTDest() computes and prints a detailed summary of the
key components of the object, including lambdas, transition matrices,
independent distribution (if present), log-likelihood, and
(if available) oscillations and iteration diagnostics.
logLik.MTDest() returns the log-likelihood as an object of class
"logLik", with attributes df (number of free parameters
under the multimatrix model) and nobs (effective sample size).
print.MTDestInvisibly returns the "MTDest" object, after
displaying its lag set, state space, final log-likelihood, and iteration
count (if available).
summary.MTDestInvisibly returns a named list with fields:
call, S, A, lambdas, pj,
p0 (or NULL),logLik, oscillations,
iterations, lastComputedDelta and deltaLogLik.
The same information is printed to the console in a readable format.
logLik.MTDest An object of class "logLik" with attributes
df (number of free parameters) and nobs (effective sample size).
MTDest, as.MTD, MTD-methods,
oscillation, perfectSample, probs
## Not run: set.seed(1) MTD <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.01) X <- perfectSample(MTD, N = 200) # small N to keep examples fast init <- list( p0 = c(0.4, 0.6), lambdas = c(0.05, 0.45, 0.5), pj = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 3), init = init, iter = TRUE) print(fit) summary(fit) coef(fit) # Works by inheritance logLik(fit) BIC(fit) ## End(Not run)## Not run: set.seed(1) MTD <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.01) X <- perfectSample(MTD, N = 200) # small N to keep examples fast init <- list( p0 = c(0.4, 0.6), lambdas = c(0.05, 0.45, 0.5), pj = list( matrix(c(0.2, 0.8, 0.45, 0.55), byrow = TRUE, ncol = 2), matrix(c(0.25, 0.75, 0.3, 0.7), byrow = TRUE, ncol = 2) ) ) fit <- MTDest(X, S = c(1, 3), init = init, iter = TRUE) print(fit) summary(fit) coef(fit) # Works by inheritance logLik(fit) BIC(fit) ## End(Not run)
Generates an MTD model as an object of class MTD given a set of parameters.
MTDmodel( Lambda, A, lam0 = NULL, lamj = NULL, pj = NULL, p0 = NULL, single_matrix = FALSE, indep_part = TRUE )MTDmodel( Lambda, A, lam0 = NULL, lamj = NULL, pj = NULL, p0 = NULL, single_matrix = FALSE, indep_part = TRUE )
Lambda |
A numeric vector of positive integers representing the relevant lag set. The elements will be sorted from smallest to greatest. The smallest number represents the latest (most recent) time in the past, and the largest number represents the earliest time in the past. |
A |
A vector with nonnegative integers representing the state space. |
lam0 |
A numeric value in |
lamj |
A numeric vector of weights for the transition probability matrices in |
pj |
A list with |
p0 |
A probability vector for the independent component of the MTD model. If |
single_matrix |
Logical. If |
indep_part |
Logical. If |
A list of class MTD containing:
PThe transition probability matrix of the MTD model.
lambdasA vector with MTD weights (lam0 and lamj).
pjA list of stochastic matrices defining conditional transition probabilities.
p0The independent probability distribution.
LambdaThe vector of relevant lags.
AThe state space.
single_matrixA logical argument, if TRUE indicates that all matrices in pj are identical.
Objects returned by MTDmodel() have class "MTD" and support:
print: compact display of the relevant lag set and
the state space (see MTD-methods).
summary: detailed summary of the model components
(see MTD-methods).
coef: extracts the model parameters lambdas,
pj, and p0 (see MTD-methods).
logLik: logLik(object, X) computes the
log-likelihood, provided that the user supplies a sample X (see MTD-methods).
plot: diagnostic plots, including oscillations by lag, mixture weights,
and transition graphs (see plot.MTD).
oscillation: oscillations by lag computed from the model parameters.
perfectSample: perfect sampling from the stationary distribution, provided that
.
probs: one-step predictive probabilities.
Stable access to model components is provided by MTD-accessors:
lags, Lambda, lambdas, pj,
p0, states, and transitP.
MTDest for EM-based parameter estimation,
hdMTD for lag selection procedures,
MTD-methods for methods applicable to "MTD" objects,
MTD-accessors for stable access to model components,
oscillation, perfectSample, and probs
for additional inference and simulation utilities.
summary(MTDmodel(Lambda = c(1, 3), A = c(4, 8, 12))) MM <- MTDmodel(Lambda = c(2, 4, 9), A = c(0, 1), lam0 = 0.05, lamj = c(0.35, 0.2, 0.4), pj = list(matrix(c(0.5, 0.7, 0.5, 0.3), ncol = 2)), p0 = c(0.2, 0.8), single_matrix = TRUE) transitP(MM); pj(MM); oscillation(MM) MM <- MTDmodel(Lambda = c(2, 4, 9), A = c(0, 1), lam0 = 0.05, pj = list(matrix(c(0.5, 0.7, 0.5, 0.3), ncol = 2)), single_matrix = TRUE, indep_part = FALSE) p0(MM); lambdas(MM)summary(MTDmodel(Lambda = c(1, 3), A = c(4, 8, 12))) MM <- MTDmodel(Lambda = c(2, 4, 9), A = c(0, 1), lam0 = 0.05, lamj = c(0.35, 0.2, 0.4), pj = list(matrix(c(0.5, 0.7, 0.5, 0.3), ncol = 2)), p0 = c(0.2, 0.8), single_matrix = TRUE) transitP(MM); pj(MM); oscillation(MM) MM <- MTDmodel(Lambda = c(2, 4, 9), A = c(0, 1), lam0 = 0.05, pj = list(matrix(c(0.5, 0.7, 0.5, 0.3), ncol = 2)), single_matrix = TRUE, indep_part = FALSE) p0(MM); lambdas(MM)
Calculates the oscillations of an MTD model object, of an MTDest fit, or estimates the oscillations of a chain sample.
oscillation(x, ...) ## S3 method for class 'MTD' oscillation(x, ...) ## Default S3 method: oscillation(x, S, A = NULL, ...)oscillation(x, ...) ## S3 method for class 'MTD' oscillation(x, ...) ## Default S3 method: oscillation(x, S, A = NULL, ...)
x |
Either an MTD model object, an MTDest fit, or a chain sample. |
... |
Ignored. |
S |
A numeric vector of distinct positive integers. Represents a set of lags. |
A |
Optional integer vector giving the state space. If omitted, defaults to |
For an MTD model or an MTDest fit, the oscillation for lag
(i.e., for MTD or
for MTDest) is the product of the mixture weight and the
maximum of the total variation distance between the distributions in a
stochastic matrix . That is,
When x is an object of class MTD (or MTDest), the quantities
(or ), , , and are directly available.
In this case, the oscillations can be computed exactly from the model parameters.
Alternatively, when estimating oscillations from data, x must be a
realization of a chain and S must be provided as a candidate lag set.
The transition probabilities are then estimated from the sample before
computing the corresponding oscillations.
Let symbolize an estimated distribution
in given a certain past ( which is a sequence of
elements of where each element occurred at a lag in S),
and an estimated distribution given past
and that the symbol occurred at lag . If
is the sample size, max(S) and is the number of
times the sequence appeared in the sample, then
is the estimated oscillation for a lag S.
Note that is the space of sequences of
indexed by S.
A named numeric vector of oscillations. If the x parameter is
an MTD or MTDest object, it will provide the oscillations for each element in
Lambda (or S for MTDest). If x is a chain sample, it
estimates the oscillations for a user-inputted set of lags S.
oscillation(MTD): For an MTD (or MTDest) object: computes
for all (or for all ).
oscillation(default): For a chain sample: estimates for each in S.
oscillation(MTDmodel(Lambda = c(1, 4), A = c(2, 3))) oscillation(MTDmodel(Lambda = c(1, 4), A = c(2, 3), lam0 = 0.01, lamj = c(0.49, 0.5), pj = list(matrix(c(0.1, 0.9, 0.9, 0.1), ncol = 2)), single_matrix = TRUE))oscillation(MTDmodel(Lambda = c(1, 4), A = c(2, 3))) oscillation(MTDmodel(Lambda = c(1, 4), A = c(2, 3), lam0 = 0.01, lamj = c(0.49, 0.5), pj = list(matrix(c(0.1, 0.9, 0.9, 0.1), ncol = 2)), single_matrix = TRUE))
Samples an MTD Markov Chain from the stationary distribution.
perfectSample(object, N, ...)perfectSample(object, N, ...)
object |
An object of class "MTD" or "MTDest". |
N |
Positive integer. Sample size to generate. Must be > max(Lambda(object)). |
... |
Additional arguments passed to methods. |
This perfect sample algorithm requires that the MTD model has
an independent distribution (p0) with a positive weight (i.e., lambdas(object)["lam0"]>0
which means ).
Returns a size N sample from an MTD model (the first element is the most recent).
M <- MTDmodel(Lambda = c(1, 3, 4), A = c(0, 2)) perfectSample(M, N = 200) M <- MTDmodel(Lambda = c(2, 5), A = c(1, 2, 3)) perfectSample(M, N = 300)M <- MTDmodel(Lambda = c(1, 3, 4), A = c(0, 2)) perfectSample(M, N = 200) M <- MTDmodel(Lambda = c(2, 5), A = c(1, 2, 3)) perfectSample(M, N = 300)
Produces plots for an MTD object. By default, it shows the following
sequence of plots: (i) barplot of oscillations by relevant lag,
(ii) barplot of mixture weights (including lam0 if > 0)
and (iii) graphs of pj, one graph for each lag in Lambda. When
type is specified, only the requested plot is drawn.
## S3 method for class 'MTD' plot(x, type, main, ylim, col = "gray70", border = NA, pj_index = 1, ...)## S3 method for class 'MTD' plot(x, type, main, ylim, col = "gray70", border = NA, pj_index = 1, ...)
x |
An object of class |
type |
If |
main |
Optional main title. When |
ylim |
Optional y-axis limits. When |
col |
Bar fill color (passed to |
border |
Bar border color (passed to |
pj_index |
Integer index of |
... |
Further graphical parameters: passed to |
For type = "oscillation", the function calls oscillation(x)
to obtain
for each lag in Lambda(x), and draws a bar plot named by the lags.
For type = "lambdas", it plots the mixture weights by lag.
If lam0 > 0, the weight for the independent component is included and
labeled "0".
For type = "pj", the function draws the directed, weighted graph of a
transition matrix pj taken from pj(x). Vertices correspond to the
states states(x). A directed edge a -> b carries weight
p_j(b | a). Edge widths and edge labels are proportional to the transition
probabilities (labels shown in the plot are rounded to two decimals).
By default, self-loops (a -> a) are not drawn. The self-loop probability
at a state a can be inferred as
For type = "pj", a specific matrix can be selected via pj_index
(e.g., pj_index = 2 plots pj(x)[[2]]). In automatic mode (when
type is missing), graphs for all pj matrices are shown
sequentially.
If type is "oscillation" or "lambdas", invisibly returns the numeric
vector that was plotted. If type = "pj", invisibly returns the selected transition
matrix. If type is missing, invisibly returns a list with components
oscillation and lambdas.
## Not run: m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1)) ## Automatic mode (press Enter between plots) plot(m) ## Single plot: plot(m, type = "oscillation") plot(m, type = "lambdas") plot(m, type = "pj", pj_index = 2) ## End(Not run)## Not run: m <- MTDmodel(Lambda = c(1, 3), A = c(0, 1)) ## Automatic mode (press Enter between plots) plot(m) ## Single plot: plot(m, type = "oscillation") plot(m, type = "lambdas") plot(m, type = "pj", pj_index = 2) ## End(Not run)
Produces plots for an MTDest object. By default, it shows in sequence:
(i) barplot of oscillations by relevant lag, (ii) barplot of mixture weights
(including lam0 if > 0), and (iii) graphs of
pj (one graph for each lag in Lambda). If EM diagnostics are
available, a convergence panel (log-likelihood variation per update) is shown
last. When type is specified, only the requested plot is drawn.
## S3 method for class 'MTDest' plot(x, type, main, ylim, col = "gray70", border = NA, pj_index = 1, ...)## S3 method for class 'MTDest' plot(x, type, main, ylim, col = "gray70", border = NA, pj_index = 1, ...)
x |
An object of class |
type |
If |
main |
Optional main title. When |
ylim |
Optional y-axis limits. When |
col |
Bar fill color (passed to |
border |
Bar border color (for bar plots). Defaults to |
pj_index |
Integer index of |
... |
Further graphical parameters: passed to |
For type = "oscillation", the function delegates to plot.MTD(),
which returns
for each lag in lags(x), and draws a bar plot named by the lags.
For type = "lambdas", the function delegates to plot.MTD(),
it plots the mixture weights by lag. If lam0 > 0, the
weight for the independent component is included and labeled "0".
For type = "pj", the function delegates to plot.MTD(),
it draws the directed, weighted graph of the transition matrices pj
taken from pj(x). Vertices correspond to the states states(x).
A directed edge a -> b carries weight p_j(b | a). Edge widths
and edge labels are proportional to the transition probabilities (labels
shown in the plot are rounded to two decimals). By default,
self-loops (a -> a) are not drawn. The self-loop probability
at a state a can be inferred as
For type = "pj", a specific matrix can be selected via pj_index
(e.g., pj_index = 2 plots pj(x)[[2]]). In automatic mode (when
type is missing), graphs for all pj matrices are shown
sequentially.
If EM iteration diagnostics are available (i.e., the object was fitted with
iter = TRUE and length(deltaLogLik) > 0), a convergence panel
showing the log-likelihood variation per update is displayed automatically
when type is missing. You can also request it explicitly with
type = "convergence".
If type is "oscillation" or "lambdas", invisibly returns
the numeric vector that was plotted. If type = "pj", invisibly returns
the selected transition matrix pj(x)[[pj_index]]. If
type = "convergence", invisibly returns the numeric vector
deltaLogLik. If type is missing, invisibly returns the list
produced by plot.MTD(m) (typically with components oscillation
and lambdas); if deltaLogLik is available, the returned list also
includes deltaLogLik.
## Not run: set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) X <- perfectSample(M, N = 300) fit <- MTDest(X, S = c(1, 3), init = coef(M), iter = TRUE) plot(fit) plot(fit, type = "pj", pj_index = 2) plot(fit, type = "convergence") ## End(Not run)## Not run: set.seed(1) M <- MTDmodel(Lambda = c(1, 3), A = c(0, 1), lam0 = 0.05) X <- perfectSample(M, N = 300) fit <- MTDest(X, S = c(1, 3), init = coef(M), iter = TRUE) plot(fit) plot(fit, type = "pj", pj_index = 2) plot(fit, type = "convergence") ## End(Not run)
Compute one-step-ahead predictive probabilities under an MTD model or an MTDest fit. For objects of class "MTDest", the method is inherited from the "MTD" class via S3 inheritance.
Conventions:
Samples are read most recent first: x[1] = X_{t-1}, x[2] = X_{t-2}, etc.
The global transition matrix P (or transitP) is indexed by row labels that
list the past context from oldest to newest. A cell at row "s_k...s_1" and
column "a" is read as p(a | s_1...s_k).
If both newdata and context are missing, probs() returns the full
global transition matrix (transitP(object)).
probs(object, context = NULL, newdata = NULL, oldLeft = FALSE)probs(object, context = NULL, newdata = NULL, oldLeft = FALSE)
object |
An |
context |
Optional vector or matrix/data.frame of contexts (rows). By default,
each row follows the "most recent first" convention; set |
newdata |
Optional vector or matrix/data.frame of samples (rows). Columns follow
the "most recent first" convention. Must have at least |
oldLeft |
Logical. If |
All entries of newdata/context must belong to the model's state space
states(object).
A numeric matrix of predictive probabilities with one row per input
context and columns indexed by states(object). Row names are the context
labels (oldest to newest) formed by concatenating state symbols without a
separator. If both newdata and context are missing, the full global
transition matrix is returned, which can be large for big state spaces or many lags.
transitP, states, Lambda,
empirical_probs, MTDmodel, MTDest
set.seed(1) m <- MTDmodel(Lambda = c(1,3), A = c(0,1), lam0 = 0.1) # Full matrix P <- probs(m) # Using a sample row (most recent first): newdata has >= max(Lambda) columns new_ctx <- c(1, 0, 1, 0) # X_{t-1}=1, X_{t-2}=0, X_{t-3}=1, ... probs(m, newdata = new_ctx) # one row of probabilities # Explicit contexts (exactly |Lambda| symbols per row) probs(m, context = c(0, 1), oldLeft = FALSE) # most recent first probs(m, context = c(0, 1), oldLeft = TRUE) # oldest to newest # Multiple contexts (rows) ctxs <- rbind(c(1,0,1), c(0,1,1), c(1,1,0)) probs(m, newdata = ctxs)set.seed(1) m <- MTDmodel(Lambda = c(1,3), A = c(0,1), lam0 = 0.1) # Full matrix P <- probs(m) # Using a sample row (most recent first): newdata has >= max(Lambda) columns new_ctx <- c(1, 0, 1, 0) # X_{t-1}=1, X_{t-2}=0, X_{t-3}=1, ... probs(m, newdata = new_ctx) # one row of probabilities # Explicit contexts (exactly |Lambda| symbols per row) probs(m, context = c(0, 1), oldLeft = FALSE) # most recent first probs(m, context = c(0, 1), oldLeft = TRUE) # oldest to newest # Multiple contexts (rows) ctxs <- rbind(c(1,0,1), c(0,1,1), c(1,1,0)) probs(m, newdata = ctxs)
A data frame with the maximum temperature of the last hour, by each hour, in the city of Brasília, Brazil. The data spans from 01/01/2003 to 31/08/2024.
tempdatatempdata
A data frame with 189936 rows and 3 columns. Each row corresponds to a time in a day specified in columns 2 ("TIME") and 1 ("DATE") respectively. The value in column 3 ("MAXTEMP") is the maximum temperature measured in the last hour, in Celsius (Cº), in the city of Brasília, the capital of Brazil, located in the central-western part of the country.
The day, from 01/01/2003 to 31/08/2024
The time, form 00:00 to 23:00 each day
The maximum temperature measured in the last hour in Celsius
Meteorological data provided by INMET (National Institute of Meteorology, Brazil). Data collected from automatic weather station in Brasília (latitude: -15.79°, longitude: -47.93°, altitude: 1159.54 m). Available at: https://bdmep.inmet.gov.br/
data(tempdata)data(tempdata)