| Title: | Automated Routing Engine for Longitudinal Missing Data |
| Version: | 0.2.0 |
| Description: | Provides an automated routing engine for longitudinal missing data. It utilizes a Lagrange-constrained Random Forest based on sample size, missingness rate, and skew to preserve structural variance. |
| License: | MIT + file LICENSE |
| SystemRequirements: | C++17 |
| Encoding: | UTF-8 |
| VignetteBuilder: | knitr |
| Imports: | Rcpp (≥ 1.0.0) |
| Suggests: | missForest, missRanger, ranger, mice, lavaan, ggplot2, tidyr, dplyr, knitr, rmarkdown, MASS |
| LinkingTo: | Rcpp, RcppArmadillo |
| Config/roxygen2/version: | 8.0.0 |
| NeedsCompilation: | yes |
| Packaged: | 2026-06-05 20:38:16 UTC; xguo |
| Author: | Xiyuan Guo [aut, cre] |
| Maintainer: | Xiyuan Guo <tommyguo039@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-06-05 21:00:02 UTC |
FIML-Smriti Refinement Wrapper
Description
This function serves as an agnostic refinement layer, combining
Full Information Maximum Likelihood (FIML) with the
structural covariance preservation of smriti_impute().
Usage
smriti_fiml(data, model, initial_imputation = NULL, lambda = 1)
Arguments
data |
A data frame containing missing values. |
model |
A lavaan model syntax string. |
initial_imputation |
A data frame or matrix of the same dimensions as the
subset of |
lambda |
A numeric value specifying the per-observation penalty weight for covariance matching. Defaults to 1.0. |
Value
A data frame with FIML-consistent, covariance-projected imputed values.
missForest-Smriti Refinement Wrapper
Description
This function serves as an agnostic refinement layer, combining
the predictive power of Random Forests via missForest with the
structural covariance preservation of smriti_impute().
Usage
smriti_forest(data, time_cols, robust = TRUE, ...)
Arguments
data |
A data frame containing missing values. |
time_cols |
A character vector or numeric vector specifying the longitudinal columns. |
robust |
A logical value. If TRUE (default), uses robust covariance estimation. |
... |
Additional arguments passed directly to |
Value
A data frame with covariance-projected imputed values.
Examples
## Not run:
df <- data.frame(T1 = c(1, NA, 3, 4), T2 = c(NA, 2, 3, 4))
smriti_forest(df, time_cols = 1:2)
## End(Not run)
Smriti Automated Longitudinal Imputation
Description
This function performs an automated routing and refinement for longitudinal missing data. It establishes a target covariance manifold from observed data, and then projects an initial imputation back toward the structural manifold using a Lagrangian constraint. Only originally-missing values are updated; observed data is held fixed.
Usage
smriti_impute(
data,
time_cols,
initial_imputation = NULL,
lambda = 1,
learning_rate = 0.001,
tol = 1e-06,
max_iter = 2000,
robust = FALSE,
custom_target = NULL
)
Arguments
data |
A data frame containing missing values. |
time_cols |
A character vector or numeric vector specifying the longitudinal columns. |
initial_imputation |
A matrix or data frame of the same dimensions as
|
lambda |
A numeric value or "auto" specifying the per-observation penalty weight on the covariance-constraint term. If "auto", a simple heuristic is used based on the Frobenius norm between initial and target covariance. The covariance gradient is deliberately un-normalised (no division by n-1) so the constraint remains effective at any sample size. Defaults to 1.0 (tuned via simulation for general use). Use lower values (e.g. 0.01) for clean Normal data; increase for stronger constraint enforcement. |
learning_rate |
A numeric value for the gradient descent step size. Defaults to 0.001. |
tol |
A numeric value for the internal convergence tolerance (Frobenius norm). Defaults to 1e-6. |
max_iter |
An integer specifying the maximum number of iterations for the gradient descent projection. Defaults to 2000. |
robust |
A logical value. Setting it to TRUE uses a robust target constructed from pairwise Spearman correlations and column-wise MAD, projected to the nearest positive-semidefinite matrix. This protects against outliers and heavy-tailed noise. Defaults to FALSE (pairwise Pearson covariance). Use TRUE when the data contains explicit outlier contamination (e.g. sensor artifacts). |
custom_target |
An optional p x p covariance matrix (where p is the
number of longitudinal columns) to be used as the structural manifold.
If provided, |
Value
A data frame with imputed and structurally refined values. Only the originally-missing cells are modified; observed values are returned unchanged.
Examples
# Simulated longitudinal data with scattered missingness
df <- data.frame(
T1 = c(1.2, NA, 2.8, 3.1),
T2 = c(2.1, 2.5, NA, 4.0),
T3 = c(3.0, 3.3, 4.1, NA)
)
smriti_impute(df, time_cols = 1:3, robust = FALSE)
Smriti Multiple Imputation Wrapper
Description
This function generates multiple imputed datasets to capture the uncertainty inherent in missing data imputation. It uses bootstrapping to create distinct datasets, applies the Smriti Lagrangian refinement to each, and returns a list of completed datasets suitable for pooling.
Usage
smriti_mi(data, time_cols, m = 5, initial_imputation = NULL, ...)
Arguments
data |
A data frame containing missing values. |
time_cols |
A character vector or numeric vector specifying the longitudinal columns. |
m |
An integer specifying the number of imputations. Defaults to 5. |
initial_imputation |
A matrix or data frame of the same dimensions as
|
... |
Additional arguments passed to |
Details
Approximate (bootstrap) multiple imputation. Proper multiple imputation
(Rubin 1987) draws imputations from the Bayesian posterior predictive
distribution of the missing data given the observed data and the imputation
model. This function instead bootstraps the rows and then deterministically
applies smriti_impute() to each bootstrap sample. The between-imputation
variance therefore captures sampling variability but not the full
imputation-model uncertainty. The resulting standard errors may be
moderately anti-conservative. Users needing valid Rubin's-rules pooling
should treat the output as approximate MI and consider adding a stochastic
residual-draw step, or use a fully Bayesian engine such as mice.
Pooling the returned list can be performed manually via Rubin's rules:
mi_list <- smriti_mi(df, time_cols = 2:5, m = 10) # Within-imputation estimates estimates <- lapply(mi_list, function(d) coef(lm(y ~ x, data = d))) # Pool with Rubin's rules (point estimate = mean, SE = sqrt(W + (1+1/m)*B))
Value
A list of m completed data frames, each with an "imputation"
attribute giving its index (1..m). The list has class "smriti_mi_list".
See Also
smriti_impute() for the single-imputation engine.
Examples
## Not run:
df <- data.frame(
T1 = c(1, 2, NA, 4),
T2 = c(NA, 2, 3, 4),
T3 = c(1, NA, 3, 4)
)
mi_list <- smriti_mi(df, time_cols = c("T1", "T2", "T3"), m = 5)
## End(Not run)
mice-Smriti Refinement Wrapper
Description
This function serves as an agnostic refinement layer, combining
multivariate imputation by chained equations (MICE) with the
structural covariance preservation of smriti_impute().
Usage
smriti_mice(data, time_cols, robust = TRUE, ...)
Arguments
data |
A data frame containing missing values. |
time_cols |
A character vector or numeric vector specifying the longitudinal columns. |
robust |
A logical value. If TRUE (default), uses robust covariance estimation. |
... |
Additional arguments passed directly to |
Value
A data frame with covariance-projected imputed values.
Examples
## Not run:
df <- data.frame(T1 = c(1, NA, 3, 4), T2 = c(NA, 2, 3, 4))
smriti_mice(df, time_cols = 1:2)
## End(Not run)
missRanger-Smriti Refinement Wrapper
Description
This function serves as an agnostic refinement layer, combining
the fast predictive imputation of missRanger with the structural
covariance preservation of smriti_impute().
Usage
smriti_ranger(data, time_cols, robust = TRUE, ...)
Arguments
data |
A data frame containing missing values. |
time_cols |
A character vector or numeric vector specifying the longitudinal columns. |
robust |
A logical value. If TRUE (default), uses robust covariance estimation. |
... |
Additional arguments passed directly to |
Value
A data frame with covariance-projected imputed values.
Examples
## Not run:
df <- data.frame(T1 = c(1, NA, 3, 4), T2 = c(NA, 2, 3, 4))
smriti_ranger(df, time_cols = 1:2)
## End(Not run)