Title: | Diagnostic Tools and Unit Tests for Statistical Estimators |
---|---|
Description: | Extension of 'testthat' package to make unit tests on empirical distributions of estimators and functions for diagnostics of their finite-sample performance. |
Authors: | Dmitry Otryakhin [aut, cre] |
Maintainer: | Dmitry Otryakhin <[email protected]> |
License: | GPL-3 |
Version: | 0.0.3 |
Built: | 2024-11-18 03:54:03 UTC |
Source: | https://gitlab.com/dmitry_otryakhin/diagnostics-and-tests-for-statistical-estimators |
For every sample size value the function creates a sample and evaluates the estimators Nmc times.
Estim_diagnost(Nmc, s, Inference, packages = NULL)
Estim_diagnost(Nmc, s, Inference, packages = NULL)
Nmc |
number of repetitions |
s |
numeric vector of sample sizes |
Inference |
function of s creating a sample and evaluating estimators (see details) |
packages |
list of packages to pass to foreach loop |
data frame with estimators' values
Nmc=400 s<-c(1e2,1e3) Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) estims_qqplot(data) estims_boxplot(data) # Inference<-function(s){ rrr<-2/0 list(Mn=mean(rrr), Sd=sd(rrr)) } head(Estim_diagnost(Nmc, s, Inference)) # Inference<-function(s){ rrr<-rnorm(n=s) rrr[2]<-"dwq" list(Mn=mean(rrr), Sd=sd(rrr)) } head(Estim_diagnost(Nmc, s, Inference))
Nmc=400 s<-c(1e2,1e3) Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) estims_qqplot(data) estims_boxplot(data) # Inference<-function(s){ rrr<-2/0 list(Mn=mean(rrr), Sd=sd(rrr)) } head(Estim_diagnost(Nmc, s, Inference)) # Inference<-function(s){ rrr<-rnorm(n=s) rrr[2]<-"dwq" list(Mn=mean(rrr), Sd=sd(rrr)) } head(Estim_diagnost(Nmc, s, Inference))
Plot boxplots of estimators for different sample sizes.
estims_boxplot(data, sep = FALSE)
estims_boxplot(data, sep = FALSE)
data |
data frame returned by |
sep |
indicates whether all plots will be stacked together or returned as elements of a list |
ggplot2 object
Nmc=400 s<-seq(from = 1, to = 10, by = 2)*1e3 Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) estims_boxplot(data) estims_boxplot(data, sep=TRUE)
Nmc=400 s<-seq(from = 1, to = 10, by = 2)*1e3 Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) estims_boxplot(data) estims_boxplot(data, sep=TRUE)
Plot QQ-plots of estimators' empirical distributions for different sample sizes.
estims_qqplot(data, sep = FALSE, ...)
estims_qqplot(data, sep = FALSE, ...)
data |
data frame returned by |
sep |
indicates whether all plots will be stacked together or returned as elements of a list |
... |
parameters to pass to stat_qq function |
ggplot2 object
library(ggplot2) Nmc=500 s<-c(1e3,4e3) Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) lisst <- estims_qqplot(data, sep=TRUE) lisst[2][[1]] + geom_abline(intercept = 1) pl_joint<-estims_qqplot(data) pl_joint + geom_abline(slope=1) pl_joint<-estims_qqplot(data, distribution = stats::qt, dparams = list(df=3, ncp=0.1)) pl_joint + geom_abline(slope=1)
library(ggplot2) Nmc=500 s<-c(1e3,4e3) Inference<-function(s){ rrr<-rnorm(n=s) list(Mn=mean(rrr), Sd=sd(rrr)) } data <- Estim_diagnost(Nmc, s, Inference) lisst <- estims_qqplot(data, sep=TRUE) lisst[2][[1]] + geom_abline(intercept = 1) pl_joint<-estims_qqplot(data) pl_joint + geom_abline(slope=1) pl_joint<-estims_qqplot(data, distribution = stats::qt, dparams = list(df=3, ncp=0.1)) pl_joint + geom_abline(slope=1)
Expectation checking whether a given sample comes from a certain parametric distribution. The underlying procedure is Anderson-Darling test of goodness-of-fit ad.test
.
The expectation throws an error when the test's p-value is smaller than the threshold p-value.
expect_distfit(sample, p_value = 0.001, nulldist, ...)
expect_distfit(sample, p_value = 0.001, nulldist, ...)
sample |
to test |
p_value |
threshold p-value of the test |
nulldist |
null distribution |
... |
parameters to pass to the null distribution |
Invisibly returns a p-value of the test.
# Gaussianity test ## Not run: x<-rnorm(n=1e4,5,6) expect_distfit(sample=x, nulldist="pnorm", mean=5, sd=6.3) expect_distfit(sample=x, nulldist="pnorm", mean=5, sd=6) ## End(Not run) # Uniformity test x<-runif(n=1e4,-1,6) expect_distfit(sample=x, nulldist="punif", min=-1, max=6)
# Gaussianity test ## Not run: x<-rnorm(n=1e4,5,6) expect_distfit(sample=x, nulldist="pnorm", mean=5, sd=6.3) expect_distfit(sample=x, nulldist="pnorm", mean=5, sd=6) ## End(Not run) # Uniformity test x<-runif(n=1e4,-1,6) expect_distfit(sample=x, nulldist="punif", min=-1, max=6)
Expectation checking whether a given sample comes from Gaussian distribution with arbitrary parameters. The underlying procedure is Shapiro- Wilk's test of normality shapiro.test
.
The expectation throws an error when the test's p-value is smaller than the threshold p-value.
expect_gaussian(sample, p_value = 0.001)
expect_gaussian(sample, p_value = 0.001)
sample |
to test |
p_value |
threshold p-value of the test |
shapiro.test allows the number of non-missing values to be between 3 and 5000.
Invisibly returns a p-value of the test.
x<-rnorm(n=1e3,5,6) expect_gaussian(sample=x) #The following test doesn't pass ## Not run: x<-runif(n=1e2,-1,6) expect_gaussian(sample=x) ## End(Not run)
x<-rnorm(n=1e3,5,6) expect_gaussian(sample=x) #The following test doesn't pass ## Not run: x<-runif(n=1e2,-1,6) expect_gaussian(sample=x) ## End(Not run)
Expectation checking whether values from a given sample have a certain mean or that two samples have the same mean. The underlying procedure is Student's t-test t.test
.
The expectation throws an error when the test's p-value is smaller than the threshold p-value.
expect_mean_equal(p_value = 0.001, ...)
expect_mean_equal(p_value = 0.001, ...)
p_value |
threshold p-value of the test |
... |
parameters to pass to t.test function including data sample(s) |
Invisibly returns a p-value of the test
# This test doesn't pass ## Not run: x<-1:1e3 expect_mean_equal(x=x) ## End(Not run) # This one passes, but shouldn't x<-rnorm(1e3) + 0.01 expect_mean_equal(x=x) x<-rnorm(1e3) expect_mean_equal(x=x) # check if 2 samples have the same mean x<-rnorm(1e3, mean=10) y<-rnorm(1e3, mean=10) expect_mean_equal(x=x, y=y)
# This test doesn't pass ## Not run: x<-1:1e3 expect_mean_equal(x=x) ## End(Not run) # This one passes, but shouldn't x<-rnorm(1e3) + 0.01 expect_mean_equal(x=x) x<-rnorm(1e3) expect_mean_equal(x=x) # check if 2 samples have the same mean x<-rnorm(1e3, mean=10) y<-rnorm(1e3, mean=10) expect_mean_equal(x=x, y=y)