How to Use NMstats
NMstats is an R package that contains functions aligned to the Navidi/Monk Elementary & Essential Statistics textbooks. This vignette describes the functions available along with special considerations and examples.
Call the NMstats package:
library(NMstats)
combs
The combs function calculates the number of ways to choose r items from a total of n without replacement and where order does not matter.
Usage
The format is combs(n, r) where n is the total number of items and r is the number of items to choose.
Return
The function returns an integer value that represents the number of combinations of r items chosen from n. If invalid input values are given, an error message is returned.
Examples
Count the number of combinations of 4 items chosen from 10
combs(10, 4)
## [1] 210
Computation of binomial probability where n = 15, p = 0.25, and x = 6
n <- 15
p <- 0.25
x <- 6
prob_value <- combs(n, x) * p^x * (1 - p)^(n - x)
print(prob_value)
## [1] 0.09174777
perms
The perms function calculates the number of ways to choose r items from a total of n without replacement and where order matters.
Usage
The format is perms(n, r) where n is the total number of items and r is the number of items to choose.
Return
The function returns an integer value that represents the number of permutations of r items chosen from n. If invalid input values are given, an error message is returned.
Examples
The number of permutations of 7 items chosen from 12
perms(12, 7)
## [1] 3991680
outlier_bounds
The outlier_bounds function computes the lower and upper outlier boundaries using the IQR method.
Usage
The format is outlier_bounds(data) where data is a data set.
Return
The function will print the lower and upper outlier bounds. Variables Lower.bound and Upper.bound are available as return values after execution of this function.
Examples
Print the lower and upper outlier bounds of a data set
data <- c(14, 9, 3, 22, 8, 13, 6)
outlier_bounds(data)
##
## Lower Outlier Bound: -2.75
## Upper Outlier Bound: 23.25
data_range
The data_range function computes the range of a data set.
Usage
The format is data_range(data) where data is a data set.
Return
The function returns the range of a data set, defined as the minimum subtracted from the maximum.
Examples
Return the range of a data set
data <- c(24, -67, 15, 89, -34, 51, -42, 76)
data_range(data)
## [1] 156
rel_hist
The rel_hist function constructs a relative frequency histogram for a data set.
Usage
The format is rel_hist(data, bins, col, xlab, ylab, main, ybreaks) where data is a data set.
Optional arguments are: bins (number of bins), col (fill color), xlab and ylab (axis labels), main (title), and ybreaks (y-axis tick marks).
Return
The function constructs a relative frequency histogram.
Examples
Generate random data and construct relative histogram
data <- sample(1:100, 220, replace = TRUE)
rel_hist(data, bins = 20, xlab = "My Data")
var.p
The var.p function calculates the population variance of a data set.
Usage
The format is var.p(data) where data is a data set.
Examples
Calculate the population variance of a data set
data <- c(37, 292, 175, 86, 331, 249, 104, 58, 368, 213)
var.p(data)
## [1] 12523.21
sd.p
The sd.p function calculates the population standard deviation of a data set.
Usage
The format is sd.p(data) where data is a data set.
Examples
Calculate the population standard deviation of a data set
data <- c(-3.2, 7.1, -12.5, 4.8, -0.6, 11.9, -9.4, 2.3, 6.7)
sd.p(data)
## [1] 7.563427
Z_Interval
The Z_Interval function calculates the confidence interval for a population mean when the population standard deviation is known.
Usage
The format is Z_Interval(xbar, n, sigma, alpha) where xbar is the sample mean, n is the sample size, sigma is the population standard deviation, and alpha is a value between 0 and 1 where 1 - alpha is the confidence level as a decimal.
Return
The function will print the confidence level, margin of error, critical value, and the confidence interval (lower and upper bounds).
Examples
Construct a 90% confidence interval
Z_Interval(xbar = 39.8, n = 36, sigma = 6.4, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 1.75451
## Critical Value: 1.64485
## Lower Bound: 38.04549
## Upper Bound: 41.55451
T_Interval
The T_Interval function calculates the confidence interval for a population mean when the population standard deviation is not known, and the sample standard deviation is used instead.
Usage
The format is T_Interval(xbar, n, s, alpha) where xbar is the sample mean, n is the sample size, s is the sample standard deviation, and alpha is the significance level.
Return
The function will print the confidence level, margin of error, degrees of freedom, critical value, and the confidence interval bounds.
Examples
Construct a 95% confidence interval
T_Interval(xbar = 77.3, n = 22, s = 12.4, alpha = 0.05)
## Confidence Level: 95 %
## Margin of Error: 5.49824
## Degrees of Freedom: 21
## Critical Value: 2.07961
## Lower Bound: 71.80176
## Upper Bound: 82.79824
One_Prop_Int
The One_Prop_Int function calculates the confidence interval for a population proportion.
Usage
The format is One_Prop_Int(x, n, alpha) where x is the number of individuals of interest, n is the sample size, and alpha is the significance level.
Examples
Construct a 90% confidence interval
One_Prop_Int(x = 28, n = 60, alpha = 0.1)
## Confidence Level: 90 %
## Margin of Error: 0.10594
## Critical Value: 1.64485
## Sample Proportion: 0.46667
## Lower Bound: 0.36073
## Upper Bound: 0.57261
Z_Test
The Z_Test function performs a hypothesis test about a population mean when the population standard deviation is known.
Usage
The format is Z_Test(xbar, n, sigma, mu, alt) where alt is a character string with choices "left", "right", or "two".
Examples
Perform a left-tailed hypothesis test
Z_Test(xbar = 47.3, n = 44, sigma = 10.2, mu = 50, alt = "left")
## Test Statistic: z = -1.75586
## P-Value: 0.03956
T_Test
The T_Test function performs a hypothesis test about a population mean when the population standard deviation is not known.
Usage
The format is T_Test(xbar, n, s, mu, alt).
Examples
Perform a right-tailed hypothesis test
T_Test(xbar = 199.7, n = 60, s = 41.8, mu = 192, alt = "right")
## Number of Degrees of Freedom: 59
## Test Statistic: t = 1.42689
## P-Value: 0.07944
One_Prop_Test
The One_Prop_Test function performs a one-sample hypothesis test about a population proportion.
Usage
The format is One_Prop_Test(x, n, p0, alt).
Examples
One_Prop_Test(x = 480, n = 1000, p0 = 0.5, alt = "left")
## Sample Proportion: 0.48
## Test Statistic: z = -1.26491
## P-Value: 0.10295
Two_Samp_T_Interval
The Two_Samp_T_Interval function constructs a confidence interval for the difference between two population means (μ₁ − μ₂) given two independent samples.
Usage
The format is Two_Samp_T_Interval(xbar1, s1, n1, xbar2, s2, n2, alpha).
Examples
Two_Samp_T_Interval(xbar1 = 24.5, s1 = 3.2, n1 = 18,
xbar2 = 21.3, s2 = 4.1, n2 = 22, alpha = 0.05)
## Confidence Level: 95 %
## Point Estimate: 3.2
## Lower Bound: 0.92316
## Upper Bound: 5.47684
Two_Samp_T_Test
The Two_Samp_T_Test function performs a hypothesis test about the difference between two population means (μ₁ − μ₂) given two independent samples.
Usage
The format is Two_Samp_T_Test(xbar1, s1, n1, xbar2, s2, n2, alt).
Examples
Two_Samp_T_Test(xbar1 = 24.5, s1 = 3.2, n1 = 18,
xbar2 = 21.3, s2 = 4.1, n2 = 22, alt = "two")
## Test Statistic: t = 2.84498
## P-Value: 0.00718
Two_Prop_Int
The Two_Prop_Int function calculates the confidence interval for the difference between two population proportions (p₁ − p₂).
Usage
The format is Two_Prop_Int(x1, n1, x2, n2, alpha).
Examples
Two_Prop_Int(x1 = 344, n1 = 880, x2 = 304, n2 = 620, alpha = 0.05)
## Confidence Level: 95 %
## Point Estimate: -0.16035
## Lower Bound: -0.23057
## Upper Bound: -0.09013
Sign_Test
The Sign_Test function performs a one-sample Sign Test for a population median when the population distribution is not necessarily normal.
Usage
The format is Sign_Test(sample, m0, alpha, alt).
Examples
# Define sample data
data <- c(2.93, 2.95, 2.76, 2.89, 2.57, 3.06, 2.61, 2.66, 2.98, 2.79, 2.96, 2.74)
# Perform the test
Sign_Test(data, m0 = 3, alpha = 0.05, alt = "less")
## Null Hypothesis: H0: m = 3
## Alternate Hypothesis: H1: m < 3
## Test Statistic: 1
## Critical Value: 2
## Result: Reject the Null Hypothesis
Rank_Sum_Test
The Rank_Sum_Test function performs a nonparametric test for comparing the medians of two populations.
Usage
The format is Rank_Sum_Test(sample1, sample2, alpha, alt).
Examples
# Define sample data
sample_1 <- c(78, 82, 83, 87, 75, 63, 78, 60, 94, 62, 98, 90, 97, 81)
sample_2 <- c(73, 72, 92, 100, 74, 90, 64, 84, 77, 89, 70, 64)
# Perform the test
Rank_Sum_Test(sample_1, sample_2, alpha = 0.05, alt = "two")
## Null Hypothesis: m1 = m2
## Alternate Hypothesis: m1 is not equal to m2
## Test Statistic: -0.38576
## P-Value: 0.69968
## Result: Do Not Reject the Null Hypothesis
Signed_Rank_Test
The Signed_Rank_Test function performs the nonparametric Signed Rank Test for testing whether there is a difference between the medians of two populations, when the data are in the form of paired samples.
Usage
The format is Signed_Rank_Test(sample1, sample2, alpha).
Examples
# Define sample data
sample_1 <- c(283, 299, 274, 284, 248, 275, 293, 277)
sample_2 <- c(290, 281, 262, 287, 253, 287, 267, 271)
# Perform the test
Signed_Rank_Test(sample_1, sample_2, alpha = 0.05)
## Null Hypothesis: md = 0
## Alternate Hypothesis: md is not equal to 0
## Test Statistic: 12.5
## Critical Value: 4
## Result: Do Not Reject the Null Hypothesis