How to Use NMstats

Barry Monk

NMstats is an R package that contains functions aligned to the Navidi/Monk Elementary & Essential Statistics textbooks. This vignette describes the functions available along with special considerations and examples.

Call the NMstats package:

library(NMstats)

combs

The combs function calculates the number of ways to choose r items from a total of n without replacement and where order does not matter.

Usage

The format is combs(n, r) where n is the total number of items and r is the number of items to choose.

Return

The function returns an integer value that represents the number of combinations of r items chosen from n. If invalid input values are given, an error message is returned.

Examples

Count the number of combinations of 4 items chosen from 10

combs(10, 4) ## [1] 210

Computation of binomial probability where n = 15, p = 0.25, and x = 6

n <- 15 p <- 0.25 x <- 6 prob_value <- combs(n, x) * p^x * (1 - p)^(n - x) print(prob_value) ## [1] 0.09174777

perms

The perms function calculates the number of ways to choose r items from a total of n without replacement and where order matters.

Usage

The format is perms(n, r) where n is the total number of items and r is the number of items to choose.

Return

The function returns an integer value that represents the number of permutations of r items chosen from n. If invalid input values are given, an error message is returned.

Examples

The number of permutations of 7 items chosen from 12

perms(12, 7) ## [1] 3991680

outlier_bounds

The outlier_bounds function computes the lower and upper outlier boundaries using the IQR method.

Usage

The format is outlier_bounds(data) where data is a data set.

Return

The function will print the lower and upper outlier bounds. Variables Lower.bound and Upper.bound are available as return values after execution of this function.

Examples

Print the lower and upper outlier bounds of a data set

data <- c(14, 9, 3, 22, 8, 13, 6) outlier_bounds(data) ## ## Lower Outlier Bound: -2.75 ## Upper Outlier Bound: 23.25

data_range

The data_range function computes the range of a data set.

Usage

The format is data_range(data) where data is a data set.

Return

The function returns the range of a data set, defined as the minimum subtracted from the maximum.

Examples

Return the range of a data set

data <- c(24, -67, 15, 89, -34, 51, -42, 76) data_range(data) ## [1] 156

rel_hist

The rel_hist function constructs a relative frequency histogram for a data set.

Usage

The format is rel_hist(data, bins, col, xlab, ylab, main, ybreaks) where data is a data set.

Optional arguments are: bins (number of bins), col (fill color), xlab and ylab (axis labels), main (title), and ybreaks (y-axis tick marks).

Return

The function constructs a relative frequency histogram.

Examples

Generate random data and construct relative histogram

data <- sample(1:100, 220, replace = TRUE) rel_hist(data, bins = 20, xlab = "My Data")

var.p

The var.p function calculates the population variance of a data set.

Usage

The format is var.p(data) where data is a data set.

Examples

Calculate the population variance of a data set

data <- c(37, 292, 175, 86, 331, 249, 104, 58, 368, 213) var.p(data) ## [1] 12523.21

sd.p

The sd.p function calculates the population standard deviation of a data set.

Usage

The format is sd.p(data) where data is a data set.

Examples

Calculate the population standard deviation of a data set

data <- c(-3.2, 7.1, -12.5, 4.8, -0.6, 11.9, -9.4, 2.3, 6.7) sd.p(data) ## [1] 7.563427

Z_Interval

The Z_Interval function calculates the confidence interval for a population mean when the population standard deviation is known.

Usage

The format is Z_Interval(xbar, n, sigma, alpha) where xbar is the sample mean, n is the sample size, sigma is the population standard deviation, and alpha is a value between 0 and 1 where 1 - alpha is the confidence level as a decimal.

Return

The function will print the confidence level, margin of error, critical value, and the confidence interval (lower and upper bounds).

Examples

Construct a 90% confidence interval

Z_Interval(xbar = 39.8, n = 36, sigma = 6.4, alpha = 0.1) ## Confidence Level: 90 % ## Margin of Error: 1.75451 ## Critical Value: 1.64485 ## Lower Bound: 38.04549 ## Upper Bound: 41.55451

T_Interval

The T_Interval function calculates the confidence interval for a population mean when the population standard deviation is not known, and the sample standard deviation is used instead.

Usage

The format is T_Interval(xbar, n, s, alpha) where xbar is the sample mean, n is the sample size, s is the sample standard deviation, and alpha is the significance level.

Return

The function will print the confidence level, margin of error, degrees of freedom, critical value, and the confidence interval bounds.

Examples

Construct a 95% confidence interval

T_Interval(xbar = 77.3, n = 22, s = 12.4, alpha = 0.05) ## Confidence Level: 95 % ## Margin of Error: 5.49824 ## Degrees of Freedom: 21 ## Critical Value: 2.07961 ## Lower Bound: 71.80176 ## Upper Bound: 82.79824

One_Prop_Int

The One_Prop_Int function calculates the confidence interval for a population proportion.

Usage

The format is One_Prop_Int(x, n, alpha) where x is the number of individuals of interest, n is the sample size, and alpha is the significance level.

Examples

Construct a 90% confidence interval

One_Prop_Int(x = 28, n = 60, alpha = 0.1) ## Confidence Level: 90 % ## Margin of Error: 0.10594 ## Critical Value: 1.64485 ## Sample Proportion: 0.46667 ## Lower Bound: 0.36073 ## Upper Bound: 0.57261

Z_Test

The Z_Test function performs a hypothesis test about a population mean when the population standard deviation is known.

Usage

The format is Z_Test(xbar, n, sigma, mu, alt) where alt is a character string with choices "left", "right", or "two".

Examples

Perform a left-tailed hypothesis test

Z_Test(xbar = 47.3, n = 44, sigma = 10.2, mu = 50, alt = "left") ## Test Statistic: z = -1.75586 ## P-Value: 0.03956

T_Test

The T_Test function performs a hypothesis test about a population mean when the population standard deviation is not known.

Usage

The format is T_Test(xbar, n, s, mu, alt).

Examples

Perform a right-tailed hypothesis test

T_Test(xbar = 199.7, n = 60, s = 41.8, mu = 192, alt = "right") ## Number of Degrees of Freedom: 59 ## Test Statistic: t = 1.42689 ## P-Value: 0.07944

One_Prop_Test

The One_Prop_Test function performs a one-sample hypothesis test about a population proportion.

Usage

The format is One_Prop_Test(x, n, p0, alt).

Examples

One_Prop_Test(x = 480, n = 1000, p0 = 0.5, alt = "left") ## Sample Proportion: 0.48 ## Test Statistic: z = -1.26491 ## P-Value: 0.10295

Two_Samp_T_Interval

The Two_Samp_T_Interval function constructs a confidence interval for the difference between two population means (μ₁ − μ₂) given two independent samples.

Usage

The format is Two_Samp_T_Interval(xbar1, s1, n1, xbar2, s2, n2, alpha).

Examples

Two_Samp_T_Interval(xbar1 = 24.5, s1 = 3.2, n1 = 18, xbar2 = 21.3, s2 = 4.1, n2 = 22, alpha = 0.05) ## Confidence Level: 95 % ## Point Estimate: 3.2 ## Lower Bound: 0.92316 ## Upper Bound: 5.47684

Two_Samp_T_Test

The Two_Samp_T_Test function performs a hypothesis test about the difference between two population means (μ₁ − μ₂) given two independent samples.

Usage

The format is Two_Samp_T_Test(xbar1, s1, n1, xbar2, s2, n2, alt).

Examples

Two_Samp_T_Test(xbar1 = 24.5, s1 = 3.2, n1 = 18, xbar2 = 21.3, s2 = 4.1, n2 = 22, alt = "two") ## Test Statistic: t = 2.84498 ## P-Value: 0.00718

Two_Prop_Int

The Two_Prop_Int function calculates the confidence interval for the difference between two population proportions (p₁ − p₂).

Usage

The format is Two_Prop_Int(x1, n1, x2, n2, alpha).

Examples

Two_Prop_Int(x1 = 344, n1 = 880, x2 = 304, n2 = 620, alpha = 0.05) ## Confidence Level: 95 % ## Point Estimate: -0.16035 ## Lower Bound: -0.23057 ## Upper Bound: -0.09013

Sign_Test

The Sign_Test function performs a one-sample Sign Test for a population median when the population distribution is not necessarily normal.

Usage

The format is Sign_Test(sample, m0, alpha, alt).

Examples

# Define sample data data <- c(2.93, 2.95, 2.76, 2.89, 2.57, 3.06, 2.61, 2.66, 2.98, 2.79, 2.96, 2.74) # Perform the test Sign_Test(data, m0 = 3, alpha = 0.05, alt = "less") ## Null Hypothesis: H0: m = 3 ## Alternate Hypothesis: H1: m < 3 ## Test Statistic: 1 ## Critical Value: 2 ## Result: Reject the Null Hypothesis

Rank_Sum_Test

The Rank_Sum_Test function performs a nonparametric test for comparing the medians of two populations.

Usage

The format is Rank_Sum_Test(sample1, sample2, alpha, alt).

Examples

# Define sample data sample_1 <- c(78, 82, 83, 87, 75, 63, 78, 60, 94, 62, 98, 90, 97, 81) sample_2 <- c(73, 72, 92, 100, 74, 90, 64, 84, 77, 89, 70, 64) # Perform the test Rank_Sum_Test(sample_1, sample_2, alpha = 0.05, alt = "two") ## Null Hypothesis: m1 = m2 ## Alternate Hypothesis: m1 is not equal to m2 ## Test Statistic: -0.38576 ## P-Value: 0.69968 ## Result: Do Not Reject the Null Hypothesis

Signed_Rank_Test

The Signed_Rank_Test function performs the nonparametric Signed Rank Test for testing whether there is a difference between the medians of two populations, when the data are in the form of paired samples.

Usage

The format is Signed_Rank_Test(sample1, sample2, alpha).

Examples

# Define sample data sample_1 <- c(283, 299, 274, 284, 248, 275, 293, 277) sample_2 <- c(290, 281, 262, 287, 253, 287, 267, 271) # Perform the test Signed_Rank_Test(sample_1, sample_2, alpha = 0.05) ## Null Hypothesis: md = 0 ## Alternate Hypothesis: md is not equal to 0 ## Test Statistic: 12.5 ## Critical Value: 4 ## Result: Do Not Reject the Null Hypothesis