User Interaction Sketches

In this page we look at different ways in which a user may interact and use distr6. These are ordered by increasing complexity and the final two link to other projects that will involve distr6.

Any ideas or suggestions are welcome.

For each of these we will use R6 notation for OOP method calls and S3 notation for dispatch, for example to calculate the mean from a Binomial distribution:

B <- Binomial$new(size = 10, prob = 0.5)
B$mean() # R6 method call
mean(B) # S3 dispatch

Note that the variable and class names used in these sketches are for guidance only and are not the final names.

Use Case: Finding distributions by property and/or trait

Brief: A user, possibly unsure about which distribution to interact with, can list all distributions by properties or traits

Basic flow:

Call to listDistributions(), properties or traits can be specified
System returns a data-frame of all distributions satisfying the requirements (all if none specified)

Use Case: Statistical analysis

Brief: Constructing a distribution in distr6 is equivalent to defining a random variable following that distribution. Then one can evaluate that random variable's density, distribution, quantile or other statistical features.

Basic flow:

Construct a given distribution
Use method calls or dispatch to call the relevant statistical function

Pseudo-code:

B <- Binomial$new(size = 10, prob = 0.2) # Binomial(10, 0.2)
B$mean() # mean(B)
B$var() # var(B)
B$pdf(2) # pdf(B, 2)
B$cdf(2) # cdf(B, 2)
B$quantile(0.4) # quantile(B, 0.4)

Use Case: Simulation

Brief: Simulating numbers from a given distribution.

Basic flow:

Construct a given distribution
Via R6 or S3 simulate x numbers from the distribution

Pseudo-code:

B <- Binomial$new(size = 10, prob = 0.2)
B$rand(100) # rand(100)

Use Case: Plotting distributions

Brief: Plotting specified functions from a given distribution.

Basic flow:

Construct a given distribution
Call plot() on the distribution, either specifying function or cycling through all, and either specifying range or over (reasonable) support

Pseudo-code:

N <- Normal$new(mean = 0, sd = 1)
plot(N, rep="hazard",range=c(-1,1), type="l") # Plots a line plot of hazard function over (-1,1)
plot(N, plots = 2) # Plots line plots (suggested default for AbsContDist) for the first two possible representations (e.g. pdf and cdf) then 'Press ENTER to continue' to see more plots 
qqplot(N)
hist(N)

Use Case: Estimating model parameters

Brief: Given a constructed distribution and a data-frame type object of empirical data, perform statistical inference for a given method to estimate the specified parameter.

Basic flow:

Construct a distribution
Create or load a dataset of empirical data
Call Estimate() with distribution, data and specified method as arguments

Pseudo-code:

N <- Normal$new(sd=1)
x <- load(data.dat)
Estimate(method = "mle", distr = N, param = "mean", data = x)

Note: See design project for open questions regarding how best to implement this.

Use Case: Convolution of Random Variables

Brief: Given two random variables (instances of distributions), we can `add' these together assuming i.i.d. to construct a new distribution object, their convolution.

Basic flow:

Construct an instance of distribution X
Construct an instance of distribution Y
Calculate the convolution X+Y

Psuedo-Code:

N <- Normal$new()
E <- Exponential$new()
convNE <- N + E 
convNE$pdf(1)
convNE$cdf(1)

Note: Similarly for NE etc. We extend the distr design and use X+X convolution of two i.i.d. random variables constructed from the same distribution and use 2X for the scalar multiplication of a single random variable.

Use Case: Conditioning

Brief: Given two random variables X and Y, construct the probability distribution of X|Y

Basic flow:

Construct instance of distribution X
Construct instance of distribution Y
Construct conditional sub-class X|Y

Psuedo-Code:

N <- Normal$new()
E <- Exp$new()
NgY <- Conditional$new(N, Y)
NgY$mean()
NgY$sd()

Note: See design project about how conditional classes should be implemented, also for truncated and decomposed distributions.

Use Case: Mixing Distributions

Brief: Given two or more random variables, construct a mixing distribution based on these and either given probabilities or assumed uniform probabilities.

Basic flow:

Construct instance of distribution X_1
Construct instance of distribution X_2
...
Construct instance of distribution X_n
Construct mixing sub-class from instances

Pseudo-code:

N <- Normal$new()
E <- Exp$new()
B <- Binomial$new()
mixNEB <- Mixing$new(N,E,B,weights=c(0.1,0.2,0.7))
mixNEB <- Mixing$new(N,E,B) # Weights assumed to be 0.333 each
mixNEB$mean()
mixNEB$sd()
mixNEB$p(2)

Use Case: Creating new distributions

Brief: Define a new distribution via a density/mass/distribution function and internally other representations are automatically generated.

Basic flow:

Construct instance of either ContDist, DiscDist with arguments of either pdf/pmf or cdf
Internal verification to check if this is a valid distribution performed
Internal calculations of other statistical functions

Pseudo-code:

newDist <- DiscreteDist$new(supp = c(1,5,7,21), prob = c(0.1,0.1,0.6,0.2)) # Example taken from distr

Note: This will require careful planning about which classes should be abstract as original designs had DiscreteDist as an abstract class with all implemented distributions as sub-classes. Clearly however this will not work in cases such as this.

Use Case: Describing data

Brief: Given that data follows a particular distribution (known apriori or via inference/simulation) then use a generic data-container to describe the data. See xdataframe project for more details.

Use Case: Probabilistic supervised learning

Brief: Given a modelling interface that is trained on a data-frame of distributions or that makes use of statistical inference in some other way, return a predicted model as a distribution object. See pslr project for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User Interaction Sketches

Use Case: Finding distributions by property and/or trait

Use Case: Statistical analysis

Use Case: Simulation

Use Case: Plotting distributions

Use Case: Estimating model parameters

Use Case: Convolution of Random Variables

Use Case: Conditioning

Use Case: Mixing Distributions

Use Case: Creating new distributions

Use Case: Describing data

Use Case: Probabilistic supervised learning

Clone this wiki locally