Lovegrove Mathematicals

logo

"A calculated distribution is not the underlying set"

Distributions

Background

In probabilistic analysis, the modern technique is to describe distributions by parametrically-defined formulæ. Often, there are 2 or 3 parameters, specifying what are conveniently referred to as location, scale and as shape, although there may be more or fewer.

The main advantages to this approach are:-

Succinctness
To convey a result, it is necessary to give only the name of the type of distribution plus the values of the parameters: for example, "Gaussian distribution, mean= 33.1, sd= 2.65".
Detailed properties
It is usually possible to differentiate and integrate the theoretical expressions to give other insights which may be of use, such as skew.

There are, though, disadvantages. These include:

Enforced Smoothness
All the parametrically-defined formulæ used in practice give smooth results(I am here using the term "smooth" in an informal way, not in the technical sense of having a continuous first differential). For example, visualise the graph of an unimodal distribution; you will almost certainly be thinking of a bell-shaped distribution since that is what you have become used to by using parametrically-defined formulæ. Yet, compared to the set of unimodal distributions, the set of bell-shaped distributions has measure zero; although you could scarcely believe it from publications based on parametrically-defined formulæ.
Absence of a formula
Enforced smoothness to one side, often there isn't a suitable formula in any case.
  • Think of ranked distributions. You can undoubtedly think of models for specific situations, such as the Poisson distribution or Zipf's Law, but try giving a formula that gives all ranked distributions (and nothing more); there isn't one.

Because of matters such as these, there are times when no parametrically-defined distribution is applicable. When this happens, another approach is needed. The approach followed here is numerical generation.

This approach is made possible by one crucial result, reported on this site: the discovery of a linear bijection from the set of all distributions (of a given degree) to that of all ranked distributions. All of the other distributions reported here start with that result.

It is the development of the algorithms needed by numerical generation, not only as a method of calculation but also to define the distributions themselves, which I am currently working on, and which this part of the site is about.

Main Sets of Distributions

Operations on sets

Any set of distributions can be modified by the following

flow of logic