Lovegrove Mathematicals

"Dedicated to making Likelinesses the entity of prime interest"

# Introduction

## Background

The usual concept of probability -the frequentist probability- is in practice never known. But, more than that, it is unknowable. It is not possible to deduce the limit of a convergent sequence from any finite number of its terms.

The value of a probability is known only when it is a given (such as being told that a coin is fair) or is calculated from givens. In other contexts, we have to work with estimates. Those estimates are too often still called 'the probability'; this causes confusion and can be misleading because it can lead to the estimates being treated as if they were the probabilities.

The difficulty with this confusion is that we might then do things with those estimates which magnify any inherent errors -and so would not matter if they were truly the probabilities (since the errors would be zero, and many times zero is still zero)- giving results which are seriously wrong. For example, we might substitute into a non-linear formula such as the Multinomial Theorem. Substituting even a good estimate into the Multinomial theorem can give large errors -not just of 10% or 20% but by a factor of 10x or 20x or more: you will see the theory and the consequences on this site.

Problems are also encountered on the theoretical front, where confusion between estimates and the actual values can cause errors in the theory. Examples of this are the problem called "The Perfect Cube Factory" and the difficulties Johnson had with his "Combination Postulate".

Fortunately, it turns out that the entity commonly called 'the best estimate of a probability' can -despite its name- be defined without actually using the concept of probability. Of course, if we do this then we need another name for 'The best-estimate of a probability' since there is no probability for it to be the best-estimate of: on this site, we shall call it a Likeliness.

Having eliminated probabilities from our definition of likeliness, we are in the position of being able to define probability in terms of likeliness rather than the other way round. There is no need to do this, but it does have an historical and academic interest. This is a simple process: it is based on the idea that if we truly knew the probability then it would be independent of data. For example, if we truly knew that a coin was fair then it would not matter how many times it came down "H" or "T": if it's fair then it's fair, and that's the end to the matter.

What is meant by 'truly knew'? It means there is only one possibility: and that means that the underlying set -the set of distributions meeting the requirements of the problem- must be singleton.

So we define our concept of probability as being a likeliness with a singleton underlying set.

We then look at the consequences of this definition and find that this concept of probability is indeed independent of data. Further weight is given to this approach by the observation that the Multinomial Theorem picks out singleton underlying sets as something uniquely special so far as its validity is concerned.

Having defined our basic concepts, we then investigate them further and find that we can make predictions (see the horse-racing results) and carry out analyses (such as of the Distribution of Distributions) which the more traditional approach could not adequately tackle.

There is an additional benefit. In many applications the underlying science does not lead to a closed, parametric formula to represent a generating distribution. Instead, it leads to a geometric shape: 'ranked', 'unimodal', 'U-shaped', etc. These concepts are not easily handled by the usual probabilistic parameter-based, formulaic techniques; in fact, they can rarely be handled at all. Likelinesses are set-oriented rather than formula-oriented; this gives them the flexibility to handle such concepts directly.

## Why the new word "Likeliness"?

• To retain the terminology "Best-estimate of the probability" would imply that probabilities came first, and Likelinesses came second.
• Because best-estimating probabilities is only one of the rôles of likelinesses.
• Because likelinesses exist independently of probabilities. The definition of "likeliness" does not involve the concept of probability.
• Because the term "best-estimate" is unpopular with a significant number of people: what is "best" to one person need not be "best" to someone else.
• Because the expression "Best estimate of the probability" is too much of a mouthful. We need a simple noun which conjures up an appropriate image.
• By removing confusion between probabilities and best-estimates of probabilities, various theoretical results can be developed which were not otherwise possible.

## Pros of likelinesses

• There's no getting away from them. Even if you are explicitly using probabilities, you are still using likelinesses: a probability is a special type of likeliness -if you prefer, likeliness is a generalisation of probability. The Venn diagram showing the relationship between likeliness and probability is as in A, not B.

This means that you can never go wrong by asking for the likeliness rather than the probability, but you could by asking for the probability rather than the likeliness. So always ask for the likeliness; that is, make Likeliness the entity of prime interest.
• Likelinesses can deal directly with geometric concepts such as "ranked", "unimodal", "bell-shaped" without having to assume any particular parametrically-defined form. This will often allow an analytical approach to be taken which matches the actual problem, rather than having to use a simplified approximation forced by the parametric approach to probabilities. For example, there is no need to assume a parametrically-defined formula which forces a bell-shaped distribution when one wants an unimodal distribution; unimodal distributions per se are easily handled.
• There is no need to worry about whether or not a data-set is "large": there is no such concept for likelinesses which (unlike frequentist probabilities) are not defined by a limiting process. Make your data-sets as small as you want.
• Likelinesses are not simply the old subject of best-estimation dressed in new clothes. Released from their rĂ´le as estimators, likelinesses are free of questions about tolerances, precision and accuracy. Instead, they offer a fundamental approach which brings solutions to some of the major problems. It is this aspect which this site is all about.

## Cons of likelinesses

• With a few exceptions that are covered by Theorems, finding a likeliness needs to be carried out numerically, and can involve a lot of number-crunching. Desktop computers have only recently become readily available which can do this in an acceptable time, so -with no practical way of carrying out the calculations- the whole subject-area has become a by-water. Consequently, there is no proprietary software available to do the necessary calculations: a vicious circle which needs breaking. For this reason, I am making my own program, "Great Likelinesses", freely available.
• Even so, algorithms are available only for a few important cases, although it is anticipated that more will be developed with time. This site is pump-priming rather than offering a fully-fledged package of solutions.
• Likelinesses might not always be of the expected shape. For example, if you are assuming an underlying set of distributions with 6 modes then the result might not have 6 modes. If the underlying set is convex then the result will always be of the expected shape, but if the underlying set is concave then the result might fall in the hole/dent: whether it does so or not depends on the data. Do not confuse Best-estimating with Best-fitting.