Frequency Distributions Frequency Distributions

Stephen Mildenhall

September 1999

1. Backgrounders

1.1 Moment Notation per [JKK] and [PW]

This section contains some basic definitions and notation.

The rth uncorrected moment, moment about zero, raw moment or moment about the origin is
 mĘr(X) = E(Xr).
The rth corrected moment, moment about the mean, or central moment is
 mr(X) = E((X-E(X))r).
Define m: = E(X).
Variance is second central moment s2: = m2.
CV is ÷{m2}/m = sX/m.
Index of skewness is a3(X) = ÷{b1(X)} = m3/m23/2.
Index of kurtosis is a3(X) = b2(X) = m4/m22.
Corrected from uncorrected moments:
 mr = E(X-E(X))r = r Ś j = 0 (-1)r śÁ Ť r j Ųų Ý mĘr-jmj.
In particular:
 m2
 = mĘ2-m2
 m3
 = mĘ3-3mĘ2m+2m3
 m4
 = mĘ4-4mĘ3m+6mĘ2m2-3m4.
Note that you can do these in your head from the binomial coefficients, remembering that m is the only tricky one!
Raw moments in terms of the central moments:
 mĘ2
 = m2+m2
 mĘ3
 = m3+3m2m+ m3
 mĘ4
 = m4+4m3m+6m2mĘ2 + m4.
The rth descending factorial moment is
 mĘ[r] = E(X!/(X-r)!).
[PW] call these simply factorial moments and denote them m(r).
Factorial moments interms of uncorrected or raw moments:
 mĘ
 = m
 mĘ
 = mĘ2-m
 mĘ
 = mĘ3-3mĘ2+2m
 mĘ
 = mĘ4-6mĘ3+11mĘ2-6m.
Raw moments in terms of factorial moments
 mĘ1
 = m
 mĘ2
 = m+m
 mĘ3
 = m+3m+m
 mĘ4
 = m+6m+7m+m.
In general we have
 mĘr = r Ś j = 1 S(r,j)mĘ[j]
where S(r,j) are the Stirling numbers of the second kind.
The cumulants, or semi-invariants, are defined as the coefficients of tr/r! in the Taylor expansion of the MGF (see below):
 KX(t) = logMX(t) = Ś krtr/t!.
For independent X and Y, kr(X+Y) = kr(X)+kr(Y).
Cumulants interms of the central moments:
 k1
 = m
 k2
 = m2
 k3
 = m3
 k4
 = m4-3m22
 k5
 = m5-10m3m2
Generating Function Notation per [JKK]

The characteristic function is

 f(t) = E(eitX).

The probability generating function is

 G(z) = Ś j Pjzj = E(zX),
where Pj = Pr(X = j). Thus f(t) = G(eit). The moment generating function is M(t) = G(et). The cumulant generating function is K(t) = lnG(et).

We have

 mĘr = drG(et)dtr ÍÍ Í t = 0 .

Also, since the factorial moment generating function is

 E(1+tX) = G(1+t)
we have
 mĘ[r] = drG(1+t)dtr ÍÍ Í t = 0 .

Mixtures and Stopped Sum distributions per [JKK]

[JKK] write mixtures as

 NB = Poisson(Q) Ŕ Q Gamma(a,b).
The PGF of a mixture is the mixture of the PGF's.

Examples

A Gamma mixture of Poissons is a negative binomial.
An inverse Gaussian mixture of Poissons is a PIG. The Generalized IG distribution gives Sichel's distribution.
A Poisson mixture of Poissons is a Neyman Type A distribution. By Gurland it is also a Poisson-stopped sum of Poisson distributions.
A Beta mixture of NBs gives the Beta-Negative Binomial. The mixture is
 NB = NB(k,P) Ŕ p = Q-1 Beta(a,b).
where Q = 1+P. Here p: = Q-1 has beta distribution with pdf
 pa-1(1-p)b-1B(a,b) .
If the PGF can be written as G1(G2(z)) then Feller calls the result a ``generalized'' distribution. F1 the generalized distribution and F2 is the generalizing distribution. These are the infinitely divisible distributions, by Levy's theorem. They are also called stopped sum distributions.

Write the distributions with a ŕ, so: G1(G2(z)) corresponds to F1ŕF2. Note that

 G1(G2(z)) ~ F1 ŕ F2 ~ Count ŕ Severity.
SayF1ŕF2 as F1-stopped summed-F2 distribution. For example
 NB = Poisson ŕ Logarithmic.

Theorem. Let distributions F1, F2 have pgf's G1(z) = \sumpkzk and G2(z), where G2(z) depends on a parameter fin such a way that

 G2(z|kf) = (G2(z|f))k.
Then the mixed distribution represented by
 F2(Kf) Ŕ K F1
has the pgf
 Ś k p G2(z|kf)
 = Ś k p(G2(z|f))k
 = G1(G2(z|f))
so
 F2 Ŕ F1 ~  F1 ŕ F2Ę.

For example, the Poisson, binomial and negative binomial distributions all have pgf's of the required form:

 śÁ Ť p1-qz Ųų Ý kf = śÁ Ť śÁ Ť p1-qz Ųų Ý k Ųų Ý f .

2. Poisson Distribution

See [JKK] Chapter 4, especially section 3.

Parameter: q
Pr(X = x) = exp(-q)qx/x!.

3. Negative Binomial Distribution

See [JKK] Chapter 5.

Parameters: k = r and p, q: = 1-p.
 Pr (X = x) = śÁ Ť k +x-1 k-1 Ųų Ý pk qx = G(k +x) G(k)x! pk qx

[JKK] prefer a parameterization by P and k. They write Q = 1+P. Then p = 1/(1+P) = 1/Q. [PW] use r = k and b = P. These give the following view.

Parameters: k and P, Q = 1+P
 Pr (X = x) = śÁ Ť k +x-1 k-1 Ųų Ý śÁ Ť 1- P Q Ųų Ý k śÁ Ť P Q Ųų Ý x
or
 Pr (X = x) = śÁ Ť k +x-1 k-1 Ųų Ý śÁ Ť 1 1+p Ųų Ý k śÁ Ť p 1+p Ųų Ý x

4. Logarithmic Distribution

See [JKK] Chapter 7. This is a single parameter family supported on the positive integers. The parameter is q. Letting a = -ln((1-q))-1 we have

 m = aq1-q .
This distribution is not easy to deal with.

Parameter: 0 < q < 1
Pr(X = x) = aqx / x

5. Stopped Sum Distributions

Neyman Type A: Poisson sum of Poissons. Limited since ratio of skewness of kurtosis falls in a tight range. No closed form expression for density, but easy to use FFT methods. See other Neyman distributions. See [JKK] Chapter 9, Section 6.
Thomas's Distribution is a Neyman Type A, where the summed distribution is a shifted Poisson, ensuring that each occurrence yeilds at least one claim. See page 392.
Polya-Aeppli distribution is a Poisson stopped Shifted Geometric distribution. The Geometric distribution is a NB with k = 1, so the variance multiplier equals m+1. Could be useful for clash, but the ``number of claims per occurrence'' distribution is very limited. Again, no closed form for probabilities but easy to estimate using FFT. See page 378.
Poisson-Pascal distribution, also called the generalized Polya-Aeppli distribution, is a Poisson stopped sum of negative binomial distributions. Can also be regarded as a mixture of negative binomial (k,P)'s where k has a Poisson distribution. See page 382.
The Generalized Poisson-Pascal distribution ([PW] page 259) is a Poisson stopped sum of truncated (at zero) negative binomial distributions. The PGF is obvious. Per an interesting table on p 253 of  we have the following formulae for the third moments about the mean.
 Poisson:
 m3 = 3s2-2m
 Poisson Binomial:
 m3 = 3s2-2m+ m-2m-1 (s2-m)2m
 Negative Binomial:
 m3 = 3s2-2m+ 2 (s2-m)2m
 Polya-Aeppli:
 m3 = 3s2-2m+ 32 (s2-m)2m
 Neyman Type A:
 m3 = 3s2-2m+ (s2-m)2m
 Generalized PP:
 m3 = 3s2-2m+ r+2r+1 (s2-m)2m
Note that r > -1 in the last line give a great deal of flexibility.

Beta-Negative Binomial, a NB mixed over the variance multplier distributed as a beta should have a lot of potential as a distribution. However, the PGF involves 2F1 which makes it very hard to deal with.

7. Generalized Poisson-Pascal Distribution, [PW]

The GPP is a Poisson stopped-sum of extended truncated Negative Binomial distributions. It is a three parameter distribution. It has PGF

 G(z) = exp śÁ Ť q śÁ Ť (1+P-Pz)-r-(1+P)-r1-(1+P)-r -1 Ųų Ý Ųų Ý .

Note that provided 1-(1+P)-r = 1-pr > 0

 G(z) = exp śÁ Ť q1-(1+P)-r ((1+P-Pz)-r-1) Ųų Ý
is a valid PGF for a Poisson-Negative Binomial (Poisson-Pascal). The condition is necessary so that the frequency is non-negative. Thus in the Poisson-Pascal case the distribution can be regarded as a Poisson-NB without zero truncation, or a Poisson-ZTNB, with an adjusted primary Poisson frequency.

Special cases of the GPP include:

r = 1 is a Poisson-Geometric
r > 0 is a Poisson-Pascal, aka Poisson-Negative Binomial
-1 < r < 0 is a Poisson-ETNB, and you need the zero truncation.
r = -1/2 is a Poisson-Inverse Gaussian mixture.
8. PIG and GPIG Distributions

The PIG is a Poisson mixed over an Inverse Gaussian distribution. The PIG is closed under certain convolutions, see [PW]. It has a thicker tail than the Negative Binomial distribution. It is a special case of the generalized Poisson-Pascal distribution with r = -1/2.

References for this section are from [PW], Section 7.8.3.

Per  page 260, the PIG is a Poisson ETNB.

Per  page 261, the Poisson ETNB with -1 < r < 0 is a Poisson mixture with a stable distribution, (see also Feller p 448, 581).

PIG Parameters: m and b.
See below with l = -1/2. The Generalized Poisson inverse Gaussian distribution is also called Sichel's distribution.
Sichel's Distribution Parameters: m and b and l.
Pr(X = x) = [( mn)/ n!][( Kl+n(mb-1÷{1+2b}))/( Kl+n(mb-1) )] (1+2b)-(l+n)/2. The rth factorial moment
 m[r] = mr Kl+r(m/b)Kl(m/b) .

The Bessel function used is the modified Bessel function of the third (second according to some sources!) kind, Kl(x). It is available for integral l built into Excel, in MathFunctions as nrBesselK(n,x) for any real n and x ő R and also in Matlab as nrBesselK, again for any n and any x ő C.

Matlab mentions their BesselK uses a MEX interface to a Fortran library by D. E. Amos, which are available on the web under www.netlib.com, search for amos.

9. Recursive Classes of Distributions

The (a,b) recursion is

 pn = pn-1(a+b/n).
For (a,b,0) the recursion is valid for n = 1,2,3,.... For (a,b,1) the recursion is valid for n = 2,3,4,....

The (a,b) classes fall into two sub-groups.

(a,b,0) distributions are supported on the non-negative integers. They are specified through a, b, and p0.
(a,b,1) distributions are supported on the positive integers. There are two sub-sub-classes. The zero-truncated distributions have zero probability at zero. These include the zero-truncated Poisson, logarithmic and negative binomial distributions. The zero-modified distributions are a weighting of a degenerate distribution with a zero-truncated class. For the negative binomial, there is slightly more flexibility in the choice of parameters for the truncated distribution, so it is sometimes called the ``extended truncated negative binomial distribution''. Normally we have parameters r = k and b = P = q/p, with mean rP and variance multiplier 1+P = Q. In the ETNB, we must still have b > 0, so the apparent variance multiplier is greater than 1. However, we can have -1 < r < 0, which would translate into a negative mean, in the usual case. Also, if r < 0 then the probability of a zero loss is pr > 1, which is also impossible, since p < 1 always. (Recall, p = 1/(1+P) = 1/vm.)
See the nice table on p 229 of  for a good summary of the options. See also page 250-251 for a chart showing the relationships between the various distributions.

Data Tables

Poisson Distribution Key Facts

 Item Poisson Distribution Mean q Variance q q m n/a m3 q m4 3q2+q CV 1/÷{q} Skewness 1/÷{q} Kurtosis 3+1/q PGF G(z) exp(q(z-1) MGF f(t) exp(q(eit-1) Recursions p0 exp(-q) pn pn-1q/ n

Negative Binomial (r = k,p) Key Facts

 Item NB Distribution Mean kq/p Variance kq/p2 VM View, m and v p 1/v k m/(v-1) Contagion View, m and c p 1/(1+cm) k 1/c m3 [( kq(1+q))/( p3)] m4 [( 3k2q2)/( p4)]+[( kq(p2+6q))/( p4)] CV 1/÷[kq] Skewness [( 1+q)/( ÷[kq])] Kurtosis 3+[( p2+6q)/ kq] PGF G(z) (p/(1-qz))r MGF f(t) (p/(1-qeit))r Recursions p0 pr pn pn-1 (k+n-1)q/n pn+1 pn (k+n)q/(n+1)

Table

Negative Binomial (k,P) Key Facts

 Item NB Distribution Mean kP Variance kP(1+P) VM View, m and v P v-1 k m/(v-1) Contagion View, m and c P cm k 1/c m3 kP(1+P)(1+2P) m4 3k2P2(1+P)2+kP(1+P)(1+6P+6P2) CV ((1+P)/(kP))1/2 Skewness [( 1+2P)/( {kP(1+P)}1/2)] Kurtosis 3+[( (1+6P+6P2))/( kP(1+P))] PGF G(z) (1+P-Pz)-k MGF f(t) Recursion p0 Q-k pn+1 [( k+r)/( r+1)][ P/( 1+P)]pn

Table

PIG Distribution Key Facts

 Item NB Distribution Mean m Variance m(b+1) VM View, m and v m m b v-1 Contagion View, m and c m m b cm m3 m4 CV Skewness Kurtosis PGF G(z) exp(-m/b÷{(1+2b(1-z))}-1 ) MGF f(t) Recursion p0 exp(-m/b(÷{1+2b}-1) p1 m÷{1+2b}p0 pn [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2

Table

Generalized PIG Distribution Key Facts

 Item NB Distribution Mean m Variance m(b+1) VM View, m and v m m b v-1 Contagion View, m and c m m b cm m3 m4 CV Skewness Kurtosis PGF G(z) exp(-m/b÷{(1+2b(1-z))}-1 ) MGF f(t) Recursion p0 exp(-m/b(÷{1+2b}-1) p1 m÷{1+2b}p0 pn [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2

FILLINNAME Key Facts

 Item NB Distribution Mean Variance VM View, m and v Contagion View, m and c m3 m4 CV Skewness Kurtosis PGF G(z) MGF f(t) Recursion p0 pn

JKK] [JKK] Johnson, Kotz and Kemp Statistical Methods for Forecasting John Wiley and Sons 1983

File translated from TEX by TTH, version 2.34.
On 11 Sep 1999, 17:28.