Frequency Distributions Frequency Distributions


Stephen Mildenhall

September 1999





1. Backgrounders

1.1 Moment Notation per [JKK] and [PW]

This section contains some basic definitions and notation.

The rth uncorrected moment, moment about zero, raw moment or moment about the origin is
mr(X) = E(Xr).
The rth corrected moment, moment about the mean, or central moment is
mr(X) = E((X-E(X))r).
Define m: = E(X).
Variance is second central moment s2: = m2.
CV is {m2}/m = sX/m.
Index of skewness is a3(X) = {b1(X)} = m3/m23/2.
Index of kurtosis is a3(X) = b2(X) = m4/m22.
Corrected from uncorrected moments:
mr = E(X-E(X))r = r

j = 0 
(-1)r

r
j


mr-jmj.
In particular:
m2
= m2-m2
m3
= m3-3m2m+2m3
m4
= m4-4m3m+6m2m2-3m4.
Note that you can do these in your head from the binomial coefficients, remembering that m is the only tricky one!
Raw moments in terms of the central moments:
m2
= m2+m2
m3
= m3+3m2m+ m3
m4
= m4+4m3m+6m2m2 + m4.
The rth descending factorial moment is
m[r] = E(X!/(X-r)!).
[PW] call these simply factorial moments and denote them m(r).
Factorial moments interms of uncorrected or raw moments:
m[1]
= m
m[2]
= m2-m
m[3]
= m3-3m2+2m
m[4]
= m4-6m3+11m2-6m.
Raw moments in terms of factorial moments
m1
= m[1]
m2
= m[2]+m
m3
= m[3]+3m[2]+m
m4
= m[4]+6m[3]+7m[2]+m.
In general we have
mr = r

j = 1 
S(r,j)m[j]
where S(r,j) are the Stirling numbers of the second kind.
The cumulants, or semi-invariants, are defined as the coefficients of tr/r! in the Taylor expansion of the MGF (see below):
KX(t) = logMX(t) =
krtr/t!.
For independent X and Y, kr(X+Y) = kr(X)+kr(Y).
Cumulants interms of the central moments:
k1
= m
k2
= m2
k3
= m3
k4
= m4-3m22
k5
= m5-10m3m2
Generating Function Notation per [JKK]

The characteristic function is

f(t) = E(eitX).

The probability generating function is

G(z) =

j 
Pjzj = E(zX),
where Pj = Pr(X = j). Thus f(t) = G(eit). The moment generating function is M(t) = G(et). The cumulant generating function is K(t) = lnG(et).

We have

mr = drG(et)
dtr




t = 0 
.

Also, since the factorial moment generating function is

E(1+tX) = G(1+t)
we have
m[r] = drG(1+t)
dtr




t = 0 
.

Mixtures and Stopped Sum distributions per [JKK]

[JKK] write mixtures as

NB = Poisson(Q)

Q 
Gamma(a,b).
The PGF of a mixture is the mixture of the PGF's.

Examples

A Gamma mixture of Poissons is a negative binomial.
An inverse Gaussian mixture of Poissons is a PIG. The Generalized IG distribution gives Sichel's distribution.
A Poisson mixture of Poissons is a Neyman Type A distribution. By Gurland it is also a Poisson-stopped sum of Poisson distributions.
A Beta mixture of NBs gives the Beta-Negative Binomial. The mixture is
NB = NB(k,P)

p = Q-1 
Beta(a,b).
where Q = 1+P. Here p: = Q-1 has beta distribution with pdf
pa-1(1-p)b-1
B(a,b)
.
If the PGF can be written as G1(G2(z)) then Feller calls the result a ``generalized'' distribution. F1 the generalized distribution and F2 is the generalizing distribution. These are the infinitely divisible distributions, by Levy's theorem. They are also called stopped sum distributions.

Write the distributions with a , so: G1(G2(z)) corresponds to F1F2. Note that

G1(G2(z)) ~ F1
F2 ~ Count
Severity.
SayF1F2 as F1-stopped summed-F2 distribution. For example
NB = Poisson
Logarithmic.


Theorem. Let distributions F1, F2 have pgf's G1(z) = \sumpkzk and G2(z), where G2(z) depends on a parameter fin such a way that

G2(z|kf) = (G2(z|f))k
Then the mixed distribution represented by
F2(Kf)

K 
F1
has the pgf


k 
p G2(z|kf)
=

k 
p(G2(z|f))k
= G1(G2(z|f))
so
F2
 F1 ~  F1
 F2.

For example, the Poisson, binomial and negative binomial distributions all have pgf's of the required form:



p
1-qz


kf

 
=



p
1-qz


k

 


f

 
.

2. Poisson Distribution

See [JKK] Chapter 4, especially section 3.

Parameter: q
Pr(X = x) = exp(-q)qx/x!.

3. Negative Binomial Distribution

See [JKK] Chapter 5.

Parameters: k = r and p, q: = 1-p.
Pr
(X = x) =

k +x-1
k-1


pk qx = G(k +x)
G(k)x!
pk qx

[JKK] prefer a parameterization by P and k. They write Q = 1+P. Then p = 1/(1+P) = 1/Q. [PW] use r = k and b = P. These give the following view.

Parameters: k and P, Q = 1+P
Pr
(X = x) =

k +x-1
k-1




1- P
Q


k

 


P
Q


x

 
or
Pr
(X = x) =

k +x-1
k-1




1
1+p


k

 


p
1+p


x

 

4. Logarithmic Distribution

See [JKK] Chapter 7. This is a single parameter family supported on the positive integers. The parameter is q. Letting a = -ln((1-q))-1 we have

m = aq
1-q
.
This distribution is not easy to deal with.

Parameter: 0 < q < 1
Pr(X = x) = aqx / x

5. Stopped Sum Distributions

Neyman Type A: Poisson sum of Poissons. Limited since ratio of skewness of kurtosis falls in a tight range. No closed form expression for density, but easy to use FFT methods. See other Neyman distributions. See [JKK] Chapter 9, Section 6.
Thomas's Distribution is a Neyman Type A, where the summed distribution is a shifted Poisson, ensuring that each occurrence yeilds at least one claim. See page 392.
Polya-Aeppli distribution is a Poisson stopped Shifted Geometric distribution. The Geometric distribution is a NB with k = 1, so the variance multiplier equals m+1. Could be useful for clash, but the ``number of claims per occurrence'' distribution is very limited. Again, no closed form for probabilities but easy to estimate using FFT. See page 378.
Poisson-Pascal distribution, also called the generalized Polya-Aeppli distribution, is a Poisson stopped sum of negative binomial distributions. Can also be regarded as a mixture of negative binomial (k,P)'s where k has a Poisson distribution. See page 382.
The Generalized Poisson-Pascal distribution ([PW] page 259) is a Poisson stopped sum of truncated (at zero) negative binomial distributions. The PGF is obvious. Per an interesting table on p 253 of  we have the following formulae for the third moments about the mean.
Poisson:   
m3 = 3s2-2m
Poisson Binomial:   
m3 = 3s2-2m+ m-2
m-1
(s2-m)2
m
Negative Binomial:   
m3 = 3s2-2m+ 2 (s2-m)2
m
Polya-Aeppli:    
m3 = 3s2-2m+ 3
2
(s2-m)2
m
Neyman Type A:   
m3 = 3s2-2m+ (s2-m)2
m
Generalized PP:    
m3 = 3s2-2m+ r+2
r+1
(s2-m)2
m
Note that r > -1 in the last line give a great deal of flexibility.

Beta-Negative Binomial, a NB mixed over the variance multplier distributed as a beta should have a lot of potential as a distribution. However, the PGF involves 2F1 which makes it very hard to deal with.

7. Generalized Poisson-Pascal Distribution, [PW]

The GPP is a Poisson stopped-sum of extended truncated Negative Binomial distributions. It is a three parameter distribution. It has PGF

G(z) = exp

q

(1+P-Pz)-r-(1+P)-r
1-(1+P)-r
-1



.

Note that provided 1-(1+P)-r = 1-pr > 0

G(z) = exp

q
1-(1+P)-r
((1+P-Pz)-r-1)

is a valid PGF for a Poisson-Negative Binomial (Poisson-Pascal). The condition is necessary so that the frequency is non-negative. Thus in the Poisson-Pascal case the distribution can be regarded as a Poisson-NB without zero truncation, or a Poisson-ZTNB, with an adjusted primary Poisson frequency.

Special cases of the GPP include:

r = 1 is a Poisson-Geometric
r > 0 is a Poisson-Pascal, aka Poisson-Negative Binomial
-1 < r < 0 is a Poisson-ETNB, and you need the zero truncation.
r = -1/2 is a Poisson-Inverse Gaussian mixture.
8. PIG and GPIG Distributions

The PIG is a Poisson mixed over an Inverse Gaussian distribution. The PIG is closed under certain convolutions, see [PW]. It has a thicker tail than the Negative Binomial distribution. It is a special case of the generalized Poisson-Pascal distribution with r = -1/2.

References for this section are from [PW], Section 7.8.3.

Per  page 260, the PIG is a Poisson ETNB.

Per  page 261, the Poisson ETNB with -1 < r < 0 is a Poisson mixture with a stable distribution, (see also Feller p 448, 581).

PIG Parameters: m and b.
See below with l = -1/2. The Generalized Poisson inverse Gaussian distribution is also called Sichel's distribution.
Sichel's Distribution Parameters: m and b and l.
Pr(X = x) = [( mn)/ n!][( Kl+n(mb-1{1+2b}))/( Kl+n(mb-1) )] (1+2b)-(l+n)/2. The rth factorial moment
m[r] = mr Kl+r(m/b)
Kl(m/b)
.

The Bessel function used is the modified Bessel function of the third (second according to some sources!) kind, Kl(x). It is available for integral l built into Excel, in MathFunctions as nrBesselK(n,x) for any real n and x R and also in Matlab as nrBesselK, again for any n and any x C.

Matlab mentions their BesselK uses a MEX interface to a Fortran library by D. E. Amos, which are available on the web under www.netlib.com, search for amos.

9. Recursive Classes of Distributions

The (a,b) recursion is

pn = pn-1(a+b/n).
For (a,b,0) the recursion is valid for n = 1,2,3,.... For (a,b,1) the recursion is valid for n = 2,3,4,....

The (a,b) classes fall into two sub-groups.

(a,b,0) distributions are supported on the non-negative integers. They are specified through a, b, and p0.
(a,b,1) distributions are supported on the positive integers. There are two sub-sub-classes. The zero-truncated distributions have zero probability at zero. These include the zero-truncated Poisson, logarithmic and negative binomial distributions. The zero-modified distributions are a weighting of a degenerate distribution with a zero-truncated class. For the negative binomial, there is slightly more flexibility in the choice of parameters for the truncated distribution, so it is sometimes called the ``extended truncated negative binomial distribution''. Normally we have parameters r = k and b = P = q/p, with mean rP and variance multiplier 1+P = Q. In the ETNB, we must still have b > 0, so the apparent variance multiplier is greater than 1. However, we can have -1 < r < 0, which would translate into a negative mean, in the usual case. Also, if r < 0 then the probability of a zero loss is pr > 1, which is also impossible, since p < 1 always. (Recall, p = 1/(1+P) = 1/vm.)
See the nice table on p 229 of  for a good summary of the options. See also page 250-251 for a chart showing the relationships between the various distributions.

Data Tables

Poisson Distribution Key Facts



Item Poisson Distribution

Mean q
Variance q
  
q m
n/a
m3 q
m4 3q2+q
CV 1/{q}
Skewness 1/{q}
Kurtosis 3+1/q
  
PGF G(z) exp(q(z-1)
MGF f(t) exp(q(eit-1)
  
Recursions
p0 exp(-q)
pn pn-1q/ n
  

Negative Binomial (r = k,p) Key Facts



Item NB Distribution

Mean kq/p
Variance kq/p2
  
VM View, m and v
p 1/v
k m/(v-1)
Contagion View, m and c
p 1/(1+cm)
k 1/c
  
m3 [( kq(1+q))/( p3)]
m4 [( 3k2q2)/( p4)]+[( kq(p2+6q))/( p4)]
CV 1/[kq]
Skewness [( 1+q)/( [kq])]
Kurtosis 3+[( p2+6q)/ kq]
  
PGF G(z) (p/(1-qz))r
MGF f(t) (p/(1-qeit))r
  
Recursions
p0 pr
pn pn-1 (k+n-1)q/n
pn+1 pn (k+n)q/(n+1)
  

Table


Negative Binomial (k,P) Key Facts



Item NB Distribution

Mean kP
Variance kP(1+P)
  
VM View, m and v
P v-1
k m/(v-1)
Contagion View, m and c
P cm
k 1/c
  
m3 kP(1+P)(1+2P)
m4 3k2P2(1+P)2+kP(1+P)(1+6P+6P2)
CV ((1+P)/(kP))1/2
Skewness [( 1+2P)/( {kP(1+P)}1/2)]
Kurtosis 3+[( (1+6P+6P2))/( kP(1+P))]
  
PGF G(z) (1+P-Pz)-k
MGF f(t)
  
Recursion
p0 Q-k
pn+1 [( k+r)/( r+1)][ P/( 1+P)]pn
  

Table


PIG Distribution Key Facts



Item NB Distribution

Mean m
Variance m(b+1)
  
VM View, m and v
m m
b v-1
Contagion View, m and c
m m
b cm
  
m3
m4
CV
Skewness
Kurtosis
  
PGF G(z) exp(-m/b{(1+2b(1-z))}-1 )
MGF f(t)
  
Recursion
p0 exp(-m/b({1+2b}-1)
p1 m{1+2b}p0
pn [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2
  

Table


Generalized PIG Distribution Key Facts



Item NB Distribution

Mean m
Variance m(b+1)
  
VM View, m and v
m m
b v-1
Contagion View, m and c
m m
b cm
  
m3
m4
CV
Skewness
Kurtosis
  
PGF G(z) exp(-m/b{(1+2b(1-z))}-1 )
MGF f(t)
  
Recursion
p0 exp(-m/b({1+2b}-1)
p1 m{1+2b}p0
pn [( b)/( 1+2b)](2-[ 3/ n])pn-1 +[( m2)/( 1+2b)][ 1/( n(n-1))]pn-2
  



FILLINNAME Key Facts



Item NB Distribution

Mean
Variance
  
VM View, m and v
Contagion View, m and c
  
m3
m4
CV
Skewness
Kurtosis
  
PGF G(z)
MGF f(t)
  
Recursion
p0
pn
  


JKK] [JKK] Johnson, Kotz and Kemp Statistical Methods for Forecasting John Wiley and Sons 1983


File translated from TEX by TTH, version 2.34.
On 11 Sep 1999, 17:28.