Logo

SQL Server NEGHYPGEOM_DIST Function

Updated 2024-03-13 12:33:12.017000

Description

Use the scalar function NEGHYPGEOM_DIST to calculate the probability mass function or the cumulative distributive function of the negative hypergeometric distribution. The negative hypergeometric distribution models the sample size required to achieve a specified number of failures given the number of successes and the size of the population. The probability mass function can be calculated as:

P(x)=\frac{\binom{x-1}{k-1}\binom{N-x}{M-k}}{\binom{N}{M}};k\leq{x}\leq{k+N-M}

Where:

column 1column 2column 3
x=number of trials
M=number of successes in the population
N=population size
k=number of successes to achieve with the sample

The cumulative distribution function can be calculated as:

cdf=\sum_{x=k}^{num\_trials}\frac{\binom{x-1}{k-1}\binom{N-x}{M-k}}{\binom{N}{M}};k\leq{x}\leq{k+N-M}

Syntax

SELECT [westclintech].[wct].[NEGHYPGEOM_DIST](
  <@num_trials, float,>
 ,<@num_success, float,>
 ,<@pop_success, float,>
 ,<@pop_size, float,>
 ,<@Cumulative, bit,>)

Arguments

@num_trials

the number of trials until @num_success have occurred. @num_trials must be of a type float or of type that intrinsically converts to float.

@num_success

the number of successes to achieve with the sample. @num_success must be of a type float or of type that intrinsically converts to float.

@pop_success

the number of successes in the population. @pop_success must be of a type float or of type that intrinsically converts to float.

@pop_size

the size of the population. @pop_size must be of a type float or of a type that intrinsically converts to float.

@Cumulative

a bit value identifying whether the probability mass function ( 'False') or the cumulative distribution function ( 'True') is to be returned. @Cumulative is of a type bit or a type that implicitly converts to bit.

Return Type

float

Remarks

@num_trials, @num_success, @pop_success, and @pop_size are truncated; only the integer part is used.

If @num_success < 0 then NULL is returned.

If @pop_success < 0 then NULL is returned.

If @num_trials = 0 then NULL is returned.

If @pop_size = 0 then NULL is returned.

If @pop_success > @pop_size then NULL is returned.

If @num_trials > @pop_size - @pop_success + @num_success then NULL is returned.

Examples

In an urn with 1000 balls of which 300 are white we want to calculate the probability of drawing exactly 40 white balls in 150 draws from the urn.

SELECT wct.NEGHYPGEOM_DIST(   150,    --@num_trials
                              40,     --@num_success
                              300,    --@pop_success
                              1000,   --@pop_size
                              'False' --@Cumulative
                          ) as NEGHYPGEOM_DIST;

This produces the following result.

NEGHYPGEOM_DIST
0.0131376591508012

We can contrast this to the value returned using the binomial coefficients.

SELECT wct.NEGHYPGEOM_DIST(   150,    --@num_trials
                              40,     --@num_success
                              300,    --@pop_success
                              1000,   --@pop_size
                              'False' --@Cumulative
                          ) as NEGHYPGEOM_DIST,
       wct.BICO(150 - 1, 40 - 1) * wct.BICO(1000 - 150, 300 - 40) / wct.BICO(1000,
                 300) as pmf;

This produces the following result.

NEGHYPGEOM_DISTpmf
0.01313765915080120.0131376591507985

Using the same data we calculate the cumulative distribution function.

SELECT wct.NEGHYPGEOM_DIST(   150,   --@num_trials
                              40,    --@num_success
                              300,   --@pop_success
                              1000,  --@pop_size
                              'True' --@Cumulative
                          ) as NEGHYPGEOM_DIST;

This produces the following result.

NEGHYPGEOM_DIST
0.856400290288149

In this example, we demonstrated the relationship between the hypergeometric and the negative hypergeometric distribution .

SELECT wct.NEGHYPGEOM_DIST(   150,   --@num_trials
                              40,    --@num_success
                              300,   --@pop_success
                              1000,  --@pop_size
                              'True' --@Cumulative
                          ) as NEGHYPGEOM_DIST,
       wct.HYPGEOM_DIST(150 - 40, 150, 1000 - 300, 1000, 'True') as HYPGEOM_DIST;

This produces the following result.

NEGHYPGEOM_DISTHYPGEOM_DIST
0.8564002902881490.856400290288149

In this example we use the XLeratorDB SeriesInt function to show how the cdf is the sum of the pmf values from @num_success to @num_trial.

DECLARE @num_success as float = 40;
DECLARE @num_trials as float = 150;
DECLARE @pop_success as float = 300;
DECLARE @pop_size as float = 1000;
SELECT num_trials,
       pmf,
       SUM(pmf) OVER (ORDER BY num_trials) as cdf
FROM
(
    SELECT SeriesValue as num_trials,
           wct.NEGHYPGEOM_DIST(seriesValue, @num_success, @pop_success, @pop_size,
                     'False') as pmf
    FROM wct.SeriesInt(@num_success, @num_trials, NULL, NULL, NULL)
) n;

This produces the following result.

num_trialspmfcdf
401.7614981027881E-221.7614981027881E-22
415.13770279980309E-215.3138526100819E-21
427.67682088318186E-208.20820614419005E-20
437.8306778361577E-198.65149845057671E-19
446.13096252423383E-186.9961123692915E-18
453.92792042556879E-174.62753166249794E-17
462.14390421133981E-162.6066573775896E-16
471.02488766161703E-151.28555339937599E-15
484.37849108427305E-155.66404448364904E-15
491.69743183771367E-142.26383628607857E-14
506.04346420537374E-148.30730049145232E-14
511.99521067067724E-132.82594071982247E-13
526.15645484341946E-138.98239556324193E-13
531.78719026677994E-122.68542982310413E-12
544.90823539023829E-127.59366521334243E-12
551.28132961096537E-112.04069613229961E-11
563.19273102467746E-115.23342715697707E-11
577.62053646367773E-111.28539636206548E-10
581.74782081458471E-103.03321717665019E-10
593.86282266966779E-106.89603984631798E-10
608.24677747314151E-101.51428173194595E-09
611.70450111907096E-093.21878285101691E-09
623.41750081570169E-096.6362836667186E-09
636.65885159399702E-091.32951352607156E-08
641.26292546840779E-082.59243899447935E-08
652.33500886603117E-084.92744786051052E-08
664.21425396944296E-089.14170182995348E-08
677.43385175575885E-081.65755535857123E-07
681.28311058612437E-072.94066594469561E-07
692.16934369333757E-075.11000963803318E-07
703.59607746715971E-078.70608710519289E-07
715.85001849496499E-071.45561056001579E-06
729.34708115218811E-072.3903186752346E-06
731.46798986120825E-063.85830853644285E-06
742.26784359198026E-066.12615212842311E-06
753.44857771018759E-069.5747298386107E-06
765.16509949836555E-061.47398293369762E-05
777.62406637081614E-062.23638957077924E-05
781.10970009778345E-053.34608966856269E-05
791.59353896905121E-054.9396286376139E-05
802.25876795398639E-057.1983965916003E-05
813.16179607557323E-050.000103601926671735
824.37259735524565E-050.000147327900224192
835.97679137913006E-050.000207095814015492
848.07773398174178E-050.00028787315383291
850.0001079853113366470.000395858465169557
860.0001428387314960470.000538697196665604
870.0001870157273213250.000725712923986929
880.0002424368036887070.000968149727675636
890.0003112704540014150.00127942018167705
900.0003959319173308230.00167535209900787
910.0004990738453744450.00217442594438232
920.0006235677886520010.00279799373303432
930.0007724756314580750.0035704693644924
940.0009490103969213170.00451947976141371
950.001156486199933720.00567596596134744
960.001398257535533680.00707422349688112
970.001677648538209180.00875187203509029
980.001997873314559930.0107497453496502
990.002361948916886910.0131116942665371
1000.002772602965531730.0158842972320689
1010.003232178320473110.019116475552542
1020.003742537526011970.0228590130785539
1030.004304969986975090.027163983065529
1040.004920104964728040.0320840880302571
1050.005587833495660220.0376719215259173
1060.006307242226564290.0439791637524816
1070.007076561931847920.0510557256843295
1080.007893133130326470.058948858814656
1090.008753390767266930.0677022495819229
1100.009652869385639530.0773551189675624
1110.01058622960000440.0879413485675669
1120.01154730603115520.099488654598722
1130.01252917618487610.112017830783598
1140.01352424909160240.125542079875201
1150.01452437189122930.14006645176643
1160.01552095197268240.155587403739112
1170.01650509178454350.172092495523656
1180.01746773303812540.189560228561781
1190.01839980674053090.207960035302312
1200.0192923853336770.227252420635989
1210.02013683317656850.247389253812558
1220.02092495169248120.268314205505039
1230.02164911570390970.289963321208948
1240.02230239778310760.312265718992056
1250.02287867784401690.335144396836073
1260.02337273567121140.358517132507284
1270.02378032460699540.38229745711428
1280.02409822517673870.406395682291019
1290.02432427800544560.430719960296464
1300.0244573959447710.455177356241235
1310.02449755587244920.479674912113684
1320.02444577112558740.504120683239272
1330.02430404597708550.528424729216357
1340.02407531394359280.55250004315995
1350.02376336202112660.576263405181077
1360.02337274317284880.599636148353925
1370.02290867954360130.622544827897527
1380.02237695894454730.644921786842074
1390.0217838271494380.666705613991512
1400.02113587846964660.687841492461159
1410.02043994694280160.70828143940396
1420.01970300028351940.72798443968748
1430.01893203851897650.746916478206456
1440.01813399897229080.765050477178747
1450.0173156689786190.782366146157366
1460.01648360742723170.798849753584598
1470.01564407593205560.814493829516653
1480.01480298014711660.82929680966377
1490.01396582147316010.84326263113693
1500.01313765915080120.856400290287731

See Also

BICO - Binomial coefficient

BINOMDIST - Binomial distribution

HYPGEOM_DIST - Hypergeometric distribution

LCHOOSE - Natural logarithm of the binomial coefficient

NEGBINOM_DIST - Negative binomial distribution

NEGHYPGEOM_INV - Calculate the quantiles of the negative hypergeometric distribution.