Logo

SQL Server LOGIT Function

Updated 2024-02-13 20:12:01.047000

Description

Use the table-valued function LOGIT to calculate the binary logistic regression coefficients from a table of x-values (the independent variables) and dichotomous y-values (the dependent variable). The function supports the entry of the data in raw form, where the y-values are a column of zeros and ones {0,1}. The y-value can appear in any column and the column number of the y-values is specified at input.

x1x2x3xny
x 1,1x 1,2x 1,3x 1,ny 1
x 2,1x 2,2x 2,3x 2,ny 2
x m,1x m,2x m,3x m,ny m

For summary values, where the y-values are the counts of successes and failures, use the LOGITSUM function.

Logistic regression estimates the probability of an event occurring. Unlike ordinary least squares (see LINEST and LINEST_q) which estimates the value of a dependent variable based on the independent variables, LOGIT measures the probability (p) of an event occurring (1) or not occurring (0). The probability is estimated as:

p=\hat{\pi}=\frac{e^{\beta_0 + \beta_1x_1 +...+\beta_nx_n}}{1 + e^{\beta_0 + \beta_1x_1 +...+\beta_nx_n}}

The LOGIT function works by finding the coefficient (ß) values that maximize the log-likelihood statistic, which is defined as:

LL = \sum_{i=1}^n \{y_i \times \ln \hat{\pi}(x_i) + (1-y_i)\times ( 1 - \hat{\pi}(x_i))\}

using a method of iteratively re-weighted least squares.

Syntax

SELECT * FROM [westclintech].[wct].[LOGIT](
  <@Matrix_RangeQuery, nvarchar(max),>
 ,<@y_ColumnNumber, int,>

Arguments

@Matrix_RangeQuery

the SELECT statement, as a string, which, when executed, creates the resultant table of x- and y-values used in the calculation.

@y_ColumnNumber

the number of the column in the resultant table returned by @Matrix_RangeQuery which contains the dichotomous outcomes. @y_ColumnNumber must be of the type int or of a type that implicitly converts to int.

Return Type

table

colNamecolDatatypecolDesc
stat_namenvarchar(4000)Identifies the statistic being returned: b estimated coefficient for each independent variable plus the intercept se standard error of b z z statistic for b pval p-value (normal distribution) for the z statistic Wald Wald statistic LL0 log-likelihood with just the intercept and no other coefficients LLM model log-likelihood chisq chi squared statistic df degrees of freedom p_chisq p-value of the chi-squared statistic AIC Akaike information criterion BIC Bayesian information criterion Nobs number of observations rsql log-linear ratio R2 rsqcs Cox and Snell's R2 rsqn Nagelkerke's R2 D deviance Iterations number of iteration in iteratively re-weighted least squares Converged a bit value identifying whether a solution was found (1) or not (0) AUROC area under the ROC curve
idxintIdentifies the subscript for the estimated coefficient (b), standard error of the coefficient (se), z statistics (z), p-value of the z statistic (pval), and the Wald statistic. When the idx is 0, it is referring to the intercept, otherwise the idx identifies the column number of independent variable. Descriptive statistics other than the ones mentioned above will have an idx of NULL.
stat_valfloatthe calculated value of the statistic.

Remarks

If @y_ColumnNumber is NULL then the right-most column in the resultant table is assumed to contain the dichotomous results.

@Matrix_RangeQuery must return at least 2 columns or an error will be returned.

If @y_ColumnNumber is not NULL and @y_ColumnNumber < 1 an error will be returned.

If @y_ColumnNumber is not NULL and @y_ColumnNumber greater than the number of columns returned by @Matrix_RangeQuery an error will be returned.

Examples

Example #1

In this example we use the Coronary Heart Disease data from Applied Logistic Regression, Third Edition by David W. Hosmer, Jr., Stanley Lemeshow, and Rodney X. Sturdivant . The data consist of a single independent variable (age) and an outcome (chd) which indicates the absence (0) or presence (1) of coronary heart disease.

The following SQL populates a temporary table, #chd.

SELECT *
INTO   #chd
  FROM (   VALUES (20, 0),
                  (23, 0),
                  (24, 0),
                  (25, 0),
                  (25, 1),
                  (26, 0),
                  (26, 0),
                  (28, 0),
                  (28, 0),
                  (29, 0),
                  (30, 0),
                  (30, 0),
                  (30, 0),
                  (30, 0),
                  (30, 0),
                  (30, 1),
                  (32, 0),
                  (32, 0),
                  (33, 0),
                  (33, 0),
                  (34, 0),
                  (34, 0),
                  (34, 1),
                  (34, 0),
                  (34, 0),
                  (35, 0),
                  (35, 0),
                  (36, 0),
                  (36, 1),
                  (36, 0),
                  (37, 0),
                  (37, 1),
                  (37, 0),
                  (38, 0),
                  (38, 0),
                  (39, 0),
                  (39, 1),
                  (40, 0),
                  (40, 1),
                  (41, 0),
                  (41, 0),
                  (42, 0),
                  (42, 0),
                  (42, 0),
                  (42, 1),
                  (43, 0),
                  (43, 0),
                  (43, 1),
                  (44, 0),
                  (44, 0),
                  (44, 1),
                  (44, 1),
                  (45, 0),
                  (45, 1),
                  (46, 0),
                  (46, 1),
                  (47, 0),
                  (47, 0),
                  (47, 1),
                  (48, 0),
                  (48, 1),
                  (48, 1),
                  (49, 0),
                  (49, 0),
                  (49, 1),
                  (50, 0),
                  (50, 1),
                  (51, 0),
                  (52, 0),
                  (52, 1),
                  (53, 1),
                  (53, 1),
                  (54, 1),
                  (55, 0),
                  (55, 1),
                  (55, 1),
                  (56, 1),
                  (56, 1),
                  (56, 1),
                  (57, 0),
                  (57, 0),
                  (57, 1),
                  (57, 1),
                  (57, 1),
                  (57, 1),
                  (58, 0),
                  (58, 1),
                  (58, 1),
                  (59, 1),
                  (59, 1),
                  (60, 0),
                  (60, 1),
                  (61, 1),
                  (62, 1),
                  (62, 1),
                  (63, 1),
                  (64, 0),
                  (64, 1),
                  (65, 1),
                  (69, 1)) n (age, chd);

We can run the following SQL to reproduce Table 1.2, Frequency Table of Age Group by CHD from the Hosmer book, verifying that we are using the same data.

SELECT a.descr as [Age Group],
       COUNT(*) as n,
       COUNT(*) - SUM(c.chd) as Absent,
       SUM(c.chd) as Present,
       AVG(cast(c.chd as float)) as Mean
  FROM #chd c
 CROSS APPLY (   SELECT TOP 1 grp,
                        descr
                   FROM (   VALUES (20, 1, '20-29'),
                                   (30, 2, '30-34'),
                                   (35, 3, '35-39'),
                                   (40, 4, '40-44'),
                                   (45, 5, '45-49'),
                                   (50, 6, '50-55'),
                                   (55, 7, '55-59'),
                                   (60, 8, '60-69')) n (age, grp, descr)
                  WHERE n.age <= c.age
                  ORDER BY n.age DESC) a
 GROUP BY a.descr,
          a.grp
 ORDER BY 1;

This produces the following result.

Age GroupnAbsentPresentMean
20-2910910.1
30-34151320.133333333333333
35-3912930.25
40-44151050.333333333333333
45-4913760.461538461538462
50-558350.625
55-59174130.764705882352941
60-6910280.8

This SQL calculates the results of the logistic regression.

SELECT *
  FROM wct.LOGIT('SELECT
        age
       ,chd
    FROM
       #chd', 2);

This produces the following result.

stat_namedxtat_val
b0-5.30945337391905
b10.1109211422069
se01.13365463681529
se10.0240598358749988
z0-4.68348401840844
z14.61022023521619
pval02.82039453738119E-06
pval14.02242615828777E-06
Wald021.9350225506872
Wald121.2541306171968
LL0NULL-68.3314913574166
LLMNULL-53.6765463471564
chisqNULL29.3098900205205
dfNULL1
p_chisqNULL6.16800830270808E-08
AICNULL111.353092694313
BICNULL116.563433066289
NobsNULL100
rsqlNULL0.214468391061534
rsqcsNULL0.254051637397186
rsqnNULL0.340992792718107
DNULL-0.482789163126702
AUROCNULL0.789881680946553
IterationsNULL5
ConvergedNULL1

Let's reformat some of these to make them easier to read and easier to compare to Hosmer. This SQL reproduces Table 1.3, Results of Fitting the Logistic Regression to the CHDAGE Data, n = 100.

SELECT n.Variable,
       p.b as Coeff,
       p.se as [Std. Err],
       p.z,
       p.pval as p
  FROM (   SELECT *
             FROM wct.LOGIT('SELECT
           age
          ,chd
        FROM
          #chd', 2) ) d
  PIVOT (   SUM(stat_val)
            FOR stat_name IN (b, se, z, pval)) p
 CROSS APPLY (   VALUES (0, 'Constant'),
                        (1, 'Age')) n (idx, variable)
 WHERE p.idx = n.idx
 ORDER BY p.idx DESC;

This produces the following result.

VariableCoeffStd. Errzp
Age0.11092114220690.02405983587499884.610220235216194.02242615828777E-06
Constant-5.309453373919051.13365463681529-4.683484018408442.82039453738119E-06

As Hosmer points out, the fitted values, are given by the equation

http://westclintech.com/Portals/0/images/doc_stats_LOGIT_img1.jpg

meaning that we can use this equation to predict the probability of the presence of coronary heart disease given a person's age.

If you are not interested in an explanation of how the remaining statistics are calculated, then skip to the next example. In order to explore how these remaining statistics are calculated, let's store the results of the LOGIT function in a table (#mylogit) with the following SQL.

SELECT stat_name,
       idx,
       stat_val
INTO   #mylogit
  FROM wct.LOGIT('SELECT
        age
       ,chd
    FROM
       #chd', 2);

The standard errors of the coefficients (se) can be calculated as the square root of the diagonal of the covariance matrix:

http://westclintech.com/Portals/0/images/doc_stats_LOGIT_img2.jpg

Where X is the design matrix (a column of ones added to the input data) and W is the diagonal matrix of weights calculated by the iteratively re-weighted least squares process.

In this piece of SQL we will create a table which calculates the log-likelihood (LL), the weights (W) and the predicted value, which can then be used to calculate the covariance matrix.

SELECT *,
       (p_obs * LOG(p_pred) + (1 - p_obs) * LOG(1 - p_pred)) as LL,
       p_pred * (1 - p_pred) as W
INTO   #m
  FROM (   SELECT age,
                  chd as p_obs,
                  EXP(b0.stat_val + age * b1.stat_val) / (1 + EXP(b0.stat_val + age * b1.stat_val)) as p_pred
             FROM #mylogit b0
             JOIN #mylogit b1
               ON b0.stat_name = 'b'
              AND b0.idx = 0
              AND b1.stat_name = 'b'
              AND b1.idx = 1
            CROSS JOIN #chd) n;

We can now use the matrix functions from the XLeratorDB math module to verify the results returned by the LOGIT function for the standard errors (se).

SELECT RowNum as idx,
       SQRT(ItemValue) as se
  FROM wct.Matrix(
           wct.MATINVERSE(
               wct.MATMULT(
                   wct.TRANSPOSE(wct.Matrix2String_q('SELECT 1, age FROM #m')),
                   wct.Matrix2String_q('SELECT W, W * age FROM #m'))))
 WHERE RowNum = ColNum

This produces the following result.

idxse
01.1336546368153
10.0240598358749989

From the coefficients (b) and the se values, we can calculate the z, pval, and Wald values. The following SQL demonstrates the calculation, using the b and se values stored in #mylogit.

SELECT n.Variable,
       p.b / p.se as z,
       2 * wct.NORMSDIST(-ABS(p.b / p.se)) as pval,
       POWER(p.b / p.se, 2) as Wald
  FROM (SELECT * FROM #mylogit) d
  PIVOT (   SUM(stat_val)
            FOR stat_name IN (b, se)) p
 CROSS APPLY (   VALUES (0, 'Constant'),
                        (1, 'Age')) n (idx, variable)
 WHERE p.idx = n.idx
 ORDER BY p.idx DESC;

This produces the following result.

VariablezpvalWald
Age4.610220235216194.02242615828777E-0621.2541306171968
Constant-4.683484018408442.82039453738119E-0621.9350225506872

The model log-likelihood (LLM) is simply the sum of the individual log-likelihoods which we have already computed and stored in #m.

SELECT SUM(LL) as LLM
  FROM #m

This produces the following result.

LLM
-53.6765463471564

The calculation of LL0 can be done directly from the input data.

SELECT f * LOG(f) + s * LOG(s) - n * LOG(n) as LL0
  FROM (   SELECT COUNT(*) as n,
                  SUM(chd) as s,
                  COUNT(*) - SUM(chd) as f
             FROM #chd) n;

This produces the following result.

LL0
-68.3314913574166

Given LL0 and LLM it is relatively straightforward to calculate the rest of the statistics.

SELECT
   x.*
FROM (
   SELECT
        2 *(LLM - LL0) as chisq
       ,wct.CHIDIST(2 *(LLM - LL0), df) as pchisq
       ,-2 *(LLM - df - 1) as AIC
       ,-2 * LLM + LOG(Nobs) *(df + 1) as BIC
       ,1 - LLM/LL0 as rsql
       ,1 - EXP((-2/Nobs)*(LLM-LL0)) as rsqcs
       ,(1 - EXP((-2/Nobs)*(LLM-LL0)))/(1-exp(2*LL0/Nobs)) as rsqn
       ,-2*LOG(LL0/LLM) as D
   FROM (
       SELECT
           stat_name
          ,stat_val
       FROM
          #mylogit
       )d
   PIVOT(
       SUM(stat_val)
   FOR
       stat_name
   IN
       (LL0,LLM,df,Nobs)
       )p
   )q
CROSS APPLY(
   VALUES
    ('chisq',chisq),
    ('pchisq',pchisq),
    ('AIC',AIC),
    ('BIC',BIC),
    ('rsql',rsql),
    ('rsqcs',rsqcs),
    ('rsqn',rsqn),
    ('D',D)
   )x(stat_name, stat_val);

This produces the following result.

stat_namestat_val
chisq29.3098900205205
pchisq6.16800830270808E-08
AIC111.353092694313
BIC116.563433066289
rsql0.214468391061534
rsqcs0.254051637397186
rsqn0.340992792718107
D-0.482789163126702

We can use the table-valued function ROCTable for the calculation of the area under the ROC curve (AUROC). The table-valued function requires the predicted probabilities and associated absence (0) or presence (1) of coronary heart disease as inputs. We will use the LOGITPRED function to create these combinations and store them in a temporary table which will then be called from ROCTable.

SELECT wct.LOGITPRED('SELECT stat_val FROM #mylogit where stat_name = ''b'' ORDER BY idx', cast(age as varchar(max))) as [p predicted],
       chd as y
INTO   #t
  FROM #chd
SELECT *
  FROM wct.ROCTABLE('SELECT * FROM #t');

This produces the following result.

idxppredfailuresuccesscumfailurecumsuccessFalsePositiveRateTruePositiveRateAUROCcumAUROC
00.912464554564153010100.023255813953488400
10.869939152344419010200.04651162790697670.0008159934720522230.000815993472052223
20.85686593067653611130.01754385964912280.069767441860465100.000815993472052223
30.84271622060268301140.01754385964912280.093023255813953500.000815993472052223
40.82744940176391402160.01754385964912280.1395348837209300.000815993472052223
50.81103299288096801170.01754385964912280.1627906976744190.002855977152182780.00367197062423501
60.79344461565528711280.03508771929824560.18604651162790700.00367197062423501
70.774673993551717022100.03508771929824560.2325581395348840.004079967360261120.00775193798449612
80.754724899971724123120.05263157894736840.279069767441860.009791921664626680.0175438596491228
90.733616953220639245160.0877192982456140.37209302325581400.0175438596491228
100.711387142595015035190.0877192982456140.4418604651162790.007751937984496120.0252957976336189
110.688090963392313126210.1052631578947370.48837209302325600.0252957976336189
120.663803041111905016220.1052631578947370.51162790697674400.0252957976336189
130.638617138505235026240.1052631578947370.5581395348837210.009791921664626680.0350877192982456
140.61264546440856117250.122807017543860.5813953488372090.01019991840065280.0452876376988984
150.586017240033851108250.1403508771929820.5813953488372090.01019991840065280.0554875560995512
160.558876524531328119260.1578947368421050.6046511627906980.02121583027335780.076703386372909
170.5313793534369512111270.1929824561403510.6279069767441860.0110159118727050.087719298245614
180.5036902959935131212290.2105263157894740.6744186046511630.02366381068951450.111383108935129
190.4759785844732812114300.2456140350877190.6976744186046510.01223990208078340.123623011015912
200.4484140048604641115310.2631578947368420.7209302325581390.01264789881680950.136270909832721
210.4211627589753441116320.2807017543859650.7441860465116280.02611179110567120.162382700938392
220.3943835106261782218340.3157894736842110.7906976744186050.02774377804977560.190126478988168
230.3682238123282762120350.3508771929824560.8139534883720930.04283965728274170.23296613627091
240.3428170766427843123360.4035087719298250.8372093023255810.02937576499388010.26234190126479
250.3182802114257522025360.438596491228070.8372093023255810.014687882496940.27702978376173
260.2947119867178421126370.4561403508771930.860465116279070.01509587923296610.292125662994696
270.2721921485117541127380.4736842105263160.8837209302325580.03100775193798450.323133414932681
280.2507812465609692029380.5087719298245610.8837209302325580.03100775193798450.354141166870665
290.2305211038773862131390.5438596491228070.9069767441860460.03182374541003670.385964912280702
300.2114358271319042133400.5789473684210530.9302325581395350.03263973888208890.418604651162791
310.1935332406631262035400.6140350877192980.9302325581395350.06527947776417790.483884128926969
320.1768066215825864139410.6842105263157890.9534883720930230.03345573235414120.51733986128111
330.1612366178210712041410.7192982456140350.9534883720930230.03345573235414120.550795593635251
340.1467932425433172043410.7543859649122810.9534883720930230.08363933088535290.634434924520604
350.1211250535032685148420.8421052631578950.9767441860465120.01713586291309670.651570787433701
360.109804435463621049420.8596491228070180.9767441860465120.03427172582619340.685842513259894
370.09942217640138632051420.8947368421052630.9767441860465120.03427172582619340.720114239086087
380.08124847365986182053420.9298245614035090.9767441860465120.01713586291309660.737250101999184
390.07334378841000281154430.94736842105263210.01754385964912290.754793961648307
400.06615277830121591055430.96491228070175410.01754385964912280.77233782129743
410.05962144972811551056430.98245614035087710.01754385964912290.789881680946553
420.04347875674882361057431100.789881680946553

You can get more information about the calculation of the area under the ROC curve by going to the ROCTable documentation

Example #2

This example is based on R Data Analysis Examples: Logit Regression which can be found at www.ats.ucla.edu/stat/r/dae/logit.htm . We will put the data into a table called #mydata.

The dataset contains 4 columns of data which are labeled admit, gre, gpa, and rank. The variables gpa and gre will be treated as continuous. The variable rank has the values 1 through 4 where a 1 indicates the highest-ranked institutions and a 4 indicates the lowest-ranked institutions.

SELECT *
INTO   #mydata
  FROM (   VALUES (1, 0, 380, 3.61, 3),
                  (2, 1, 660, 3.67, 3),
                  (3, 1, 800, 4, 1),
                  (4, 1, 640, 3.19, 4),
                  (5, 0, 520, 2.93, 4),
                  (6, 1, 760, 3, 2),
                  (7, 1, 560, 2.98, 1),
                  (8, 0, 400, 3.08, 2),
                  (9, 1, 540, 3.39, 3),
                  (10, 0, 700, 3.92, 2),
                  (11, 0, 800, 4, 4),
                  (12, 0, 440, 3.22, 1),
                  (13, 1, 760, 4, 1),
                  (14, 0, 700, 3.08, 2),
                  (15, 1, 700, 4, 1),
                  (16, 0, 480, 3.44, 3),
                  (17, 0, 780, 3.87, 4),
                  (18, 0, 360, 2.56, 3),
                  (19, 0, 800, 3.75, 2),
                  (20, 1, 540, 3.81, 1),
                  (21, 0, 500, 3.17, 3),
                  (22, 1, 660, 3.63, 2),
                  (23, 0, 600, 2.82, 4),
                  (24, 0, 680, 3.19, 4),
                  (25, 1, 760, 3.35, 2),
                  (26, 1, 800, 3.66, 1),
                  (27, 1, 620, 3.61, 1),
                  (28, 1, 520, 3.74, 4),
                  (29, 1, 780, 3.22, 2),
                  (30, 0, 520, 3.29, 1),
                  (31, 0, 540, 3.78, 4),
                  (32, 0, 760, 3.35, 3),
                  (33, 0, 600, 3.4, 3),
                  (34, 1, 800, 4, 3),
                  (35, 0, 360, 3.14, 1),
                  (36, 0, 400, 3.05, 2),
                  (37, 0, 580, 3.25, 1),
                  (38, 0, 520, 2.9, 3),
                  (39, 1, 500, 3.13, 2),
                  (40, 1, 520, 2.68, 3),
                  (41, 0, 560, 2.42, 2),
                  (42, 1, 580, 3.32, 2),
                  (43, 1, 600, 3.15, 2),
                  (44, 0, 500, 3.31, 3),
                  (45, 0, 700, 2.94, 2),
                  (46, 1, 460, 3.45, 3),
                  (47, 1, 580, 3.46, 2),
                  (48, 0, 500, 2.97, 4),
                  (49, 0, 440, 2.48, 4),
                  (50, 0, 400, 3.35, 3),
                  (51, 0, 640, 3.86, 3),
                  (52, 0, 440, 3.13, 4),
                  (53, 0, 740, 3.37, 4),
                  (54, 1, 680, 3.27, 2),
                  (55, 0, 660, 3.34, 3),
                  (56, 1, 740, 4, 3),
                  (57, 0, 560, 3.19, 3),
                  (58, 0, 380, 2.94, 3),
                  (59, 0, 400, 3.65, 2),
                  (60, 0, 600, 2.82, 4),
                  (61, 1, 620, 3.18, 2),
                  (62, 0, 560, 3.32, 4),
                  (63, 0, 640, 3.67, 3),
                  (64, 1, 680, 3.85, 3),
                  (65, 0, 580, 4, 3),
                  (66, 0, 600, 3.59, 2),
                  (67, 0, 740, 3.62, 4),
                  (68, 0, 620, 3.3, 1),
                  (69, 0, 580, 3.69, 1),
                  (70, 0, 800, 3.73, 1),
                  (71, 0, 640, 4, 3),
                  (72, 0, 300, 2.92, 4),
                  (73, 0, 480, 3.39, 4),
                  (74, 0, 580, 4, 2),
                  (75, 0, 720, 3.45, 4),
                  (76, 0, 720, 4, 3),
                  (77, 0, 560, 3.36, 3),
                  (78, 1, 800, 4, 3),
                  (79, 0, 540, 3.12, 1),
                  (80, 1, 620, 4, 1),
                  (81, 0, 700, 2.9, 4),
                  (82, 0, 620, 3.07, 2),
                  (83, 0, 500, 2.71, 2),
                  (84, 0, 380, 2.91, 4),
                  (85, 1, 500, 3.6, 3),
                  (86, 0, 520, 2.98, 2),
                  (87, 0, 600, 3.32, 2),
                  (88, 0, 600, 3.48, 2),
                  (89, 0, 700, 3.28, 1),
                  (90, 1, 660, 4, 2),
                  (91, 0, 700, 3.83, 2),
                  (92, 1, 720, 3.64, 1),
                  (93, 0, 800, 3.9, 2),
                  (94, 0, 580, 2.93, 2),
                  (95, 1, 660, 3.44, 2),
                  (96, 0, 660, 3.33, 2),
                  (97, 0, 640, 3.52, 4),
                  (98, 0, 480, 3.57, 2),
                  (99, 0, 700, 2.88, 2),
                  (100, 0, 400, 3.31, 3),
                  (101, 0, 340, 3.15, 3),
                  (102, 0, 580, 3.57, 3),
                  (103, 0, 380, 3.33, 4),
                  (104, 0, 540, 3.94, 3),
                  (105, 1, 660, 3.95, 2),
                  (106, 1, 740, 2.97, 2),
                  (107, 1, 700, 3.56, 1),
                  (108, 0, 480, 3.13, 2),
                  (109, 0, 400, 2.93, 3),
                  (110, 0, 480, 3.45, 2),
                  (111, 0, 680, 3.08, 4),
                  (112, 0, 420, 3.41, 4),
                  (113, 0, 360, 3, 3),
                  (114, 0, 600, 3.22, 1),
                  (115, 0, 720, 3.84, 3),
                  (116, 0, 620, 3.99, 3),
                  (117, 1, 440, 3.45, 2),
                  (118, 0, 700, 3.72, 2),
                  (119, 1, 800, 3.7, 1),
                  (120, 0, 340, 2.92, 3),
                  (121, 1, 520, 3.74, 2),
                  (122, 1, 480, 2.67, 2),
                  (123, 0, 520, 2.85, 3),
                  (124, 0, 500, 2.98, 3),
                  (125, 0, 720, 3.88, 3),
                  (126, 0, 540, 3.38, 4),
                  (127, 1, 600, 3.54, 1),
                  (128, 0, 740, 3.74, 4),
                  (129, 0, 540, 3.19, 2),
                  (130, 0, 460, 3.15, 4),
                  (131, 1, 620, 3.17, 2),
                  (132, 0, 640, 2.79, 2),
                  (133, 0, 580, 3.4, 2),
                  (134, 0, 500, 3.08, 3),
                  (135, 0, 560, 2.95, 2),
                  (136, 0, 500, 3.57, 3),
                  (137, 0, 560, 3.33, 4),
                  (138, 0, 700, 4, 3),
                  (139, 0, 620, 3.4, 2),
                  (140, 1, 600, 3.58, 1),
                  (141, 0, 640, 3.93, 2),
                  (142, 1, 700, 3.52, 4),
                  (143, 0, 620, 3.94, 4),
                  (144, 0, 580, 3.4, 3),
                  (145, 0, 580, 3.4, 4),
                  (146, 0, 380, 3.43, 3),
                  (147, 0, 480, 3.4, 2),
                  (148, 0, 560, 2.71, 3),
                  (149, 1, 480, 2.91, 1),
                  (150, 0, 740, 3.31, 1),
                  (151, 1, 800, 3.74, 1),
                  (152, 0, 400, 3.38, 2),
                  (153, 1, 640, 3.94, 2),
                  (154, 0, 580, 3.46, 3),
                  (155, 0, 620, 3.69, 3),
                  (156, 1, 580, 2.86, 4),
                  (157, 0, 560, 2.52, 2),
                  (158, 1, 480, 3.58, 1),
                  (159, 0, 660, 3.49, 2),
                  (160, 0, 700, 3.82, 3),
                  (161, 0, 600, 3.13, 2),
                  (162, 0, 640, 3.5, 2),
                  (163, 1, 700, 3.56, 2),
                  (164, 0, 520, 2.73, 2),
                  (165, 0, 580, 3.3, 2),
                  (166, 0, 700, 4, 1),
                  (167, 0, 440, 3.24, 4),
                  (168, 0, 720, 3.77, 3),
                  (169, 0, 500, 4, 3),
                  (170, 0, 600, 3.62, 3),
                  (171, 0, 400, 3.51, 3),
                  (172, 0, 540, 2.81, 3),
                  (173, 0, 680, 3.48, 3),
                  (174, 1, 800, 3.43, 2),
                  (175, 0, 500, 3.53, 4),
                  (176, 1, 620, 3.37, 2),
                  (177, 0, 520, 2.62, 2),
                  (178, 1, 620, 3.23, 3),
                  (179, 0, 620, 3.33, 3),
                  (180, 0, 300, 3.01, 3),
                  (181, 0, 620, 3.78, 3),
                  (182, 0, 500, 3.88, 4),
                  (183, 0, 700, 4, 2),
                  (184, 1, 540, 3.84, 2),
                  (185, 0, 500, 2.79, 4),
                  (186, 0, 800, 3.6, 2),
                  (187, 0, 560, 3.61, 3),
                  (188, 0, 580, 2.88, 2),
                  (189, 0, 560, 3.07, 2),
                  (190, 0, 500, 3.35, 2),
                  (191, 1, 640, 2.94, 2),
                  (192, 0, 800, 3.54, 3),
                  (193, 0, 640, 3.76, 3),
                  (194, 0, 380, 3.59, 4),
                  (195, 1, 600, 3.47, 2),
                  (196, 0, 560, 3.59, 2),
                  (197, 0, 660, 3.07, 3),
                  (198, 1, 400, 3.23, 4),
                  (199, 0, 600, 3.63, 3),
                  (200, 0, 580, 3.77, 4),
                  (201, 0, 800, 3.31, 3),
                  (202, 1, 580, 3.2, 2),
                  (203, 1, 700, 4, 1),
                  (204, 0, 420, 3.92, 4),
                  (205, 1, 600, 3.89, 1),
                  (206, 1, 780, 3.8, 3),
                  (207, 0, 740, 3.54, 1),
                  (208, 1, 640, 3.63, 1),
                  (209, 0, 540, 3.16, 3),
                  (210, 0, 580, 3.5, 2),
                  (211, 0, 740, 3.34, 4),
                  (212, 0, 580, 3.02, 2),
                  (213, 0, 460, 2.87, 2),
                  (214, 0, 640, 3.38, 3),
                  (215, 1, 600, 3.56, 2),
                  (216, 1, 660, 2.91, 3),
                  (217, 0, 340, 2.9, 1),
                  (218, 1, 460, 3.64, 1),
                  (219, 0, 460, 2.98, 1),
                  (220, 1, 560, 3.59, 2),
                  (221, 0, 540, 3.28, 3),
                  (222, 0, 680, 3.99, 3),
                  (223, 1, 480, 3.02, 1),
                  (224, 0, 800, 3.47, 3),
                  (225, 0, 800, 2.9, 2),
                  (226, 1, 720, 3.5, 3),
                  (227, 0, 620, 3.58, 2),
                  (228, 0, 540, 3.02, 4),
                  (229, 0, 480, 3.43, 2),
                  (230, 1, 720, 3.42, 2),
                  (231, 0, 580, 3.29, 4),
                  (232, 0, 600, 3.28, 3),
                  (233, 0, 380, 3.38, 2),
                  (234, 0, 420, 2.67, 3),
                  (235, 1, 800, 3.53, 1),
                  (236, 0, 620, 3.05, 2),
                  (237, 1, 660, 3.49, 2),
                  (238, 0, 480, 4, 2),
                  (239, 0, 500, 2.86, 4),
                  (240, 0, 700, 3.45, 3),
                  (241, 0, 440, 2.76, 2),
                  (242, 1, 520, 3.81, 1),
                  (243, 1, 680, 2.96, 3),
                  (244, 0, 620, 3.22, 2),
                  (245, 0, 540, 3.04, 1),
                  (246, 0, 800, 3.91, 3),
                  (247, 0, 680, 3.34, 2),
                  (248, 0, 440, 3.17, 2),
                  (249, 0, 680, 3.64, 3),
                  (250, 0, 640, 3.73, 3),
                  (251, 0, 660, 3.31, 4),
                  (252, 0, 620, 3.21, 4),
                  (253, 1, 520, 4, 2),
                  (254, 1, 540, 3.55, 4),
                  (255, 1, 740, 3.52, 4),
                  (256, 0, 640, 3.35, 3),
                  (257, 1, 520, 3.3, 2),
                  (258, 1, 620, 3.95, 3),
                  (259, 0, 520, 3.51, 2),
                  (260, 0, 640, 3.81, 2),
                  (261, 0, 680, 3.11, 2),
                  (262, 0, 440, 3.15, 2),
                  (263, 1, 520, 3.19, 3),
                  (264, 1, 620, 3.95, 3),
                  (265, 1, 520, 3.9, 3),
                  (266, 0, 380, 3.34, 3),
                  (267, 0, 560, 3.24, 4),
                  (268, 1, 600, 3.64, 3),
                  (269, 1, 680, 3.46, 2),
                  (270, 0, 500, 2.81, 3),
                  (271, 1, 640, 3.95, 2),
                  (272, 0, 540, 3.33, 3),
                  (273, 1, 680, 3.67, 2),
                  (274, 0, 660, 3.32, 1),
                  (275, 0, 520, 3.12, 2),
                  (276, 1, 600, 2.98, 2),
                  (277, 0, 460, 3.77, 3),
                  (278, 1, 580, 3.58, 1),
                  (279, 1, 680, 3, 4),
                  (280, 1, 660, 3.14, 2),
                  (281, 0, 660, 3.94, 2),
                  (282, 0, 360, 3.27, 3),
                  (283, 0, 660, 3.45, 4),
                  (284, 0, 520, 3.1, 4),
                  (285, 1, 440, 3.39, 2),
                  (286, 0, 600, 3.31, 4),
                  (287, 1, 800, 3.22, 1),
                  (288, 1, 660, 3.7, 4),
                  (289, 0, 800, 3.15, 4),
                  (290, 0, 420, 2.26, 4),
                  (291, 1, 620, 3.45, 2),
                  (292, 0, 800, 2.78, 2),
                  (293, 0, 680, 3.7, 2),
                  (294, 0, 800, 3.97, 1),
                  (295, 0, 480, 2.55, 1),
                  (296, 0, 520, 3.25, 3),
                  (297, 0, 560, 3.16, 1),
                  (298, 0, 460, 3.07, 2),
                  (299, 0, 540, 3.5, 2),
                  (300, 0, 720, 3.4, 3),
                  (301, 0, 640, 3.3, 2),
                  (302, 1, 660, 3.6, 3),
                  (303, 1, 400, 3.15, 2),
                  (304, 1, 680, 3.98, 2),
                  (305, 0, 220, 2.83, 3),
                  (306, 0, 580, 3.46, 4),
                  (307, 1, 540, 3.17, 1),
                  (308, 0, 580, 3.51, 2),
                  (309, 0, 540, 3.13, 2),
                  (310, 0, 440, 2.98, 3),
                  (311, 0, 560, 4, 3),
                  (312, 0, 660, 3.67, 2),
                  (313, 0, 660, 3.77, 3),
                  (314, 1, 520, 3.65, 4),
                  (315, 0, 540, 3.46, 4),
                  (316, 1, 300, 2.84, 2),
                  (317, 1, 340, 3, 2),
                  (318, 1, 780, 3.63, 4),
                  (319, 1, 480, 3.71, 4),
                  (320, 0, 540, 3.28, 1),
                  (321, 0, 460, 3.14, 3),
                  (322, 0, 460, 3.58, 2),
                  (323, 0, 500, 3.01, 4),
                  (324, 0, 420, 2.69, 2),
                  (325, 0, 520, 2.7, 3),
                  (326, 0, 680, 3.9, 1),
                  (327, 0, 680, 3.31, 2),
                  (328, 1, 560, 3.48, 2),
                  (329, 0, 580, 3.34, 2),
                  (330, 0, 500, 2.93, 4),
                  (331, 0, 740, 4, 3),
                  (332, 0, 660, 3.59, 3),
                  (333, 0, 420, 2.96, 1),
                  (334, 0, 560, 3.43, 3),
                  (335, 1, 460, 3.64, 3),
                  (336, 1, 620, 3.71, 1),
                  (337, 0, 520, 3.15, 3),
                  (338, 0, 620, 3.09, 4),
                  (339, 0, 540, 3.2, 1),
                  (340, 1, 660, 3.47, 3),
                  (341, 0, 500, 3.23, 4),
                  (342, 1, 560, 2.65, 3),
                  (343, 0, 500, 3.95, 4),
                  (344, 0, 580, 3.06, 2),
                  (345, 0, 520, 3.35, 3),
                  (346, 0, 500, 3.03, 3),
                  (347, 0, 600, 3.35, 2),
                  (348, 0, 580, 3.8, 2),
                  (349, 0, 400, 3.36, 2),
                  (350, 0, 620, 2.85, 2),
                  (351, 1, 780, 4, 2),
                  (352, 0, 620, 3.43, 3),
                  (353, 1, 580, 3.12, 3),
                  (354, 0, 700, 3.52, 2),
                  (355, 1, 540, 3.78, 2),
                  (356, 1, 760, 2.81, 1),
                  (357, 0, 700, 3.27, 2),
                  (358, 0, 720, 3.31, 1),
                  (359, 1, 560, 3.69, 3),
                  (360, 0, 720, 3.94, 3),
                  (361, 1, 520, 4, 1),
                  (362, 1, 540, 3.49, 1),
                  (363, 0, 680, 3.14, 2),
                  (364, 0, 460, 3.44, 2),
                  (365, 1, 560, 3.36, 1),
                  (366, 0, 480, 2.78, 3),
                  (367, 0, 460, 2.93, 3),
                  (368, 0, 620, 3.63, 3),
                  (369, 0, 580, 4, 1),
                  (370, 0, 800, 3.89, 2),
                  (371, 1, 540, 3.77, 2),
                  (372, 1, 680, 3.76, 3),
                  (373, 1, 680, 2.42, 1),
                  (374, 1, 620, 3.37, 1),
                  (375, 0, 560, 3.78, 2),
                  (376, 0, 560, 3.49, 4),
                  (377, 0, 620, 3.63, 2),
                  (378, 1, 800, 4, 2),
                  (379, 0, 640, 3.12, 3),
                  (380, 0, 540, 2.7, 2),
                  (381, 0, 700, 3.65, 2),
                  (382, 1, 540, 3.49, 2),
                  (383, 0, 540, 3.51, 2),
                  (384, 0, 660, 4, 1),
                  (385, 1, 480, 2.62, 2),
                  (386, 0, 420, 3.02, 1),
                  (387, 1, 740, 3.86, 2),
                  (388, 0, 580, 3.36, 2),
                  (389, 0, 640, 3.17, 2),
                  (390, 0, 640, 3.51, 2),
                  (391, 1, 800, 3.05, 2),
                  (392, 1, 660, 3.88, 2),
                  (393, 1, 600, 3.38, 3),
                  (394, 1, 620, 3.75, 2),
                  (395, 1, 460, 3.99, 3),
                  (396, 0, 620, 4, 2),
                  (397, 0, 560, 3.04, 3),
                  (398, 0, 460, 2.63, 2),
                  (399, 0, 700, 3.65, 2),
                  (400, 0, 600, 3.89, 3)) n (rn, admit, gre, gpa, [rank]);

We can run the following SQL to get some basic descriptive statistics to make sure that the data are loaded correctly.

SELECT
   n.lbl,
   wct.QUARTILE(admit,n.x) as Admit,
   wct.QUARTILE(gre,n.x) as gre,
   wct.QUARTILE(gpa,n.x) as gpa,
   wct.QUARTILE(rank,n.x) as rank
FROM
   #mydata
CROSS APPLY(VALUES
   ('Min',0),('1st Quartile',1),('Median',2),('3rd Quartile',3),('Max',4))n(lbl,x)
GROUP BY
   n.lbl
UNION
SELECT
   'Mean',
   AVG(cast(admit as float)) as Admit,
   AVG(cast(gre as float)) as gre,
   AVG(cast(gpa as float)) as gpa,
   AVG(cast(rank as float)) as rank
FROM
   #mydata
ORDER BY
   3;

This produces the following result.

lblAdmitgregparank
Min02202.261
1st Quartile05203.132
Median05803.3952
Mean0.3175587.73.38992.485
3rd Quartile16603.673
Max180044

To get the standard deviations, we could run the following SQL.

SELECT wct.STDEV_S(admit) as Admit,
       wct.STDEV_S(gre) as gre,
       wct.STDEV_S(gpa) as gpa,
       wct.STDEV_S(rank) as rank
  FROM #mydata;

This produces the following result.

Admitgregparank
0.466086732384932115.5165363722380.3805667716303840.944460169902007

This SQL will create a two-way contingency table between the admit outcome and the rank predictor. We run this SQL to check to see if there are any zeroes in the contingency table.

SELECT admit,
       [1],
       [2],
       [3],
       [4]
  FROM (SELECT admit, rank FROM #mydata) p
  PIVOT (   COUNT([rank])
              FOR [rank] IN ([1], [2], [3], [4])) as d;

This produces the following result.

admit1234
028979355
133542812

This SQL will run the logistic regression and put the results in a temporary table, #mylogit. The only reason for storing the results in the temp table is to make it easier to follow the rest of the example.

SELECT *
INTO   #mylogit
  FROM wct.LOGIT(
           'SELECT
    admit
   ,gre
   ,gpa
   ,CASE RANK
       WHEN 2 THEN 1
       ELSE 0
    END
   ,CASE RANK
       WHEN 3 THEN 1
       ELSE 0
    END
   ,CASE RANK
       WHEN 4 THEN 1
       ELSE 0
    END
   FROM
       #mydata',
           1);

Note that we have taken the rank column and turned it into 3 columns, representing the ranks of 2, 3, and 4, since we are treating the rank variable as discrete and not continuous.

In this SQL, we are going to select the estimated coefficients, the standard error of the estimates, the z-value, and the probability associated with that z-value, and return the results in spreadsheet format with the appropriate labels for the estimated coefficients.

SELECT CASE idx
            WHEN 0 THEN 'Intercept'
            WHEN 1 THEN 'gre'
            WHEN 2 THEN 'gpa'
            WHEN 3 THEN 'rank2'
            WHEN 4 THEN 'rank3'
            WHEN 5 THEN 'rank4' END as [X],
       ROUND(b, 5) as [Estimated],
       ROUND(se, 5) as [Std. Error],
       ROUND(z, 2) as [z Value],
       ROUND(pval, 5) as [Pr(>|z|)],
       ROUND(Wald, 2) as Wald
  FROM #mylogit
    PIVOT (   SUM(stat_val)
              FOR stat_name IN (b, se, z, pval, Wald)) as pvt
 WHERE idx IS NOT NULL
 ORDER BY idx;

This produces the following result.

XEstimatedStd. Errorz ValuePr(>|z|)Wald
Intercept-3.989981.13995-3.500.0004712.25
gre0.002260.001092.070.038474.28
gpa0.804040.331822.420.015395.87
rank2-0.675440.31649-2.130.032834.55
rank3-1.340200.34531-3.880.0001015.06
rank4-1.551460.41783-3.710.0002013.79

The results of the regression indicate that both the gre and gpa are statistically significant as are the three terms for rank. You could interpret the results in the following way:

    •     for every one unit change in gre, the log odds of success (admission) increase by 0.00226.

    •     for every one unit change in gpa, the log odds of success increase by 0.80404.

    •     since the rank variable is not continuous it is interpreted differently. For example, attending an undergraduate institution with a rank of 2 (as opposed to 1) decreases the log odds of admission by 0.67544.

The following SQL returns the remaining statistics from the regression analysis.

SELECT stat_name,
       stat_val
  FROM #mylogit
 WHERE idx IS NULL;

This produces the following result.

stat_namestat_val
LL0-249.988258777458
LLM-229.25874623795
chisq41.4590250790163
df5
p_chisq7.57819423181544E-08
AIC470.517492475899
BIC494.466279758547
Nobs400
rsql0.0829219445780524
rsqcs0.0984570211879663
rsqn0.137995801309716
D-0.173125379863381
AUROC0.692769173084134
Iterations5
Converged1

See Also

LINEST - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values

LOGEST - Logarithmic regression

LOGITPRED - Calculate predicted values based on a logit regression

LOGITPROB - Calculate the probability that Y = 1 given a set of coefficients from a logistic regression

LOGITSUM - Logit regression using summary data

VIF - Variance inflation factors