Logo

SQL Server LINEST Function

Updated 2023-11-01 13:09:08.680000

Description

Use the table-valued function LINEST to calculate the Ordinary Least Squares (OLS) solution for a series of x- and y-values. The OLS solution calculates a line that best fits the data supplied to the function. The LINEST function returns the statistics that describe the calculated solution, including the coefficients (m), the standard  error of the coefficients (se), the t statistic for each coefficient (tstat) and the associated p-values (pval), the coefficient of determination (rsq), the adjusted r-square value (rsqa) and the modified r-square value (rsqm), the standard error of the y estimate (sey), the F-observed value (F), the residual degrees of freedom (df), the regression sum of squares (ss_reg) and the residual some of squares (ss_resid).

In the case where we have one column of x-values and a column of y-values, the OLS solution is immediately recognizable as the formula for a line:

y = mx + b

For the purpose of this function, we would re-write the equation as

y=m_1x_1 + m_0

The value for slope, then, is stored in m0.

For purposes of multi-linear regression, where there are multiple columns of x-values (and still a single column of y-values), the formula for the solution is described by the following equation:

y = m_nx_n + m_{n-1}x_{n-1} + m_{n-2}x_{n-2} + ... + m_1x_1 + m_0

Where n is the number of x-columns.

The function expects the input to be in row-column, or spreadsheet, format, rather than in third-normal form. Thus, the input into the function requires the specification of the column names for the x- and y-values, the specification of a 1-based index identifying the column number for the y-values, and a bit value which specifies whether or not the solution has a non-zero y-intercept. The column specifications and the table or view that contains the data are passed into the function as strings. The function dynamically creates SQL and the resultant table from the SQL is used as input into the OLS calculations.

LINEST automatically detects collinearity and removes the right-most co-linear column resulting in a regression coefficient of 0 for that column.

Syntax

SELECT * FROM [westclintech].[wct].[LINEST](
  <@TableName, nvarchar(max),>
 ,<@ColumnNames, nvarchar(4000),>
 ,<@GroupedColumnName, nvarchar(4000),>
 ,<@GroupedColumnValue, sql_variant,>
 ,<@Y_ColumnNumber, int,>
 ,<@Lconst, bit,>)

Arguments

@TableName

the name, as text, of the table or view that contains the values used in the LINEST calculation.

@ColumnNames

the name, as text, of the columns in the table or view specified by @TableName that contain the values used in the LINEST calculation. Data returned from the @ColumnNames must be of the type float or of a type that implicitly converts to float.

@GroupedColumnName

the name, as text, of the column in the table or view specified by @TableName which will be used for grouping the results.

@GroupedColumnValue

the column value to do the grouping on.

@Y_ColumnNumber

the index into the array identifying the column containing the y-values. The index value must be between 1 and n, where n is the number of columns specified in @ColumnNames. @y_ColumnNumber must be of the type int or of a type that implicitly converts to int.

@Lconst

A logical value specifying whether to force the y-intercept value ( m 0) equal to zero. @LConst must be of the type bit or of a type that implicitly converts to bit.

Return Type

table

colNamecolDatatypecolDesc
stat_namenvarchar(4000)Identifies the statistic being returned: m the estimated coefficient se the standard error of the estimated coefficient tstat the t statistic pval the p-value (t distribution) for the t statistic rsq the coefficient of determination (r2) rsqa adjusted r square rsqm multiple r square sey the standard error for the y estimate f the f-observed value df the residual degrees of freedom ss_reg the regression sum of squares ss_resid the residual sum of squares
idxintIdentifies the subscript for the estimated coefficient, the standard error of the estimated coefficient, the t statistic, and the p-value. For example, the stat_name m with an idx of 0, specifies that the stat_val is for m0, or the y-intercept (which is b in y = mx + b). An idx of 1 for the same stat_name identifies m1. The stat_name se with an idx of 0 identifies the standard error of the m0 coefficient (which is sometimes referred to as the standard error of b or seb). idx values are only supplied for the m, se, tstat, and pval stat_names. All others will have an idx of NULL.
stat_valfloatthe calculated value of the statistic.
col_namenvarchar(4000)the column name from the resultant table produced by the dynamic SQL. col_name values are produced only for the m, se, tstat, and pval statistics; all other stat_names have NULL for col_name.

Remarks

If @Lconst is NULL then @Lconst is set to 'True'.

If @LConst is true than the number of rows must be greater than the number of columns.

If @Lconst is false than the number of rows must be greater than or equal to the number of columns.

For more complicated queries, you can try the LINEST_q function.

Examples

Example #1

We put the x- and y-data into a temp table, #xy.

SELECT *
INTO #xy
FROM
(
    VALUES
        (6.1, 1.21, 4.35, 5.42, 6.45, 138.08),
        (7.95, 0.97, 2.79, 4.73, 8.14, 80.3),
        (8.53, 9.73, 9.7, 1.16, 9.05, 284.45),
        (7.4, 1.61, 9.9, 8.8, 4.38, 226.66),
        (7.42, 4.58, 0.06, 8.97, 8.75, 112.37),
        (6.19, 3.56, 9.69, 8.7, 5.67, 168.73),
        (8.44, 3.85, 8.23, 1.05, 0.92, 160.38),
        (0.84, 3.86, 7.85, 2.14, 3.03, 129.26),
        (0.37, 7.33, 5.07, 8.06, 3.25, 170.39),
        (7.48, 0.68, 8.34, 2.98, 2.81, 188.53),
        (5.33, 3.51, 7.03, 6.49, 7.54, 131.1)
) n (x1, x2, x3, x4, x5, y);

This is what the data look like.

x1x2x3x4x5y
6.101.214.355.426.45138.08
7.950.972.794.738.1480.30
8.539.739.701.169.05284.45
7.401.619.908.804.38226.66
7.424.580.068.978.75112.37
6.193.569.698.705.67168.73
8.443.858.231.050.92160.38
0.843.867.852.143.03129.26
0.377.335.078.063.25170.39
7.480.688.342.982.81188.53
5.333.517.036.497.54131.10

To invoke the table-valued function LINEST , we enter the following SQL.

SELECT *
FROM wct.LINEST(   '#xy',              --@TableName
                   'x1,x2,x3,x4,x5,y', --@ColumnNames
                   '',                 --@GroupedColumnName
                   NULL,               --@GroupedColumnValue
                   6,                  --@Y_ColumnNumber
                   'True'              --@Lconst
               );

Note that the table name and the column names are both bound by single-quotes, so that they are passed into the function as a string. Essentially the function is dynamically creating a SQL statement to SELECT the column names from the #xy table. This means that we can actually simplify the second parameter, since we are selecting all the columns in this particular table, by entering the following:

SELECT *
FROM wct.LINEST(   '#xy', --@TableName
                   '*',   --@ColumnNames
                   '',    --@GroupedColumnName
                   NULL,  --@GroupedColumnValue
                   6,     --@Y_ColumnNumber
                   'True' --@Lconst
               );

The third parameter is entered as blank (2 single quotes), since there is no column on which we want to group the results for input into the calculation and the fourth parameter is NULL, since there is no grouped column name. The fifth parameter, the y-column value, specifies that the 6th column in the resultant table, which becomes the input into the calculation, contains the y-values. The last parameter specified that we want to calculate the y-intercept.

The following results are produced.

stat_nameidxstat_valcol_name
m0-2.00793721243439Intercept
m16.75602376546862x1
m210.7172546474712x2
m311.482088782086x3
m42.66947583235558x4
m5-1.11015571062543x5
se056.6362962641145Intercept
se15.18738179897952x1
se25.0471020356278x2
se34.44876625513428x3
se44.35402063583205x4
se55.97904930603055x5
tstat0-0.0354531871764828Intercept
tstat11.30239570312671x1
tstat22.12344719243191x2
tstat32.58096023112803x3
tstat40.613105921085155x4
tstat5-0.185674285961434x5
pval00.973090230335637Intercept
pval10.249542367556677x1
pval20.087124225219611x2
pval30.0493746060236316x3
pval40.566619010443336x4
pval50.859997582994016x5
rsqNULL0.777597428923374NULL
seyNULL37.5668536635793NULL
FNULL3.49635089720011NULL
dfNULL5NULL
ss_regNULL24671.4493290961NULL
ss_residNULL7056.3424709039NULL
rsqmNULL0.881814849570687NULL
rsqaNULL0.555194857846747NULL

The results are returned in 3rd normal form. You can use standard SQL commands to re-format the results. For example, if you wanted to produce the the coefficients in a format similar to output of the Excel Data Analysis Regression Tool, you could use the following SQL.

;WITH mycte
as (SELECT *
    FROM wct.LINEST('#xy', '*', '', NULL, 6, 1) p )
SELECT d.col_name,
       d.m,
       d.se,
       d.tstat,
       d.pval,
       m - wct.T_INV_2T(.05, m.stat_val) * se as [Lower Confidence Level],
       m + wct.T_INV_2T(.05, m.stat_val) * se as [Upper Confidence Level]
FROM mycte p
    PIVOT
    (
        MAX(stat_val)
        FOR stat_name in (m, se, tstat, pval)
    ) d
    CROSS JOIN mycte m
WHERE stat_name = 'df'
      AND d.col_name IS NOT NULL;

This produces the following result.

col_namemsetstatpvalLower Confidence LevelUpper Confidence Level
Intercept-2.0079372124343956.6362962641145-0.03545318717648280.973090230335637-147.596171626684143.580297201815
x16.756023765468625.187381798979521.302395703126710.249542367556677-6.5785656614985720.0906131924358
x210.71725464747125.04710203562782.123447192431910.087124225219611-2.2567341679166823.6912434628591
x311.4820887820864.448766255134282.580960231128030.04937460602363160.046171055645968922.9180065085259
x42.669475832355584.354020635832050.6131059210851550.566619010443336-8.5228905260999713.8618421908111
x5-1.110155710625435.97904930603055-0.1856742859614340.859997582994016-16.479791251081514.2594798298306

Similarly, if you wanted to reformat the results to produce the equivalent of the ANOVA table from the Excel Data Analysis Regression tool, you could use the following SQL.

;WITH mycte as (
   SELECT
        df as [Residual df]
       ,Obs - df - 1 as [Regression df]
       ,ss_reg as [Regression SS]
       ,ss_resid as [Residual SS]
       ,F
   FROM (
       SELECT
           stat_name
          ,stat_Val
       FROM
          wct.LINEST('#xy','*','',NULL,6,1)
       WHERE
          stat_name in('ss_reg','ss_resid','F','df')
       UNION
       SELECT
          'Obs'
          ,COUNT(*)
       FROM
          #xy)d
   PIVOT(MAX(stat_val) FOR stat_name in(df,F,ss_reg,ss_resid,Obs))p
   ) 
SELECT
    'Regression'
   ,[Regression df] as DF
   ,[Regression SS] as SS
   ,[Regression SS]/[Regression df] as MS
   ,F
   ,wct.F_DIST_RT(F,[Regression df],[Residual df]) as [Significance F] 
FROM
   mycte
UNION ALL
SELECT
    'Residual'
   ,[Residual df]
   ,[Residual SS]
   ,[Residual SS]/[Residual df] as [Residual MS]
   ,NULL
   ,NULL
FROM
   mycte
UNION ALL
SELECT
    'Total'
   ,[Regression df] + [Residual df]
   ,[Regression SS] + [Residual SS]
   ,NULL
   ,NULL
   ,NULL
FROM
   mycte;

This produces the following result.

(No Column Name)DFSSMSFSignificance F
Regression524671.44932909614934.289865819223.496350897200110.0978733240631539
Residual57056.34247090391411.26849418078NULLNULL
Total1031727.7918NULLNULLNULL

Example #2

Using the same data as Example #1, if we wanted to calculate the coefficients with a y-intercept of 0 we would modify the function call to make the last parameter FALSE. This time, however, we will put the results into a temporary table, #L0, and reformat the results using the #L0 table.

SELECT *
INTO #L0
FROM wct.LINEST(   '#xy',  --@TableName
                   '*',    --@ColumnNames
                   '',     --@GroupedColumnName
                   NULL,   --@GroupedColumnValue
                   6,      --@Y_ColumnNumber
                   'False' --@Lconst
               );

The #L0 table should contain the following data.

stat_nameidxstat_valcol_name
m00Intercept
m16.68653600898501x1
m210.669110550903x2
m311.3894353187733x3
m42.58577750738032x4
m5-1.15990773149072x5
se00Intercept
se14.38493471629212x1
se24.4380097835218x2
se33.28695713905425x3
se43.34008793898134x4
se55.30630339494569x5
tstat0NULLIntercept
tstat11.52488838297668x1
tstat22.40403042609709x2
tstat33.46503919489816x3
tstat40.774164499443965x4
tstat5-0.218590541316492x5
pval0NULLIntercept
pval10.178129170463663x1
pval20.0530030468411133x2
pval30.0133846278949679x3
pval40.468231464499349x4
pval50.834214519463804x5
rsqNULL0.978154399885257NULL
seyNULL34.2979988105928NULL
FNULL53.7309697924091NULL
dfNULL6NULL
ss_regNULL316032.862965531NULL
ss_residNULL7058.11633446853NULL
rsqmNULL0.989016885541019NULL
rsqaNULL0.959949733122971NULL

We can use the same technique as in Example #1 to reformat the coefficient statistics.

SELECT d.col_name,
       d.m,
       d.se,
       d.tstat,
       d.pval,
       m - wct.T_INV_2T(.05, m.stat_val) * se as [Lower Confidence Level],
       m + wct.T_INV_2T(.05, m.stat_val) * se as [Upper Confidence Level]
FROM #L0 p
    PIVOT
    (
        MAX(stat_val)
        FOR stat_name in (m, se, tstat, pval)
    ) d
    CROSS JOIN #L0 m
WHERE m.stat_name = 'df'
      AND d.col_name IS NOT NULL;

This produces the following result.

col_namemsetstatpvalLower Confidence LevelUpper Confidence Level
Intercept00NULLNULL00
x16.686536008985014.384934716292121.524888382976680.178129170463663-4.0430127148071917.4160847327772
x210.6691105509034.43800978352182.404030426097090.0530030468411133-0.19030818389383721.5285292856998
x311.38943531877333.286957139054253.465039194898160.01338462789496793.346540941015919.4323296965307
x42.585777507380323.340087938981340.7741644994439650.468231464499349-5.5871232543795110.7586782691401
x5-1.159907731490725.30630339494569-0.2185905413164920.834214519463804-14.143964394354111.8241489313727

Note that because we specified a 0 intercept an intercept row is still created with a regression coefficient of 0 (keeping the results consistent with the Excel Data Analysis Regression). It is simple enough to exclude this from the output (like R does) simply by adding another condition to the WHERE clause.

SELECT d.col_name,
       d.m,
       d.se,
       d.tstat,
       d.pval,
       m - wct.T_INV_2T(.05, m.stat_val) * se as [Lower Confidence Level],
       m + wct.T_INV_2T(.05, m.stat_val) * se as [Upper Confidence Level]
FROM #L0 p
    PIVOT
    (
        MAX(stat_val)
        FOR stat_name in (m, se, tstat, pval)
    ) d
    CROSS JOIN #L0 m
WHERE m.stat_name = 'df'
      AND d.col_name IS NOT NULL
      AND d.col_name <> 'Intercept';

The following SQL can be used to reproduce the ANOVA table.

;WITH mycte
as (SELECT df as [Residual df],
           --,Obs - df - 1 as [Regression df]
           Obs - df as [Regression df],
           ss_reg as [Regression SS],
           ss_resid as [Residual SS],
           F
    FROM
    (
        SELECT stat_name,
               stat_Val
        FROM #L0
        WHERE stat_name in ( 'ss_reg', 'ss_resid', 'F', 'df' )
        UNION
        SELECT 'Obs',
               COUNT(*)
        FROM #xy
    ) d
    PIVOT
    (
        MAX(stat_val)
        FOR stat_name in (df, F, ss_reg, ss_resid, Obs)
    ) p)
SELECT 'Regression',
       [Regression df] as DF,
       [Regression SS] as SS,
       [Regression SS] / [Regression df] as MS,
       F,
       wct.F_DIST_RT(F, [Regression df], [Residual df]) as [Significance F]
FROM mycte
UNION ALL
SELECT 'Residual',
       [Residual df],
       [Residual SS],
       [Residual SS] / [Residual df] as [Residual MS],
       NULL,
       NULL
FROM mycte
UNION ALL
SELECT 'Total',
       [Regression df] + [Residual df],
       [Regression SS] + [Residual SS],
       NULL,
       NULL,
       NULL
FROM mycte;

This produces the following result.

(No Column Name)DFSSMSFSignificance F
Regression5316032.86296553163206.572593106353.73096979240916.67425061507056E-05
Residual67058.116334468531176.35272241142NULLNULL
Total11323090.9793NULLNULLNULL

Note that the calculation of the regression degrees of freedom needs to be adjusted to reflect the 0 intercept. It also worthwhile noting that the calculation in this SQL produces a Significance F that agrees with R, Excel produces a different value (which seems to be caused by subtracting 1 from the residual degrees of freedom).

Example #3

Here's another example. Instead of just having a matrix of x- and y-values in our table, we are going to add a column, testid, which is a way of grouping x- and y-values together for purposes of doing the LINEST calculation. This allows us to compute the ordinary least square values for multiple sets of data in a single SELECT statement.

Second, we take the x-values and raise them to the powers of 2, 3, 4, 5 thus solving for the equation:

y = B0 + B1x + B2x2 + B3x3 + B4x4 + B5x5

SELECT *
INTO #xy
FROM
(
    VALUES
        ('Wampler5', 7590001, 0),
        ('Wampler5', -20479994, 1),
        ('Wampler5', 20480063, 2),
        ('Wampler5', -20479636, 3),
        ('Wampler5', 25231365, 4),
        ('Wampler5', -20476094, 5),
        ('Wampler5', 20489331, 6),
        ('Wampler5', -20460392, 7),
        ('Wampler5', 18417449, 8),
        ('Wampler5', -20413570, 9),
        ('Wampler5', 20591111, 10),
        ('Wampler5', -20302844, 11),
        ('Wampler5', 18651453, 12),
        ('Wampler5', -20077766, 13),
        ('Wampler5', 21059195, 14),
        ('Wampler5', -19666384, 15),
        ('Wampler5', 26348481, 16),
        ('Wampler5', -18971402, 17),
        ('Wampler5', 22480719, 18),
        ('Wampler5', -17866340, 19),
        ('Wampler5', 10958421, 20),
        ('Wampler4', 75901, 0),
        ('Wampler4', -204794, 1),
        ('Wampler4', 204863, 2),
        ('Wampler4', -204436, 3),
        ('Wampler4', 253665, 4),
        ('Wampler4', -200894, 5),
        ('Wampler4', 214131, 6),
        ('Wampler4', -185192, 7),
        ('Wampler4', 221249, 8),
        ('Wampler4', -138370, 9),
        ('Wampler4', 315911, 10),
        ('Wampler4', -27644, 11),
        ('Wampler4', 455253, 12),
        ('Wampler4', 197434, 13),
        ('Wampler4', 783995, 14),
        ('Wampler4', 608816, 15),
        ('Wampler4', 1370781, 16),
        ('Wampler4', 1303798, 17),
        ('Wampler4', 2205519, 18),
        ('Wampler4', 2408860, 19),
        ('Wampler4', 3444321, 20),
        ('Wampler3', 760, 0),
        ('Wampler3', -2042, 1),
        ('Wampler3', 2111, 2),
        ('Wampler3', -1684, 3),
        ('Wampler3', 3888, 4),
        ('Wampler3', 1858, 5),
        ('Wampler3', 11379, 6),
        ('Wampler3', 17560, 7),
        ('Wampler3', 39287, 8),
        ('Wampler3', 64382, 9),
        ('Wampler3', 113159, 10),
        ('Wampler3', 175108, 11),
        ('Wampler3', 273291, 12),
        ('Wampler3', 400186, 13),
        ('Wampler3', 581243, 14),
        ('Wampler3', 811568, 15),
        ('Wampler3', 1121004, 16),
        ('Wampler3', 1506550, 17),
        ('Wampler3', 2002767, 18),
        ('Wampler3', 2611612, 19),
        ('Wampler3', 3369180, 20),
        ('Wampler2', 1, 0),
        ('Wampler2', 1.11111, 1),
        ('Wampler2', 1.24992, 2),
        ('Wampler2', 1.42753, 3),
        ('Wampler2', 1.65984, 4),
        ('Wampler2', 1.96875, 5),
        ('Wampler2', 2.38336, 6),
        ('Wampler2', 2.94117, 7),
        ('Wampler2', 3.68928, 8),
        ('Wampler2', 4.68559, 9),
        ('Wampler2', 6, 10),
        ('Wampler2', 7.71561, 11),
        ('Wampler2', 9.92992, 12),
        ('Wampler2', 12.75603, 13),
        ('Wampler2', 16.32384, 14),
        ('Wampler2', 20.78125, 15),
        ('Wampler2', 26.29536, 16),
        ('Wampler2', 33.05367, 17),
        ('Wampler2', 41.26528, 18),
        ('Wampler2', 51.16209, 19),
        ('Wampler2', 63, 20),
        ('Wampler1', 1, 0),
        ('Wampler1', 6, 1),
        ('Wampler1', 63, 2),
        ('Wampler1', 364, 3),
        ('Wampler1', 1365, 4),
        ('Wampler1', 3906, 5),
        ('Wampler1', 9331, 6),
        ('Wampler1', 19608, 7),
        ('Wampler1', 37449, 8),
        ('Wampler1', 66430, 9),
        ('Wampler1', 111111, 10),
        ('Wampler1', 177156, 11),
        ('Wampler1', 271453, 12),
        ('Wampler1', 402234, 13),
        ('Wampler1', 579195, 14),
        ('Wampler1', 813616, 15),
        ('Wampler1', 1118481, 16),
        ('Wampler1', 1508598, 17),
        ('Wampler1', 2000719, 18),
        ('Wampler1', 2613660, 19),
        ('Wampler1', 3368421, 20)
) n (testid, y, x);

Let's say wanted to run LINEST for the all the data where the testid is equal to Wampler3. We could simply enter the following statement.

ELECT *
FROM wct.LINEST(   '#xy',
                                                                      --@TableName
                   'y,x,POWER(x,2),POWER(x,3),POWER(x,4),POWER(x,5)', --@ColumnNames
                   'testid',                                          --@GroupedColumnName
                   'Wampler3',                                        --@GroupedColumnValue
                   1,                                                 --@Y_ColumnNumber
                   'True'                                             --@Lconst
               );

This produces the following result.

stat_nameidxstat_valcol_name
m01.00000000017179Intercept
m10.999999999679294x
m21.0000000000945Column1
m30.999999999989123Column2
m41.00000000000055Column3
m50.99999999999999Column4
se02152.32624678169Intercept
se12363.55173469678x
se2779.343524331576Column1
se3101.475507550349Column2
se45.64566512170747Column3
se50.112324854679311Column4
tstat00.000464613578757896Intercept
tstat10.000423092071563048x
tstat20.00128313121091521Column1
tstat30.00985459471087597Column2
tstat40.177127048530663Column3
tstat58.90274910975851Column4
pval00.999635414800478Intercept
pval10.999667996985275x
pval20.998993119116665Column1
pval30.992267170697082Column2
pval40.86177817377114Column3
pval52.25341849383758E-07Column4
rsqNULL0.99999555902582NULL
seyNULL2360.14502379269NULL
FNULL675524.458240117NULL
dfNULL15NULL
ss_regNULL18814317208116.7NULL
ss_residNULL83554268.0000007NULL
rsqmNULL0.999997779510445NULL
rsqaNULL0.999994078701093NULL

You will notice that the column names for the calculated columns are Column1, Column2, Column3, and Column4. This is because we have not assigned them a name and these are the default column names. Even though Column1 does not refer to the first column in the resultant table from our dynamic SQL, it is the first column with no name . To get more descriptive column names, you can simply assign the names in the @ColumnNames variable as in this example.

SELECT *
FROM wct.LINEST(
                   '#xy',                                                                                 --@TableName
                   'y,x,POWER(x,2) as [x^2],POWER(x,3) as [x^3],POWER(x,4) as [x^4],POWER(x,5) as [x^5]', --@ColumnNames
                   'testid',                                                                              --@GroupedColumnName
                   'Wampler3',                                                                            --@GroupedColumnValue
                   1,                                                                                     --@Y_ColumnNumber
                   'True'                                                                                 --@Lconst
               );

This produces the following result.

stat_nameidxstat_valcol_name
m01.00000000017179Intercept
m10.999999999679294x
m21.0000000000945x^2
m30.999999999989123x^3
m41.00000000000055x^4
m50.99999999999999x^5
se02152.32624678169Intercept
se12363.55173469678x
se2779.343524331576x^2
se3101.475507550349x^3
se45.64566512170747x^4
se50.112324854679311x^5
tstat00.000464613578757896Intercept
tstat10.000423092071563048x
tstat20.00128313121091521x^2
tstat30.00985459471087597x^3
tstat40.177127048530663x^4
tstat58.90274910975851x^5
pval00.999635414800478Intercept
pval10.999667996985275x
pval20.998993119116665x^2
pval30.992267170697082x^3
pval40.86177817377114x^4
pval52.25341849383758E-07x^5
rsqNULL0.99999555902582NULL
seyNULL2360.14502379269NULL
FNULL675524.458240117NULL
dfNULL15NULL
ss_regNULL18814317208116.7NULL
ss_residNULL83554268.0000007NULL
rsqmNULL0.999997779510445NULL
rsqaNULL0.999994078701093NULL

In the following SQL we return selected values for each test.

SELECT p.testid,
       p.rsq,
       p.rsqa,
       p.rsqm,
       p.F,
       p.df,
       p.ss_reg,
       p.ss_resid
  FROM (   SELECT n.testid,
                  k.stat_name,
                  k.stat_val
             FROM (SELECT DISTINCT testid FROM #xy) n
            CROSS APPLY wct.LINEST(
                            '#xy',
                            'y,x,POWER(x,2) as [x^2],POWER(x,3) as [x^3],POWER(x,4) as [x^4],POWER(x,5) as [x^5]',
                            'testid',
                            n.testid,
                            1,
                            'True') k ) d
  PIVOT (   SUM(stat_val)
            FOR stat_name IN (F, df, ss_reg, ss_resid, rsq, rsqm, rsqa)) p;

This produces the following result.

testidrsqrsqarsqmFdfss_regss_resid
Wampler11111.38198630394956E+321518814317208116.74.08419037605817E-19
Wampler21116.62975864402057E+31156602.918583651672.98785473417206E-28
Wampler30.999995559025820.9999940787010930.999997779510445675524.4582401171518814317208116.783554268.0000007
Wampler40.9574784408256620.943304587767550.97850827325355967.55244582401231518814317208116.7835542680000
Wampler50.0022466892157494-0.3303377477123340.04739925332480880.006755244582401241518814317208116.78.3554268E+15

In this SQL we select the coefficient statistics for each of the tests.

SELECT p.testid,
       p.col_name,
       p.m,
       p.se,
       p.tstat,
       p.pval
  FROM (   SELECT n.testid,
                  k.stat_name,
                  k.idx,
                  k.col_name,
                  k.stat_val
             FROM (SELECT DISTINCT testid FROM #xy) n
            CROSS APPLY wct.LINEST(
                            '#xy',
                            'y,x,POWER(x,2) as [x^2],POWER(x,3) as [x^3],POWER(x,4) as [x^4],POWER(x,5) as [x^5]',
                            'testid',
                            n.testid,
                            1,
                            'True') k
            WHERE k.idx IS NOT NULL) d
  PIVOT (   SUM(stat_val)
            FOR stat_name IN (m, se, tstat, pval)) p
 ORDER BY testid,
          idx;

This produces the following result.

testidcol_namemsetstatpval
Wampler1Intercept0.9999999999930221.50479303077928E-106645432159.365826.15882032963405E-140
Wampler1x0.9999999998444351.65247075510795E-106051544311.772772.50814535309203E-139
Wampler1x^21.000000000042255.44875901481339E-1118352802855.17451.4852856151972E-146
Wampler1x^30.9999999999956257.09463246033833E-12140951628655.3257.78564309561968E-160
Wampler1x^41.000000000000213.94715138653456E-132533472628923.331.17924139944035E-178
Wampler1x^50.9999999999999967.85316160862982E-151273372496118123.57284346398899E-204
Wampler2Intercept0.9999999999999954.0700853992358E-152456950903751941.86809429176221E-208
Wampler2x0.09999999999999794.46951637564801E-1522373785348420.37.60771017143417E-193
Wampler2x^20.01000000000000111.47375180882262E-156785403037428.734.5051705244268E-185
Wampler2x^30.0009999999999998531.91891904063468E-165211267275085.792.36154241344595E-183
Wampler2x^40.0001000000000000081.06760483988871E-179366761582912.283.57687675436682E-187
Wampler2x^59.99999999999986E-062.12408203303352E-1947079160995106.71.0837154071571E-197
Wampler3Intercept1.000000000171792152.326246781690.0004646135787578960.999635414800478
Wampler3x0.9999999996792942363.551734696780.0004230920715630480.999667996985275
Wampler3x^21.0000000000945779.3435243315760.001283131210915210.998993119116665
Wampler3x^30.999999999989123101.4755075503490.009854594710875970.992267170697082
Wampler3x^41.000000000000555.645665121707470.1771270485306630.86177817377114
Wampler3x^50.999999999999990.1123248546793118.902749109758512.25341849383758E-07
Wampler4Intercept1.00000001085174215232.6246781684.64613583719948E-060.999996354191052
Wampler4x0.999999977160948236355.1734696774.23092062035716E-060.999996679970601
Wampler4x^21.0000000084137377934.35243315731.28313122158987E-050.999989931165944
Wampler4x^30.99999999887446210147.55075503489.85459469989148E-050.999922670373091
Wampler4x^41.00000000006258564.5665121707450.00177127048541650.99861007362054
Wampler4x^50.9999999999987711.23248546793110.08902749109747690.930237861013585
Wampler5Intercept1.0000010995242121523262.46781684.64614089531958E-080.999999967977304
Wampler5x0.99999772851621523635517.34696774.23091110651957E-080.999999967977304
Wampler5x^21.000000834898237793435.243315731.28313228207793E-070.999999898735342
Wampler5x^30.9999998884099671014755.075503489.85459361130866E-070.999999226799412
Wampler5x^41.0000000062007156456.65121707451.7712704962888E-050.999986100711028
Wampler5x^50.9999999998782031123.24854679310.0008902749108674310.999301395744685

See Also

LINEST_q - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values

LOGEST - Logarithmic regression

TREND - for simpler queries

GROWTH - calculate predicted exponential growth using existing values

TRENDMX - Calculate the values along a linear trend for multiple x values

GROWTHMX - Values along an exponential trend

FORECAST - The predicted y-value for a given x-value