Logo

SQL Server LMPRED Function

Updated 2023-11-01 15:02:30.317000

Description

Use the scalar function LMPRED to evaluate a dependent variable from the independent variables using the regression coefficients from the output of the LINEST or LINEST_q function. You can use the following equation to calculate the dependent value, y, from the independent variables, x.

y = m_nx_n + m_{n-1}x_{n-1} + m_{n-2}x_{n-2} + ... + m_1x_1 + m_0

The m values are included in the output from the XLeratorDB linear regression functions LINEST and LINEST_q and have a stat_name of 'm'. The subscripts for m are in the idx column of the regression output.

LMPRED matches the coefficients of regression (m) to the new independent variables in one of two ways. You can specify that they be matched using the names of the coefficients and the column names of the independent variables or by using the idx associated with m and the relative position of the new independent variables

Syntax

SELECT [westclintech].[wct].[LMPRED](
  <@Coef, nvarchar(max),>
 ,<@New_x, nvarchar(max),>
 ,<@UseNames, bit,>)

Arguments

@Coef

A string which contains a SELECT statement which returns a resultant table consisting of the idx, stat_val and col_name from the LINEST or LINEST_q table-valued functions.

@New_x

A string which contains either a SELECT statement which returns a single row consisting of the independent variables to be evaluated or a comma separated string of new independent variables. When @New_x is not a SELECT statement @UseNames is ignored and coefficients are match to independent variables based on their position in the string.

@UseNames

A bit value indicating whether or not use the independent variable names to match up the coefficients with the independent variables. When True the coefficients are match to the independent variables using the names. When False, the coefficients are match to the independent variables using the column position in the resultant table (or string) returned by @New_x.

Return Type

float

Remarks

There will be 1 more coefficient than independent variable.

When @UseNames is 1, LMPRED matches the col_name from the regression to the column names in @New_x.

LMPRED does not calculate the regression coefficients.

Examples

Example #1

--Put data into a table
SELECT IDENTITY(int, 1, 1) as recno,
       *
INTO   #L
  FROM (   VALUES (2310, 2, 2, 20, 142000),
                  (2333, 2, 2, 12, 144000),
                  (2356, 3, 1.5, 33, 151000),
                  (2379, 3, 2, 43, 150000),
                  (2402, 2, 3, 53, 139000),
                  (2425, 4, 2, 23, 169000),
                  (2448, 2, 1.5, 99, 126000),
                  (2471, 2, 2, 34, 142900),
                  (2494, 3, 3, 23, 163000),
                  (2517, 4, 4, 55, 169000),
                  (2540, 2, 3, 22, 149000)) n ([Floor Space], [Offices], [Entrances], [Age], [Assessed Value]);

--Run the linear regression and store the results
SELECT *
INTO   #lm
  FROM wct.LINEST('#L', '[Floor Space],[Offices],[Entrances],[Age],[Assessed Value]', '', NULL, 5, 'True');

The regression coefficients are stored in the temp table #lm and we can select them using the following SQL.

SELECT idx,
       stat_val,
       col_name
  FROM #lm
 WHERE stat_name = 'm';

This produces the following result.

idxstat_valcol_name
052317.8305072913Intercept
127.6413873660203Floor Space
212529.7681670868Offices
32553.21066039153Entrances
4-234.237164471202Age

As you can see from the resultant table, the regression coefficients can be referenced by the idx column or the col_name column. We can use the LMPRED function with the @UseNames = 1 and the input data to calculate a predicted Assessed Value. Note that we have passed all the column names from the input table into the @New_x variable not just the independent variable. By using @UseNames = 1, LMPRED matched the regression coefficient names to the column names returned by the @New_x SQL.

SELECT recno,
       [Floor Space],
       [Offices],
       [Entrances],
       [Age],
       [Assessed Value],
       wct.LMPRED(
           'SELECT
              idx
              ,stat_val
              ,col_name
          FROM
              #lm
          WHERE
              stat_name = ''m'''                                                                                                                                                    ,
           --@Coef
           'SELECT
              *
          FROM
              #l
          WHERE recno = ' + cast(recno as varchar(max)), --@New_x
           1 --@UseNames
       ) as yhat
  FROM #l;

This produces the following result.

recnoFloor SpaceOfficesEntrancesAgeAssessed Valueyhat
1231022.020142000141650.649688331
2233322.012144000144160.298913519
3235631.533151000151130.233205933
4237932.043150000150700.218800835
5240223.053139000139017.041558846
6242542.023169000169186.234076183
7244821.599126000125683.819821421
8247122.034142900142821.592751663
9249433.023163000161116.932297743
10251744.055169000169340.073771561
11254023.022149000150092.905113964

Example #2

Using the same regression coefficients as in the previous example, we set @UseNames = 0. This means that LMPRED will not use the regression coefficient names to match to the independent variables but will use the idx values instead which refer to the relative column position in the resultant table from the @New_x SQL. In other words, idx = 1 means the first column, idx = 2 means the second column, etc.

SELECT recno,
       wct.LMPRED(
           'SELECT idx,stat_val FROM #lm WHERE stat_name = ''m''',
           'SELECT [Floor Space],[Offices],[Entrances],[Age] FROM #l WHERE recno = ' + cast(recno as varchar(max)),
           0) as Yhat
  FROM #L;

This produces the following result.

recnoYhat
1141650.649688331
2144160.298913519
3151130.233205933
4150700.218800835
5139017.041558846
6169186.234076183
7125683.819821421
8142821.592751663
9161116.932297743
10169340.073771561
11150092.905113964

Example #3

Using the same regression coefficients as Example #1 this example shows how to calculate the predicted value when the new x-values are passed in as a comma separated string rather than as a SELECT statement. We use the built-in CONCAT function to build the string a derived table and then pass that value into LMPRED .

SELECT wct.LMPRED('SELECT idx,stat_val FROM #lm WHERE stat_name = ''m''', newx, 0) as Yhat
  FROM (   SELECT CONCAT([Floor Space], ',', [Offices], ',', [Entrances], ',', [Age]) as newx
             FROM #L) n;

This produces the following result.

Yhat
141650.649688331
144160.298913519
151130.233205933
150700.218800835
139017.041558846
169186.234076183
125683.819821421
142821.592751663
161116.932297743
169340.073771561
150092.905113964

See Also

FORECAST - The predicted y-value for a given x-value

LINEST - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values

LINEST_q - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values

TREND - for simpler queries

TRENDMX - Calculate the values along a linear trend for multiple x values