Logo

SQL Server TRENDMX_q Function

Updated 2024-03-14 20:08:28.300000

Description

Use the scalar function TRENDMX_q to calculate values along a linear trend. TRENDMX_q is designed specifically for multiple x-values. If there is only one x-value for each y-value, consider using the aggregate function TREND.

TRENDMX_q calculates the Ordinary Least Squares (OLS) solution for the given x- and y-values. The OLS solution calculates a straight line that fits the data supplied to the function. For more information on the calculation of these values, see LINEST.

TRENDMX_q computes a y-value with new x-values supplied as input to the function, using the coefficients calculated in the OLS solutions.

In the case where we have one column of x-values and a column of y-values, the OLS solution is immediately recognizable as the formula for a line:

y=mx+b

For the purpose of this function, we would re-write the equation as

y=m_1x+m_0

The value for slope, then, is stored in m0.

For purposes of multi-linear regression, where there are multiple columns of x-values (and still a single column of y-values), the formula for the solution is described by the following equation:

y=m_nx_n + m_{n-1}x_{n-1} + m_{n-2}x_{n-2} ... m_1x_1 + m_0

Where n is the number of x-columns.

The function expects the input to be in row-column, or spreadsheet, format, rather than in third-normal form. Thus, the input into the function requires the specification of the column names for the x-values and the y-values, and the specification of a 1-based index identifying the column reference for the y-values, and the specification of new x–values in the same column order as the x-values. The column specifications and the table or view that contains the data are passed into the function as strings.

Syntax

SELECT [westclintech].[wct].[TRENDMX_q] (
  <@Matrix_RangeQuery, nvarchar(max),>
 ,<@Y_ColumnNumber, int,>
 ,<@New_x, nvarchar(max),>
 ,<@Lconst, bit,>)

Arguments

@Matrix_RangeQuery

The SELECT statement, as a string, which, when executed, creates the resultant table of x- and y-values which will be used in the calculation. @MatrixRangeQuery values must evaluate to a type of float or of a type that implicitly converts to float.

@Y_ColumnNumber

the index into the array identifying the column containing the y-values. The index value must be between 1 and n, where n is the number of columns specified in @ColumnNames. @y_ColumnNumber must be of the type int or of a type that implicitly converts to int.

@New_x

the new x-values for which you want TRENDMX to return a corresponding y-value. @new_x values must evaluate to a type of float or of a type that implicitly converts to float.

@Lconst

a logical value specifying whether to force the y-intercept value (m0) equal to zero. @LConst must be of the type bit or of a type that implicitly converts to bit.

Return Type

float

Remarks

If @Lconst is NULL than @Lconst is set to 'True'.

If @Lconst is true than the number of rows must be greater than the number of columns.

If @Lconst is false than the number of rows must be greater than or equal to the number of columns.

For simpler queries, you can try the TRENDMX function.

Examples

We create a table and put some data in it.

CREATE TABLE #xy
(
    y float,
    x0 float,
    x1 float,
    x2 float
);
INSERT INTO #xy
VALUES
(18, 2, -4, 4);
INSERT INTO #xy
VALUES
(24, -9, 2, 9);
INSERT INTO #xy
VALUES
(16, 5, -6, 8);
INSERT INTO #xy
VALUES
(8.5, 5, -1, 4);

This is what the data look like.

yx0x1x2
182-44
24-929
165-68
8.55-14

We have 4 columns of x-data and one column of y-data, which is in the first column of the table #xy. To invoke the scalar function TRENDMX_q, we enter the following SQL.

SELECT wct.TRENDMX_q('SELECT y,x0,x1,x2 from #xy', 1, '-3,10,6', 'True') as TREND;

Note that the table name and the column names are both bound by single-quotes, so that they are passed into the function as a string. Essentially the function is dynamically creating a SQL statement to SELECT the column names from the #xy table. This means that we can actually simplify the second parameter, since we are selecting all the columns in this particular table, by entering the following:

SELECT wct.TRENDMX_q('SELECT * from #xy', 1, '-3,10,6', 'True') as TREND;

The second parameter, the y-column value, specifies that the 1st column in the resultant table, which becomes the input into the calculation, contains the y-values. The third parameter is the new x-values, which is entered as a comma-separated string. The last parameter specified that we want to calculate the y-intercept.

The following result is produced.

TREND
2.25089605734768

By looking at the LINEST function, we can see how TRENDMX came up with this value. If we run the following SQL, we can mimic the TREND calculation:

SELECT *,
       CASE
           WHEN idx = 0 THEN
               1
           WHEN idx = 1 THEN
               -3
           WHEN idx = 2 THEN
               10
           WHEN idx = 3 THEN
               6
       END as new_x
FROM wct.LINEST('#xy', '*', '', NULL, 1, 'True')
WHERE stat_name = 'm';

This returns the following result.

stat_nameidxstat_valnew_x
m015.18996415770611
m1-1.52329749103943-3
m2-1.6433691756272410
m3-0.179211469534056

With a slight modification to the SQL, we can then reproduce the same result as that which was returned by TRENDMX_q.

SELECT SUM(stat_val * new_x) as TREND
FROM
(
    SELECT *,
           CASE
               WHEN idx = 0 THEN
                   1
               WHEN idx = 1 THEN
                   -3
               WHEN idx = 2 THEN
                   10
               WHEN idx = 3 THEN
                   6
           END as new_x
    FROM wct.LINEST('#xy', '*', '', NULL, 1, 'True')
    WHERE stat_name = 'm'
) n;

This produces the following result.

TREND
2.25089605734768

Which is the same result returned by TRENDMX_q.

Since the TRENDMX_q function operates off of the resultant table from the @MatrixRangeQuery, it is not a requirement of the function that values be stored in the database. It is possible to dynamically perform the TRENDMX_q calculation without making reference to a table,a temp table, view, or common table expresion. Here's an example.

SELECT wct.TRENDMX_q(
                        'SELECT 18,2,-4,4 UNION ALL
      SELECT 24,-9,2,9 UNION ALL
      SELECT 16,5,-6,8 UNION ALL
      SELECT 8.5,5,-1,4',
                        1,
                        '-3,10,6',
                        'True'
                    ) as TREND;

This produces the same result as the previous example.

TREND
2.25089605734768

If we wanted to calculate the coefficients without a y-intercept, and supplying some different new x-values, we would modify function call to make the last parameter FALSE, as seen in the following example.

SELECT wct.TRENDMX_q('SELECT * from #xy', 1, '-9,5,-7', 'False') as TREND;

This produces the following result.

TREND
-14.2916920910584

In this example, we will add a new table, containing multiple new x-values, demonstrating how to calculate multiple y-values from multiple new x-values in a single SELECT.

CREATE TABLE #new_x
(
    r float,
    x0 float,
    x1 float,
    x2 float
);
INSERT INTO #new_x
VALUES
(1, 1, 1, 1);
INSERT INTO #new_x
VALUES
(2, 2, 2, -2);
INSERT INTO #new_x
VALUES
(3, 3, -3, -3);
INSERT INTO #new_x
VALUES
(4, -4, -4, -4);
INSERT INTO #new_x
VALUES
(5, -5, 5, -5);

We can then run the following statement and return a new y-value for each set of new x-values.

SELECT #new_x.r,
       wct.TRENDMX_q('SELECT *
      FROM #xy', 1, cast(x0 as varchar) + ',' + cast(x1 as Varchar) + ',' + cast(
                x2 as varchar), 'TRUE') as y
FROM #new_x;

This produces the following result.

ry
111.8440860215054
29.21505376344087
316.0878136200717
428.573476702509
515.4856630824373

We also could have entered the following statement, using a SELECT statement to get the new x-values.

SELECT #new_x.r,
       wct.TRENDMX_q(
                        'SELECT * FROM #xy',
                        1,
                        'SELECT x0, x1,x2 from #new_x where r = ' + cast(#new_x.r 
                                  as varchar),
                        'TRUE'
                    ) as y
FROM #new_x;

Since the TRENDMX_q function provides the ability to select specific columns, there is no requirement that the columns be contiguous, or even be in the same table or view, as long as they are in the same resultant table.

For example, if we wanted to not use the x1 column in the TRENDMX_q calculation, we can simply change the @Matrix_RangeQuery and change to @new_x , as in the following example.

SELECT #new_x.r,
       wct.TRENDMX_q('SELECT y, x0, x2
      FROM #xy', 1, 'SELECT x0,x2 FROM #new_x where r = ' + cast(#new_x.r as varchar)
                , 'TRUE') as y
FROM #new_x;

This produces the following result.

ry
112.9929859719439
210.3597194388778
39.04809619238478
412.9428857715431
512.9328657314629

See Also

TRENDMX - Calculate the values along a linear trend for multiple x values