Logo

SQL Server FORECAST Function

Updated 2023-11-01 10:29:07.527000

Description

Use the aggregate function FORECAST to calculate a future value by using existing values. The predicted value is a y-value for a given x-value. The known values are existing x-values and y-values, and the new value is predicted using linear regression.

\\\text{FORECAST} = a+bx\\\text{where} \\a = \bar{y} - b\bar{x} \\\text{and} \\ b=\frac{\sum(x-\bar{x})(y-\bar{y})}{\sum(x-\bar{x})^2 

Syntax

SELECT [westclintech].[wct].[FORECAST] (
,<@new_x, float,>
,<@Known_y, float,>
,<@Known_x, float,>)

Arguments

@new_x

the new x-value for which you want FORECAST to calculate the y-value. @New_x is an expression of type float or of a type that can be implicitly converted to float.

@Known_y

the y-values to be used in the FORECAST calculation. @Known_y is an expression of type float or of a type that can be implicitly converted to float.

@Known_x

t he x-values to be used in the FORECAST calculation. @Known_x is an expression of type float or of a type that can be implicitly converted to float.

Return Type

float

Remarks

FORECAST is an AGGREGATE function and follows the same conventions as all other AGGREGATE functions in SQL Server.

Examples

In this example, we calculate the trend for a single set of x- and y-values with a single new x value.

SELECT wct.FORECAST(12.5, y, x) as FORECAST
FROM
(
    SELECT 0.75,
           1
    UNION ALL
    SELECT 2.5,
           2
    UNION ALL
    SELECT 6.75,
           3
    UNION ALL
    SELECT 10,
           4
) n(x, y);

This produces the following result

FORECAST
4.79116945107399

In this example, we will populate some temporary table with some historical financial information and then calculate a forecast. First, create the table and put some data in it:

CREATE TABLE #c
(
    SYM NVARCHAR(5),
    YE BIGINT,
    REV FLOAT,
    GPROF FLOAT,
    OPINC FLOAT,
    NETINC FLOAT
);
INSERT INTO #c
VALUES
('YHOO', 2009, 6460.32, 3588.57, 386.69, 597.99);
INSERT INTO #c
VALUES
('YHOO', 2008, 72.5, 4185.14, 12.96, 418.92);
INSERT INTO #c
VALUES
('YHOO', 2007, 6969.27, 4130.52, 695.41, 639.16);
INSERT INTO #c
VALUES
('YHOO', 2006, 6425.68, 3749.96, 940.97, 751.39);
INSERT INTO #c
VALUES
('YHOO', 2005, 5257.67, 3161.47, 1107.73, 1896.23);
INSERT INTO #c
VALUES
('GOOG', 2009, 23650.56, 14806.45, 8312.19, 6520.45);
INSERT INTO #c
VALUES
('GOOG', 2008, 21795.55, 13174.04, 5537.21, 4226.86);
INSERT INTO #c
VALUES
('GOOG', 2007, 16593.99, 9944.9, 54.44, 4203.72);
INSERT INTO #c
VALUES
('GOOG', 2006, 10604.92, 6379.89, 3550, 3077.45);
INSERT INTO #c
VALUES
('GOOG', 2005, 6138.56, 3561.47, 2017.28, 1465.4);
INSERT INTO #c
VALUES
('MSFT', 2009, 62484, 509, 24167, 18760);
INSERT INTO #c
VALUES
('MSFT', 2008, 58437, 46282, 21225, 14569);
INSERT INTO #c
VALUES
('MSFT', 2007, 60420, 48822, 22271, 17681);
INSERT INTO #c
VALUES
('MSFT', 2006, 51122, 40429, 18438, 14065);
INSERT INTO #c
VALUES
('MSFT', 2005, 44282, 36632, 16064, 12599);
INSERT INTO #c
VALUES
('ORCL', 2009, 26820, 21056, 9062, 6135);
INSERT INTO #c
VALUES
('ORCL', 2008, 23252, 18458, 8321, 5593);
INSERT INTO #c
VALUES
('ORCL', 2007, 22430, 17449, 7844, 5521);
INSERT INTO #c
VALUES
('ORCL', 2006, 17996, 13805, 5974, 4274);
INSERT INTO #c
VALUES
('ORCL', 2005, 14380, 11145, 4736, 3381);
INSERT INTO #c
VALUES
('SAP', 2009, 10672, 6980, 2588, 1748);
INSERT INTO #c
VALUES
('SAP', 2008, 11575, 7370, 2701, 1847);
INSERT INTO #c
VALUES
('SAP', 2007, 10256, 6631, 2698, 1906);
INSERT INTO #c
VALUES
('SAP', 2006, 9393, 6064, 2578, 1871);
INSERT INTO #c
VALUES
('SAP', 2005, 8509, 5460, 2337, 1496);

In this example, we will forecast the revenue (REV) against the year (YE)

SELECT SYM,
       ROUND(wct.FORECAST(2010, REV, YE), 0) as [2010 Revenue],
       ROUND(wct.FORECAST(2011, REV, YE), 0) as [2011 Revenue],
       ROUND(wct.FORECAST(2012, REV, YE), 0) as [2012 Revenue]
FROM #c
GROUP BY SYM;

This produces the following result.

SYM2010 Revenue2011 Revenue2012 Revenue
GOOG296213424338864
MSFT6846572837772
ORCL300163303036044
SAP120331268413335
YHOO385634583063

To calculate the net income using the revenue projections from the above query, we could enter the following statement.

SELECT #c.SYM,
       ROUND(wct.FORECAST([2010 Revenue], NETINC, REV), 0) as [2010 Net Income],
       ROUND(wct.FORECAST([2011 Revenue], NETINC, REV), 0) as [2011 Net Income],
       ROUND(wct.FORECAST([2012 Revenue], NETINC, REV), 0) as [2012 Net Income]
FROM
(
    SELECT SYM,
           ROUND(wct.FORECAST(2010, REV, YE), 0) as [2010 Revenue],
           ROUND(wct.FORECAST(2011, REV, YE), 0) as [2011 Revenue],
           ROUND(wct.FORECAST(2012, REV, YE), 0) as [2012 Revenue]
    FROM #c
    GROUP BY SYM
) n ,
#c
WHERE n.sym = #c.sym
GROUP BY #c.SYM;

This returns the following results.

SYM2010 Net Income2011 Net Income2012 Net Income
GOOG710781769246
MSFT195662091022253
ORCL706077538447
SAP194520022059
YHOO807789771

As the following query demonstrates, this returns a different result than if we had just looked at the net income over time.

SELECT SYM,
       ROUND(wct.FORECAST(2010, NETINC, YE), 0) as [2010 Net Income],
       ROUND(wct.FORECAST(2011, NETINC, YE), 0) as [2011 Net Income],
       ROUND(wct.FORECAST(2012, NETINC, YE), 0) as [2012 Net Income]
FROM #c
GROUP BY SYM;

This produces the following result.

SYM2010 Net Income2011 Net Income2012 Net Income
GOOG727784039529
MSFT193832066521948
ORCL702977128394
SAP191819662014
YHOO-18-311-604

See Also

SLOPE - slope of the linear regression through the data points in the known x-values and y-values

INTERCEPT - the point at which a line will intersect the y-axis by using existing x-values and y-values

RSQ - the Pearson product moment correlation coefficient through data points in known-y's and known-x's

STEYX - the standard error of the predicted y-value for each x in the regression

CORREL - Aggregate function to calculate the correlation coefficient

TREND - for simpler queries

TRENDMX - Calculate the values along a linear trend for multiple x values

TRENDMX_q - Calculate the values along a linear trend for multiple x values

LINEST - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values

LINEST_q - the Ordinary Least Squares (OLS) solution for a series of x-values and y-values