For many years, people have been predicting weather conditions, economic and political events, and sports results; recently, this extensive list has been replenished with cryptocurrencies. There are many ways to develop predictions for predicting versatile events. For example, intuition, expert opinions, using past results to compare with traditional statistics, and forecasting time series is only one of them, while the most modern and accurate type of forecasts with a wide scope.
Time series method
A Time Series Method (TS) is a data set that collects information over a period of time. There are special methods for highlighting this type:
- linear and nonlinear;
- parametric and nonparametric;
- one-dimensional and multidimensional.
Time series forecasting brings with it a unique set of capabilities for solving modern problems. Modeling is based on learning to establish the driving force behind data changes. The process comes from long-term trends, seasonal effects, or irregular fluctuations that are characteristic of TS and are not observed in other types of analysis.
Machine learning is an informatics industry where algorithms are data-driven and include artificial neural networks, deep learning, association rules, decision tree, reinforcement learning, and Bayesian networks. A variety of algorithms provides solutions to problems, and each has its own requirements and trade-offs for data entry, speed and accuracy. They, along with the accuracy of the final predictions, will be weighed when the user decides which algorithm will work best for the situation being studied.
Time series forecasting borrows from the field of statistics, but provides new approaches for modeling tasks. The main problem for machine learning and time series is the same - to predict new results based on previously known data.
The purpose of the predictive model
TS is a collection of data points collected at regular intervals. They are analyzed to determine a long-term trend, to predict the future or to perform some other type of analysis. There are 2 things that distinguish TS from the usual regression problem:
- They are time dependent. Therefore, the basic assumption of the linear regression model that the observations are independent is not fulfilled in this case.
- Along with the tendency to increase or decrease, most TS have some form of seasonality, that is, changes characteristic of a certain period of time.
The goal of a time series forecasting model is to provide an accurate forecast on request. The time series has time (t) as an independent variable and a target dependent variable. In most cases, the forecast is a specific result, for example, the value of the home during the sale, the sports result of the competition, the results of trading on the exchange. The forecast represents the median and average value and includes a confidence interval expressing the level of power of attorney in the range of 80-95%. When they are fixed at regular intervals, the processes are called time series and are expressed in two ways:
- one-dimensional with an index of time which creates an implicit order;
- a set with two dimensions: time with an independent variable and another dependent variable.
Creation of functions is one of the most important and time-consuming tasks in applied machine learning. However, when forecasting time series, functions are not created, at least in the traditional sense. This is especially true when you want to predict the result a few steps forward, and not just the next value.
This does not mean that functions are completely prohibited. They should simply be used with caution for the following reasons:
- It is not clear what future real values will be for these functions.
- If the objects are predictable and have some patterns, you can build a predictive model for each of them.
However, it must be borne in mind that the use of predicted values as signs will propagate the error to the target variable and lead to errors or give biased forecasts.
Time Series Components
A tendency exists when a series increases, decreases, or remains at a constant level in time; therefore, it is taken as a function. Seasonality refers to a time series property that displays periodic patterns repeating at a constant frequency (m), for example, m = 12 means that the pattern repeats every twelve months.
Dummy variables, like seasonality, can be added as a binary function. You can, for example, take into account holidays, special events, marketing campaigns, regardless of whether the value is extraneous or not. However, remember that these variables must have specific patterns. Moreover, the number of days can be easily calculated even for future periods and influence forecasting based on time series, especially in the financial field.
Cycles are seasons that do not occur at a fixed rate. For example, the annual attributes of Canadian lynx reproduction reflect seasonal and cyclical patterns. They do not repeat at regular intervals and can occur even if the frequency is 1 (m = 1).
Lagged values - you can include lagged variable values as predictors. Some models, such as ARIMA, vector autoregression (VAR), or autoregressive neural networks (NNAR), work in this way.
The components of the variable of interest are very important for time series analysis and forecasting, in order to understand their behavior, patterns, and also be able to choose the appropriate model.
Dataset attributes
The programmer may be used to entering thousands, millions, and billions of data points in the machine learning model, but this is not required for time series. In fact, you can work with small and medium TS, depending on the frequency and type of variable, and this is not a disadvantage of the method. Moreover, there are actually a number of advantages to this approach:
- Such sets of information will match the capabilities of a home computer.
- In some cases, time series analysis and forecasting is performed using the entire data set, and not just the sample.
- TS length is convenient for creating graphs that can be analyzed. This is a very important point, because programmers are based on the graph at the analysis stage. This does not mean that they do not work with huge time series, but initially they should be able to handle smaller TS.
- Any dataset that contains a time-related field can benefit from time-series analysis and forecasting. However, if the programmer has a larger data set, a database (TSDB) may be more appropriate.
Some of these sets come from events recorded using timestamps, system logs, and financial data. Since TSDB initially works with time series, this is a great opportunity to apply this technique to large-scale datasets.
Machine learning
Machine learning (MO) can outperform traditional time series forecasting techniques. There is a whole bunch of studies comparing machine learning methods with more classical statistical methods for TS data. Neural networks are one of the technologies that has been widely studied and applies TS approaches. Machine learning methods lead the way in ranking data collection based on time series. These approaches have proven to be superior to pure TS approaches in competition with the M3 or Kaggle.
MO has its own specific problems. Developing functions or creating new predictors from a dataset is an important step for him and can have a huge impact on performance and be a necessary way to solve the problems of the trend and seasonality of TS data. In addition, some models have problems with how well they fit the data, and if not, they may skip the main trend.
Time series and machine learning approaches should not exist in isolation from each other. They can be combined together to give the benefits of each approach. Forecasting methods and time series analysis do a good job of decomposing data into trend and seasonal elements. Then this analysis can be used as input for the MO model, which has information on trends and seasonality in its algorithm, which gives the best of two possibilities.
Understanding the problem statement
As an example, consider the TS associated with predicting the number of passengers of a new high-speed rail service. For example, there are data for 2 years (August 2016 - September 2018), and using these data you need to predict the number of passengers for the next 7 months, having data for 2 years (2016–2018) at the hourly level with the number of passengers traveling , and you need to estimate their number in the future.
A subset of a dataset for forecasting using time series:
- Create a train and test file for simulation.
- The first 14 months (August 2016 - October 2017) are used as training data, and the next 2 months (November 2017 - December 2017) are used as test data.
- Dataset aggregation on a daily basis.
Perform data visualization to know how they change over a period of time.
Naive Approach Construction Method
The library used in this case to predict TS is statsmodels. It must be installed before applying any of these approaches. Statsmodels may already be installed in the Python environment, but it does not support forecasting methods, so you will need to clone it from the repository and install it using the source code.
For this example, it is understood that coin fares are stable from the very beginning and throughout the entire time period. This method assumes that the next expected point is equal to the last observed point and is called the Naive Approach.
The standard deviation is now calculated to verify the accuracy of the model on the test data set. From the RMSE value and the above graph, we can conclude that Naive is not suitable for variants with high variability, but is used for stable ones.
Simple middle style
To demonstrate the method, a graph is constructed, assuming that the Y axis represents the price, and the X axis represents time (days).
From it we can conclude that the price increases and decreases randomly with a small margin, so that the average value remains constant. In this case, you can predict the price of the next period similar to the average for all the past days.
Such a forecasting method with the expected average value of previously observed points is called the simple average method.
In this case, they take previously known values, calculate the average and take it as the next value. Of course, this will not be accurate, but rather close, and there are situations when this method works best.
Based on the results displayed on the graph, it is clear that this method works best when the average value for each period of time remains constant. Although the naive method is better than the average, but not for all data sets. It is recommended that you try out each model step by step and see if it improves the result or not.
Moving Average Model
Based on this graph, we can conclude that prices have increased several times in the past by a wide margin, but are now stable. In order to use the previous averaging method, you need to take the average value of all previous data. The prices of the initial period will greatly affect the forecast for the next period. Therefore, as an improvement compared with a simple average take the average price only for the last few periods of time.
Such a forecasting technique is called a moving average technique, which is sometimes called a “n” sliding window. Using a simple model, the next value in TS is predicted to verify the accuracy of the method. Obviously, Naive is superior to both the Average and the Moving Average for this dataset.
There is a forecast option using simple exponential smoothing. In the moving average method, past "n" observations are equally weighted. In this case, one may encounter situations where each of the past 'n' affects the forecast in its own way. This option, which weighs past observations differently, is called the weighted moving average method.
Pattern extrapolation
One of the most important properties needed to consider time series forecasting algorithms is the ability to extrapolate patterns outside the training data area. Many MO algorithms do not have this feature, since they tend to be limited to the area that is determined by the training data. Therefore, they are not suitable for TS, the purpose of which is to project the result into the future.
Another important property of the TS algorithm is the ability to obtain confidence intervals. Although this is the default property for TS models, most MO models do not have this feature, since not all of them are based on statistical distributions.
You should not think that only simple statistical methods are used to predict TS. It's not like that at all. There are many complex approaches that can be very useful in special cases. Generalized autoregressive conditional heteroskedasticity (GARCH), Bayesian and VAR are just some of them.
There are also models of neural networks that can be applied to time series that use delayed predictors and can handle functions such as autoregression of neural networks (NNAR). There are even time series models borrowed from a complex study, in particular in the family of recurrent neural networks, such as LSTM and GRU networks.
Assessment metrics and residue diagnosis
The most common estimation metrics for forecasting are the rms averages that many use to solve regression problems:
- MAPE, since it does not depend on scale and represents the ratio of the error to the actual values in percent;
- MASE, which shows how well the forecast is performing compared to the naive average forecast.
After the forecasting method has been adapted, it is important to evaluate how well it is able to capture models. Although the evaluation metrics help determine how close the values are to the actual, they do not evaluate whether the TS model matches. Leftovers are a good way to value this. As a programmer tries to apply TS patterns, he can expect errors to behave like “white noise” because they represent what cannot be captured by the model.
White noise should have the following properties:
- Residues are uncorrelated (Acf = 0)
- Residues correspond to a normal distribution with zero mean (unbiased) and constant dispersion.
- If either of the two properties is missing, this means that there is room for improvement in the model.
- The zero mean property can be easily verified using the T-criterion.
- The properties of normality and constant dispersion are visually monitored using a histogram of residues or the corresponding one-dimensional normality test.
ARIMA Model
ARIMA, the AutoRegressive Integrated Moving-Average model, is one of the most popular methods used in forecasting TS, mainly due to autocorrelation of data to create high-quality models.
When evaluating ARIMA coefficients, the basic assumption is that the data are stationary. This means that trend and seasonality cannot affect variance. The quality of the model can be estimated by comparing the timeline of the actual values with the predicted values. If both curves are close, then we can assume that the model is suitable for the analyzed case. It should disclose any trends and seasonality, if any.
, : , . ARIMA (0,1,1) , , (0,2,2) .
ARIMA Excel:
- Excel.
- XL MINER.
- ARIMA.
ARIMA:
- ARIMA - .
- , .
- ARIMA : ARIMA (p, d, q), p = , d = q = .
SQL Server
Cross-prediction is one of the important features of time series in forecasting financial problems. If two interconnected series are used, the resulting model can be used to predict the results of one series based on the behavior of the others.
SQL Server 2008 has powerful new time series features that you need to learn and use. The tool has easily accessible TS data, an easy-to-use interface for modeling and reproducing algorithm functions and an explanation window with a link to server-side DMX requests so that you can understand what is happening inside.
Market time series is a wide area to which deep learning models and algorithms can be applied. Banks, brokers and funds today are experimenting with their deployment of analysis and forecasting of indices, exchange rates, futures, cryptocurrency prices, government shares and much more.
When forecasting time series, the neural network finds predictable patterns, studying the structures and trends of markets, and gives advice to traders. These networks can also help in detecting anomalies such as unexpected peaks, falls, trend changes and level shifts. Many models of artificial intelligence are used for financial forecasts.