Development Of An Improved Hidden Markov Model Based Fuzzy Time Series Forecasting Model Using Genetic Algorithm
Need help with a related project topic or New topic? Send Us Your Topic
DOWNLOAD THE COMPLETE PROJECT MATERIAL
Development Of An Improved Hidden Markov Model Based Fuzzy Time Series Forecasting Model Using Genetic Algorithm
ABSTRACT
The goal of this study is to create an improved Hidden Markov Model (HMM)-based fuzzy time series (FTS) forecasting model that employs the Genetic Algorithm (GA). In order to increase forecasting performance, a GA and HMM are designed to optimise and correctly predict membership values in the fuzzy relationship matrix during the fuzzy inference step.
Monte Carlo simulation was used to estimate the stochastic outcome of the data and further improve the model’s representation of real data and unpredictability.
The generated model was implemented in MATLAB R2015a and tested against Cheng and Sheng’s data on Taipei’s daily average temperature and cloud density as a benchmark for bivariate FTS.
The performance of the proposed GA-HMM-based FTS was assessed using Mean Square Error (MSE) and Average Forecasting Error Percentage (AFEP) metrics. The results showed that the constructed model had an MSE of 0.5976 and an AFEP of 1.8673 for the bivariate benchmark HMM-FTS data of the daily average temperature and cloud density of Taipei, Taiwan, compared to 0.933 and 2.7464, respectively, derived from (Li and Cheng, 2012).
This is a 35% and 32% improvement in the MSE and AFEP, respectively. The model was also used to forecast short-term Internet traffic data from ABU, Zaria. The simulation results reveal MSE and AFEP values of 68.32392 and 0.08904, respectively, suggesting good predicting performance given the vast size and randomness of these traffics.
Thus, these results illustrate both the proposed GA-HMM-based FTS model’s superiority at making effective forecasts in the face of enormous traffic volumes and randomness, as well as its robustness in adapting to time series with diverse structural and statistical properties.
Chapter one
INTRODUCTION
1.1 Background of the Study
A time series is merely a collection of quantitative data measured at regular periods of time. Time series, whether discrete or continuous, are inherently nonlinear and non-stationary since they are sample functions derived from stochastic processes (Subanar and Abadi, 2011). Time series forecasting is useful in a wide range of applications, including predicting university enrolments, stock prices, rainfall, blood pressure, and so on.
Such forecasting often employs a series of past data points that are measured sequentially to predict future events (Sheng et al., 2009). Various strategies for time series forecasting have emerged in recent decades.
When compared to other models, Autoregressive Moving Average (ARMA) and Autoregressive Integrated Moving Average ARIMA-based models stand out and are quite effective.
However, they are unable to deal with time series ambiguity and linguistic words (Song & Chissom, 1993). Furthermore, these statistical approaches did not perform well on time series with little data (Tsaur et al., 2005).
Furthermore, the necessary conditions for applying conventional time series to probabilistic models, which include assumptions such as the number of observations, normal distribution, and linearity (Egrioglu, 2014).
As a result, when these assumptions are not met, these methods produce inaccurate forecasting results. Non-probabilistic techniques have been proposed as an alternative to probabilistic time series forecasting models (Egrioglu, 2015).
Fuzzy time series (FTS) has been developed and widely used to address these problems (Radmehr & Gharneh, 2012). Many academics have been drawn to FTS models in recent years due to the following advantages: improved performance in some genuine forecasts.
2
Problems (Song & Chissom, 1993), data handling in linguistic terms (Song & Chissom, 1993), and integration with heuristic knowledge and models (Huarng, 2001).
One of the most critical challenges in FTS models is determining the fuzzy relations (Egrioglu et al., 2013). There are numerous approaches in the literature for determining fuzzy relations.
These include fuzzy logic relationship groups (FLRG), artificial neural networks, fuzzy connection matrices derived from fuzzy set operations, particle swarm optimisation, and evolutionary algorithms (Egrioglu, 2014).
The fuzzy logic connection group is the most often utilised method since it eliminates the need for sophisticated matrix operations when creating FLRG tables. However, when FLRG tables are used, membership values of fuzzy sets are ignored since only the items with the greatest membership value are evaluated (Aladag et al. 2012).
This circumstance leads to knowledge loss, which may have a detrimental impact on forecasting ability. Fuzzy interactions can be nonlinear and complex, thus an intelligent approach for calculating them is required.
To address these shortcomings, the Hidden Markov Model (HMM) has been developed and employed in the formulation of the fuzzy relationship, with model parameters determined using a traditional search technique known as the Baum-Welch algorithm (Li and Cheng, 2010).
Because parameter learning in the Hidden Markov Model utilising the Bawm-Welch algorithm is prone to get trapped in the local optima, a technique for obtaining improved estimates of the fuzzy relations while also avoiding the local optima is required.
In recent years, artificial intelligence approaches have been utilised at various phases of fuzzy time series methods (Egrioglu 2014). To obtain the best estimate of the inner fuzzy relations, this study used a Genetic Algorithm (GA) method.
GA is a well-known search heuristic that simulates the process of natural evolution. This heuristic is commonly used to produce effective solutions to optimisation and search issues, such as the partition problem in fuzzy time series (Cai et al., 2013). In general, GA consists of populations and chromosomes.
3
Fitness functions and genetic procedures. The population represents a collection of appropriate solutions. And each person in the population represents a possible solution to a certain object problem. This population representation defines the search space for the problem’s solution.
Each of the factors that make up a person is known as a chromosome. The chromosomes are often coded into a string to form the individual. A fitness function evaluates each individual in the population to assess how fit the solution is.
The GA maintains a population of n potential solutions, i.e., individuals, with associated fitness values calculated using the fitness function (Koo et al., 1990).
1.2 Motivation.
1.2 Motivation.
A significant amount of research has been conducted to improve the accuracy of fuzzy time series forecasting models (Uslu et al, 2013; Bas et al, 2014; Zhang et al, 2013). Many studies have been undertaken to improve the accuracy of FTS models utilising artificial intelligence optimisation algorithms (Yolcu, 2014; Aladag et al., 2013; Haneen et al., 2014). However, the difficulty of adequately capturing the relations and hence enhancing the model’s forecasting accuracy persists.
1.3 Statement of Problem
Fuzzy time series approaches are excellent for forecasting time series. Since its inception, fuzzy time series (FTS) research has gained popularity due to its ability to deal with the uncertainty and vagueness that are frequently inherent in real-world data as a result of measurement inaccuracies, incomplete sets of observations, or difficulties in obtaining measurements under uncertain conditions.
Forecasting relies heavily on the modelling of fuzzy relations derived from fuzzy time series. In the analysis of time-invariant fuzzy time series, fuzzy logic group connections tables have been widely used to determine fuzzy logic relationships.
This is because using these tables eliminates the requirement for complex matrix computations. On the other hand, fuzzy logic group relationships tables.
4
are exploited, and fuzzy set membership values are ignored. Thus, in contravention of fuzzy set theory, only the elements with the highest membership value are evaluated.
This condition generates information loss, which reduces the model’s forecasting accuracy. Second, it is likely to face issues with rule redundancy and computational overhead.
As a result, there is a need for a strategy that can capture relationships more accurately despite the non-linear character of fuzzy time series data. Furthermore, because of the intrinsic uncertainty associated with time evolution, state transitions in a system are typically probabilistic.
To address the limitations inherent in existing FTS models, a forecasting model based on the Hidden Markov Model (HMM) for fuzzy time series was used to implement the probabilistic state transition.
Typically, relationship (parameter) estimation for an HMM is carried out using a well-defined iterative approach that is prone to becoming trapped in a local minimum.
Genetic Algorithms (GA) are popular because of their ability to handle nonlinear interactions. To improve the relationship representation, a GA-HMM-based model was used to effectively capture the relations and hence improve the forecasting accuracy of the model.
1.4 The Significance of Research
The research’s significance lies in the development of an enhanced hidden markov model-based fuzzy time series that can increase forecasting accuracy by effectively estimating the fuzzy relations that exist among the states of historical data.
1.5 Aims and Objectives
The goal of this project is to create an improved HMM-based FTS forecasting model using Genetic Algorithm.
To achieve the above goal, the following objectives were used:
a) Creation of an HMM-based FTS model using the Bauw-Welch estimate approach.
5
b) Create an improved HMM-based FTS model by optimising the model parameters with GA.
c) Model validation utilising bivariate benchmark FTS data of Taipei’s daily average temperature and cloud density, and comparing findings to those produced by Li and Cheng (2012), using MSE and AFEP performance measures.
d) Applying the created model to ABU Zaria’s Internet traffic statistics.
1.6 Methodology.
The methods used in this study to construct an improved hidden Markov model based fuzzy time series forecasting model utilising genetic algorithms is highlighted below.
a) Create the conventional HMM-based FTS forecasting model using the relative frequency Bawm-Welch estimate approach.
b) Improving the established HMM-based FTS model by re-estimating the model parameters with GA.
c) Model validation utilising bivariate benchmark FTS data of Taipei’s daily average temperature and cloud density, and comparing findings to those produced by Li and Cheng (2012), using MSE and AFEP performance measures.
d) Use the constructed model to anticipate short-term Internet traffic data for ABU Zaria from February 29th to March 31st, 2016, as collected from the ABU, Zaria data centre, and evaluate its performance.
1.7
1.7 Dissertation Organisation.Dissertation Organisation
The overall introduction was delivered in Chapter One. The remaining chapters are structured as follows: Chapter Two begins with a detailed examination of related literature and key core topics such as time series, fuzzy time series forecasting, Markov Model (MM), Hidden Markov Models (HMM), and Genetic Algorithms (GA).
6
Second, Chapter Three presents an in-depth strategy and important mathematical models for developing the improved hidden markov model-based fuzzy time series forecasting model utilising a genetic algorithm.
Third, Chapter Four presents the analysis, performance, and discussion of the results. The fifth chapter concludes with recommendations for future work. The appendices contain entire MATLAB scripts.
7
Need help with a related project topic or New topic? Send Us Your Topic