Introduction: In the financial area, a simple but also realistic way to model the data is very important. A Markov chain perturbed by Gaussian noises model for the share prices data is introduced. This paper is related to the renowned (and open) embedding problem for Markov chains: not every discrete time Markov chain has an underlying continuous time Markov chain, and the necessary and sufficient conditions for this to be the case are unknown. Let P be an (n×n) estimated transition matrix associated with data. The question is whether P can have a representation P=e^Q, where Q matrix is transition rate matrix. This is referred to as an embedding problem. In here, the matrix Q is the generator, the real generator, intensity matrix or true generator; otherwise, matrix Q is neither a true generator nor an exact generator. Elfving first proposed the embedding problem which is also known as Elfving's problem. He gave certain associated necessary conditions, in particular observing that the eigenvalues of P must satisfy two conditions: (i) no eigenvalue other than unity can have unit modulus (and so P must be aperiodic); (ii) every negative eigenvalue must have an even (algebraic) multiplicity (Elfving, 1937:1-17). Previously, the embedding problem was treated as a problem of pure mathematics. Then, in the 1990s, the problem, which applies to rating transition matrices, received increasing attention in the financial mathematics literature (Israel, Rosenthal, Wei, 1999:245-65). Notice that many authors in the financial literature assume that the solution to the embedding problem is positive, that is the data comes from the continuous time Markov process, and estimate its generator. Purpose: The main goal of this research is to analyse whether the discrete time model permits extension or embedding to the continuous time model. Scope: In this research, we consider BP day by day closing share prices for four different financial year (April to April) 2009-2010, 2010-2011, 2011-2012, 2012-2013 and we suggest to model them as a random walk on a discrete time Markov Chain perturbed by the Gaussian noise. Also, the share prices of twenty different companies for four different financial years are considered to check embedability. This research identifies conditions under which a true generator does or does not exist. To be consistent, we apply same approach to volatility process as to share price process. Limitations: The share prices of twenty-one different companies are considered to check embedability which are chosen arbitrarily. Although extending the dataset is made the research more convenient, it is inconvenient in terms of time consumed. Method: The data is firstly modelled by log linear regression that is popular in the actuarial literature. After that, Markov chains are constructed for residuals. We define the states of the Markov chains and estimate their transition probabilities. Then, the data is splitted in two parts and their tensor product is considered. The tensor product splitting data is a way to significantly reduce the number of parameters. In addition, from the financial perspective, the tensor product structure allows to apply the hedging argument for each component and to encompass the incomplete data. However, the side effect of the tensor product modelling is that the construction leads to a missing data. Therefore, this requires us to deal with the missing data. In order to treat the missing values, we apply Expectation-Maximization algorithm as the parameter estimation method and Machine Learning algorithm (C4.5) as the imputation method. The intention of this paper was to analyse whether the discrete time model permits extension or embedding in the continuous time model. If the model is converted to a continuous time model (embeddable), it means that the result (data) is observable each time. This is a plausible and important method in the financial sector. If the model is a continuous time model, many existing formulae (such as option pricing) are applicable to the model. We analyse this problem by applying it to our data. Overall, this research is an extensive case study of the embedding problem for financial data and its volatility. It gives a real financial application that illustrates the importance of the embedding problem. Findings: This study shows that using a continuous time model for volatility is more stable than the original share prices. In addition, considering a small number of carefully chosen states is more reliable. Conclusion: As a result, in general we could not embed the discrete time Markov chain in the continuous time Markov chain. This means that the model we considered should be treated as a discrete time model.

Anahtar Kelimeler: Embedding problem, Tensor product, Missing data, Machine learning, Markov chain, Share price, Volatility