brain_3_1


37 

Declarative vs. Procedural Memory: Roles in Second Language Acquisition  
 

Laleh Fakhraee Faruji 
Department of Humanities, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran 

fakhraeelaleh@yahoo.com 

 
Abstract 

            Memory is not a single faculty but is a combination of multiple distinct abilities (Schacter, 
1987). The declarative-procedural distinction is used both with regard to knowledge and memory 
that stores this knowledge. Ellis (2008) used the terms explicit/implicit, and declarative/procedural 
interchangeably. In this article the researcher aims at identifying the different aspects of 
declarative/procedural memory, interaction between these two types of memory, and the role they 
may play in second language acquisition.   
 

Keywords: Conscious memory, Declarative memory, Procedural memory 
 
1. Introduction 

        According to Reber & Squire (1998) what distinguishes declarative (explicit) memory from 
non-declarative (implicit) memory is the point that declarative memory supports conscious memory 
of facts and events, and procedural memory supports a range of phenomena including habit 
learning, simple conditioning, and priming. As Squire (2007, as cited in Dornyei, 2009, p. 147) 
argued, declarative memory is representational, and what is learnt can be expressed through 
conscious recollection. “It involves the storage and retrieval of facts (e.g. semantic knowledge) and 
events (e.g. episodic knowledge)”. It is the kind of memory used in everyday language. 

According to Ullman (2004) an important feature of declarative memory system is that it 
enables us to learn very rapidly and flexibly, sometimes even based on a single stimulus 
presentation (i. e. a single exposure to the information to be learnt). In contrast, procedural memory 
is expressed through performance rather than conscious recollection. “It involves the storage and 
retrieval of sensori-motor and cognitive habits, skills, and other types of sequences, which can be as 
complex as playing an instrument or game”. The unconscious memory system is related our 
experiences in interacting with the world, and it involves “gradual learning on a continuous basis 
during multiple representations of stimuli and responses” (Ullman, 2004).  

Schumann (2004, as cited in Dornyei, 2009, p. 148) clarified an important aspect of 
procedural memory, and stated that procedural memory is “relatively inflexible and non-
transferrable”, that is, it is only available in contexts that are either identical or very similar to the 
original learning situation (e.g. the ability to play one string instrument cannot be transferred 
efficiently to playing another). However as Dornyei (2009, p. 148) stated this is compensated by  
the fact that procedural memory can be reserved much more than declarative memory in the elderly 
or in people with dementia which is an illness that affects the brain and memory, and makes you 
gradually lose the ability to think and behave normally. 

 
2. Interaction between two memory systems in second language acquisition 
Although, the two memory systems are largely autonomous, they may interact in a number 

of ways (McBride, 2007). For example, recent studies suggest that declarative knowledge may turn 
into procedural knowledge (proceduralization of declarative knowledge) and procedural (implicit) 
knowledge may be converted into declarative knowledge as a result of accumulating experience  
Ullman (2001) also believed that the two systems are related to separate aspects of language. He 
argued that procedural memory is ‘largely informationally encapsulated’ and develops as a result of 
implicit, nonconscious processes of learning. It is specialized for sequences (i.e. linguistic chunks) 
and it has a rule system which includes “the representations of linguistic patterns extracted from 
input which has repeatedly stimulated the relevant neural circuits”. It is associated with grammatical 
processing (both syntax and morphology) as this occurs in real time. He characterized the 


BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 38 

declarative memory as ‘an associative memory of distributed representations’ which contains a 
mental lexicon, which is the sound and meanings of morphologically simple and complex words. 

He illustrated this model with an example of the processing of morphological forms such as 
regular and irregular past-tense verb forms. He proposed that procedural memory is responsible for 
the computation of regular morphological features (for example, v-ed) by ordering the phonological 
forms of the base and an affix (for example, walk + ed = walked). In contrast declarative memory 
handles irregular forms. 

According to Ellis (2008, p. 751) evidence for functional distinction of two memory systems 
come from studies of neurological injuries and neuroimaging studies. Dornyei (2009, p. 148) argued 
that “declarative memory is located in the medial temporal lobe, including the hippocampus, while 
procedural memory is usually associated a network of more diffuse brain structures placed in the 
frontal/basal-ganglia circuits”. 

As was referred earlier evidence for the differentiation between declarative and procedural 
memory systems comes from neurological impairment and neuroimaging studies. Ellis (2008) while 
referring to this issue stated that the nature of language loss in people with neurodegeneracy and 
neurodevelopmental disorders can be explained in terms of two separate systems, each with their 
own distinct functions. For example the implicit memory system is damaged in Parkinson’s disease, 
and as a result grammatical processing deficiency would occur. The explicit memory system is 
damaged in Alzheimer’s disease and Williams’ Syndrome, which leads to difficulty in accessing 
lexical items. What is particularly strong evidence of the distinctiveness of the two systems is that 
damage to one system does not lead to loss of functions associated with the other system. 

 
3. Conscious memory 

        The study of consciousness is often discussed in terms of intention and awareness. Memory 
researchers have also used these concepts to distinguish conscious and automatic memory forms of 
memory. According to McBride (2007) conscious memory processes involve either intentional 
retrieval of a previous episode, awareness of the retrieval of a previous episode, or both. An 
important goal of research on conscious and automatic memory in the past few decades has been the 
measurement of these processes in the retrieval of information. The distinction between intention 
and awareness has been important in the development of these measurement techniques. In some 
methods, subjects are asked to complete a task by intentionally retrieving a study episode 
(measuring conscious memory) or are asked to complete a task without intentional retrieval of a 
study episode (measuring automatic memory). In other methods, awareness that a study episode 
was previously experienced distinguishes conscious from automatic memory processes. 

Sun et al. (2008) also confirmed this dichotomy and said that conscious memory (CM), 
involves intentional retrieval and self-awareness of memory, whereas unconscious memory (UM), 
which refers to the cognitive use of previous experiences without involving self-awareness of 
memory (Sun et al., 2008). 

According to them a widely used method to illuminate CM and UM has been to compare 
differences in performance between explicit and implicit tests (i.e., the task-dissociation method). 
Explicit tests such as recall and recognition are assumed to tap CM. Implicit tests such as word-
stem completion and word identification, in which participants are not instructed to make reference 
to studied words, are thought to reflect UM. It has been shown that the memories measured by the 
two types of test differ from each other in their sensitivity to experimental variables such as level-
of-processing (LoP) and self-generation, which suggests that the two types of test measure different 
memory systems. 

The human medial temporal lobe (MTL) system mediates memories that can be consciously 
recollected (Grunwald et al., 2003). However, the specific natures of the individual contributions of 
its various subregions to conscious memory processes remain vague. Grunwald et al. (2003) show a 
functional dissociation between the hippocampus proper and the parahippocampal region in 
conscious and unconscious memory as revealed by invasive recordings of limbic event-related brain 


D. A. Popescu, M. C. Dănăuţă - Validation of a Web Application by Using a Limited Number of Web Pages 

 
   39 

potentials recorded during explicit and implicit word recognition: Only hippocampal and not 
parahippocampal neural activity exhibits sensitivity to the implicit versus explicit nature of the 
recognition memory task. Moreover, only within the hippocampus proper do the neural responses to 
repeated words differ not only from those to new words but also from each other as a function of 
recognition success. By contrast parahippocampal (rhinal) responses are sensitive to repetition 
independent of conscious recognition. These findings thus demonstrate that it is the hippocampus 
proper among the MTL structures that is specifically engaged during conscious memory processes. 
 

4. Degree of availability of procedural memory to L2 learners 
Procedural memory is less available to L2 learners: They have fewer items in their implicit 

linguistic competence than native speakers. (Paradis, 2009, p. 22). As predetermined in Paradis 
(2004, as cited in Paradis, 2009, p. 22), to the extent that there is a gap in their L2 implicit linguistic 
competence (the “rule” system), adult learners compensate by relying on their metalinguistic 
knowledge (concretely, if they cannot process the passive construction procedurally, they will 
consciously construct a passive sentence by applying the explicit rule they have learned); they 
therefore depend more than native speakers upon declarative memory.  

According to him to the extent that a language has been internalized, its implicit 
grammatical competence is processed by procedural memory (and is available); to the extent that 
there are gaps in the implicit competence for one (or more) of the languages, the speaker will 
compensate for them by using explicit knowledge sustained by declarative memory (upon which the 
speaker is thus more dependent). 

Procedural memory is not monolithic (Paradis, 2009, p. 24). There is procedural memory 
that sustains phonology, and procedural memory that sustains syntax, just as there is procedural 
memory dedicated to playing tennis and dedicated to playing the piano. Each procedural language 
module concerns a different set of objects, of a different nature, that engage a different type of 
implicit rule (procedures). 

In addition, as Paradis (2009, p. 24) believed it is also the case that the availability of 
procedural memory for acquiring language as a whole decreases with age (though with different 
optimal periods for the development of various components: prosody, phonology, morphology, 
syntax, in that order).  

 
5. There is no continuum from automatic to controlled processing 

        According to Paradis (2009, p. 26) there is no continuum between implicit competence and 
explicit knowledge, declarative memory and procedural memory, incidental acquisition and 
attentional learning, or automatic and controlled processing. Processing is either automatic or 
controlled. Controlled processing may be speeded-up but it remains qualitatively different from 
automatic processing. Conscious control may be involved in the deliberate decision to initiate an 
automatic process, but it is not involved in the processing itself (p. 26). 

Paradis (2009, p. 26) argued that what may be considered as a continuum is the gradual 
replacement of the conscious use of metalinguistic knowledge by the automatic use of implicit 
linguistic competence. He provided the example that if you gradually replace meat by vegetable 
protein in your diet, meat does not become vegetable protein, and there is no continuum between 
meat and vegetables (except, possibly, phylogenetically over millions of years of evolution – but 
not in the context of the period and situation that concern us). 

There is no continuum from automatic to controlled processing (i.e., no degrees of 
automaticity; a function is automatic or it is not); there is only a continuum ranging from 
predominant reliance on controlled processing to predominant reliance on automatic processing, or 
between the amount of controlled and automatic processing exerted on a particular function Paradis 
(2009, p. 26). 
 
 
BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 40 

7. Concluding remarks 
        Studies of memory organization in non-human animals and humans have led to a consensus 
that memory is not a monolithic faculty, but rather is supported by multiple brain systems that differ 
in terms of the types of memory they mediate (Poldrack & Packard, 2003). 

According to Cleerman (2003, as cited in Dornyei, 2009, p. 146) with the rapid development 
of cognitive neuroscience, many existing distinctions that were described in binary terms, such as 
the explicit-implicit or the declarative-procedural distinction, have been redefined in terms of 
graded characterizations. This is particularly applicable for declarative-procedural memory 
dichotomy. Since current memory frameworks identified more memory types, procedural memory 
has been replaced by ‘non declarative memory’, and procedural memory refers to only one 
constituent of declarative memory system. The exact number of different systems in the category of 
‘non declarative memory’ has not been identified yet, but the main aspects referred to in literature 
which was mentioned by Dornyei (2009, p. 148), include for: (1) skills and habits (which is 
procedural memory in the narrow sense), (2) priming and perceptual learning, (3) classical 
conditioning, and (4) non-associative learning (i.e. behavioral change brought by repeated 
representation of one stimulus). 
 
References 
[1] Dörnyei, Z. (2009). The psychology of second language acquisition. Oxford: Oxford University 
Press. 
[2] Ellis, R. (2008). The study of second language acquisition. Oxford: Oxford University Press.  
[3] Grunwald, T., Pezer, N., Munte, T. F., Kurthen, M., Lehnertz, K., Van Roost, D., Fernandez, G.,   
Kutas, M., &  Elgera, C. E. (2003). Dissecting out conscious and unconscious memory 
(sub)processes within the human medial temporal lobe. NeuroImage, 20, 139–145. 
[4] Gupta, R.,  Duff, C. M., Denburg, N. L., Cohen, N. J.  Bechara, A.,  Tranel, D. (2009). 
Declarative memory is critical for sustained advantageous complex decision-making. 
Neuropsychologia, 47, 1686–1693. 
[5] McBride, D. M. (2007). Methods for measuring conscious and automatic memory: A Brief        
Review. Journal of Consciousness Studies, 14, 1, 198–215. 
[6] Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam:  
John Benjamins. 
[7] Poldrack, R. A. , & Packard, M. G. (2003). Competition among multiple memory systems: 
Converging evidence from animal and human brain studies. Neuropsychologia, 1497, 1–7. 
[8] Reber, P. J., & Squire, L. R. (1998). Encapsulation of implicit and explicit memory in sequence  
learning. Journal of Cognitive Neuroscience, 10, 248–263. 
[9] Schacter, D. L. (1987). Implicit memory: History and current status. Journal of Experimental 
Psychology: Learning, Memory, and Cognition, 13(3), 501-518. 
[10] Sun, R., Cheng, C. M., Lin, W., Tsai, Ch. (2008). Conscious and unconscious forms of 
memory in different implicit tests. Cognitive Systems Research, 9, 312–328. 
[11] Ullman, M. T. (2004). Contributions of memory circuits to language: The 
declarative/procedural model. Cognition, 92, 231–270. 
[12] Ullman, M. T. (2001). The declarative/procedural model of lexicon and grammar. Journal of  
Psycholinguistic Research, 30(1), 37–69. 
 

41 

Comparative study of Financial Time Series Prediction by Artificial Neural 
Network with Gradient Descent Learning 

 
Arka Ghosh 

Seacom Engineering College under West Bengal University Of Technology, India 
9007900477a@gmail.com 

 
Abstract 
Financial forecasting is an example of a signal processing problem which is challenging due 

to Small sizes, high noise, non-stationarity, and non-linearity,but fast forecasting of stock market 
price is very important for strategic business planning.Present study is aimed to develop a 
comparative predictive model with Feedforward Multilayer Artificial Neural Network & Recurrent 
Time Delay Neural Network for the Financial Timeseries Prediction.This study is developed with 
the help of historical stockprice dataset made available by GoogleFinance.To develop this 
prediction model Backpropagation method with Gradient Descent learning has been implemented. 
Finally the Neural Net ,learned with said algorithm is found to be skillful predictor for non-
stationary noisy Financial Timeseries. 

 
Key Words: Financial Forecasting, Financial Timeseries Feedforward Multilayer Artificial 
Neural Network,Recurrent Timedelay Neural Network, Backpropagation, Gradient descent. 
 
1. Introduction  
Over past fifteen years, a view has emerged that computing based on models inspired by our 

understanding of the structure and function of the biological neural networks may hold the key to 
the success of solving intelligent tasks by machines like noisy time series prediction and more [1]. 
A neural network is a massively parallel distributed processor that has a natural propensity for 
storing experiential knowledge and making it available for use. It resembles the brain in two 
respects: Knowledge is acquired by the network through a learning process and interneuron 
connection strengths known as synaptic weights are used to store the knowledge[2]. Moreover, 
recently the Markets have become a more accessible investment tool, not only for strategic 
investors but for common people as well. Consequently they are not only related to macroeconomic 
parameters, but they influence everyday life in a more direct way. Therefore they constitute a 
mechanism which has important and direct social impacts. The characteristic that all Stock Markets 
have in common is the uncertainty, which is related with their short and long-term future state. This 
feature is undesirable for the investor but it is also unavoidable whenever the Stock Market is 
selected as the investment tool. The best that one can do is to try to reduce this uncertainty. Stock 
Market Prediction (or Forecasting) is one of the instruments in this process. We cannot exactly 
predict what will happen tomorrow, but from previous experiences we can roughly predict 
tomorrow. In this paper this knowledge based approach is taken. 

The accuracy of the predictive system which is made by ANN can be tuned with help of 
different network architectures. Network is consists of input layer ,hidden  layer & output layer of 
neuron, no of neurons per layer can be configured according to the needed result accuracy & 
throughput,there is no cut & bound rule for  that.the network can be trained by using sample 
training data set,this neural network model is very much useful for mapping unknown functional 
dependencies between different input & output tuples.In this paper two types of neural network 
architecture,feed forward multilayer network & timedelay recurrent network is used for the 
prediction of the  NASDAQ stock price.A comparative error study for both network architecture is 
introduced in this paper. 

In this paper gradient descent backpropagation learning algorithm is used for supervised  
training of  both  network architectures. The back propagation algorithm was developed by Paul 
Werbos in 1974 and it is rediscovered independently by Rumelhart and Parker. In backpropagation  
learning  atfirst the network weight is selected as random small value then the network output is 


BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 42 

calculated & it is compared with the desired output,difference between them is defined by error 
.The goal of efficient network training is to minimize this error by monotonically tuning the 
network weights by using gradient descent method.To compute the gradient of error surface it takes 
mathematical tools & it is a iterative process. 

ANN is a powerful  tool  widely used in soft-computing techniques for forecasting  stock 
price.The first stock forecasting approach was taken by White,1988 ,he used IBM daily stock price 
to predict the future stock value[3].When developing  predictive model for forecasting Tokyo stock 
market, Kimoto, Asakawa, Yoda, and Takeoka 1990  have reported onthe effectiveness of 
alternative learning algorithms and prediction methods using ANN[4]. Chiang, Urban, and 
Baldridge 1996 have used ANN to forecast the end-of-year net asset value of mutual funds[5]. 
Trafalis (1999) used feed-forward ANN to forecast the change in the S&P(500) index. In that 
model, the input values were the univariate data consisting of weekly changes in 14 
indicators[6].Forecasting of daily direction of change in the S&P(500) index is made by Choi, Lee, 
and Rhee 1995[7]. Despite the wide spread use of ANN in this domain, there are significant 
problems to be addressed. ANNs are data-driven model (White, 1989[8]; Ripley, 1993[9]; Cheng & 
Titterington, 1994[10]), and consequently, the underlying rules in the data are not always apparent 
(Zhang, Patuwo, & Hu, 1998[11]). Also, the buried noise and complex dimensionality of the stock 
market data makes it difficult to learn or re-estimate the ANN parameters (Kim & Han, 2000[12]). 
It is also difficult to come with an ANN architecture that can be used for all domains. In addition, 
ANN occasionally suffers from the overfitting problem (Romahi & Shen, 2000[13])[14]. 
 

2. Data analysis and problem description 
This paper develops two comparative ANN models step-by-step to predict the stock price 

over financial time series, using data available at the website http://www.google.com/finance. The 
problem described in this paper is a predictive problem. In this paper four predictors have been used 
with one predictand. The four predictors are listed  below 

• Stock open price 
• Stock price high 
• Stock price low 
• Stock close price 
• Total trading volume 

 
The predictand is next stock opening price. 
All these four predictors of year X are used for prediction of stock opening price of year ( 

X+1). Whole dataset comprises of 1460 days NASDAQ stock data. Now first subset contains early 
730 days data (open,high,low,close,volume) which is the inputseries to the neural network 
predictor.Second subset has later 730 days data(only open) which is the target series to the neural 
network predictor.Now the network learns the dynamic relationship between those previous five 
parameters (open, high, low, close, volume)to the one final parameter(open),which it will predict in 
future. 
 
  Data Preprocessing 

Once the historical stock prices are gathered ,now this is the time for data selection for 
training,testing and simulating the network.In this project we took 4 years historical price of any 
stock ,means total 1460 working days data.We done R/S analysis  over these datafor 
predictability(Hurst exponent analysis).Now The Hurst exponent (H) is a statistical measure used to 
classify time series. H=0.5 indicates a random series while H>0.5 indicates a trend reinforcing 
series. The larger the H value is, the stronger trend. (1) H=0.5 indicates a random series. (2) 
0<H<0.5 indicates an anti-persistent series. (3) 0.5<H<1 indicates a persistent series. An 
antipersistent series has a characteristic of “mean-reverting”, which means an up value is more 
likely followed by a down value, and vice versa. The strength of “meanreverting” increases as H 


A. Ghosh - Comparative study of Financial Time Series Prediction by Artificial Neural Network with Gradient Descent 

Learning 

 
   43 

approaches 0.0. A persistent series is trend reinforcing, which means the direction (up or down 
compared to the last value) of the next value imore likely the same as current value. The strength of 
trend increases as H approaches 1.0. Most economic and financial time series are persistent with 
H>0.5. Now we took the dataset timeseries having hurst exponent >0.5 for persistency in good 
predictability. 

 
Figure 1. Data Division for NetworkTraining 

 
Now first subset contains early 730 days data(open,high,low,close,volume) which is the 

inputseries to the neural network predictor.Second subset has later 730 days data(only open) which 
is the target series  to the neural network predictor.Now the network learns the dynamic relationship 
between those previous five parameters (open,high,low,close,volume) to the one final 
parameter(open),which it will predict in future. 

All five predictors are given to the network & also corresponding predictand is given by 
using backpropagation traing (gradient descent approach) the network will learn the abstract 
mapping  between input & output & will minimize prediction error. After getting satisfactory 
minimization  of mean square error over several epoch the training is said to be completed & the 
prediction system is ready for forecasting purpose. 
 
 
INPUT SERIES 

OLDER STOC 
OPEN PRICE 

OLDER STOCK 
HIGH PRICE 

OLDER STOCK 
LOW PRICE 

OLDER STOCK 
CLOSE PRICE 

OLDER TRADE 
VOLUME OF 

OLDER STOCK 
OPEN PRICE 

OLDER STOCK 
HIGH PRICE 

OLDER STOCK 
LOW PRICE 

OLDER STOCK 
CLOSE PRICE 

OLDER TRADE 
VOLUME OF STOCK 

NEURAL NETWORK PREDICTOR 

NETWORK TRAINING 

NEWER OPEN PRICE OF STOCK 

TARGETSERIES 


BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 44 

Figure 2.Flow Chart for Data preprocessing & Training 

 
After these data processing job is done these are fed to the network fortraining and 

testing,80% of total data is used for training purpose and rest 20% data is used for testing purpose. 
III.  Methodology 

This paper develops an ANN based comparative predictive model for NASDAQ stock 
prediction. The first ANN model is developed with Multi-Layer Feed forward Network 
Architecture & the second model is developed with Recurrent Neural Network Architecture. In this 
paper gradient descent based back propagation learning algorithm is used for the supervised 
learning of the predictive network. The mathematical model used in this paper is described below, 
 
Algorithm 
Initialize each weight wi to some small random value.  
 • Until termination condition is met do -> 
   -For each training example do -> 
    • Input it & compute network output Ok 
       • For each output unit k 

            �� < − �� (1 − �)(�� − ��) 

            • For each hidden unit h 

                   �ℎ < − �ℎ (1 − �ℎ )  

• For each network weight �� do –> 

                             ��, <- ��,+∆��,	 

                                Where ∆��, = 
�	��, 

Here the transfer function is sigmoid transfer function, it is used for its continuous nature. 
 is the 

learning rate &   is the gradient. 
 
At first the network is constructed. in this paper, sigmoidal function is used as the activation 

function of the ANN, it is chosen because of its continuous nature so the transfer function is eq(1), 


A. Ghosh - Comparative study of Financial Time Series Prediction by Artificial Neural Network with Gradient Descent 

Learning 

 
   45 

 
f(x)=     (1) 

 
Where x is the total summed input received at node k. At first all weights are allocated to 

some small random value  for ith layer. The successive weight is defined by eq(2), 
 

=      (2) 
 

The weight updating rule for gradient descent back propagation is eq (3), 

∆��, = 
�	��,    (3) 

Here we use mean square error,because the error surface is a multi-variable function it is 
wise to take mean of them & it is defined by eq(4), 
 

Err=   (4) 

 
3. Implementation and results 
The whole dataset is divided into training & test dataset, 80% of total data is used for 

training purpose & 20% of total data is used for test purpose. Using gradient descent 
backpropagation algorithm the data are trained two times up to 1000 epochs. After training ANN 
model is tested over test dataset. Both networks are trained in same manner, after completion of 
training comparison of their mean  square error is presented  by Table 1. 
 
Table 1. Comparison  of  ERROR 

Network Data Feedforward 
NN 

Timedelay 
Recurrent 
NN 

Using Trainig Set 4.14% 3.01% 

Using Testing Set 25% 15% 

 
A regression model relates Y to a function of X and β. 

     (5) 
 

The unknown parameters denoted as β; this may be a scalar or a vector. 
The independent variables, X. 
The dependent variable, Y. 
 
Regression model is very much useful for model relation between function of independent 

variables and unknown parameters with some dependent variable. This paper also compute and 
contrast the regression plot for both networks over same NASDAQ data forecasting problem. 

 
BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 46 

 
Figure 3. Regression plot for NASDAQ index (MLP) 

 
Figure 3 depicts the regression plot for the feedforward MLP network, analyzing it we can 

say that Y=T regression is not so good. 
 

Figure 4. Regression plot for NASDAQ index (RNN) 

 
Figure 4,depicts the regression plot for the Timedelay RNN network, analyzing  it we can 

say that Y=T regression is totally fit. 
This paper also comprises of comparative study of  performance(mse) plot of both network. 

 
A. Ghosh - Comparative study of Financial Time Series Prediction by Artificial Neural Network with Gradient Descent 

Learning 

 
   47 

 
Figure 5. Performance plot for NASDAQ index (MLP) 

 
Figure 6. Performance plot for NASDAQ index (RNN) 

 
Figure 5 we can see that the mse curve reaches the In performance goal but it does not 

decrease in that good manner,but in Figure 6 the mse is reduces widely. By analyzing all these 
results one can say that RNN is better choice than Feedforward MLP  in prediction purpose. 

Table2 represents the original stock value & predicted ones 
 

4. Conclusion 
This paper presented a hybrid neural-evolutionary methodology to forecast time-series. The 

methodology is hybrid because an evolutionary computation-based optimization process is used to 
produce a complete design of a neural network. The produced neural network, as a model, is then 
used to forecast the time-series. One of the advantages of the proposed scheme is that the design 
and training of the ANNs has been fully automated. This implies that the model identification does 
not require any human intervention. The model identification process involves data manipulation 
and a highly experienced statistician to do the work. This fact pushes the state of the art in 
automating the process of producing forecasting models. Compared to previous work, this paper 
approach is purely evolutionary, while others use mixed, mainly combined with back-propagation, 
which is known to get stuck in local optima. On the direction of model production, the evolutionary 
process automates the identification of input variables, allowing the user to avoid data pre-treatment 
and statistical analysis. The system is fully implemented in Matlab [15]. 

The study proves the nimbleness of ANN as a predictive tool for Financial Timeseries 
Prediction. Furthermore, Conjugate Gradient Descent is proved to be an efficient Backpropagation 


BRAIN. Broad Research in Artificial Intelligence and Neuroscience 

Volume 3, Issue 1, February 2012, ISSN 2067-3957 (online), ISSN 2068 - 0473 (print) 

 
 48 

algorithm that can be adopted to predict the average stock price of NASDAQ.It is also revealed that 
temporal relationship between mapping is better learnt by RNN than FFMLP. 
 

Table 2.Comparison between original stock  price(TARGET) & Simulated price by ANN. 
 
 
SIMRNN-simulated output using RNN model. 
SIMMLP-simulated output using MLP model. 

 
Acknowledgment 
Author heartily acknowledge Dr.Pabitra Mitra, Associate Professor Department Of 

Computer Science & Engineering, Indian Institute Of technology ,Kharagpur,India & Mr.Mriganka 
Chakraborty, Assistant Professor, Department of Computer Science & Engineering ,Seacom 
Engineering College, Howrah,India, for their endless help in this research work in theoretically & 
practically & specially thanks Prof.Rajob Bag,Head Of The Department,Department of Computer 
Science & Engineering ,Seacom Engineering College,Howrah,India, for his moral support. The 
design and simulation work was carried out at the laboratories of Computer Sciences Engineering at 
Seacom Engineering College ,Howrah India. Author must acknowledge the support of Seacom 
Engineering College authority in this paper publication.  

TARGET SIMRNN SIMMLP TARGET SIMRNN SIMMLP 
2519.1 2627 2183.8 2183.8 2627 2183.8 

2514.8 5209.6 2379.7 2326.8 2352.8 2355.1 

2500 3219 1921 2309.7 2355.3 2316.0 

2473 3269.9 2215.3 2308.1 2353.4 2251.9 

2509.4 2524.1 2083.9 2349.1 2416.2 2466.4 

2528.5 2534.3 2534.3 2362.7 2420.3 2406.3 

2483.2 2497.4 2124.7 2356.8 2356.8 2356.8 

2436.7 2439.9 2141.6 2407.5 2456.4 2344.7 

2444.9 2268.6 2309.9 2434.2 2412.8 2390.9 

2414.4 2446.2 2086.0 2424.3 2521.9 2414.0 

2423.0 2423.0 1961.4 2414.4 2462.8 2369.6 

2443.1 2467.0 2018.2 2463.1 2463.1 2463.1 

2481.2 2515.9 2239.5 2456.9 2363.3 2382.5 

2444.9 2043.0 2223.8 2404.9 2501.4 2501.4 

2416.5 2462.1 2107.2 2390.3 2438.9 2394.5 

2375.8 2400.8 1992.4 2399.7 2435.9 2388.4 

2388.4 2416.4 2227.8 2364.3 2888.2 2319.0 
2365.8 2430.4 2119.0 2362.8 2412.8 2372.2 

2319.6 2353.4 2173.1 2390.1 2444.5 2324.4 

2312.4 2347.1 2120.1 2120.1 2441.5 2323.5 

2274.2 2308.9 2102.5 2402.1 2468.7 2334.6 

2311.5 2293.3 2242.3 2346.8 2421.8 2301.6 

2261.7 2309.6 2251.3 2315.1 2315.1 2315.1 

2263.6 2322.7 2089.0 2241.6 2312.9 2240.0 

2244.9 2225.6 2337.0 2296.1 2369.7 2161.6 

2290.6 2336.7 2204.6 2204.6 2204.6 2204.6 

2239.9 2270.0 2102.5 2102.5 2102.5 2102.5 

2102.5 2102.5 2463.1 2270.0 2463.1 2102.5 

2456.9 2456.9 2270.0 2204.8 2102.5 2204.9 

2308.9 2307.6 2308.9 2346.8 2102.5 2204 

2362.8 2102.5 2390.1 2390.1 2270.0 2204 

2362.8 2362.8 2270.0 2362.8 2390.1 2390.1 

2362.8 2362.8 2364.3 2102.5 2270.0 2364.3 

2501.4 2362.8 2270.0 2204 2204 2315.1 

2362.8 2308.9 2089.0 2364.3 2315.1 2501.4 

2501.4 2508.3 2362.8 2204 2102.5 2390.1 


A. Ghosh - Comparative study of Financial Time Series Prediction by Artificial Neural Network with Gradient Descent 

Learning 

 
   49 

 
References 

[1] Artificial Neural Networks By Dr.B.Yegnanarayana. 
[2] Neural Networks – A Comprehensive Foundation By Simon Haykin. 
[3]White, H. (1988). Economic prediction using neural networks: the case of IBM daily stock 
returns. In  Proceedings of the second IEEE annual conference on neural networks, II (pp. 451–
458). 
[4] Kimoto, T., Asakawa, K., Yoda, M., & Takeoka, M. (1990). Stock market prediction system 
with modular neural networks. In Proceeding of the international joint conference on neural 
networks (IJCNN) (Vol. 1, pp. 1–6.) San Diego. 
[5] Chiang, W.-C., Urban, T. L., & Baldridge, G. W. (1996). A neural network approach to mutual 
fund net asset value forecasting. Omega International Journal of Management Science, 24(2), 205–
215. 
[6] Trafalis, T. B. (1999). Artificial neural networks applied to financial forecasting. In C. H. Dagi 
Dagli, A. L. Buczak, J. Ghosh, M. J. Embrechts, & O. Ersoy (Eds.), Smart engineering 
systems:neural networks, fuzzy logic, data mining, and evolutionary programming. Proceedings of 
the artificial neural networks in engineering conference (ANNIE’99) (pp. 1049–1054). New York: 
ASME Press. 
[7] Choi, J. H., Lee, M. K., & Rhee, M. W. (1995). Trading S&P 500 stock  index futures using a 
neural network. In Proceedings of the 3rd annual international conference on artificial intelligence 
applications on wall street (pp. 63–72). New York. 
[8] White, H. (1989). Learning in artificial neural networks: a statistical perspective. Neural 
Computation, 1, 425–464. 
[9] Ripley, B. D. (1993). Statistical aspects of neural networks. In O. E. Brandorff-Nielsen, J. L. 
Jensen, & W. S. Kendall (Eds.), Networks and chaos-statisticalandprobabilistic aspects (pp. 40–
123). London: Chapmanand Hall. 
[10] Cheng, B., & Titterington, D. M. (1994). Neural networks: a review from statistical 
perspective. Statistical Science, 9(1), 2–54. 
[11] Zhang, G., Patuwo, B. E., & Hu, M. H. (1998). Forecasting with artificialneural networks: the 
state of the art. International Journal ofForecasting, 14, 35–62. 
[12] Kim, K.-J., & Han, I. (2000). Genetic algorithms approach to featurediscretization in artificial 
neural networks for the prediction of stock price index. Expert Systems with Applications, 19, 125–
132. 
[13] Romahi, Y., & Shen, Q. (2000). Dynamic financial forecasting withautomatically induced 
fuzzy associations. In Proceedings of the 9

th
 international conference on fuzzy systems (pp. 493–

498). 
[14] A fusion model of HMM, ANN and GA for stock market forecasting Md. Rafiul Hassan *, 
Baikunth Nath, Michael Kirley Computer Science and Software Engineering, The University of 
Melbourne, Carlton 3010, Australia 2006. 
[15] MATLAB-by MathWorks MATLAB Version 7.12.0.635 (R2011a) .