dr adel _mag_.doc IJCPE Vol.8 No.4 (December 2007) 31 Iraqi Journal of Chemical and Petroleum Engineering Vol.8 No.4 (December 2007) 31-37 ISSN: 1997 -4884 Prediction of Fractional Hold-Up in RDC Column Using Artificial Neural Network Adel Al-Hemiri and Suhayla Akkar Chemical Engineering Department - College of Engineering - University of Baghdad - Iraq Abstract In the literature, several correlations have been proposed for hold-up prediction in rotating disk contactor. However, these correlations fail to predict hold-up over wide range of conditions. Based on a databank of around 611 measurements collected from the open literature, a correlation for hold up was derived using Artificial Neiral Network (ANN) modeling. The dispersed phase hold up was found to be a function of six parameters: N, cv , dv , ρ∆ , dc µµ / , σ . Statistical analysis showed that the proposed correlation has an Average Absolute Relative Error (AARE) of 6.52% and Standard Deviation (SD) 9.21%. A comparison with selected correlations in the literature showed that the developed ANN correlation noticeably improved prediction of dispersed phase hold up. The developed correlation also shows better prediction over a wide range of operation parameters in RDC columns. Keywords: dispersed phase hold up, RDC, artificial neural networks (ANN). Introduction In the design and scale up of RDC, it is necessary to explore the hydrodynamic behavior, mass transfer mechanism, and hold up effect within the equipment under different operating conditions. Dispersed phase hold up represents the total drop population in RDC column is defined as the ratio of dispersed phase to the volume of the column. The effect of the hold up on the performance of an extraction column is the most important hydrodynamic characteristic, because hold up is related to the interfacial area between the phases by: 32 6 d x a = (1) Where x is the dispersed phase hold up and 32d is the sauter mean diameter. And the hold up is related to the rate of mass transfer (W) via (a) by: caVKW ∆= ... (2) Wher K is the mass transfer coefficient, V is volume of the column and c∆ is the concentration driving force. In solvent extraction the re lationship between mass transfer and hydrodynamic performance is complex and there are many types of contactors each requiring a special understanding. Numerous experimental studies of dispersed phase hold up, drop size, mass transfer and mixing behavior within contactors have been reported [1]. In order to determine the interfacial area of the dispersion for the mass transfer calculation using equation (2) either of the following should be known: 1. The drop residence time in the contactor. 2. The fraction of the column occupied by the dispersed phase hold up. In agitated contactors the residence time distribution is rather complex and dispersed phase hold up is therefore usually used for the estimation of interfacial area. Virmijs and Karmers [2] investigated performance of RDC for various values of the rotor speed, total through put and solvent to feed ratio by comparing the separating efficiency with the fractional volume of the dispersed University of Baghdad College of Engineering Iraqi Journal of Chemical and Petroleum Engineering Prediction of fractional hold up in RDC column using artificial neural networks IJCPE Vol.8 No.4 (December 2007) 32 phase under the same circumstances it was found that under certain condition the efficiency decreases although the hold up of the dispersed phase increases. This effect is ascribed to back mixing in continuous phase due to entrainment by the dispersed phase. The hold up increased by increasing the solvent to feed ratio wh ile the total through put is kept constant, and the special kind of back mixing in the continuous phase impairs the efficiency of the extraction operation. Logsdail et.al [3] were the first to introduce the concept of dispersed phase hold up for the characterization of column design these authors modified the concept of relating the slip velocity sv of the dispersed phase to the hold up in a two phase system by: x v x v v cds − += 1 (3) ( ) x v x v xv cdo − +=− 1 1 (4) ov is called the characteristic velocity and is defined as the mean velocity of the droplets extrapolated to essentially zero flow rates at a fixed rotor speed. Many correlations have been published relating the dispersed phase hold - up to the characteristic velocity in the form of equation 4 with additional factors for column size constriction and droplets coalescence and break up which could not be easily applied due to the amount of information required specially for ov . Some selected reliable correlations are given in table (1). However these correlations fail to predict hold up over a wide range of conditions. Thus this work was initiated in order to develop a general correlation using artificial neural network. Artificial Neural Network (ANN) From an engineering view point ANN can be viewed as non linear empirical models that are especially useful in Adel Al-Hemiri and Suhayla Akkar IJCPE Vol.8 No.4 (December 2007) 33 representing input -output data. Making predication, classifying data, reorganization patterns, and control process. ANN which will be referred to as a node in this work and is analogous to a single neuron in the human brain. The advantages of using artificial neural network in contrast with first principles models or other empirical models are [4-6], 1. ANN can be highly non linear. 2. The structure can be more complex and hence more representative than most other empirical models. 3. The structure does not have to be prespecified. 4. Quite flexible models. (ANN) have been increasingly applied to many problems in transport planning and engineering, and the feed forward network with the error back propagation learning rule, usually called simply Back propagation (Bp), has been the most popular neural network [7]. Back-propagation Back propagation was one of the first general techniques developed to train multi-layer networks, which does not have many of the inherent limitations of the earlier, single -layer neural nets. A back propagation net is a multilayer, feed forward network that is trained by back propagating the errors using the generalized Delta rule [8]. The steps for back- propagation training can be shown as follows [9]: 1. Initialize the weights with small, random values. 2. Each input unit broadcasts its value to all of the hidden units. 3. Each hidden unit sums its input signals and applies its activation function to compute its output signal. 4. Each hidden unit sends its signal to the output units. 5. Each output unit sums its input signals and applies its activation function to compute its output signal. 6. Each-output unit updates its weights and bias: The conventional algorithm used for training a MLFF is the Bp algorithm, which is an iterative gradient algorithm designed to minimize the mean-squared error between the desired output and the actual output for a particular input to the network [10]. Basically, Bp learning consists of two passes through the different layers of the network: a forward pass and backward pass. During the forward pass the synaptic weights of the network are all fixed. During the backward pass, on the other hand, the synaptic weights are all adjusted in accordance with an error-correction rule [11]. The algorithm of the error back-propagation training is as given below [10]: Step 1: initialize network weight values. Step 2: sum weighted input and apply activation function to compute output of hidden layer       = ∑ i ijij Wxfh (4) Where, hj: The actual output of hidden neuron j for input signals X. Xi: Input signal of input neuron (i). Wij: Synaptic weights between input neuron hidden neuron j and i. f : The activation function. Step3: sum weighted output of hidden layer and apply activation function to compute output of output layer.       = ∑ j jkjk WhfO (5) Where Ok: The actual output of output neuron k. Wjk: Synaptic weight between hidden neuron j and output neuron k. Step 4: Compute back propagation error ( )        −= ∑ j jkjkkk WhfOd 'δ (6) Where f’: The derivative of the activation function. dk: The desired of output neuron k. Step 5: Calculate weight correlation term ( ) ( )1−∆+=∆ nWhnW jkjkjk αηδ (7) Step 6: Sums delta input for each hidden unit and calculate error term. ( )∑= ijijkkj WXfW 'δδ (8) Step 7: Calculate weight correction term ( ) ( )1−∆+=∆ nWXnW ijijij αηδ (9) Step 8: Update weights ( ) ( ) ( )nWnWnW jkjkjk ∆+=+ 1 (10) Step 9: Repeat step 2 for a given number of error ( )       −= ∑∑ p k p k p k Odp MSE 2 2 1 Where p is the number of patterns in the training set. Step 10: End Bp is easy to implement, and has been shown to produce relatively good results in many applications. It is capable of approximating arbitrary non-linear mappings. Prediction of fractional hold up in RDC column using artificial neural networks IJCPE Vol.8 No.4 (December 2007) 34 However, it is noted that two serious disadvantages in the Bp algorithm are the slow rate of convergence, requiring very long training times, and getting stuck in local minima. The success of Bp methods very much depends on problem specific parameter settings and on the topology of the network [ 9]. The Activation Function used with the Back- Propagation There are three transfer functions most commonly used for back propagation, but other differentiable transfer functions can be created and used with back propagation if desired. These functions are tansig, logsig, and purelin. The function logsig generates outputs between 0 and 1 as the neuron's net input goes from negative to positive infinity. Alternatively, multilayer networks may use the tan sigmoid transfer function. Occasionally, the linear transfer function purelin is used in back propagation networks. [8]. If the last layer of a multilayer network has sigmoid neurons, then the outputs of the network are limited to a small range. If linear output neurons are used the network outputs can take any value. In the present simulation the tansig is used. Modeling Correlation of ANN The modeling of ANN correlation began with the collection of large data bank followed by the learning file which was made by randomly selecting about 70% of the data base to train the network. The remaining 30% of data is then used to check the generalization capability of the model. The last step is to perform a neural correlation and to validate it statistically. So that the steps of modeling are:- Collection of Data The first step is collection of data.. Many investigators studied the hydrodynamics of RDC based on the dispersed phase hold up. In this model about 611 experimental points have been collected for mass transfer from continuous to dispersed phase (c —>d), for mass transfer from dispersed to continuous (d —>c) and for the case of no mass transfer in RDC. The data were divided into training and test sets: the neural network was trained on 70% of the data and tested on 30%. The data includes nine chemical systems with a large range of rotary speed, velocity of both continuous and dispersed phase as well as the physical properties for each chemical system. All of these parameters are input to neural network and there is one output; it is the hold up of dispersed phase. The Structure of Artificial Neural Network In this work, a multilayer neural network has been used, as it is effective in finding complex non-linear relationships. It has been reported that mu ltilayer ANN models with only one hidden layer are universal approximates. Hence, a three layer feed forward neural network is chosen as a correlation model. The weighting coefficients of the neural network are calculated using MATLAB programming. Structure of artificial neural network built as:- 1. Input layer: A layer of neurons that receive information from external sources and pass this information to the network for processing. These may be either sensory inputs or signals from other systems outside the one being modeled. In this work six input neurons in the layer and there is a set of (427) data points available for the training set. 2. Hidden layer: A layer of neurons that receives information from the input layer and processes them in a hidden way. It has no direct connections to the outside world (inputs or output). All connections from the hidden layer are to other layers within the system. The number of neuron in the hidden layer is twenty one neurons. This gave best results and was found by trial and error. If the number of neurons in the hidden layer is more, the network becomes complicated. Results probably indicate that, the present problem is not too complex to have a complicated network routing. Hence, the results can be satisfactorily achieved by keeping the number of neurons in hidden layer at a best value of twenty one neurons. 3. Output layer: A layer of one neuron that receives processed information and sends output signals out of the system. Here the output is the hold up of dispersed phase in RDC. 4. Bias: The function of the bias is to provide a threshold for activation of neurons. The bias input is connected to each of hidden neurons in network. The structure of muiti layer ANN modeling is illustrated in figure (1). Adel Al-Hemiri and Suhayla Akkar IJCPE Vol.8 No.4 (December 2007) 35 Training of Artificial Neural Network The training phase starts with randomly chosen initial weight values. Then a back-propagation algorithm is applied after each iteration, the weights are modified so that the cumulative error decreases. In back-propagation, the weight changes are proportional to the negative gradient of error. More details about this learning algorithm is shown in figure (1). Back-propagation may have an excellent performance. This algorithm is used to calculate the values of the weights and the following procedure is then used (called "supervised learning") to determine the values of weights of the network: - 1. For a given ANN architecture, the value of the weights in the network is initialized as small random numbers. 2. The input of the training set is sent to the network and resulting outputs are calculated. 3. The measure of the error between the outputs of the network and the known correct (target) values is calculated. 4. The gradients of the objective function with respect to each of the individual weights are calculated. 5. The weights are changed according to the optimization search direction. 6. The procedure returns to step 2. 7. The iteration terminates when the value of the objective function calculated using the data in the test approaches experimental value. The trial and error to find the best ANN correlation model is shown in table 2. Table (2) Network parameters in ANN model Network Parameters Structure MSE No. of iteration Learning rate Momentum coefficient Transfer function [6-16-1] 0.1 2590 0.7 0.9 Tan sigmoid [6-18-1] 0.01 4321 0.65 0.9 Tan sigmoid [6-21-1] 0.0001 9103 0.75 0.9 Tan sigmoid With reduced MSE (Mean Square Error) the network is more accurate, because MSE is defined as: ( )       −= ∑∑ p k p k p k Odp MSE 2 2 1 (13) Where p is the number of patterns in training set, k is the number of iterations, pkd is the desired output, p kO is the actual output. The learning process includes the procedure when the data from the input neurons is propagated through the network via the interconnections. Each neuron in a layer is connected to every neuron in adjacent layers. A scalar weight is associated with each interconnection. Neurons in the hidden layers receive weighted inputs from each of the neurons in the previous layer and they sum the weighted inputs to the neuron and then pass the resulting summation through a non-linear activation function (tan sigmoid function). Artificial neural networks learn patterns can be equated to determining the proper values of the connection strengths (i.e. the weight matrices wh and wo illustrated in figure 1) that allow all the nodes to achieve the correct state of activation for a given pattern of inputs. The matrix, bias, and vector, given equations (14), (15), and (16) illustrate the result of coefficient weights for ANN correlation , where wl is the matrix containing the weight vectors for the nodes in the hidden layer, Wo is the vector containing the weight for the nodes in the output layer and is the bias. (14) (15) (16) Prediction of fractional hold up in RDC column using artificial neural networks IJCPE Vol.8 No.4 (December 2007) 36 Simulation Results The network architecture used for predicting hold up is illustrated in figure (1) consist of six inputs neurons corresponding to the state variables of the system, with 21 hidden neurons and one output neuron. All neurons in each layer were fully connected to the neurons in an adjacent layer. The prediction of ANN correlation result is plotted in figure (2) compares the predicted hold up with experimental hold up for training set Figure (2) Comparison between experimental and predicted hold up in training set Figure (3) Comparison between experimental and predicted hold up in testing set Test of the Proposed ANN The purely empirical model was tes ted on data that were not used to train the neural network and yielded very accurate predictions. Having completed the successful training, another data set was employed to test the network prediction hold up. We made use of the same model to generate (184) new data values. The result of prediction is plotted with experimental values as shown in figure (3). Statistical Analysis Statistical analysis based on the test data is calculated to validate the accuracy of the output for pervious correlation model based on ANN. The structure for each model should give the best output prediction, which is checked by using statistical analysis. The statistical analysis of prediction is based on the following criteria: 1. The AARE (Average Absolute Relative Error) should be minimum: ∑ − = N erimental erimentalprediction x xx N AARE 1 exp exp1 (17) Where N here is the number of data points. x is the hold up. 2. The standard deviation should be minimum. ( )[ ] ∑ − −− = 1 / 2 expexp N AARExxx SD erimentalerimentalprediction (18) 3. The correlation coefficient R between input and output should be around u nity. ( )( ) ( ) ( )∑∑ ∑ == = −− −− = N i predictionprediction N i erimentalerimental N i predictionipredictionerimentalierimental xxxx xxxx R 1 2 1 2 expexp 1 )(exp)(exp (19) Where erimentalx exp =hold up mean of experimental points, predictionx =hold up mean for prediction points. The literature correlations (in table 1) were used to estimate the hold up. These correlatio ns show a poor agreement between the prediction and experimental hold up value compared with ANN correlation. Table (3) gives information of comparing these correlation with ANN prediction in testing set. Table (3) Comparison of ANN and previous literature correlations in testing set Correlation AARE% S.D% R Kastkin(1962) 51.93 32.55 0.695 Murakami(1978) 41.29 23.94 0.7914 Hartland(1987) 32.79 22.59 0.778 Kalaichelvi(1998) 32 27.63 0.726 ANN (this work) 6.52 9.21 0.998 Adel Al-Hemiri and Suhayla Akkar IJCPE Vol.8 No.4 (December 2007) 37 Conclusions The ANN correlation shows noticeable improvement in the prediction of dispersed phase hold up. The neural network correlation yield an AARE of 6.52% and standard deviation of 9.21%, which is better than those, obtained for the selected literature correlations. Also ANN correlation yielded improved predictions for variety of liquid systems and a wide range of operating parameters. The number of input units and output units are fixed to a problem (here, 6 and 1 respectively) but the choice of the number of the hidden units is flexible. In this work best results were obtained employing 21 hidden neurons. Nomenclature a Interfacial mass transfer area m2/m3 b Bias c∆ Concentration driving force kg/m 3 32d Sauter mean diameter Dr Diameter of rotary disk m Ds Stator ring opening m Dt Diameter of RDC column M f The activation function f' The derivation of the activation function g Gravitational constant m/s2 hi The actual output of hidden neuron j K Mass transfer coefficient ̀ m/s n Number of input neurons N Speed of rotor dist rps Ok The actual output of neuron k P The number of patterns in the training set R Correlation coefficient V Volume of column m3 vc Velocity of continuous phase m/s vd Velocity of dispersed phase m/s vo Characteristic velocity m/s vs Slip velocity m/s W Rate of mass transfer kg/s Wij Synaptic weights between input and hidden neuron Wj k Synaptic weights between input and output neuron x Hold up Xi Input vector x Mean hold up zc Height of compartment m zt Height of RDC column M Greek symbols α Momentum to accelerate the network convergence process kδ The error term η The learning rate µ Viscosity kg/m.s σ Interfacial tension N/m ρ Density kg/m3 ρ∆ Density difference kg/m 3 Subscripts c Continuous phase d Dispersed phase References 1. Bailes, P.J., Gledhill, J., Godfrey, J.C and Slater, M.J, "Hydrodynamic behavior of packed, rotating disc contactor, and Kuhn", Chem. Eng. Res. Des, 64, 43-55, (1986). 2. Vermijs, H.J., and Kramers, H., "Liquid -liquid extraction in rotating disc contactor", Chem. Eng. Sci., 3, 55-64, (1954). 3. • Logsdail, D.H., Thomton, J.D., and Pratt, H.R.C., "liquid-liquid extraction part XII: flooding rates and performance data for a rotating disc contactor", Trans. Inst. Chem. Eng., 35, 301-315, (1957). 4. David M.H., "Applications of artificial neural networks in chemical engineering", Korean. J. Chem. Eng., 17, 373-392, (2002). 5. Patterson, "Artificial neural networks theory and applications". Prentice Hall, (1996). 6. Sivanadam, S.N., "Introduction to artificial neural networks", Vikas publishing House Pvt. Ldt, (2003). 7. Freeman, J.A., and Skapura, D.M., "Neural networks", Jordan University of Science and Technology, July (1992). 8. MATLAB, Version 7, June 2003, "Neural network toolbox" 9. Leonard, J., and Kramer, M.A., "Improvement of the back-propagation algorithm for training neural networks", Comp. Chem. Eng, 14, 337-341, (1990). 10. Lendaris, G., "Supervised learning in ANN from introduction to artificial intelligence". New York, April 7, (2004). 11. Lippmann, R.P., "An introduction to computing with neural nets", IEEE Magazine, April, pp.4-22,(1987). 12. Kalaichelvi, P., Murugesan, T., "Dispersed phase hold up in rotating disc co ntactor", Bioprocess Engineering, 18,105-111, (1998). 13. Kasatkin, A.G., and Kagan, S.Z Appl., Chem., USSR, 35, 1903, (1962) [Cited in Murakami, A., and Misonou, A., and Inoue, A., 1978]. 14. Kumar, A., and Hartland, S., "Independent prediction of slip velocity and hold up in liquid -liquid extraction columns", Can. J. Chem. Eng., 67, 17, (1987). 15. Murakam, A., Misonou, A., Inoue, K., " Dispersed phase hold up in a rotating disc extraction column", International Chemical Engineering, 18,1, (1978).