CHEMICAL ENGINEERING TRANSACTIONS VOL. 61, 2017 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Petar S Varbanov, Rongxin Su, Hon Loong Lam, Xia Liu, Jiří J Klemeš Copyright © 2017, AIDIC Servizi S.r.l. ISBN 978-88-95608-51-8; ISSN 2283-9216 Soft Sensor Based on Recursive Kernel Partial Least Squares for 4-carboxybenzaldehyde of an Industrial Terephthalic Acid Hydropurification Process Zhi Li, Weimin Zhong, Xin Peng, Wenli Du, Feng Qian* Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education, East China University of Science and Technology, Shanghai, 200237, China fqian@ecust.edu.cn Terephthalic acid is a raw material for polyester and textile industry. However, the by-product 4- Carboxybenzaldehyde in TA is harmful to the polymer process since it can lower the polymerization rate and the average molecular weight. Thus, a hydropurification process is built to decrease 4-CBA. In this process, 4- CBA in TA is purified by hydrogen in water at 270 °C–290 °C under 7.9 MPa pressure over 0.5 wt. % carbon- coated palladium catalyst in a fixed-bed reactor. The activity of the catalyst will gradually decrease with the process running. The most important quality index for this process is the content of 4-CBA in the product. However, in real plant, the content of 4-CBA is analysed every two hours in a laboratory. It is a very large delay for control system so the operation conditions of the process could not be adjusted in time. These may results in a maximum two hours product failure. In this paper, a first principle model of the process is developed based on Aspen Plus. The accuracy of the model is verified by the model results and the actual plant results. Based on this model, a series of sensitive analysis are performed. Six variables include 4-CBA content in TA, react temperature, react pressure, hydrogen flow rate, catalyst activity and Slurry concentration are the main factors influencing the 4-CBA content in the product. Though the aspen model is accurate, these parameters in aspen model are not easy to adjust quickly. In order to predict 4-CBA content quickly, a soft sensor for 4-CBA is developed using recursive kernel partial least squares considering the nonlinearity and slow time-varying of the process. Results show that the prediction accuracy of this method is very high and it is easy for engineers to handle. 1. Introduction Pure terephthalic acid (PTA) is one of the most important raw materials for polymer industry (Li et al., 2016). There are two main sections in PTA process. The first is p-xylene (PX) oxidation process. PX in acetic acid solvent is oxidized to TA in a continuous stirred tank reactor by air or molecular oxygen (Qian et al., 2012). However, the product TA contains an impurity, namely, 4-carboxybenzaldehyde (4-CBA). TA with high content of 4-CBA is called crude terephthalic acid (CTA). 4-CBA can lower the polymerization rate and the average molecular weight in polymer process, besides, it is not easy to separate using physical method as the molecular weight of 4-CBA is similar to TA. The second section, hydropurification process is implemented to decrease 4-CBA in CTA. In this section, CTA is first mixed with deionized water in a storage tank, and then the slurry is gradually heated to about 273 °C through five continuous heat exchangers. After that, the slurry and hydrogen are injected to the top of a fixed-bed reactor. 4-CBA reacts with hydrogen and converts to p- toluic acid in the presence of 0.5.wt. % carbon-coated palladium (Pd/C) catalyst, and then, p-toluic acid is easy to separate from TA by crystallization and centrifugation (Azarpour et al., 2012). In this process, the content of 4-CBA is decrease from 3,000 ppm to less than 25 ppm. This can achieve the material requirement of polymer industry (Li et al., 2015). In real CTA purification process, 4-CBA content in final product is the most important quality index. However, rather than observed directly, it is tested every two hours in a laboratory. This is a very large delay for the whole process, results from the laboratory can not respond to system status in real time. These may results in DOI: 10.3303/CET1761075 Please cite this article as: Li Z., Zhong W., Peng X., Du W., Qian F., 2017, Soft sensor based on recursive kernel partial least squares for 4- carboxybenzaldehyde of an industrial terephthalic acid hydropurification process, Chemical Engineering Transactions, 61, 463-468 DOI:10.3303/CET1761075 463 a maximum two hours product failure. Although a lot of real process simulators such as Aspen, PRO-II are popular in this field, it is still not easy for engineers and workers to handle. Besides, with the progress of production, the operation conditions need to be adjusted in real time. It is not easy to change all of the conditions in a simulator. For this reason a lot of soft sensors are developed to solve these problems. The common method is data based regression algorithm, such as PCA (Farsang et al., 2015), PLS (Facco et al., 2009) and KPLS (Godoy et al., 2014), etc. CTA hydropurification process is a nonlinear process and the catalyst activity is a slow time-varying variable, it decreases gradually after the process begins. PCA and PLS can only handle linear problem, KPLS is difficult to solve slow time varying problem. Thus, a recursive kernel partial least squares method is developed. The kernel here is used to solve the non-linearity, recursive method here is used to solve the slow time varying problem. The following sections of this paper is organized as follows, in section 2, details of this process are discussed. In section 3, 6 essential variables which influence the final 4-CBA are picked out by doing a series of sensitive analysis. Section 4 describes the proposed recursive kernel partial least squares and the real plant data are applied using this method, also, discussions of the results are given in this section. Finally, conclusions are drawn in section 5. 2. CTA hydropurification process CTA hydropurification process is of great importance in PTA process. CTA containing 3000 ppm 4-CBA from PX oxidation process and deionized water are mixed together in a storage tank, the mass fraction of the CTA slurry is about 29%, then the slurry is pumped through five continuous heat exchangers. The temperature of the slurry increases from atmosphere to 273 °C. In this pre-heat process, CTA is dissolved in the deionized water. Then the slurry and hydrogen are injected to the top of a fix-bed reactor. The reactor bed is fulfilled with 0.5.wt. % carbon-coated palladium (Pd/C) catalyst. In the reactor, 4-CBA reacts with hydrogen under the pressure of 7.9 MPa and inverts to p-toluic acid. The residence time is about 20 min. Next, the product goes out from the bottom of the reactor. After five continuous crystallizers, the temperature of the product decreases gradually from 273 °C to atmosphere, at the same time, PTA crystallized with the temperature goes down. In the final product, the content of 4-CBA is less than 25 ppm. The flowchart is shown in Figure 1. Figure 1: Process flowchart of pure terephthalic acid hydropurification process CTA hydropurification reaction is a complicated chemical process. The reactions are shown in Eq(1) and (2). Reactions Eq(1) and Eq(2) are the main reactions. The intermediate product is 4-hydroxymethylbenzoic acid, and the final product is p-toluic acid. Reaction Eq(3) is a side decarbonylation reaction, and the final product is benzoic acid (Zhang et al., 2008). The kinetics of the reactions are listed below. Eqs(3 - 7) are main and side reactions(Zhou et al., 2006a). The kinetic data are shown in Table 1 (Zhou et al., 2006b). ⎯⎯⎯⎯⎯→ ⎯⎯⎯⎯⎯→Pd/C Pd/C8 6 3 8 8 3 8 8 2Hydrogen Hydrogen 4-CBA 4-HMBA p-toluic acid C H O C H O C H O (1) ⎯⎯⎯⎯⎯→Pd/C8 6 3 7 6 2Hydrogen 4-CBA Benzoic acid C H O C H O (2) 464 − − −− = 2 1 1 24 01 4 E n nCBA RT CBA H dc k e C C dt (3) − − − − −= −2 2 1 2 1 2 3 44 01 4 02 4 E E n n n nHMBA RT RT CBA H HMBA H dc k e C C k e C C dt (4) −− −= 2 2 3 4 02 4 E p toluic acid n nRT HMBA H dc k e C C dt (5) − − −− = 3 54 03 4 E nCBA RT CBA dc k e C dt (6) − −= 3 5 03 4 E nBA RT CBA dc k e C dt (7) Table 1: Kinetic data of the reactions Item Data in reference (Zhang et al., 2008) Data in this study (Li et al., 2015) Frequency factor =01 0.67k =02 0.2558k =03 69k =01 0 69k . =02 0.5971k =03 68k Reaction order = = = = =1 2 3 4 50.98, 0.26, 0.70, 0.60, 0.30n n n n n Activation energy = = =1 2 318.66 / , 28.04 / , 724.143 /E kJ mol E kJ mol E kJ mol 3. Sensitivity analysis and variables selection Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be apportioned to different sources of uncertainty in its inputs (Saltelli, 2002). In the whole process, there are 7 main variables who may influence the final product, namely, 4-CBA content in CTA, mass flow of CTA, mass flow of deionized water, mass flow of hydrogen, reactor temperature, reactor pressure and catalyst activity. Sensitivity analysis here is used to better understanding the relationships between inputs and output of the process and find out the input variables which are most relevant to the output. Based on Aspen plus software, sensitivity analysis of these 7 input variables are done. The variation range of these variables is set to be ±10 % (The temperature of reactor cannot be too high since the catalyst activity losses too fast at higher temperature. Therefore, the variation range of temperature is set to be ±5 %). Figure 2 shows the influence of these 6 variables to final 4-CBA. The normal conditions of these input and output variables are listed in Table 2. Also, variation range, percentage are listed too. Table 2 and Figure 2 demonstrate clearly that catalyst activity is the most relevant input variable to final 4- CBA, reactor pressure has little influence to final 4-CBA. Therefore, 6 variables are selected to be the input variables. Table 2: Data of sensitivity analysis of 7 variables Variables Normal condition % Range Final 4-CBA (2.059 kg/h in normal condition) % 4-CBAin 176 kg/h ±10 % 158.4 ~ 193.6 kg/h 1.346 ~ 2.694 kg/h -34.6% ~ 30.8% CTA 80,000 kg/h ±10 % 72,000 ~ 88,000 kg/h 1.591 ~ 2.461 kg/h -22.7% ~ 19.5% H2O 141,111 kg/h ±10 % 127,000 ~ 155,222 kg/h 0.946 ~ 3.499 kg/h -54.1% ~ 69.9% H2 30 kg/h ±10 % 27 ~ 33 kg/h 2.514 ~ 2.655 kg/h 22.1% ~ -19.6% T 273 °C ±5 % 259.35 ~ 286.65 °C 5.019 ~ 0 kg/h 143.8% ~ -100% P 7.9 MPa ±10 % 7.11 ~ 8.69 MPa 2.059 ~ 2.058 kg/h 0% ~ 0% CA 0.69 ±10 % 0.621 ~ 0.759 3.861 ~ 0.765 kg/h 87.5% ~ -62.9% 465 160 170 180 190 1 1.5 2 2.5 3 Inlet 4-CBA mass flow kg/h O u tl et 4 -C B A m a ss f lo w k g /h 7.5 8 8.5 x 104 1.5 2 2.5 CTA mass flow kg/h O u tl et 4 -C B A m a ss f lo w k g /h 1.2 1.3 1.4 1.5 1.6 x 105 1 2 3 Deionized water mass flow kg/h O u tl et 4 -C B A m a ss f lo w k g /h 28 30 32 1.6 1.8 2 2.2 2.4 2.6 Hydrogen mass flow kg/h O u tl et 4 -C B A m a ss f lo w k g /h 260 270 280 290 0 2 4 6 Reactor temperature °C O u tl et 4 -C B A m a ss f lo w k g /h 0.65 0.7 0.75 0.8 0 1 2 3 4 Catalyst activity kmol kgc-1 s-1 O u tl et 4 -C B A m a ss f lo w k g /h Figure 2: Sensitivity analyses of 6 variables 4. Methodology and application illustration CTA hydropurification process is a complex nonlinear and time varying process, traditional PLS method can not handle the nonlinear problem. Though KPLS is capable of dealing with nonlinear problem, the time varying problem of this process is still waiting to be solved. Thus, recursive kernel partial least squares method is developed to build the soft sensor model for this process. The procedures of this method are shown in Table 3. Table 3: RKPLS method Recursive kernel partial least squares Step 1. Get data from the real process Step 2. Data standardization 0X Step 3. Kernelization of 0X → 0K Step 4. Update when a new newX comes to get 1X Step 5. Update the new kernel of 1X → 1K Step 6. PLS method Step 7. Return to step 4 Step 8. Calculate predY Gaussian kernel is used to establish kernelization in step 3 to calculate 0K . The update of X is as follows, λ γ λ γ= + = +1 0 1 0,new newX X X Y Y Y (8) where λ is forgetting factor, 0 1λ≤ ≤ , 1γ λ= − . In this study, λ is set to be 0.8. It can be seen from Figure 3 that the new 1X contains information both in 0X and newX . The new kernel 1K is calculated as follows, = − − +1 0 0 0 0 0 0 0 0 0 0 0 0 T T T TK K t t K K t t t t K t t (9) where = ←0 0 0 0 0 0,t K Y t t t . After step 5, using traditional PLS to calculate the matrix of P and W. Then the prediction of predY is calculated as follows, ( )−= 10 *T Tpred t tY K P W K P W Y (10) 466 Figure 3: Soft sensor model developments for CTA hydropurification process using RKPLS It is clear that the computational cost of this method is higher than KPLS and PLS, because of the kernelization step in every step. However, this method can solve both nonlinear and slow time varying problems. As it is not easy to get real plant data directly, 400 sampling points of these 6 input variables and corresponding output 4-CBA results are collected using Aspen Plus. In the simulation, the secondary variables are given random fluctuations to simulate the disturbances in real plant. Prediction results using PLS, KPLS and RKPLS methods are shown in Figure 4. It can be seen that PLS method is not suitable in this complex process. The prediction results of KPLS method is getting worse and worse as the number of sampling points goes on. That is because the catalyst is slowly deactivating, KPLS method can not handle this slow time varying problem. It is clear that the prediction results of RKPLS method is very good from the beginning to the end, this demonstrates that this method is suitable for this nonlinear and slow time varying CTA hydropurification process. 0 100 200 300 400 0 1 2 3 4 5 Sampling point ( )a o u tl et 4 -C B A m a ss f lo w k g /h real data pls predicted 0 100 200 300 400 1.5 2 2.5 3 3.5 Sampling point ( )b o u tl et 4 -C B A m a ss f lo w k g /h rela data kpls predicted 0 100 200 300 400 2 2.5 3 3.5 Sampling point ( )c o u tl et 4 -C B A m a ss f lo w k g /h real data rkpls predicted Figure 4: Prediction results using PLS, KPLS and RKPLS Besides, the root mean square error of calibration (RMSECV), root mean square error of prediction (RMSEP), max error, min error and mean error in this study are chosen as the performance indexes to compare the results using PLS, KPLS and RKPLS method. The lower the values of RMSECV and RMSEP, the better the soft sensor model. The calculation formula of RMESCV is Eq(11) ( ) = = − 2 1 ˆ n i i i RMSECV y y n (11) where, n is the number of sampling points in the calibration set, iy is the real value of sample i in the calibration set, ˆiy is the prediction result using regression model. The calculation formula of RMESP is Eq(12) ( ) = = −  2 1 ˆ n i i i RMSEP y y n (12) where, n is the number of sampling points in the prediction set, iy is the real value of sample i in the prediction set, ̂iy is the prediction result using regression model. 467 Table 4 shows the error comparisons among these methods. It can be seen that RKPLS is better than the other two methods in CTA hydropurification process soft sensor modelling. Table 4: Error comparisons among PLS, KPLS and RKPLS MinCV MaxCV MeanCV RMSECV MinPre MaxPre MeanPre RMSEPre PLS 8.164E-03 2.047E+00 8.182E-01 1.102E+00 6.429E-04 2.086E+00 1.840E+00 1.645E+00 KPLS 3.635E-05 1.883E-02 7.422E-03 8.484E-03 5.040E-05 4.096E-02 9.386E-03 1.129E-02 rKPLS 6.843E-06 5.077E-02 1.375E-02 1.668E-02 4.140E-06 2.426E-02 4.600E-03 6.288E-03 5. Conclusion In this paper, a new recursive kernel partial least squares method is developed. This RKPLS method is applied to build a soft sensor for a complex nonlinear and slow time-varying process, terephthalic acid hydropurification process. The results demonstrate that the proposed method can efficiently capture the time- varying and nonlinear relationship in process variables. The prediction results using RKPLS are better than PLS and KPLS method. Acknowledgments This work was supported by the National Natural Science Foundation of China (61422303, 21376077) and Fundamental Research Funds for Central Universities. References Azarpour A., Zahedi G., 2012. Performance analysis of crude terephthalic acid hydropurification in an industrial trickle-bed reactor experiencing catalyst deactivation. Chemical Engineering Journal, 209, 180- 193. Facco P., Doplicher F., Bezzo F., Barolo M., 2009. Moving average PLS soft sensor for online product quality estimation in an industrial batch polymerization process. Journal of Process Control, 19, 520-529. Farsang B., Balogh I., Németh S., Székvölgyi Z., Abonyi J., 2015. PCA based data reconciliation in soft sensor development - Application for melt flow index estimation. Chemical Engineering Transactions, 43, 1555- 1560. Godoy J. L., Zumoffen D. A., Vega J. R., Marchetti J. L., 2014. New contributions to non-linear process monitoring through kernel partial least squares. Chemometrics & Intelligent Laboratory Systems, 135, 76- 89. Li Z., Zhong W., Liu Y., Na L., Feng Q., 2015. Dynamic Modeling and Control of Industrial Crude Terephthalic Acid Hydropurification Process. Korean Journal of Chemical Engineering, 32, 597-608. Li Z., Zhong W., Wang X., Luo N., Qian F., 2016. Control structure design of an industrial crude terephthalic acid hydropurification process with catalyst deactivation. Computers & Chemical Engineering, 88, 1-12. Saltelli A., 2002. Sensitivity Analysis for Importance Assessment. Risk Analysis, 22, 579–590. Zhang S., Zhou J., Yuan W., 2008. Mathematical simulation of hydrorefining reactor for terephthalic acid. Chemical Reaction Engineering and Technology, 24. Zhou J., Zhang T., Sui Z., 2006a. Hydropurification of Terephthalic Acid over Pd/C I. Thermodynamcis and Feature Analysis. Journal of East China University of Science and Technology (Natural Science Edition), 32, 374-380. Zhou J., Zhang T., Sui Z., 2006b. Hydropurification of Terephthalic Acid over Pd/C II.Apparent Kinetics of 4- CBA Hydrogenation on Catalysts of Different Sizes. Journal of East China University of Science and Technology (Natural Science Edition), 32, 503-507. 468