CHEMICAL ENGINEERING TRANSACTIONS VOL. 76, 2019 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Petar S. Varbanov, Timothy G. Walmsley, Jiří J. Klemeš, Panos Seferlis Copyright © 2019, AIDIC Servizi S.r.l. ISBN 978-88-95608-73-0; ISSN 2283-9216 Novel Parameter Estimation Method for Molecular Reconstruction of Naphtha by Gamma Distribution Yu Rena, Zuwei Liaob,*, Jingyuan Suna, Binbo Jianga, Jingdai Wangb, Yongrong Yangb, Qing Wuc aZhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310027, P. R. China bState Key Laboratory of Chemical Engineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310027, P. R. China cChina National Offshore Oil Corporation, Beijing 100010, P. R. China liaozw@zju.edu.cn Molecular reconstruction is an effective tool to predict the detailed composition of complex mixtures under obtainable bulk properties and chemical details. With the introduction of statistical distribution, although the number of variables in the model is reduced, the ranges of statistical distribution parameters have a great influence on the prediction accuracy. A novel parameter estimation method using the analysis of the probability density plot of gamma distribution based on the basic available information is proposed to obtain parameter ranges with wide universality and non-specificity. The obtained ranges are used in a molecular reconstruction model to predict the bulk properties and molecular composition of naphtha. The model assumes molecular composition within each homologous series follows a gamma distribution against carbon number. The results of a naphtha sample show that the calculated bulk properties and molecular composition are in good agreement with the experimental values. 1. Introduction Facing the increasingly serious energy crisis and environmental pressure, energy conservation and consumption reduction are important issues for the chemical industry. Other than process intensification tools (Yang et al., 2018), energy integration (Hong et al., 2018) and recovery (Li et al., 2019), system optimization (Hong et al., 2019) and integration (Lou et al., 2019), molecular management is an effective alternative tool to improve refinery efficiency. The basic idea of molecular reconstruction is to determine the detailed molecular composition of petroleum fractions through obtainable bulk properties and chemical details (Ren et al., 2019). Due to the complexity of petroleum composition, there are numerous variables in the model while available constraints are insufficient. To reduce the degree of freedom, the introduction of statistical distribution, which is an important feature of petroleum, has been widely adopted. Based on the Molecular type-homologous series (MTHS) matrix model (Peng, 1999), as shown in Figure 1, Wu and Zhang (2010) assumed that the compositions of the entries within each homologous series follow a gamma distribution with boiling point. The numbers of variables to represent molecular composition are reduced from the unknown fractions of each matrix entry to three parameters for the gamma distribution and one fraction of each homologous series. Therefore, the degrees of freedom are reduced dramatically. Liu (2015) followed the above method of using gamma distribution but improved the matrix form of the MTHS model by combining carbon numbers and pseudo-components in the matrix framework. Wang and Li (2017) used normal distribution instead of the gamma distribution to describe the statistical distribution within each homologous series in the proposed novel MTHS matrix. Recently, Cui et al. (2018) also used gamma probability density function to describe the fractions of species in each chemical family with a library containing 89 hydrocarbons and 81 heteroatom-containing species in their molecular representation of the petroleum gasoline fraction. DOI: 10.3303/CET1976133 Paper Received: 16/03/2019; Revised: 01/07/2019; Accepted: 02/07/2019 Please cite this article as: Ren Y., Liao Z., Sun J., Jiang B., Wang J., Yang Y., Wu Q., 2019, Novel Parameter Estimation Method for Molecular Reconstruction of Naphtha by Gamma Distribution, Chemical Engineering Transactions, 76, 793-798 DOI:10.3303/CET1976133 793 The introduction of statistical distribution results in reducing model variables. However, it brings a problem that the boundary of variables is difficult to determine during the optimization of the model. Noted that the tuning parameters for the optimization engine at this time are no longer the fractions of each molecule whose boundary fall within [0, 1], but the parameters of the statistical distribution whose boundary are wide enough to reach infinity (Cui et al., 2018). Therefore, reasonable ranges of parameters are critical to the accuracy of composition prediction. Wu and Zhang (2010) stated that to avoid the distribution between molecules within each homologous series being too narrow or too wide, the scale parameter α and the shape parameter β of gamma distribution are limited in the range of 1~20 and 19~100, respectively. However, the authors did not indicate how these ranges were determined. Wang and Li (2017) summarized the ranges of normal distribution parameters by analyzing the parameters of 16 naphtha and gasoline samples. These ranges were used to generate the initial value of distribution parameters, and the prediction calculation was optimized by the multi-population genetic optimization algorithm in this range too. By this way, the authors guaranteed the initial value and optimal value of distribution parameters obtained in the correct range, which could decrease the predicting error and calculating time largely. Figure 1: Molecular type-homologous series (MTHS) matrix (adapted from Wu and Zhang (2010)). In summary, the ranges of statistical distribution parameters and the estimation method are important in molecular reconstruction. If the initial ranges are obtained by parameter regression of existing molecular composition data of sample oils, it will result that a set of parameters can only be applied to specific oils. In order to solve the parameter estimation problem and ensure that the parameters ranges obtained have wide universality and are not specific to any oil characteristics, a novel complete parameter estimation procedure is proposed in this article. The pre-set ranges of parameters are determined from the analysis of the probability density plot of gamma distribution based on the basic available information. Using the range of parameters obtained above, the molecular reconstruction of a naphtha sample shows that the accuracy of the calculated bulk properties and molecular composition is within an acceptable range. 2. Parameter estimation The gamma distribution function gampdf(x; α, β)is a two-parameter family of continuous probability distributions. The parameter α is called shape parameter, and the parameter β is called scale parameter: gampdf(x; α, β) = 𝑥𝛼−1𝑒 −𝑥 𝛽⁄ 𝛽𝛼Γ(α) 𝑓𝑜𝑟 𝑥 > 0 𝑎𝑛𝑑 𝛼, 𝛽 > 0 (1) Where 𝑥 is a random variable; Γ(α) is the gamma function. Figure 2: Probability density plots of gamma distributions. C0 C1 C2 . . . . C11 C12 . . . . C23 C24 . . . nP iP nO iO N5 N6 2N 3N+ 1A 1A1N 1A2N 2A 2A1N 2A2N 3A 3A1N 4A 5A+ SI SII SIII SIV SV NI NII Columns represent homologous series Rows represent carbon numbers Mass or molar fraction of molecule Sulphur and nitrogen compounds nP: normal Paraffins iP: iso-Paraffins O: Olefins N: Naphthenes A: Aromatics S: mercaptans SII: thiophenes & benzothiophenes SIII: dibenzothiophenes not substituted at the 4 or 6 position SIV: dibenzothiophenes substituted at the 4 or 6 position SV: dibenzothiophenes substituted at both the 4 or 6 position NI: basic nitrogen compounds NII: non-basic nitrogen compounds 794 Figure 2 shows the probability density plots of gamma distributions. As can be seen from Figure 2, the parameters α and β simultaneously control the shape of the gamma distribution. According to the practical issues of molecular reconstruction, the shape of probability density plots of gamma distributions can be restricted according to the characteristics of the model. In this paper, it is assumed that the molecular composition within each homologous series follows a gamma distribution against carbon number, e.g., the random variable 𝑥 in Eq(1) stands for the carbon number of each matrix entry. Therefore, according to the common knowledge of naphtha composition distribution, the following constraints can be obtained. α𝑗 > 1 (2) β𝑗>0 (3) gampdf(CN𝑗 𝑚𝑖𝑛; α𝑗 , β𝑗 ) > 0 (4) gampdf(CN𝑗 𝑚𝑎𝑥; α𝑗 , β𝑗 ) > 0 (5) CN𝑗 𝑚𝑖𝑛 <(α𝑗 − 1)β𝑗0, the gamma distribution is a unimodal function, which satisfies the requirements of the model. Eq(4) and Eq(5) control that the function values of the starting point (CN𝑗 𝑚𝑖𝑛 ) and ending point (CN𝑗 𝑚𝑎𝑥 ) of each plot must be larger than 0. These constraints are to ensure that gamma distribution is effective in fitting molecular composition without too many zeros between the range of minimum carbon number (CN𝑗 𝑚𝑖𝑛 ) and maximum carbon number (CN𝑗 𝑚𝑎𝑥 ). (α𝑗 − 1)β𝑗 is the extreme point obtained by deriving the gamma distribution function. Eq(6) defines that the peak position of gamma distribution must fall within the range of maximum and minimum carbon number in order to avoid monotonous increase or decrease in the studied range. Eq(7) indicates that the maximum value of gamma distribution does not exceed 1, to match the fact that the molecular fractions in a homologous series are between 0 and 1. The carbon number range for the five homologous series divided in this paper is as follows: C4~C12 for n-Paraffins and iso-Paraffins, C4~C7 for Olefins, C5~C11 for Naphthenes, C6~C11 for Aromatics. Figure 3: Flowsheet of parameter estimation. Initial range of parameters α and β α: [1, 10 5 ], β: [0, 10 5 ] Minimize α by genetic algorithm Constraints Equation(4)~(7) Determine the minimum of α and update the initial range Maximize α by genetic algorithm Determine the maximum of α and update the initial range Minimize β by genetic algorithm Determine the minimum of β and update the initial range Maximize β by genetic algorithm Get the final range of a set of α and β 795 The process of parameter estimation is shown in Figure 3. The genetic algorithm is used to solve this nonlinear programming problem. In the above constraints, there is only a lower limit of the parameters α and β. To reasonably control the solution space of the model, the upper limit is set to 105. Based on such initial range, the objective function is to minimize the parameter α firstly under the above constraints, e.g., Eq(4) ~ (7). The optimization process is repeated 20 times and the minimum value of α is taken as the lower limit of parameter α. Then, the obtained minimum of α is used to replace the lower limit of α in the initial range, thereby realizing the update of the initial range. Subsequently, the optimization processes of the maximum of α, the minimum of β, and the maximum of β are sequentially performed. The operation of each optimization process is the same. Finally, the pre-set ranges of gamma distribution parameters α and β of each homologous series as shown in Table 1 are obtained. Table 1: The pre-set range of parameters α and β of the gamma distribution. Parameters Lower limit Upper limit 𝛼𝑛𝑃 1.04 87.00 𝛽𝑛𝑃 0.05 118.61 𝛼𝑖𝑃 1.04 87.01 𝛽𝑖𝑃 0.05 118.61 𝛼𝑂 1.05 96.02 𝛽𝑂 0.05 99.74 𝛼𝑁 1.05 106.39 𝛽𝑁 0.05 100.79 𝛼𝐴 1.06 115.63 𝛽𝐴 0.05 99.33 3. Mathematical model The formulas to describe the assumption gamma distribution are as follows. 𝑦𝑖,𝑗 = 𝐶𝑁 𝑖,𝑗 𝛼𝑗−1 𝑒 −𝐶𝑁𝑖,𝑗 𝛽𝑗⁄ 𝛽𝑗 𝛼𝑗 Γ(𝛼𝑗) (8) 𝑥𝑖,𝑗 = 𝑋𝑗 𝑦𝑖,𝑗 ∑ 𝑦𝑖𝑖,𝑗𝑖,,𝑖 (9) where 𝑦𝑖,𝑗 in Eq(8) stands for the fraction of the entry i, j within the j homologous series; 𝐶𝑁𝑖,𝑗 is the carbon number of the entry i, j; 𝑥𝑖,𝑗 in Eq(9) stands for the mass fraction of the entry i, j in the whole matrix; 𝑋𝑗 is the total mass fraction of the homologous series j. obj = ∑ (𝑤2,𝑇 𝑉𝑇 𝑒𝑥𝑝 −𝑉𝑇 𝑐𝑎𝑙 𝑉 𝑇 𝑒𝑥𝑝 ) 2 + ∑ (𝑤3,𝑗 𝐹 𝑗 𝑒𝑥𝑝 −𝐹𝑗 𝑐𝑎𝑙 𝐹 𝑗 𝑒𝑥𝑝 ) 2 + ∑ (𝑤1,𝑃 𝑃𝑒𝑥𝑝−𝑃𝑐𝑎𝑙 𝑃𝑒𝑥𝑝 )2𝑃𝑗∈𝑃𝐼𝑂𝑁𝐴𝑇 (10) where obj is the objective function, T is the distillation curve temperature, V is the volume fraction at the corresponding distillation curve temperature T, F stands for n-Paraffins, iso-Paraffins, Olefins, Naphthenes, and Aromatics fractions, P stands for other available bulk properties, and w is the weight factor. 𝐶𝑁𝐶8,𝐴 ≤ (𝛼𝐴 − 1)𝛽𝐴 ≤ 𝐶𝑁𝐶9,𝐴 (11) Eq(11) is a constraint according to refinery reality to constrain the fraction of C8 or C9 for naphtha fraction is the maximum within the aromatic homologous series (Wu and Zhang, 2010). Particle swarm optimization (PSO) is used to solve the above mathematical model. The whole calculation processes including parameter estimation are executed by MATLAB R2016b on a PC constituted by Intel Xeon E31230 processor 3.20 GHz. 4. Result and discussion A naphtha sample is used to verify the accuracy of the above mathematical model with the obtained parameters ranges in Table 1. The available bulk properties including molecular weight, special-density, PIONA weight fractions and ASTM D 86 distillation data are shown in Table 2. Table 3 shows the comparison of experimental and calculated values of molecular weight and special-density. The relative errors of these two properties are both less than 1.2 %, indicating that the calculated values agree well with the experimental values. Figure 4 shows the comparison between experimental and calculated data of 796 simulated distillation curve. It can be seen that after 20 % distillation volume, the simulated distillation curve agrees well with the true boiling point distillation curve with the relative error of less than 5 %. Table 2: Bulk properties data of the naphtha sample. Bulk properties Molecular weight 101.32 Distillation (ASTM D 86)℃ IBP 33.04 Special-density [kg / m3] 715.85 5 % 51.15 10 % 71.28 n-Paraffins (wt %) 28.89 20 % 80.01 30 % 88.75 iso-Paraffins (wt %) 39.59 40 % 94.75 50 % 100.75 Olefins (wt %) 0.26 60 % 110.98 70 % 121.21 Naphthenes (wt %) 23.54 80 % 138.5 90 % 155.78 Aromatics (wt %) 7.72 95 % 174.75 FBP 189.41 Table 3: Comparison of experimental and calculated properties. Properties Experiment Calculation Relative error Molecular weight 101.32 102.53 -1.19 % Special-density kg / m3 715.85 717.02 -0.16 % Figure 4: Comparison between experimental and calculated data of simulated distillation curve. -20 0 20 40 60 80 100 120 140 160 180 200 220 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Te m p e ra tu re ℃ Cumulative volume fraction experimental calculted n-Paraffins iso-Paraffins Naphthenes Aromatics 0 2 4 6 8 10 12 14 C4 C5 C6 C7 C8 C9 C10 C11 C12 W e ig h t fr a ct io n % Carbon number experimental calculated 0 2 4 6 8 10 12 C4 C5 C6 C7 C8 C9 C10 C11 C12 W e ig h t fr a ct io n % Carbon number experimental calculated 0 1 2 3 4 5 6 7 8 C5 C6 C7 C8 C9 C10 C11 W e ig h t fr a ct io n % Carbon number experimental calculated 0 0.5 1 1.5 2 2.5 3 C6 C7 C8 C9 C10 C11 W e ig h t fr a ct io n % Carbon number experimental calculated 797 Figure 5: Experimental values and calculated values of the molecular composition of the naphtha sample. The predicted results of molecular composition are shown in Figure 5. The figure shows the comparison of the experimental and calculated molecular composition of n-Paraffins, iso-Paraffins, Naphthenes and Aromatics. Olefins are neglected for their low content. The mean absolute differences (abbreviated as MAD, MAD = 1 𝑛 ∑ |𝑥𝑖 𝑐𝑎𝑙 − 𝑥𝑖 𝑒𝑥𝑝 |𝑛𝑖=1 ) of the calculated mass fractions of all components is 1.1, which is within an acceptable range. The prediction results for aromatics are more accurate due to the additional constraint on the peak position. Although there are errors in the specific content, an approximate molecular composition simulation result is obtained. To some extent, the prediction accuracy of sample oils with compositional distributions more in line with the statistical law of the gamma distribution will be higher. 5. Conclusions A novel parameter estimation method is proposed for the molecular reconstruction model with the statistical distribution introduced. The model is based on the Molecular type-homologous series (MTHS) matrix method where the molecular composition within each homologous series is assumed to follow a gamma distribution against carbon number. The pre-set ranges of gamma distribution parameters are calculated from the analysis of the probability density plot of gamma distribution based on the basic available information, so that the obtained parameter ranges have wide universality and are not specific to any oil characteristics. A naphtha sample is used to verify the model with the obtained parameters range calculated by the proposed parameter estimation method. The relative errors between experimental and calculated values of molecular weight and special-density are less than 1.2 %, and that of the simulated distillation curve is less than 5 % after 20 % distillation volume. The accuracy of the calculated mass fractions of molecular composition is also within an acceptable range with the MAD of 1.1. This work proves that the approximate bulk properties and molecular composition prediction could also be obtained without parameter regression but only through analyzing the features of the statistical distribution plot. In order to obtain better prediction accuracy, more implicit graphical features need to be excavated. Acknowledgments The financial support provided by the Project of National Natural Science Foundation of China (21822809 & 61590925), the National Science Fund for Distinguished Young (21525627) are gratefully acknowledged. References Cui, C., Billa, T., Zhang, L., Shi, Q., Zhao, S., Klein, M.T., Xu, C., 2018, Molecular Representation of the Petroleum Gasoline Fraction, Energy and Fuels, 32, 1525-1533. Hong, X., Liao, Z., Lou, Y., Sun, J., Jiang, B., Wang, J., Yang, Y., 2018, Transshipment type model for total site heat integration using non-isothermal utility loop, Chemical Engineering Transactions, 70, 1135-1140. Hong, X., Liao, Z., Sun, J., Jiang, B., Wang, J., Yang, Y., 2019, Transshipment type heat exchanger network model for intra- and inter-plant heat integration using process streams, Energy, 178, 853-866. Li, H., Liao, Z., Sun, J., Jiang, B., Wang, J., Yang, Y., 2019, Modelling and simulation of two-bed PSA process for separating H2 from methane steam reforming, Chinese Journal of Chemical Engineering, Accepted. Liu, L., 2015, Molecular characterisation and modelling for refining processes, PhD Thesis, University of Manchester Institute of Science and Technology, Manchester, UK. Lou, Y., Liao, Z., Sun, J., Jiang, B., Wang, J., Yang, Y., 2019, A novel two-step method to design inter-plant hydrogen network, International Journal of Hydrogen Energy 44, 5686-5695. Ren, Y., Liao, Z., Sun, J., Jiang, B., Wang, J., Yang, Y., Wu, Q., 2019, Molecular reconstruction: Recent progress toward composition modeling of petroleum fractions, Chemical Engineering Journal, 357, 761-775. Van de Vijver, R., Devocht, B.R., Van Geem, K.M., Thybaut, J.W., Marin, G.B., 2016, Challenges and opportunities for molecule-based management of chemical processes, Current Opinion in Chemical Engineering, 13, 142-149. Wang, K., Li, S., 2017, Modified molecular matrix model for predicting molecular composition of naphtha, Chinese Journal of Chemical Engineering, 25, 1856-1862. Wu, Y., Zhang, N., 2010, Molecular Characterization of Gasoline and Diesel Streams, Industrial & Engineering Chemistry Research, 49, 12773-12782. Yang, Y., Sun, J., Huang, Z., Hu, D., Liao, Z., Yang, Y., Wang, J., 2018, Dynamic and Steady-State Characterization of the Liquid Spray Zone in an Externally Heated Gas-Solid Fluidized Bed, Industrial and Engineering Chemistry Research, 57, 2988-3001. 798