CET 96 DOI: 10.3303/CET2296039 Paper Received: 15 January 2022; Revised: 19 July 2022; Accepted: 4 October 2022 Please cite this article as: Sartori F., Zuecco F., Facco P., Bezzo F., Barolo M., 2022, Data Analytics Can Help Reduce Energy Consumption in the Industrial Manufacturing of Specialty Chemicals, Chemical Engineering Transactions, 96, 229-234 DOI:10.3303/CET2296039 CHEMICAL ENGINEERING TRANSACTIONS VOL. 96, 2022 A publication of The Italian Association of Chemical Engineering Online at www.cetjournal.it Guest Editors: David Bogle, Flavio Manenti, Piero Salatino Copyright © 2022, AIDIC Servizi S.r.l. ISBN 978-88-95608-95-2; ISSN 2283-9216 Data Analytics Can Help Reduce Energy Consumption in the Industrial Manufacturing of Specialty Chemicals Francesco Sartoria, Federico Zueccob, Pierantonio Faccoa, Fabrizio Bezzoa, Massimiliano Baroloa,* aCAPE-Lab ‒ Computer-Aided Process Engineering Laboratory, Department of Industrial Engineering, University of Padova, 35131 Padova PD (Italy) bBASF Italia SpA, Via Pila 6/3, 40037 Pontecchio Marconi BO (Italy) max.barolo@unipd.it Batch processes are operated following recipes that consist of a sequence of steps of given time lengths carried out in different pieces of equipment. Large variability in the length of a processing step can cause that step to become a bottleneck for the entire process, thus leading to an increase of the energy consumption per unit of product manufactured. Debottlenecking the process can therefore lead to reduction of the energy requirements. We consider the case of a batch reaction that is a key step in the industrial manufacturing of a polymer additive. The available data historians revealed that, over a period of 12 months of operation, the length of the reaction step ranged between 0.9 and 2.8 h, with an average value of 1.3 h. This acted as a limit to the performance of the overall manufacturing system, but no cause was initially identified to explain this behavior. Advanced analytics on the process data historians by means of multivariate statistical techniques revealed that over 40% of the batches had been affected by intervention of a safety interlock in the reactor, whose occurrence strongly correlated to an increase of the batch length. Reconfiguration of the interlock system resulted in a reduction of both average batch length and batch length variability. Namely, over the 6-month assessment that followed this study, a 29% reduction in the average batch length for the reactor under investigation was observed, which resulted in an 8% reduction of the overall process cycle duration, thus entailing significant energy savings. Furthermore, an 11% reduction on nitrogen consumption was achieved. 1. Introduction Batch processes are widespread in many industries producing low volumes of high added-value goods such as pharmaceuticals, biotechnological products and specialty chemicals. The energy costs for common chemicals produced in batch operations can arrive at as high as 10% of the total production costs (Bieler et al., 2004). Therefore, reducing energy consumption not only reduces the environmental impact of the process, but can also significantly reduce the process operating expenses. One approach for modeling batch processes is based on first-principles models, requiring detailed knowledge about the phenomena occurring in the process. The development of these models is usually expensive, time consuming, hence often prohibitive in an industrial setting. On the other hand, the increased availability of data in the process industry, propelled by the development of sensors and networking technology together with the reduction of the costs of computing equipment, allowed the development of data-driven models for tasks traditionally carried out through first-principles models, thus sensibly reducing time and costs for model development. Batch processing is highly impacted by this approach, especially when the process chemistry is not completely understood, which renders the development of a first-principles model a hard (or even impossible) task. In order to extract process-relevant information from the massive amount of data generated by a modern chemical process, effective data analytics techniques can be used. Multivariate statistical methods, such as principal component analysis (PCA; Jollife and Cadima, 2016) and its multiway extension (Nomikos and MacGregor, 1995) are extensively used to this purpose. These techniques can reduce the dimensionality of 229 large sets of data, thus increasing their interpretability while minimizing information loss, and revealing the underlying correlation structure between the process variables over their time evolution. They do this by projecting the data onto the space of a reduced set of new, uncorrelated variables (called principal components) that summarize the original data, in such a way that an intuitive visual comparison of the data evolution patterns across different batches can be obtained. In this study, we exploit PCA to find the root-cause determining a large variability in the time duration of a key reaction step for an industrial batch process manufacturing a specialty chemical. Large (and unexplained) average batch length and length variability in this reaction step caused significant energy and raw materials consumption per unit of product manufactured, yet with no apparent impact on the quality of the final product. 2. Process description The process under investigation consists in the synthesis of an intermediate chemical for the manufacturing of a hindered amine light stabilizer (HALS), to be used as a polymer additive. A simplified process flow diagram is shown in Figure 1. Figure 1: Simplified process flow diagram of the process under investigation In this semi-batch process, reactants A and B are fed to the jacketed stirred tank reactor R410. Vacuum is then applied to the system, and the main reaction is a liquid-phase, thermally activated, exothermic one, according to: 2A + B → 2C + D (1) where D is the desired product, and C is a byproduct. The process is highly automated and is operated through a recipe that consists of the following steps: 1. setup: the system is set up for a new run; 2. reactant loading: liquid reactants A and B are loaded into R410 from their respective storage tanks in stoichiometric amounts. A small amount of C is also loaded into the reactor; the reason for this is to speed up the initial reaction phase. The reactor is heated up with low-pressure steam through a jacket, to reach the temperature required to carry out the reaction; 3. reaction: vacuum is applied to the system in two steps: a faster pressure decrease is applied first, down to an assigned pressure; then, pressure is further slowly decreased to the pressure value required by the reaction. Once the reaction conditions are met, byproduct C is released as a vapor, and it is removed from the reactor, condensed in E410 condenser, and then stored in T410 buffer tank; 4. nitrogen blanketing: when the required volume in T410 is obtained, vacuum is broken, and the reactor is blanketed with nitrogen; 5. product discharge: liquid byproduct C is discharged from T410 to the waste unit, reaction product D (liquid) is discharged from R410 and fed to the subsequent processing step. Step 3 corresponds to the reaction phase (the key one for this process), and we call “batch length” its duration. Figure 2 shows the distribution of batch lengths recorded over a period of 12 consecutive months before this study was started. It can be seen that the distribution is bimodal, with one mode with a peak at 56 min and one mode with a peak at 74 min; furthermore, the batch lengths range between 55 and 145 min, and the overall average batch length is 77 min. 230 Figure 2: Distribution of the time duration of step 3 in reactor R410 across the historical dataset. The vertical red line is the mean of the distribution. The result of this variability in batch length is a decrease in productivity, as well as an increase in energy consumption per unit of product manufactured. Since engineering understanding was not enough to find the root cause of this variability, analytics on the historical manufacturing data was done to mine process-relevant information that could help in the task of troubleshooting the reaction step. 3. Mathematical background 3.1 Principal component analysis A short introduction on principal component analysis (PCA) is provided in this section. Detailed information can be found elsewhere (Jollife and Cadima, 2016). Let 𝐗[𝐼 × 𝐽] be a historical dataset composed of 𝐼 observations and 𝐽 variables. PCA is a multivariate statistical technique that allows describing 𝐗 through 𝐴 ≤ min(𝐼, 𝐽) principal components. In order to determine the most appropriate value of 𝐴 for a given dataset, the scree test can be used (Brown, 2009). PCA models the historical dataset as: 𝐗 = 𝐓𝐏T + 𝐄 (2) where 𝐓[𝐼 × 𝐴] is the scores matrix, 𝐏[𝐴 × 𝐽] is the loadings matrix, and 𝐄[𝐼 × 𝐽] is the residuals matrix. In batch processes, measurements of the time trajectories for several process variable are collected continuously in time during each run. A convenient arrangement for batch process data is therefore a tensor- like one, namely 𝐗[𝐼 × 𝐽 × 𝐾], where 𝐗 is made by 𝐼 batch runs, 𝐽 measured variables, and 𝐾 time samples for them. In order to analyze a three-way array like 𝐗, multiway PCA (MPCA) can be used. This technique is applied by unfolding 𝐗 along its second dimension (i.e., batchwise unfolding; Camacho et al., 2008), thus obtaining 𝐗U [𝐼 × 𝐽𝐾], and then applying standard PCA to this two-way matrix (Nomikos and MacGregor, 1995). 3.2 Batch alignment In recipe-driven, multiphase batch processes, such as the one considered in this study, the termination of each phase is usually triggered by an event that indicates phase completion (e.g., reaching a desired volume of separated byproduct). Often, these events occur at a different time in each batch, and each batch may therefore come to completion at a different time (Undey and Cinar, 2002). In order for multivariate statistical techniques to be applied to data coming from processes with these uneven-length characteristics, batch alignment is required. In this study, the indicator variable batch alignment technique was applied (Nomikos and MacGregor 1994). This technique assumes that a variable exists in the dataset that is a reliable indicator of the process evolution. A variable is a good candidate as an indicator variable if (García-Muñoz et al., 2003): • it is monotonic in time; • it has a favorable signal-to-noise ratio; • it has the same initial and final values across all batches. For a multiphase process, each phase can be aligned using a different indicator variable (García-Muñoz et al., 2003). When such a variable is not present in the dataset, time can be used as the indicator variable, either for the whole process, or for each single phase. 231 4. Available data A total of 𝐼 = 468 historical batches were extracted from the plant historian (Figure 2). All batches ended up in a product meeting the target quality profile. The available data consists of the time trajectories of 7 operating variables, as listed in Table 1. The measurements were collected every 30 s. The available data were organized in a tensor 𝐗[𝐼 × 𝐽 × 𝐾], where 𝐽 = 7 is the number of measurement sensors available for the unit, and 𝐾 is the total number of observations per batch, ranging between 183 and 1225. Table 1: Available variables in the 𝑿 dataset Variable no. Variable description 1 R410 absolute internal pressure 2 R410 internal pressure controller output 3 R410 internal temperature 4 E410 condensed liquid temperature 5 T410 internal level 6 R410 internal level 7 Time 5. Analysis of historical batch data The dataset was aligned with the indicator variable technique applied to each operating phase, using the internal level as the indicator variable for phases 2 and 5, and time as the indicator variable for phase 3, thus obtaining 𝐗a[468 × 7 × 296]. Due to their very short time duration, operating stage 1 and 4 were neglected. An MPCA model with 3 PCs was calibrated utilizing the preprocessed dataset 𝐗a. The model explains 51% of the dataset variability. Figure 3. Results of the multiway principal component analysis (MPCA) model built on the historical process data: (a) scores plot, (b) loadings plot on PC1 (time evolution for each process variable). The grey areas shown for each variable in (b) refer to step 2 (left) and step 5 (right) of the manufacturing recipe; the white area refers to step 3. The model scores for the first two PCs are plotted in Figure 3a: no particular pattern across the historical batches is apparent. On the other hand, analysis of the loadings (Figure 3b) reveals interesting information. The loadings plot shows how the loadings along the first PC evolve with the aligned time for each of the variables listed in Table 1. For any given variable, the time evolution is marked using three background colors: i) a grey left area, corresponding to the reactant loading phase (step 2); ii) a white central area, corresponding to the reaction phase (step 3); and iii) a grey right area, corresponding to the product discharge phase (step 5). Paired analysis of the loadings and scores plot (Bro and Smilde 2014) helps understanding how each variable concurs to separating the scores along a particular direction in the scores plot. For example, considering the direction along the first PC in the scores plane (left to right in Figure 3a), we can conclude from Figure 3b that the batches located in the right-half plane of Figure 3a are characterized by: • lower pressure in the first part of the reaction phase, and higher pressure in the second part of the reaction stage (variable no. 1); (a) (b) 232 • lower temperature in the first part of the reaction stage, and higher temperature in the second part of the reaction stage (variable no. 3); • lower temperature of the condensed liquid in the condenser in the reaction stage (variable no. 4); • consistently longer duration of the reaction stage (variable no. 7). Figure 4a reports the time profiles of the reactor internal pressure for two historical batches, respectively projecting onto the left-half plane (batch no. 18) and the right-half plane (batch no. 337) of the scores plot. Figure 4: (a) Internal pressure profiles from a batch projecting onto the left-half plane of Fig. 4a (batch no.18) and a batch projecting onto the right-half plane of Fig. 4a (batch no.337). (b) MPCA model scores of Figure 3a colored according to whether they are designated as regular or abnormal. It is observed that the internal pressure for batch no.18 follows the recipe described in Section 2 correctly, whereas batch no.337 shows an abnormal pressure profile, with two spikes in the first quarter of the batch duration. This indicates that vacuum was broken (and then reinstated) in that batch before reaching the end of the reaction stage. Confirming the conclusions drawn from the loadings analysis, batch no.337 have a longer duration than batch no.18 (Figure 4a), and higher internal pressure during the reaction phase (namely, in- between the initial decreasing ramp and the final step increase). Since loss of vacuum is an abnormal event (yet not leading to abnormal product quality), an algorithm to automatically identify all historical batches with a reactor pressure profile qualitatively similar to the one of batch no.337 was developed. The relevant batches were denoted as “abnormal”, to distinguish them from the “regular” ones, where the vacuum breakage event did not occur. It was found that as many as 40% of the historical batches can be designated as abnormal. The regular and abnormal batches were then identified in the scores plot, obtaining the results of Figure 4b. We notice that most of the abnormal batches are projected onto the right-half plane, whereas most of the regular ones are lying in the left-half plane, i.e., the separation between regular and abnormal batches occurs along the first PC. Therefore, we conclude that the difference between regular and abnormal batches acts as the strongest source of variability within the historical dataset. Discussion with the process experts revealed that the abnormal pressure profiles observed in the historical dataset are related to the intervention of the reactor safety interlock system. In fact, the occurrence of particular combinations of operating conditions in the reactor can trigger the intervention of specific interlocks, each of which acting by breaking the vacuum and blanketing the reactor with nitrogen. The triggering of all potential interlocks was therefore monitored through a dummy variable across a set of new batches, and this enabled the identification of one interlock that did not work properly. This specific interlock was therefore reconfigured, and a new campaign of batches was initiated to validate the finding and assess its impact on the distribution of the batch length across the campaign. 6. Validation After reconfiguration of the reactor safety interlock system, 8 months of operating data (corresponding to 635 new batches) were collected from the data historian. Less than 2% of the batches of the validation campaign were affected by a vacuum loss event, thus confirming that the interlock system reconfiguration was effective. In terms of distribution of the batch lengths across the validation campaign, Figure 5 shows that a unimodal distribution was obtained, with a peak value of 58 min. The distribution is also narrower than the one in Figure 2, ranging between 49 and 120 min. Therefore, reconfiguration of the safety interlock system resulted in a 29% reduction in average batch length. Overall, this amounted to an 8% reduction of the overall cycle time, with a (a) (b) 233 related saving in the energy expenses. Furthermore, it was estimated that an abnormal batch requires 25% more nitrogen than a regular batch. Considering that the abnormal batches were 40% of the total number of the historical batches, and assuming that only 2% of the batches of new production campaigns remain abnormal, reconfiguration of the safety interlock system also resulted in a 11% saving on nitrogen expenses. Figure 5. Distribution of the time duration of step 3 in reactor R410 after reconfiguration of the safety interlock system. The vertical red line is the mean of the distribution. 7. Conclusions Analytics on historical data for a semi-batch industrial process, coupled to engineering understanding on the process itself, allowed us to uncover the existence of an abnormal behavior in 40% of the batches in historical manufacturing campaigns. The abnormal batches did not terminate unsuccessfully, but simply took longer to complete than the other batches. Since this event did not have an impact on the final product quality, it went almost unnoticed, and therefore did not trigger any specific action by the plant personnel. However, the longer duration increased the energy expenditure (hence, the operating expenses) per unit of product manufactured. Data analytics was central to find that the cause of the abnormal behavior was the anomalous intervention of one interlock in the reactor safety system, which (under particular conditions) caused vacuum in the reactor to be broken by nitrogen blanketing; this required subsequent vacuum reinstatement to recover the process conditions and to end the batch successfully. Reconfiguration of the safety interlock system allowed us to shorten the average batch length by 29%, and the overall process cycle time by 8%. Furthermore, an 11% reduction on the nitrogen consumption was obtained. References Bieler P. S., Fischer U., Hungerbühler K., 2004, Modeling the Energy Consumption of Chemical Batch Plants: Bottom-up Approach, Industrial and Engineering Chemistry Research, 43, 7785–7795. Bro R., Smilde A. K., 2014, Principal Component Analysis, Analytical Methods, 6, 2812–2831. Brown, J. D., 2009, Choosing the Right Number of Components of Factors in PCA and EFA, JALT Testing & Evaluation SIG Newsletter, 13, 19–23. Camacho J., Picó J., Ferrer A., 2008, Bilinear Modelling of Batch Processes. Part I: Theoretical Discussion, Journal of Chemometrics, 22, 299–308. García-Muñoz S., Kourti T., MacGregor J. F., Mateos A. G. , Murphy G., 2003, Troubleshooting of an Industrial Batch Process Using Multivariate Methods, Industrial and Engineering Chemistry Research, 42, 3592–3601. Jollife I. T., Cadima J., 2016, Principal Component Analysis: A Review and Recent Developments, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374. Nomikos P., MacGregor J. F., 1994, Monitoring Batch Processes Using Multiway Principal Component Analysis, AIChE Journal, 40, 1361–1375. Nomikos P., MacGregor J. F., 1995, Multivariate SPC Charts for Monitoring Batch Processes, Technometrics, 37, 41–59. Undey C., Cinar A., 2002, Statistical Monitoring of Multistage, Multiphase Batch Processes, IEEE Control Systems Magazine, October, 40–52. 234 100sartori.pdf Data Analytics Can Help Reduce Energy Consumption in the Industrial Manufacturing of Specialty Chemicals