Microsoft Word - 001.docx CHEMICAL ENGINEERING TRANSACTIONS VOL. 66, 2018 A publication of The Italian Association of Chemical Engineering Online at www.aidic.it/cet Guest Editors: Songying Zhao, Yougang Sun, Ye Zhou Copyright © 2018, AIDIC Servizi S.r.l. ISBN 978-88-95608-63-1; ISSN 2283-9216 Analytical Chemometrics Research Based on Artificial Intelligence Algorithm Guohua Zou School of Software, East China University of Technology, Nanchang 330013, China ghzou@ecit.cn This paper conducted analytical chemometrics research based on artificial intelligence algorithms. It combined the Ant Colony Optimization (ACO) with the fuzzy clustering to perform clustering twice to determine the hidden layer nodes of the radial basis function (RBF) network. By introducing the concept of fuzzy mathematics, it performed flexible partitioning of ant colony clustering algorithms. The ant colony optimization- fuzzy c-means-radial basis function network (ACO-FCM-RBF) was used to obtain satisfactory results for the simultaneous determination of bismuth and zirconium. In this paper, a new color system of DBS- chlorophosphonazo-zirconium was established, which lays the foundation for the simultaneous determination of bismuth and zirconium. This paper also established a new method of analytical chemometrics, compared with the test results of RBF network method, the ACO-FCM-RBF had better calculation result accuracy. 1. Introduction Nowadays, the boundaries between disciplines have long been rather vague, multi disciplines infiltrate with each other to establish interdisciplinary subject has become a feature of the progress of today's subjects. Chemometries is a new branch of chemistry that appears at the intersection of chemistry, computer science, mathematics, and statistics (Kowalkowski et al., 2006). It effectively uses computer science, mathematics, statistics, and other theories and measures of relevant subjects to optimize the chemical measurement process and obtains practical chemical information from chemical measurement information as much as possible. Artificial intelligence algorithms are a kind of iterative search algorithms that simulate or explain the development, evolution and variation of some natural phenomena (Cavus, 2010; Chen et al., 2015). The advantages of artificial intelligence algorithms are: they can perform global search, have fast convergence speed, high numerical accuracy, good generality, etc. (Hetmani et al., 2015; Kargarian et al., 2014; Kim et al., 2017). Currently widely used intelligent algorithms include Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Ant Colony Optimization (ACO), and neural network algorithm (Zhang et al., 2018). Radial Basis Function (RBF) neural network is a kind of forward-type network with excellent performance, it has been receiving much attention and has been widely used, and there have been many improvements and developments in recent years. The determination of hidden layer nodes in RBF networks is directly related to the accuracy of network prediction. Therefore, many scholars have proposed different determination methods. This paper connected the ACO with the fuzzy clustering, and used the ACO-FCM-RBF network system for the simultaneous detection of zirconium and bismuth to construct a new analytical chemometrics measures. 2. Basic theory and algorithm 2.1 Basic theory of ACO ACO is a "natural" algorithm that appeared and inspired by the specific actions of various creatures in the natural world. It comes from the analysis of various actions of ant colonies (Ding et al., 2008; Ding et al., 2003; Saidi-Mehrabad et al., 2015). Ant colony optimization (ACO) is a key content of this system. The theory of ant colony system is shown in Figure 1 for details: DOI: 10.3303/CET1866139 Please cite this article as: Zou G., 2018, Analytical chemometrics research based on artificial intelligence algorithm, Chemical Engineering Transactions, 66, 829-834 DOI:10.3303/CET1866139 829 Figure 1: Principle of ACO As shown in the figure, if A is an ant nest, F is a food source. 16 ants crawled from A to F, and then the 16 ants returned to A from F. At t=1, 16 ants are at B and D; at t=2, 8 ants are at B and D; 16 ants are at E. At this time, the distribution of the pheromones concentration on each route is: τBE=8, τED=8, τAB=16, τCD=16, τBC=16, τFD=16, for the amount of pheromones on the routes, BCD is two times of BED. Therefore, with the progress of time, most ants pick the route BCD, by the end, all ants make the same selection to achieve the optimization of the process. 2.2 Basic theory of fuzzy clustering Chapter 2 Clustering refers to the automatic classification of data with different degrees of similarity, so as to minimize the similarities between different classes, and maximize the similarity within the same class. The most widely used means clustering algorithm is the C-means clustering algorithm. The fuzzy C-means clustering (FCM) algorithm is an improvement of the common C-means clustering algorithm to make it fit better with the RBF networks. The algorithm is shown in Figure 2. Figure 2: Improved fuzzy clustering methods Figure 3: Structure of RBF neural network 2.3 Basic theory of RBF neural network For Radial Basis Function (RBF) neural network, it is a forward artificial neural network and is composed of three layers of neurons: input layer, hidden layer, and output layer, its basic idea is: to use RBF as a basis of the hidden module to form the space range of the hidden layer, and to map the input vector straight into the 830 hidden space. The mapping of the hidden layer space range to the output layer space range is linear, that is, the output is the linear weighting of the hidden module output. For the RBF neural network, its topological structure is shown in Figure 3 in detail: The connotations of layers of this network are as follows: (l) For the first layer, it is the input layer, the input layer neurons only functions effectively as a linkage, and does not transform the signal. (2) For the second layer, it is the hidden layer. Assume the correlation coefficient from the i-th neuron in the input layer to the j-th neuron in the hidden layer is vij(1≤i≤I,1≤j≤H), the correlation coefficient vector from input layer neurons to the j-th neuron in the hidden layer is Vj=(v1j,v2j,…,vij) t (1≤j≤H), the transition function of hidden layer neuron is a Gaussian function, and the current state of the corresponding input x of the j-th neuron in the hidden layer is: = − = exp −∑ − /(2 ), where σj(1≤j≤H) is the specific width of the basis function of the j-th neuron in the hidden layer. (3) For the third layer, it is the output layer. Assume the j-th neuron in the hidden layer is connected to the K-th neuron in the output layer, its specific actual coefficient is wjk(1≤j≤H,1≤K≤O), and the correlation coefficient vector from the hidden layer neuron to the output layer K-th neuron is Wk=(w1k,w2k,...,wHK) t, (1≤K≤O). 3. Algorithm improvement The research algorithm of RBF network focuses on: the nearest neighbor clustering research algorithm, self- starting research algorithm, ACO, randomized algorithm, and fuzzy clustering algorithm, this is used to select the main body of RBF. The paper uses fuzzy clustering and ACO to perform clustering twice to clearly determine the main body of RBF, it’s called ACO-FCM-RBF network research algorithm. The actual steps of the research algorithm of the RBF network are as follows: 3.1 Determine the main body of the basis function The actual algorithm is described below: l) First randomly selecting M typical points or selecting according to the experience. This paper uses the random selecting measures to determine the original typical points; 2) Initializing parameters: Assume N, ε0, r, m, τs(0) = 0, β, α, M, P0, η; 3) Calculating Euclidean distance dij: = ( − ) = ∑ ( − ) ; 4) Calculating the amount of pheromones of each route: ( ) = 1, ≤0, > 5) Calculating the probability combining Xi to Xj: = ( ) ( )∑ ( ) ( )∈ 6) Judging whether Pij(t)≥P0 is true; if it’s true, proceed next step, else i=i+1, go to step 3; 7) Calculating the new clustering main body = ∑ , ∈ ; 8) Calculating deviation errors and global errors: the deviation of the j-th clustering is: = ∑ ∑ ( − ) Where vji represents the i-th component in the j-th cluster. The global error of the calculation is: = ∑ . 9) Judging whether ε≤ε0 is true. If it’s true, stop the calculation, and output the results; else go to step 3 and proceed effective iteration; 10) Through effective ant colony clustering, the actual number of clusters c (2≤c≤n) is obtained, and the conclusion of the ant colony clustering is taken as the original reflection point of fuzzy clustering, then further select the specific index weight m'(1