INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL Online ISSN 1841-9844, ISSN-L 1841-9836, Volume: 15, Issue: 6, Month: December, Year: 2020 Article Number: 4037, https://doi.org/10.15837/ijccc.2020.6.4037 CCC Publications A Social Network Image Classification Algorithm Based on Multimodal Deep Learning J. W. Bai, C. Chi Junwei Bai School of Art and Media, Wuhan Polytechnic University, Wuhan 430023, China daqianzi007@126.com Cheng Chi* College of Engineering and Technology, Hubei University of Technology, Wuhan 430068, China *Corresponding author: aling420@126.com Abstract The complex data structure and massive image data of social networks pose a huge challenge to the mining of associations between social information. For accurate classification of social network images, this paper proposes a social network image classification algorithm based on multimodal deep learning. Firstly, a social network association clustering model (SNACM) was established, and used to calculate trust and similarity, which represent the degree of similarity between users. Based on artificial ant colony algorithm, the SNACM was subject to weighted stacking, and the social network image association network was constructed. After that, the social network images of three modes, i.e. RGB (red-green-blue) image, grayscale image, and depth image, were fused. Finally, a three-dimensional neural network (3D NN) was constructed to extract the features of the multimodal social network image. The proposed algorithm was proved valid and accurate through experiments. The research results provide a reference for applying multimodal deep learning to classify the images in other fields. Keywords: multimodal deep learning, social network, image classification, three-dimensional neural network (3D NN). 1 Introduction The model of our social interactions has been reshaped by social networks, which are booming under the proliferation of information technology and smart mobile terminals [2, 4, 6, 10, 11, 21]. The frequency actions of social network users (e.g. interaction, image upload, and sharing) complicate the data structure and inflate the scale of image data on social networks, posing a severe challenge to data mining. https://doi.org/10.15837/ijccc.2020.6.4037 2 The social network images are associated with various social information of the users, including their locations, followees, comments, and reposts. The associations contain information that are valuable for user portrayal, precision advertising, and community discovery. As a result, the clustering, classification, and frequent pattern mining of social network images have become research hotspots [12, 15, 18, 22, 24]. Radcliffe-Brown was the first to study social network and its structure, kicking off the social network analysis. Since then, great progress has been made in the theories, methods, and techniques of social structure [5, 9, 16, 23, 25]. Roy et al. [19] reviewed the development and analyzed the strengths and weaknesses of various social networks at home and abroad, namely, Facebook, Twitter, Qzone, and WeChat Moments. The current research on social network images mainly focuses on image indexing, image classifica- tion, and image annotation. Li et al. [13] classified multi-class social network images through kernel canonical correlation analysis (KCCA), and achieved a classification accurate of over 89%. Mallick et al. [14] screened social network images by matrix factorization model, and mined the visual features of each image and its associated images, thereby improving the effect of image classification. Based on depth-first search and breadth-first search, Hang et al. [8] extended Deep-Walk with word2vec model, and divided the random paths of the network through greedy binarization. The existing processing methods for social network images cannot mine the relationship between social information. Fortunately, the deep learning has been successfully applied to analyze the image associations in other fields [1, 3, 7, 20]. Zhu et al. [26] designed an image clustering algorithm for image capturing robot, based on the convolutional neural network (CNN) for multi-classifier decision- making. Paoletti et al. [17] combined hierarchical matching pursuit with multithreaded asynchronous reinforcement learning to classify multi-source remote sensing images. For accurate classification of social network images, this paper proposes a social network image classification algorithm based on multimodal deep learning, which fully mines the associations between social network images and other social information. The remainder of this paper is organized as follows: Section 2 establishes a social network association clustering model (SNACM), and calculates the trust and similarity, which represent the degree of similarity between users; Section 3 adopts the artificial ant colony algorithm to analyze the importance of the users with the highest similarity, and conducts weighted stacking of the SNACM based on the update of pheromone; On this basis, the social network image association network was constructed, and the social network images of three modes, i.e. RGB (red-green-blue) image, grayscale image, and depth image, were fused. At the end of Section 3, the features of the multimodal social network image were extracted with a three-dimensional neural network (3D NN). Finally, the proposed algorithm was proved valid and accurate through experiments. 2 SNACM The social networks have various strong or weak associations. The effect and coverage of social network image clustering directly hinge on the user activity. To balance the effect and coverage of image clustering and overcome data sparsity of social networks, this paper created the SNACM based on artificial ant colony algorithm, which searches for the users highly similar to the core user to improve the clustering accuracy. 2.1 Similarity calculation Suppose each user association in a social network based on trust propagation have two weights. Let P be the Pearson correlation coefficient representing user similarity, and T be the trust representing the trust propagation relationship. Then, the dual weight coefficient can be expressed as: (ωP , ωT )=(P(xi,xj), T(xi,xj)). Figure 1 provides an example of the SNACM for 7 users. As shown in Figure 1, the SNACM combines the two types of associations in Figures 1(a) and (b), which greatly improves the coverage. In addition, every association is directed. For instance, the weight of x2 → x4 is (P(x2,x4), T(x2,x4)), for user x2 trusts user x4; the weight of x2 → x4 is (P(x2, x4), 0), for user x4 does not trust user x2. The core user M of the social network can be defined as: https://doi.org/10.15837/ijccc.2020.6.4037 3 (a) The similarity-based associations between social network users (b) The trust-based associations between so- cial network users (c) The SNACM for 7 users Figure 1: An example of the SNACM M = min ∑ xi,xj∈C DIS (xi,xj) (1) where, C is the set of user clusters; DIS(xi,xj) is the distance between users xi and xj: DIS (xi,xj) = √ D2P (xi,xj) + D 2 T (xi,xj) (2) where, DP and DT are similarity-based distance and trust-based distance, respectively; ωP and ωT are measurement functions of similarity and trust, respectively:{ DP (xi,xj) = 1 −ωP (xi,xj) DT (xi,xj) = 1 −ωT (xi,xj) (3) The next step is to search for the users highly similar to the core user c through initialization, and weighted stacking. Firstly, the similarity between the core user and every other user in the trust-based social network was quantified, and the top-t users were selected. The similarity can be quantified by: SIM (c,xk) =   2×P (c,xk)×T (c,xk) P (c,xk)+T (c,xk) P (c,xk) + T (c,xk) 6= 0 T (c,xk) P (c,xk) = 0,T (c,xk) 6= 0 P (c,xk) P (c,xk) 6= 0,T (c,xk) = 0 (4) https://doi.org/10.15837/ijccc.2020.6.4037 4 Formula (4) shows that, the similarity between user xk and the core user c can be directly calcu- lated, if the two users have direct trust-based associations like mutual or unilateral following. Other- wise, the similarity can be mined from comments and ratings, using the Pearson correlation coefficient: T (c,xk) = DISmax −DISmax (c,xk) + 1 DISmax (5) The user similarity can be computed by: P (c,xk) = ∑H h=1 (sh(c) − s̄(c)) (sh (xk) − s̄ (xk))√∑H h=1 (sh(c) − s̄(c)) 2 √∑H h=1 (sh (xk) − s̄ (xk)) 2 (6) where, sh is the attention of a user to event h; s is the mean attention of a user to events; H is the set of events that attract the attention of the user. The user similarity was determined against a suitable threshold, such that the users in the same range are allocated to the same class. 2.2 Weighted stacking Based on the SNACM, the artificial ant colony algorithm was introduced to analyze the importance of the t users with the highest similarity. Figure 2 shows the trust-based association model of the core user. Figure 2: The trust-based association model of the core user The pheromone released by the ant walking in the model is determined by the similarity between the users corresponding to the path it walks. The pheromone is updated iteratively by: τ ′ (c,xk) = (1 −ρ)τ (c,xk) + G∑ g=1 ∆τg−xi (c,xk) (7) where, τ ′ is the updated pheromone of user xk, representing its association with the core user c; ρ is the volatilization coefficient. Let cg−xk be the cost of the g-th ant to solve the optimal path. Then, the pheromone ∆τg−xk released by the g-th ant on path between of user xk and the core user c can be described as: ∆τg−xi (c,xk) = { K cg−xi(c,xk) , xi ∈ Xg 0, xi /∈ Xg (8) where, K is a constant; Xg is the set of users visited by the g-th ant. As shown in formulas (7) and (8), if user xk does not belong to that set, the pheromone will volatize iteratively with ρ; otherwise, https://doi.org/10.15837/ijccc.2020.6.4037 5 the vitalization should be combined with the sum of the pheromone released by all ants on the paths between u user xk and other users. To avoid the local optimum trap, the probability for the g-th ant to walk from user xk to the core user c can be defined as follows, in order to select the users with high similarity to the core user and low redundancy: Pg (c,xk) =   [τ(c,xk)α[1/P (c,xk)]β∑ l∈Yg [τ(c,xl)]α[1/P (c,xl)]β ,c ∈ Nxk 0,c /∈ Nxk (9) where, 1/P(c,xk) is the heuristic value, representing the probability to walk from user xk to the core user c; α and β are controlling parameters of the effects of pheromone and heuristic value, respectively; Nxk is the set of neighbors of user xk; Yg is the set of users not yet visited by the g-th ant. 3 Social network image classification algorithm The SNACM can reveal the similarity-based association vectors between users. Here, the CNN is adopted to extract the salient features of the associations between social network images. The obtained salient features, coupled with the similarity-based association vectors, were used to train a neural network model, aiming to accurately classify social network images. The structure of the social network image classification algorithm is presented in Figure 3. Figure 3: The structure of the social network image classification algorithm 3.1 Building an image association network The social image association network was constructed through one-to-one mapping of the n users in the SNACM to the set M of images uploaded by them. Let F(f,mk) be the association between the image mk uploaded by user xk and the image f uploaded by the core user c, and matrix R be the degree of the association. Then, the weights of the association between mk and f can be characterized by the corresponding element r(f,mk) in matrix R. Hence, the probability for the g-th ant to walk from image mk to image mj can be defined as: Pg (f,mk) =   [τ(f,mk)]α[1/F (f,mk)]β∑ l∈Vg [τ(f,ml)]α[1/F (f,ml)]β ,f ∈ Nmk 0, f /∈ Nmk (10) where, Nmk is the set of neighbors of image mk; Vg is the set of images not yet visited by the g-th ant. The LINE algorithm based on the second-order similarity can be used to calculate the conditional probability of every directed association in the social network: P̂g (f,mk) = exp [τ (f,mk) τ ′ (f,mk)]∑n q=1 exp [τ (f,mq) τ ′ (f,mq)] (11) https://doi.org/10.15837/ijccc.2020.6.4037 6 The associations between SNACM images were taken as training samples. Then, the image as- sociation network can be finalized through learning the Kullback–Leibler (K-L) divergence between formulas (10) and (11). The objective function of the algorithm can be defined as: min ∑ f,mk∈M G∑ g=1 KL ( P̂g (f,mk) ,Pg (f,mk) ) = min ∑ f,mk∈M G∑ g=1 [r (f,mk) lg (Pg (f,mk))] (12) The conditional probabilities cannot be calculated unless the pheromones are iteratively optimized throughout the image association network. However, it is an arduous task to directly optimize formula (12). To solve the problem, lg (Pg (f,mk)) can be processed by the negative sampling algorithm and Sigmoid function: lg sig (Pg (f,mk)) + S∑ s=1 Pnoise [sig (Ps−g (f,mk))] (13) where, the second term is the negative sampling item; Pnoise is the noise distribution of negative sampling; Ps−g is the total number of negative samplings with conditional probability S obtained in the s-th sampling. Then, the gradient descent algorithm was introduced to optimize the above formula. In this way, the salient features of image associations can be obtained through learning. 3.2 Multimodal fusion Figure 4: The workflow of the classification of multimodal social network image Many social network images come from video clips. For the accuracy of image classification, the images extracted from videos were subject to modal analysis and fusion, and trained with the CNN. For a social network image, the texture is reflected by the grayscale image of the object, and the 3D surface features are depicted by RGB and depth images. To fully utilize the feature information of social network images, this paper fuses the RGB image, grayscale image, and the depth image, and extracts the salient features from the fused high-dimensional image (Figure 4). During the fusion, the low-level features were extracted from the RGB image, grayscale image, and the depth image by the 3D NN, using convolutional and pooling techniques, and used as the inputs of the objective functions of multiple path-searching ants, thereby obtaining more abstract and effective high-level features. After the fusion, the image mk corresponding to user xk can be expressed as: Imk = [IRGB (mk) ,IGRAY (mk) ,IDEPTH (mk)] (14) where, IRGB, IGRAY , and IDEPTH are the images uploaded in the RGB, grayscale, and depth modes, respectively. The different features were extracted from the three modes, and used to create a training set T under mode fusion based on image association network: T = {t1 (If,Im1 ) , · · · , tk (If,Imk) , · · · , tn (If,Imn)} (15) https://doi.org/10.15837/ijccc.2020.6.4037 7 where, tk(If,Imk) is the k-th training sample of the image association network. Based on the image association network, the training path involves a feature extraction model and a feature amplification module. The former consists of a 3D convolution layer, a regularization layer, and a max pooling layer; the latter consists of 3D sampling and feature cascading. Since two- dimensional (2D) CNN cannot effectively mine the information from the grayscale and depth images, this paper adopts the 3D NN to extract the features from the multimodal social network image. In the 3D NN, the input of the input layer, and the outputs of kernels and output layer are both 3D data (Figure 5). Figure 5: The sketch map of the 3D NN In the 3D NN for multimodal fusion, the pixel value of point (a,b,c) in the c-th image of the m-th feature block on the n-th layer can be calculated by: Iabcnm = f  ∑ q Xn−1∑ xn=0 Yn−1∑ yn=0 Zn−1∑ zn=0 ωnmq ×u (a+xn)(b+yn)(c+zn) (n−1)q + dnm   (16) where, Xn, Yn, and Zn are the width, height, and length of the kernel on the n-th layer; xn ×yn ×zn is the step length of 3D convolution; ωnmg is the weight between point (a,b,c) in the c-th image of the m-th feature block on the n-th layer and the q-th feature map on the i-1-th layer; u is the input from the n-1-th layer to the m-th layer; dnm is the deviation. The 3D pooling after convolution has a similar principle as the convolution. This operation not only reduces the dimensionality of the features of the social network images in the three modes, but also enhances the saliency of these features. If the region of interest (ROI) is of the size W1×W2×W3. Then, the 3D pooling can be expressed as: Irst = max 0≤ε1≤W1;0≤ε2≤W2;0≤ε3≤W3 (ur×η+ε1,s×δ+ε2,t×τ+ε3 ) (17) where, (r,s,t) is the pixel to be pooled; n×δ×τ is the step length of 3D pooling; ur×η+ε1,s×δ+ε2,t×τ+ε3 is the output eigenvalue of 3D pixel (r ×η + ε1,s× δ + ε2, t× τ + ε3). To prevent data jitter and speed up the convergence of the multimodal fusion 3D NN, instance normalization was employed to normalize each channel of every social network image. Let upoo be the output of the pooling layer. Then, the mean λ of each social network image can be calculated along the channel by: λ = 1 RS R∑ r=1 S∑ s=1 upoo (18) The variance σ can be obtained by: σ = √√√√ 1 RS R∑ r=1 S∑ s=1 (upoo −λ)2 (19) The normalized result can be obtained by: https://doi.org/10.15837/ijccc.2020.6.4037 8 INOR = upoo −λ√ σ2 + ε (20) The layer-by-layer normalization effectively prevents vanishing or exploding gradient, reduces the reliance of the 3D NN on initial parameters, and accelerates network convergence. On this basis, the social network images can be classified accurately through subsequent visualization analysis. 4 Experiments and results analysis To verify its effectiveness, the proposed social network image classification algorithm, which is grounded on multimodal deep learning, was tested on single modal and multimodal social network images. A total of 360 social network images, which has been classified, was taken as the training sam- ples. Then, 100 samples were selected to serve as test samples. Each of them contains images in three modes, namely, RGB image, grayscale image, and depth image. Figure 6 presents the convergence curves of our algorithm on the training set and test set under different image modes. (a) Multimodal images (b) Single modal images Figure 6: The convergence curves of our algorithm on the training set and test set under different image modes As shown in Figure 6, the losses of our algorithm were -0.8735 and -0.8232 on the training set and test set of single modal social network images, respectively, and -0.8924 and -0.8045 on the training set and test set of multimodal social network images, respectively. Obviously, the training loss and test loss on multimodal images were 0.0189 and 0.0187 smaller than those on single modal images, indicating that our algorithm performs better on multimodal images. Our algorithm has obvious advantage over other classification algorithms in classification perfor- mance. The classification metrics of the proposed 3D NN are compared with several training networks in Table 1. Table 1: The comparison of classification metrics between different training networks Type of network Weighted stacking? Accuracy Mean accuracy Precision Recall BP NN No 90.2% 83.2% 82% 80% BP NN Yes 92.1% 86.3% 86% 87% RBF NN Yes 93.7% 92.4% 92% 94% SOFM NN Yes 95.2% 92.9% 93% 94% 3D NN Yes 96.8% 97.0% 95% 95% Note: BP, RBD, and SOFM are short for backpropagation, radial basis function, and self-organizing feature map, respectively. As shown in Table 1, the BP NN had a great difference between accuracy (90.2%) and mean accuracy (83.2%) in direct training. The accuracy of the BP NN on social network images was improved from 90.2% to 92.1%, after the importance of the most similar users was analyzed through https://doi.org/10.15837/ijccc.2020.6.4037 9 the artificial ant colony algorithm, as well as weighted stacking. Compared with BP NNs, RBF NN, and SOFM NN, the 3D NN achieved the highest accuracy, which averaged at 97.0%. Hence, our model is highly effective in classifying social network images. Figure 7 compares the loss curves of the above NNs on the test set. It can be seen that the proposed model produced the highest classification accuracy and tolerable losses. (a) Accuracy (b) Loss curve Figure 7: The comparison of accuracy and loss curve between different training networks 5 Conclusions Based on multimodal deep learning, this paper designs a classification algorithm for social network images. Firstly, the authors developed the SNACM, and relied on the model to derive the trust and similarity, which represent the degree of similarity between users. Then, the importance of the most similar users was analyzed by the artificial ant colony algorithm, and the SNACM was subject to weighted stacking. Next, the social network images of RGB, grayscale, and depth modes were fused. The features of the multimodal social network image were extracted by a 3D NN, realizing the accurate classification of social network images. Experimental results show that our algorithm has smaller loss on multimodal social network images than on single modal ones, and clearly outperforms other methods in classification performance; the proposed 3D NN boasts the highest accuracy among the tested training networks, and controls the loss within the acceptable range. Funding This work is supported by Philosophy and social science research project of Education Depart- ment (19G091), Humanities and Social Sciences project of Hubei Provincial Department of Education (17G134), Ministry of Education Collaborative Education Project (201902256010), Humanities and Social Sciences project of Hubei Provincial Department of Education (18G623), Ministry of Education Collaborative Education Project (201901088004). Author contributions The authors contributed equally to this work. Conflict of interest The authors declare no conflict of interest. https://doi.org/10.15837/ijccc.2020.6.4037 10 References [1] Anantha, N.L.; Battula, B.P. (2018). Deep convolutional neural networks for product recommen- dation, Ingénierie des Systèmes d’Information, 23(6), 161-172, 2018. [2] Arepalli, P.G.; Narayana, V.L.; Venkatesh, R.; Kumar, N.A. (2019). Certified node frequency in social network using parallel diffusion methods, Ingenierie des Systemesd’Information, 24(1), 113-117, 2019. [3] Chirra, V.R.R.; Uyyala, S.R.; Kolli, V.K.K. (2019). Deep CNN: A machine learning approach for driver drowsiness detection based on eye state, Revue d’Intelligence Artificielle, 33(6), 461-466, 2019. [4] Claude, U. (2020). Predicting tourism demands by google trends: A hidden markov models based study, Journal of System and Management Sciences, 10(1), 106-120, 2020. [5] De Salve, A.; Di Pietro, R.; Mori, P.; Ricci, L. (2017). A logical key hierarchy based approach to preserve content privacy in decentralized online social networks, IEEE Transactions on Depend- able and Secure Computing, 17(1), 2-21, 2017. [6] Gothania, J.; Rathore, S.K. (2019). Performance metrics for chromatic correlation clustering for social network analysis, Revue d’Intelligence Artificielle, 33(5), 373-378, 2019. [7] Gu, Y.; Chanussot, J.; Jia, X.; Benediktsson, J.A. (2017). Multiple kernel learning for hyper- spectral image classification: A review, IEEE Transactions on Geoscience and Remote Sensing, 55(11), 6547-6565, 2017. [8] Hang, R.; Liu, Q.; Hong, D.; Ghamisi, P. (2019). Cascaded recurrent neural networks for hy- perspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, 57(8), 5384-5394, 2019. [9] Huang, W.D.; Wang, Q.; Cao, J. (2018). Tracing public opinion propagation and emotional evolution based on public emergencies in social networks, International Journal of Computers Communications & Control, 13(1), 129-142, 2018. [10] Kim, J.H.; Kim, M.S.; Hong, R.K.; Ko, J.W. (2019). Continuous use intention of corporate mobile SNS users and its determinants: application of extended technology acceptance model, Journal of System and Management Sciences, 9(4), 12-28, 2019. [11] Kirsal, Y.; Paranthaman, V.V.; Mapp, G. (2018). Exploring Analytical Models for Proactive Resource Management in Highly Mobile International Journal of Computers Communications & Control, 13(5), 837-852, 2018. [12] Kuhnle, A.; Pan, T.; Alim, M.A.; Thai, M.T. (2017). Scalable bicriteria algorithms for the thresh- old activation problem in online social networks, In IEEE INFOCOM 2017-IEEE Conference on Computer Communications, IEEE, 1-9, 2017. [13] Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. (2019). Deep learning for hyperspectral image classification: An overview, IEEE Transactions on Geoscience and Remote Sensing, 57(9), 6690-6709, 2019. [14] Mallick, P.K.; Ryu, S.H.; Satapathy, S.K.; Mishra, S.; Nguyen, G.N.; Tiwari, P. (2019). Brain MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural network, IEEE Access, 7, 46278-46287, 2019. [15] Meng, W.L.; Mao, C.Z.; Zhang, J.; Wen, J.; Wu, D.H. (2019). A fast recognition algorithm of online social network images based on deep learning, Traitement du Signal, 36(6), 575-580, 2019. https://doi.org/10.15837/ijccc.2020.6.4037 11 [16] Minaev, V.A.; Dvoryankin, S.V. (2016). Foundation and description of informational and psycho- logical destructive nature influences dynamics model in social networks, Bezopasnost informat- sionnykh tekhnologiy= IT Security, 23(3), 40-52, 2016. [17] Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A. (2018). A new deep convolutional neural network for fast hyperspectral image classification, ISPRS journal of photogrammetry and remote sensing, 145, 120-147, 2018. [18] Pensa, R.G.; Di Blasi, G.; Bioglio, L. (2019). Network-aware privacy risk estimation in online social networks, Social Network Analysis and Mining, 9(1), 15, 2019. [19] Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. (2019). HybridSN: Exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification, IEEE Geoscience and Remote Sensing Letters, 17(2), 277-281, 2019. [20] Sajja, T.K.; Devarapalli, R.M.; Kalluri, H.K. (2019). Lung cancer detection based on CT scan images by using deep transfer learning, Traitement du Signal, 36(4), 339-344, 2019. [21] Shu, K.; Bernard, H.R.; Liu, H. (2019). Studying fake news via network analysis: detection and mitigation, In Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining, Springer, Cham, 43-65, 2019. [22] Van Schaik, P.; Jansen, J.; Onibokun, J.; Camp, J.; Kusev, P. (2018). Security and privacy in online social networking: Risk perceptions and precautionary behaviour, Computers in Human Behavior, 78, 283-297, 2018. [23] Venkatesan, S.; Oleshchuk, V.A.; Chellappan, C.; Prakash, S. (2016). Analysis of key management protocols for social networks, Social Network Analysis and Mining, 6(1), 3, 2016. [24] Wajeed, M.A.; Sreenivasulu, V. (2019). Image based tumor cells identification using convolutional neural network and auto encoders, Traitement du Signal, 36(5), 445-453, 2019. [25] Zhang, X.F.; Chen, X.L.; Seng, D.W.; Fang, X.J. (2019). A factored similarity model with trust and social influence for top-N recommendation, International Journal of Computers Communi- cations & Control, 14(4), 590-607, 2019. [26] Zhu, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. (2018). Generative adversarial networks for hyperspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, 56(9), 5046-5063, 2018. Copyright c©2020 by the authors. Licensee Agora University, Oradea, Romania. This is an open access article distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International License. Journal’s webpage: http://univagora.ro/jour/index.php/ijccc/ This journal is a member of, and subscribes to the principles of, the Committee on Publication Ethics (COPE). https://publicationethics.org/members/international-journal-computers-communications-and-control https://doi.org/10.15837/ijccc.2020.6.4037 12 Cite this paper as: Bai, J. W.; Chi, C. (2020). A Social Network Image Classification Algorithm Based on Multimodal Deep Learning, International Journal of Computers Communications & Control, 15(6), 4037, 2020. https://doi.org/10.15837/ijccc.2020.6.4037 Introduction SNACM Similarity calculation Weighted stacking Social network image classification algorithm Building an image association network Multimodal fusion Experiments and results analysis Conclusions