INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
Online ISSN 1841-9844, ISSN-L 1841-9836, Volume: 15, Issue: 6, Month: December, Year: 2020
Article Number: 4037, https://doi.org/10.15837/ijccc.2020.6.4037

CCC Publications 

A Social Network Image Classification Algorithm Based on
Multimodal Deep Learning

J. W. Bai, C. Chi

Junwei Bai
School of Art and Media,
Wuhan Polytechnic University,
Wuhan 430023, China
daqianzi007@126.com

Cheng Chi*
College of Engineering and Technology,
Hubei University of Technology,
Wuhan 430068, China
*Corresponding author: aling420@126.com

Abstract

The complex data structure and massive image data of social networks pose a huge challenge to
the mining of associations between social information. For accurate classification of social network
images, this paper proposes a social network image classification algorithm based on multimodal
deep learning. Firstly, a social network association clustering model (SNACM) was established,
and used to calculate trust and similarity, which represent the degree of similarity between users.
Based on artificial ant colony algorithm, the SNACM was subject to weighted stacking, and the
social network image association network was constructed. After that, the social network images
of three modes, i.e. RGB (red-green-blue) image, grayscale image, and depth image, were fused.
Finally, a three-dimensional neural network (3D NN) was constructed to extract the features of the
multimodal social network image. The proposed algorithm was proved valid and accurate through
experiments. The research results provide a reference for applying multimodal deep learning to
classify the images in other fields.

Keywords: multimodal deep learning, social network, image classification, three-dimensional
neural network (3D NN).

1 Introduction
The model of our social interactions has been reshaped by social networks, which are booming

under the proliferation of information technology and smart mobile terminals [2, 4, 6, 10, 11, 21]. The
frequency actions of social network users (e.g. interaction, image upload, and sharing) complicate the
data structure and inflate the scale of image data on social networks, posing a severe challenge to data
mining.


https://doi.org/10.15837/ijccc.2020.6.4037 2

The social network images are associated with various social information of the users, including
their locations, followees, comments, and reposts. The associations contain information that are
valuable for user portrayal, precision advertising, and community discovery. As a result, the clustering,
classification, and frequent pattern mining of social network images have become research hotspots
[12, 15, 18, 22, 24].

Radcliffe-Brown was the first to study social network and its structure, kicking off the social
network analysis. Since then, great progress has been made in the theories, methods, and techniques
of social structure [5, 9, 16, 23, 25]. Roy et al. [19] reviewed the development and analyzed the
strengths and weaknesses of various social networks at home and abroad, namely, Facebook, Twitter,
Qzone, and WeChat Moments.

The current research on social network images mainly focuses on image indexing, image classifica-
tion, and image annotation. Li et al. [13] classified multi-class social network images through kernel
canonical correlation analysis (KCCA), and achieved a classification accurate of over 89%. Mallick et
al. [14] screened social network images by matrix factorization model, and mined the visual features
of each image and its associated images, thereby improving the effect of image classification. Based on
depth-first search and breadth-first search, Hang et al. [8] extended Deep-Walk with word2vec model,
and divided the random paths of the network through greedy binarization.

The existing processing methods for social network images cannot mine the relationship between
social information. Fortunately, the deep learning has been successfully applied to analyze the image
associations in other fields [1, 3, 7, 20]. Zhu et al. [26] designed an image clustering algorithm for
image capturing robot, based on the convolutional neural network (CNN) for multi-classifier decision-
making. Paoletti et al. [17] combined hierarchical matching pursuit with multithreaded asynchronous
reinforcement learning to classify multi-source remote sensing images.

For accurate classification of social network images, this paper proposes a social network image
classification algorithm based on multimodal deep learning, which fully mines the associations between
social network images and other social information. The remainder of this paper is organized as follows:
Section 2 establishes a social network association clustering model (SNACM), and calculates the trust
and similarity, which represent the degree of similarity between users; Section 3 adopts the artificial
ant colony algorithm to analyze the importance of the users with the highest similarity, and conducts
weighted stacking of the SNACM based on the update of pheromone; On this basis, the social network
image association network was constructed, and the social network images of three modes, i.e. RGB
(red-green-blue) image, grayscale image, and depth image, were fused. At the end of Section 3,
the features of the multimodal social network image were extracted with a three-dimensional neural
network (3D NN). Finally, the proposed algorithm was proved valid and accurate through experiments.

2 SNACM
The social networks have various strong or weak associations. The effect and coverage of social

network image clustering directly hinge on the user activity. To balance the effect and coverage of
image clustering and overcome data sparsity of social networks, this paper created the SNACM based
on artificial ant colony algorithm, which searches for the users highly similar to the core user to improve
the clustering accuracy.

2.1 Similarity calculation

Suppose each user association in a social network based on trust propagation have two weights. Let
P be the Pearson correlation coefficient representing user similarity, and T be the trust representing
the trust propagation relationship. Then, the dual weight coefficient can be expressed as: (ωP ,
ωT )=(P(xi,xj), T(xi,xj)). Figure 1 provides an example of the SNACM for 7 users.

As shown in Figure 1, the SNACM combines the two types of associations in Figures 1(a) and
(b), which greatly improves the coverage. In addition, every association is directed. For instance, the
weight of x2 → x4 is (P(x2,x4), T(x2,x4)), for user x2 trusts user x4; the weight of x2 → x4 is (P(x2,
x4), 0), for user x4 does not trust user x2. The core user M of the social network can be defined as:


https://doi.org/10.15837/ijccc.2020.6.4037 3

(a) The similarity-based associations between
social network users

(b) The trust-based associations between so-
cial network users

(c) The SNACM for 7 users

Figure 1: An example of the SNACM

M = min
∑

xi,xj∈C
DIS (xi,xj) (1)

where, C is the set of user clusters; DIS(xi,xj) is the distance between users xi and xj:

DIS (xi,xj) =
√
D2P (xi,xj) + D

2
T (xi,xj) (2)

where, DP and DT are similarity-based distance and trust-based distance, respectively; ωP and ωT
are measurement functions of similarity and trust, respectively:{

DP (xi,xj) = 1 −ωP (xi,xj)
DT (xi,xj) = 1 −ωT (xi,xj)

(3)

The next step is to search for the users highly similar to the core user c through initialization, and
weighted stacking. Firstly, the similarity between the core user and every other user in the trust-based
social network was quantified, and the top-t users were selected. The similarity can be quantified by:

SIM (c,xk) =




2×P (c,xk)×T (c,xk)
P (c,xk)+T (c,xk)

P (c,xk) + T (c,xk) 6= 0
T (c,xk) P (c,xk) = 0,T (c,xk) 6= 0
P (c,xk) P (c,xk) 6= 0,T (c,xk) = 0

(4)


https://doi.org/10.15837/ijccc.2020.6.4037 4

Formula (4) shows that, the similarity between user xk and the core user c can be directly calcu-
lated, if the two users have direct trust-based associations like mutual or unilateral following. Other-
wise, the similarity can be mined from comments and ratings, using the Pearson correlation coefficient:

T (c,xk) =
DISmax −DISmax (c,xk) + 1

DISmax
(5)

The user similarity can be computed by:

P (c,xk) =
∑H
h=1 (sh(c) − s̄(c)) (sh (xk) − s̄ (xk))√∑H

h=1 (sh(c) − s̄(c))
2
√∑H

h=1 (sh (xk) − s̄ (xk))
2

(6)

where, sh is the attention of a user to event h; s is the mean attention of a user to events; H is the set
of events that attract the attention of the user. The user similarity was determined against a suitable
threshold, such that the users in the same range are allocated to the same class.

2.2 Weighted stacking

Based on the SNACM, the artificial ant colony algorithm was introduced to analyze the importance
of the t users with the highest similarity. Figure 2 shows the trust-based association model of the core
user.

Figure 2: The trust-based association model of the core user

The pheromone released by the ant walking in the model is determined by the similarity between
the users corresponding to the path it walks. The pheromone is updated iteratively by:

τ ′ (c,xk) = (1 −ρ)τ (c,xk) +
G∑
g=1

∆τg−xi (c,xk) (7)

where, τ ′ is the updated pheromone of user xk, representing its association with the core user c; ρ is
the volatilization coefficient.

Let cg−xk be the cost of the g-th ant to solve the optimal path. Then, the pheromone ∆τg−xk
released by the g-th ant on path between of user xk and the core user c can be described as:

∆τg−xi (c,xk) =
{

K
cg−xi(c,xk)

, xi ∈ Xg
0, xi /∈ Xg

(8)

where, K is a constant; Xg is the set of users visited by the g-th ant. As shown in formulas (7) and
(8), if user xk does not belong to that set, the pheromone will volatize iteratively with ρ; otherwise,


https://doi.org/10.15837/ijccc.2020.6.4037 5

the vitalization should be combined with the sum of the pheromone released by all ants on the paths
between u user xk and other users. To avoid the local optimum trap, the probability for the g-th ant
to walk from user xk to the core user c can be defined as follows, in order to select the users with high
similarity to the core user and low redundancy:

Pg (c,xk) =




[τ(c,xk)α[1/P (c,xk)]β∑
l∈Yg

[τ(c,xl)]α[1/P (c,xl)]β
,c ∈ Nxk

0,c /∈ Nxk
(9)

where, 1/P(c,xk) is the heuristic value, representing the probability to walk from user xk to the core
user c; α and β are controlling parameters of the effects of pheromone and heuristic value, respectively;
Nxk is the set of neighbors of user xk; Yg is the set of users not yet visited by the g-th ant.

3 Social network image classification algorithm
The SNACM can reveal the similarity-based association vectors between users. Here, the CNN

is adopted to extract the salient features of the associations between social network images. The
obtained salient features, coupled with the similarity-based association vectors, were used to train a
neural network model, aiming to accurately classify social network images. The structure of the social
network image classification algorithm is presented in Figure 3.

Figure 3: The structure of the social network image classification algorithm

3.1 Building an image association network

The social image association network was constructed through one-to-one mapping of the n users
in the SNACM to the set M of images uploaded by them. Let F(f,mk) be the association between
the image mk uploaded by user xk and the image f uploaded by the core user c, and matrix R be the
degree of the association. Then, the weights of the association between mk and f can be characterized
by the corresponding element r(f,mk) in matrix R. Hence, the probability for the g-th ant to walk
from image mk to image mj can be defined as:

Pg (f,mk) =




[τ(f,mk)]α[1/F (f,mk)]β∑
l∈Vg

[τ(f,ml)]α[1/F (f,ml)]β
,f ∈ Nmk

0, f /∈ Nmk
(10)

where, Nmk is the set of neighbors of image mk; Vg is the set of images not yet visited by the g-th ant.
The LINE algorithm based on the second-order similarity can be used to calculate the conditional

probability of every directed association in the social network:

P̂g (f,mk) =
exp [τ (f,mk) τ ′ (f,mk)]∑n
q=1 exp [τ (f,mq) τ ′ (f,mq)]

(11)


https://doi.org/10.15837/ijccc.2020.6.4037 6

The associations between SNACM images were taken as training samples. Then, the image as-
sociation network can be finalized through learning the Kullback–Leibler (K-L) divergence between
formulas (10) and (11). The objective function of the algorithm can be defined as:

min
∑

f,mk∈M

G∑
g=1

KL
(
P̂g (f,mk) ,Pg (f,mk)

)
= min

∑
f,mk∈M

G∑
g=1

[r (f,mk) lg (Pg (f,mk))] (12)

The conditional probabilities cannot be calculated unless the pheromones are iteratively optimized
throughout the image association network. However, it is an arduous task to directly optimize formula
(12). To solve the problem, lg (Pg (f,mk)) can be processed by the negative sampling algorithm and
Sigmoid function:

lg sig (Pg (f,mk)) +
S∑
s=1

Pnoise [sig (Ps−g (f,mk))] (13)

where, the second term is the negative sampling item; Pnoise is the noise distribution of negative
sampling; Ps−g is the total number of negative samplings with conditional probability S obtained
in the s-th sampling. Then, the gradient descent algorithm was introduced to optimize the above
formula. In this way, the salient features of image associations can be obtained through learning.

3.2 Multimodal fusion

Figure 4: The workflow of the classification of multimodal social network image

Many social network images come from video clips. For the accuracy of image classification, the
images extracted from videos were subject to modal analysis and fusion, and trained with the CNN.
For a social network image, the texture is reflected by the grayscale image of the object, and the 3D
surface features are depicted by RGB and depth images. To fully utilize the feature information of
social network images, this paper fuses the RGB image, grayscale image, and the depth image, and
extracts the salient features from the fused high-dimensional image (Figure 4). During the fusion,
the low-level features were extracted from the RGB image, grayscale image, and the depth image
by the 3D NN, using convolutional and pooling techniques, and used as the inputs of the objective
functions of multiple path-searching ants, thereby obtaining more abstract and effective high-level
features. After the fusion, the image mk corresponding to user xk can be expressed as:

Imk = [IRGB (mk) ,IGRAY (mk) ,IDEPTH (mk)] (14)

where, IRGB, IGRAY , and IDEPTH are the images uploaded in the RGB, grayscale, and depth modes,
respectively. The different features were extracted from the three modes, and used to create a training
set T under mode fusion based on image association network:

T = {t1 (If,Im1 ) , · · · , tk (If,Imk) , · · · , tn (If,Imn)} (15)


https://doi.org/10.15837/ijccc.2020.6.4037 7

where, tk(If,Imk) is the k-th training sample of the image association network.
Based on the image association network, the training path involves a feature extraction model

and a feature amplification module. The former consists of a 3D convolution layer, a regularization
layer, and a max pooling layer; the latter consists of 3D sampling and feature cascading. Since two-
dimensional (2D) CNN cannot effectively mine the information from the grayscale and depth images,
this paper adopts the 3D NN to extract the features from the multimodal social network image. In
the 3D NN, the input of the input layer, and the outputs of kernels and output layer are both 3D data
(Figure 5).

Figure 5: The sketch map of the 3D NN

In the 3D NN for multimodal fusion, the pixel value of point (a,b,c) in the c-th image of the m-th
feature block on the n-th layer can be calculated by:

Iabcnm = f


∑

q

Xn−1∑
xn=0

Yn−1∑
yn=0

Zn−1∑
zn=0

ωnmq ×u
(a+xn)(b+yn)(c+zn)
(n−1)q + dnm


 (16)

where, Xn, Yn, and Zn are the width, height, and length of the kernel on the n-th layer; xn ×yn ×zn
is the step length of 3D convolution; ωnmg is the weight between point (a,b,c) in the c-th image of
the m-th feature block on the n-th layer and the q-th feature map on the i-1-th layer; u is the input
from the n-1-th layer to the m-th layer; dnm is the deviation.

The 3D pooling after convolution has a similar principle as the convolution. This operation not
only reduces the dimensionality of the features of the social network images in the three modes, but
also enhances the saliency of these features. If the region of interest (ROI) is of the size W1×W2×W3.
Then, the 3D pooling can be expressed as:

Irst = max
0≤ε1≤W1;0≤ε2≤W2;0≤ε3≤W3

(ur×η+ε1,s×δ+ε2,t×τ+ε3 ) (17)

where, (r,s,t) is the pixel to be pooled; n×δ×τ is the step length of 3D pooling; ur×η+ε1,s×δ+ε2,t×τ+ε3
is the output eigenvalue of 3D pixel (r ×η + ε1,s× δ + ε2, t× τ + ε3).

To prevent data jitter and speed up the convergence of the multimodal fusion 3D NN, instance
normalization was employed to normalize each channel of every social network image. Let upoo be the
output of the pooling layer. Then, the mean λ of each social network image can be calculated along
the channel by:

λ =
1
RS

R∑
r=1

S∑
s=1

upoo (18)

The variance σ can be obtained by:

σ =

√√√√ 1
RS

R∑
r=1

S∑
s=1

(upoo −λ)2 (19)

The normalized result can be obtained by:


https://doi.org/10.15837/ijccc.2020.6.4037 8

INOR =
upoo −λ√
σ2 + ε

(20)

The layer-by-layer normalization effectively prevents vanishing or exploding gradient, reduces the
reliance of the 3D NN on initial parameters, and accelerates network convergence. On this basis, the
social network images can be classified accurately through subsequent visualization analysis.

4 Experiments and results analysis
To verify its effectiveness, the proposed social network image classification algorithm, which is

grounded on multimodal deep learning, was tested on single modal and multimodal social network
images. A total of 360 social network images, which has been classified, was taken as the training sam-
ples. Then, 100 samples were selected to serve as test samples. Each of them contains images in three
modes, namely, RGB image, grayscale image, and depth image. Figure 6 presents the convergence
curves of our algorithm on the training set and test set under different image modes.

(a) Multimodal images (b) Single modal images

Figure 6: The convergence curves of our algorithm on the training set and test set under different
image modes

As shown in Figure 6, the losses of our algorithm were -0.8735 and -0.8232 on the training set and
test set of single modal social network images, respectively, and -0.8924 and -0.8045 on the training
set and test set of multimodal social network images, respectively. Obviously, the training loss and
test loss on multimodal images were 0.0189 and 0.0187 smaller than those on single modal images,
indicating that our algorithm performs better on multimodal images.

Our algorithm has obvious advantage over other classification algorithms in classification perfor-
mance. The classification metrics of the proposed 3D NN are compared with several training networks
in Table 1.

Table 1: The comparison of classification metrics between different training networks
Type of network Weighted stacking? Accuracy Mean accuracy Precision Recall

BP NN No 90.2% 83.2% 82% 80%
BP NN Yes 92.1% 86.3% 86% 87%

RBF NN Yes 93.7% 92.4% 92% 94%
SOFM NN Yes 95.2% 92.9% 93% 94%

3D NN Yes 96.8% 97.0% 95% 95%

Note: BP, RBD, and SOFM are short for backpropagation, radial basis function,
and self-organizing feature map, respectively.

As shown in Table 1, the BP NN had a great difference between accuracy (90.2%) and mean
accuracy (83.2%) in direct training. The accuracy of the BP NN on social network images was
improved from 90.2% to 92.1%, after the importance of the most similar users was analyzed through


https://doi.org/10.15837/ijccc.2020.6.4037 9

the artificial ant colony algorithm, as well as weighted stacking. Compared with BP NNs, RBF NN,
and SOFM NN, the 3D NN achieved the highest accuracy, which averaged at 97.0%. Hence, our model
is highly effective in classifying social network images.

Figure 7 compares the loss curves of the above NNs on the test set. It can be seen that the
proposed model produced the highest classification accuracy and tolerable losses.

(a) Accuracy (b) Loss curve

Figure 7: The comparison of accuracy and loss curve between different training networks

5 Conclusions
Based on multimodal deep learning, this paper designs a classification algorithm for social network

images. Firstly, the authors developed the SNACM, and relied on the model to derive the trust
and similarity, which represent the degree of similarity between users. Then, the importance of the
most similar users was analyzed by the artificial ant colony algorithm, and the SNACM was subject
to weighted stacking. Next, the social network images of RGB, grayscale, and depth modes were
fused. The features of the multimodal social network image were extracted by a 3D NN, realizing the
accurate classification of social network images. Experimental results show that our algorithm has
smaller loss on multimodal social network images than on single modal ones, and clearly outperforms
other methods in classification performance; the proposed 3D NN boasts the highest accuracy among
the tested training networks, and controls the loss within the acceptable range.

Funding

This work is supported by Philosophy and social science research project of Education Depart-
ment (19G091), Humanities and Social Sciences project of Hubei Provincial Department of Education
(17G134), Ministry of Education Collaborative Education Project (201902256010), Humanities and
Social Sciences project of Hubei Provincial Department of Education (18G623), Ministry of Education
Collaborative Education Project (201901088004).

Author contributions

The authors contributed equally to this work.

Conflict of interest

The authors declare no conflict of interest.


https://doi.org/10.15837/ijccc.2020.6.4037 10

References
[1] Anantha, N.L.; Battula, B.P. (2018). Deep convolutional neural networks for product recommen-

dation, Ingénierie des Systèmes d’Information, 23(6), 161-172, 2018.

[2] Arepalli, P.G.; Narayana, V.L.; Venkatesh, R.; Kumar, N.A. (2019). Certified node frequency
in social network using parallel diffusion methods, Ingenierie des Systemesd’Information, 24(1),
113-117, 2019.

[3] Chirra, V.R.R.; Uyyala, S.R.; Kolli, V.K.K. (2019). Deep CNN: A machine learning approach for
driver drowsiness detection based on eye state, Revue d’Intelligence Artificielle, 33(6), 461-466,
2019.

[4] Claude, U. (2020). Predicting tourism demands by google trends: A hidden markov models based
study, Journal of System and Management Sciences, 10(1), 106-120, 2020.

[5] De Salve, A.; Di Pietro, R.; Mori, P.; Ricci, L. (2017). A logical key hierarchy based approach to
preserve content privacy in decentralized online social networks, IEEE Transactions on Depend-
able and Secure Computing, 17(1), 2-21, 2017.

[6] Gothania, J.; Rathore, S.K. (2019). Performance metrics for chromatic correlation clustering for
social network analysis, Revue d’Intelligence Artificielle, 33(5), 373-378, 2019.

[7] Gu, Y.; Chanussot, J.; Jia, X.; Benediktsson, J.A. (2017). Multiple kernel learning for hyper-
spectral image classification: A review, IEEE Transactions on Geoscience and Remote Sensing,
55(11), 6547-6565, 2017.

[8] Hang, R.; Liu, Q.; Hong, D.; Ghamisi, P. (2019). Cascaded recurrent neural networks for hy-
perspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, 57(8),
5384-5394, 2019.

[9] Huang, W.D.; Wang, Q.; Cao, J. (2018). Tracing public opinion propagation and emotional
evolution based on public emergencies in social networks, International Journal of Computers
Communications & Control, 13(1), 129-142, 2018.

[10] Kim, J.H.; Kim, M.S.; Hong, R.K.; Ko, J.W. (2019). Continuous use intention of corporate mobile
SNS users and its determinants: application of extended technology acceptance model, Journal
of System and Management Sciences, 9(4), 12-28, 2019.

[11] Kirsal, Y.; Paranthaman, V.V.; Mapp, G. (2018). Exploring Analytical Models for Proactive
Resource Management in Highly Mobile International Journal of Computers Communications &
Control, 13(5), 837-852, 2018.

[12] Kuhnle, A.; Pan, T.; Alim, M.A.; Thai, M.T. (2017). Scalable bicriteria algorithms for the thresh-
old activation problem in online social networks, In IEEE INFOCOM 2017-IEEE Conference on
Computer Communications, IEEE, 1-9, 2017.

[13] Li, S.; Song, W.; Fang, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. (2019). Deep learning for
hyperspectral image classification: An overview, IEEE Transactions on Geoscience and Remote
Sensing, 57(9), 6690-6709, 2019.

[14] Mallick, P.K.; Ryu, S.H.; Satapathy, S.K.; Mishra, S.; Nguyen, G.N.; Tiwari, P. (2019). Brain
MRI image classification for cancer detection using deep wavelet autoencoder-based deep neural
network, IEEE Access, 7, 46278-46287, 2019.

[15] Meng, W.L.; Mao, C.Z.; Zhang, J.; Wen, J.; Wu, D.H. (2019). A fast recognition algorithm of
online social network images based on deep learning, Traitement du Signal, 36(6), 575-580, 2019.


https://doi.org/10.15837/ijccc.2020.6.4037 11

[16] Minaev, V.A.; Dvoryankin, S.V. (2016). Foundation and description of informational and psycho-
logical destructive nature influences dynamics model in social networks, Bezopasnost informat-
sionnykh tekhnologiy= IT Security, 23(3), 40-52, 2016.

[17] Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A. (2018). A new deep convolutional neural network
for fast hyperspectral image classification, ISPRS journal of photogrammetry and remote sensing,
145, 120-147, 2018.

[18] Pensa, R.G.; Di Blasi, G.; Bioglio, L. (2019). Network-aware privacy risk estimation in online
social networks, Social Network Analysis and Mining, 9(1), 15, 2019.

[19] Roy, S.K.; Krishna, G.; Dubey, S.R.; Chaudhuri, B.B. (2019). HybridSN: Exploring 3-D–2-D CNN
feature hierarchy for hyperspectral image classification, IEEE Geoscience and Remote Sensing
Letters, 17(2), 277-281, 2019.

[20] Sajja, T.K.; Devarapalli, R.M.; Kalluri, H.K. (2019). Lung cancer detection based on CT scan
images by using deep transfer learning, Traitement du Signal, 36(4), 339-344, 2019.

[21] Shu, K.; Bernard, H.R.; Liu, H. (2019). Studying fake news via network analysis: detection and
mitigation, In Emerging Research Challenges and Opportunities in Computational Social Network
Analysis and Mining, Springer, Cham, 43-65, 2019.

[22] Van Schaik, P.; Jansen, J.; Onibokun, J.; Camp, J.; Kusev, P. (2018). Security and privacy in
online social networking: Risk perceptions and precautionary behaviour, Computers in Human
Behavior, 78, 283-297, 2018.

[23] Venkatesan, S.; Oleshchuk, V.A.; Chellappan, C.; Prakash, S. (2016). Analysis of key management
protocols for social networks, Social Network Analysis and Mining, 6(1), 3, 2016.

[24] Wajeed, M.A.; Sreenivasulu, V. (2019). Image based tumor cells identification using convolutional
neural network and auto encoders, Traitement du Signal, 36(5), 445-453, 2019.

[25] Zhang, X.F.; Chen, X.L.; Seng, D.W.; Fang, X.J. (2019). A factored similarity model with trust
and social influence for top-N recommendation, International Journal of Computers Communi-
cations & Control, 14(4), 590-607, 2019.

[26] Zhu, L.; Chen, Y.; Ghamisi, P.; Benediktsson, J.A. (2018). Generative adversarial networks for
hyperspectral image classification, IEEE Transactions on Geoscience and Remote Sensing, 56(9),
5046-5063, 2018.

Copyright c©2020 by the authors. Licensee Agora University, Oradea, Romania.
This is an open access article distributed under the terms and conditions of the Creative Commons
Attribution-NonCommercial 4.0 International License.
Journal’s webpage: http://univagora.ro/jour/index.php/ijccc/

This journal is a member of, and subscribes to the principles of,
the Committee on Publication Ethics (COPE).

https://publicationethics.org/members/international-journal-computers-communications-and-control


https://doi.org/10.15837/ijccc.2020.6.4037 12

Cite this paper as:

Bai, J. W.; Chi, C. (2020). A Social Network Image Classification Algorithm Based on Multimodal
Deep Learning, International Journal of Computers Communications & Control, 15(6), 4037, 2020.

https://doi.org/10.15837/ijccc.2020.6.4037


	Introduction
	SNACM
	Similarity calculation
	Weighted stacking

	Social network image classification algorithm
	Building an image association network
	Multimodal fusion

	Experiments and results analysis
	Conclusions