INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
ISSN 1841-9836, e-ISSN 1841-9844, 14(4), 590-607, August 2019.

A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation

X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

Xuefeng Zhang
School of Computer Science and Technology, Hangzhou Dianzi University
Hangzhou 310018, China

Xiuli Chen
School of Computer Science and Technology, Hangzhou Dianzi University
Hangzhou 310018, China

Dewen Seng*
School of Computer Science and Technology, Hangzhou Dianzi University
Hangzhou 310018, China
*Corresponding author: sengdw@163.com

Xujian Fang
School of Computer Science and Technology, Hangzhou Dianzi University
Hangzhou 310018, China

Abstract: Many trust-aware recommendation systems have emerged to overcome
the problem of data sparsity, which bottlenecks the performance of traditional Col-
laborative Filtering (CF) recommendation algorithms. However, these systems most
rely on the binary social network information, failing to consider the variety of trust
values between users. To make up for the defect, this paper designs a novel Top-N rec-
ommendation model based on trust and social influence, in which the most influential
users are determined by the Improved Structural Holes (ISH) method. Specifically,
the features in Matrix Factorization (MF) were configured by deep learning rather
than random initialization, which has a negative impact on prediction of item rating.
In addition, a trust measurement model was created to quantify the strength of im-
plicit trust. The experimental result shows that our approach can solve the adverse
impacts of data sparsity and enhance the recommendation accuracy.
Keywords: recommendation system, matrix factorization, trust, social influence,
deep learning, top-n recommendation.

1 Introduction

The dawn of the big data era has brought abundant information resources. However, hu-
man beings are sometimes provided with too much information, that is, faced with information
overload. To solve the problem, many portals and e-commerce systems have adopted recommen-
dation systems to push the desired information to users. These systems learn users’ preferences
from their historical behaviors, aiming to recommend the items meeting their interests and needs.

CF is one of the most popular and successful personalized recommendation algorithms. The
algorithm discovers the preferences of users by mining the data on their historical behaviors,
divides the users into different groups based on their preferences, and recommends similar items
to users in each group. The recommendation accuracy of the CF hinges on two issues: inferring
user preferences from historical data, and identifying similar users or items. The traditional CF
algorithm faces two major challenges, namely, data sparsity and cold start. Many methods have
been developed to cope with these challenges. Some scholars have introduced various auxiliary

Copyright ©2019 CC BY-NC


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 591

data to improve the CF, including membership [7,26], trust [2,8,14,23,27,28] and other social
information [5,21,25].

With the development of social networking, it is increasingly convenient to acquire the data
on social relationship. Against this background, a series of trust-based approaches have been pro-
posed and applied in various domains, making up an integral part of recommendation algorithms.
Trust relationship, closely related to social information, is a hotspot in the research of recom-
mendation systems. In social networks, users always prefer the items recommended by those
they trust. Therefore, trust-aware recommendation can greatly improve the recommendation
effect of sparse users (i.e. data sparsity), leading to better user experience and enhanced loyalty.
Based on trust propagation mechanism, Jamali and Ester [10] designed a MF-based model for
recommendation in social rating networks (SocialMF). The potential feature vectors of target
users were described as the weighted average of their trustee’s feature vectors, which significantly
reduced the recommendation error, especially for cold-start users. Guo et al. [8] extended the
SVD++ algorithm [14] into the TrustSVD algorithm, which achieves excellent recommendation
effect through considering the explicit trust relationship of the target user. The above studies
fully demonstrate the promotion effect of the trust relationship on recommendation accuracy.
However, these trust relationships are single and incomplete.

The observations on real-world datasets like FilmTrust, Epinions and Ciao reveal two prob-
lems of trust relationship: (1) the trust relationship may be implicit due to privacy concerns,
resulting in the sparsity of trust information; (2) the trust value is usually binary, i.e. each
trusted friend has the same impact on the target user, which goes against the reality.

Therefore, the existing trust information can not fully mine the implicit user preference
information in social networks. It is necessary to accurately measure the trust relationship, and
examine its role in item recommendation.

The user behavior in the social network is also greatly affected by social influence. The
behavior or opinion of a person is easily affected by influential users, who seem to be authoritative.
To study user behavior, it is both necessary and difficult to analyze the influence of different
users out of the massive information in the social network.

The social influence analysis [18] refers to the evaluation of the impact of each user against
his/her information or social behavior by a certain standard, with the aim to identify those with
significant impact in his/her group or the social network.

In fact, the analysis of social influence has attracted much attention from scholars engaging in
sociology, management, information science, and economics [1,12,13,15,22]. For example, Kitsak
et al. [13] empirically investigated social networks and email networks, and divided network nodes
by k-kernel decomposition to different levels from the edge to the core, revealing that the most
influential nodes are not necessarily the most connected nodes, i.e. those with high betweenness
centrality, but those with high k-shells. Nevertheless, in the field of recommendation system,
trust plays an important role, and social influence is neglected by most researchers, although it
is also a key factor affecting the behavior of social network users.

Our research is mainly motivated by two issues. On the one hand, the existing trust-based
recommendation methods usually use the binary trust relationship of social networks directly to
improve the recommendation quality, seldom considering the difference and potential impact of
trust intensity among users, and also can not fully tap the implicit user preference information
in social networks, which can not achieve good recommendation effect in the case of sparse trust
data.

On the other hand, the existing recommendation systems rarely utilize social influence, de-
spite its significant impact on user behavior in the social network. To overcome these two defects,
this paper fully excavates the implicit social relationships among users in the social network, and
propose a top-N item recommendation algorithm which fusion of trust and social influence, called


592 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

Factored user and item Similarity Model with Trust and social Influence based on Deep learning
(FSTID). The architecture of our recommendation algorithm is shown in Fig. 1. Finally, the
proposed algorithm was proved more accurate and effective than other recommendation methods
on three datasets (Epinions, Ciao and FilmTrust).

Figure 1: The architecture our recommendation algorithm

There are four main contributions of this research:

(1) A novel trust measurement model was conducted to quantify the implicit trust in social
network. The trust value covers three aspects, namely, user interaction, item rating and user
preference. The combination of the two kinds of relationship between trust and similarity are
compact, which are effective to solve the extreme data sparsity of recommendation system. In
addition, our model has played a significant positive role correspond to accurate positioning
neighbor users.

(2) The ISH method was proposed to pinpoint the key nodes, i.e. the most influential users, in
the social network.

(3) Considering the influence of user trust and social influence on recommendation, an innovative
top-N recommendation model with the incorporation of user and item similarity, trust and
social influence is proposed. Besides, Deep Autoencoder (DAE) technology was employed to
optimize the initial values of the latent features for MF.

(4) Many top-N recommendation algorithms were compared through experiments on three real-
world datasets. The comparison shows that our approach can alleviate the sparse problem
of rating data to a certain extent and obtain more reliable recommendations.

The remainder of this paper is organized as follows: Section 2 details the construction of
our algorithm; Section 3 compares our algorithm with other top-N recommendation algorithms
through experiments; Section 4 puts forward the conclusions and foretells the future research.


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 593

2 Model construction

This section introduces the overall framework of the FSTID and details the four key stages
of the method in seven steps. Before presenting the FSTID, a new trust measurement model
was first established, the ISH method was employed to identify the most influential nodes in the
social network, and the DAE was adopted to extract user and item feature vectors.

Let U = {u1,u2, · · · ,um} be the set of m users, I = {i1, i2, · · · , in} be the set of n items,
R = [ru,i]m×n be the binary matrix of the item ratings by the user (ru,i = 1 means user u has
purchased or rated item i; otherwise, user u has not purchased or rated item), G = (V,E) be the
graph of social network (V is the set all nodes (users) and E is the set of all edges), and < i,j >
be the edge from node i to node j, that is, the trust relationship between users i and j.

2.1 Overall framework

Figure 2: The overall framework of the FSTID

As shown in Fig. 2, the procedure of the FSTID algorithm roughly contains four key stages:
the calculation of trust value, the identification of influential users, feature extraction, and top-
N recommendation. Firstly, the trust value is computed based on the initial trust value and
preference degree of each user, forming the trust network. Then, the ISH is adopted to identify
the key nodes in the social network, and allocate them into the set of influential users. After
that, the features of users and items are extracted by the DAE.

Finally, the CF is applied to predict the item ratings, and provide personalized recommen-
dation. The four stages are implemented in seven steps:

(1) Calculation of trust value
Step 1: The trust value of each user is initialized based on his/her interaction information

with others. The term interaction is defined as two users rating on the same item i.
Step 2: Considering the mutual influence between trust value and interactions, and the

users’ tendency to trust those who have interacted successfully on their favorite items, the author
measured the user preference over different items by preference degree.

Step 3: Based on user preference over item i, different weights are assigned to items in each
interaction, then generate the final trust value, and the trust network is formed after filtering


594 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

out the weak trust relationship.
(2) Identification of influential users
Step 4: The constraint value of each node is computed according to the topology of its

neighbors in the directed trust network. Taking the constraint value as the indicator, the impor-
tance of each node is evaluated and recorded as "influence". Finally, the most influential users
are determined and allocated to the same set.

(3) Feature extraction
Steps 5 and 6: By using DAE to learn users’ behaviors without supervision so that the

high-dimensional, sparse users’ behaviors can be compressed into low-dimensional, dense users
and items feature vectors. Compared to initial features, these features are no longer sparse, but
also more representative.

(4) Top-N recommendation
Step 7: Combining the user similarities and item similarities, we introduce trust users and

influential users to predict rating and offer personalized recommendation for users

2.2 Calculation of trust value (Steps 1 3)

The trust value can be expressed as T = [tu,v]m×m, where tu,v is a nonzero element indicating
the existence of social relationship between users u and v. In the real world, the trust relationship
of most online social networks, e.g. Facebook, Epinions and Flixster, is usually represented in
binary form (0 or 1), which is not exactly the same as the trust relationship of users.

This subsection sets up a novel trust measurement model, regardless if the trust value obeys
explicit distribution. First, two assumptions were put forward: (1) an interaction is defined
as two users perform a rating on the same item; (2) the trust value comes from cumulative
experience of subjective individuals. Under these assumptions, the trust value can be initialized
as:

Init(u,v) =
min(|Iu

⋂
Iv|,Du)

Du
(1)

where Iu
⋂
Iv is the number of interactions between users u and v, i.e. the number of items

rated by both users; Du =
√
|Iu| is an adjustable threshold specifying the minimum number of

interactions between the two users that fully trust each other. The trust relationship is not a
stable factor, which along with influence of interaction experience will be changed. A successful
interaction would increase the trust value and an unsuccessful interaction act oppositely. If both
users u and v have rated item i, then the interaction is successful if the two ratings are both
above or below the average rating of user himself:{

successful, (ru,i − ū) ∗ (rv,i − v̄) > 0
unsuccessful, (ru,i − ū) ∗ (rv,i − v̄) < 0

(2)

where ū and v̄ are the average rating of users u and v on all items, respectively.
The trust value differs with the preference degree of each user. Let Pre(u,i) be the preference

degree of user u over item i, Ui be the set of users that have rated item i, and o be a user in
the set Ui. Then, the similarity between users u and o can be calculated using the Pearson
correlation coefficient (PCC):

sim(u,o) =


1

2
+

∑
i∈Iu∩Io(ru,i − ū)(ro,i − ō)

2 ∗
√∑

i∈Iu∩Io(ru,i − ū)
2
√∑

i∈Iu∩Io(ro,i − ō)
2


∗ |Iu ∩ Io|

|Iu|
(3)

Pre(u,i) =

∑
o∈Ui sim(u,o)

|Ui|
(4)


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 595

Based on the user preferences in successful or unsuccessful interactions, different weights
were assigned to different items. On this basis, the final trust value T(u,v) can be obtained as:

T(u,v) =

∑
i∈successful Pre(u,i) −

∑
i∈unsuccessful Pre(u,i)∑

i∈successful Pre(u,i) +
∑

i∈unsuccessful Pre(u,i)
∗ Init(u,v) (5)

2.3 Identification of influential nodes (Step 4)

In the social network, each user collects and disseminates information as a node. Some of
them are more influential than others. These key nodes are similar to Structural Hole (SH)
occupants in the competitive group theory proposed by Burt [3]. The SH occupants play a
key role in information exchange between the local groups and enjoys more opportunities and
options than other group members. Therefore, the author improved the SH method to integrate
the impacts of in- and out- degrees of neighbor nodes in the directed trust network. The ISH is
illustrated with an instance in Fig. 3 and Table 1. Here, the in-degree is defined as the number
of nodes pointing towards a node, and the out-degree has the opposite meaning.

The SH is generally evaluated by two indices: the betweenness centrality and the index
given by Burt himself. The former refers to Freeman’s betweenness centrality [6] for the overall
network and its extended form. If a node is on the shortest path of many pairs of other nodes,
then the node has a high betweenness centrality, and a high probability to be a SH occupant.
The latter involves four aspects: effective size, efficiency, constraint and hierarchy. Constraints
refer to the node’s ability to use SH in its own networks, which is more important than the other
three aspects. The calculation formula of constraint is as follows, describing the degree to which
a node in a network is directly or indirectly connected to other nodes:

C(i) =
∑
j∈Γ(i)


p(j,i) + ∑

q∈Γ(i)

p(j,q) ∗p(q,i)


2 , i 6= q 6= j (6)

where Γ(i) refers to the nodes directly connected to node i in the undirected graph; p(j,i)
is the proportion of the energy that node i devotes to maintain its relationship with node j to
the total energy of node i. Since the trust network is a directed graph, Γ(i) was replaced with
T−i to describe the set of nodes pointing towards node i. The value of p(j,i) can be calculated
by:

p(j,i) =
Zji∑

j∈T−i
Zji

(7)

where Zji = 1 if < i,j >∈ E (otherwise, Zji = 0); p(j,i) +
∑

q∈Γ(i) p(j,q)∗p(q,i) represents
the time and energy that node i invests directly or indirectly to maintain its connection with
node j, when node j is the only node pointing to node i, the item reaches a maximum value of
1; this item reaches a minimum when there is no bridge node between node i and node j.

Based on the C(i) value obtained from Eq. (6), the number and closeness of neighbors
can be evaluated in a comprehensive manner. The C(i) value is negatively correlated with the
number of nodes pointing towards the target node, and positively with the closeness between
these nodes. If closely distributed, the nodes are less likely to acquire new information. On the
contrary, nodes with small constraint coefficients can promote information propagation.

Table 1 lists the constraint values of all nodes in Fig.3 (a), respectively calculated by the SH
and the ISH. During the calculation, any node with no node pointing towards it was neglected.
As shown in the table, nodes u4 and u7 have the same constraint value, i.e. the same influence.
The information flow theory [20] holds that the information on the Internet often propagates
to the opinion leader, before spreading to the others distributed in a wide range. In this sense,


596 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

Figure 3: Two toy examples of trust network topological graph

Table 1: Constraint values of all nodes in Fig. 3(a) obtained by the SH and the ISH

C(i) u1 u4 u7 u8

SH 0.6667 0.3333 0.3333 1

ISH 0.5555 0.3333 0.0625 1

a node connected to multiple opinion leaders is likely to become an SH occupant. Taking the
in-degree of node as the evaluation standard for opinion leader, nodes u1,u4 and u7 with the
highest in-degrees must be opinion leaders, and play a greater role than the other nodes (e.g.
u8 and u9) in information dissemination. Among them, node u7 are connected to two opinion
leaders at the same time, indicating that it is more conducive to information dissemination than
nodes u1 and u4. In other words, node u7 is more likely to occupy an SH than the other two
nodes. The SH results in Table 1 are obviously unreasonable, as many important nodes are not
found. This is because the traditional SH only measures the relationship between a node and its
nearest neighbor, without considering that between the node and its two-hop neighbors.

For convenience, it is assumed that a key node j can boost the influence of the node i it
points towards. The nodes pointing away from node j were also taken into account. As shown in
Fig.3(b), the influence of node u6 is reduced due to the fact that the only node pointing towards
it u1 has a node pointing away from it u5. Therefore, the ISH integrating the impacts of in- and
out-degrees of neighbor nodes can be expressed as:

C(i) =
∑
j∈T−i

|T +j |
|T +j | + |T

−
j |
∗


p(j,i) + ∑

q∈T−i

p(j,q) ∗p(q,i)


2 , i 6= q 6= j (8)

where T +j is the set of nodes pointing away from node j.
The validity of the ISH was tested with the instance in Fig.3(a). The test results are recorded

in Table 1. It can be seen that node u7, with the lowest constraint value, is the most influential
node. Both u1 and u4 had an in-degree of 3, yet u8, a node pointing towards u1, has a connection
with u5, which reduces the influence of u1 on u8. Thus, the influence of u4 is higher than that
of u1.


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 597

After the constraint values of all nodes were obtained by Eq. (8), the top k% of users with
the smallest constraint values were selected as global influential users IU.

2.4 Feature extraction (step 5 and step 6)

Figure 4: The flowchart of feature extraction.

Considering the sheer number of users and items, the user vectors and item vectors were
respectively mapped by the MF to d-dimensional joint potential factor spaces Pd×m and Xd×n.
Explain intuitively that matrix P represents the preferences of m users on d topics, and X
represents the d topics which are concerned by users behind n items, then the rating matrix R
can be approximately expressed as the product of two matrices: R = PTX. Most MF-based
recommendations initialize P and X randomly and iteratively output local optimal solutions,
which depends heavily on the initial values [4]. This paper attempts to make more accurate
recommendations by mining out accurate feature vectors from the implicit features of the MF.
To this end, the DAE, a deep neural network, was adopted to compress high-dimensional, sparse
user behaviors into low-dimensional, dense feature vectors of users and items. This approach
provides good initial values to the network through unsupervised layer-by-layer pre-training,
and adjust the network weights by supervised fine-tuning training, making it possible to extract
key information from the data and form features [24].

The primary goal is to design an item-based (user-based) DAE to map the observed Ui(Iu)
into a low-dimensional potential (implicit) space, and then reconstruct Ui(Iu) in the output
space to learn advanced, abstract features. The entire workflow is illustrated in Fig.4. Let
S = [I1, ...,Ii, ...,Im] be the input. The specific steps of the DAE to extract user feature matrix
P are given below:

Step 1: Given an input S, let Wi ∈ Rd×m(i ∈ [1,L− 1]) be a weight matrix, bi ∈ Rd be the
bias vectors and σ(x) = 1

1+ex
be the sigmoid formula. Then, the d-dimensional implicit feature

hi can be expressed as:

h1 = σ(W1S + b1) (9)
hi = σ(Wihi−1 + bi) (10)

Step 2: Given the weight matrix WL ∈ Rm×d between hidden and output layers, and bias


598 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

vector bL ∈ Rm, the DAE can reconstruct the original data Ŝ from the hidden layer hL−1 by:

Ŝ = σ(WLhL−1 + bL) (11)

Step 3: The pretraining program of the DAE intends to minimize the objective function by
adjusting weight matrices W and bias vectors b:

L =
1

2m

m∑
i=1

∣∣∣Ŝi −Si∣∣∣2 + λ
2
|W1|2 + . . . +

λ

2
|WL|

2 (12)

where Ŝi and Si represent the i-dimensional vectors of Ŝ and S, respectively; |·|2 is the
squared L2 norm of vectors or matrices, i.e. the sum of squared dimensional values; the first factor
in objective function denotes the error which is employed to minimize the error of reconstructed
data Ŝi and original data Si; the other factors are regular terms used to prevent overfitting.

Step 4: Update weight matrix W in each iteration by:

W = W − l×
∂L

∂W
(13)

where l refers to learning rate. According to the above formula, bias vectors b update process
act in the same way.

The d-dimensional feature vector of the users and the items were obtained by repeating the
above steps to continuously optimize the model until the end of training.

2.5 Top-N recommendation agent (step 7)

Our approach FSTID, is based on the model proposed by Guo called Factored user and item
Similarity model with social Trust (FST) [9], which predict user u’s rating on item i through
four parts: (1) item bias bi; (2) the similarity between user u and another user v who has rated
item i: pTv qu, where pv,qu are the implicit feature vectors of users v and u, respectively; (3) the
similarity between item i and another item j which has been rated by user u: xTj yi, where xj
and yi are the implicit feature vectors of items j and i, respectively; (4) influence of any user u’s
trusted user w on targeted item i: pTwyi. The rating prediction formula can be expressed as:

r̂u,i = s|Ui−u|−β
∑

v∈Ui−u

pTv qu + (1 −s)|Iu−i|
−α

∑
j∈Iu−i

xTj yi + |Tu|
−z
∑
w∈Tu

pTwyi + bi (14)

where Ui−u refers to the set of users having rated item i except user u; Iu−i refers to the
set of items having been rated by user u except item i; Tu is the set of users trusted by user
u; s ∈ [0, 1] is the relative importance of user similarity; α,β,z are parameters related to the
number of rated items, similar users and trusted users, respectively. Taking β for instance, β = 1
refers to the mean user similarity, item i will get a high rating only if all users in Ui−u are similar
to user u; when β = 0, this item calculates the sum of the similarities between user u and user
in Ui−u, which means that item i will get a high rating even if only a few users are similar to
user u. In conclusion, the probability of obtaining a high predictive rating decreases with the
increase of parameter β.

The above analysis shows the critical importance of the influential users’ ratings on item i.
In addition, Guo et al. [9] confirmed that the trustee of user u also affects user u’s rating on item
i. Thus, the rating prediction formula of the FST can be modified as:

r̂u,i =s|Ui−u|−β
∑

v∈Ui−u

pTv qu + (1 −s)|Iu−i|
−α

∑
j∈Iu−i

xTj yi + δ|T
+
u |
−z
∑
o∈T+u

pTo yi+

(1 − δ)|T−u |
−z
∑
c∈T−u

pTc yi + |IU|
−µ
∑
f∈IU

pTf yi + bi

(15)


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 599

The trustees in the FST were divided into a set of trustees T +u and a set of trustors T
−
u ,

aiming to disclose their impacts on the rating of item i. The parameter δ ∈ [0, 1] was adopted to
control the weights of the two types of users. When δ = 0, it represents the influence of trustees is
totally ignored. To the contrary, δ = 1 denotes only the influence of trustees is considered. µ ≥ 0
refers to the parameter of the number of influential users. For each influential user f ∈ IU, its
influence over item i is described by the inner product pTf yi. In addition, the model parameters
b,P,Q,X and Y can be trained by minimizing the objective function below:

J =
1

2

∑
u∈U

∑
i∈I+u ,j∈I−u

|(ru,i −ru,j) − (r̂u,i − r̂u,j)|2F +

λ

2
(|P |2F + |Q|

2
F + |X|

2
F + |Y |

2
F + |b|

2
F )

(16)

where U is the set of all users; I+u and I
−
u respectively refers to the rated items and unrated

items of user u. For simplicity, parameter λ was taken as the regularization term in all cases.
The pseudocode for model learning is shown in Algorithm 0, where α, β, z, µ, s and δ are control

Algorithm 1 The learning algorithm of FSTID
Data:
U ← user set; R ← user item matrix;T ← user trust matrix;
IU ← influential user set; P ← user feature matrix; X ← item feature matrix;
Input:
α, β, z, µ, s, δ, ρ, λ, η and iteration limitation L;
Output:
b, P, Q, X, Y

1: Initialize bias vector b with random values in (0, 0.01), and set Q = P,Y = X;
2: while J not converged or the number of iterations < L do
3: for all u ∈ U do
4: for all i ∈ I+u do
5: Z ← sample(ρ,I−u )
6: Compute Loss J by Eq. (16)
7: Update bi,bj,qu,yi,yj,pv,xk,po,pc,pf according SGD:
8: bi ← bi −η × ∂J∂bi , qu ← qu −η ×

∂J
∂qu

, yi ← yi −η × ∂J∂yi
9: for all j ∈ Z do

10: bj ← bj −η × ∂J∂bj , yj ← yj −η ×
∂J
∂yj

11: ∀v ∈ Uj−u,pv ← pv −η × ∂J∂pv
12: end for
13: ∀v ∈ Ui−u,pv ← pv −η × ∂J∂pv , ∀k ∈ Uu−i,xk ← xk −η ×

∂J
∂xk

14: ∀o ∈ T +u ,po ← po −η ×
∂J
∂po

, ∀c ∈ T−u ,pc ← pc −η ×
∂J
∂pc

15: ∀f ∈ IU,pf ← pf −η × ∂J∂pf
16: end for
17: end for
18: iteration number++
19: end while
20: return b, P, Q, X, Y

parameters, λ is the regularization parameter, η is the initial learning rate and ρ is the number of
samples. Firstly, ρ unrated items are selected randomly for each user to train the model (Line 6).


600 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

Then, the parameters are optimized continuously in the training phase by Stochastic Gradient
Descent (SGD) (Lines 7-15) until the loss function converges or the preset maximum number
of iterations is reached (Line 3). Finally, the learning vectors and matrices are outputted (Line
20). As for comparative experiments, we will make detailed comparisons and explanations in
subsequent experiments to express the advantages of our algorithm.

3 Experiments and results analysis

This section carries out a series of experiments on three real-world datasets, trying to answer
the following questions:

(1) Does the ISH enjoy higher accuracy in identifying influential users than other influence
measurement methods?

(2) How do model parameters like α, β, z and µ affect recommendation accuracy?
(3) How does the number of influential users affect recommendation accuracy?
(4) Does the FSTID outperform the other advanced trust-aware recommendation algorithms

on typical sparse datasets?

3.1 Datasets and evaluation metrics

There real-world datasets were selected for our experiments, including FilmTrust, Ciao and
Epinions, which are currently well-known test data sets. Most trust-based recommendation
algorithms are used to test the performance of algorithms because they contain both user ratings
and trust relationships. FilmTrust is a dataset from a movie sharing website. The data entries in
the dataset are movie ratings based on a four-point scale. Ciao and Epinions are datasets from
two famous consumer review websites, on which users rate commodities against a five-point scale
and refer to others’ ratings before deciding on whether to purchase a commodity. As shown in
Table 2, all the three datasets are extremely sparse in nature.

Table 2: Specification of the used datasets

Data Set Users Items Ratings Density

Epinions 40163 139738 664824 0.0118%

Ciao 7375 99746 139738 0.0379%

FilmTrust 1508 2071 40163 1.14%

*Density= #Ratings
#Users×#Items

The 5-fold cross validation was adopted in our experiments. Each dataset was split randomly
into 5 equal parts. In each iteration, 1 part was selected as the test set and the other 4 parts
as the training set. The mean results of the 5 parts were taken as the final results. Different
from the evaluation of rating prediction, the top-N recommendation performance was measured
by two metrics: precision and F1-measure. F1-measure is the weighted average of precision and
recall rate (the proportion of all recommended items in the items rated by the target user).
Let U ′ be the set of users in the test set and RN (u) be the top-N items recommended to user
u. Then, the precision, recall rate and F1-measure under N recommended items, respectively


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 601

denoted as P@N, R@N and F1@N, can be calculated by:

P@N =
1

|U ′|
∑
u∈U′

|RN (u) ∩ Iu|
N

(17)

R@N =
1

|U ′|
∑
u∈U′

|RN (u) ∩ Iu|
Iu

(18)

F1@N =
2 ∗P@N ∗R@N
P@N + R@N

(19)

where N ∈{5, 10} is the number of items recommended to the target user. The greater the
values of P@N and F1@N, the more accurate the top-N recommendation.

3.2 Social influence analysis

The Susceptible Infected (SI) epidemic model [15] was selected to evaluate the effects of
model parameters on recommendation results. Mimicking the virus propagation process, this
classic model is a mature method to simulate the transmission of information. In this model,
each network node either exists in the susceptible S status or in the infected I status. Once
infected, a susceptible node will irreversibly become an infected node, and transmit virus to its
neighbor nodes at the probability of γ ∈ (0, 1). In our experiments, the probability γ was set to
0.001 and the network nodes were initialized randomly. The influence of each network node on
the three datasets were evaluated by the parameter Si at transmission time t = 10. The mean
value of Si can be calculated by:

S̄i =
1

M

M∑
m=1

Si (20)

where M is the repeated times of node i.
Our method ISH was compared with the SH and the Degree Centrality (DC) in terms of the

social influence over the three datasets. The experimental results are presented in Fig.5, where
the x-axis is the calculated influence of each node and the y-axis is the actual influence of each
node.

The purple area in Fig.5 are low-impact nodes. The SH and the DC had almost the same
number of low-impact nodes on all three datasets. Comparatively, the SH nodes ranked lower
than the DC nodes, indicating that the SH enjoyed better effect than the DC. Meanwhile, the
ISH achieved the lowest proportion of low-impact nodes among the three methods. Most ISH
nodes concentrated in the second half of the ranking.

The red area in Fig.5 are high-impact nodes. By analysis of top-25 nodes, the number of
high-impact nodes calculated by DC, SH, and ISH are 18, 15 and 18 in FilmTrust; the number
of high-impact nodes is 8, 9, 11 respectively in Ciao and also is 9, 8, 10 in Epinions. The results
are very close, revealing that the node with higher degree is more important.

As shown in Fig.5, the ISH achieved the highest mean value on all datasets, that is, our
algorithm outperformed the other methods in identification. To sum up, the ISH can identify
the top k% influential users correctly, without mistaking low-impact users as high-impact users.

3.3 Effects of model parameters

Effect of parameters α, β, z, µ

The parameters α, β, z, µ respectively control the influence of item similarity, user simi-
larity, trustees, and influential users on the item ratings. To save time and space, the values of


602 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

Mean:2198.6

500

1000

1500

2000

2500

3000

3500

c  ISH

f  ISH

Figure 5: Correlation analysis of different methods and their actual influence. (a)-(c) Results on
FilmTrust; (d)-(f) Results on Ciao; (g)-(i) Results on Epinions.

these parameters were limited to a small set: 0.5, 1, 2 while the values of s and δ were fixed
at 0.5 in our experiments. For simplicity, only the five best P@10 results on each dataset are
listed in Table 3. Obviously, the different parameter configurations led to different results, and
the optimal parameters changed from dataset to dataset. The optimal values of α, β, z and µ
on Epinions, Ciao and FilmTrust were (0.5, 2, 2, 0.5), (0.5, 2, 0.5, 0.5) and (0.5, 1, 0.5, 0.5),
respectively. It can be concluded that the optimal combination is α = 0.5, β, z > 1 and µ < 1,
that is, the item similarity and influential users are more important than user similarity and
trustees.

Table 3: The five best P@10 results on each dataset.

Epinions Ciao FilmTrust
1 0.010486 (0.5, 2, 2, 0.5) 0.02378 (0.5, 2, 2, 2) 0.352258 (2, 1, 1, 0.5)
2 0.010467 (1, 2, 0.5, 0.5) 0.023609 (0.5, 0.5, 0.5, 0.5) 0.351964 (2, 0.5, 2, 2)
3 0.010379 (2, 1, 2, 0.5) 0.023411 (1, 2, 1, 1) 0.351948 (0.5, 1, 0.5, 1)
4 0.010243 (1, 2, 1, 1) 0.023396 (0.5, 2, 1, 0.5) 0.351819 (1, 1, 0.5, 0.5)
5 0.010231 (0.5, 2, 0.5, 2) 0.02336 (2, 0.5, 0.5, 1) 0.351775 (0.5, 0.5, 0.5, 0.5)

Effect of s and δ

The parameters s and δ from Eq.(15) respectively describe the relative importance of user
similarity and trustees on the prediction of item ratings. The value of s is positively correlated
with the impact of user similarity, while the value of δ is negatively correlated with the impact
of trustees. In our experiments, α, β, z, µ were set to the optimal values, while the values of


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 603

s and δ were increased at a step of 0.1 in the range [0, 1]. Firstly, the value of s was set to
0.5 to obtain a series of results on δ and determine the value of δ. Then, the experiments were
conducted with s. According to the experimental results in Fig. 6, the value of P@10 was worse
than most other values at δ = 0 or δ = 1. This means the recommendation accuracy can be
improved effectively through the proper combination of trustors and trustees. This conclusion
also applies to parameter s. The value of s can be set to 0.3, despite the difference in its optimal
value on different datasets.

0.0 0.2 0.4 0.6 0.8 1.0

0.00910

0.00936

0.00962

0.00988

0.01014

0.0 0.2 0.4 0.6 0.8 1.0

0.0210

0.0216

0.0222

0.0228

0.0234

0.0 0.2 0.4 0.6 0.8 1.0

0.3480

0.3492

0.3504

0.3516

0.3528

0.3540

P
@
1
0

P
@
1
0

 s
 

(a)Epinions (b)Ciao

 s
 

P
@
1
0

(c)FilmTrust

 s
 

Figure 6: The effects of s and δ on FSTID at P@10

0.0 0.2 0.4 0.6 0.8 1.0

0.00910

0.00936

0.00962

0.00988

0.01014

0.0 0.2 0.4 0.6 0.8 1.0

0.0210

0.0216

0.0222

0.0228

0.0234

0.0 0.2 0.4 0.6 0.8 1.0

0.3480

0.3492

0.3504

0.3516

0.3528

0.3540

P
@
1
0

P
@
1
0

 s
 

(a)Epinions (b)Ciao

 s
 

P
@
1
0

(c)FilmTrust

 s
 

Figure 7: The effects of parameters k on FSTID at P@10

Effect of parameter k

The FSTID takes the top k% users with higher influence as influential users, as only a tiny
fraction of users in the social network have a great influence. To verify the impact of parameter
k on the recommendation performance, the value of the parameter was increased at the step of
1 in the interval [0, 10]. The experimental results are displayed in Fig. 7. It can be seen that
the optimal value of k was 2, 4 and 7 in Epinions, Ciao, and FilmTrust, respectively. Moreover,
the poorest result was observed on the three datasets at k = 0, i.e. ignoring influential users.
Hence, the consideration of influential users can indeed enhance the recommendation effect.

3.4 Comparison with other methods

Several top-N recommendation algorithms were selected to evaluate the effect of our method,
such as MostPop, FST, Group Bayesian Personalized Ranking (GBPR), the factored item sim-
ilarity model (FISM), and the FSTID-. The MostPop is the baseline method that ranks the
ratings of an item by popularity, i.e. how frequently the item is rated or consumed by the user.
Proposed by Guo et al., the FST [8] takes account of implicit user feedback, similarity and social
trust. The GBPR was improved from the Bayesian Personalized Ranking (BPR) by Pan and
Chen [16], which introduces the influence of social groups on user preference to enhance the item
recommendation quality. FISM is a top-N recommendation method based on item similarity


604 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

proposed by Kabbur et al. [11]. The FSTID- removes the DAE pretraining from our approach
and randomly sets the values of P,Q,X and Y in the interval of (0, 0.01).

Table 4: Results of different methods at N=5 and 10. The optimal results are in bold characters.

Datasets N Metrics MostPop GBPR FISM FST FSTID- FSTID

Epinions

5 P@N 0.01169 0.009353 0.01147 0.01179 0.011888 0.01231
10 F1@N 0.01298 0.01103 0.01307 0.0133 0.013107 0.01402
5 P@N 0.009171 0.00756 0.00902 0.009187 0.009486 0.01024
10 F1@N 0.01305 0.01111 0.01315 0.01328 0.014431 0.01459

Ciao

5 P@N 0.02677 0.02228 0.02704 0.02741 0.027382 0.0283
10 F1@N 0.02436 0.02063 0.02495 0.02523 0.025442 0.02644
5 P@N 0.02142 0.01827 0.02141 0.02174 0.022376 0.02329
10 F1@N 0.02662 0.02116 0.02687 0.0272 0.028402 0.02914

FilmTrust

5 P@N 0.4170 0.4124 0.4171 0.4191 0.418567 0.4198
10 F1@N 0.4095 0.4051 0.4087 0.4099 0.411364 0.4116
5 P@N 0.3503 0.3470 0.3503 0.3514 0.352159 0.3532
10 F1@N 0.4518 0.4458 0.4516 0.4521 0.452825 0.4541

It can be seen from Table 4 that the FSTID achieved better results than the contrastive
methods on all datasets, proving that our approach can alleviate data sparsity and make re-
liable recommendations. Besides, the FSTID, which relies on the DAE to initialize the MF,
outperformed the FSTID-, revealing that the features extracted by the DAE are more suitable
than randomly generated features. On the other hand, FSTID- because there is no need to
utilize DAE for feature initialization, it has lower computational cost than FSTID, and in most
experiments it has achieved the best results compared with other approach except the FSTID
method, which indicates the effectiveness of the integration of social influences to improve the
recommended performance. Furthermore, the FSTID’s advantage over the other methods were
smaller on FilmTrust than Ciao and Epinions, due to the relatively small scale of FilmTrust.
This means the DAE can capture features more effectively in big data.

4 Conclusion

This paper proposes a novel factored similarity model for top-N recommendation based on
trust and social influence. Firstly, we quantify the trust value between users based on the user’s
interaction information and similarity, and finally build the user’s trust association graph. In
the process of calculating the trust degree, the penalty factor based on the common rating item
is added to reduce the accidental effect caused by the low number of common rating items,
which greatly increases the attack cost of malicious users, makes it difficult to become the
neighbors of target users, and effectively enhances the stability of the algorithm. Then, the
ISH was utilized to identify the influential users in a complex network. Moreover, the DAE
was adopted to compress high-dimensional, sparse user behaviors into low-dimensional, dense
vectors of user and item features. Compared with other top-N recommendation approaches, our
approach achieved excellent recommendation effects on three real-world datasets. The future
research will consider the dynamic changes in user preference and trust with the time of user
rating, and try to recommend items preferred by users in a timely manner.


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 605

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 61473108),
the Fundamental Public Welfare Research Program of Zhejiang Province (No. GF19F020052)
and the National Experimental Teaching Demonstration Center of Computer-Science in China.

Bibliography

[1] Anagnostopoulos, A.; Kumar, R.; Mahdian, M. (2008). Influence and correlation in social
networks, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, 7–15, 2008.

[2] Bao, Y.; Fang, H.; Zhang, J. (2014). Leveraging decomposed trust in probabilistic matrix
factorization for effective recommendation, Proceedings of the 28th AAAI Conference on
Artificial Intelligence (AAAI), 350: 30–36, 2014.

[3] Burt, R.S. (2009); Structural holes: The social structure of competition, Harvard university
press, 2009.

[4] Deng, S.; Huang, L.; Xu, G. et al. (2017). On deep learning for trust-aware recommendations
in social networks, IEEE Transactions on Neural Networks & Learning Systems, 28(5), 1164–
1177, 2017.

[5] Deng, X.Y.; Wang, C. (2018). A hybrid collaborative filtering model with context and
folksonomy for social recommendation, Ingenierie des Systemes d’Information, 23(5), 139–
157, 2018.

[6] Freeman, L.C. (1977); A set of measures of centrality based on betweenness, Sociometry,
40(1), 35–41, 1977.

[7] Guy, I.; Ronen, I.; Wilcox, E. (2009). Do you know?: recommending people to invite into
your social network, Proceedings of the 14th International Conference on Intelligent User
Interfaces, 77–86, 2009.

[8] Guo, G.; Zhang, J.; Yorke-Smith, N. (2016). A novel recommendation model regularized
with user trust and item ratings, IEEE Transactions on Knowledge & Data Engineering,
28(7), 1607–1620, 2016.

[9] Guo, G.; Zhang, J.; Zhu, F. et al. (2017). Factored similarity models with social trust for
top-n item recommendation, Knowledge-Based Systems, 122, 17-25, 2017.

[10] Jamali, M.; Ester; M. (2010). A matrix factorization technique with trust propagation for
recommendation in social networks, Proceedings of the fourth ACM conference on Recom-
mender systems, 135–142, 2010.

[11] Kabbur, S.; Ning, X.; Karypis, G. (2013). Fism: factored item similarity models for top-n
recommender systems, Proceedings of the 19th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 659-667, 2013.

[12] Kempe, D.; Kleinberg, J.; Tardos, E. (2003). Maximizing the spread of influence through a
social network, Proceedings of the 9th ACM SIGKDD International Conference on Knowl-
edge Discovery and Data Mining, 137–146, 2003.


606 X.F. Zhang, X.L. Chen, D.W. Seng, X.J. Fang

[13] Kitsak, M.; Gallos, L.K.; Havlin, S. et al. (2010). Identification of influential spreaders in
complex networks, Nature Physics, 6(11), 888–893, 2010.

[14] Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filter-
ing model, Proceedings of the 14th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, 426–434, 2008.

[15] Li, D.; Luo, Z.; Ding, Y. et al. (2017). User-level microblogging recommendation incorpo-
rating social influence, Journal of the Association for Information Science and Technology,
68(3), 553-568, 2017.

[16] Pan, W.; Chen, L. (2013). Gbpr: Group preference based bayesian personalized ranking
for one-class collaborative filtering, Proceedings of The Twenty-Third International Joint
Conference on Artificial Intelligence (IJCAI), 13, 2691-2697, 2013.

[17] Pastorsatorras, R.; Castellano, C.; Mieghem, P.V. et al. (2014). Epidemic processes in
complex networks, Review of Modern Physics, 87(3), 120-131, 2014.

[18] Peng, S.; Wang, G.; Xie, D. (2017). Social influence analysis in social networking big data:
Opportunities and challenges, IEEE Network the Magazine of Global Internetworking, 31(1),
11–17, 2017.

[19] Rendle, S.; Freudenthaler, C.; Gantner, Z. et al.. (2009). Bpr: Bayesian personalized ranking
from implicit feedback, Conference on Uncertainty in Artificial Intelligence, 452-461, 2009.

[20] Rogers, E.M. (1995). The Diffusion of Innovations, Free Press, 1995.

[21] Sedhain, S.; Menon, A.K.; Sanner, S. et al. (2017). Low-rank linear cold-start recommen-
dation from social data, Proceedings of the 31th AAAI Conference on Artificial Intelligence
(AAAI), 1502–1508, 2017.

[22] Xu, W.; Rezvani, M.; Liang, W. et al. (2017). Efficient algorithms for the identification of
top-k structural hole spanners in large social networks, IEEE Transactions on Knowledge
& Data Engineering, 29(5), 1017-1030, 2017.

[23] Xu, K.; Zheng, X.; Cai, Y. et al. (2018). Improving user recommendation by extracting
social topics and interest topics of users in unidirectional social networks, Knowledge Based
Systems, 140, 120–133, 2018.

[24] Yuan, X.; Huang, B.; Wang, Y. et al. (2018). Deep learning based feature representation and
its application for soft sensor modeling with variable-wise weighted SAE, IEEE Transactions
on Industrial Informatics, (99): 1–1, 2018.

[25] Yang, C. ; Sun, M. ; Zhao; W. X. et al. (2016). A neural network approach to joint modeling
social networks and mobile trajectories, Acm Transactions on Information Systems, 35(4),
36, 2016.

[26] Yuan, Q.; Zhao, S.; Chen, L. et al. (2009). Augmenting collaborative recommender by
fusing explicit social relationships, Workshop on Recommender Systems and the Social Web,
Recsys, 2009.

[27] Zhang, Z., Liu, Y., Jin, Z. et al. (2018). A dynamic trust based two-layer neighbor selection
scheme towards online recommender systems, Neurocomputing, 285, 94-103, 2018.

[28] Zhao, H.; Yao, Q.; Kwok, J.T. et al. (2017). Collaborative filtering with social local models,
2017 IEEE International Conference on Data Mining (ICDM), 645–654, 2017.


A Factored Similarity Model with Trust and Social Influence
for Top-N Recommendation 607

Appendix

In order to facilitate the reader to read better, we have listed the abbreviations used in this
article in lexicographic order, as shown in Table 5.

Table 5: Acronyms and Initialisms Dictionary.

Abbreviation Full name
CF Collaborative Filtering
DAE Deep Autoencoder
DC Degree Centrality
FISM Factored Item Similarity Model
FST Factored user and item Similarity model with social Trust

FSTID Factored user and item Similarity Model with Trust and social Influence
based on Deep learning

GBPR Group Bayesian Personalized Ranking
ISH Improved Structural Holes
MF Matrix Factorization
SGD Stochastic Gradient Descent
SH Structural Hole
SI Susceptible Infected