CHEMICAL ENGINEERING TRANSACTIONS  
 

VOL. 51, 2016 

A publication of 

 
The Italian Association 

of Chemical Engineering 
Online at www.aidic.it/cet 

Guest Editors: Tichun Wang, Hongyang Zhang, Lei Tian 
Copyright © 2016, AIDIC Servizi S.r.l., 

ISBN 978-88-95608-43-3; ISSN 2283-9216 

Enterprise Online Product Recommendation Service Model 
based on Big Data Environment 

Weiwei Liu 
Northeast Petroleum University, Qinhuangdao 066004,china 
huntercet@qq.com 

With the development and application of e-commerce, the research on enterprise online product 
recommendation service model under big data background has become a frontier issue. Collaborative filtering 
algorithm is improved based on domain ontology, which calculates semantic similarity of domain ontology from 
two angles of hierarchical similarity and attribute similarity. It combines with the traditional grading similarity to 
dig out semantic relationship between products, and then it draws abstract semantic information. The 
experiment results show that it can significantly improve the recommendation speed. Besides, 
recommendation efficiency is also relatively stable. In dealing with large data, computational efficiency is 
better than the traditional collaborative filtering algorithm, recommendation algorithm based on association 
rules and recommendation algorithm based on content. 

1. Introduction 

With the development and application of e-commerce, the research on personalized information 
recommendation service model under big data background has become a frontier issue (Junyean Moon, et. al,  
2008). E-commerce websites not only provide products and services, but also make users more difficult to find 
the product information, which meet their needs in mass of information quickly and accurately. Personalized 
information recommendation based on big-data could recommend products or services to users according to 
their preference actively in real time. On the one hand, it can better meet the users’ individual needs. On the 
other hand, it could help electric website to establish a stable user groups, improve service quality, and 
thereby enhance market competitiveness (Lee et al., 2012). As the time for big data processing is coming,  
several  problems  that  traditional  recommendation  system  faced  such  as  cold start up, accuracy, and 
scalability are worsen. In the meantime, the real-time problem as a new bottleneck to the recommendation 
systems oriented huge amounts of data under this severe environment arises. How to provide a better user 
experience under the big data environment continues to drive the development of technique. With the 
combination of the distributed system and the grid computing, Cloud computing has developed. The powerful 
capabilities of preserving and processing big data meet the demand of the recommender system in the big 
data era. So, how to optimize and parallelize the mature technique in the traditional recommendation system 
becomes a new area of research (Hu, et al., 2014). 
At present, collaborative filtering technology is widely used in personalized recommendation technology (
Zhu et al., 2014). After comparing the advantages and disadvantages of the existing recommendation 
algorithms (Zhu et al., 2014; Long et al., 2012), according to the characteristics of the product 
recommendation platform and system goal, we choose the collaborative filtering recommendation algorithm. 
Collaborative filtering algorithm is not perfect, however, it still needs further improvement in the late, in order to 
meet the accuracy and speed requirements of the recommendation system (Kuang, 2012; Papadimitriou and 
Disco, 2008). 
The study object of this article is enterprise online product recommendation service model of e-commerce 
based on big-data. It introduces related theoretical basis, including concepts, features and primary technology. 
It constructs a model for e-commerce enterprise online product recommendation influence factors and then 
test and revises it according to the data of the questionnaire survey, which is designed based on factors that 
influence consumers to buy personalized commodity. At the same time, it constructs an enterprise online 

                               
DOI: 10.3303/CET1651128

 
Please cite this article as: Liu W.W., 2016, Enterprise online product recommendation service model based on big data environment, Chemical 
Engineering Transactions, 51, 763-768  DOI:10.3303/CET1651128   

763


product recommendation service model and proposes a model for e-commerce based on achievements of 
scholars both abroad and at home and conclusion by questionnaire survey. Lastly, it carried on case studies 
for enterprise online product recommendation service model, through specific analysis of shopping website, to 
illustrate the feasibility and effectiveness of this model.  
This paper proposes a specific enterprise online product recommendation service model based on big-data, 
accelerating personalized and intelligent development of e-commerce information services. It has brought 
convenience and personalized service to  users  and  huge  economic  benefits  for  e-commerce  enterprise  
and has great significance to personalized information service for e-commerce. In this paper, customer 
evaluation algorithm based on data mining mainly includes the following several aspects. In the next section, 
principle of collaborative filtering recommendation is investigated. In Section 3, improved collaborative filtering 
recommendation based on domain ontology is proposed. In Section 4, in order to test the performance of 
proposed recommendation algorithm, it is used to evaluate customer credit of some rural credit cooperatives 
in China. Finally, some conclusions are given. 

2. Collaborative filtering recommendation 

This algorithm produces recommendation results for the client demanding products. The main idea is to analyze 
the user data to the evaluation of products. Through the similarity, it finds the nearest neighbor client, and it 
recommends product for target users based on the nearest neighbor users. Collaborative filtering 
recommendation algorithm is mainly divided into three steps, the similarity between the user and choosing the 
nearest neighbor users and recommendation based on predicting scores. 
Similarity calculation should calculate personal information of users, evaluation data of products and browsing 
data. The score can use user-product matrix. If evaluation vector of user a and user b are a  and b  
respectively, the similarity between user a and user b is  

, ,

2 2

, ,

( , )

n

a j b jj

n n

a j b jj j

R R
sim a b

R R




 
.  

Ra,j and Rb,j represent the score of user a and user b to product j respectively. After the completion of the 
similarity calculation between the users, the similarity is represented by vector length. The shorter the length of 
the vector, the higher the similarity. In the selection of the nearest neighbor, you have three ways. The similarity 
threshold can be set, and users satisfying the similarity threshold are the nearest neighbors. We can also set 
the number of the nearest neighbors. You can also choose some nearest neighbors meeting similarity threshold. 
At last, the recommendation results are generated. Suppose there are a number of users, K number of nearest 
neighbors meeting the threshold value. pa,t represents score of user a to product t.   

, ,

1

( , )( )
K

a t a u t u

u

P R X sim a u R R


   . 

sim(a, u) represents similarity between user a and its nearest user u.
u

R  represents average score of neighbor 

user u to the product and 
a

R  represents average score of a to the product. Ru,t represents evaluation score of 
user u to product t. 
Big data is a broad term for data sets so large or complex that traditional data processing applications are 
inadequate. Challenges include analysis, capture, creation, search, sharing, storage, transfer, visualization, and 
information privacy. The term often refers simply to the use of predictive analytics or other certain advanced 
methods to extract value from data, and seldom to a particular size of data set. 
Analysis of data sets can find new correlations, to "spot business trends, prevent diseases, and combat crime 
and so on." Scientists, practitioners of media and advertising and governments alike regularly meet difficulties 
with large data sets in areas including Internet search, finance and business informatics. Scientists encounter 
limitations in e-Science work, including meteorology, genomics, complex physics simulations, and biological 
and environmental research. Data sets grow in size in part because they are increasingly being gathered by 
cheap and numerous information-sensing mobile devices, aerial (remote sensing), software logs, cameras, 
microphones, radio-frequency identification (RFID) readers, and wireless sensor networks. The world's 
technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as 
of 2012, every day 2.5 Exabyte’s (2.5×1018) of data were created; The challenge for large enterprises is 
determining who should own big data initiatives that straddle the entire organization.  
Relational database management systems and desktop statistics and visualization packages often have 
difficulty handling big data. The work instead requires "massively parallel software running on tens, hundreds, 

764


or even thousands of servers". What is considered "big data" varies depending on the capabilities of the users 
and their tools, and expanding capabilities make Big Data a moving target. Thus, what is considered to be "Big" 
in one year will become ordinary in later years. "For some organizations, facing hundreds of gigabytes of data 
for the first time may trigger a need to reconsider data management options. For others, it may take tens or 
hundreds of terabytes before data size becomes a significant consideration." 
This location data retrieved from smart phones can play a vital role in determining the user trend. As mobile 
phone is connected with the base station where each base station represents the cell where user is residing in 
at a particular time it record the spatiotemporal trend of user automatically even without disturbing the user 
routine. So we can apply the data mining techniques on such kind of data to extract meaningful information. As 
this location information can provide the list of significant locations for user during mobility this can be used for 
Location based services (LBS). 
The location information and mobility path can be used for the potential applications which include mobile 
advertisement, city wide route sensing, region pollution, traffic safety management, social networking, potential 
warning system, soured analysis, route tracking expand and communication. In these applications the low level 
mobility data is unity recommendations interoperated into high level information in term of stay locations, 
patterns and finally user profiling. 
In our work we are focused on the precise extraction of mobility profile against user mobility so that it can be 
rendered for any location based service. As described before this mobility profiling is all location based where 
location is logged in by the user using different methods i.e. Indoor and Outdoor. 
There are many ways to record the user mobility which can be Wi-Fi, Bluetooth, Infrared, GPS and GSM 
depending on the situation and type of intended application. Most important work regarding the location 
extraction based on algorithms is done in their work where they formally defined the term mobility mining to 
extract patterns through profiling. While the current mobility trends are studied in detail by their work, the 
mobility is defined as key prediction indicator of human life. 
So the tracking of true location is basis of every mobility based application. As there are two kind of technical 
solutions available for the location recording indoor and outdoor. In case of indoor like Bluetooth, RFID or 
infrared, these short range and cannot compete the outdoor like GSM and GPS which are categorized broadly 
in their work. On the other hand Wifi is another solution to location tracking as well were geo location of the user 
is determined by the terminal it is connected with. As per the feasibility and their wide usage outdoor 
technologies are widely used for location tracking which includes GPS, Assisted faux GPS and GSM. Where 
GPS is coordinated based which provides the exact location of the user in term of longitude and latitude. But 
GPS needs long start-up time on device, high consumption of energy which is discouraging for the user. And 
most importantly there are many applications where exact location of user does not matter and application can 
use the relative position of the user for prediction of trends where GSM can serve the purpose well enough. 

3. Improved collaborative filtering recommendation based on domain ontology 

The constructed domain is product domain ontology. After ontology knowledge base construction has been 
completed, the corresponding product characteristics tree of each product is generated. According to feature 
attributes of the products, we find the location in the knowledge base. After position is determined, according to 
the position information of product in the domain ontology knowledge base, similarity of product is calculated. 
On the one hand, it can solve the drawback of ignoring the semantic relationship between keywords in 
traditional collaborative filtering algorithm to enhance the accuracy of recommendation. On the other hand, the 
establishment of a comprehensive domain knowledge base, positions of each attribute and attribute value are 
fixed. For the products, it only needs to store location information of the corresponding position and does not 
need to inquiry, deal with and storage keywords information in the form of text. It can improve data processing 
speed. Then Hierarchical similarity calculation and attribute similarity calculation is investigated. 
It constructs a model for e-commerce enterprise online product recommendation influence factors and then test 
and revise it according to the data of the questionnaire survey, which is designed based on factors that 
influence consumers to buy personalized commodity. At the same time, it constructs an enterprise online 
product recommendation service model and proposes a model for e-commerce based on achievements of 
scholars both abroad and at home and conclusion by questionnaire survey. Lastly, it carried on case studies for 
enterprise online product recommendation service model, through specific analysis of shopping website, to 
illustrate the feasibility and effectiveness of this model. How to provide a better user experience under the big 
data environment continues to drive the development of technique. With the combination of the distributed 
system and the grid computing, Cloud computing has developed. 
If A and B belongs to the same branch class, semantic distance between A and B is  
 d(A,B)=dep(B)-dep(A). 

765


dep(x) represents the depth of class x in the hierarchical structure. If A and B belongs to the heterogeneous 
class, semantic distance between A and B is 
d(A,B)=dep(A,R)+dep(B,R). 
The semantic similarity between A  and B  is 
Csim(A,B)=1/(d(A,B)2+1). 
Hierarchical similarity between instance I1 and I2 is 

1 2

1 2 1 2

1,  

( , ) ( ( ), ( ))
,

2

if I I

Isim I I Csim C I C I
otherwise




 



. 

C(I1)represents class that instance I1 belongs to. C (I2) represents class that instance I2 belongs to. Suppose 
ontology class C1 has instance I1, value of attribute M1 is m1, and value of attribute M2 is m2, which can be 
represented as I1= C1[M],   M=(m1,m2,L,mn)    
Ontology class C2 has instance I2, value of attribute N1 is n1, and value of attribute N2 is n2, which can be 
represented as I2= C2[N],N=( n1, n2,L, nn). Attribute similarity between I1 and I2 is 

1 2

( , )
( , )

i i ii

i

com p q
Psim I I










. 

1,
( , )

0,

i i

i i

m n
com m n

otherwise


 


, βi represents weight of some attribute in the class. 

( , ) ( , )
( , )

m Isim a b n Psim a b
sim a b

m n

  



 

m and n are constants, Csim(a,b) represents hierarchical similarity between a and b, and Psim(a,b) represents 
attribute similarity between a and b. 

4. Experiment and analysis 

To further enhance performance of recommendation algorithm, relying only on a computer is difficult to 
implement, so parallel processing will effectively improve computing speed of the algorithm. More current 
application to solve big data includes MapReduce programming model, Hadoop distributed framework and 
Hbase distributed database (He et al., 2008; Xia et al., 2011; Shekhar et al., 2012; Yang et al., 2011). Here we 
select MapReduce which is relatively simple and convenient to further improve the algorithm to increase the 
speed of recommendation. It includes three processes of data division, Map stage and Reduce stage. We 
mainly carry out segmentation in accordance with the user. Similarity calculation process of a particular user 
and other users, prediction score and recommendation process are encapsulated in the process of the map.  
One hundred of users evaluate 1700 kind of products, and scoring value is an integer from 1 to 10. Two 
thousand of data are selected. User interest is proportional to scoring value. Accuracy comparison of different 
algorithms is shown in figure 1. From the perspective of recommendation accuracy, mean absolute error MAE 
is used to measure the accuracy of the algorithm. MAE is average of deviation of absolute value between actual 
value and predictive value of all score. The higher the prediction precision, the smaller the MAE value. Product 
recommendation algorithm accuracy is measured based on MAE. Collaborative filtering recommendation based 
on domain ontology, traditional collaborative filtering recommendation, recommendation based on the content, 
and recommendation based on association rules are tested. The experimental results show that the 
collaborative filtering recommendation algorithm based on domain ontology has higher accuracy than the 
traditional collaborative filtering recommendation, recommendation based on association rules, and 
recommendation based on content. Speed comparison of different algorithms is shown in figure 2. It can be 
concluded that with increasing of number of data, processing speed is much faster than other traditional 
algorithms. The proposed algorithm has important practical significance for the implementation of enterprise 
online product recommendation service model. 

766


Figure 1: Accuracy comparison of different algorithms 

 
Figure 2: Speed comparison of different algorithms 

5. Conclusion 

Combined with the characteristics of big data, on the basis of analysis of the current research status of the 
enterprise online product recommendation system, collaborative filtering recommendation algorithm is selected 
for its lower data requirements and more successful application. The collaborative filtering recommendation 
algorithm is difficult in handling the natural language expressive and potential user needs under the 
environment of big data, which would affect the recommendation accuracy. In order to solve this defect, 
semantic similarity is introduced into the traditional collaborative filtering algorithm. Domain ontology is 
introduced to improve performance of enterprise online product recommendation algorithm. It improves the 
accuracy of recommendation, taps the dynamic and potential needs of users. Besides, it transforms the user 
needs expressed by the text into location information of the ontology knowledge library; therefore it improves 
the speed of recommendation. In order to test the effectiveness of the improved algorithm, experiments are 
done. Testing results show that the improved algorithm can improve recommendation efficiency and 
recommendation quality to some extent.  

10 20 30 40 50 60 70 80 90 100
0.65

0.7

0.75

0.8

0.85

0.9

number of data

m
e
a
n

 a
b

s
o

lu
te

 e
r
r
o

r

 
association rule

content

traditional collaborative filtering

improved collaborative filtering

0 100 200 300 400 500 600 700 800 900 1000
0

0.5

1

1.5

2

2.5

number of data

ti
m

e
 (

s
)

 
association rule

content

traditional collaborative filtering

improved collaborative filtering

767


Acknowledgment  

This work is supported by Qinhuangdao Science and Technology Bureau “The development of medium-sized 
and small enterprises in Qinhuangdao in the background of big data”. Project code: 201502A278. 

References 

He B., Fang W., Luo Q., 2008, Mars: a MapReduce framework on graphics processors, Proceedings of the 
17th international conference on Parallel architectures and compilation techniques,ACM, 10, 260-269. 

Hu R., Dou W., Liu J., 2014, ClubCF: A clustering-based collaborative filtering approach for big data 
application, IEEE Transactions on Emerging Topics in Computing, 2(3), 302-313. 

Kuang G.F., 2012, The development of e-commerce recommendation system based on collaborative filtering, 
advanced engineering forum, 6(7),636-640. 

Lee Y.H., Hu P.J.H., Cheng T.H., Hsieh Y.F., 2012, A cost-sensitive technique for positive-example learning 
supporting content-based product recommendations in B-to-C e-commerce, D ecision Support Systems, 
53(1), 245-256. doi:10.1016/j.dss.2012.01.018. 

Long S., Zhu W.H., 2012, Mining evolving association rules  for  e-business recommendation, Journal of 
Shanghai Jiaotong University (Science), 17(2),161-165.  

Moon J., Chadee D., Tikoo S., 2008, Culture, product type, and price influences on consumer purchase 
intention to buy personalized products online, Journal of Business Research, 61(1), 31-39. 
doi:10.1016/j.jbusres.2006.05.012. 

Papadimitriou S., Sun J., Disco, 2008, Distributed co-clustering with map-reduce: A case study towards 
petabyte-scale end-to-end mining, Data Mining, Eighth IEEE International Conference on. IEEE, 11,512-
521. 

Shekhar S., Gunturi V., Michael R.,  Evans, Yang K.S., 2012,  Spatial big-data  challenges  intersecting  
mobility  and  cloud  computing,  MobiDE'12:Proceedings of the Eleventh ACM International Workshop on 
Data  Engineering  for Wireless and Mobile Access, 2, 1-6. 

Xia X.W., Wang W.P., 2011, Collaborative Filtering recommendation algorithm based on trust model, 
Computer Engineering, 37, 26-28. 

Yang S., Xue W., Xie Y.H., Wang X.Y., Zhu X.J., 2011, Collaborative filtering recommendation algorithm 
based on single-class classification, Computer Engineering, 37, 59-61. 

Zhu Y., Su H.Y., Wang C.Q., 2014, Distributed collaborative filtering recommendation model based on 
expand-vector, Advanced Materials Research, 989, 2188-2191. 

Zhu Y., Su H.Y., Wang C.Q, Yan B., Zheng H., 2014, Distributed collaborative filtering recommendation  
model  based  on  expand-vector, Advanced  Materials  Research, 989, 2188-2191. 

 
768