


Proceedings of Engineering and Technology Innovation, vol. 7, 2017, pp. 20 - 24 

 
Preliminary Study on a System for Visualization of Big Data in SMEs 

Yasuo Uchida
1,*

, Miao Xinyun
1
, Seigo Matsuno

1
, Yasushi Iha

2
, Makoto Sakamoto

3
 

1
Department of Business Administration, National Institute of Technology, Ube College, Ube, Japan  

2
Department of Media Information Engineering, National Institute of Technology, Okinawa College, Nago, Japan  

3
Department of Computer Science and Systems Engineering, University of Miyazaki, Miyazaki, Japan 

Received 19 July 2017; received in revised form 30 July 2017; accept ed 03 August  2017 

 
Abstract 

The 2012 White Paper on Information and Co mmun ications in Japan issued by the Ministry of Internal Affa irs 

and Co mmunications of Japan advocates use of big data under its “Special The me: ICT -induced and accelerated 

Disaster Recovery and Japan’s Re-birth.” However, the analysis in the Japan Users Association of Informat ion 

Systems’ white paper on its 2014 IT trend survey for co mpanies reports that less than 10% of co mpanies utilize big 

data, and it would appear that progress in its use is centered on large firms. Under such conditions, use of big data is 

becoming a  challenge for the purpose of ensuring the survival and success of SM Es as we ll. As a result,  R&D and 

technological support for SMEs are becoming pressing issues. However, at present there has been almost no 

academic  research concerning policies and future direct ions for use of big data at SMEs. Accordingly, th is study 

conducted the modelizat ion of the procedure fo r v isualization of big data fo r SM Es. Specifically, we o rganized  the 

procedure as a tutorial, fro m obtaining data of Japanese hot -spring areas using web scraping, to visualizing them 

using the visualization software Cytoscape 

Keywor ds: big data, visualization, SMEs, Cytoscape 

1. Introduction 

This study is intended to research and develop a system for visualizat ion of big data suited to SMEs, as a tactica l 

informat ion tool to  support SM Es’ strategies for success under conditions of increasing ly intense global co mpetition. That is, it 

aims to probe a  fra me work that is easy to adopt and superior in  terms of operability  for the collection, storage, analysis, a nd use 

of big  data. At the same  time, it  also a ims to eluc idate e mpirica lly the ideal fo rm of a  strategic information infrastructure for 

SMEs and challenges in its operation and administration.  

In this study, we carried out a preparatory consideration of visualization of big data by SMEs. Specifically, we organized 

the procedure as a tutorial, fro m obtaining data of Japanese hot-spring areas using web scraping, to visualizing the m using the 

visualization software Cytoscape. 

2. Trends in Use of Big Data at SMEs in Japan 

At present, there are very few e xa mp les of successful use of big data by Japanese SMEs. In addition, how big data is used 

at SMEs depends on individual planning by each company. Accordingly, this paper will begin by summarizing measures taken 

and research trends related to the use of big data at Japanese SMEs. It also will e xa m ine a nu mber of e xa mp les of early 

adopters.  

                                                                 
*
 
Corresponding author. E-mail address: uchida@ube-k.ac.jp 

 
Tel.: +81-836-35-7567; Fax: +81-836-35-7567

 
Proceedings of Engineering and Technology Innovation, vol. 7, 2017, pp. 20 - 24 

Copyright ©  TAETI 

21 

For e xa mple , the report “ Enriched Living and Economy  fro m Connected IT : The  Va lue and Re liab ility of Big Data”[1] 

fro m the Research Group on IT Infrastructure for Living and the Economy of the Informat ion -technology Promotion Agency, 

Japan (IPA) (IT Infrastructure for Living and the Economy of the Informat ion -technology Promotion Agency, Japan) both 

e xpla ins in simp le  terms what big data means for managers of co mpanies aiming  to provide new services using big dat a and 

identifies results such as expansion of business opportunities through summa rizing e xa mp les of early adopters of big data, 

advantages and issues in service realizatio n, and efforts to resolve these. 

In addition, the 2014 White Paper on Small and Mediu m Enterprises in Japan [2] fro m the Sma ll and Mediu m Enterprise 

Agency mentions use of data on corporate transactions (big data) as a “key” to revitalization of regional economies.  

Looking at the activit ies of SM Es in the fie ld, in Nove mber 2014 the Osaka  Cha mber of Co mmerce and Industry 

published the results of a survey intended to ascertain matters such as expectations, needs, and issues involved in use of bi g 

data by second-tier co mpanies and SMEs [3]. While  the results of this survey show that approximate ly 81% of co mpanies are 

interested in “informat ion (data)” as “useful for manage ment purposes,” respondents also identify the following as the top 

three “issues in use” of data: 

 “Difficulty of understanding the cost-effectiveness of use of information (data)” (64.9%) 

 “Lack of human resources to analyze information (data)” (56.9%) 

 “Lack of understanding of methods of using information (data)” (34.0%)  

Accordingly, we decided to proceed with research and development focusing on these three points. First of  all, we 

identified as a  necessary condition the ability to  use personal computers having specificat ions like  those used in ord inary 

administrative-leve l operations instead of high -priced co mputers, to keep costs down as much as possible. We also decided t o 

use, in principle, software such as open -source software that can be used free of charge as tools needed for analysis and 

visualizat ion. Another prerequisite we identified was that the data analysis must be of a degree capable of being conducted b y 

emp loyees who have the skill levels needed to analyze  data using spreadsheet software (such as Microsoft Exce l), since it is 

difficult for SM Es to secure staff that have sp ecialized data analysis skills. Furthermo re, we decided to provide hints on use of 

data by describing specific examples of methods of their use. 

3. Visualization of Big Data 

3.1.   Steps from data collection through visualization  

The data subject to visualizat ion can be broken down into two main categories. The first consists of data in the possession 

of the co mpany itself. In this case, the co mpany has ascertained the content of the data sufficiently  and it  is easy for it t o process 

the data on its own. The other category consists of data that is present on the Internet. In this case, it is difficult to understand 

the structure of the data and they are not easy to obtain. However, somet imes SM Es will want to obtain and utilize these data . 

Accordingly, this study will consider the steps used when obtaining and processing data present on the In ternet. Since the main 

objective of this study is to illustrate a data processing model, we  limited the purposes of visualization itself to the foll owing 

content: 

 Subject data to be collected: Data on hot-springs resorts in Japan, published on the Internet 

 Purpose of visualization: To visualize the locations and water qualities of hot -springs resorts  

 Steps in visualization: Obtaining data through Web scraping [4], conducting a number o f preprocessing steps, and then 

using Cytoscape [5] to import the data as  network information and visualize it in the form of graphs. 


 Proceedings of Engineering and Technology Innovation, vol. 7, 2017, pp. 20 - 24 

Copyright ©  TAETI 

22 

3.2.   Data acquisition and processin g 

When obtaining data through Web scraping, the permission of the data provider must be obtained in advance. There is a 

need to consider how to avoid  burdenin g the servers and network when actually  obtaining the data. Besides , the end-user 

license agree ment must be co mplied  with fo r the data obtained. A lthough we used the Python language [6] as a software  

environment for obtain ing and processing data, we arrang ed the model as one consisting of steps that could be used even by 

non-specialists, with consideration for ease of use. 

(1) Analysis of Web pages  

There was a  need to analy ze the  data structure of Web pages and identify the data obtained. Th is can be done using the 

View source” feature of a Web browser (Fig. 1). 

Fig. 1 Example of displaying a Web page’s source  

(2) We used a Python program to obtain the desired data from within Web pages through Web scraping. In this study, we 

obtained only data on the names and water quality of hot-springs resorts . 

(3) We used the Python program to look up the latitude and longitude of the hot -springs resorts in Google Maps [7]. 

(4) We processed the above data using Exce l and other tools and saved it as network data. An e xa mp le  of the fo rmat of the data 

is provided below. In th is case, we  used latitude as the Y-a xis va lue on the graph a fter inverting  positive and negative  signs, 

since display coordinates and axial directions on the monitor are reversed . 

Sample network data format: 

Prefecture name, hot-springs resort name, water quality, X coordinate (longitude), Y coordinate (latitude) 

3.3.   Visualization using Cytoscape 

Cytoscape is a tool for visualizat ion of networks (through a graph structure). For this reason, the subject of processing 

needs to have a network structure. Accordingly, we  decided to analy ze  the locations of hot -springs resorts and their local 

prefecture capitols, as an exa mp le of a network. Fig. 2 shows an exa mple of v isualization of information on hot -springs resorts 

in Ya maguchi Prefecture resulting fro m loading network data to Cytoscape and color-coding the informat ion by water quality. 


Proceedings of Engineering and Technology Innovation, vol. 7, 2017, pp. 20 - 24 

Copyright ©  TAETI 

23 

In the center of the graph is the Ya maguchi Prefecture capitol. Fro m this graph, the reader can identify the mutual positioni ng 

from the latitudes and longitudes on the map and the water quality from the color coding of the hot -springs resort names. 

 
Fig. 2 Visualization of hot-springs resorts in Yamaguchi Prefecture 

4. Discussion 

In this study, ultimately we visualized structured data. First of all, the origina l data source of Web page source text 

(HTM L) is semistructured data [8]. We  followed  the method of Web scraping to obtain data fro m the content of Web pages. 

Ne xt, when the data obtained as in this study are composed of multip le files, there is a need for steps such as data abstraction 

and combination. This p rocess requires use of data-processing tools and programming languages. In addition, occasionally it is 

impossible to apply general-purpose tools to conversion of semistructured to structured data, and in such cases one must rely 

on programming languages. Also, in the case of a locale such as Japan that employs mult ibyte characters, sometimes text code 

conversion is required [9]. 

5. Conclusion 

This study employed a preparatory consideration of a system for visualizat ion of Big Data at SMEs, eluc idating a number 

of require ments. That is, it showed that in processes such as data collection and data processing there are many cases  in which 

it is difficult to process the data using general-purpose tools alone. Topics for future study include development of independent 

tools to supplement general-purpose tools as well as development of general-purpose models for the steps involved in  

visualization and preparation of tutorials suitable for use by SMEs. 

Acknowledgement 

This work was supported by JSPS KAKENHI Grant Number 15K03639 . 

References 

[1] IT Infrastructure for Living and the Economy of the Information -technology Promotion Agency, Japan, “Enriched living 

and economy from connected IT: the value and reliability of big data,” http://www.ipa.go.jp/files/000001884.pdf. 

[2] Small and Medium Enterprise Agency, “2014 White paper on small and medium enterprises in Japan,” 

http://www.chusho.meti.go.jp/pamflet/hakusyo/H26/PDF/h26_pdf_moku ji.ht ml. 

[3] The Osaka Chamber of Commerce and Industry, “Results of survey on use of big data,” Press Release, 2014. 

[4] L. Richardson, “Beautiful Soup,” Available via https://www.crummy.com/software/BeautifulSo up/. Cited 26 January 

2016 

https://www.crummy.com/software/BeautifulSoup/


 Proceedings of Engineering and Technology Innovation, vol. 7, 2017, pp. 20 - 24 

Copyright ©  TAETI 

24 

[5] Cytoscape Consortium, “Cytoscape,” http://www.cytoscape.org/. 

[6] Python Software Foundation, “Python,” https://www.python.org/. 

[7] Python Software Foundation, “Pygeocoder,” https://pypi.python.org/pypi/pygeocoder. 

[8] D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman , and J. Widom, “Querying semi structured heterogeneous information,” 

Journal of Systems Integration, vol. 7, no. 3, pp. 381-407, 1997. 

[9] “The unicode consortium,” http://unicode.org/.