INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL
Online ISSN 1841-9844, ISSN-L 1841-9836, Volume: 17, Issue: 5, Month: October, Year: 2022
Article Number: 4696, https://doi.org/10.15837/ijccc.2022.5.4696

CCC Publications

Online Healthcare Privacy Disclosure User Group Profile Modeling
Based on Multimodal Fusion

Yong Wang

Yong Wang
School of Economics and Management
Beijing Jiaotong University
Beijing, 100044, China
17113137@bjtu.edu.cn

Abstract

With the spread of COVID-19, online healthcare is rapidly evolving to assist the public with
health, reduce exposure and avoid the risk of cross-infection. Online healthcare platform requires
more information from patients than offline, and insufficient or incorrect information may delay
or even mislead treatment. Therefore, it is valuable to predict users’ privacy disclosure behaviors
while fully protecting their information, which can provide healthcare services for users accurately
and realize a personalized online healthcare environment. Compared with the traditional static on-
line healthcare platform user privacy disclosure behavior influence factor analysis, this paper uses
multimodal fusion and group profile technology to build a user privacy disclosure model and lay the
foundation for personalized online healthcare services. This paper proposes a cross-modal fusion
modeling approach to address the problem that the information of each modality cannot be fully
utilized in the current online healthcare privacy disclosure modeling. A multimodal user profile
approach is used to construct personal and group profiles, and the privacy disclosure behavioral
characteristics reflected by both are integrated to realize accurate personalized services for online
healthcare. The case study shows that compared with the static unimodal privacy disclosure model,
the accuracy of our method gains significant improvement, which is helpful for precision healthcare
services and online healthcare platform development.

Keywords: multimodal fusion technology, group profiling, privacy disclosure, online healthcare
platform, personalized healthcare services.

1 Introduction
With the spread of COVID-19, many patients prefer to stay home rather than go to the doctor.

“Another benefit is the ‘green effect’ telehealth helps prevent patients from traveling as much as
possible for treatment and avoid those expensive helicopters, planes and ambulances, especially for
rural patients, reducing unnecessary carbon dioxide gas, as documented by the University of New
Mexico Telehealth Institute," Summit Healthcare Regional Healthcare Center telehealth specialist
Susy Salvo-Wendt, a telehealth specialist at Summit Healthcare Regional Center said. In the face of

https://doi.org/10.15837/ijccc.2022.5.4696 2

the sudden epidemic in 2020, online healthcare platforms have become an important place for people to
seek healthcare advice, exchange health information, and pay attention to the prevention and control
of the epidemic during the COVID-19, thanks to features such as being geographically independent,
identifying suspected cases, reducing exposure, avoiding cross-infection, and easing public anxiety.
According to relevant statistics, the average number of new registered users of Good Doctor online
healthcare platform in February 2020 increased nearly 3.5 times a month compared to December 2019,
and the daily demand for online health consultation and healthcare knowledge consultation submitted
by users increased nearly 6.5 times a day. It can be said that there was a significant surge in the public
demand for the use of online healthcare platforms due to the New Coronary Pneumonia outbreak.

Advances in technology have facilitated the delivery of medical and health information, and the
widespread use of online technology in the medical field has effectively improved treatment outcomes
and reduced healthcare costs [1]. The disclosure of health information by patients on online health-
care platforms ensures smooth communication with physicians, improves correct diagnosis, increases
scientific knowledge, medical and health information, and raises self-care awareness. Personal health
information is vital to users’ lives, and correct advice from doctors based on relevant information can
improve users’ health, but incorrect diagnosis can also put users at risk, which means it is important to
establish a user health database, creating a challenge for the construction and development of online
healthcare platforms. The use of online healthcare services by users involves the disclosure of a large
amount of personal health information, which is of great research significance as an important data
source for health databases.

However, due to the asymmetry of information among platform users, there are differences in the
way users express and deliver their information, which result in the neglect of users’ different needs and
varying degrees of demand, and thus provide services in a “one-size-fits-all" manner, without matching
the services to the different needs of users. It can be said that the characteristics and needs of users are
like the symptoms of patients, and only on the basis of “look", “smell", “ask" and “cut" can the doctor
“prescribe the right medicine". If the user is not well informed of specific information, the service will
be counterproductive and make it more difficult for the user to access health resources. Therefore,
the platform should provide different ways for different users, combine their information literacy and
knowledge ability, assist users to complete the online resources, solve the difficulties of retrieval,
and avoid the problems of too much professionalism and poor adaptability of service content. In
addition, more and more people are seeking medical services from online healthcare platforms in order
to save time and reduce the cost of treatment. However, individuals face many threats when using
online healthcare services, such as disclosure of private medical information [2]. The contradiction
between the necessity of information disclosure and the risk of information leakage has become a key
issue affecting users’ use or continued use of online healthcare platforms. The frequent occurrence of
privacy leaks raises users’ concerns about personal privacy security, online privacy leaks harm patients’
interests, and privacy concerns reduce users’ willingness to disclose information. Compared to offline,
online healthcare platforms require more supporting information from patients, and insufficient or
incorrect information may delay or even mislead treatment, making it difficult to fully utilize the
online healthcare platform for healthcare services. Therefore, it is necessary to promote the disclosure
of user information based on the protection of user privacy.

Privacy disclosure refers to the voluntary and active disclosure of personal information by individ-
uals to others during social interactions. Research around personal health privacy disclosure can be
broadly divided into four perspectives. The first one is from a technical perspective, which studies the
development of personal health management information systems [3], the construction and develop-
ment of online healthcare platforms [4], and methods of protecting medical and health privacy [5]. The
second one is from the legal perspective to study the laws and regulations of health information and to
consider how to reduce the privacy concerns of users [6, 7]. The third one is from the business analysis
perspective, which analyzes the current state of healthcare services, the current state of applications
and research in the health field, opportunities and challenges. The fourth level is from the perspective
of factors influencing healthcare privacy disclosure. Zhang et al. [8] analyzed the factors influencing
users’ privacy concerns in online health communities by combining a dual algorithm model and protec-
tion motivation theory, and found that privacy concerns, information support, and emotional support

https://doi.org/10.15837/ijccc.2022.5.4696 3

had significant effects on users’ willingness to disclose personal health information. Bansal et al. [9]
constructed a model based on utility theory to explore the effects of personal personality, trust and
privacy concerns on users’ willingness to disclose health information online. Wang et al. [10] studied
website services, website reciprocity norms and user trust factors, and constructed a model of factors
influencing users’ willingness to self-disclose on healthcare websites. The above studies analyzed user
disclosure behavior in terms of emotion, benefit and risk, and pointed out the continuous influence of
privacy concern on privacy disclosure willingness and privacy disclosure behavior.

Overall there are few results that have examined medical privacy disclosure at the level of factors in-
fluencing privacy disclosure. Most of the existing studies consider the negative effects of personal med-
ical privacy disclosure from the perspective of individual perceptions and suggest privacy-protective
behaviors to users, and less often consider factors that encourage users to disclose their own health
information. Besides, the benefits and risks perceived by patients in the state of health and the state
of seeking healthcare consultation are different, and the influence of both on the willingness to dis-
close privacy information are different. Under adequate physical control conditions, willingness can
directly determine user behavior. The stronger a user’s willingness to disclose information to an online
healthcare platform, the more positive the privacy disclosure behavior will be. User online healthcare
privacy disclosure behavior in this paper is the result of the interaction of short-term and long term
disclosure willingness. Level of explanation theory predicts that the level of explanation discounts the
perceived outcome or the value of the outcome, in which case the value of the far-future outcome is
lower than the near-future outcome. There is a temporal discount for user privacy willingness, and
with the user’s present moment as the standard point of reference, the long term disclosure intention
is a far psychological distance away in time; the short term disclosure intention is a near psychological
distance away in time. The value of long term disclosure intention is lower compared to the value
of short term disclosure intention, so when online healthcare platform users make privacy disclosure
behavior decisions, they are more likely to be influenced by short term disclosure intention.

Based on the above analysis, to solve the problems faced by online healthcare platforms at present,
firstly, we should make full use of information disclosed by users and label users to precise healthcare
services; secondly, we should comprehensively analyze the characteristics of users’ privacy disclosure
behaviors and increase the efforts of users to disclose their privacy to online healthcare platforms.
Therefore, this paper proposes a multimodal fusion user profile technique to build user profiles to model
users’ online healthcare privacy disclosure behavior. In this paper, User Profiling Based on Multimodal
Fusion and Stacking, Model Combination, and Fusion (UMF-SCF) is used, which can deeply fuse
multiple data sources of user online healthcare privacy disclosure such as basic user information. This
paper combines the personal and group profiles of users’ healthcare privacy disclosure to reflect the
characteristics of users’ privacy disclosure behavior and realize personalized healthcare services.

The purpose of this paper is to explore the factors influencing users’ willingness to disclose their
personal medical privacy, and to investigate the mechanisms of these influencing factors on the will-
ingness to disclose and disclosure behavior, with a view to assisting in optimizing the functions of
online healthcare websites and promoting the development of online healthcare care. This study can
both promote users’ disclosure of their necessary personal medical privacy, laying the foundation for
accurate personalized medicine, and assist medical service providers in obtaining and applying a large
amount of user health data in a legitimate way.

Compared to existing work, the contributions of this paper can be summarized as follows.
(1) First proposes multimodal fusion user profile modeling method for online healthcare privacy

disclosure.
(2) The UMF-SCF multimodal fusion technique is used to construct a characteristic profile of

user’s personal privacy disclosure behavior and enhance the fusion of multiple data sources of user’s
healthcare privacy disclosure including basic user information, historical disclosure behavior, perceived
risk, perceived profit, short term disclosure intention, and long term disclosure intention, which solves
the problem that each modality cannot interact deeply in the user profile modeling process.

(3) Based on K-means, we construct user group profiles by clustering user privacy disclosure
behavior personal profiles.

(4) A specific case study, including a questionnaire survey of online healthcare platform users

https://doi.org/10.15837/ijccc.2022.5.4696 4

and a comparison of baseline model performance, verifies that the online healthcare privacy disclosure
modeling method with multimodal user profiles proposed in this paper can effectively improve accuracy
and contribute to precision healthcare services.

The subsequent sections of this paper are organized as follows: Section 2 systematically investigates
existing research on online privacy disclosure, user profile construction and multimodal fusion, and
analyzes their advantages and disadvantages; Section 3 describes multiple data sources for online user
healthcare privacy disclosure; Section 4 introduces the multimodal user profiling method to online
healthcare privacy disclosure modeling in detail; Section 6 describes the case study of this paper in
detail and evaluates the accuracy of the results by comparing the baseline models and questionnaires;
Section 7 summarizes the work of this paper.

2 Related Work

2.1 Online Healthcare Privacy Disclosure

Willingness to disclose is an ability to control privacy through the user’s privacy concern and
privacy control [11]. WANG et al. [12] explored the factors influencing users’ willingness to disclose
information in the mobile environment. Sapuppo [13] compared the results of two questionnaires
and found that the main reasons affecting users’ willingness to disclose in the Internet were mobile.
Based on the analysis of privacy computing theory, L.WANG et al. [14] conducted a study from the
perspectives of different individual users and different mobile service providers to explore the user’s
willingness to disclose and the truthfulness of disclosure. Papadopoulou et al. [15] analyzed the main
factors of trust that influence the willingness to disclose personal privacy in both e-commerce and
mobile commerce contexts, and showed that in mobile commerce, the main reason for user disclosure
is “trust". According to Bergström [16], social network users have different levels of “trust" and
different online privacy concerns, which have a direct impact on online privacy settings and disclosure
levels.

With the development of data economy, users’ huge private information has become a treasure
for data users. The change in users’ attitude toward personal privacy has also led to a change in
users’ willingness to disclose their privacy information. Users no longer blindly protect their personal
privacy information and more and more users disclose their privacy information on their own, which
shows that the willingness to disclose privacy is one of the manifestations of users’ ability to control
their personal information. Through many research results on user information disclosure, researchers
found that users compare their expected benefits and possible risks in the process of whether to
disclose their privacy. Users decide to disclose when the benefits and risks reach a certain trade-off
point. According to [17], users first calculate the risks and benefits of privacy disclosure and choose to
disclose information when the risk of disclosure tends to be none, or when the risk is much less than
the benefit.

2.2 User Profiles

User profiles better describe users, makes customer information more vivid and easier to com-
puterize, and is at the root of some big companies’ use of big data theory. User profiles are used
to identify users and to determine how to treat them, for example, which products they like, what
their educational background is, their spending power and their social connections, and also to reflect
their personality through their behavior, i.e., to tag them. The user information can be used for data
mining of user profiles, for example, using clustering algorithms to analyze the distribution of age
groups and occupations of users with high blood pressure. To perform massive data processing, it
is inseparable from frequent various operations, however, tagging provides us with a way to quantify
the difficult information, so that computers can programmatically process tedious information and
also understand users through some algorithms and models. When the computer has achieved how to
understand the user, whether it is a search engine or a recommendation engine or various commercial
applications such as advertising and marketing, it can further improve the accuracy of the results and
the efficiency of information acquisition as well as recommendation information. For each different

https://doi.org/10.15837/ijccc.2022.5.4696 5

company and project, the goals of the user personas they create and the final state they want to
achieve are different. Through the purpose of enterprise’s profiles of some users as a research point,
the profile can have the following three applications: the first is to cluster and analyze the massive
user data to split the users into different user groups; the second is to have a good understanding
for the enterprise’s product-oriented objects through core analysis to better understand the users; the
third is a product application based on the user profile. First of all, if we want to make a reasonable
profile of customers, we need to collect information about the customers of the enterprise, and the
way to collect information is usually in the form of website crawling or offline questionnaires, and then
we can define the basic outline of the profile according to the target, and also make a basic under-
standing of the user’s situation. The basic information of users mainly includes user ID, age, gender,
occupation, geography, education background, family income status, and so on. It is also possible to
obtain information about users’ spending and clicking behavior from some enterprise platforms. After
having a profile, it is necessary to define the details of the profile, and it is necessary to analyze all the
teams involved in the profile, or the different levels of staff and management in a company, product
sales staff or senior developer, etc. The extended or dynamic information of the user mainly includes
the preferred product model, the user’s activity, hobbies and preferences, usual browsing behavior and
purchasing tendencies. When a company segments its users, it divides the user metrics into several
categories and indicates which data information plays a dominant role, which is used as the main basis
for user classification.

2.3 Multimodal Fusion

In general, modality refers to the way things happen or exist, and multimodality refers to various
combinations of two or more modalities in various forms. For each source or form of information, it
can be referred to as a modality. The current research focuses on the processing of three modalities:
image, text, and voice. The reason for the fusion of modalities is that different modalities have different
expressions and different perspectives on things, so there are some crossover (redundant information
exists), complementary (better than single feature) phenomena, and there may even be a variety
of different information interactions between modalities, and if the multimodal information can be
processed reasonably, the feature information can be enriched. That is, in general, the distinctive
features of multimodality are redundancy and complementarity.

Traditional feature fusion algorithms can be divided into three main categories: Bayesian decision
theory-based, sparse representation theory-based, and deep learning theory-based methods. Among
them, deep learning methods can fuse every layer according to the level of fusion. Pixel level fuses the
original data at the smallest granularity. Feature level fuses the abstract features, including early and
late fusion. Early means fusing the features first and then outputting the model, the disadvantage
is that it cannot make full use of the complementarity between multiple modal data, and there is
the problem of information redundancy. Late is divided into two forms: fusion and non-fusion. Non-
fusion is similar to integrated learning, where different modalities get their own results and then unified
scoring for fusion, and the model of this method is independently robust; fusion means free fusion in
the process of feature generation, which is more flexible. Besides, decision level is used to fuse the
decision results, hybrid level hybrid fuses multiple fusion methods

In recent years, social media user profiles has attracted more and more researchers’ attention.
Among the existing user profile modeling techniques, research on how to fuse multiple user data
sources or modalities in order to obtain more accurate user profiles is quite limited and has some
shortcomings. On the one hand, some of the user profile research works are conducted only on a
single modality, which is difficult to characterize users comprehensively; on the other hand, most
of the studies using multiple modalities only integrate data sources at feature level or decision level
[18, 19, 20], and even though some studies are able to perform fusion at two levels [21, 22], they still
lack the exploration of deep fusion of multiple data sources.

Based on the above analysis, this paper builds a UMF-SCF model on the multi-source data of user
privacy disclosure collected and processed by the questionnaire to construct personal privacy disclosure
behavior profiles of users, and then constructs group profiles by clustering the personal profiles through
K-means. The use of multimodal techniques to construct personal profiles and the combination of

https://doi.org/10.15837/ijccc.2022.5.4696 6

Table 1: User history disclosure behavior factors.
Historical
disclosure
behavior

Explanation

Information
Provided Previously provided information to online healthcare platforms

Information
Recording Information can be found on online healthcare platforms

Private
Information
Disclosure

Once mentioned personal things on online healthcare platforms

Information
Consistency

The information provided to the online platform is consistent with the actual
situation

personal profiles and group profiles can achieve a comprehensive characterization of users’ online
healthcare privacy disclosure behavioral characteristics and provide help for online healthcare precision
services.

3 Online Healthcare Privacy Disclosure Data
There are various data reflecting users’ privacy disclosure behaviors in the online healthcare process,

including users’ basic information, historical disclosure behaviors, perceived risks, perceived profits,
short term disclosure intentions, and long term disclosure intentions.

3.1 Basic User Information

Due to the differences in age, education and gender, there are also differences in users’ willingness
to disclose personal information. Therefore, this paper uses basic user information as one of the basic
attributes to construct personal and group profiles, including gender, age, education, occupation, time
of using online healthcare platforms and experience of privacy leakage.

3.2 Historical Disclosure Behavior

Historical disclosure behavior refers to users who have provided information to an online healthcare
platform in the past or the historical information can be found on the platform, etc. Since a user who
has provided information to a healthcare platform in the past is more likely to provide information to
the platform in the future as well, this paper considers historical disclosure as one of the influencing
factors of user disclosure behavior.

3.3 Perceived Risk

Perceived risk refers to the potential risk or loss that users perceive when disclosing privacy to
online healthcare platforms [23]. Users’ perceived risk has a negative impact on the willingness to
disclose. Therefore, in this paper, several perceived risk factors are selected as the basic attributes
of users to construct personal and group profiles of users. Table 2 presents the explanation of each
factor.

3.4 Perceived Profits

Perceived profits refer to the benefits and rewards that users perceive can be brought to them
when disclosing information to online healthcare platforms [23]. [12] have shown that perceived profits
have a positive relationship on users’ intention to disclose information. Users can get a better service
experience when disclosing information to online healthcare platforms, which enables doctors to clearly

https://doi.org/10.15837/ijccc.2022.5.4696 7

Table 2: User perceived risk factors.
Perceived

Risk Explanation

Information provision
risk Providing information to online healthcare platforms is risky.

Information
leakage Information provided to online healthcare platforms may be leaked.

Risk of information
use

Information provided to online healthcare platforms may be used
inappropriately.

Table 3: User perceived profits factors and their implications.
Perceived
Profits Explanation

Service
acquisition

Providing information to online healthcare platforms facilitates access
to appropriate services.

Personalized
service

Providing information to online healthcare platforms facilitates access
to personalized services.

Health
benefits

Providing information to online healthcare platforms can be beneficial in
helping to solve health problems.

Doctor-patient
communication

Providing information to online healthcare platforms facilitates
communication with doctors.

understand their basic body information and get better treatment advice that is more suitable for their
situation, thus better solving health problems.

3.5 Willingness to Disclose

The user’s willingness to disclose plays a decisive influence on the disclosure behavior. The stronger
the user’s willingness to disclose, the higher the probability of generating disclosure behavior. Disclo-
sure willingness is divided into short term and long term disclosure intention. Short term intention
to disclose is a user’s psychological activity in the present regarding providing information to online
healthcare platforms. Generally, users are willing to disclose information in the present and tend to
disclose in the future. However, users’ willingness to disclose in the distant future may change, thus
affecting their future disclosure behavior. Therefore, both users’ willingness to disclose in the near
future and willingness to disclose in the far future are among the influencing factors of users’ disclosure
behavior. The value of short term intention to disclose is greater than that of long term intention to
disclose [23].

Table 4: Users’ willingness to disclose in short term and long term.
Willingness to disclose

in short term
and long term

Explanation

Short term
disclosure intention

Whether there are information consulting doctors willing to
provide information to online healthcare platforms at the moment.

Long term
disclosure intention

Whether information will be provided to online healthcare
platforms in the future upon request or when it would help with

health diagnosis.
Health
benefits

Providing information to online healthcare platforms can
be beneficial in helping to solve health problems.

Doctor-patient
communication

Providing information to online healthcare platforms
facilitates communication with doctors.

https://doi.org/10.15837/ijccc.2022.5.4696 8

online user
healthcare privacy

disclosure objectives

user profile
label

system

index
construction

online healthcare
platform user
profile label

dataset

questionnaires

Likert scale

data
acquisition

user group profile
1

user group profile
2

......

user group profile
n

user profiles
clustering

similarity
analysis, cluster

Digitization Labelling Clustering

User Personal Profile Construction User Group Profile Construction

UMF- SCF

user personal
profiles

Figure 1: Online healthcare privacy disclosure modeling process for multimodal user profiling.

4 Multimodal User Profiling Method to Online Healthcare Privacy
Disclosure Modeling

The online healthcare privacy disclosure modeling for multimodal user profiles consists of three
stages: digitization, labeling, and clustering. The specific process are shown in Figure 1. Firstly, the
user privacy disclosure target is clarified and the online healthcare user privacy disclosure labeling
system is established; then the survey questionnaire method is used to obtain data, and the sample
digitization is realized after pre-processing; next, based on the sample set and labeling system, the
multi-data sources of user online healthcare privacy disclosure are extracted with feature labels to form
the labeling dataset, and UMF-SCF is used to make the interaction between the multi-modalities to
form the user online healthcare privacy disclosure personal profiles. Then, based on the labeled data
set, we combine similarity analysis and K-means clustering to classify the users and get the final user
group profiles.

4.1 Multimodal Fusion of Privacy Disclosure Data

Since users’ healthcare privacy disclosure behaviors are reflected by multiple data in the online
healthcare process, it is crucial to establish deep interactions between multiple modalities, so this
paper uses the multimodal fusion technique UMF-SCF [24] to comprehensively characterize users’
future privacy disclosure behaviors.

4.1.1 UMF-SCF Overall Architecture

The UMF-SCF contains a modal fusion layer and a stacking layer, as shown in Figure 2. The modal
fusion layer has 25 cross-modal learning joint representation networks, such as FusionBR and Fusion
BRPHD, for each of the 25 model combination forms. Among them, B denotes the basic information
features of users, R denotes the perceived risk features of users’ privacy disclosure, P denotes the per-
ceived profit features, H denotes the historical disclosure behavior features of users, and D denotes the
privacy disclosure intention features of users (including short term and long term disclosure intention).
The role of this layer is to generate the joint or shared feature representations of each modality, and
to make the features of different modalities interact deeply by combining data sources and nonlinear
functions. In addition, since different modalities have different levels of importance for different task
goals, this layer also employs an attention mechanism to use modality-level weighted scoring for the
extracted features. The Stacking layer implements decision-level fusion, which obtains the prediction
probabilities of the output of the modal fusion layer and uses a multilayer perceptron to obtain the
prediction results of the privacy disclosure behavior of the final online healthcare platform users.

https://doi.org/10.15837/ijccc.2022.5.4696 9

B R P H D

B R B P B H B D R P R H R D P H P D H D B R P B R H

B R D B P H B P DB H D R P H R P D R H D

B R P H B R P D

B R H D

B P H D B R P H DR P H D

Fusio nBR Fusio nBP Fusio nBH Fusio nBD Fusio nRP Fusio nRH Fusio nRD Fusio nPH Fusio nPD Fusio nH D

Fusio nBRH D Fusio nBRPH Fusio nBPH D Fusio nBRPD Fusio nRPH D Fusio nBRPH D

......

...... ...... ...... ...... ......

......

MLP
S tacking

Mo del Fusio n

Predictio n Pro bability

Figure 2: UMF-SCF2 overall architecture.

4.1.2 FusionBRP

The cross-modal learning joint representation network is the core of the UMF-SCF model. In this
paper, FusionBRP is used as an example to introduce the cross-modal learning joint representation
network, which has three layers containing embedding layer, interaction layer and decision layer. Its
structure is shown in Figure 3.

The embedding layer is pre-trained for each input to obtain embedding representations of different
modalities. Taking FusionBRP as an example, the embedding layer contains user basic information
embedding (B1), perceived risk embedding (R1), perceived profits embedding (P1).

The interaction layer is the same two-layer structure, where each modal representation is first
transformed using the hidden layer, and then for each transformed modal representation, the asso-
ciation representation with the other transformed modal embedding is added, so that the feature
representation of each modal after the interaction layer is obtained, which contains the interaction
information corresponding to the other modal.

M 1b = tanh
(
Um1 M

1
a + Wm1m2 Um2 M

2
a + Wm1m3 Um3 M

3
a + Wm1m4 Um4 M

4
a

)
(1)

where, M∗ denotes the modal embedding B, R, and P respectively; a is the number of layers of the
interaction layer, if a = 1, then b = 2, if a = 2, then b = 3; the neural network activation function uses
T anh; U∗ is the transformation matrix corresponding to the modal embedding in the hidden layer
of the a-th interaction layer. W∗∗ is the correlation weight matrix of the corresponding two modal
embedding in the interaction layer of layer a.

After the interaction layer, each user can get 4 high-level representations, including B3, R3, P3,
and B3 ⊕ R3 ⊕ P3. The decision layer maps the 4 representations to their label category space by
means of a linear layer, which are CB, CR, CP and CBRP , and then a softmax layer is used to obtain
the probability distribution of category c. The linear mapping layer is defined as follows.

C∗ = W∗−cA3 + b∗−c (2)

where A ∈ {B3, R3, P3, B3 ⊕ R3 ⊕ P3}, the C∗ denotes the label category space corresponding to A,
and W∗−c denotes the linear layer weights corresponding to A, and b∗−c is the corresponding bias
value.

https://doi.org/10.15837/ijccc.2022.5.4696 10

...... Embe dding Laye r

Inte raction Laye r......

......

Figure 3: Structure of FusionBRP.

The softmax layer is defined as shown below.

p∗c =
exp (C∗c )∑K

k=1 exp
(
C∗k
) (3)

where p∗c denotes the probability of B3, R3, P3, B3 ⊕ R3 ⊕ P3 predicting category c, K is the number
of categories, and C∗c denotes the c-th category of a certain labeled category space.

4.1.3 Loss Representation and Attention Mechanism for Multimodal Fusion

The loss function of FusionBRP consists of three components: the loss function of each modal
representation Lm, the loss of the joint representation Ld, the loss of consistency between the modal
representations . These three loss functions are defined as follows.

Lm = −
∑

p∈Pm

∑
x∈Xt

K∑
k=1

prk(x) log(p(x)) (4)

LBRP = −
∑

x∈Xt

K∑
k=1

prk(x) log
(
pBRPk (B3(x) ⊕ R3(x) ⊕ P3(x))

)
(5)

Ld = −
∑

(p1,p2)∈Pd

∑
x∈Xt

K∑
k=1

p1(x) log (p2(x)) (6)

where Pm =
{

pBk , p
R
k , p

P
k

}
, is the set of probabilities for computing Lm the set of prediction proba-

bilities. p∗k is the probability that a modality predicts the k-th category. Xt denotes the set of user
samples with labels, i.e., the training set. prk denotes the true label of the k-th category of a given
sample. Pd =

{(
pBk , p

R
k

)
,
(
pBk , p

P
k

)
,
(
pRk , p

P
k

)}
, is the set of values for calculating the set of predicted

probabilities.
Since different modalities show different contributions to the classification task for different at-

tributes of users, this paper introduces an attention mechanism to linearly weight the above three
losses to obtain the final loss. The attention in this paper can be interpreted as follows: for different
modalities performing the same classification task, the influence of short term disclosure intention
on users’ disclosure behavior is greater than that of long term disclosure intention, so short term
disclosure intention is more important and should be given more weight; for the same modality per-
forming different classification tasks, the influence of perceived risk on long term disclosure intention

https://doi.org/10.15837/ijccc.2022.5.4696 11

Table 5: User online healthcare privacy disclosure data.
ID 8

Gender Male
Age 34

Academic qualifications Master’s Degree
Career Programmer

Years of using online healthcare platform 1
Have experienced privacy leakage No

Perceived risk 4.33
Perceived profit 2.25

Historical disclosure behavior 1.25
Recent disclosure intentions 2.75

Willingness to disclose in the future 2.50

is significantly greater than that of short term disclosure intention, so more weight should be given to
perceived risk classification forward disclosure The perceived risk has a significantly greater impact on
the willingness to disclose in the near future than the willingness to disclose in the near future. The
calculation formula is as follows.

Ltotal =
2∑

i=0
w[i] · LosSLm [i] + w[3] · LBRP +

6∑
j=4

w[j] · LoSSLd [j − 4] + w[6] · L2Loss (7)

where w denotes the list of weight coefficients introduced by the attention mechanism, and LossLm
denotes the list of Lm, and LossLd denotes the list of Ld. w enables to balance the representation of
each modality, the joint representation, and the consistency.

4.2 Personal Profiles and Group Profiles of Users’ Online Healthcare Privacy
Disclosure

4.2.1 Personal Profile Construction for Online Healthcare Privacy Disclosure of Users

Table 5 shows the data generated by a user who has used an online health care platform. The user
profile built from this data is shown in Figure 4.

4.2.2 Group Profile Construction for Online Healthcare Privacy Disclosure of Users

Due to the large differences between people, it is difficult to achieve personalized or accurate
recommendation of medical services by only constructing personal profiles of users, so this paper also
constructs online healthcare privacy disclosure group profiles for users based on personal profiles.
The user group profile statistically analyzes the similarity of multiple users, clustering users with
similar characteristics to form several user groups, and summarizes and refines the common indexes
within the user groups. The construction of user group profiles is based on personal profiles and then
similarity analysis and aggregation. In this paper, we use Vector Space Model (VSM) [25] to calculate
the similarity of users and the classical K-means [26] algorithm to classify user groups. VSM first
represents user features as vectors, and then finds the distance between vectors by the cosine distance
calculation method, which is its similarity. Then the similarity formula is as follows.

vim(user1, user2 ) = cos


 ∑Fi=1 fif′i∏n

j=1

√
f 2j + f 2j


 (8)

where F is the number of user features and fi is the i-th representation of the user 1 feature vector,
and f′i is the i-th representation of the user 2 feature vector.

User profile clustering is a method of classifying users according to their features and attributes,
which can classify them into several categories and ensure that the differences within categories are as

https://doi.org/10.15837/ijccc.2022.5.4696 12

us e r id 1

male

3 4 ye ars old

mas te r de g re e

prog ramme r

no privacy le akag e
e xpe rie nce

us e the online he althcare
platform for 1 ye ar

s trong pe rce ive d ris k

me dium pe rce ive d profit

low his torical dis clos ure

me dium s hort te rm
dis clos ure inte ntion

me dium long te rm
dis clos ure inte ntion

Figure 4: Personal profile of users’ online healthcare privacy disclosure.

small as possible, while the differences between categories are as large as possible. Take user profile
classification as an example, the specific steps of K-means algorithm clustering are: 1) randomly select
K users as the center of clustering; 2) according to the similarity, group the remaining users into the
class; 3) according to the first clustering result, recalculate the center of each of the K classes by taking
the arithmetic average of the dimension of the respective feature vectors of all users in the class; 4)
repeat step 2; 5) repeat step 3 and 4 until the dissimilarity between the clustering result and the last
clustering result is less than the set threshold; 6) obtain the end-user clustering result. In addition,
this paper uses the simplest way to select the K value, which is calculated as follows.

K = sqrt(N/2) (9)

where N denotes the total number of clustering units, i.e., the number of user samples involved in
clustering.

5 Case Studies
In this paper, we explain the above process of modeling online healthcare privacy disclosure with

multimodal user profiles through a case study.

5.1 Design of Profile Labeling System

By analyzing the data of users’ online healthcare privacy disclosure, this paper establishes six basic
elements of users’ basic information, perceived risk, perceived profits, historical disclosure behavior,
short term disclosure intention, and long term disclosure intention as subdivision indicators to design
the label system of user profile. The final constructed label system of user privacy disclosure profile
of online healthcare platform is shown in Figure 5. Among them, the six labels include a total of 25
measurement variables, as shown in Table 6.

The difference in the privacy disclosure profiles of users of online healthcare platforms lies in the
label weighting design, i.e., different user group characteristics are reflected in different degrees of
importance on a certain label. Therefore, this paper expresses users’ tendency to disclose privacy in
a certain aspect with the help of a 5-point Likert scale from low to high. The mean value of each user

https://doi.org/10.15837/ijccc.2022.5.4696 13

user profile

us e r bas ic information pe rce ive d ris k pe rce ive d profits
his torical dis clos ure

be havior
s hort te rm dis clos ure

inte ntion
long te rm dis clos ure

inte ntion

Figure 5: User privacy disclosure profile labeling system.

Table 6: Measurement variables for user profiling labels.

user
basic

information

gender
age
qualification
occupation
time to use the online healthcare platform
have experienced a privacy leakage

perceived
risk

Providing information to online healthcare platforms is risky
the information provided to the online healthcare platform may be leaked
Information provided to online healthcare platforms may be used inappropriately

perceived
profits

Providing information to the online healthcare platform is conducive to
obtaining corresponding services
Providing information to the online healthcare platform is conducive to
enjoying personalized services
Providing information to online healthcare platforms can help solve health
problems
Providing information to the online healthcare platform can better communicate
with doctors or other users

historical
disclosure
behavior

once provided information to the online healthcare platform
personal information can be found on the online healthcare platform
once mentioned personal things on the online healthcare platform
the information shared to the online healthcare platform is consistent with
the actual situation

Short
term

disclosure
intention

if you are ill, you will provide information to online healthcare platform
or doctors to obtain services
if you are ill, you will provide information when the doctor or online healthcare
platform requests information
if you are ill, you will provide information that will help you stay healthy
if you are ill, you will provide information to other users of the online healthcare
platform for help

Long
term

disclosure
intention

intend to provide information to online healthcare platform or doctor to obtain
services in the future
intend to provide personal information when online healthcare platform or doctor
ask for information in the future
intend to provide information to online healthcare platform or doctor when
providing information helps to maintain health
will continue to share information with online healthcare platform or doctor
as you do now in the future

https://doi.org/10.15837/ijccc.2022.5.4696 14

Table 7: Average accuracy of user label prediction for each method.

Method Basicinformation
Perceived

risk
Perceived
profitss

Historical
disclosure
behavior

Recent
disclosure
intentions

Long
term

disclosure
intention

Manual
Feature+SVM 0.720 0.653 0.632 0.678 0.614 0.683

Concat+SVM 0.709 0.682 0.675 0.705 0.603 0.690
UDMF 0.765 0.703 0.688 0.726 0.619 0.711

UMF-SCF
without
attention

0.789 0.725 0.723 0.759 0.705 0.799

UMF-SCF 0.812 0.788 0.758 0.806 0.769 0.856

psychological preference variable in each subgroup is chosen as the user profile label weight, while the
basic user information is ranked by frequency using attribute values, thus defining the weight value of
the user profile label.

5.2 Data Acquisition

This paper uses questionnaires to obtain the basic data of users’ online healthcare privacy disclo-
sure, so as to build personal and group profiles. The user data obtained through the questionnaire has
the characteristics of targeted and specific information, which can make the questions more clearly and
specifically oriented, thus facilitating the precise analysis of the subsequent study. The questionnaire
consists of 25 questions in TABLE VI, and all questions, except for basic personal information, are on
a 5-point Likert scale, with 1 to 5, 1 being “strongly disagree" and 5 being “strongly agree".

This study continued to distribute questionnaires for 14 days, and a total of 700 questionnaires were
distributed. After removing 120 questionnaires that had not used the online healthcare platform and
10 consecutive questionnaires with the same options, 580 valid questionnaires were finally obtained.
In this paper, the data of 464 of these users were used as the training set and the data of 116 users
were used as the test set. In order to construct the user feature vector, the 19 measurement variables
of perceived risk, perceived profits, historical disclosure behavior, short term disclosure intention and
long term disclosure intention in the questionnaire must be quantified. In this paper, scores of 1, 2, 3,
4, and 5 are assigned to markers 1 to 5 on a 5-point Likert scale, and the scores are summed according
to the options selected by the users to obtain scores for 5 labels such as perceived risk. A user is
considered to have a very low performance level on this label if the score ∈ 0.0, 1.0], low if the score
∈ (1.0, 2.0], medium if the score ∈ (2.0, 3.0], high if the score ∈ (3.0, 4.0], and very high if the score
∈ 4.0, 5.0].

5.3 User Online Healthcare Privacy Disclosure Personal Profile Construction

The user’s personal profile is constructed, i.e., predicting the user’s basic information, perceived
risk, perceived profits, historical disclosure behavior, short term disclosure intention and long term
disclosure intention as shown in Figure 4. We use UMF-SCF to fuse and predict multiple features of
users, and set the baseline model including manual features + SVM, Concat + SVM, UDMF, and
UMF-SCF without attention mechanism to compare and evaluate the accuracy performance of the
UMF-SCF model for user personal profile construction. In this paper, “Accuracy” is used as the
evaluation metric. The average accuracy predicted by each method on the test set is shown below.

By Table 7, UMF-SCF achieves leading results in predicting all six features of users; Manual
Feature+SVM is a unimodal embedding method, and compared with the other four multimodal fusion
methods, the multimodal fusion method performs significantly better than unimodal embedding in the
user profiling problem; Concat+SVM is a multimodal fusion method but performs worse than UMF-
SCF because it only achieves simple splicing without sufficient interaction between multiple modalities;

https://doi.org/10.15837/ijccc.2022.5.4696 15

UDMF only uses a simple neural network to fuse each data source without deep interaction between
multiple modalities; UMF-SCF without attention mechanism treats the importance of each modality
for different classification tasks as the same, and performs worse than UMF-SCF, indicating that the
setting of different weights is beneficial to improve the performance of the model.

5.4 User Online Healthcare Privacy Disclosure Group Profile Construction

In order to effectively obtain the differentiated profiles of online healthcare platform user groups,
this paper selects perceived risk, perceived profits, historical disclosure behavior, short term dis-
closure intention and long term disclosure intention as feature factors, constructs a feature vector
for each user, and calculates the similarity degree values between each two users by VSM. For
example, the feature vector of user 1 is (4.33, 2.25, 1.25, 2.75, 2.50), and the feature vector of
user 2 is (1.00, 3.50, 4.25, 4.75, 4.75), then the similarity degree between user 1 and user 2 is
cos

(
4.33×1.00+2.25×3.50+1.25×4.25+2.75×4.75+2.50×4.75√

(4.332+1.002)×
√

(2.252+3.502)×
√

(1.252+4.252)×
√

(2.752+4.752)×
√

(2.502+4.752)

)
= 1.78.

Next, the K-means algorithm and the calculated similarity values between two users are used to
classify the user groups. In this paper, there are 464 users in the training set as clustering objects,
and the K-value is calculated as 15 by equation 9. The 15 users with the smallest similarity between
two users are selected as the initial clustering centers according to the minimum-maximum principle
method, and 15 clusters are finally obtained according to the steps of the K-means clustering method.
The center of each cluster is the representative of all objects in that cluster, and its individual param-
eters are the reflection of the collective common characteristics. The 17 user groups obtained in this
paper are as follows.

User group A = user 1, user 56, user 156, user 159, ... , user486
User group B = user 2, user 33, user 78, user 255, ... , user578
...
User group O = user 89, user 112, user 139, user 201, ... , user487
To evaluate the accuracy of the group profiles constructed in this paper, we used the data of the

remaining 116 users for testing. The labels of the users’ personal profiles are first represented by
feature vectors, and then their similarity is calculated with the representatives of the user profiles of
the 15 clustering centers, and they were categorized into the user groups where the clustering centers
with the largest similarity values were located, and then the accuracy of online healthcare privacy
disclosure of these 116 users was investigated separately through questionnaires. The results show
that about 89% of the users think that their willingness are the same as the group profiles. From
the perspective of online healthcare services, these users will have access to more accurate diagnoses
than the other 11%, and the platform may give the remaining 11% a broader range of treatment
options. Regarding disclosure behavior are basically accurately predicted, and the user group profile
established in this paper can effectively model the characteristics of users’ online healthcare privacy
disclosure behavior.

It is worth noting that the factors influencing users’ medical privacy disclosure behaviors considered
in this paper are still limited, which restricts the accuracy of behavioral predictions to a certain
extent. In the future, we will further explore the privacy disclosure intention of online healthcare
users by combining privacy protection, domestic and international service platform differences, and
demographic factors to further assist the development of online healthcare privacy platforms

6 Conclusion
To address the challenges of user privacy disclosure in current online healthcare platforms, this

paper proposes a multimodal fusion user profile modeling method. The method achieves cross-modal
interaction of multiple data sources for user privacy disclosure through UMF-SCF, constructs per-
sonal profiles of user online healthcare privacy disclosure, and comprehensively characterization the
characteristics of user privacy disclosure behavior; constructs group profiles of user online healthcare
privacy disclosure based on personal profiles and K-means clustering algorithm, and integrates the
user characteristics presented by personal profiles and group profiles to more accurately capture the

https://doi.org/10.15837/ijccc.2022.5.4696 16

online healthcare platform users’ needs. According to the case study, the accuracy of the method pro-
posed in this paper is significantly improved compared with unimodal and some advanced multimodal
methods, and the questionnaires show that the accuracy of the user profile constructed by this method
reaches 89%. The multimodal fusion user profile modeling method can effectively maximize the user
online healthcare privacy disclosure behavior while guaranteeing the user information without leakage
and promote the personalized services of online healthcare.

References
[1] Boonstra A, Broekhuis M. Barriers to the acceptance of electronic medical records by physicians

from systematic review to taxonomy and interventions[J]. BMC health services research, 2010,
10(1): 1-17.

[2] Sun S, Zhang J, Zhu Y, et al. Exploring users’ willingness to disclose personal information in
online healthcare communities: The role of satisfaction[J]. Technological Forecasting and Social
Change, 2022, 178: 121596.

[3] Kobrinskii B A, Grigoriev O G, Molodchenkov A I, et al. Artificial intelligence technologies
application for personal health management[J]. IFAC-PapersOnLine, 2019, 52(25): 70-74.

[4] Dai Q Y, Hong X B, Cai J, et al. Deep Learning Based Recommendation Algorithm in Online
Medical Platform[C]//International Conference on Brain Inspired Cognitive Systems. Springer,
Cham, 2018: 34-43.

[5] Dhiman G, Juneja S, Mohafez H, et al. Federated learning approach to protect healthcare data
over big data scenario[J]. Sustainability, 2022, 14(5): 2500.

[6] Edemekong P F, Annamaraju P, Haydel M J. Health insurance portability and accountability
act[J]. 2018.

[7] Lye C T, Forman H P, Gao R, et al. Assessment of US hospital compliance with regulations for
patients’ requests for medical records[J]. JAMA network open, 2018, 1(6): e183014-e183014.

[8] Zhang X, Liu S, Chen X, et al. Health information privacy concerns, antecedents, and information
disclosure intention in online health communities[J]. Information & Management, 2018, 55(4):
482-493.

[9] Bansal G, Gefen D. The impact of personal dispositions on information sensitivity, privacy concern
and trust in disclosing health information online[J]. Decision support systems, 2010, 49(2): 138-
150.

[10] Wang, Yuchao, & Sun, Y. Q. The influence of service and reciprocity norm on self-disclosure
intention in virtual health community[J]. Intelligence Science, 36(5), 149-157.

[11] Zhang, Yue, & Zhu, Qing-Hua. Review on foreign study of information privacy[J]. Library and
Information Service, 58(13), 140-148.

[12] Wang T, Duong T D, Chen C C. Intention to disclose personal information via mobile applications:
A privacy calculus perspective[J]. International journal of information management, 2016, 36(4):
531-542.

[13] Sapuppo A. Privacy analysis in mobile social networks: the influential factors for disclosure of
personal data[J]. International Journal of Wireless and Mobile Computing, 2012, 5(4): 315-326.

[14] Wang L, Yan J, Lin J, et al. Let the users tell the truth: Self-disclosure intention and self-disclosure
honesty in mobile social networking[J]. International Journal of Information Management, 2017,
37(1): 1428-1440.

https://doi.org/10.15837/ijccc.2022.5.4696 17

[15] Papadopoulou P, Pelet J E. Trust and privacy in the shift from e-commerce to m-commerce: A
comparative approach[C]//Conference on e-Business, e-Services and e-Society. Springer, Berlin,
Heidelberg, 2013: 50-60.

[16] Bergström A. Online privacy concerns: A broad approach to understanding the concerns of
different groups for different uses[J]. Computers in Human Behavior, 2015, 53: 419-426.

[17] Keith M, Thompson S, Hale J, et al. Examining the rationality of location data disclosure through
mobile devices[J]. 2012.

[18] Xiao J, Ye H, He X, et al. Attentional factorization machines: Learning the weight of feature
interactions via attention networks[J]. arXiv preprint arXiv:1708.04617, 2017.

[19] Wei H, Zhang F, Yuan N J, et al. Beyond the words: Predicting user personality from hetero-
geneous information[C]//Proceedings of the tenth ACM international conference on web search
and data mining. 2017: 305-314.

[20] Wöllmer M, Weninger F, Knaup T, et al. Youtube movie reviews: Sentiment analysis in an
audio-visual context[J]. IEEE Intelligent Systems, 2013, 28(3): 46-53.

[21] Gu Y, Yang K, Fu S, et al. Hybrid attention based multimodal network for spoken language clas-
sification[C]//Proceedings of the Conference. association for Computational Linguistics. meeting.
NIH Public Access, 2018, 2018: 2379.

[22] Gu Y, Chen S, Marsic I. Deep mul timodal learning for emotion recognition in spoken lan-
guage[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP). IEEE, 2018: 5079-5083.

[23] Liu, W., & Wu, D.J. Research on factors influencing user privacy disclosure behavior of online
healthcare platforms[J]. Journal of Medical Informatics, 42(6), 16-23.

[24] Zhang Z, Feng X, Qian T. User profiling based on multimodal fusion technology[J]. Beijing Da
Xue Xue Bao, 2020, 56(1): 105-111.

[25] Salton G, Wong A, Yang C S. A vector space model for automatic indexing[J]. Communications
of the ACM, 1975, 18(11): 613-620.

[26] Hartigan J A, Wong M A. Algorithm AS 136: A k-means clustering algorithm[J]. Journal of the
royal statistical society. series c (applied statistics), 1979, 28(1): 100-108.

Copyright ©2022 by the authors. Licensee Agora University, Oradea, Romania.
This is an open access article distributed under the terms and conditions of the Creative Commons
Attribution-NonCommercial 4.0 International License.
Journal’s webpage: http://univagora.ro/jour/index.php/ijccc/

This journal is a member of, and subscribes to the principles of,
the Committee on Publication Ethics (COPE).

https://publicationethics.org/members/international-journal-computers-communications-and-control

https://doi.org/10.15837/ijccc.2022.5.4696 18

Cite this paper as:

Yong Wang (2022). Online Healthcare Privacy Disclosure User Group Profile Modeling Based
on Multimodal Fusion, International Journal of Computers Communications & Control, 17(5), 4696,
2022.

https://doi.org/10.15837/ijccc.2022.5.4696

Introduction
Related Work
Online Healthcare Privacy Disclosure
User Profiles
Multimodal Fusion

Online Healthcare Privacy Disclosure Data
Basic User Information
Historical Disclosure Behavior
Perceived Risk
Perceived Profits
Willingness to Disclose

Multimodal User Profiling Method to Online Healthcare Privacy Disclosure Modeling
Multimodal Fusion of Privacy Disclosure Data
UMF-SCF Overall Architecture
FusionBRP
Loss Representation and Attention Mechanism for Multimodal Fusion

Personal Profiles and Group Profiles of Users' Online Healthcare Privacy Disclosure
Personal Profile Construction for Online Healthcare Privacy Disclosure of Users
Group Profile Construction for Online Healthcare Privacy Disclosure of Users

Case Studies
Design of Profile Labeling System
Data Acquisition
User Online Healthcare Privacy Disclosure Personal Profile Construction
User Online Healthcare Privacy Disclosure Group Profile Construction

Conclusion