Wang_ijcccv11n5.pdf


INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL

ISSN 1841-9836, 11(5):734-746, October 2016.

A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter

L. Wang, X. Li, L.J. Liao, L. Liu

Liu Wang, Xin Li, Le-Jian Liao*, Li Liu

School of Computer Science
Beijing Institute of Technology
No. 5 South Zhongguancun Street
Beijing, 100081, China
wangliu2000@163.com, xinli@bit.edu.cn,
liaolj@bit.edu.cn, liuli0407@hotmail.com
*Corresponding author: liaolj@bit.edu.cn

Abstract:

The existing work on mining of hot topics is mainly based on topic multiplicity and
attention from users in unit time. With the advent of social networking, the weight has
been put on the hot topics which can effectively describe the importance and hotness
of a topic. However, the researches on the influence exerted by the accumulation of
attention towards hot topics and the alternation between hot topics and outdated
ones are still relatively weak. In this paper, a novel algorithm for calculating the
hotness of topics is proposed based on momentum. The number of the participants,
but also the long tail effect of the historical accumulation on the topic is taken into
consideration. Through this algorithm, we can accurately build a model for the
hot topics on their emerging growing period and effectively describe the whole life
circle of the topic. Additionally, the change between hot topics and old ones can be
distinguished efficiently. Our experiments show that the process of a topic growing
into a hot topic can be detected explicitly. Potential hot topics can be explored and
the overdue ones can be rejected respectively.
Keywords: hashtag, hot topic, aging theory.

1 Introduction

In modern times social networking has become an important resource of real-time news up-
dates. As smart phones and other mobile devices spread, there is a growing tendency that people
use social networks such as Twitter and Weibo to obtain the hot issues happening across the
world. Detecting the hot topics out from the large-scale of information posted online simultane-
ously is of significant interest for many reasons. For one, it shortens the time for users to obtain
the hot topic, which may play an important role in decision-making. Also from the hot topics
detected, one can easily have an understanding of current social dynamics.

Most of the social network systems may provide a ranking list of hot topics through the search
or post count of keywords. While this approach fails to take the temporal relations of hot topics
into account. Current hot topic detection solutions are mostly based on the topic multiplicity and
attention from users in unit time, whereas they neglect the digestion of the hot topics. Also they
aim to establish the topics from a clustering of keywords from the content layer, which overlooked
the functions of a social network. From the point of view of research methods analysis, physics
methods are applied in many fields, such as: Newton’s theorem was applied to the [6,18], the
gravity was applied to manufacturing modeling [21], the application of the theorem of momentum
was used in [26,34], [15] combined with momentum method to launch the stock prediction. They
carried out the research of modeling by physics methods. From the social networking features,
in addition to the security and network mining research [11,28,31,35], also, many scholars are

Copyright © 2006-2016 by CCC Publications



736 L. Wang, X. Li, L.J. Liao, L. Liu

still studying hashtags’ other functions [27,40], in [1,37], Xiao and Aldhelaan were using social
network to do the research on hot topic discovery and recommendation, Chen [13] combined with
evolution model to achieve the topic prediction. The accumulation of topics in social networks
and the integration of the topic have strong physics characteristics, so we try to find a way to
use physics methods to model the topic in social networks and combine the momentum theorem
with hashtags.

In this paper, we aim to detect those emerging hot topics according to the momentum theory
with the use of hashtags. The momentum of hashtags can reveal the mechanism of dynamics
from the temporal characteristics and quantity characteristics and discover the ideal life cycle
of topics. This will result in using momentum theory to dig out hot topics more accurately
and effectively. In this paper, the proposed algorithm does not need to collect and store a
large number of historical data. Therefore, it is suitable for data streams real-time calculation.
Moreover, the nested loops in the algorithm are few and the time complexity is approximate to
O(n). Due to high computational efficiency, this algorithm is suitable for large data analysis as
well.

The rest of the paper is organized as follows: In Section 2, we give a review of related works.
In Section 3, we propose the definition of hot topics. Section 4 describes the theoretical model
of topic hotness. In Section 5, we discuss the results of the experiments run. Finally, in Section
6, we present our conclusions and some future research directions.

2 Related Work

Topic Detection and Tracking (TDT) [2] has long been a foundation of the research related to
hot topic detection. However, different from the traditional sources of information such as Web
pages, texts, etc., information in social networks is often very short and sparse, also spreading
rapidly. Under the background of the new era of social media, a series of work have been
completed towards those characters. You et al. [7] utilized frequent item set mining algorithm
SaM [8] to find out hot topics denoted by combinations of keywords. Thus, detecting the hot
topics is comparable to mining frequent patterns from news streams. Similar to that, a recent
work by Kim et al. [22] also took geographic elements into consideration, which provides a simple
but useful approach to analyze real-time streaming data and finds geographic communities.
According to FP-growth [20], Giannella et al. [19] proposed the FP-stream time window and the
digestion concept of the support parameters. This new algorithm can let the outdated items
expired and accumulate the importance of items based on the timeline as well. However, the
algorithm is more complicated. Guo et al. [24] improved the tree structure in the Frequent Pattern
stream mining algorithm (FP-stream) [19] and used the new algorithm to detect hot topics from
twitter streams, which can lead to time-sensitive results. Lee et al. [17] developed an algorithm
for ranking topics, using topic energy to represent the significance of a given topic at each time
point within a time period. The strategy in determining topic energy value considers factors
such as popularity, burstiness and informativeness, which suits the character of information on
social network better. In [29], the key entity significance is computed through traditional "tf-idf"
evaluation method [33] in the information retrieval literature. Then, clustering entities are used
to generate significant events. Bun et al. [25] computed the value of a term by using tf*idf and
clustered terms into a sentence. Yang et al. [39] applied the VSM to the task of news TDT
and used a time window with a decaying function (TW-DF) to model the temporal relations
between documents and events. Unfortunately however, the strategies above did not consider
the temporal relations thoroughly. The newly generated hot topics cannot be distinguished from
the outdated ones accurately.

Chen et al. [12,16] proposed an aging theory to model life cycles of news events. For all user



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 737

messages generated from the topic time interval, the aging theory calculates their nutrition and
convert the nutrition value to the energy value, and then add into the cumulative energy value
of the topic. It also applies energy decaying strategy on the topic. As time goes by, the energy
decays gradually. If the increase of the energy is less than the reduction of the energy, the topic
will show a trend of attenuation. If the energy value is less than a given threshold, this topic is
set to "death" state and is removed from the hot topics list. The aging algorithm defines α as
the nutrition transferred factor and β as the nutrition decayed factor, 0 < α < 1, 0 < β < 1,
α decides the increase of nutrition from an input news document and β decides the nutrition
loss in a period. Zheng et al. [41] utilized aging theory on candidate topics discovered by a
clustering method in each time slot in BBS. With energy values updated at the end of each time
slot according to the three functions of aging theory, hot topics in each time slot can be easily
found. Chen et al. [14] combined aging theory with a term weighting scheme to extract genuine
hot terms. Based on the extracted hot terms, key sentences are then identified and grouped into
clusters that represent hot topics by using multidimensional sentence vectors. Cataldi et al. [10]
improved the nutrition formula by using "idf" and formalized the keyword life cycle leveraging
a novel aging theory intended to mine terms that frequently occur in the specified time interval
which are relatively rare in the past. Wang et al. [36] took media focus and user attention into the
classical aging theory and proved the importance of these two factors. In comparison with these
approaches, our solution extracts hot topics based on hashtags according to the functions of a
social network instead of the content aspect. The hot topic emerging theory based on momentum
can explicitly distinguish new hot topics from outdated ones.

The methods mentioned emphasized topic extraction from a cluster of sentences or the fre-
quency pattern of topics, but ignore the topic life cycle. The aging theory [12,16,41] discovers
the life cycle of topic but does not deal with the older topic elimination mechanism very well.
Chen et al. [23] proposed a similar idea to our model, defining hot velocity and hot acceleration
to recognize hot topics but did not reveal the dynamic characteristics. Different from that model,
we define hot topic emerging theory based on momentum. The utilization of momentum theory
can successfully describe the whole life circle of a topic, thus sift out the newly emerged hot
topics from the old ones more effectively.

3 Definition of Hot Topic

Hot topics in real microblogging systems have three characteristics: 1) the number of posts
related to this topic would exceed a threshold; 2) the amount of users concerned should be
large, especially those key persons; 3) a hot topic would occur at a short time [23]. In [14], the
characteristics of hot topics are concluded based on [25] as those appear on many news channels
and go through a life cycle of birth, growth, maturity, and death. In this paper, hot topics are
defined as the topics which are both influential and latest. Four characteristics are summarized
as follows:

i Timeliness. The hot topic refers to the people and things that have happened recently. Those
topics which have been under discussion for more than seven days or one month cannot be
regarded as hot topics.

ii Development. Due to the dynamics while topics are propagated, the influence of public
opinion will spread as time goes by. The hot topics diffract backwards.

iii Accumulation. The more people are involved into the discussion of the topic, the hotter the
topic is. However with more and more attention drawn on a certain topic, it becomes less
possible to be hot topic again.



738 L. Wang, X. Li, L.J. Liao, L. Liu

iv Digestion. As time goes on, the influence of a topic will unavoidably be watered-down. Thus
hot topics have the character of obliviousness.

4 Kinetic Model of Topic Hotness

The topic under a certain time point can be regarded as a set of large number of synonymous
terms. In other words, to describe this concept from the perspective of physics, we can assume
each term as a particle which owns weight and velocity. Besides, the activeness of the particle
represents its energy and the activeness of a topic is the reflection of the particles inside the set.
As time goes on, the set is dynamically changing with terms increasing, thus the energy of the
particles are changing too. In time t − 1, the topic set is represented as Topicn(t−1) . In time t,
the additional subset Topicn∆t will join into Topicn(t−1) as the new set Topicnt. This process
can be regarded as two kinetic physical objects collided and fused into one kinetic physical object
with the exchange of energy. In this paper, we adopt conservation of momentum to illustrate
this process.

In this paper, we propose to use momentum to represent the physical characters in thermody-
namics and dynamics of hot topic. Negative acceleration is imported to represent the digestion
of hotness according to the attenuation of the hot topics while propagated. We use momentum
equation to present the active momentum of topics which is aroused by discussion.

A topic is consists of many synonymous terms. In social networks, a post can be seen as a
term, and similar terms constitute a topic. The topic set is varying all the time and the variation
composes a time sequence. For a topic n, the topic set can be expressed as:

Topicn := {Topicn1, Topicn2, ..., Topicnt}

All kinds of topics are formed as the whole topic set:

Topic := {Topic1, Topic2, ..., Topicn}

The current topic set is the combination of the former set and the subset increased during
the interval, as:

Topicnt := Topicn(t−1) ∪ Topicn∆t

Topicn∆t := {termn1, termn2, ..., termnm}

We can also say that the topic under a specific time spot is the set of all the terms emerged
before, as:

Topicnt := {termn1, termn2, termn3, ..., termnt}

4.1 Momentum Modelling

In this paper, we use the equations below to represent the variation when a topic becomes
hotter. The topic in t − 1 can be seen as a physical entity mt−1 with weight and velocity. The
increased terms inunder that topic can be regarded as another entity mterm∆t. Thus the topic
in t as an entity mt is the combination of mterm∆t and mt−1 after a collision with the velocity of
vterm∆t.

The increased momentum of topic entity mterm∆t is the sum of the momentum of each term
added. We can calculate the momentum of each topic as follows:

⇀
Mterm∆t =

m∑

∆t=1

(mterm∆t ×
⇀vterm∆t) (1)



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 739

The momentum of the Mt equals to the sum of the momentum of mt−1 and Mterm∆t. The
total momentum is invariable before and after the collision. ⇀vt−1 +

⇀g∆t represents the final
velocity after ∆t. The development and digestion of topic can be expressed clearly by this
equation:

⇀
Mt = mt−1 × (

⇀vt−1 +
⇀g∆t) +

⇀
Mterm∆t (2)

The weight of the topic in time t is the sum of the weight of itself before collision and
the weight of the topic added, which shows the accumulation of topic. The more the topic is
discussed, the heavier the topic is.

mt = mt−1 +
m∑

∆t=1

mterm∆t (3)

In time t, the velocity of the topic is the quotient of total momentum and total weight.

⇀vt =
⇀
Mt/mt (4)

The topic reaches a height in a certain velocity after being represented in physics. In this
paper we define this height as the hotness of the topic in that time spot.

⇀
Ht =

⇀vt−1 × t (5)

The height is not able to keep growing according to the digestion of the topic. Under this
circumstance we use acceleration of gravity g to lower the height (hotness). The equation is
modified as follows:

⇀
Ht =

⇀vt−1 × t0 +
1

2
× ⇀g∆t2 (6)

4.2 Modelling Solution

In social networks, Hashtag (#) is used extensively to determine the topic, which is convenient
to aggregate and classify vast amounts of information and let people who follow a certain topic
get the relevant information more easily. Twitter showed its significant value in information
propagation almost before every emergency and important activity due to the aggregation of
using hashtags. We can draw a safe conclusion that hashtag reflects the true tendency of topics
and has great potential in the reconstruction and compilation of information. In this paper we
use hashtag for identification modeling of topics. A topic under a specific time spot is defined as
a set with synonymous hashtags.

Topicnt := {hashtagn1, hashtagn2, hashtagn3, ..., hashtagnt}

At a specified time, more similar hashtags means the topic is more active. In a specific period
of time, topic is a sequence of sets. In this Section we use m for the weight of hashtag, v for the
velocity of hashtag, and M for the momentum of hashtag. The topic hotness modeling algorithm
is as follows:

5 Experiment Analysis And Result

5.1 Data Corpus and Parameter Settings

In this paper we adopt the historical(2009/6/11-2009/12/31) data from Twitter as corpus. We
extracted 460,496 twitter terms and 22,063 hashtags in total, neglecting those topics which lasted



740 L. Wang, X. Li, L.J. Liao, L. Liu

Algorithm 1 Topic Hotness Modeling

Input: The set of hashtags which emerged during a certain period of time
Output: The topics list ranked by hotness in each time spot t

1: for t = t0 do
2: Process the data we collected and aggregate into topic set Topic.
3: for each topic Topicn from Topic do
4: Obtain the set {termn1, ..., termnt} of the Topicnt
5: Calculate the hotness of Topicnt by

⇀
Ht =

⇀vt−1 × t0 +
1
2
× ⇀g∆t2

6: end for

7: output the topic list ordered by Ht
8: t = t + 1
9: end for

Table 1: Parameter Setting

Parameter Value

Hashtag Initial Velocity v 10

Hashtag Weight m 1

Acceleration g -3

Interval t (day) 1

less than 3 days or contained less than 2 posts, the coverage can meet the needs of simulation
experiments. The main parameters of the computer for this modelling are as follows: CPU is
Intel(R) Core(TM)2 Duo, Main Frequency is 2.0GHz, Memory Frequency is 777MHz, Memory
Capacity is 1.96GB, Running Environment is WINDOWS XP. The rest parameters are set as
the Table 1.

5.2 Comparison of Hot Topic detection

In this experiment, our momentum theory (M) is compared to two proposed methods. The
baseline method (A) [12, 16] is a basic aging method. Cataldi et al. [10] improved a method
(A-TF) which enhanced the aging method by using an augmented normalized term frequency.
As a result, top 3 and top 10 generated hot topics from each method are evaluated using official
TDT measures including: precision (p), recall (r) and F1-measure (F1).

In Table 2, the best score for each item is represented in bold. In the top 3 comparison, our
momentum theory achieves both highest precision and recall which results in the best F1 score,
while the aging method achieves both reasonable precision and recall. In the top 10 comparison,
the precision of our momentum theory is still the highest in the first 10 hot hashtags but loses

Table 2: Comparison of Three Methods

Method p r F1

Top 3

A 0.67 0.67 0.67

M 1.00 0.67 0.80

A-TF 0.50 0.50 0.50

Top 10

A 0.63 0.71 0.67

M 0.88 0.78 0.82

A-TF 0.56 0.83 0.67



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 741

Table 3: Top 3 Hashtags Changing Circumstances in A Week in Three Methods

2009/7/6 2009/7/7 2009/7/8 2009/7/9 2009/7/10 2009/7/11

M

mileycyrus mileycyrus mileycyrus 140army 140army zyngapirates

140army 140army 140army mileycyrus zyngapirates hottest100

140mafia 140mafia gorillapenis gorillapenis urumqi ashes

A

livestrong moonfruit turnon notagoodlook ff unacceptable

iranelection nothingpersonal turnoff iranelection followfriday iranelection

140mafia iranelection iranelection 140mafia iranelection ff

A-TF

livestrong moonfruit turnon iranelection ff unacceptable

iranelection iranelection turnoff 140mafia followfriday iranelection

musicmonday livestrong iranelection notagoodlook iranelection ff

recall. We believe that is due to the fact that the momentum mechanism likes to remove the
old topics. The recall of A-TF is the best, because it improves the aging algorithm which will
be sensitive of hot topics and enlarge its coverage, but there is a reduction in the precision. In
the F1 item, the momentum theory has maintained the highest standards. In conclusion, the
comprehensive performance of momentum theory is better than the other two algorithms.

5.3 Comparison of Hot Topic Trending

Due to space limitations, Table 3 represents the top 3 hashtags from 2009/7/6 to 2009/7/11
in the momentum theory (M), the aging algorithm (A) and the improved aging algorithm (A-
TF) . These results show that the topic trending in M is relatively stable. Hashtag #mileycyrus,
#140army and#zyngapirates are stable up to first place in turn. In the last two algorithms,
#iranelection is ranked in the top 3 places. However, according to the historical data checking,
this hashtag appeared over 20,000 times a day in June and was already ranked as hottest topic
at that time in all three algorithms. Clearly, in July #iranelection should not be a hot topic
again. Hashtag #140mafia’s overall performance is good, the repeating number is higher and the
potential of impacting as hottest topic is great, so it is ranked as top 3 in the three algorithms.
Although it is stable in M, it is unstable in A and A-TF.

Table 4 presents the top 10 hashtags in two days (2009/7/6-2009/7/7) in three algorithms.
Hashtag #mileycyrus remains first place in M in two days, ranks 7th in A, falls out of the top
10 in A-TF and only ranks 17th as 0.635 energy value. It is because there are a lot of old hot
topics which interfere with these two days of ranking, while these old hashtags already were the
hot topics in an earlier time. #140army is in second place in M, but ranks 13th and 11th in A
and A-TF separately, falling out of top 10. #flip2009 is captured only by M, while it gets its
highest times in 5th July and 6th July. But due to relatively lower times, it is abandoned by two
aging algorithms. As we mentioned before, #140mafia is captured in all three methods. #tcot,
#spymaster and #tweetmyjobs had more than thousands of times per day from 2009/6/12 and
were listed as hot topics in momentum theory at that time. These are not hot news yet, but
these hashtags are still ranked as top 10 hashtags in A and A-TF. #mj is a standard hot hashtag
which arise sharply in 7th July as 1,077 times occur, however, it is replaced by #tcot in A and
A-TF due to the shortcoming of lower discrimination degree of the aging theory.

Because of the large amount of data and limited space, we take hashtag #mileysycyrus as
an example to analyze the trending of different algorithms. Figure 1 Shows the times, hotness
and energy value six months profiles of hashtag #mileysycyrus, which is the hottest topic of 6th
July in momentum theory.

In Figure 1 A, this hashtag began to appear in 12th June. After that, it did not reproduce
for about a week. Later, it appeared sometime in a lower rate. On 5th July, it shot up suddenly



742 L. Wang, X. Li, L.J. Liao, L. Liu

 !!"#$#%&  !!"#'#%&  !!"#"#%&  !!"#%%#%&  !%!#%#%&

!

%!!!

 !!!

(!!!

)!!!

$!!!

&!!!

 !!"#$#%&  !!"#'#%&  !!"#"#%&  !!"#%%#%&  !%!#%#%&

!

%!

 !

(!

)!

 !!"#$#%&  !!"#'#%&  !!"#"#%&  !!"#%%#%&  !%!#%#%&

!*!

!*%

!* 

!*(

!*)

!*$

!*&

 !!"#$#%&  !!"#'#%&  !!"#"#%&  !!"#%%#%&  !%!#%#%&

+!*%

!*!

!*%

!* 

!*(

!*)

!*$

!*&

!*'

 

!

"

 

#

!
"
#

$
%
 
&

'
 
(

)
%
*

+
)

,

!$%

#&'()*

+

!$%

 

#

(
 
&

'
 
(

)
%
*

+
)

,

#(

 

#

#!$%

-
.

$
/
,

0

#+

 

#

#

!$%

-
.

$
/
,

0

#+,&-

Figure 1: Comparison of #mileysycyrus (A:occurrence number of hashtag per day, B:hotness
of hashtag per day in momentum theory, C:energy value of hashtag per day in aging method,
D:energy value of hashtag per day in A-TF method)



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 743

Table 4: Top 10 Hashtags in Two Days in Three Methods

Hashtag M Times Hashtag A Times Hashtag A-TF Times

7-6

mileycyrus 33.63 187 livestrong 0.56 4026 livestrong 0.67 4026

140army 27.93 781 iranelection 0.55 2640 iranelection 0.66 2640

140mafia 27.68 2316 140mafia 0.53 2316 musicmonday 0.66 2966

moonfruit 26.23 2124 moonfruit 0.51 2124 140mafia 0.66 2316

gorillapenis 24.98 984 musicmonday 0.50 2966 moonfruit 0.66 2124

flip2009 21.19 12 spymaster 0.47 1616 spymaster 0.65 1616

mj 20.86 288 tcot 0.40 1206 tcot 0.65 1206

rw09 20.64 16 gorillapenis 0.39 984 gorillapenis 0.64 984

katemcrae 19.36 10 mileycyrus 0.38 187 militarymon 0.64 933

cmonbrazil 19.11 140 honduras 0.30 676 jobs 0.64 827

7-7

mileycyrus 35.64 18 moonfruit 0.62 5147 moonfruit 0.64 5147

140army 30.88 461 nothingpersonal 0.54 3670 iranelection 0.63 1902

140mafia 29.23 1644 iranelection 0.51 1902 livestrong 0.63 1228

gorillapenis 29.09 559 140mafia 0.49 1644 140mafia 0.63 1644

moonfruit 25.65 5147 livestrong 0.47 1228 musicmonday 0.63 1423

cmonbrazil 20.13 47 musicmonday 0.46 1423 spymaster 0.63 936

urumqi 19.61 153 spymaster 0.40 936 tcot 0.63 839

mj 19.27 1075 tcot 0.36 839 tweetmyjobs 0.63 1013

crocmint 19.27 6 tweetmyjobs 0.34 1013 gorillapenis 0.63 559

xinjiang 19.05 110 threadless 0.30 1192 jobs 0.63 642

and reached 5,302 times, then declined rapidly. Although it repeated a few dozens or hundreds
times later, but never rose again sharply. In Figure 1 B, momentum theory captures the hotness
rising rapidly on 5th, 6th, 7th July and declining dramatically quickly. This hotness is higher
than other hashtags, so it is ranked in first place. The aging theory also finds this change but
does not list this hashtag in a higher position (only the 9th place) due to a relative lower energy
value compared with other hashtags. The A-TF method almost gets every higher repeat rate
moments, but it cannot distinguish the highest point from all higher points clearly. So, it is not
able to choose this hashtag from the data corpus efficiently. As a result, it falls out of the top
10 hashtags. In Figure 1, momentum theory presents a higher sensitivity and better ability to
discriminate the hottest hashtags from others than the other two algorithms.

Extrapolating the results from the experiment, we can obtain the hottest topics during the
period of time we captured by the algorithm we proposed, and compared with the hot hashtags
that are extracted by the rank of occurrence.

From the comparison of Table 5, we can explicitly observe that the occurrence of the hashtag
may mislead to the generation of a hot topic, such as #followfriday, #ff and #1. Those topics
own little valid information, however are regarded as hot according to the high frequency of
occurrence. By the algorithm we proposed, the topics which lack realistic meaning will be
eliminated by the momentum equation. The topics which survive will be the active ones with
realistic meaning.

6 Conclusion and Future Work

In this paper we propose a novel algorithm for hot topic detection based on momentum
theory using hashtags for defining a topic. The main contributions of the paper are as follows.



744 L. Wang, X. Li, L.J. Liao, L. Liu

Table 5: Hottest Hashtags in Half A Year

Momentum Theory Occurrence

Hashtag Hotness Hashtag Days Times

1 tehran 44.73 ff 113 417802

2 nomaschavez 44.19 iranelection 116 368519

3 happybirthdaymikey 43.13 tcot 120 288091

4 nem 43 jobs 115 243586

5 iranelection 41.14 mobsterworld 85 203445

6 zain 38.93 followfriday 113 188664

7 happybirthdaypink 38.2 1 114 166260

8 teenisland 38.15 musicmonday 109 t164399

9 blackbery 37.94 140mafia 105 149078

First we analyze the characteristics of hot topics and conclude into four points:1) Timeliness. 2)
Development. 3) Accumulation. 4) Digestion. Then we build a hot topic detection model with
momentum theory. The experiments show that our model can identify those emerging hot topics
effectively and accurately.

In the future, we hope to filter the posts under each topic to provide the users with a purer
source of information without the distraction from irrelevant posts. Our algorithm standardizes
the topic life cycle into a very stable curve which makes the topic prediction possible. Some
artificial intelligence techniques have been applied in mathematical modeling [4, 32, 38], and
some achievements have been obtained in the prediction of data modeling [3,5,9,30]. We will
try to use these techniques to predict the hot topics in the next step.

Acknowledgment

The authors would like to thank the anonymous reviewers for their insightful comments. The
authors also thank Yanmei Zhai and Xu Han for their help with data preparation and exper-
iments. This work has been partially supported by National Program on Key Basic Research
Project under Grant No. 2013CB329605, NSFC under Grant No. 61300178 and Natural Science
Foundation of Beijing under Grant No. 4092037.

Bibliography

[1] Aldhelaan, M.; Alhawasi, H. (2015); Graph Summarization for Hashtag Recommendation,
2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud), 698-702.

[2] Allan, J.; Carbonell, J.; Doddington, G., et al. (1998); Topic Detection and Tracking Pilot
Study Final Report, Proceedings of the DARPA Broadcast News Transcription and Under-
standing Workshop, 194-218.

[3] Asadi, S.; Hadavandi, E.; Mehmanpazir, F., et al. (2012); Hybridization of evolutionary
Levenberg-Marquardt neural networks and data pre-processing for stock market prediction,
Knowledge-Based Systems, 35(15): 245-258.



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 745

[4] Aydemir, E.; Koruca, H.I. (2015); A New Production Scheduling Module Using Priority-Rule
Based Genetic Algorithm, International Journal of Simulation Modelling, ISBN 1726-4529,
14(3): 450-462.

[5] Bas, E.; Egrioglu, E.; Aladag, C.H., et al. (2015); Fuzzy-time-series network used to forecast
linear and nonlinear time series, Applied Intelligence, 43(2): 1-13.

[6] Blekas, K.; Lagaris, I.E. (2013); A Spectral Clustering Approach Based on Newton’s Equa-
tions of Motion, International Journal of Intelligent Systems, 28(4): 394-410.

[7] Bo, Y.; Ming, L.; Bing-Quan, L., et al. (2012); Detecting hot topics in technology news
streams, Machine Learning and Cybernetics (ICMLC), 2012 International Conference on,
ISBN 2160-133X, 5:1968-1974.

[8] Borgelt, C. (2010); Simple Algorithms for Frequent Item Set Mining, Advances in Machine
Learning II, ISBN 978-3-642-05178-4, 263(16):351-369.

[9] Caldeira, J.F.; Moura, G.V.; Santos, A.A.P. (2016); Predicting the yield curve using forecast
combinations, Computational Statistics & Data Analysis, 100: 79-98.

[10] Cataldi, M.; Caro, L.D.; Schifanella, C. (2010); Emerging topic detection on Twitter based
on temporal and social terms evaluation, Proceedings of the Tenth International Workshop on
Multimedia Data Mining, 1-10.

[11] Chang, M.K.; Cheung, W.; Tang, M. (2013); Building trust online: Interactions among
trust building mechanisms, Information & Management, 50(7): 439-445.

[12] Chen, C.; Chen, Y.-T.; Sun, Y., et al. (2003); Life Cycle Modeling of News Events Using
Aging Theory, Machine Learning: ECML 2003, ISBN 978-3-540-20121-2, 2837(7):47-59.

[13] Chen, J.; Yu, J.; Shen, Y. (2012); Towards Topic Trend Prediction on a Topic Evolution
Model with Social Connection, The Ieee/wic/acm International Joint Conferences on Web
Intelligence & Intelligent Agent Technology, 153-157.

[14] Chen, K.Y.; Luesukprasert, L.; Chou, S.c.T. (2007); Hot Topic Extraction Based on Time-
line Analysis and Multidimensional Sentence Modeling, IEEE Transactions on Knowledge and
Data Engineering, ISBN 1041-4347, 19(8): 1016-1025.

[15] Chen, T.L. (2012); Forecasting the Taiwan Stock Market with a Novel Momentum-based
Fuzzy Time-series, Review of Economics & Finance, 2:38-50.

[16] Chien Chin, C.; Yao-Tsung, C.; Meng Chang, C. (2007); An Aging Theory for Event Life-
Cycle Modeling, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Trans-
actions on, ISBN 1083-4427, 37(2): 237-248.

[17] Chung-Hong, L.; Tzan-Feng, C.; Hsin-Chang, Y. (2011); An automatic topic ranking ap-
proach for event detection on microblogging messages, Systems, Man and Cybernetics (SMC),
2011 IEEE International Conference on, ISBN 1062-922X, 1358-1363

[18] Galperin, E.A. (2011); Information transmittal, Newton’s law of gravitation, and tensor
approach to general relativity, Computers & Mathematics with Applications, 62(2): 709-724.

[19] Giannella, C.; Han, J.; Pei, J., et al. (2003); Mining Frequent Patterns in Data Streams at
Multiple Time Granularities, Data Mining Next Generation Challenges & Future Directions.



746 L. Wang, X. Li, L.J. Liao, L. Liu

[20] Han, J.; Pei, J.; Yin, Y. (2000); Mining frequent patterns without candidate generation,
SIGMOD Rec., ISBN 0163-5808, 29(2): 1-12.

[21] Hrelja, M.; Klancnik, S.; Balic, J., et al. (2014); Modelling of a turning process using the
gravitational search algorithm, International Journal of Simulation Modelling, ISBN 1726-
4529, 13(1): 30-41.

[22] Hwi-Gang, K.; Seongjoo, L.; Sunghyon, K. (2013); Discovering hot topics using Twitter
streaming data social topic detection and geographic clustering, Advances in Social Networks
Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on, 1215-1220.

[23] Jiangfeng, C.; Jianjun, Y.; Yi, S. (2012); Towards Topic Trend Prediction on a Topic
Evolution Model with Social Connection, Web Intelligence and Intelligent Agent Technology
(WI-IAT), 2012 IEEE/WIC/ACM International Conferences on, 1:153-157.

[24] Jing, G.; Peng, Z.; Tanb, J., et al. (2012); Mining Hot Topics from Twitter Streams, Procedia
Computer Science, 9(11): 2008-2011.

[25] Khoo Khyou, B.; Ishizuka, M. (2002); Topic extraction from news archive using TF*PDF
algorithm, Web Information Systems Engineering, 2002. WISE 2002. Proceedings of the Third
International Conference on, 73-82.

[26] Khulief, Y.A. (2010); Numerical Modelling of Impulsive Events in Mechanical Systems,
International Journal of Modelling & Simulation, 30: 80-86.

[27] Kotsakos, D.; Sakkos, P.; Katakis, I., et al. (2015); Language agnostic meme-filtering for
hashtag-based social network analysis, Social Network Analysis & Mining, 5(1): 1-14.

[28] Li, M.; Tang, M. (2013); Information Security Engineering: a Framework for Research and
Practices, International Journal of Computers Communications & Control, 8(4): 578-587.

[29] Liu, M.; Liu, Y.; Xiang, L., et al. (2008); Extracting Key Entities and Significant Events
from Online Daily News, Intelligent Data Engineering and Automated Learning-IDEAL 2008,
ISBN 978-3-540-88905-2, 5326(26):201-209.

[30] Liu, C.H.; Xiong, W. (2015); Modelling and Simulation of Quality Risk Forecasting in a
Supply Chain, International Journal of Simulation Modelling, ISBN 1726-4529, 14(2): 359-370.

[31] Ma, H.; Lu, Z.; Li, D., et al. (2014); Mining hidden links in social networks to achieve
equilibrium Ąî, Theoretical Computer Science, 556:13-24.

[32] Ramesh Kumar, L.; Padmanaban, K.; Balamurugan, C. (2016); Optimal Tolerance Al-
location in a Complex Assembly Using Evolutionary Algorithms, International Journal of
Simulation Modelling, ISBN 1726-4529, 15(1): 121-132.

[33] Salton, G. (1989); Automatic text processing: the transformation, analysis, and retrieval of
information by computer, ISBN 0-201-12227-8.

[34] Ternik, P.; Rudolf, R. (2013); Laminar Natural Convection of Non-Newtonian Nanofluids in
a Square Enclosure with Differentially Heated Side Walls, International Journal of Simulation
Modelling, 12(1): 5-16.

[35] Tran, D.H.; Nguyen, H.L.; Zhao, W., et al. (2011); Towards security in sharing data on
cloud-based social networks, Information, Communications and Signal Processing (ICICS)
2011 8th International Conference on, 1-5.



A Momentum Theory for Hot Topic Life-cycle:
A Case Study of Hot Hashtag Emerging in Twitter 747

[36] Wang, C.; Zhang, M.; Ru, L., et al. (2008); Automatic online news topic ranking using media
focus and user attention based on aging theory, Proceedings of the 17th ACM conference on
Information and knowledge management, 1033-1042.

[37] Xiao, Z. (2014); A SOCIAL NETWORK-ORIENTED MINING ALGORITHM FOR HOT
TOPIC DATA, Computer Applications & Software

[38] Yang, K.W.; Zhang, P.L.; Ge, B.F., et al. (2015); A Variables Clustering Based Differ-
ential Evolution Algorithm to Solve Production Planning Problem, International Journal of
Simulation Modelling, ISBN 1726-4529, 14: 525-538.

[39] Yang, Y.; Pierce, T.; Carbonell, J. (1998); A study of retrospective and on-line event de-
tection, Proceedings of the 21st annual international ACM SIGIR conference on Research and
development in information retrieval, 28-36.

[40] Zangerle, E.; Gassler, W.; Specht, G. (2013); On the impact of text similarity functions on
hashtag recommendations in microblogging environments, Social Network Analysis & Mining,
3(4): 889-898.

[41] Zheng, D.; Li, F. (2009); Hot Topic Detection on BBS Using Aging Theory, Web Information
Systems and Mining, ISBN 978-3-642-05249-1, 5854(14): 129-138.