Sample Paper - Manuscript Preparation


23 
J. mt. area res., Vol. 2, 2017 

   
         Journal of Mountain Area Research 
            

AN EFFICIENT AND COST-EFFECTIVE MATHEMATICAL MODEL TO ANALYZE 

BIG DATA  
Ubaidullah*, W. Akram, I. A. Memon 

Sukkur Institute of Business Administration, Sindh Pakistan. 

 
ABSTRACT 

 An efficient and cost-effective piecewise mathematical model is presented to represent a 

descriptive huge data mathematically. The techniques of function lines as decision boundaries are 

applied to incorporate the big data of the organization into slope intercept form. Which may be very 

helpful for a better understanding of discrete data to obtain sustainable and accurate results. Based 

on the boundaries limitation results of the collected data of the Federal Board of Revenue, the 

income tax against the income is studied. And finally the reliability of piecewise function to optimize 

the role of strategic management in any organization is investigated. The results showed that, the 

slope rate measured in the boundaries of income in percentage or increased slope rate is in good 

agreement with that predicted by the organization in descriptive form. 

 KEYWORDS: Big data; Mathematical Model; Boundaries; Tools; Computational technique 

 
* Corresponding author:  (E-mail: ubaidullah@iba-suk.edu.pk) 

 
1. INTRODUCTION 

The software engineering and computer 

science characterized big data as the large 

data set that become hard to do work. Due to 

the size and complexity of big data it is difficult 

to obtain the required results using on-hand 

database management devices or traditional 

data processing techniques. Scientist, business 

executives and technocrats hypothesize that 

the phenomena of big data is difficult to explain 

due to the unequal growth rate and huge 

volume. The term big is not big in volume if we 

go back ten years ago a hard derive of 30 MB 

was big but nowadays a 2TB derives are 

common. One historically important distinction 

between the collected data from dawn of 

civilization to 2003 is 5 exabytes, now we are 

creating 5 exabytes every two days. Today we 

are not only creating the size that is volume of 

data but we are also creating variety of data 

with a faster rate, this is called three Vs of Big 

Data, Volume, Variety and Velocity. In the era 

1440s and by 1500s, the printed data were 10 

Vol. 2, 2017 

http://journal.kiu.edu.pk/index.php/JMAR 

Full length article 


Ubaidullah et al., J. mt. area res. 02 (2017) 23-28 

24 
J. mt. area res., Vol. 2, 2017 

million texts including 2 million books, while in the 

6th and 7th centuries only 120 books were 

produced annually in Western Europe. The 

International Data Corporation stated that 

discovery and analysis of data economically 

with high velocity from very huge volumes of 

data by using new technologies and 

architecture designs is big data.  

Viktor Mayer-Schönberger and Kenneth Cukier 

described that big data is a way to take out 

new insights or make new forms of value in 

traditions that change markets, organizations 

etc. 

To collect, store and analyze the data sets from 

unstructured data we have need advance 

techniques, software’s and systems. In advance 

laboratories and advance engineering 

research centers big data provided significant 

advances. Whether we are in technology or 

business the term big data is absolutely different 

than the past things. Martin Hilbert and Priscilla 

Lopez stated that in 1996 digital data was one 

percent and by 2007 almost ninety four percent 

data was digital. Around 2000 each things 

became digital while in the 20th century the 

digital data was only texts and numbers. 

Andrew Whit stated that big data has the 

potential to open up stirring new opportunities in 

social research, it while it is difficult to access. 

Hall [1] stated that curiosity, litheness and 

motivation to learn by doing assortment and job 

experience. Lorenz [2] described that data is like 

an assembly of facts, but it is not necessarily the 

facts always truth. The weakly interpretation 

causes the wrong conclusions so we have 

needed to understand the big data. 

 Jeanne Harris described that the importance 

to understand the mathematical reasoning and 

statistical models is not only need for technical 

experts but it is also need for managers to meet 

the challenges of big data. Harris [4] stated that 

sixty percent of respondent on a survey feel the 

need to develop new skills for their employees 

to translate big data into insight and business 

value. To reduce large raw data sets into small 

dimensions the topology technique is a flexible 

technique for different systems. Big Data has the 

potential not only to update research, but it has 

the potential also transform education [8]. 

Hamann [9] described the technique of data 

discretization for resembling the curve by line 

segments. Discretization technique is a first step 

making the data suitable for numerical 

assessment and execution on digital computers. 

McMaster in 1987 provided a scheme of data 

reduction of piecewise linear curves. 

Gar in 2011 stated that the industry analysis 

companies are not facing challenges only in 

volume but also in velocity and Variety. Big data 

is a data which is recorded from a data 

generating source. One challenge is the 

collection of required data from these sources 

without losing the exact required information. 

Another challenge is to automatically collect 

the right data from the data source. We have 

need also an information extraction method to 

take out the associated data from the data 

source and express it in an ordered form for 

analysis. Data analysis is also a challenge, so to 

overcome this challenge we need domain 

analysis scientist to create effective data base 

design. The piecewise defined function is a well-

defined mathematical technique to formulate 

and interpret the big data. The Federal Board of 

Revenue is a supreme federal agency of 

Pakistan for auditing, enforcing and collecting 

revenue for the government of Pakistan. The 

data of collection federal taxes is a big data. In 

our paper we have used the piecewise function 

to formulate and interpret the collected data. 

We formulated the income tax slabs for salaried 

class in Pakistan for Financial Year 2014-15 into 


Ubaidullah et al., J. mt. area res. 02 (2017) 23-28 

25 
J. mt. area res., Vol. 2, 2017 

piecewise defined form that is limitations. Our 

developed mathematical model is a cost 

effective and time efficient.  

 
2. MATHEMATICAL AND GRAPHICAL 

REPRESENTATION OF DATA 

2.1 Mathematical model 

The general mathematical form for n 

dimensional piecewise continuous and convex 

linear functions is RRf
n
: . 

Like  ))((
1


n

RPP                                                                 

So that: bxaxf
ba











 







.maxmin

),(

                                                                              
If the function is convex and continuous then, 

)(
1


n

RP  

So that: bxaxf
ba











 






.max

),(

 
Here bxa 


. is a linear polynomial such that 

𝑎 ≠ 0 𝑎𝑛𝑑 𝑎, 𝑏 ∈ 𝑅. 

The piecewise linear function effectively 

reduced the problem size and enhanced the 

computational efficiency. 

 
2.2 Data Collection and Processing 

According to the Finance Act passed by the 

government of Pakistan, these below 

mentioned income tax rates will be followed for 

salaries in the year 2014-2015. Suppose 𝑥  

represents the income and 𝑇(𝑥) represents the 

income tax. The tax slabs are as follows: 

 
S# Taxable Income Rate of Tax 

1 Where the 

taxable income 

does not exceed 

Rs.400,000 

0% 

2 Where the 

taxable income 

exceed 

Rs.400,000 but 

does not exceed 

Rs.750,000 

5% of the amount 

exceeding 

Rs.400,000 

3 Where the 

taxable income 

exceed 

Rs.750,000 but 

does not exceed 

Rs.1,400,000 

Rs.17,500+10% of 

the amount 

exceeding 

Rs.750,000 

4 Where the 

taxable income 

exceed 

Rs.1,400,000 but 

does not exceed 

Rs.1,500,000 

Rs.82,500 +12.5% of 

the amount 

exceeding 

Rs.1,400,000 

5 Where the 

taxable income 

exceed 

Rs.1,500,000 but 

does not exceed 

Rs.1,800,000 

Rs.95,000+15% of 

the amount 

exceeding 

Rs.1,500,000 

7 Where the 

taxable income 

exceed 

Rs.1,800,000 but 

does not exceed 

Rs.2,500,000 

Rs.140,000+17.5% 

of the amount 

exceeding 

Rs.1,800,000 

8 Where the 

taxable income 

exceed 

Rs.2,500,000 but 

does not exceed 

Rs.3,000,000 

Rs.262,000+20% of 

the amount 

exceeding 

Rs.2,500,000 


Ubaidullah et al., J. mt. area res. 02 (2017) 23-28 

26 
J. mt. area res., Vol. 2, 2017 

9 Where the 

taxable income 

exceed 

Rs.3,000,000 but 

does not exceed 

Rs.3,500,000 

Rs.362,500+22.5% 

of the amount 

exceeding 

Rs.2,500,000 

10 Where the 

taxable income 

exceed 

Rs.3,500,000 but 

does not exceed 

Rs.4,000,000 

Rs.475,000+25% of 

the amount 

exceeding 

Rs.3,500,000 

11 Where the 

taxable income 

exceed 

Rs.4,000,000 but 

does not exceed 

Rs.7,000,000 

Rs.600,000+27.5% 

of the amount 

exceeding 

Rs.4,000,000 

12 Where the 

taxable income 

exceed 

Rs.7,000,000  

Rs.1,425,000+30% 

of the amount 

exceeding 

Rs.7,000,000 

 
 The rate of income tax is zero 0% if the 

taxable salary income does not exceed Rs. 

400,000 i.e. 0)( xT . 

 The rate of income tax is 5% if the taxable 

salary income exceed Rs.  400,000 but does 

not exceed Rs 750,000 i.e. 

2000005.0)(

)000,4000(05.0)(





xxT

xxT
. 

 The rate of income tax is 10% if the taxable 

salary income exceed Rs.  750,000 but does 

not exceed Rs. 1,400,000  

 i.e.  

5750010.0)(

)000,750(10.017500)(

20000)000,750(05.0)(







xxT

xxT

xT

 
 The rate of income tax is 12.5% if the taxable 

salary income exceed Rs.  1,400,000 but 

does not exceed Rs. 1,500,000  

 i.e. 
92500125.0)(

)000,400,1(125.082500)(





xxT

xxT
 

 The rate of income tax is 15% if the taxable 

salary income exceed Rs.  1,500,000 but 

does not exceed Rs. 1,800,000  

 i.e. 
13000015.0)(

)000,500,1(15.095000)(





xxT

xxT
 

 The rate of income tax is 17.5% if the taxable 

salary income exceed Rs.  1,800,000 but 

does not exceed Rs. 2,500,000  

  

 i.e. 
000,175175.0)(

)000,800,1(175.0140000)(





xxT

xxT
 

 The rate of income tax is 20% if the taxable 

salary income exceed Rs.  2,500,000 but 

does not exceed Rs. 3,000,000  

 i.e. 
500,2372.0)(

)000,500,2(2.0262500)(





xxT

xxT
 

 The rate of income tax is 22.5% if the taxable 

salary income exceed Rs.  3,000,000 but 

does not exceed Rs. 3,500,000  

 i.e.  
500,312225.0)(

)000,000,3(225.0362500)(





xxT

xxT
 

 The rate of income tax is 25% if the taxable 

salary income exceed Rs.  3,500,000 but 

does not exceed Rs. 4,000,000  

 i.e.  
000,40025.0)(

)000,500,3(25.0475000)(





xxT

xxT
 

 The rate of income tax is 27.5% if the taxable 

salary income exceed Rs.  4,000,000 but 

does not exceed Rs. 7,000,000  


Ubaidullah et al., J. mt. area res. 02 (2017) 23-28 

27 
J. mt. area res., Vol. 2, 2017 

 i.e.  
000,500275.0)(

)000,000,4(275.0000,600)(





xxT

xxT
 

 The rate of income tax is 30% if the taxable 

salary income exceed Rs.  7,000,000  

 i.e.  
000,6753.0)(

)000,000,7(3.0000,425,1)(





xxT

xxT
 





















































000,000,7000,6753.0

000,000,7000,000,4000,500275.0

000,000,4000,500,3000,40025.0

000,500,3000,000,3500,312225.0

000,000,3000,500,2500,2372.0

000,500,2000,800,1000,175175.0

000,800,1000,500,1000,13015.0

000,500,1000,400,1500,92125.0

000,400,1000,750500,571.0

000,750000,400000,2005.0

000,40000

)(

xx

xx

xx

xx

xx

xx

xx

xx

xx

xx

x

xT

 
2.3 Graphical Representation 

    
Figure 1. Represents the Income in million on x-axis 

and Income tax in million on y-axis in standard form. 

     
Figure 2. Represents the Income in million on x-axis 

and Income tax in million on y-axis in scientific 

notation. 

 
3. RESULTS AND DISCUSSION 

From the derived mathematical model and the 

graphical representation we can conclude that 

the data we have collected of the fiscal year 

2014-2015 of federal budget of Pakistan is big in 

term of volume. To manage, share, analyze and 

visualize the data in a timeframe it is difficult 

without advanced tools, software, and systems. 

The used mathematical model summarized the 

big data into a small form such that we can 

calculate with a faster rate easily and efficiently. 

The income tax depends on the income so we 

have taken income on x-axis and income tax on 

y-axis. Scaling on the axis is as; on x-axis income 

is in millions and on y-axis income tax is in 

hundred thousand. The graph shows as income 

increase the income tax is also increase. Figure 

1and Figure 2 indicate the increase of income 

tax due to increase of income. If the income of 

a pair is 750,000 the income tax is 17500. This 

showed the reliability of the piecewise linear 

mathematical model. From the above Model of 

T(x), the slope of the intervals are as: 

0 1 2 3 4 5 6 7
0

0.2

0.4

0.6

0.8

1

1.2

1.4
Income Tax of Federal Budget 2013-2014 

 Income

 
I
n
c
o
m

e
 
T

a
x

In
c
o

m
e
 T

a
x

 
Income 

 
Ubaidullah et al., J. mt. area res. 02 (2017) 23-28 

28 
J. mt. area res., Vol. 2, 2017 

0, 0.05, 0.10, 0.125, 0.15, 0.175, 0.2, 0.225, 0.25, 

0.27 and 0.3. These are the mathematical 

indicators which are efficient and cost effective 

to analyze and interpret the big data into small 

one. These indicator indicates that as income 

increase the income tax is also increase.  

 
4. CONCLUSION 

We presented a piecewise mathematical 

model which converts a descriptive data into a 

single model based on the linear coefficients, 

assigned variables and tax slab’s percentage 

into a single model. We used the high level 

language software ‘MATLAB’ that is able to 

reliably detect and sharply the tax slab of the 

tax payer. This software also accurately 

calculate the exact amount of the individual 

taxpayer. The problem here is to find the slab 

percentages that appears in the acquired 

data. Finally we did optimize our collected 

data. 

 
References 

[1] D. Lazer, A. Pentland, L. Adamic, S. Aral, A-L. 

Barabási, D. Brewer. Computational Social 

Science”. Science: 323 (2009), 721-723. 

[2] S. Shvetank, H. Andrew, C. Jaime. "Good Data 

Won't Guarantee Good Decisions”. Harvard 

Business Review, HBR.org. Retrieved (2012). 

[3] V. Mayer- Schönberger & K. Cukier. Big Data: A 

Revolution that Will Transform How We Live, Work, 

and Think”, (2013) New York, Houghton Mifflin 

Harcourt Publishing Company. 

[4] J. Harris. Data is useless without the skills to 

analyze it, (2012) HBR Blog Network. 

[5] J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, 

C. Roxburgh, & A. H. Byers (2011).  

 
[6] “Big data: The next frontier for innovation, 

competition, and productivity”. McKinsey Global 

Institute. 

[7] D. Raywood. Big data analyst shortage is a 

challenge for the UK. SC Magazine, (2012). 

[8] CCC, Advancing Personalized Education. 

Computing Community Consortium”. Spring 

2011. 

[9] B. Hamann, J. L. Chen. "Data point selection for 

piecewise linear curve approximation". 

Computer Aided Geometric Design 11 (1994). 

[10] M. H. Lin, J. G. Carlsson, D. Ge, J. Shi and J. F. Tsai, 

A Review of Piecewise Linearization Method. 

Mathematical Problems in Engineering (2013). 

[11] K. Holmberg. Solving the Staircase Cost Facility 

Location Problem with Decomposition and 

Piecewise Linearization. European Journal of 

Operational Research, 75(1994) 41-61. 

[12] A. B. Keha, I. R. De Farias, and G. L. Nemhauser, 

Models for Representing Piecewise Linear Cost 

Function”. Operation Rsearch Letters, 32 (2004) 

44-48. 

[13] V. Ford and A. Siraj, Clustering of Smart Meter 

Data for Disaggregation, In Proc. IEEE Global 

Conference on Signal and Information 

Processing (Global SIP),Austin, TX (2013). 

[14]  www.fbr.gov.pk 

[15] W. Huang, P. Eades, S. H. Hong, C. C. Lin. 

Improving multiple aesthetics produces better 

graph drawings. J Vis Lang Comput 24 (2013) 262-

272. 

[16] M. J. Baker, S. G. Eick. Space-filling Software 

Visualization. Journal of Visual Languages & 

Computing 6(1995)119-133.   

 This work is licensed under a Creative Commons Attribution 4.0 International License. 

http://hbr.org/2012/04/good-data-wont-guarantee-good-decisions/ar/1
http://hbr.org/2012/04/good-data-wont-guarantee-good-decisions/ar/1
http://hbr.org/2012/04/good-data-wont-guarantee-good-decisions/ar/1
http://www.fbr.gov.pk/
http://www.journalofbigdata.com/sfx_links?ui=s40537-015-0022-3&bibl=B3
http://www.journalofbigdata.com/sfx_links?ui=s40537-015-0022-3&bibl=B10
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/licenses/by/4.0/
http://creativecommons.org/licenses/by/4.0/