issn 2086-0382 e-issn 2477-3344 cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 cauchy vol. 6 no. 3 pages: 100 – 161 malang november 2020 issn 2086-0382 e-issn 2477-3344 𝜒 cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 cauchy is a mathematical journal published twice a year on may and november by the mathematics department, faculty of science and technology, universitas islam negeri maulana malik ibrahim malang. this journal includes research papers, literature studies, analysis, and problem solving in mathematics (algebra, analysis, statistics, computing and applied mathematics). editorial board editor in chief : dr. sri harini, m.si, maulana malik ibrahim state islamic university of malang, indonesia. managing editor : mohammad jamhuri, m.si, maulana malik ibrahim state islamic university of malang, indonesia. juhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. 1. editorial board : prof hadi susanto, department of mathematical sciences, university of 2. essex and department of mathematics of khalifa university, united kingdom mario rosario guarracino, computational and data science laboratory high performance computing and networking institute national research council of italy, italy kartick chandra mondal, jadavpur university, salt lake campus, india rowena alma l. betty, university of the philippines diliman, philippines subanar seno, gadjah mada university, indonesia toto nusantara, state university of malang, indonesia edy tri baskoro, institut teknologi bandung, indonesia eridani eridani, airlangga university, indonesia abdul halim abdullah, university of technology malaysia, malaysia kusno, university of jember, indonesia slamin, university of jember, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia abdussakir, maulana malik ibrahim state islamic university of malang, indonesia javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/272712') cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 editorial board ari kusumastuti, maulana malik ibrahim state islamic university of malang, indonesia fachrur rozi, maulana malik ibrahim state islamic university of malang, indonesia elly susanti, universitas islam negeri maulana malik ibrahim malang, indonesia assistant editor : mohammad nafie jauhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. editorial office mathematics department, maulana malik ibrahim state islamic university of malang gajayana st. 50 malang, east java, indonesia 65144 phone (+62) 81336397956, faximile (+62) 341 558933 e-mail: cauchy@uin-malang.ac.id cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 focus and scope cauchy-jurnal matematika murni dan aplikasi is a mathematical journal published twice a year in may and november by the mathematics department, faculty of science and technology, maulana malik ibrahim state islamic university of malang. we we lc om e a u t h or s for original articles (research), review articles, interesting case reports, special articles illustrations that focus on the mathematics pure and applied. subjects suitable for publication include, but are not limited to the following fields of: 1. actuaria 2. algebra 3. analysis 4. applied 5. computing 6. econometry 7. statistics cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 indexing and abstracting cauchy-jurnal matematika murni dan aplikasi has been covered (indexed and abstracted) by following services: 1. doaj (2016-,)(https://doaj.org/toc/2477-3344) 2. d i m e n s i o n s 3. moraref (2015-,)-(http://moraref.or.id/index.php/browse/index/36) 4. onesearch indonesia (2015-,)-(http://onesearch.id/search/results?filter[]=repoid:ios2732) 5. mendeley (2013-,)-(https://www.mendeley.com/groups/5034091/cauchy/papers/) 6. indonesian scientific journal database (isjd) (2013-,)-(http://isjd.pdii.lipi.go.id/index.php/direktorijurnal.html) 7. google scholar (2009-,)-(https://scholar.google.co.id/citations?hl=en&view_op=list_works&gmla=ajsn f6omofbk7q0o2q-9 xuimca1zi8oz9lp2ehctubhl9dcisxnyh9saieau0g0udt8tym6jk3z666zu46vrsbyz6vjc2a_w&user=dr k-5hkaaaaj) 8. ipi (2009-,)-(http://id.portalgaruda.org/?ref=browse&mod=viewjournal&journal=5272) http://moraref.or.id/index.php/ http://onesearch.id/search/results http://www.mendeley.com/groups/5034091/cauchy/papers/ http://isjd.pdii.lipi.go.id/index.php/ http://id.portalgaruda.org/ cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 table of contents matrix approach to the direct computation method for the solution of fredholm integro-differential equations of the second kind with degenerate kernels .................................................................................................................. 100 – 108 forecasting financial system stability using vector error correction model approach ...................................................................................................................................... 109 – 116 inclusion properties of the homogeneous herz-morrey ............................................... 117 – 121 local dynamics of an svir epidemic model with logistic growth ............................ 122 – 132 super total labeling (a,d)edge antimagic on the firecracker graph ..................... 133 – 139 the rule of hessenberg matrix for computing determinant of centrosymmetric matrices ........................................................................................................................................ 140 – 148 the metric dimension and local metric dimension of relative prime graph ....... 149 – 161 issn 2086-0382 e-issn 2477-3344 cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 cauchy vol. 7 no. 1 pages: 1 – 151 malang november 2021 issn 2086-0382 e-issn 2477-3344 𝜒 cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 cauchy is a mathematical journal published twice a year on may and november by the mathematics department, faculty of science and technology, universitas islam negeri maulana malik ibrahim malang. this journal includes research papers, literature studies, analysis, and problem solving in mathematics (algebra, analysis, statistics, computing and applied mathematics). editorial board editor in chief : dr. sri harini, m.si, maulana malik ibrahim state islamic university of malang, indonesia. managing editor : mohammad jamhuri, m.si, maulana malik ibrahim state islamic university of malang, indonesia. juhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. 1. editorial board : prof hadi susanto, department of mathematical sciences, university of 2. essex and department of mathematics of khalifa university, united kingdom mario rosario guarracino, computational and data science laboratory high performance computing and networking institute national research council of italy, italy kartick chandra mondal, jadavpur university, salt lake campus, india rowena alma l. betty, university of the philippines diliman, philippines subanar seno, gadjah mada university, indonesia toto nusantara, state university of malang, indonesia edy tri baskoro, institut teknologi bandung, indonesia eridani eridani, airlangga university, indonesia abdul halim abdullah, university of technology malaysia, malaysia kusno, university of jember, indonesia slamin, university of jember, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia abdussakir, maulana malik ibrahim state islamic university of malang, indonesia javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/272712') cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 editorial board ari kusumastuti, maulana malik ibrahim state islamic university of malang, indonesia fachrur rozi, maulana malik ibrahim state islamic university of malang, indonesia elly susanti, universitas islam negeri maulana malik ibrahim malang, indonesia assistant editor : mohammad nafie jauhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. editorial office mathematics department, maulana malik ibrahim state islamic university of malang gajayana st. 50 malang, east java, indonesia 65144 phone (+62) 81336397956, faximile (+62) 341 558933 e-mail: cauchy@uin-malang.ac.id cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 focus and scope cauchy-jurnal matematika murni dan aplikasi is a mathematical journal published twice a year in may and november by the mathematics department, faculty of science and technology, maulana malik ibrahim state islamic university of malang. we we lc om e a u t h or s for original articles (research), review articles, interesting case reports, special articles illustrations that focus on the mathematics pure and applied. subjects suitable for publication include, but are not limited to the following fields of: 1. actuaria 2. algebra 3. analysis 4. applied 5. computing 6. econometry 7. statistics cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 indexing and abstracting cauchy-jurnal matematika murni dan aplikasi has been covered (indexed and abstracted) by following services: 1. doaj (2016-,)(https://doaj.org/toc/2477-3344) 2. d i m e n s i o n s 3. moraref (2015-,)-(http://moraref.or.id/index.php/browse/index/36) 4. onesearch indonesia (2015-,)-(http://onesearch.id/search/results?filter[]=repoid:ios2732) 5. mendeley (2013-,)-(https://www.mendeley.com/groups/5034091/cauchy/papers/) 6. indonesian scientific journal database (isjd) (2013-,)-(http://isjd.pdii.lipi.go.id/index.php/direktorijurnal.html) 7. google scholar (2009-,)-(https://scholar.google.co.id/citations?hl=en&view_op=list_works&gmla=ajsn f6omofbk7q0o2q-9 xuimca1zi8oz9lp2ehctubhl9dcisxnyh9saieau0g0udt8tym6jk3z666zu46vrsbyz6vjc2a_w&user=dr k-5hkaaaaj) 8. ipi (2009-,)-(http://id.portalgaruda.org/?ref=browse&mod=viewjournal&journal=5272) http://moraref.or.id/index.php/ http://onesearch.id/search/results http://www.mendeley.com/groups/5034091/cauchy/papers/ http://isjd.pdii.lipi.go.id/index.php/ http://id.portalgaruda.org/ cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 table of contents genetic algorithm for variable selection and parameter optimization in svm and fuzzy svm for colon cancer microarray classification ..................................... 1 – 12 a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units .................. 13 – 21 inclusion properties of the homogeneous herz-morrey spaces with variable exponent ....................................................................................................................................... 22 – 27 sentiment analysis on government performance in tourism during the covid19 pandemic period with lexicon based ........................................................................ 28 – 39 optimal prevention and treatment control on sveir type model spread of covid-19 ...................................................................................................................................... 40 – 48 analysis of landing airplane queue systems at juanda international airport surabaya ....................................................................................................................................... 49 – 63 on rainbow vertex antimagic coloring of graphs: a new notion .............................. 64 – 72 the confidence interval of the estimator of the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process .......................................................................................................................................................... 73 – 83 on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function ............................................................... 84 – 96 supplier selection analysis using minmax multi choice goal programming model .............................................................................................................................................. 97 – 104 spline nonparametric regression to analyze factors affecting gender empowerment measure (gem) in east java ................................................................... 105 – 117 modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression approach ................................................................................ 118 – 128 the ring homomorphisms of matrix rings over skew generalized power series rings ............................................................................................................................................... 129 – 135 cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 table of contents local hölder regularity of weak solutions for singular parabolic systems of plaplacian type ............................................................................................................................ 136 – 141 a study of count regression models for mortality rate .................................................. 142 – 151 issn 2086-0382 e-issn 2477-3344 cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 cauchy vol. 7 no. 2 pages: 152 – 331 malang may 2022 issn 2086-0382 e-issn 2477-3344 𝜒 cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 cauchy is a mathematical journal published twice a year on may and november by the mathematics department, faculty of science and technology, universitas islam negeri maulana malik ibrahim malang. this journal includes research papers, literature studies, analysis, and problem solving in mathematics (algebra, analysis, statistics, computing and applied mathematics). editorial board editor in chief : dr. sri harini, m.si, maulana malik ibrahim state islamic university of malang, indonesia. managing editor : mohammad jamhuri, m.si, maulana malik ibrahim state islamic university of malang, indonesia. juhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. 1. editorial board : prof hadi susanto, department of mathematical sciences, university of 2. essex and department of mathematics of khalifa university, united kingdom mario rosario guarracino, computational and data science laboratory high performance computing and networking institute national research council of italy, italy kartick chandra mondal, jadavpur university, salt lake campus, india rowena alma l. betty, university of the philippines diliman, philippines muhammad fakhruddin, department of mathematics, faculty of military mathematics and natural sciences, the republic of indonesia defense university, bogor, indonesia alfi yusrotis zakiyyah, universitas bina nusantara, indonesia bety hayat susanti, politeknik siber dan sandi negara, indonesia dian savitri, universitas negeri surabaya, indonesia meta kallista, universitas telkom, indonesia dani suandi, universitas bina nusantara, bandung, indonesia anwar fitrianto, department of statistics, ipb university, indonesia sri harini, universitas islam negeri maulana malik ibrahim malang, indonesia dr heni widayani, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia corina karim, brawijaya uiversity javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/272712') cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 editorial board subanar seno, gadjah mada university, indonesia toto nusantara, state university of malang, indonesia edy tri baskoro, institut teknologi bandung, indonesia eridani eridani, airlangga university, indonesia abdul halim abdullah, university of technology malaysia, malaysia kusno, university of jember, indonesia slamin, university of jember, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia abdussakir, maulana malik ibrahim state islamic university of malang, indonesia ari kusumastuti, maulana malik ibrahim state islamic university of malang, indonesia fachrur rozi, maulana malik ibrahim state islamic university of malang, indonesia elly susanti, universitas islam negeri maulana malik ibrahim malang, indonesia assistant editor : mohammad nafie jauhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. editorial office mathematics department, maulana malik ibrahim state islamic university of malang gajayana st. 50 malang, east java, indonesia 65144 phone (+62) 81336397956, faximile (+62) 341 558933 e-mail: cauchy@uin-malang.ac.id cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 focus and scope cauchy-jurnal matematika murni dan aplikasi is a mathematical journal published twice a year in may and november by the mathematics department, faculty of science and technology, maulana malik ibrahim state islamic university of malang. we we lc om e a u t h or s for original articles (research), review articles, interesting case reports, special articles illustrations that focus on the mathematics pure and applied. subjects suitable for publication include, but are not limited to the following fields of: 1. actuaria 2. algebra 3. analysis 4. applied 5. computing 6. econometry 7. statistics cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 indexing and abstracting cauchy-jurnal matematika murni dan aplikasi has been covered (indexed and abstracted) by following services: 1. doaj 2. d i m e n s i o n s 3. moraref (2015-,)-(http://moraref.or.id/index.php/browse/index/36) 4. onesearch indonesia (2015-,)-(http://onesearch.id/search/results?filter[]=repoid:ios2732) 5. mendeley (2013-,)-(https://www.mendeley.com/groups/5034091/cauchy/papers/) 6. indonesian scientific journal database (isjd) (2013-,)-(http://isjd.pdii.lipi.go.id/index.php/direktorijurnal.html) 7. google scholar (2009-,)-(https://scholar.google.co.id/citations?hl=en&view_op=list_works&gmla=ajsn f6omofbk7q0o2q-9 xuimca1zi8oz9lp2ehctubhl9dcisxnyh9saieau0g0udt8tym6jk3z666zu46vrsbyz6vjc2a_w&user=dr k-5hkaaaaj) 8. ipi (2009-,)-(http://id.portalgaruda.org/?ref=browse&mod=viewjournal&journal=5272) http://moraref.or.id/index.php/ http://onesearch.id/search/results http://www.mendeley.com/groups/5034091/cauchy/papers/ http://isjd.pdii.lipi.go.id/index.php/ http://id.portalgaruda.org/ cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 table of contents a note on generalized strongly p-convex functions of higher order ....................... 152 – 157 the generalized star modeling with heteroscedastic effects ..................................... 158 – 172 optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention ......................................................................... 173 – 185 an application of geographically weighted regression for assessing water polution in pontianak, indonesia ........................................................................................ 186 – 194 richards curve implementation for prediction of covid-19 spread in maluku province ........................................................................................................................................ 195 – 206 the properties of intuitionistic anti fuzzy module t-norm and t-conorm ............... 207 – 219 analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 220 – 230 average based-fts markov chain based on a modified frequency density partitioning to predict covid-19 in central java ......................................................... 231 – 239 spatial autoregressive model of tuberculosis cases in central java province 2019 ................................................................................................................................................ 240 – 248 goodwin model with clustering workers' skills in indonesian economic cycle ... 249 – 266 a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 ............................................................................................................................ 267 – 280 forecasting rice paddy production in aceh using arima and exponential smoothing models ..................................................................................................................... 281 – 292 multipolar intuitionistic fuzzy ideal in b-algebras ........................................................... 293 – 301 hybrid model of singular spectrum analysis and arima for seasonal time series data ................................................................................................................................................. 302 – 315 elliptical orbits mode application for approximation of fuel volume change ...... 316 – 331 cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics journal cauchy is a peer-reviewed electronic national journal. this statement clarifies ethical behaviour of all parties involved in the act of publishing an article in this journal, including the author, the chief editor, the editorial board, the peer-reviewer and the publisher (mathematics department of maulana malik ibrahim state islamic university of malang). this statement is based on cope’s best practice guidelines for journal editors. ethical guideline for journal publication the publication of an article in a peer-reviewed cauchy is an essential building block in the development of a coherent and respected network of knowledge. it is a direct reflection of the quality of the work of the authors and the institutions that support them. peer-reviewed articles support and embody the scientific method. it is therefore important to agree upon standards of expected ethical behavior for all parties involved in the act of publishing: the author, the journal editor, the peer reviewer, the publisher and the society. as publisher of pure and applied mathematics journal, we take our duties to back up over all stages of publishing seriously and we recognize our ethical and other responsibilities. we are committed to ensuring that advertising, reprint or other commercial revenue has no impact or influence on editorial decisions. publication decisions the editor of cauchy is responsible for deciding which of the articles submitted to the journal should be published. the validation of the work in question and its importance to researchers and readers must always drive such decisions. the editors may be guided by the policies of the journal's editorial board and constrained by such legal requirements as shall then be in force regarding libel, copyright infringement and plagiarism. the editors may confer with other editors or reviewers in making this decision. fair play an editor at any time evaluates manuscripts for their intellectual content without regard to race, gender, sexual orientation, religious belief, ethnic origin, citizenship, or political philosophy of the authors. confidentiality the editor and any editorial staff must not disclose any information about a submitted manuscript to anyone other than the corresponding author, reviewers, potential reviewers, other editorial advisers, and the publisher, as appropriate. any manuscripts received for review must be treated as confidential documents. they must not be shown to or discussed with others except as authorized by the editor. disclosure and conflicts of interest unpublished materials disclosed in a submitted manuscript must not be used in an editor's own research without the express written consent of the author. cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics contribution to editorial decisions peer review assists the editor in making editorial decisions and through the editorial communications with the author may also assist the author in improving the paper. promptness any selected referee who feels unqualified to review the research reported in a manuscript or knows that its prompt review will be impossible should notify the editor and excuse himself from the review process. standards of objectivity reviews should be conducted objectively. personal criticism of the author is inappropriate. referees should express their views clearly with supporting arguments. acknowledgement of sources reviewers should identify relevant published work that has not been cited by the authors. any statement that an observation, derivation, or argument had been previously reported should be accompanied by the relevant citation. a reviewer should also call to the editor's attention any substantial similarity or overlap between the manuscript under consideration and any other published paper of which they have personal knowledge. disclosure and conflict of interest privileged information or ideas obtained through peer review must be kept confidential and not used for personal advantage. reviewers should not consider manuscripts in which they have conflicts of interest resulting from competitive, collaborative, or other relationships or connections with any of the authors, companies, or institutions connected to the papers. reporting standards authors of reports of original research should present an accurate account of the work performed as well as an objective discussion of its significance. underlying data should be represented accurately in the paper. a paper should contain sufficient detail and references to permit others to replicate the work. fraudulent or knowingly inaccurate statements constitute unethical behavior and are unacceptable. data access and retention authors are asked to provide the raw data in connection with a paper for editorial review and should be prepared to provide public access to such data (consistent with the alpspstm statement on data and databases), if practicable, and should in any event be prepared to retain such data for a reasonable time after publication. originality and plagiarism the authors should ensure that they have written entirely original works, and if the authors have used the work and/or words of others that this has been appropriately cited or quoted. cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics multiple, redundant or concurrent publication an author should not in general publish manuscripts describing essentially the same research in more than one journal or primary publication. submitting the same manuscript to more than one journal concurrently constitutes unethical publishing behavior and is unacceptable. acknowledgement of sources proper acknowledgment of the work of others must always be given. authors should cite publications that have been influential in determining the nature of the reported work. authorship of the paper authorship should be limited to those who have made a significant contribution to the conception, design, execution, or interpretation of the reported study. all those who have made significant contributions should be listed as co-authors. where there are others who have participated in certain substantive aspects of the research project, they should be acknowledged or listed as contributors. the corresponding author should ensure that all appropriate co-authors and no inappropriate co-authors are included on the paper, and that all co-authors have seen and approved the final version of the paper and have agreed to its submission for publication. hazards and human or animal subjects if the work involves chemicals, procedures or equipment that have any unusual hazards inherent in their use, the author must clearly identify these in the manuscript. disclosure and conflicts of interest all authors should disclose in their manuscript any financial or other substantive conflict of interest that might be construed to influence the results or interpretation of their manuscript. all sources of financial support for the project should be disclosed. fundamental errors in published works when an author discovers a significant error or inaccuracy in his/her own published work, it is the author’s obligation to promptly notify the journal editor or publisher and cooperate with the editor to retract or correct the paper. cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 acknowledgment to reviewers in this issue contributions and valuable comments of the following reviewers in this issue was very appreciated arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia sri harini, universitas islam negeri maulana malik ibrahim malang, indonesia heni widayani, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia corina karim, brawijaya uiversity fachrur rozi, universitas islam negeri maulana malik ibrahim malang, indonesia modeling plant stems using the deterministic lindenmayer system cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 286-295 p-issn: 2086-0382; e-issn: 2477-3344 submitted: february 03, 2021 reviewed: march 16, 2021 accepted: april 17, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11591 modeling plant stems using the deterministic lindenmayer system juhari1, muhammad zia alghar2 1,2 department of mathematics, faculty of science and technology universitas islam negeri maulana malik ibrahim email: juhari@uin-malang.ac.id, muhammadzia1904@gmail.com abstract plant morphology modeling can be done mathematically which includes roots, stems, leaves, to flower. modeling of plant stems using the lindenmayer system (l-system) method is a writing returns that are repeated to form a visualization of an object. deterministic l-system method is carried out by predicting the possible shape of a plant stem using its iterative writing rules based on the original object photo. the purpose of this study is to find a model of the plant stem with deterministic lindenmayer system method which will later be divided into two dimensional space three. the research was conducted by identifying objects in the form of pine tree trunks measured by the angle, thickness, and length of the stem. then a deterministic and parametric model is built with l-system components . the stage is continued by visualizing the model in two dimensions and three dimensions. the result of this research is a visualization of a plant stem model that is close to the original. addition color, thickness of the stem, as well as the parametric writing is done to get the results resembles the original. the iteration is limited to less than 20 iterations so that the simulation runs optimal. keywords: modeling; deterministic l-system; plant stems; visualization introduction growth is the process of increasing the size, volume and number of cells irrevisible (cannot return to original). on the stem of a plant, its growth includes the increase in size and volume on the trunk, branches, and branches. when the plant is still young its growth is fast and will slow down when have started to mature to age [1]. while the branching of the stem is a sign the growth of a plant. almost all plants branch. only monocot plants that do little branching on the stem. branching pattern the stem is generally divided into three, namely monopodial, sympodial, and dichotomous [2] [3] [4]. (a) (b) (c) http://dx.doi.org/10.18860/ca.v6i4.11591 mailto:juhari@uin-malang.ac.id mailto:muhammadzia1904@gmail.com modeling plant stems using the deterministic lindenmayer system juhari 287 figure 1. the branches of (a) monopodial, (b) sympodial, and (c) dichotomus the lindenmayer system or what is commonly referred to as the l-system is one the method used in mathematical studies developed by astrid lindenmayer. the lsystem uses a geometric aspect as its basis which is assisted in a manner computerization to produce a particular shape and model [5]. generally the lindenmayer system is a rewriting system with certain rules [6]. l-system is a branch of science in dynamic systems science that is applied to plant morphology, architectural design, to augmented reality (ar) components from video games and three-dimensional film. as of writing with the lindenmayer system used several main components, namely axioms, production, and letters [7]. the deterministic model is a mathematical model in which symptoms can be measured with certain certainty. in the deterministic model, the odds of each the incidence of subsequent events was not counted [6]. the l-system deterministic model can be formed in two dimensions or three dimensions. how to interpret the l-system basis a graphic in two dimensions is just a 2 x 2 dimensional matrix. however, in three dimensions using a rotation matrix measuring 3 x 3 [8]. the l-system also deals with the postulates of leonardo da vinci, namely on his notes at no. 394, which reads “all tree branches at any height if put together to have the same thickness as the stem below” [9]. an explanation of da vinci describes the condition of the diameter before and after branching, that is symbolized by d , d1 , and d2 . where d is the diameter of the stem, d1 and d2 are child stem diameter [10]. figure 2. measurement of stem thickness ratio from the da vinci postulate, the area in the parent stem (blue circle) will be obtained equals the area in the child stem (green circle). the implication of this is if the ratio of the two child stems add up to the thickness of the parent stem [10]. if the comparison of the parent stem to the stem of the child is done in the previously equation, the value of 0.707106 was obtained as the ratio of the thickness of the stems in the plant. the da vinci postulate ratio is used as a parameter in determining the thickness of the stem at l-system [6]. methods research data the research data used are some photos of the evergreen plant, results measurement of the angle, thickness of the trunk, and the length of the trunk on the pine tree to be modeled. these data are used as the basis for forming the l-system pattern that will be created. the data can be seen in the following table modeling plant stems using the deterministic lindenmayer system juhari 288 table 1. the results of the angle measurement in the xz plane angle measured (α) angle size first branch 0 o second branch 90 o third branch 180 o fourth branch 270 o table 2. the results of the angle measurement in the yz plane angle measured (β) left right first section 5o 5,2 o second section 6o 6,8 o third section 5,4o 5,2 o fourth section 5o 5,2 o table 3. the results of the angle measurement in the xy plane angle measured (θ) left right first branch 65o 62 o son of first branch 52o 54 o second branch 62o 63 o son of second branch 66o 64 o third branch 60o 62 o son of third branch 64o 61 o table 4. measurement results of plant stem length type of stem stem length mother stem (a) 17,20 cm daughter stem (b) 15,50 cm branching daughter stem (c) 9,30 cm modeling plant stems using the deterministic lindenmayer system juhari 289 table 5. results of measurements of plant stem thickness type of stem thickness trunk parent stem 9,20 cm right fork 6,40 cm left fork 6,20 cm daughter of the right fork 4,50 cm daughter of the right fork 4,60 cm daughter of the left fork 4,40 cm daughter of the left fork 4,50 cm research steps the steps taken to model plants using deterministic l-system are: (1) take picture of the object being modeled, namely in the form of an image photographed from various sides. (2) measurement of the angle, length and thickness of the stem for each branch on the object stem. (3) finding the average value and ratio of measurement results. (4) identify the various components of the l-system that build it, such as rules production, letters, axioms and other components. (5) performing a simulation by evaluating the results. research data the research data used are some photos of the evergreen plant, results measurement of the angle, thickness of the trunk, and the length of the trunk on the pine tree to be modeled. these data are used as the basis for forming the l-system pattern that will be created. result and discussion modeling results research modeling plants using the deterministic l-system method was carried out against three plants in a three-dimensional plane. as for the definition of the symbols used in this study can be observed in the following table table 6. definitions of symbols in the l-system f(l) : draw forward by l units, for l > 0 +(a) : rotates counterclockwise with rotation matrix r(α) of α degree -(a) : rotates clockwise with rotation matrix r(α) of a degree &(α) &(a) : rotates counterclockwise with rotation matrix r(β) of a degree modeling plant stems using the deterministic lindenmayer system juhari 290 ^(a) : rotates clockwise with rotation matrix r (β) of a degree /(a) : rotates counterclockwise with rotation matrix r (δ) of a degree \(a) : rotates counterclockwise with rotation matrix r (δ) of a degree |(a) : rotates with a rotation matrix r ( δ ) of 180o degree [ : saves the current location then moves according to the next command ] : returns to the original position stored in the symbol “[“ wr : specifies the thickness of the stem !(x) : determine the line thickness x the following are all production rules created for the three crop objects (pine plant) w = a(1,15) r1 = 0.9; r2 = 0.6; wr = 0,691 generation : 15 a0 = 61.25; a1 = 5.47; a2 = 90 p1 : a(l,w) --> !(w*wr)f(l)[e(l*r1,w*wr)] p2 : b(l,w):(l>=0.1) --> !(w*wr)f(l)[+(a0)^(a1)c(l*r2,w*wr)]f(wr) [-(a0)^(a1)c(l*r1,w*wr)] [^(a1)b(l*r2,w*wr2)f(l*r2,w*wr)] p3 : c(l,w):(l>=0.0) --> !(w*wr)f(l)[^(a1)d(l*r2,w*wr)] p4 : d(l,w):(l>=0.0) --> ! (w*wr)f(l)[^(a1)c(l*r2,w*wr)] p5 : e(l,w) --> !(w*wr)f(l)[[/(a2)&(l*a2)b(l*r1,w*wr)] [\(a2)&(l*a2)b(l*r1,w*wr)][/(2*a2)&(l*a2)b(l*r1,w*wr)] [\(0*a2)&(l*a2)b(l*r1,w*wr)]][[/(0.5*a2)&(l*a2)b(l*r1,w*wr)] [/(0.666*a2)fe(l*r1,w*wr)] p6 : s(l,w) --> tf p7 : t(l,w) --> f (ketapang kencana plant) v = {r1, r2, r3, wr, l, w, a, y, z, s, f, !, -, +, &, ^, /, \, (, ), [, ], *} modeling plant stems using the deterministic lindenmayer system juhari 291 axiom w = a(12,20) a(8,20) a(4,20) a(1,20) generations : 5 r1 = 0.8 r2 = 0.5 r3 = 0.3 wr = 0.707 p1 : a(l,w) --> !(w*0.5)sf(l)[-(70)y(l*r1,w*wr)][+(70)z(l*r1,w*wr)] [-(60)^(45)/(15)y(l*r1,w*wr)][+(60)&(45)/(15)z(l*r1,w*wr)] [+(120)&(135)\(15)y(l*r1,w*wr)][(120)&(225)\(15)z(l*r1,w*wr)] [/(90)-(70)y(l*r1,w*wr)][/(90)+(70)z(l*r1,w*wr)] p2 : y(l,w) --> !(w*0.3)sf(l)[[^(37.5)y(l*r3,w*wr)][&(37.5)y(l*r3,w*wr)]] sf(l)[[^(20)y(l*r3,w*wr)][&(20)y(l*r3,w*wr)]]sf(l) p3 : z(l,w) --> !(w*0.3)sf(l)[[^(37.5)z(l*r3,w*wr)][&(37.5)z(l*r3,w*wr)]] sf(l)[[^(20)z(l*r3,w*wr)][&(20)z(l*r3,w*wr)]]sf(l) p4 : s --> ssf (trembesi plant) v = {a0, a1, r1, r2, wr, l, w, a, b, c, s, f, !, -, +, ^, /, \, (, ), [, ], *} axiom w = a(1,90) generation : 10 r1 = 0.9 r2 = 0.6 a0 = 25 a1 = 10 wr = 0.707 p1 : a(l,w) --> !(w*0.4)-(10)sf(l*0.5)[+(a0)/(90)c(l*r2,w*wr)] [-(a1)\(90)a(l*r1,w*wr)][^(a0)\(90)c(l*r1,w*wr)] p2 : b(l,w) --> !(w*0.4)sf(l)[-(a0)c(l*r2,w*wr)][+(a1)c(l*r1,w*wr)] p3 : c(l,w) --> !(w*0.4)sf(l)[+(a0)a(l*r2,w*wr)][-(a0)a(l*r1,w*wr)] p4 : s --> sf modeling plant stems using the deterministic lindenmayer system juhari 292 visualization result the results of the visualization of the modeling of plant stems using the deterministic method system is done based on the measurement results of stem thickness, angle, and length stem of the object being modeled, which is then written in the lindenmayer rule system. furthermore, the l-system writing is visualized using computational applications namely l-studio. this application is specially designed for modeling plant growth developed at the university of cagliari [11]. the visualization results will be in a dimensional form three, so that the output can be viewed from various points of view. visual display on l-studio supports in visualization magnification, so that the output can be seen in form the details. the following is the output of the lindenmayer system program for three indoor plants three dimension. (pine plant) figure 3. visualization of pine trees from various iterations (trembesi plant) figure 4. visualization of trembesi from various perspectives modeling plant stems using the deterministic lindenmayer system juhari 293 ( ketapang kencana plant) figure 5. visualization of ketapang kencana plants from various perspectives after visualizing the l-studio program, it is followed by evaluating the results of the visualization. every detail of the visualization is enlarged and rotated in all directions. this is to ensure that there are no defects in the visualization. if there is defects, then changes are made to the components of the production rules. the iteration use on each plant is less than 20 iterations. it is intended to prevent programs that are not responding or errors when running on l-studio. the following is the comparison result of the visualization with the original object. (pine plant) figure 6. comparison of several visualization results of the l-system program on pine plant with the original object modeling plant stems using the deterministic lindenmayer system juhari 294 (ketapang kencana plant) figure 7. comparison of several visualization results of the l-system program on ketapang kencana plant with its original object (trembesi plant) figure 8. comparison of several visualization results of the l-system program on tamarind trees with the original object conclusion modeling plant stems using the deterministic lindenmayer system method, is a modeling that is more concise in level than using a method stochastic lindenmayer system , in the absence of a probability factor. the initial stages important in modeling modeling plant stems using the deterministic lindenmayer system juhari 295 plant stems is to determine the main component of l-system . in this modeling, researchers use three-dimensional visualization on the result. therefore, the visualization results are displayed from the front side, the side (round 90o x axis ), the top side (90o rotation of the z axis ), and the side slightly down (round 45o z axis ). the use of the l-studio application is very helpful in the process of visualizing the model plants, both in the iteration process, determine production rules, to deep loops do the visualization. the use of iterations needs to be considered in order for the running program to run smooth. the researcher uses less than 20 iterations so that it running optimally. references [1] a. shipunov, introduction of botany, usa: university of minot state, 2011. [2] r. mcgarry, "monopodial and sympodial branching architecture in cotton is differentially regulated by the gossypium hirsutum single flower truss and selfpruning orthologs," new phytologist, pp. 1-2, 26 april 2016. [3] j. power, "interactive arrangement of botanical l-system models," proceedings of the 1999 symposium on interactive 3d graphics si3d '99, pp. 175-182, 1999. [4] c. jacob, "genetic l-system programming: breeding and evolving artificial flowers with mathematica," in proceedings of the first international mathematica symposium, vol. 33976, pp. 215-222, 1995. [5] p. prusinkiewicz, "developmental models of herbaceous plants for computer imagery purposes," computer graphics, vol. 22, no. 0097-8930, pp. 141-150, 1988. [6] a. lindenmayer, the algorithmic beauty of plants, new york: spinger-verlag, 1990. [7] juhari, "pemodelan pertumbuhan tanaman zea mays l. menggunakan stochastic lsystem," jurnal cauchy, vol. 3, no. 2477-3344, p. 2, 2013. [8] c. h. iswanto, "penerapan stochastic l-systems pada pemodelan pertumbuhan batang tanaman," pp. 1-18, 1 janurary 2014. [9] j. ritcher, the notebooks of leonardo da vinci, new york: dover publications, 1970. [10] b. mandelbrot, the fractal geometry of nature, san fransisco: w.h. freeman, 1982. [11] suhartono, "pemodelan pertumbuhan tanaman zinnia menggunakan lindenmayer system dengan mathematica," jurnal cauchy, vol. 3, no. 2086-0382, pp. 33-37, 2013. [12] juhari, "pemodelan pertumbuhan batang tanaman menggunakan deterministic lsystems," p. 3, 25 november 2013. [13] m. kahfi, geometri transformasi, malang: ikip malang, 1997. inclusion properties of herz-morrey spaces with variable exponent cauchy –jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 22-27 p-issn: 2086-0382; e-issn: 2477-3344 submitted: may 02, 2021 reviewed: august 24, 2021 accepted: october 06, 2021 doi: https://doi.org/10.18860/ca.v7i1.12141 inclusion properties of herz-morrey spaces with variable exponent hairur rahman departement of mathematics, islamic state university of maulana malik ibrahim malang email: hairur@mat.uin-malang.ac.id abstract the inclusion properties in herz-morrey spaces has proved by rahman in 2020. this paper aims to discuss the inclusion of the homogeneous herz-morrey spaces and homogeneous weak herzmorrey spaces with variable exponent. we also investigated the inclusion between both spaces. this result will be useful to prove fractional integral on the homogeneous herz-morrey spaces with variable exponent. keywords: herz-morrey spaces; inclusion properties; variable exponent. introduction inclusion properties or inclusion relation between spaces has received a lot of attention from researchers. it seems that many authors have studied this issue in some spaces (see [1]-[5]). thus, this lead the author for discussing the inclusion properties especially in herz-morrey spaces. herz spaces can be traced back to the work of beurling. beurling [6] introduced a space 𝒜𝑝, which is the original version of non homogeneous herz spaces. lu et al [7] has given the inclusion properties in homogeneous herz spaces, as a proposition below. proposition 1.1. let 𝛼 ∈ ℝ, 𝑝 > 0, and 𝑞 ≤ ∞. the following inclusions are valid. a. if 𝑝1 ≤ 𝑝2, then 𝐾𝑞 𝛼,𝑝1 (ℝ𝑛) ⊂ 𝐾𝑞 𝛼,𝑝2 (ℝ𝑛) b. if 𝑞2 ≤ 𝑞1, then 𝐾𝑞1 𝛼,𝑝 (ℝ𝑛) ⊂ 𝐾𝑞2 𝛼−𝑛( 1 𝑞1 − 1 𝑞2 ),𝑝 (ℝ𝑛). this proposition can be proved by simply computation. in fact, if 0 < 𝑟 < 1, (a) is a consequence of the inequality (∑|𝑎𝑘 | ∞ 𝑘=1 ) 𝑟 ≤ ∑|𝑎𝑘 | 𝑟 ∞ 𝑘=1 . while, (b) can be deduced directly from the hölder inequality. in 2016, gunawan et al. (see [1] [2]) have proved the inclusion of morrey spaces and generalized morrey spaces. recently, rahman [8] also has proved the inclusion properties in herz-morrey spaces. these result have been motivated the author to study more about inclusion in homogenous herz-morrey spaces, but in this case the author uses variable exponent. since 1991, the research of kovacik and rakosnik [9] motivated many researchers to study about function spaces with variable exponent in several discussion. suppose that ω ⊂ ℝ𝑛 is an open set, 𝑝(⋅): ω → [1, ∞) is a measurable https://doi.org/10.18860/ca.v7i1.12141 mailto:hairur@mat.uin-malang.ac.id inclusion properties of herz-morrey spaces with variable exponent hairur rahman 23 function and 𝐿𝑝(⋅)(ω) is denoted the set of measurable functions 𝑓 on ω, such that for some positive 𝜆 satisfied ∫ ( | 𝑓(𝑥) | 𝜆 ) 𝑝( 𝑥 ) 𝑑𝑥 ω < ∞. if 𝐿𝑝(⋅)(ω) equipped by the luxemburg-nakano norm ‖ 𝑓 ‖ 𝐿 𝑝(⋅)(ω) = inf { 𝜆 > 0 ∶ ∫ ( | 𝑓(𝑥) | 𝜆 ) 𝑝(𝑥) 𝑑𝑥 ω ≤ 1}, then 𝐿𝑝(⋅)(ω) becomes a banach function spaces. since these spaces generalize the standard 𝐿𝑝 spaces, they are also referred to as variable 𝐿𝑝 spaces. 𝐿𝑝(⋅)(ω) is isometrically isomorphic to 𝐿𝑝(ω), when 𝑝(𝑥) = 𝑝 is a constant. in 2010, the boundedness of sublinear operators on herz-morrey space with variable exponent ℳ�̇�𝑝(⋅) 𝛼,𝑞 and ℳ�̇�𝑝(⋅) �̅�,𝑞 was proved by izuki [10]. then, xu and yang [11] developed the definition of herz-morrey spaces with variabel exponents. let 𝑝(⋅) ∈ 𝒫(ℝ𝑛), 0 < 𝑞 < ∞, 0 ≤ 𝜆 < ∞, and 𝛼(⋅) is a bounded real-valued measurable function on ℝ𝑛 , the homogeneous herz-morrey spaces with variable exponent ℳ�̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) consists all functions 𝑓 ∈ 𝐿𝑙𝑜𝑐 𝑞 ( ℝ𝑛 /{0} ) such that ‖𝑓‖ ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) = sup 𝐿∈ℤ 1 2𝐿𝜆 (∑ 2𝑘𝛼(⋅)𝑝‖ 𝑓𝜒𝑘 ‖𝐿𝑞 (ℝ𝑛) 𝑝𝐿 𝑘=−∞ ) 1 𝑝(⋅) < ∞, , where 𝐵𝑘 = { 𝑥 ∈ ℝ 𝑛 : |𝑥| ≤ 2𝑘 }, 𝐴𝑘 = 𝐵𝑘 /𝐵𝑘−1 and 𝜒𝑘 = 𝜒𝐴𝑘 is the characteristic function of the set 𝐴𝑘 for 𝑘 ∈ ℤ. as another spaces which have their weak type spaces, herz-morrey spaces also have their weak type spaces. for 𝛼(⋅) ∈ ℝ𝑛 , 𝑝(⋅) ∈ 𝒫(ℝ𝑛), 0 ≤ 𝜆 ≤ ∞ and 0 < 𝑞 ≤ ∞, the homogeneous weak herz-morrey spaces with variabel exponent ( 𝑊 ℳ�̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛)) is a set of measurable 𝑓 ∈ 𝐿𝑙𝑜𝑐 𝑞 (ℝ𝑛 /{0}) which is equipped with norm such that ‖ 𝑓 ‖ 𝑊 ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) = sup 𝛾>0 𝛾 sup 𝐿∈ℤ 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑝(⋅)𝑚𝑘 (𝛾, 𝑓) 𝑝(⋅) 𝑞 𝐿 𝑘=−∞ ) 1 𝑝(⋅) < ∞, where 𝑚𝑘 ( 𝛾, 𝑓 ) = |{ 𝑥 ∈ 𝐴𝑘 : |𝑓(𝑥)| > 𝛾 }|. some authors have investigated those spaces in various terms of discussion (see [12] [15]). meanwhile, this article aims to discuss in terms inclusion properties and inclusion relation of the homogeneous herz-morrey spaces and homogeneous weak herz-morrey spaces with variable exponent. result and discussion our main results are the following: theorem 2.1. let 1 ≤ 𝑝1(⋅) ≤ 𝑝2(⋅) < 𝑞 < ∞, and 𝛼(⋅) is a bounded real-valued measurable fuction on ℝ𝑛 . then, the inclusion ℳ�̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ ℳ�̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛), is valid. inclusion properties of herz-morrey spaces with variable exponent hairur rahman 24 proof. we may take any 𝑓 ∈ ℳ�̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). then, by using hölder inequality and 𝑝1 ≤ 𝑝2 we have ‖𝑓‖ ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) = sup 𝐿∈𝑍 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑝1(⋅) 𝐿 𝑘=−∞ ‖𝑓𝜒𝑘 ‖𝐿𝑞(ℝ𝑛) 𝑝1(⋅) ) 1 𝑝1(⋅) ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 (( ∑ (2𝑘𝛼(⋅)𝑝1(⋅)) 𝑝2(⋅) 𝑝1(⋅) 𝐿 𝑘=−∞ ) 𝑝1(⋅) 𝑝2(⋅) ( ∑ (‖𝑓𝜒𝑘 ‖𝐿𝑞(ℝ𝑛) 𝑝1(⋅) ) 𝑝2(⋅) 𝑝2(⋅)−𝑝1(⋅) 𝐿 𝑘=−∞ ) 1− 𝑝1(⋅) 𝑝2(⋅) ) 1 𝑝1(⋅) ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 (( ∑ 2𝑘𝛼(⋅)𝑝2(⋅) 𝐿 𝑘=−∞ ) 𝑝1(⋅) 𝑝2(⋅) ( ∑ ‖𝑓𝜒𝑘 ‖𝐿𝑞(ℝ𝑛) 𝑝1(⋅)𝑝2(⋅) 𝑝2(⋅)−𝑝1(⋅) 𝐿 𝑘=−∞ ) 1− 𝑝1(⋅) 𝑝2(⋅) ) 1 𝑝1(⋅) ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑝2(⋅) 𝐿 𝑘=−∞ ( ∑ ‖𝑓𝜒𝑘 ‖𝐿𝑞(ℝ𝑛) 𝑝1(⋅)𝑝2(⋅) 𝑝2(⋅)−𝑝1(⋅) 𝐿 𝑘=−∞ ) 𝑝2(⋅)−𝑝1(⋅) 𝑝1(⋅) ) 1 𝑝2(⋅) ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑝2(⋅) 𝐿 𝑘=−∞ ‖𝑓𝜒𝑘 ‖𝐿𝑞(ℝ𝑛) 𝑝2(⋅) ) 1 𝑝2(⋅) ≤ ‖𝑓‖ ℳ�̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) . it is easy to know that 𝑓 ∈ ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛), where 𝛼(⋅) ∈ (ℝ𝑛) and 𝑝(⋅) ∈ 𝒫(ℝ𝑛). then, we have ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 ( ℝ𝑛). by the previous theorem, the author established the following inclusions. theorem 2.2. let 1 ≤ 𝑝1(⋅) ≤ 𝑝2(⋅) < 𝑞 < ∞, and 𝛼(⋅) is a bounded real-valued measurable fuction on ℝ𝑛 , then the following inclusion is valid. 𝐿𝑞 ( 𝑅𝑛) = ℳ �̇�𝑞,𝑞 𝛼(⋅),𝜆 ( ℝ𝑛 ) ⊆ ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 ( ℝ𝑛 ) ⊆ ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 ( ℝ𝑛). proof. theorem 2.1 has stated that ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). then, we only prove that 𝐿𝑞 (ℝ𝑛) = ℳ�̇� 𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ ℳ𝐾 ̇ 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). let 𝑓 ∈ 𝑀�̇�𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ), by using similar method as before, we get ‖ 𝑓 ‖ 𝑀�̇� 𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑞 𝐿 𝑘=−∞ ((∫ |𝑓(𝑥)|𝑞 𝑑𝑦 𝐵(0,2𝑘) ) 1 𝑞 (∫ |𝜒𝑘 | 𝑞 𝑑𝑦 𝐵(0,2𝑘) ) 1 𝑞 ) 𝑞 ) 1 𝑞 ≤ sup 𝐿∈𝑍 1 2𝐿𝜆 ∑ 2 𝑘𝛼(⋅) 𝐿 𝑘=−∞ (∫ |𝑓(𝑥)|𝑞 𝑑𝑦 𝐵(0,2𝑘) ) 1 𝑞 ( 2 𝑘𝑑 ) 1 𝑞 ≤ 𝐶 (∫ |𝑓(𝑥)|𝑞 𝑑𝑦 𝐵(0,2𝑘) ) 1 𝑞 inclusion properties of herz-morrey spaces with variable exponent hairur rahman 25 ≤ ‖ 𝒇 ‖ 𝑳𝒒(ℝ𝒏). hence, 𝑓 ∈ 𝐿𝑞 (ℝ𝑛) and 𝐿𝑞 (ℝ𝑛) ⊆ ℳ �̇�𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ). in the other hand, for any 𝑓 ∈ 𝐿𝑞 (ℝ𝑛), there exist any constant 𝐶 such that 𝐶 = sup 𝐿∈𝑍 1 2𝐿𝜆 ∑ 2 𝑘𝛼(⋅) + 𝑘𝑑 𝑞𝐿 𝑘=−∞ . consequently, we have 𝑓 ∈ ℳ �̇� 𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) and ℳ�̇� 𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ 𝐿𝑞 (ℝ𝑛 ). it gives conclusion that 𝐿𝑞 (ℝ𝑛) = ℳ �̇� 𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ), where 𝛼(⋅) ∈ (ℝ𝑛 ). furthermore, we will prove that ℳ �̇�𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). by using similar method as the proof of theorem 2.1, we have ‖ 𝑓 ‖ ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ≤ ‖ 𝑓 ‖ ℳ �̇�𝑞,𝑞 𝛼(⋅),𝜆 (ℝ𝑛) , where 𝛼(⋅) ∈ (ℝ𝑛 ). the author also added the inclusion of the homogeneous weak herz-morrey spaces with variable exponent by the following theorem. theorem 2.3. let 1 ≤ 𝑝1(⋅) ≤ 𝑝2(⋅) ≤ 𝑞 < ∞, and 𝛼(⋅) is a bounded real-valued measurable fuction on ℝ𝑛 , the following inclusion holds: 𝑊 ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ 𝑊 ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). proof. let 𝑓 ∈ ‖𝑓‖ 𝑊ℳ�̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) , we have ‖ 𝑓 ‖ 𝑊 ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) = sup 𝛾>0 𝛾 sup 𝐿∈ℤ 1 2𝐿𝜆 ( ∑ 2𝑘𝛼(⋅)𝑝1(⋅)𝑚𝑘 (𝛾, 𝑓) 𝑝1(⋅) 𝑞 𝐿 𝑘=−∞ ) 1 𝑝1(⋅) ≤ sup 𝛾>0 𝛾 sup 𝐿∈ℤ 1 2𝐿𝜆 (∑ 2𝑘𝛼(⋅)𝑝2(⋅)𝑚𝑘 (𝛾, 𝑓) 𝑝2(⋅) 𝑞𝐿 𝑘=−∞ ) 1 𝑝2(⋅) ≤ ‖𝑓‖ 𝑊ℳ�̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) . the above inequality has shown that 𝑊 ℳ �̇� 𝑝2(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ⊆ 𝑊 ℳ �̇� 𝑝1(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). now, we state the inclusion relation between both spaces. theorem 2.4. let 1 ≤ 𝑝(⋅) ≤ 𝑞, and 𝛼(⋅) is a bounded real-valued measurable function on ℝ𝑛 . then, the inclusion ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ) ⊆ 𝑊 ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) is proper. proof. we use similar idea as before to prove this theorem. let 𝑓 ∈ ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ), 𝑎(⋅) ∈ ℝ𝑛 , 𝑝(⋅) ∈ 𝒫(ℝ𝑛) and 𝛾 > 0. we have observed that | {𝑥 ∈ 𝐴𝑘 : |𝑓(𝑥)| > 𝛾} | 𝑝(⋅) 𝑞 ≤ (∫ |𝑓(𝑥)𝜒𝑘 | 𝑞 𝑑𝑥 𝐵(0,2𝑘) ) 𝑝(⋅) 𝑞 = ‖ 𝑓𝜒𝑘 ‖ 𝐿𝑞(ℝ𝑛) 𝑝(⋅) . multiplying both sides by ∑ 2𝑘𝛼(⋅)𝑝(⋅)𝐿𝑘=−∞ , we get inclusion properties of herz-morrey spaces with variable exponent hairur rahman 26 ∑ 2𝑘𝛼(⋅)𝑝(⋅)|{ 𝑥 ∈ 𝐴𝑘 : |𝑓(𝑥)| > 𝛾 }| 𝑝(⋅) 𝑞 𝐿 𝑘=−∞ ≤ ∑ 2𝑘𝛼(⋅)𝑝(⋅) ‖ 𝑓𝜒𝑘 ‖ 𝐿𝑞(ℝ𝑛) 𝑝(⋅) 𝐿 𝑘=−∞ . clearly, we see that ‖ 𝑓 ‖ 𝑊ℳ�̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) ≤ ‖ 𝑓 ‖ ℳ𝐾 ̇ 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛) and 𝑓 ∈ 𝑊 ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛), which implies that ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛 ) ⊆ 𝑊 ℳ �̇� 𝑝(⋅),𝑞 𝛼(⋅),𝜆 (ℝ𝑛). conclusion by this result, the author can conclude that the homogeneous herz-morrey spaces with variable exponent have inclusion properties ... . this result will be useful to be used in proving fractional integral on the homogeneous herz-morrey spaces with variable exponent. acknowledgment this paper is partially supported by uin maulana malik ibrahim malang research and innovation program 2020. references [1] h. gunawan, d. i. hakim, k. m. limanta and a. a. masta, "inclusion property of generalized morrey spaces," math. nachr., pp. 1-9, 2016. [2] h. gunawan, d. i. hakim and m. idris, "proper inclusions of morrey spaces," glasnik matematicki, vol. 53, no. 1, 2017. [3] h. gunawan, d. i. hakim, e. nakai and y. sawano, "on inclusion relation between weak morrey spaces and morrey spaces," nonlineae analysis, vol. 168, pp. 27-31, 2018. [4] h. gunawan, e. kikianty and c. schwanke, "discrete morrey spaces and their inclusion properties," math. nachr., pp. 1-14, 2017. [5] a. a. masta, h. gunawan and w. setya-budhi, "an inclusion property of orliczmorrey spaces," j. phys.: conf. ser, vol. 893, pp. 1-7, 2017. [6] a. beurling, "construction and analysis of some convolution algebras," annales de l'institut fourier grenoble, vol. 14, pp. 1-32, 1964. [7] s. lu, d. yang and h. guoen, herz type spaces and their applications, beijing: science press, 2008. [8] h. rahman, "inclusion properties of the homogeneous herz-morrey," cauchy, vol. 6, no. 3, pp. 117-121, 2020. [9] o. kovacik and j. rakosnik, "on space and," czchoslovak math. j., vol. 41, pp. 592618, 1991. [10] m. izuki, "boundedness of sublinear operators on herz spaces with variable exponent and application to wavelet characterization," vol. 36 (1), no. analysis mathematics, pp. 33-50, 2010. [11] j. yang and j. xu, "herz-morrey-hardy spaces with variable exponents and their applications," no. journal of function spaces, pp. 1-19, 2015. [12] s. lu and l. xu, "boundedness of rough singular integral operators on the homogeneous morrey-herz spaces," hokkaido math. journal, vol. 34, pp. 299-314, inclusion properties of herz-morrey spaces with variable exponent hairur rahman 27 2005. [13] m. izuki, "fractional integral on herz-morrey spaces with variable exponent," hiroshima math. j., vol. 40, pp. 343-355, 2010. [14] y. mizuta and t. ohno, "herz-morrey spaces of variable exponent, riesz potential operator and duality," complex variable and elliptic equations, vol. 60, no. 2, pp. 211-240, 2015. [15] y. shi, x. tao and t. zheng, "multilinier riesz potential on morrey-herz spaces with non-doubling measure," journal of inequality and applications, vol. 10, 2010. multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 238-245 p-issn: 2086-0382; e-issn: 2477-3344 submitted: november 25, 2020 reviewed: february 19, 2021 accepted: april 11, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.10871 multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma1, juhari 2, ramadani a. rosa3 1,2,3 department of mathematics uin maulana malik ibrahim malang email: riadhea@uin-malang.ac.id, juhari@uin-malang.ac.id, ramadaniauiyanarosa@gmail.com abstract poverty population is one of the serious problems in indonesia. the percentage of population poverty used as a means for a statistical instrument to be guidelines to create standard policies and evaluations to reduce poverty. the aims of the research are to determine model population poverty using multivariate adaptive regression spline and bagging mars then to understand the most influence variable population poverty of central java province in 2018. the result of this research is the bagging mars model showed better accuracy than the mars model. since, gcv in the bagging mars model is 0,009798721 and gcv in the mars model is 6,985571. the most influence variable population poverty of central java province in 2018 based on mars model is the percentage of the old school expectation rate. then, the most influentce variable based on bagging mars model is the number of diarrhea disease. keywords: multivariate adaptive regression splines; bootstrap aggregating; generalized crossvalidation; poverty introduction poverty has concerned problem in the world even in indonesia. in indonesia, which is developing country, poverty has been affected in economics that it’s showed level of welfare. therefore, it has become a serious problem that must be resolved. the growth of economics is the fundamental factor to reduce poverty. based on bps data, indonesia has been able to deal with some economics global problem and succeeded in increasing economic growth. some programs realized such as credit procurement programs, agricultural development, equitable development, infrastructure improvement, to the procurement program inpress lagging village (idt) to help improve the community's living standards. the efforts considered significant because it reduced the experiencing of gaps between the rich and the underprivileged people and as an effort to realize the strategy of human quality development [1]. according to bps (badan pusat statistik) or indonesian statistics institution, level of poverty in indonesia has been reduced in recently. the percentage of privileged people reduced up to 0, 58% (in year-on-year) and at 2017 was the lowest poverty level rate. the http://dx.doi.org/10.18860/ca.v6i4.10871 mailto:ramadaniauiyanarosa@gmail.com multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 239 government succeeded in reducing poverty rate by 1.18 million people from an average. the government made a system for implement social protection based on a life cycle approach at 2018. however, in some areas poverty rate was slowed in reducing poverty [2]. mars were introduced assumption about the relationship between the dependent and independent variables to estimate general functions of high dimensional data. bagging mars is a method that improved performance of in mars method used bootstrap replicating. the past researches karisma & sri harini [3] used mars to find the classification of risk factors of ischemic and hemorrhagic patients by mars method, kilinc, b et al. [4] research to find models of metal concentrations to determine soil pollution by mars method, etc. mars model used combination from spline method and recursive partition. then, model in spline regression applied using a set of basis function to achieve q-order spline regression and estimated using least squares method. it has knot to find out the continuity basis function from one region in regression line to others. otherwise, bootstrap aggregating (bagging) used to minimize squared error value. the aimed of the research was the influenced poverty factor using mars and bagging mars then it can be used for guidelines standard policies and evaluation to reduce poverty. methods poverty is resident who have an average monthly expenditure per capita below the poverty line [5]. poverty influenced by some factors such as human resources, employment, inflation, unemployment, population density, health facilities, income, scarcity, transportation, education, business capital [6] . mars method used multivariate nonparametric approaches. it has recursive partition formed, high dimensional data, and discontinuity data. bagging mars is a method that used for improve performance on mars method with bootstrap replicating. mars developed by recursive partitioning regression (rpr) to estimate sub-region in each region continuous model in knots [7]. the advantage mars is unrequired standardization, produced accurate results, used in big data, and used for regression analysis and classification simultaneously. bagging mars recursive partitioning regression (rpr) unable to overcome the discontinuous data in knots. therefore, the rpr algorithm used to estimate and correlate data in subregions [8]. the basis function explained the relationship between the dependent and independent variables [9] . the regression model used basis functions (bf) as follows: 𝑦 = 𝛽0 ∑ 𝛽𝑚ℎ𝑚(𝑥) 𝑀 𝑚=1 (1) where ℎ𝑚 is a set of basis function, and 𝛽𝑚 is a coefficient of ℎ𝑚 in splines basis function defined as: ℎ𝑚 = ∏ [𝑆𝑘𝑚(𝑥𝑣(𝑘,𝑚) − 𝑡𝑘𝑚)] + 𝐾𝑚 𝑘=1 (2) after modified bf with the rpr model, the mars model obtained as follows: 𝑓(𝑥) = 𝑎0 + ∑ 𝑎𝑚 𝑀 𝑚=1 ∏ [𝑆𝑘𝑚(𝑥𝑣(𝑘,𝑚) − 𝑡𝑘𝑚)]+ 𝐾𝑚 𝑘=1 (3) where 𝑎0 is a coefficient, 𝑎𝑚 is a coeefficient function basis-m m is a maximum basis, 𝐾𝑚 is an interction degree, 𝑥𝑣(𝑘,𝑚) is label of predictor variables, 𝑡𝑘𝑚 is knot of predictor variables 𝑥𝑣(𝑘,𝑚), and 𝑆𝑘𝑚 are variables that take values ± 1 [7]. multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 240 in matrix formed, mars model defined by (4) 𝑌 = 𝐵𝑎 + 𝜀 , 𝑌 = (𝑌1, … , 𝑌𝑛) 𝑇, 𝑎 = (𝑎0, … , 𝑎𝑀) 𝑇, 𝜀 = (𝜀0, … , 𝜀𝑛) 𝑇 (4) 𝐵 = [ 1 ∏ [𝑆1𝑚. (𝑥𝑣(1,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 … ∏ [𝑆𝑀𝑚. (𝑥𝑣(𝑀,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 1 ∏ [𝑆2𝑚. (𝑥𝑣(1,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 ⋮ 1 ∏ [𝑆𝑛𝑚. (𝑥𝑣(1,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 …… … ∏ [𝑆𝑀𝑚. (𝑥𝑣(𝑀,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 ⋮ ∏ [𝑆𝑀𝑚. (𝑥𝑣(𝑀,𝑚) − 𝑡1𝑚)] 𝐾𝑚 𝑘=1 ] (5) the gcv used to find the best model from mars method, which used smaller is better. it is determined value by trial and error combining the number of basis functions (bf), maximum interaction (mi), and minimum observation (mo) [4]. the gcv defined as: 𝐺𝐶𝑉 = 𝑀𝑆𝐸 [1− 𝐶(�̂�) 𝑛 ] 2 (6) where 𝑀𝑆𝐸 value defined as 1 𝑛 ∑ [𝑦𝑖 − 𝑓𝑀(𝑥𝑖)] 2𝑛 𝑖=1 , and c(m̂) defined as c(m̂) = c(m) + dm (7) where, c(m) is matrix trace [b(btb)−1bt] + 1 that is the number of parameters being fit and d represents a cost for each basis function optimization [7]. the research used data from sosial ekonomi nasional (susenas), bps (badan pusat statistik) or indonesian statistics institution for java province, and bps semarang regional. total data that used in this research was 350. it used mars and bagging mars to analyze, then the steps that employed are divided data into training and testing data. then, mars method resolved by determined data used mars method with a combination basis function (bf), maximum interaction (mi), and minimal observations (mo)[10]. besides, obtained minimum gcv value to determine the best model in mars and interpreted mars model. bagging mars method completed by determined bagging mars model using 50 replications. then, the best model in bagging mars method achieved. the last is determined variable that the most influenced of poverty in central java province in 2018. results and discussion statistics descriptive the descriptive analysis used to determine characteristic poverty in central java at 2018 (badan pusat statistik, 2019) multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 241 figure 1. descriptive analysis poverty population figure 1 showed the percentage of population poverty in central java at 2018. the histogram illustrated regency areas and the percentage of poverty population in those areas. the highest poverty in those areas was kabupaten wonosobo with 17,58%. the total of poverty population was almost fifth percent. the percentage of population poverty occurred by some factors such as social economic, technology, health care and others. then, the lowest population poverty was kota semarang from total population. it was under one in twenty percent. modeling poverty population mars and bagging mars methods the mars model showed in matrix pattern (see figure 3.2). the matrix plot discovered relationship between response variable, which is variable the percentage of population poverty (𝑌), and predictor variables, which is the number of diarrhea disease (𝑋1), the number of life expectancy (𝑋2), the percentage of human development index (hdi) (𝑋3), the percentage of expenditure per capita by non-food commodities (𝑋4), the percentage of open unemployement (𝑋5), the number of infant malnutrition (𝑋6), the percentage of family planning and birth control (𝑋7), the percentage of labor force participation rate (𝑋8), the percentage of expectation old school (𝑋9), the number of bpjs participants (𝑋10). 0 2 4 6 8 10 12 14 16 18 20 percentage of poverty population in central java province in 2018 multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 242 figure 2. matrix plot pattern of poverty population figure 2 illustrated that indicated unclear and difficult patterns of the relationship between variables. then, in each variable has different characteristics on those areas and predictor variable was not able to be explained. in addition, nonparametric method used in this research which is mars and bagging mars methods. the best model even in mars and bagging mars methods indicated by the gcv. the gcv in mars model was 6.985571 and the r-sq value was 75,7 %. then, it was five predictor variables that significant and affected population poverty. it was 𝑋1, 𝑋6, 𝑋9, 𝑋8, 𝑋10 using training data 85% and testing data 15%. the mars model obtained: f(x) = 12.8 − 0.000235 ∗ max(0, 𝑋1 − 19574) + 0.0107 ∗ max(0, 249 − 𝑋6) – 0.514 ∗ max(0, 𝑋8 − 67.5) + 7.35 ∗ max(0, 124 − 𝑋9) − 1.34e − 05 ∗ max(0, 597322 − 𝑋10) then, the interpretation of mars model is − 0.000235 ∗ max (0, 𝑋1 − 19574) when, the value of 𝑋1 was greater than 19574, for every increased number of diarrhea, it increased the percentage of the population poverty at 0.000235 in the central java province with an average number of cases of diarrhea less than 19574 people. 0. 0107 ∗ max (0, 249 −𝑋6) when, the value of 𝑋6 was smaller than 249, for every increased number of infant malnutrition, it increased the percentage of the population poverty at 0.0107 in the central java province with an average number of infant malnutrition less than 249 people. −0.514 ∗ max (0, 𝑋8 − 68) when, the value of 𝑋8 greater than 68, for every, increased in the percentage of labor force participation rate, it decreased the percentage of the population poverty by 0.514 in the central java province with an average percentage participation rate of a labor force more than 68 people. multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 243 7. 35 ∗ max (0, 12.4 −𝑋9) when, the value of 𝑋9 is smaller than 12.4, for every increased the percentage of old school expectancy, it decreased the percentage of the population poverty at 7.35 in the central java province with an average percentage of the old school expectancy is less than 12.4%. −1, 34𝑒−05∗ max (0, 597322 −𝑋10) when, the value of 𝑋10 was smaller than 597322, for every increased number of participants bpjs, it decreased the percentage of the population poverty of 0.0000134 in the central java province with an average number of participants bpjs less than 597322 people. in bagging mars method that used 50 times replicate the best model obtained at the 49th replication using minimum gcv. then, it was six predictor variables that have significant value affected population of poverty. it was 𝑋1, 𝑋4, 𝑋6, 𝑋7, 𝑋8, 𝑋10. the gcv was 0.009431298 and r-sq value 0.7955023. the model was: f̂(x) = 11.17643 – 0.0001232638 ∗ max(0, 13503 − 𝑋1) + 0.0001346581 ∗ max (0, 𝑋1 − 13503) + 1.637211 ∗ max(0, 48.96 − 𝑋4) – 0.6424541 ∗ max(0, 𝑋4 −48.96) – 0.0250127 ∗ max(0, 𝑋6 − 52) + 8.251765e − 05 ∗ max(0, 33664 −𝑋7) – 0.0001611239 ∗ max(0, 𝑋7 − 33664) – 0.07994066 ∗ max(0, 67.03 𝑋8) − 0,1345248 ∗ max(0, 𝑋8 − 67.03) + 1.335112e − 05 ∗ max(0, 𝑋10 −763837) (5) table 1. comparison mars and bagging mars model significance variables gcv mars 𝑋1, 𝑋6, 𝑋8, 𝑋9, 𝑋10 6.985571 bagging mars 𝑋1, 𝑋4, 𝑋6, 𝑋7, 𝑋8, 𝑋10 0.009798721 table 1 showed that the gcv of the bagging mars model was 0.009798721. then, mars model was 6.985571. gcv in the bagging mars model indicated a better accuracy than the mars model. since, bagging mars model has gcv minimum than mars model. best variable in mars and bagging mars methods the population poverty of central java using mars model affected by the number of diarrhea disease (𝑋1), the number of infant malnutrition (𝑋6), the percentage of labor force participation rate (𝑋8), the percentage of expectation old school (𝑋9), and the number of participants bpjs (𝑋10). table 2 is affected population poverty based on importance variables from mars method. table 2. importance variables mars model variable importance variables (%) 𝑋1 40.9 𝑋6 22.9 𝑋8 31.7 𝑋9 100 𝑋10 50.8 multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 244 moreover, bagging mars affected variable by importance variables that showed in table 3. the variables were the number of diarrhea disease (𝑋1), the percentage of expenditure per capita by non-food commodities (𝑋4), the percentage of family planning and birth control (𝑋7), the percentage of labor force participation rate (𝑋8), the percentage of old school expectancy (𝑋9), and the number of participants bpjs (𝑋10). table 3. importance variables bagging mars model variable importance variables 𝑋1 95.32921 𝑋4 0.000000 𝑋7 60.80385 𝑋8 0.000000 𝑋10 0.000000 mars and bagging mars method have distinction in importance variables. in mars method the best level of importance variable was 100% which is the percentage of old school expectancy (𝑋9) then in bagging mars method was 95.33% which is number of cases of diarrhea disease (𝑋1). conclusions bagging mars methods obtained better accuracy than the mars model. the most influenced variable population of poverty in central java at 2018 using mars method was the percentage of old school expectancy(𝑋9), then the bagging mars method is the variable number of cases of diarrhea disease(𝑋1). references [1] [2] [3] [4] [5] [6] [7] tjiptoherijanto, p. (1997). prospek perekonomian indonesia dalam rangka globalisasi. rineka cipta. badan pusat statistik. (2017). perhitungan dan analisis kemiskinan makro di indonesia. karisma & sri harini. (2019). multivariate adaptive regression spline in ishemic and hemorrhagic. journal aip converence proceedings of symposium on biomathematics, 1–8. kilinc, b. k., malkoc, s., koparal, a. s., & yazici, b. (2017). using multivariate adaptive regression splines to estimate pollution in soil. international journal of advanced and applied sciences. https://doi.org/10.21833/ijaas.2017.02.002 badan pusat statistik. (2019). kemiskinan dan ketimpangan. badan pusat statistik kemiskinan dan ketimpangan. https://www.bps.go.id/subject/23/kemiskinandan-ketimpangan.html kurniawan, m. d. (2017). analisis faktor-faktor penyebab kemiskinan di kabupaten musi banyuasin (studikasus di kecamatan sungai lilin). jurnal ilmiah ekonomi global masa kini. friedman, j. h. (1991). rejoinder: multivariate adaptive regression splines. the annals of statistics. https://doi.org/10.1214/aos/1176347973 rahmaniah, m. nanda dkk, (2016). bootstrap aggregating multivariate adaptive multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines of poverty in central java ria dhea ln karisma 245 [8] [9] [10] [11[ regression spline. jurnal eksponensial, 7(2), 163–170. jurnal eksponensial, 7(2), 163–170. breiman, l. (1996). bagging predictors. machine learning. https://doi.org/10.1007/bf00058655 shofa, b. & i. n. &. (2012). analisis survival dengan pendekatan multivariate adaptive regression spline pada kasus demam berdarah dengue (dbd). jurnal sains dan seni its, 1(1), 318–323. badan pusat statistik. (2019). https://semarangkab.bps.go.id. retrieved from https://semarangkab.bps.go.id/indicator/23/78/1/persentase-penduduk-miskinkabupaten-kota-di-jawa-tengah.html strongly summable vector valued sequence spaces defined by 2 modular cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 279-285 p-issn: 2086-0382; e-issn: 2477-3344 submitted: january 23, 2021 reviewed: march 29, 2021 accepted: april 07, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11484 strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho1, p.w. prasetyo2 1,2departement of mathematical education, universitas ahmad dahlan, yogyakarta, indonesia email: burhanudin@pmat.uad.ac.id abstract summability is an important concept in sequence spaces. one summability concept is strongly cesaro summable. in this paper, we study a subset of the set of all vector-valued sequence in 2modular space. some facts that we investigated in this paper include linearity, the existence of modular and completeness with respect to these modular. keywords: strongly; summable; sequence spaces; 2-modular introduction summability is an important concept in sequence spaces. the familiar example of sequence spaces that using the summability concept is ℓ𝑝 spaces. in [1], it is explained that kutner discusses spaces of strongly cesaro summable sequences, and furthermore, maddox generalizes this concept. if 𝜔 denote the set of all infinite sequence of real/complex numbers, then the set 𝑤 = {(𝑥𝑘 ) ∈ 𝜔: ∃𝐿, ∋ lim 𝑛→∞ 1 𝑛 ∑|𝑥𝑘 − 𝐿| = 0 𝑛 𝑘=1 }, denote the space of strongly cesaro summable sequence [2] [3]. let 𝑋 be a real linear space of dimension 𝑑 ≥ 2. a 2-norm on 𝑋 is a function ‖. , . ‖: 𝑋 × 𝑋 → ℝ , where for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, satisfy (i) ‖𝑥, 𝑦‖ = 0 if and only if 𝑥 and 𝑦 are linearly dependent (ii) ‖𝑥, 𝑦‖ = ‖𝑦, 𝑥‖ (iii) ‖𝛼𝑥, 𝑦‖ = |𝛼|‖𝑥, 𝑦‖, 𝛼 ∈ ℝ (iv) ‖𝑥 + 𝑦, 𝑧‖ ≤ ‖𝑥, 𝑧‖ + ‖𝑦, 𝑧‖. the pair (𝑋, ‖. , . ‖) is then called a 2-normed space [4]. the concept is initially introduced by gahler [5] in the middle of 1963. furthermore, in 1989, misiak generalized the 2normed concept to be n-normed [6]. since then, many kinds research on 2-normed (nnormed) spaces, include research on strongly cesaro summable vector-valued sequences or the generalize in 2-normed (n-normed) spaces [7] [8] [9] [10] [11]. in 1950, nakano developed modular function and it was generalized by musielak and orlicz [12] [13]. modular is the generalization of the norm. let 𝑌 be a real linear space, a functional 𝑔: 𝑌 → ℝ∗ is said tobe modular if it satisfies the following conditions: (i) 𝑔(𝑥) = 0 if and if 𝑥 = 0 (ii) 𝑔(−𝑥) = 𝑔(𝑥) (iii) 𝑔(𝛼𝑥 + 𝛽𝑦) ≤ 𝑔(𝑥) + 𝑔(𝑦), every 𝑥, 𝑦 ∈ 𝑌, 𝛼, 𝛽 ≥ 0, 𝛼 + 𝛽 = 1. http://dx.doi.org/10.18860/ca.v6i4.11484 mailto:burhanudin@pmat.uad.ac.id strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 280 the pair (𝑌, 𝑔) is then called a modular space. following the 2-norm (n-norm) concept, k. nourouzi and s. shabanian in 2009 initially introduced the n-modular concept [14] [15]. let 𝑋 be a real linear space of dimension 𝑑 ≥ 2. a 2-modular on 𝑋 is a function 𝜌(. , . ): 𝑋 × 𝑋 → ℝ∗ where for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, satisfy (i) 𝜌(𝑥, 𝑦) = 0 if and only if 𝑥 and 𝑦 are linearly dependent (ii) 𝜌(𝑥, 𝑦) = 𝜌(𝑦, 𝑥) (iii) 𝜌(−𝑥, 𝑦) = 𝜌(𝑥, 𝑦), (iv) 𝜌(𝛼𝑥 + 𝛽𝑦, 𝑧) ≤ 𝜌(𝑥, 𝑧) + 𝜌(𝑦, 𝑧), every 𝛼, 𝛽 ≥ 0, 𝛼 + 𝛽 = 1. the pair (𝑋, ‖. , . ‖) is then called a 2-modular space. the 2-modular space, with 𝜌 satisfies δ2-condition, if there exist 𝐿 > 0, such that 𝜌(2𝑥, 𝑦) ≤ 𝐿𝜌(𝑥, 𝑦), for all 𝑥, 𝑦 ∈ 𝑋. a sequence (𝑥𝑘 ) in 𝑋 is said to be 2-modular convergent to 𝑥0 ∈ 𝑋 if lim 𝑘→∞ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) = 0, ∀𝑦 ∈ 𝑋. it means that for every 𝜖 > 0, there exists an 𝑘0 ∈ ℕ, such that for any 𝑘 ∈ ℕ, 𝑘 ≥ 𝑘0, we have 𝜌(𝑥𝑘 − 𝑥0, 𝑦) < 𝜖, ∀𝑦 ∈ 𝑋. furthermore, a sequence (𝑥𝑘 ) in 𝑋 is called 2-modular cauchy sequence if, for all 𝑦 ∈ 𝑋, we have lim 𝑘,𝑙→∞ 𝜌(𝑥𝑘 − 𝑥𝑙 , 𝑦) = 0. the standard example of a 2-modular space is 𝑋 = ℝ2, with 2-modular on ℝ2 define by 𝜌(�̅�, �̅�) = √|det ( 𝑥1 𝑥2 𝑦1 𝑦2 )|, where �̅� = (𝑥1, 𝑥2), �̅� = (𝑦1, 𝑦2) ∈ ℝ 2. clearly that 𝜌 satisfies δ2-condition and the sequence (( 1 𝑛 , 0)) in ℝ2 is 2-modular convergent to (0,0) ∈ ℝ2. this paper will be constructed t spaces of strongly cesaro summable vector-valued sequences in 2-modular spaces based on the facts presented above. methods let (𝑋, 𝜌) be a 2-modular space, with 𝜌 satisfies δ2-condition and the dimension of 𝑋 greater than one. we define 𝑋𝜌 = {𝑥 ∈ 𝑋: 𝜌(𝑥, 𝑦) < ∞, ∀𝑦 ∈ 𝑋}. because 𝜌 satisfies δ2-condition, then there exists 𝐾 > 0, such that for all 𝑥, 𝑦 ∈ 𝑋𝜌, 𝑧 ∈ 𝑋 and 𝛼 ∈ ℝ, we have 𝜌(𝑥 + 𝑦, 𝑧) = 𝜌 ( 2𝑥 + 2𝑦 2 , 𝑧) ≤ 𝜌(2𝑥, 𝑧) + 𝜌(2𝑦, 𝑧) ≤ 𝐾𝜌(𝑥, 𝑧) + 𝐾𝜌(𝑦, 𝑧) < ∞ based on archimedean property, there exists 𝑛0 ∈ ℕ, such that 𝛼 ≤ 2 𝑛0 𝜌(𝛼𝑥, 𝑧) ≤ 𝜌(2𝑛0 𝑥, 𝑧) ≤ 𝐾𝑛0 𝜌(𝑥, 𝑧) < ∞. hence, we have that 𝑋𝜌 is a subspace linear of 𝑋. furthermore (𝑋𝜌, 𝜌) is a 2-modular space too. the notation 𝜔(𝑋𝜌) will donate as the set of all sequences in 𝑋𝜌 strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 281 𝜔(𝑋𝜌) = {(𝑥𝑘 ): 𝑥𝑘 ∈ 𝑋, 𝑘 ∈ ℕ} (1) where linear space operations are defined coordinatewise, (𝑥𝑘 ) + (𝑦𝑘 ) = (𝑥𝑘 + 𝑦𝑘 ), 𝛼(𝑥𝑘 ) = (𝛼𝑥𝑘 ) for all (𝑥𝑘 ), (𝑦𝑘 ) ∈ 𝜔(𝑋𝜌) and 𝛼 ∈ ℝ. the goal of this paper is that we want to extend the concept of strongly cesaro summable to 2-modular spaces valued sequences, defined as 𝑤0 𝜌 (𝑋𝜌) = {(𝑥𝑘 ) ∈ 𝜔(𝑋𝜌): 𝑙𝑖𝑚 𝑛→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) 𝑛 𝑘=1 = 0, ∀𝑦 ∈ 𝑋𝜌 } (2) 𝑤 𝜌(𝑋𝜌 ) = {(𝑥𝑘 ) ∈ 𝜔(𝑋𝜌): ∃𝑥0 ∈ 𝑋𝜌, 𝑙𝑖𝑚 𝑛→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛 𝑘=1 = 0, ∀𝑦 ∈ 𝑋𝜌 } (3) furthermore, we also studied the properties of 𝑤0 𝜌 (𝑋𝜌) and 𝑤 𝜌(𝑋𝜌). results and discussion henceforth, if not specified then 𝑋 is a 2-modular space with 2-modular 𝜌, that satisfies the δ2-conditions. first, we will prove that the mean cesaro theorem applies to 2-modular space. theorem 1. let sequence (𝑥𝑘 ) in 𝑋𝜌 2-modular convergent to 𝑥0 ∈ 𝑋𝜌, then lim 𝑛→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛 𝑘=1 = 0, ∀𝑦 ∈ 𝑋𝜌 proof. since the sequence (𝑥𝑘 ) in 𝑋𝜌 2-modular convergent to 𝑥0 ∈ 𝑋𝜌, then for all 𝜖 > 0, there exists 𝑛𝜖 ∈ ℕ, such that for all 𝑘 ≥ 𝑛𝜖 , we have 𝜌(𝑥𝑘 − 𝑥0, 𝑦) < 𝜖 2 , for all 𝑦 ∈ 𝑋. note that, for all 𝑛 ≥ 𝑛𝜖 , we have 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) = 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛𝜖 𝑘=1 + 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛 𝑘=𝑛𝜖+1 𝑛 𝑘=1 ≤ 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛𝜖 𝑘=1 + 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦)𝑛𝜖+1≤𝑘≤𝑛 𝑚𝑎𝑥 𝑛 𝑘=𝑛𝜖+1 = 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛 ∑ 1 𝑛𝜖 𝑘=1 + 𝜌(𝑥𝑘 − 𝑥0, 𝑦)𝑛𝜖+1≤𝑘≤𝑛 𝑚𝑎𝑥 𝑛 ∑ 1 𝑛 𝑘=𝑛𝜖+1 = 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛𝜖 𝑛 + 𝜌(𝑥𝑘 − 𝑥0, 𝑦)𝑛𝜖+1≤𝑘≤𝑛 𝑚𝑎𝑥 𝑛 − 𝑛𝜖 𝑛 = 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛𝜖 𝑛 + 𝜌(𝑥𝑘 − 𝑥0, 𝑦)𝑛𝜖+1≤𝑘≤𝑛 𝑚𝑎𝑥 = 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛𝜖 𝑛 + 𝜖 2 . strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 282 by archimedean property, there exists 𝑛′ ≥ 𝑛𝜖 , such that for all 𝑛 ≥ 𝑛′, we have 𝜌(𝑥𝑘 − 𝑥0, 𝑦)1≤𝑘≤𝑛𝜖 𝑚𝑎𝑥 𝑛𝜖 𝑛 < 𝜖 2 . hence, for all 𝑛 ≥ 𝑛′, we have 1 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛 𝑘=1 < 𝜖. in other words, the proof is complete. ∎ based on theorem 1, we can say that for all 2-modular convergent sequence (𝑥𝑘 ) in 𝑋𝜌 is an element of 𝑤 𝜌(𝑋𝜌). theorem 2. the set 𝑤 𝜌(𝑋𝜌) is a linear subspace of 𝜔(𝑋𝜌). proof. note that for all (𝑥𝑘 ), (𝑦𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌) and 𝛼 ∈ ℝ, there exsist 𝑥0, 𝑦0 ∈ 𝑋𝜌 so that for all 𝑦 ∈ 𝑋𝜌, we have lim 𝑛→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 − x0, 𝑦) 𝑛 𝑘=1 = 0, and lim 𝑛→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 − y0, 𝑦) 𝑛 𝑘=1 = 0. therefore, 𝜌 satisfy δ2-condition, then there exists 𝐿 > 0 and 𝑛0 ∈ ℕ so that 0 ≤ 𝜌((𝑥𝑘 + 𝑦𝑘 ) − (𝑥0 + 𝑦0), 𝑦) = 𝜌((𝑥𝑘 − 𝑥0) + (𝑦𝑘 − 𝑦0), 𝑦) ≤ 𝜌(2(𝑥𝑘 − 𝑥0), 𝑦) + 𝜌(2(𝑦𝑘 − 𝑦0), 𝑦) ≤ 𝐿𝜌((𝑥𝑘 − 𝑥0), 𝑦) + 𝐿𝜌((𝑦𝑘 − 𝑦0), 𝑦) and 0 ≤ 𝜌(𝛼𝑥𝑘 − 𝛼𝑙, 𝑦) = 𝜌(𝛼(𝑥𝑘 − 𝑙), 𝑦) ≤ 𝜌(2𝑛0 (𝑥𝑘 − 𝑥0), 𝑦) ≤ 𝐿𝑛0 𝜌(𝑥𝑘 − 𝑥0, 𝑦). hence, we have lim 𝑛→∞ 1 𝑛 ∑ 𝜌((𝑥𝑘 + 𝑦𝑘 ) − (𝑥0 + 𝑦0), 𝑦) 𝑛 𝑘=1 = 0 and lim 𝑛→∞ 1 𝑛 ∑ 𝜌(𝛼𝑥𝑘 − 𝛼𝑥0, 𝑦) 𝑛 𝑘=1 = 0. in other words (𝑥𝑘 ) + (𝑦𝑘 ), 𝛼(𝑥𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌), and we proof that 𝑤 𝜌(𝑋𝜌) is a subspace linear of 𝜔(𝑋𝜌).∎ theorem 3. if (𝑥𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌), then for all 𝑦 ∈ 𝑋𝜌, ( 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) 𝑛 𝑘=1 ) is a bounded sequence of real numbers. proof. if (𝑥𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌), then there exist 𝑥0 ∈ 𝑋𝜌, such that for all 𝑦 ∈ 𝑋𝜌, we have lim n→∞ 1 n ∑ ρ(xk − 𝑥0, y) n k=1 = 0. hence, there exist 𝑛0 ∈ ℕ, such that for all 𝑛 ∈ ℕ, with 𝑛 ≥ 𝑛0 we have strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 283 1 n ∑ ρ(xk − 𝑥0, y) n k=1 ≤ 1. since 𝜌 satisfies the δ2-conditions, there exist 𝐿 > 0, for all 𝑦 ∈ 𝑋𝜌, we have 𝜌(𝑥𝑘 , 𝑦) = 𝜌 ( 2(𝑥𝑘 − 𝑥0) 2 + 2𝑥0 2 , 𝑦) ≤ 𝐿𝜌(𝑥𝑘 − 𝑥0, 𝑦) + 𝐿𝜌(𝑥0, 𝑦). it implies, 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) 𝑛 𝑘=1 ≤ 𝐿 𝑛 ∑ 𝜌(𝑥𝑘 − 𝑥0, 𝑦) 𝑛 𝑘=1 + 𝐿𝜌(𝑥0, 𝑦). if we set 𝑀 = sup {𝜌(𝑥1 − 𝑥0, 𝑦), 1 2 ∑ 𝜌(xk − 𝑥0, y), ⋯ , 1 n0 − 1 ∑ ρ(x1 − 𝑥0, y) n0−1 k=1 , 1 2 k=1 } then it follows that we have 𝐾 = 𝐿(𝑀 + 𝜌(𝑥0, 𝑦)), such that 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) ≤ 𝐾, 𝑛 𝑘=1 for all 𝑛 ∈ ℕ. this implies that for all 𝑦 ∈ 𝑋𝜌, ( 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) 𝑛 𝑘=1 ) is a bounded sequence. ∎ theorem 4. function 𝑔((𝑥𝑘 )) = sup { 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑧) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} (5) is a modular on 𝑤 𝜌(𝑋𝜌). proof. if (𝑥𝑘 ) = 𝟎 is the zero sequence. then it is clear that 𝑔((𝑥𝑘 )) = 0. conversely, if ((𝑥𝑘 )) = 0, then we have sup { 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑧) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} = 0. hence, it implies for all 𝑛 ∈ ℕ and 𝑧 ∈ 𝑋𝜌, we have 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦𝑘 ) 𝑛 𝑘=1 = 0 ⇔ 𝜌(𝑥𝑘 , 𝑧) = 0 ⇔ 𝑥𝑘 = 0, ∀𝑘 ∈ ℕ. thus, it is evident that (𝑥𝑘 ) = 𝟎. since 𝜌(−𝑥, 𝑦) = 𝜌(𝑥, 𝑦) applies, for all 𝑥, 𝑦 ∈ 𝑋𝜌, consequently, it is clear that 𝑔(−(𝑥𝑘 )) = 𝑔((𝑥𝑘 )). finally, for all 𝛼, 𝛽 ≥ 0 with 𝛼 + 𝛽 = 1, the for all (𝑥𝑘 ), (𝑦𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌) we have, 𝑔(𝛼(𝑥𝑘 ) + 𝛽(𝑦𝑘 )) = sup { 1 𝑛 ∑ 𝜌(𝛼𝑥𝑘 + 𝛽 𝑦𝑘 , 𝑧) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} = sup { 1 𝑛 ∑( 𝜌(𝑥𝑘 , z) + 𝜌(𝑦𝑘 , 𝑧)) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 284 ≤ sup { 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑧) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} + sup { 1 𝑛 ∑ 𝜌(𝑦𝑘 , 𝑧 ) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} = 𝑔((𝑥𝑘 )) + 𝑔((𝑦𝑘 )). this completes the proof. ∎ theorem 5. if 𝑋𝜌 2-modular complete, then (𝑤 𝜌(𝑋𝜌), 𝑔) is a modular complete. proof. let 𝑛 ∈ ℕ and (𝑥𝑖 ) be a 2-modular cauchy sequence in 𝑤 𝜌(𝑋𝜌), where 𝑥 𝑖 = (𝑥𝑘 𝑖 ), for all 𝑖 ∈ ℕ. hence, for all 𝜖 > 0, there exists 𝑛0 ∈ ℕ, such that for all 𝑖, 𝑗 ∈ ℕ, with 𝑖, 𝑗 ≥ 𝑛0, we have 𝑔(𝑥𝑖 − 𝑥 𝑗 ) = sup { 1 𝑛 ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 𝑗 , 𝑧) 𝑛 𝑘=1 , ∀𝑧 ∈ 𝑋𝜌} < 𝜖. it implies that, for all 𝑖, 𝑗 ≥ 𝑛0, we have 1 𝑛 ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 𝑗 , 𝑧) 𝑛 𝑘=1 < 𝜖, ∀𝑧 ∈ 𝑋𝜌, or ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 𝑗 , 𝑧) 𝑛 𝑘=1 < 𝑛𝜖, ∀𝑧 ∈ 𝑋𝜌, such that, 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 𝑗 , 𝑧) < 𝑛𝜖, ∀𝑧 ∈ 𝑋𝜌. hence, for all 𝑘 ∈ ℕ, (𝑥𝑘 𝑖 ) is a 𝜌-cauchy sequence in 𝑋𝜌. since 𝑋𝜌 complete 2-modular, then (𝑥𝑘 𝑖 ) is 2-modular convergent in 𝑋𝜌, for all 𝑘 ∈ ℕ. therefore, for 𝑘 ∈ ℕ, there exist 𝑥𝑘 ∈ 𝑋𝜌 , such that for all 𝑧 ∈ 𝑋𝜌, we have lim 𝑖→∞ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 , 𝑧) = 0. since, for all 𝑖, 𝑗 ≥ 𝑛0, we have 1 𝑛 ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 , 𝑧) 𝑛 𝑘=1 = lim 𝑗→∞ 1 𝑛 ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 𝑗 , 𝑧) 𝑛 𝑘=1 < 𝜖, ∀𝑧 ∈ 𝑋𝜌, then 𝑔 ((𝑥𝑘 𝑖 ) − (𝑥𝑘 )) = sup ( 1 𝑛 ∑ 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 , 𝑧) 𝑛 𝑘=1 ) < 𝜖, for all 𝑖 ≥ 𝑛0, such that 𝜌(𝑥𝑘 𝑖 − 𝑥𝑘 , 𝑧) < 𝑛𝜖, for all 𝑖 ≥ 𝑛0 therefore (𝑥𝑖 ) modular convergent to (𝑥𝑘 ), and (𝑥𝑘 𝑖 − 𝑥𝑘 ) ∈ 𝑤(𝑋𝜌). since (𝑥𝑘 𝑖 ) ∈ 𝑤(𝑋𝜌) and 𝑤(𝑋𝜌) is a linear spaces, so we have (𝑥𝑘 ) = (𝑥𝑘 𝑖 ) − (𝑥𝑘 𝑖 − 𝑥𝑘 ) ∈ 𝑤(𝑋𝜌). this complete the proof that (𝑤 𝜌(𝑋𝜌), 𝑔) is a complete modular (𝜌-complete). ∎ conclusions if (𝑋, 𝜌) is a 2-modular space, with 𝜌 satisfies δ2-condition, then we can construct 𝑤 𝜌(𝑋𝜌) ⊂ 𝑤(𝑋𝜌) is the space of strongly cesaro summable vector-valued sequences in 2-modular (𝑋𝜌, 𝜌). it certainly can be shown that 𝑤 𝜌(𝑋𝜌) is a linear space. furthermore, if (𝑥𝑘 ) ∈ 𝑤 𝜌(𝑋𝜌), then we can prove that for all 𝑦 ∈ 𝑋𝜌, ( 1 𝑛 ∑ 𝜌(𝑥𝑘 , 𝑦) 𝑛 𝑘=1 ) is a bounded strongly summable vector valued sequence spaces defined by 2 modular b. a. nurnugroho 285 sequence of real numbers. this fact provides a guarantee for us to be able to build a modular 𝑔 on 𝑤 𝜌(𝑋𝜌). finally, we proved that (𝑤 𝜌(𝑋𝜌), 𝑔) is modular complete, if (𝑋𝜌, 𝜌) is a 2-modular complete. acknowledgments the author would like to thank lppm uad for funding this research references [1] t. bilgin, "on strong a-summability defined by a modulus," chinese journal of mathematics, vol. 24, no. 2, pp. 159-166, 1996. [2] f. nuray and b. aydin, "strongly summable and statistically convergent,," informacinės technologijos ir valdymas, vol. 1, no. 30, pp. 74-76, 2004. [3] j. connor, "on strong matrix summability with respect to a modulus and statistical convergence," canad. math. bull., vol. 32, no. 2, pp. 194-198, 1989. [4] h. gunawan and m. , "on finite dimensional 2-normed spaces," soochow j. math, vol. 27, no. 3, pp. 321-329, 2001. [5] s. gähler, " 2metrische raume und ihre topologishe struktur, math.," math. nachr, vol. 26, no. 1-4, pp. 115-148, 1963. [6] a. misiak, "n-inner product spaces," math. narchr, vol. 140, no. 1, pp. 299-319, 1989. [7] h. dutta, b. s. reddy and s. s. cheng, "strongly summable sequences defined over real n-normed spaces," applied mathematics e-notes, vol. 10, pp. 199-209, 2010. [8] k. raj and s. k. sharma, "some sequence spaces in 2-normed spaces defined by musielak-orlicz function," acta univ. sapientiae, mathematica, vol. 3, no. 1, pp. 97109, 2011. [9] r. anand and k. raj, "complete paranormed orlicz lorentz sequence spaces over nnormed," journal of the egyptian mathematical society, vol. 25, pp. 151-154, 2017. [10] d. hemen, "on n-normed linear space valued strongly (c,1)-summable difference sequences," asian-european journal of mathematics, vol. 3, no. 4, pp. 565-575, 2010. [11] m. mursaleen, s. k. sharma and a. kilicman, "sequence spaces defined by musielakorlicz function over 𝑛-normed spaces," abstract and applied analysis, vol. 2013, 2013. [12] l. maligranda, "hidegoro nakano (1909–1974) – on the centenary of his birth," in proceedings of the international symposium on banach and function spaces iii, kitakyushu, japan, 2009. [13] j. musielak and w. orlicz, "on modular spaces," studia mathematica, vol. 18, no. 1, pp. 49-65, 1959. [14] k. nourouzi and s. shabania, "operators defined on n-modular spaces," mediterr. j. math., vol. 6, pp. 431-446, 2009. [15] b. a. nurnugroho, s. and a. zulijanto, "2-linear operator on 2-modular spaces," far east journal of mathematical sciences , vol. 102, no. 12, pp. 3193-3210, 2017. supplier selection analysis using minmax multi choice goal programming model cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 97-104 p-issn: 2086-0382; e-issn: 2477-3344 submitted: july 16, 2021 reviewed: september 02, 2021 accepted: november 01, 2021 doi: https://doi.org/10.18860/ca.v7i1.12944 supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi1, eka susanti2, bambang suprihatin3, endro setyo cahyono4, anggun permata5, nurul fadhila yanita6 1,2,3,4,5,6 department of mathematics, universitas sriwijaya email: novirustiana@unsri.ac.id, eka_susanti@mipa.unsri.ac.id, endrosetyo_c@yahoo.co.id, bambangs@unsri.ac.id abstract production control, inventory and distribution is an important factor in trading activities. these three factors are discussed in a system called supply chain management (scm). procurement of goods from a company or trading business related to suppliers. in some cases, there are several suppliers that can be assessed by considering certain factors. in certain cases, the data from several factors that are considered are uncertainty, so the fuzzy approach can be used. the minmax multi choice goal programming model can be used to solve fuzzy supplier selection problems with linear membership function. it can be applied to selecting supplier of brastagi oranges. there are four suppliers, namely jaya, mako, baros. gina. there are three factor to consider, cost, quality and delivery. the decision maker selects the best supplier for ordering 17000 kg brastagi oranges. the results, the best supplier is gina with an order quantity of 10000 kg and mako with a total order of 7000 kg. keywords: fuzzy; minmax multi choice goal programming; supply chain management; supplier selection introduction supply chain management has three main components, namely the process of obtaining suppliers of raw materials, the process of changing raw materials into finished products and the product distribution process. the first stage in the supply chain is supplier selection. selection of suppliers aims to get products with good quality and competitive prices. supplier selection is related to the process of procuring goods to meet customer demands. price and quality, time of delivery is a consideration in supplier’s selection, especially for perishable products. fruit is a type of product that does not last long if not stored in the refrigerator. research related to supply chains with application in various fields and solutions have been carried out with several approaches. the application of fuzzy topsis in supplier selection was introduced by [1]. the fuzzy approach is also used by [2] in the selection of suppliers in manufacturing companies. the application of the supply chain concept to inventory control and supplier selection for planning new product production in several planning horizons was carried out by [3]. discussion of supply chain problems https://doi.org/10.18860/ca.v7i1.12944 mailto:novirustiana@unsri.ac.id mailto:eka_susanti@mipa.unsri.ac.id mailto:endrosetyo_c@yahoo.co.id mailto:bambangs@unsri.ac.id supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 98 by considering price, supply and demand factors is carried out by [4] and an efficient lagrangian relaxation algorithm is proposed to solve the model. a discussion of bioethanol supply chain network problems with a robust approach was introduced [5]. a deterministic approach to solving the supply chain problem of food product distribution is discussed by [6]. the application of the mix integer programming model to the distribution and supply chain problems of liquid helium is given by [7]. the research of the [8] is combines the concepts of siting, inventory and routing in the supply chain. there are two main studies related to the supplier selection model to be used, namely the concept of fuzzy and fuzzy goal programming. the goal programming (gp) model is used in problems with several objectives to be achieved simultaneously. the gp model with fuzzy numbers is called the fuzzy goal programming (fgp) model. the concept of fgp with random variables was introduced by [9]. fuzzy and probabilistic approaches to the fgp model are discussed by [10]. completion of the fgp model with a genetic algorithm is discussed by [11]. research [12] uses a multi-choice goal programming model to determine energy renewal facilities. [13] used the fgp model in production planning. the choice of waste transportation mode using the fgp model was introduced by [14]. the application of the weighted goal programming model in the urban planning process is given by [15]. the application of the gp model in capital management is given by [16]. the use of the fgp model in transportation problems with several modes of transportation is given by [17]. the research that has been mentioned is the implementation of the supply chain concept to supplier, inventory and distribution components. this research will discuss the problem of selecting suppliers of brastagi oranges using minmax multi choice goal programming models (minmax mcgp). the research focus is on component suppliers. this research is a basic research by developing the minmax multi choice goal programming introduced by [2]. in [2], the fuzzy number used is the trapezoid fuzzy number by considering the factors of price, quality and technology offered. in this study, price, quality and time of delivery are considering. linear membership function is used to define these tree factor. methods the steps for completing the supplier selection using the minmax mcgp method are: 1. data collection and description the data used in this study is primary, consist of data on the purchase with the parameters of cost, quality and delivery. the data collection period is from 18 february to 18 march 2020. 2. determine the fuzzy triangular membership value for the goal of price, quality and delivery. following are given fuzzy membership functions for the respective three goals, in order of price, quality and timeliness of delivery which are formulated based on the data in step 1. the restriction value of variable 𝑐, 𝑘, 𝑑 is determined based on the data in step 1. 𝜇(𝑐) = { 1, 𝑐 ≤ 7800 1 − [ (𝐶−𝑆𝐿1(𝑐)) 𝑆𝐿2(𝑐)−𝑆𝐿1(𝑐) ] , 7800 ≤ 𝑐 ≤ 10000 0, 𝑐 ≥ 10000 (1) supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 99 𝜇(𝑘) = { 1, 𝑘 ≥ 100. 𝑘 𝑆𝐿2(𝑘) , 0 < 𝑘 ≤ 100. 0, 𝑘 ≤ 0. (2) 𝜇(𝑑) = { 1, 𝑑 ≥ 100. 𝑑 𝑆𝐿2(𝑑) , 0 < 𝑑 ≤ 100. 0, 𝑑 ≤ 0. (3) where 𝜇(𝑐) is the membership function for the cost. 𝜇(𝑘) is the membership function for the quality. 𝜇(𝑑) is membership function for delivery 𝑘 is the percentage of average supplier quality. 𝑆𝐿1(𝑐) is satisfaction level lower bound for the unit cost. 𝑆𝐿2(𝑐) is satisfaction level upper bound for the unit cost. 𝑆𝐿2(𝑘) is satisfaction level upper bound for the unit quality. 𝑆𝐿2(𝑑)is satisfaction level upper bound for the unit delivery. 3. the minmax mcgp model formulation based on the membership function values defined in step 2. the following is the minmax mcgp model introduced by [2]. min 𝐷 subject to 𝐷 ≥ 𝛼𝑖 𝑑𝑖 + + 𝛽𝑖 𝑑𝑖 −, 𝑖 = 1,2, … 𝑚, 𝐷 ≥ 𝛿𝑖 (𝑒𝑖 + + 𝑒𝑖 −), 𝑖 = 1,2, … 𝑚, (4) 𝜇(𝑥𝑖 ) − 𝑑𝑖 + + 𝑑𝑖 − = 𝑦𝑖 , 𝑖 = 1,2, … 𝑚, 𝑦𝑖 − 𝑒𝑖 + + 𝑒𝑖 − = 𝑔𝑖,𝑚𝑎𝑥 , 𝑖 = 1,2, … , 𝑚, 𝑔𝑖,𝑚𝑖𝑛 ≤ 𝑦𝑖 ≤ 𝑔𝑖,𝑚𝑎𝑥 , 𝑖 = 1,2, … , 𝑚, 𝑑𝑖 +, 𝑑𝑖 −, 𝑒𝑖 +, 𝑒𝑖 − ≥ 0, 𝑖 = 1,2, … , 𝑚, where 𝐷 : the deviation variable of the objective function 𝛼𝑖 and 𝛽𝑖 : weight of the positive deviation penalty in the objective function 𝑑𝑖 + and 𝑑𝑖 − : positive and negative deviation of the objective function 𝛿𝑖 : the sum of the deviation in the objective function 𝑒𝑖 + and 𝑒𝑖 − : positive and negative deviation on |𝑦𝑖 − 𝑔𝑖,𝑚𝑎𝑥 |. 𝑦𝑖 : continuous variable with a range of interval value 𝑔𝑖,𝑚𝑖𝑛 and 𝑔𝑖,𝑚𝑎𝑥 : minimum and maximum 𝑦𝑖 value 𝜇(𝑥𝑖 ) : membership function for the supplier to i 4. completion of the model obtained in step (4) uses lingo 13.0 software 5. analyses and conclusion supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 100 results and discussion this research discusses supplier selection problem of citrus fruits for the type of brastagi oranges. the data used are primary data with a data collection period of 30 ordering periods. the research was conducted at a fruit shop in palembang . the following is given the research data. table 1. ordering the data for each supplier no supplie r name ordering delivery on time deliv ery price offered prece ntage of qualit y (%) date month date month cost (@kg) total 1 jaya 21 feb 21 feb √ 8500 45900000 80 2 mako 21 feb 21 feb √ 8000 43200000 85 3 baros 22 feb 24 feb √ 8500 45900000 80 4 gina 22 feb 22 feb √ 9000 48600000 95 5 jaya 23 feb 23 feb √ 8500 45900000 85 6 mako 24 feb 24 feb √ 8500 45900000 90 7 baros 25 feb 25 feb √ 8000 43200000 80 8 mako 25 feb 25 feb √ 9000 48600000 90 9 gina 26 feb 26 feb √ 9000 48600000 90 10 mako 27 feb 28 feb √ 8000 43200000 85 11 jaya 27 feb 27 feb √ 8500 45900000 85 12 baros 28 feb 28 feb √ 8000 43200000 85 13 mako 29 feb 1 maret √ 9000 48600000 85 14 gina 29 feb 29 feb √ 9500 51300000 90 15 jaya 1 maret 2 maret √ 8500 45900000 85 16 mako 1 maret 1 maret √ 9000 48600000 95 17 baros 2 maret 2 maret √ 9000 48600000 80 18 mako 3 maret 3 maret √ 9000 48600000 85 19 gina 4 maret 4 maret √ 9500 51300000 85 20 jaya 5 maret 6 maret √ 9000 48600000 85 21 baros 5 maret 5 maret √ 9000 48600000 80 22 mako 6 maret 6 maret √ 9000 48600000 90 23 gina 7 maret 7 maret √ 9500 51300000 95 24 jaya 8 maret 10 maret √ 9000 48600000 80 25 baros 8 maret 8 maret √ 9000 48600000 85 26 gina 9 maret 9 maret √ 9500 51300000 95 27 mako 9 maret 9 maret √ 9000 48600000 90 28 jaya 10 maret 12 maret √ 8500 45900000 85 29 baros 10 maret 10 maret √ 9000 48600000 80 30 gina 11 maret 11 maret √ 9500 51300000 85 31 jaya 12 maret 13 maret √ 9000 48600000 80 32 mako 13 maret 13 maret √ 9000 48600000 85 33 jaya 13 maret 14 maret √ 9000 48600000 90 supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 101 34 gina 14 maret 14 maret √ 9500 51300000 90 35 baros 15 maret 16 maret √ 8500 45900000 85 36 mako 15 maret 15 maret √ 9000 48600000 90 37 gina 16 maret 16 maret √ 9500 51300000 90 38 jaya 16 maret 18 maret √ 8500 45900000 85 39 mako 17 maret 17 maret √ 9000 48600000 90 40 gina 19 maret 18 maret √ 9000 48600000 95 41 baros 19 maret 19 maret √ 8500 45900000 90 42 jaya 19 maret 20 maret √ 8500 45900000 80 43 mako 19 maret 21 maret √ 8500 45900000 80 44 gina 20 maret 20 maret √ 9000 48600000 90 45 jaya 20 maret 22 maret √ 8000 43200000 80 46 baros 21 maret 21 maret √ 8500 45900000 90 47 mako 21 maret 22 maret √ 8500 45900000 85 (source : pd wibowo, 21 februari until maret 2020) table 1 can determine the percentage of on-time delivery, the variable price offered, and the varying percentage of quality citrus in good condition with the total of all oranges sent by the supplier. the price value of each supplier is obtained by adding up each price in purchases divided by the number of investments, determined the average value for each data cost, quality, and timeliness. the calculation results are given in table 2 below. table 2. value percentage criteria from four suppliers supplier 𝒙𝒊 cost (rp) quality (%) delivery (%) total order (kg) jaya 𝒙𝟏 8625 83,33 25,00 64800 mako 𝒙𝟐 8750 87,50 71,43 75600 baros 𝒙𝟑 8600 83,50 80,00 54000 gina 𝒙𝟒 9318 90,91 90,91 59400 determined the degree of membership for the level of satisfaction of the decision maker (dm) of each goal using (1), (2), (3). the calculation results are given in table 3 below: table 3. degree of membership for dm satisfaction level of each goal decision lowest highest 𝒄: cost > 10000 8465.4 8243.6 8021.8 7800 sl(c), satisfaction level c 0 0,7 0,8 0,9 1 𝒌 : kualitas 0 40 60 80 100 sl(k), satisfaction level k 0 0,4 0,6 0,8 1 𝒅 : ketepatan waktu 0 40 60 80 100 sl(d), satisfaction level d 0 0,4 0,6 0,8 1 the value of the level of satisfaction is in the interval [0,1]. based on table 3, it is known that for the lowest decision value, dm gives a satisfaction level value of 0. for the highest decision value, dm gives a satisfaction level value 1. the level of satisfaction for each goal of cost, quality, and time delivery is determined based on equations (1), (2), and (3). the results are given in table 4 below. supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 102 table 4. membership function value for each goal supplier amount of order cost quality delivery jaya 𝒙𝟏 0,625 0,83 0,25 mako 𝒙𝟐 0,568 0,88 0,71 baros 𝒙𝟑 0,636 0,84 0,8 gina 𝒙𝟒 0,31 0,91 0,91 average 0,53 0,865 0,6675 maximum value 0,636 0,91 0,91 the lower bound for the price goal is determined based on the average price value multiplied by the minimum order. the upper price is the product of the maximum value of the price times the maximum order. the same calculation is done for quality goals and on time delivery. we obtained a lower bound and an upper bound for the goal value of price, quality and on time delivery respectively 28876,5; 48081,6; 49013,2; 70308; 36045; 68796. the formulation of the minmax mcgp model (4) the problem of supplier’s selection of brastagi oranges with a maximum order quantity for each supplier of 10000 kg, minimum order of 15000 kg and a maximum of 17000 kg is given as follows. minimum d subject to 𝐷 ≥ 3𝑑1 + + 𝑑1 − 𝐷 ≥ 𝑒1 + + 𝑒1 − ; 𝐷 ≥ 𝑑2 + + 5𝑑2 − 𝐷 ≥ 𝑒2 + + 𝑒2 −; 𝐷 ≥ 𝑑3 + + 3𝑑3 − ; 𝐷 ≥ 𝑒3 + + 𝑒3 − (5) 0,625𝑥1 + 0,568𝑥2 + 0,636𝑥3 + 0,31𝑥4 − 𝑑1 + + 𝑑1 − = 𝑦1 𝑦1 − 𝑒1 + + 𝑒1 − = 48081,6 28876,5 ≤ 𝑦1 ≤ 48081,6 0,83𝑥1 + 0,88𝑥2 + 0,84𝑥3 + 0,91𝑥4 − 𝑑2 + + 𝑑2 − = 𝑦2 𝑦2 − 𝑒2 + + 𝑒2 − = 70308 49013,2 ≤ 𝑦2 ≤ 70308 0,25𝑥1 + 0,71𝑥2 + 0,8𝑥3 + 0,91𝑥4 − 𝑑3 + + 𝑑3 − = 𝑦3 𝑦3 − 𝑒3 + + 𝑒3 − = 68796 36045 ≤ 𝑦3 ≤ 68796 𝑥1 ≤ 10000; 𝑥2 ≤ 10000 ; 𝑥3 ≤ 10000 ; 𝑥4 ≤ 10000 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 ≥ 15000 ; 𝑥1 + 𝑥2 + 𝑥3 + 𝑥4 ≤ 17000 𝑑1 +, 𝑑1 −, 𝑒1 +, 𝑒1 −, 𝑦1, 𝑦2, 𝑦3 ≥ 0 solving the linear model (5) uses lingo 13 software and the solution is obtained in table 5 below. supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 103 table 5. minmax mcgp model solution for citrus fruit supplier selection no variable value 1. 𝑥1 0 2. 𝑥2 7000 3. 𝑥3 0 4. 𝑥4 10000 5. 𝑦1 39463.88 6. 𝑦2 28876.50 7. 𝑦3 36764.17 8. 𝐷1 + 0 9. 𝐷1 − 32387.88 10. 𝑒1 + 0 11. 𝑒1 − 8617.725 12. 𝐷2 + 0 13. 𝐷2 − 13616.5 14. 𝑒2 + 0 15. 𝑒2 − 39919.50 16. 𝐷3 + 0 17. 𝐷3 − 22694.17 18. 𝑒3 + 0 19. 𝑒3 − 32031.83 20. 𝐷 68082.50 in table 5, for a maximum total order of 17000 kg, an order is recommended for 𝑥2 (supplier mako) and 𝑥4 (supplier gina). the values of 𝑦1 (aspiration rate g1) = 39463.88, 𝑦2 (aspiration rate g2) = 28876.50, 𝑦3 (aspiration rate g3) = 36764.17, and other deviations are given in table 5. the values of 𝑥1, 𝑥2, 𝑥3, and 𝑥4 are 0, 7000, 0, 10000, respectively. it can be concluded that the order for selecting the best supplier is supplier gina with an order quantity of 10000 kg, supplier mako with an order quantity of 7000 kg. conclusions the results obtained the best supplier for orders of a maximum of 17000 kg are gina supplier with a total order of 1000 kg of brastagi oranges and mako supplier with a maximum order of 7000 kg. the best supplier order is obtained by looking at the difference in the value of the deviation from the target for each goal of price, quality and delivery. the difference in goal value results in a different order of supplier selection. acknowledgments this research is supported by universitas sriwijaya through sains teknologi dan seni (sateks) research scheme with the number of the research assignment contract number 0163.177/un9/sb3.lppm.pt/2020. references [1] r. kiani, m. goh, and n. kiani, “supplier selection with shannon entropy and fuzzy topsis in the context of supply chain risk management,” procedia soc. behav. sci., vol. 235, no. october, pp. 216–225, 2016. [2] h. ho, “the supplier selection problem of a manufacturing company using the weighted multi-choice goal programming and minmax multi-choice goal supplier selection analysis using minmax multi choice goal programming model novi rustiana dewi 104 programming,” appl. math. model., vol. 75, pp. 819–836, 2019. [3] a. negahban and m. dehghanimohammadabadi, “optimizing the supply chain configuration and production-sales policies for new products over multiple planning horizons,” int. j. prod. econ., vol. 196, pp. 150–162, 2018. [4] a. ahmadi-javid and p. hoseinpour, “a location-inventory-pricing model in a supply chain distribution network with price-sensitive demands and inventory-capacity constraints,” transp. res. part e, vol. 82, pp. 238–255, 2015. [5] h. ghaderi, a. moini, and m. s. pishvaee, “a multi-objective robust possibilistic programming approach to sustainable switchgrass-based bioethanol supply chain network design,” j. clean. prod., vol. 179, pp. 368–406, 2018. [6] j. h. m. manders, m. c. j. caniëls, and p. w. th, “exploring supply chain fl exibility in a fmcg food supply chain,” j. purch. supply manag., vol. 22, no. 3, pp. 181–195, 2016. [7] e. malinowski, m. h. karwan, j. m. pinto, and l. sun, “a mixed-integer programming strategy for liquid helium global supply chain planning,” transp. res. part e, vol. 110, no. july 2017, pp. 168–188, 2018. [8] x. zheng, m. yin, and y. zhang, “integrated optimization of location , inventory and routing in supply chain network design,” transp. res. part b, vol. 121, pp. 1–20, 2019. [9] z. qin, “uncertain random goal programming,” fuzzy optim. decis. mak., vol. 17, no. 4, pp. 375–386, 2018. [10] s. . barik, “probabilistic fuzzy goal programming problem involving pareto distribution : some additive approaches,” fuzzy inf. eng., vol. 7, pp. 227–244, 2015. [11] p. biswas, “fuzzy goal programming approach to solve linear multilevel programming problems using genetic algorithm,” int. j. comput. appl., vol. 115, no. 3, pp. 10–19, 2015. [12] c. chang, “multi-choice goal programming model for the optimal location of renewable energy facilities,” renew. sustain. energy rev., vol. 41, pp. 379–389, 2015. [13] l. chen, w. ko, and f. yeh, “approach based on fuzzy goal programming and quality function deployment for new product planning,” eur. j. oper. res., vol. 259, no. 2, pp. 654–663, 2016. [14] e. susanti, o. dwipurwani, and e. yuliza, “optimasi kendaraan pengangkut sampah menggunakan model fuzzy goal programming,” j. mat., vol. 7, no. 2, pp. 119–123, 2017. [15] r. jayaraman, c. colapinto, d. la, and t. malik, “a weighted goal programming model for planning sustainable development applied to gulf cooperation council countries,” appl. energy, vol. 185, no. 2, 1 january 2017, pp. 1931–1939, 2016. [16] m. dash and r. hanuman, “a goal programming model for working capital,” j. manag. sci., vol. 5, no. 1, pp. 7–16, 2015. [17] l. chen, j. peng, and b. zhang, “uncertain goal programming models for bicriteria solid transportation problem,” appl. soft comput. j., 2016. super total labeling (a,d)-edge antimagic on the firecracker graph cauchy –jurnal matematika murni dan aplikasi volume 6(3) (2020), pages 133-139 p-issn: 2086-0382; e-issn: 2477-3344 submitted: august 18, 2020 reviewed: october 01, 2020 accepted: november 10, 2020 doi: http://dx.doi.org/10.18860/ca.v6i3.10145 super total labeling (a,d)-edge antimagic on the firecracker graph juhari mathematics department, universitas islam negeri maulana malik ibrahim malang email: juhari@uin-malang.ac.id abstract an an (a, d)-edge antimagic total labeling on (p, q)-graph g is a one-to-one map f from v (g) ∪ e(g) onto the integers 1, 2, . . ., p + q with the property that the edge-weights, w(uv) = f (u) + f(v) + f(uv) where uv ∈ e(g), form an arithmetic progression starting from a and having common difference d. such labeling is called super if the smallest possible labels appear on the vertices. in this paper, we investigate the existence of super (a, d)-edge antimagic total labeling of firecracker graph. keywords: super (a, d)-edge-antimagic total labeling; firecracker graph (fn, k) introduction the first label appears in the middle of the year 1960 it started by a ringel and rosa hypotheses [1]. in 1967 rosa called this label as liberation β-valuation from a graph with e side, if there is the function that mapping one to one from the set of points 𝑉(𝐺) to set of integers 0,1,2, … . , 𝑒, so every side xy in graph g gets a different label |𝑓(𝑥) − 𝑓(𝑦)| for every edges in graph g. one type of graf liberation is super total labeling (𝑎, 𝑑)-antimagic edge (seatl), where a smallest side dan d different value. this liberation introduced by simanjutak, bertault and miller in the year 2000 [1], [2], [3]. the entire release (𝑎, 𝑑)-edge antimagic is total labeling in some kinds of graf g that started by labeling all of the graphs first with consecutive original numbers, then proceed with buying all sides of the map such that the side weights form an arithmetic sequence with the first term and different d [4], [5]. the type of star graph, the super total labeling (𝑎, 𝑑)-edge antimagic (seatl) not yet found one of which is a graph firecracker that hasn't been labelled previously. this prompted the writer to examine how super total labeling (𝑎 𝑑)-edge antimagic (seatl) on the firecracker graph (fn, k). some of the problems formulated are as follows: (1) the upper limit d, so the firecracker graph has super total labeling (a, d)-edge antimagic? and (2) how bijective function of super total labeling (𝑎, 𝑑)-edge antimagic the firecracker graph? in order not to be widespread, this research needs to be done, and this research needs to be done on total labeling (𝑎, 𝑑)-edge antimagic the firecracker graph (fn, k) with n ≥ 2; k ≥ 3. in this session, n and k are a provision of the definition firecracker graph. super total labeling (𝒂, 𝒅)-edge antimagic. a graph is said to have total labeling (𝑎, 𝑑)-edge antimagic if there is a one-to-one mapping of one 𝑉(𝐺) ∪ 𝐸(𝐺) to integers. 1,2,3, … , 𝑝 + 𝑞 so the set of side weight 𝑤(𝑢𝑣) = 𝑓(𝑢) + http://dx.doi.org/10.18860/ca.v6i3.10145 mailto:juhari@uin-malang.ac.id super total labeling (a,d)-edge antimagic on the firecracker graph juhari 134 𝑓(𝑣) + 𝑓(𝑢𝑣) on all edge g is 𝑎, 𝑎 + 𝑑, … , 𝑎 + (𝑞 − 1)𝑑 for 𝑎 > 0 and 𝑑 >0 both integers [6]. the total labeling (𝑎, 𝑑)-edge antimagic called super total labeling (𝑎, 𝑑)edge antimagic if 𝑓(𝑉) = {1,2,3, . . , 𝑝} and 𝑓(𝐸) = {𝑝 + 1, 𝑝 + 2, 𝑝 + 3, … , 𝑝 + 𝑞}. to search upper limit different value d super total slowly (𝑎, 𝑑)-edge antimagic can certain by lemma [1], [7]: lemma 1 if a graph (p, q) is super total labeling (a,d)-edge antimagic so 𝑑 ≥ 2𝑝+𝑞−5 𝑞−1 prove. 𝑓(𝑉) = {1,2,3, . . , 𝑝} and 𝑓(𝐸) = {𝑝 + 1, 𝑝 + 2, 𝑝 + 3, … , 𝑝 + 𝑞} for example, graph (𝑝, 𝑞) is super total labeling (𝑎, 𝑑)-edge antimagic by mapping 𝑓: 𝑉(𝐺) ∪ 𝐸(𝐺) → {1,2,3, … , 𝑝 + 𝑞}. the minimum value that possible from the smallest weight side 𝛼(𝑢) + 𝛼(𝑢𝑣) + 𝛼(𝑣) = 1 + (𝑝 + 1) + 2 = 𝑝 + 4 and can be written: 𝑝 + 4 ≤ 𝛼. while on the other side, t h e maximum value that possible from the biggest weight side gained by the sum of 2 smallest labels and biggest label or can be written (𝑝 − 1) + (𝑝 + 𝑞) + 𝑝 = 3𝑝 + 𝑞 − 1. result: 𝑎 + (𝑞 − 1) 𝑑 ≤ 3𝑝 + 𝑞 − 1 𝑑 ≤ 3𝑝 + 𝑞 − 1 − (𝑝 + 4) 𝑞 − 1 𝑑 ≤ 2𝑝 + 𝑞 − 5 𝑞 − 1 the equation above has proved and got value𝑑 ≥ 2𝑝+𝑞−5 𝑞−1 from many kinds or graph family. firecracker graph firecracker graph is a graph that gets by star graph combination exactly one leaf of each graph is connected [8], [9], [10], usually symbolized 𝐹𝑛,𝑘 where n is the number of merged star graphs, while k is the number of points of each connected star graph. methods this research uses axiomatic descriptive method, which is by decreasing the existing axioms or theorems [11], [12], then applied in super total labeling (𝑎, 𝑑)-edge antimagic on the firecracker graph 𝐹𝑛,𝑘 . in addition, some systematic research techniques are as follows: (1) count the number of points v and side e on the firecracker graph 𝐹𝑛,𝑘 ; (2) determine the upper limit of the different d values in the firecracker graph 𝐹𝑛,𝑘 in accordance with the lemma 1; (3) determine or find eavl label (edge-antimagic vertex labeling) or labeling points (𝑎, 𝑑)edge antimagic of the firecracker graph 𝐹𝑛,𝑘 ; (4) determine the algorithm of the functional function eavl 𝑓(𝑥𝑖.𝑙 ) on the firecracker graph 𝐹𝑛,𝑘 by looking at the labeling pattern on the graph firecracker 𝐹𝑛,𝑘 which has been found then grouping the numbers on the label of points that form arithmetic rows ; (5) determine the algorithm for the functional function of side weights eavl (𝑤) on the firecracker graph 𝐹𝑛,𝑘 by looking at the firecracker graph labeling pattern;(6) label the sides of the firecracker graph 𝐹𝑛,𝑘 with seatl (super edge antimagic total labeling) or super total labeling (𝑎, 𝑑)-edge antimagic for each corresponding different value d;(7) determine the bijective function on the firecracker graph 𝐹𝑛,𝑘 ; and (8) write a conclusion. super total labeling (a,d)-edge antimagic on the firecracker graph juhari 135 results and discussion the first step in determining the labeling of super total (𝑎, 𝑑)-edge antimagic is to determine the number of points and the number of edges on the graph under study, in this case, firecracker graph. after that, select the value of d in the labeling that will be examined using lemma 1. the labeling pattern can be determined by detecting the way (pattern recognition) after labeling a specific firecracker graph. next, to determine patterns in general, the objective function is found by using the principle of an arithmetic sequence. the following will be presented lemma and theorems that have been found. lemma 2. there is a point labeling (𝑎, 1)edge antimagic on the firecracker graph 𝐹𝑛,𝑘 if n odd, n ≥ 2, and k ≥ 3. prove. first defined 𝑥𝑖.𝑙 is the point in the graph component of the firecracker 𝐹𝑛,𝑘 ,where 1 ≤ i ≤ n and 0 ≤ l ≤ k − 1. based on research results,if 𝛼: 𝑉(𝐹𝑛,𝑘 ) → {1,2, … , 𝑛𝑘} so 𝛼 labelation can be written as follow: 𝑖; if i odd (1 ≤ 𝑖 ≤ 𝑛), and l=0 (𝑛 + 𝑖); if i even (2 ≤ 𝑖 ≤ 𝑛 − 1), and l=0 𝑖; if i even (12 ≤ 𝑖 ≤ 𝑛 − 1), and l=1 (𝑛 + 𝑖); if i odd (1 ≤ 𝑖 ≤ 𝑛), and l=1 𝑛(𝑙 + 1) − 𝑖−1 2 ; if i odd (1 ≤ 𝑖 ≤ 𝑛), and (2 ≤ 𝑙 ≤ 𝑘 − 1) (𝑙𝑛 + 𝑛−𝑖−1 2 ) + 1; if i even (2 ≤ 𝑖 ≤ 𝑛 − 1), and (2 ≤ 𝑙 ≤ 𝑘 − 1) from the equation above 𝛼(𝑥𝑖,𝑙 ) is an objective function that maps 𝑉(𝐹𝑛,𝑘 ) = {𝑣1, 𝑣2, 𝑣3, … , 𝑣𝑛𝑘 } to the set of integers {1,2, … , 𝑛𝑘}. if 𝑤𝛼 is defindes as the weight edge of the labeling point α, so 𝑤𝛼 is formulated: 𝑤𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑛 + 2𝑖) ; 1 ≤ 𝑖 ≤ 𝑛, and 𝑙 = 1 𝑤𝛼2(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) = (𝑛 + 2𝑖 + 1) ; 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝑙 = 1 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑙 + 1)𝑛 + (𝑖+1) 2 ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑤𝛼4(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑙 + 1)𝑛 + (𝑛+𝑖+1) 2 ; if i even, 1 ≤ 𝑖 ≤ 𝑛 − 1 and 2 ≤ 𝑙 ≤ 𝑘 − 1 theorem 1. there is super total labeling (2𝑛𝑘 + 𝑛 + 1,0)-edge antimagic on the firecracker graph 𝐹𝑛,𝑘 if n odd, n ≥ 2, and k ≥ 3. prove. first define the edge label 𝑓𝛼 : 𝐸(𝐹𝑛,𝑘 ) = {𝑒1, 𝑒2, . . . , 𝑒𝑛𝑘−1} → {𝑛𝑘 + 1, 𝑛𝑘 + 2, … , 2𝑛𝑘 − 1}, so the edge label 𝑓𝛼 for super total labeling (𝑎, 0)edge antimagic on the graph 𝐹𝑛,𝑘 can be formulated as follows: 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) = 2(𝑛𝑘 − 𝑖) + 1 ; 1 ≤ 𝑖 ≤ 𝑛 and 𝑙 = 1 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) = 2(𝑛𝑘 − 𝑖) ; 1 ≤ 𝑖 ≤ 𝑛 − 1 and 𝑙 = 1 𝑓𝛼3(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + (1−𝑖) 2 ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑓𝛼4(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + (−2𝑛−𝑖+6) 2 ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1, and 2 ≤ 𝑙 ≤ 𝑘 − 1 next, if there is wα even that defined as weight of super total labeling 𝛼(𝑥𝑖,𝑙 ) = = super total labeling (a,d)-edge antimagic on the firecracker graph juhari 136 𝛼(𝑥𝑖,𝑙 ), 𝛼(𝑥𝑖,𝑙 𝑥𝑖,0), and 𝛼(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ), so wα can be obtained by adding up the edge weight formula eavl 𝑤𝛼 and the formula of edge label 𝑓𝛼 with the terms of boundaries i and l which correspond and can be stated as follows: 𝑊𝛼1 = 𝑤𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛, and 𝑙 = 1 𝑊𝛼2 = 𝑤𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝑙 = 1 𝑊𝛼3 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑊𝛼4 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1,and 2 ≤ 𝑙 ≤ 𝑘 − 1 by substituting the equation above is obtained: 𝑊𝛼1 = (𝑛 + 2𝑖) + 2(𝑛𝑘 − 𝑖) + 1 = 2𝑛𝑘 + 𝑛 + 1 𝑊𝛼2 = 2(𝑛𝑘 − 𝑖) + (𝑛 + 2𝑖 + 1) = 2𝑛𝑘 + 𝑛 + 1 𝑊𝛼3 = (2𝑘 − 𝑙)𝑛 + (1−𝑖) 2 + (𝑙 + 1)𝑛 + (𝑖+1) 2 = 2𝑛𝑘 + 𝑛 + 1 𝑊𝛼4 = (2𝑘 − 𝑙)𝑛 + (−2𝑛−𝑖+6) 2 + (𝑙 + 1)𝑛 + (𝑛+𝑖+1) 2 = 2𝑛𝑘 + 𝑛 + 1 based on the equation above, the set of total labeling edge weights can be written as 𝑊𝛼 = {𝑊𝛼1, 𝑊𝛼2, 𝑊𝛼3, 𝑊𝛼4}. it can also be seen that 𝑊𝛼1 = 𝑊𝛼2 = ⋯ = 𝑊𝛼4 = 2𝑛𝑘 + 𝑛 + 1 or can be written as follows: ⋃ 4𝑡=1 𝑊𝛼𝑡 = {2𝑛𝑘 + 𝑛 + 1,2𝑛𝑘 + 𝑛 + 1, … , 2𝑛𝑘 + 𝑛 + 1}. from this it can be concluded that the firecracker graph 𝐹𝑛,𝑘 have super (2𝑛𝑘 + 𝑛 + 1,0)edge antimagic if n odd, n ≥ 2, and k ≥ 3. theorem 2. there are super total labeling ((𝑘 + 1)𝑛 + 3,2)-edge antimagic on the graph firecracker 𝐹𝑛,𝑘 if n odd (𝑛 ≥ 2), and 𝑘 ≥ 3. prove. first define the edge label 𝑓𝛼 : 𝐸(𝐹𝑛,𝑘 ) = {𝑒1, 𝑒2, . . . , 𝑒𝑛𝑘−1} → {𝑛𝑘 + 1, 𝑛𝑘 + 2, … , 2𝑛𝑘 − 1}, so edge label 𝑓𝛼 for super total labeling (𝑎, 2)edge antimagic on the graph 𝐹𝑛,𝑘 can be formulated as follows: 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑛𝑘 + 2𝑖 − 1) ; 1 ≤ 𝑖 ≤ 𝑛 and 𝑙 = 1 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) = (𝑛𝑘 + 2𝑖) ; 1 ≤ 𝑖 ≤ 𝑛 − 1 and 𝑙 = 1 𝑓𝛼3(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑘 + 𝑙)𝑛 + (𝑖−1) 2 ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑓𝛼4(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑘 + 𝑙)𝑛 + (𝑛+𝑖−1) 2 ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1, and 2 ≤ 𝑙 ≤ 𝑘 − 1 next, wα defined as weight of super total labeling edge 𝛼(𝑥𝑖,𝑙 ), 𝛼(𝑥𝑖,𝑙 𝑥𝑖,0), and 𝛼(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) 𝑊𝛼1 = 𝑤𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛, and 𝑙 = 1 𝑊𝛼2 = 𝑤𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝑙 = 1 𝑊𝛼3 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑊𝛼4 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1,and 2 ≤ 𝑙 ≤ 𝑘 − 1 by substituting the equation above is obtained: 𝑊𝛼1 = (𝑛 + 2𝑖) + (𝑛𝑘 + 2𝑖 − 1) = (𝑘 + 1)𝑛 + 4𝑖 − 1 𝑊𝛼2 = (𝑛 + 2𝑖 + 1) + (𝑛𝑘 + 2𝑖) = (𝑘 + 1)𝑛 + 4𝑖 + 1 𝑊𝛼3 = (𝑙 + 1)𝑛 + (𝑖+1) 2 + (𝑘 + 𝑙)𝑛 + (𝑖−1) 2 super total labeling (a,d)-edge antimagic on the firecracker graph juhari 137 = (𝑘 + 2𝑙 + 1)𝑛 + 𝑖 𝑊𝛼4 = (𝑙 + 1)𝑛 + (𝑛+𝑖+1) 2 + (𝑘 + 𝑙)𝑛 + (𝑛+𝑖−1) 2 = (𝑘 + 2𝑙 + 2)𝑛 + 𝑖 based on the equation above, the set of total labeling edge weights can be written as 𝑊𝛼 = {𝑊𝛼1, 𝑊𝛼2, 𝑊𝛼3, 𝑊𝛼4}. it can also be seen that 𝑊𝛼1 = 𝑊𝛼2 = ⋯ = 𝑊𝛼4 = 2𝑛𝑘 + 𝑛 + 1 or can be written as follows: ⋃ 4𝑡=1 𝑊𝛼𝑡 = {(𝑘 + 1)𝑛 + 3, (𝑘 + 1)𝑛 + 5, … , (3𝑘 + 1)𝑛 − 1}. from this it can be concluded that the firecracker graph 𝐹𝑛,𝑘 have super ((𝑘 + 1)𝑛 + 3,2)edge antimagic if n odd, n ≥ 2, and k ≥ 3. theorem 3. there is super total labeling (2𝑛𝑘 + 1,1)-edge antimagic on the graph firecracker 𝐹𝑛,𝑘 if n even (𝑛 ≥ 2), and 𝑘 ≥ 3. prove. to determine the total super labeling (𝑎, 1)–edge antimagic, defined first 𝑓𝑎 : 𝐸(𝐹𝑛,𝑘 ) = {𝑒1, 𝑒2, … , 𝑒𝑛𝑘−1} → {𝑛𝑘 + 1, 𝑛𝑘 + 2, … ,2𝑛𝑘 − 1}which is the labeling edge label and can be formulated: 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑛𝑘 − 4𝑖 + 2) ; 1 ≤ 𝑖 ≤ 𝑛 and 𝑙 = 1 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) = (2𝑛𝑘 − 4𝑖) ; 1 ≤ 𝑖 ≤ 𝑛 − 1 and 𝑙 = 1 𝑓𝛼3(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + 3𝑛 − 𝑖 ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑓𝛼4(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + 2𝑛 − 1 ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1, and 2 ≤ 𝑙 ≤ 𝑘 − 1 if 𝑊𝛼 defined as the total labeling edge weight 𝛼(𝑥𝑖,𝑙 ), 𝛼(𝑥𝑖,𝑙 𝑥𝑖,0), and 𝛼(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ). 𝑊𝛼1 = 𝑤𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛, and 𝑙 = 1 𝑊𝛼2 = 𝑤𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝑙 = 1 𝑊𝛼3 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑊𝛼4 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1,and 2 ≤ 𝑙 ≤ 𝑘 − 1 by substituting the equation above is obtained: 𝑊𝛼1 = (𝑛 + 2𝑖) + (2𝑛𝑘 − 4𝑖 + 2) = ( 2𝑛𝑘 + 𝑛 − 2𝑖 + 2 ) 𝑊𝛼2 = (2𝑛𝑘 − 4𝑖) + (𝑛 + 2𝑖 + 1) = ( 2𝑛𝑘 + 𝑛 − 2𝑖 + 1 ) 𝑊𝛼3 = (2𝑘 − 𝑙)𝑛 + 3𝑛 − 𝑖 + (𝑙 + 1)𝑛 + (𝑖+1) 2 = ( 2𝑛𝑘 + 4𝑛 + (𝑖+1) 2 ) 𝑊𝛼4 = (2𝑘 − 𝑙)𝑛 + 2𝑛 − 𝑖 + (𝑙 + 1)𝑛 + (𝑛+𝑖+1) 2 = ( 2𝑛𝑘 + 7𝑛+1−𝑖 2 ) based on the equation above, the set of total labeling edge weights can be witten with 𝑊𝛼 = {𝑊𝛼1, 𝑊𝛼2, 𝑊𝛼3, 𝑊𝛼4}. it can also be seen that the smalest edge lies in 𝑊𝛼2 and the biggest edge weights lies in 𝑊𝛼1, can be stated that 𝑊𝛼 forming arithmetic lines with initial term 2𝑛𝑘 + 1 and different 1 (one), or can be written ⋃ 4𝑡=1 𝑊𝛼𝑡 = {2𝑛𝑘 + 1,2𝑛𝑘 + 2,2𝑛𝑘 + 3, … , ((3𝑛𝑘 − 1) + 𝑖)}. so, can be conclude that the firecracker graph 𝐹𝑛,𝑘 have super (2𝑛𝑘 + 1,1)-eat; n even (𝑛 ≥ 2), and 𝑘 ≥ 3. super total labeling (a,d)-edge antimagic on the firecracker graph juhari 138 theorem 4. there is super total labeling {𝑛𝑘 + 𝑛 + 4,3} –edge antimagic on the combination of firecracker graph 𝑚𝐹𝑛𝑘 if 𝑚 ≥ 2, n even (𝑛 ≥ 2), and 𝑛 ≥ 3. prove. to determine the total super labeling (𝑎, 1)-edge antimagic, defined first 𝑓𝑎 : 𝐸(𝐹𝑛,𝑘 ) = {𝑒1, 𝑒2, … , 𝑒𝑛𝑘−1} → {𝑛𝑘 + 1, 𝑛𝑘 + 2, … ,2𝑛𝑘 − 1} which is the edge label and can be formulated: 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) = (𝑛𝑘 + 4𝑖 − 2) ; 1 ≤ 𝑖 ≤ 𝑛 and 𝑙 = 1 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) = (𝑛𝑘 + 4𝑖) ; 1 ≤ 𝑖 ≤ 𝑛 − 1 and 𝑙 = 1 𝑓𝛼3(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + 𝑖 − 1 ; if i odd, 1 ≤ 𝑖 ≤ 𝑛,and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑓𝛼4(𝑥𝑖,𝑙 𝑥𝑖,0) = (2𝑘 − 𝑙)𝑛 + 𝑛 + 𝑖 − 1 ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1,and 2 ≤ 𝑙 ≤ 𝑘 − 1 if 𝑊𝛼 defined as the total edge labeling weights 𝛼(𝑥𝑖,𝑙 ), 𝛼(𝑥𝑖,𝑙 𝑥𝑖,0), and 𝛼(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ). 𝑊𝛼1 = 𝑤𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼1(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛, and 𝑙 = 1 𝑊𝛼2 = 𝑤𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) + 𝑓𝛼2(𝑥𝑖,𝑙 𝑥𝑖,0) ; 1 ≤ 𝑖 ≤ 𝑛 − 1, and 𝑙 = 1 𝑊𝛼3 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i odd, 1 ≤ 𝑖 ≤ 𝑛, and 2 ≤ 𝑙 ≤ 𝑘 − 1 𝑊𝛼4 = 𝑤𝛼3(𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) + 𝑓𝛼3 (𝑥𝑖,𝑙 𝑥𝑖+1,𝑙 ) ; if i even, 2 ≤ 𝑖 ≤ 𝑛 − 1,and 2 ≤ 𝑙 ≤ 𝑘 − 1 by substituting the equation above is obtained: 𝑊𝛼1 = (𝑛 + 2𝑖) + (𝑛𝑘 + 4𝑖 + 2) = ( 𝑛𝑘 + 𝑛 + 6𝑖 + 2 ) 𝑊𝛼2 = (𝑛𝑘 + 2𝑖 + 1) + (𝑛𝑘 + 4𝑖) = ( 2𝑛𝑘 + 𝑛 + 6𝑖 + 1 ) 𝑊𝛼3 = (𝑙 + 1)𝑛 + (𝑖+1) 2 + (2𝑘 − 𝑙)𝑛 + 𝑖 − 1 = ( 2𝑛𝑘 + 𝑛 + (1−𝑖) 2 ) 𝑊𝛼4 = (2𝑘 − 𝑙)𝑛 + 𝑛 − 𝑖 + 1 + (𝑙 + 1)𝑛 + (𝑛+𝑖+1) 2 = ( 2𝑛𝑘 + 5𝑛+2𝑖−1 2 ) based on the equation above, the set of total labeling edge weights can be written with 𝑊𝛼 = {𝑊𝛼1, 𝑊𝛼2, 𝑊𝛼3, 𝑊𝛼4}. it can also be seen that the smalest edge lies in 𝑊𝛼1 and the biggest edge weights lies in 𝑊𝛼2, can be stated that 𝑊𝛼 forming arithmetic lines with initial term 𝑛𝑘 + 𝑛 + 4 and different 3 (one), or can be written ⋃ 4𝑡=1 𝑊𝛼𝑡 = {𝑛𝑘 + 𝑛 + 4, 𝑛𝑘 + 𝑛 + 7, 𝑛𝑘 + 𝑛 + 10, … , ((3𝑛𝑘 + 1)𝑛 + 𝑖)}. so, can be conclude that the firecracker graph 𝐹𝑛,𝑘 have super (2𝑛𝑘 + 1,1)-eat; n even (𝑛 ≥ 2), and 𝑘 ≥ 3. conclusions based on the result, can be concluded that firecracker graph 𝐹𝑛,𝑘 have super total labeling (𝑎, 𝑑)-edge antimagic, with 𝑑 ∈ {0,1,2,3} and bijective function in some of lemma and theorem show about super complete labeling (𝑎, 𝑑)-edge antimagic on the firecracker graph. the bijective function for each liberation with d different value have shown in equation (1),(2),(3) until (10) above. open problem: super total labeling (𝑎, 𝑑)-edge antimagic on the firecracker graph 𝐹𝑛,𝑘 for d=0 and d=2 where n even and 𝑘 ≥ 3; super total labeling d=1 and d=3 where n odd and 𝑘 ≥ 3. super total labeling (a,d)-edge antimagic on the firecracker graph juhari 139 references [1] dafik, “structural properties and labeling of graphs statement of authorship,” no. november, pp. 1–139, 2007. [2] m. baca, l. brankovic, m. lascsakova, o. phanalasy, and a. semanicova– fenovcıkova, “on d-antimagic labelings of plane graphs,” electron. j. graph theory appl., vol. 1, no. 1, pp. 28–39, 2013. [3] m. a. muttaqien and a. suyitno, “pelabelan total sisi ajaib pada graf double star dan graf sun,” unnes j. math., vol. 2, no. 2, 2013. [4] r. h. utomo, h. tjahjana, b. irawanto, and l. ratnasari, “pelabelan total super trimagic sisi pada beberapa graf,” j. fundam. math. appl., vol. 1, no. 1, p. 52, jun. 2018. [5] t. utomo and n. riskiana dewi, “dimensi metrik graf amal (nkm),” limits j. math. its appl., vol. 15, no. 1, p. 71, mar. 2018. [6] v. swaminathan and p. jeyanthi, "super edge-magic strength of firecrackers, banana trees and unicyclic graphs," discrete math., vol. 306, no. 14, pp. 1624– 1636, 2006. [7] s. chattopadhyay and p. panigrahi, “some structural properties of power graphs and k -power graphs of finite semigroups,” j. discret. math. sci. cryptogr., vol. 20, no. 5, pp. 1101–1119, jul. 2017. [8] y.-n. chen, w.-c.; lu, h.-i; and yeh, “operations of interlaced trees and graceful trees,” southeast asian bull. math, vol. 21, pp. 337–348, 1997. [9] j. a. gallian, "a dynamic survey of graph labeling," electron. j. comb., vol. 1, no. dynamicsurveys, pp. 1–256, 2018. [10] k. a. sugeng and d. r. silaban, "on b-edge consecutive edge labeling of some regular tree," indones. j. comb., vol. 4, no. 1, p. 76, 2020. [11] i. halikin, “pelabelan lokal titik graf hasil diagram lattice subgrup zn,” alkhwarizmi j. pendidik. mat. dan ilmu pengetah. alam, vol. 6, no. 1, pp. 47–56, mar. 2018. [12] a. s. meliana deta anggraeni, mulyono, “pelabelan l(3,2,1) dan pembentukan graf middle pada beberapa graf khusus,” vol. 2, no. 1, pp. 76–83, 2013. comparisons between resampling techniques in linear regression: a simulation study cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 345-353 p-issn: 2086-0382; e-issn: 2477-3344 submitted: december 23, 2021 reviewed: may 25, 2022 accepted: july 24, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.14550 comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto1,*, punitha linganathan2 1,*department of statistics, ipb university, indonesia 2department of mathematics, universiti putra malaysia, malaysia email: anwarstat@gmail.com abstract parameter estimations in linear regression need to fulfill some assumptions. once the assumptions are not fulfilled, the conclusion is questionable. bootstraps and jackknife are resampling techniques that do not require assumptions in estimating the �̂�. the study aims to compare resampling techniques in linear regression. the data used in the study is clean, without any influential observations, outliers, or leverage points. the ordinary least square method was used as the primary method to estimate the parameters and then compared with resampling techniques. the variance, p-value, bias, and standard error are used as a scale to estimate the best method among random bootstrap, residual bootstrap and delete-one jackknife. after all the analysis, it was found that random bootstrap did not perform well while residual and delete-one jackknife works quite well. random bootstrap, residual bootstrap, and jackknife estimate better than ordinary least square. the study also found that residual bootstrap works well in estimating the parameter in the small sample. at the same time, it is suggested to use jackknife when the sample size is big because jackknife is more accessible to apply than residual bootstrap and jackknife works well when the sample size is large. keywords: jackknife; linear; regression; resampling introduction regression analysis is a statistical analysis that constructs relationships between dependent or response variables 𝑦 and independent or regressor variables (𝑥1,𝑥2, …, 𝑥𝑘). ordinary least square (ols) is a traditional way of finding parameter estimates, �̂� but it relies strongly on assumptions [1]. the reliability and validity of the conclusion in regression analysis are essential ([2], [3]), and they depend on how far the data follows the assumption and on the sample size of the data. it is easier to find the estimated regression coefficient, �̂� without any assumption or distribution. bootstrap and jackknife are resampling techniques that do not need any assumptions in estimating the �̂� ,([4]–[6]. sahinler and topuz [7] compared the bootstrap and jackknife methods. their research discussed strategies for building a regression model using the jackknife and bootstrap method. the four methods used in their research are bootstrap based on the resampling observations, bootstrap based on the resampling errors, delete-one jackknife regression and delete-d jackknife regression. these methods were used to find the parameter estimates, bias, standard errors, and confidence intervals. their research concluded that http://dx.doi.org/10.18860/ca.v7i3.14550 mailto:anwarstat@gmail.com comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 346 large bootstrap replicates ensure that the parameter is close to the true parameter. they also suggested that bootstrap replicate is sufficient for estimating the variance and 𝐵 = 1000 for estimating the standard errors. their research tests the accuracy of bootstrap and jackknife methods in estimating the distribution of regression parameters with various sample sizes and various bootstrap replicates. sahinler and topuz [7] and li et. al. [8] found that the bootstrap method is appropriate for linear regression and it is usable even when the error is not normally distributed. algamal and rasheed [9] further develop resampling in linear regression. the advantage of bootstrap approximations is that, in general, it needs a smaller sample than the ordinary least square for estimating the parameter. meanwhile, the disadvantages of bootstrap methods were discussed in ma et al., [10], wan et al., [11], [12], and phaladiganon et al., [13] a few of the disadvantages of the methods are as follows: a) bootstrap distribution of is not a good approximation of 𝐹, if the sample size is small and with the existence of an outlier, b) bootstrap is not suggested to use in dependence structure case like time series, and c) it is not preferable to use residual bootstrap when the assumptions are violated. algamal and rasheed, [10] concluded that jackknife method perform quite well when the sample size is large enough (𝑛 ≥ 50). meanwhile, recent studies by shao, j., & tu, d., [14] and beyaztas, u., & alin, a., [15] discussed bootstrap and jackknife in linear regression. based on that, the study is aimed to compare parameter estimates of multiple linear regression based on several resampling methods. there are several methods to estimate the �̂� in bootstrap and jackknife. the scope of this research is to investigate the bootstrap and jackknife method with different scenariosthis research considered random bootstrap, residual bootstrap, and jackknife delete-one observation. the study is limited to multiple linear regression model. first the sample size will be selected with different size and estimate the parameter. the bias and variance will be observed then the relationship between the bias and variance will be investigated. the distribution also will be observed by varying with the increase in the sample size. the value of bootstrap resampling with different bootstrap replicates and sample size gives less bias than ordinary least square. the jackknife coefficient is calculated by using, �̂�𝑗 = 1 𝑛 ∑ �̂�𝑗𝑖 𝑛 𝑖=𝑛 (1) where n is the sample size and �̂�𝑗𝑖 parameter estimate for each sample formed after deleting one of the observations. while the bootstrap coefficient is calculated from �̂�𝑏 = 1 𝐵 ∑�̂�𝑏𝑟 𝐵 𝑟=1 (2) �̂�𝑏𝑟 = �̂�𝑜𝑙𝑠 + (𝑥 ′𝑥) −1 𝑥′𝑒𝑏𝑟 (3) where 𝑟 = 1,2,…,𝐵 is bootstrap replicate, 𝑒𝑏𝑟 is error of the regression,𝑥 is the independent variable and �̂�𝑜𝑙𝑠 is the parameter estimate from ordinary least square method. comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 347 methods data the data used in this study is pressure-dropping data, which is available in montgomery et al., [16]. it has one dependent variable 𝑦, and four independent variables, that is 𝑥1,𝑥2,𝑥3 and 𝑥4. there are 62 observations in the data. the data was collected from research where the pressure drop was measured for two-phase flow through screen-plate bubble columns. the research was conducted to test the reason of the pressure drop through the bubble cap. a bubble column is used to observe the reaction between the gas and liquid. the first factor considered in that research is the superficial fluid velocity of the gas. the gas's speed and direction of motion are measured by flow in the column. the second factor is the kinematic viscosity. the friction caused by the thickness of gas when the gas moves through the liquid particles was calculated. then the distance across the space between two parallel threads was considered. the last factor used in research is the dimensionless number, which is not associated with the physical dimension. it is calculated to relate the gas's superficial fluid velocity and the liquid's superficial fluid velocity. for building the model, the dependent variable 𝑦 denotes the dimensionless factor for the pressure drop through a bubble cap. the independent variables are 𝑥1 (superficial fluid velocity of the gas (𝑐𝑚 𝑠⁄ ), 𝑥2 (kinematic viscosity), 𝑥3 (mesh opening, cm), and 𝑥4 (dimensionless number relating the gas's superficial fluid velocity to the liquid's superficial fluid velocity). simulation study scenarios the original data will be analyzed using ordinary least square regression data. then assumptions checkings will be conducted using the residuals of the model. then, using the sampe original data, resampling techniques using the residuals and random bootstrap resampling will be conducted with four different sample sizes, which are 20, 40,50 and 62. each sample will be used in three different bootstrap replicates, namely 100, 1000 and 10000. for the delete-one jackknife bootstrap, the resampling will be conducted at different sample sizes, namely 20, 40, 50 and 62. the bias, variance, standard error and p-value will be calculated for each method. the best method among this three methods will be chosen according to the value of bias, variance, standard error and p-value. results and discussion in this study, full model was used for the reference, which means all independent variables were included in the model regardless the significance of the variables. the fitted full regression model which was obtained based on ordinary least square using sas software is written as follows: �̂� = 5.88839 − 0.48460𝑥1 + 0.18263 𝑥2 + 35.39109𝑥3 + 5.92695𝑥4 random bootstrap approach random bootstrap technique was first used to analyze the data. the resampling was conducted at different sample size 20, 40, 50 and 62. the bootstrap replication were applied in every sample size, namely 100, 1000 and 1000. comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 348 table 1. summary statistics for multiple linear regression using random bootstrap at different bootstrap replicates and sample sizes for 𝛽0 and 𝛽3 parameter estimate bootstrap replicate sample size bias variance p-value standard error �̂�0 100 20 -2.2181 85.6307 0.0001 0.9254 40 3.3626 22.8120 <.0001 0.4776 50 1.5437 15.1000 <.0001 0.0469 62 -0.6707 19.9445 <.0001 0.4466 1000 20 -1.1044 83.9495 <.0001 0.2897 40 2.9549 34.4348 <.0001 0.0012 50 1.4503 18.4197 <.0001 0.1357 62 -0.8994 19.1731 <.0001 0.1385 10000 20 -1.2754 203.4686 <.0001 0.0108 40 2.6252 41.8398 <.0001 0.0647 50 1.3527 18.8707 <.0001 0.0434 62 -0.9042 4.9842 <.0001 0.0461 �̂�3 100 20 2.6410 574.9345 <.0001 2.3978 40 -5.7656 111.8724 <.0001 1.0577 50 -5.3883 61.2876 <.0001 0.7829 62 1.0310 125.2369 <.0001 1.1191 1000 20 2.4814 629.8633 <.0001 0.7936 40 -5.4017 211.8649 <.0001 0.4603 50 -4.5249 73.4070 <.0001 0.2709 62 1.6356 116.7295 <.0001 0.3417 10000 20 3.1247 634.0890 <.0001 0.2518 40 -4.5548 261.6325 <.0001 0.1618 50 -4.2045 87.0297 <.0001 0.0933 62 1.8947 37.2858 <.0001 0.1146 table 1 shows the changes in �̂�3 and �̂�0 at different sample sizes and bootstrap replicates. for each parameter estimate, as the sample size changes, the bias changes. more specifically, the bias is getting smaller as the sample size increases. the variance of �̂�3 decreases from 574.9345 when the sample is 20 to 61.2876 when the sample size is 50. but, the bias of �̂�3 increases when the sample is 62 . it can be observed that as the sample size increased from 20 to 62, the variance of parameter estimates decreased. meanwhile, the bias decreases as the bootstrap replicate increases. for b was set to 100, the intercept shows bias as 1.5437. this value decreases to 1.4503 when the number of bootstrap replicates, b, increases to 1000. when the number of bootstrap replicates was increased to 10000, the bias decreases again to 1.3527. from the results, it can be observed that the bias decreases as the replicate increases. when the bootstrap replicate, b increases from 100 to 1000, the variance decreases from 125.2369 to 116.7295. it decreases further to 37.2858 when b is equal to 10000, which shows 70.23% difference when we compare to 125.2369. comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 349 residual bootstrap approach the second resampling technique that has been used to analyze the data was residual bootstrap. this section displays some results such as parameter estimates, bias, and variances of the parameter estimates using residual bootstrap. the results of �̂�0 and �̂�1 are shown in table 2. in residual bootstrap, the results were more apparent than in random bootstrap. it shows a clear trend of parameter estimates, bias, and variance at different sample sizes and the number of bootstrap replicates. the bias decrease as the sample size increases. when 𝑛 = 20, the bias is 0.2307. then when the sample increased to 40 the bias became 0.2266 and bias is 0.0684 when the sample size is 50 and at last, when 𝑛 is 62 the bias became 0.01368. in general, there is a noticeable difference in bias when the sample size increases. table 2. summary statistics for multiple linear regression using residual bootstrap at different bootstrap replicates and sample sizes for 𝛽0 and 𝛽1 the resampling techniques in table 2 show a clear decrease of the variances when the sample size increases. let’s consider the changes in the variance of �̂�0 when the bootstrap replicate is 1000. when the sample size is 20 the variance is 28.6300, and the value parameter estimate bootstrap replicate sample size bias variance p-value standard error �̂�0 100 20 1.5277 30.9685 <.0001 0.5565 40 2.6535 27.0046 <.0001 0.5197 50 1.5324 19.8861 <.0001 0.4459 62 -0.3345 15.4073 <.0001 0.3925 1000 20 0.9635 28.6300 <.0001 0.1692 40 2.3838 22.7581 <.0001 0.1509 50 2.0622 20.0467 <.0001 0.1416 62 0.0035 15.4785 <.0001 0.1244 10000 20 0.6704 30.9949 <.0001 0.0557 40 2.2883 24.0894 <.0001 0.0491 50 2.2491 20.2725 <.0001 0.0450 62 -0.0193 17.0400 <.0001 0.0413 �̂�1 100 20 0.2307 0.2037 <.0001 0.0451 40 0.2266 0.1566 <.0001 0.0396 50 0.0684 0.1098 <.0001 0.0331 62 0.0137 0.0819 <.0001 0.0286 1000 20 0.1630 0.2196 <.0001 0.0148 40 0.2061 0.1612 <.0001 0.0127 50 0.0732 0.1322 <.0001 0.0115 62 -0.0066 0.1025 <.0001 0.0101 10000 20 0.1547 0.2103 <.0001 0.0046 40 0.2180 0.1579 <.0001 0.0040 50 0.0608 0.1338 <.0001 0.0037 62 -0.0024 0.1071 <.0001 0.0033 comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 350 becomes 22.7581 when the sample size is 40. then the variance decrease as the sample size increases to 50 and 62 where the bias become 19.8861 and 15.4785, respectively. now let’s observe the changes in bias caused by the bootstrap replicate, b, when it is increased from hundred to thousand then ten thousand. for the estimated constant, �̂�0, when the sample size is 40 the bias changes from 2.6535 to 2.3838, then 2.2883 when b increases from 100 to 1000 then 10000, respectively. the variance also decreases when the bootstrap replicate increases. delete-one jackknife approach the third technique that was used in this research is jackknife delete-one. the method was applied with different sample sizes , which are 20, 40, 50 and 62. table 3 and figure 1 display the changes in bias of all parameters for delete-one jackknife. the bias decreases as the sample size increases. but when sample size equal to the population size the bias shows an increasing state. using the population as sample size might show this type of result. plot of variance versus sample size for all parameters are shown in figure 2. from the plot, it can be seen that the variance also shows a decreased state from sample 20 to sample 62. small variances give a better estimation in linear regression. the bias and variance also not interrelated in delete-one jackknife. the p-value also shows that all parameter estimates are significant. the standard error also clearly shows that the increase in sample size will give a better estimation. table 3. summary statistics for multiple linear regression using delete-one jackknife at different sample size . parameter estimate sample size bias variance p-value standard error �̂�0 20 0.5586 2.9683 <.0001 0.3852 40 2.2937 0.7335 <.0001 0.1354 50 2.1648 0.3625 <.0001 0.0851 62 -3.1721 0.2212 <.0001 0.0597 �̂�1 20 0.1617 0.0161 <.0001 0.0284 40 0.2182 0.0054 <.0001 0.0117 50 0.0662 0.0046 <.0001 0.0096 62 0.6613 0.0029 <.0001 0.0069 �̂�2 20 0.0249 0.0001 <.0001 0.0017 40 0.0006 0.0000 <.0001 0.0007 50 -0.0045 0.0000 <.0001 0.0005 62 0.0054 0.0000 <.0001 0.0004 �̂�3 20 -2.0491 18.1473 <.0001 0.9526 40 -7.3628 3.7708 <.0001 0.3070 50 -5.6852 1.4059 <.0001 0.1677 62 -3.7014 0.9218 <.0001 0.1219 �̂�4 20 0.3589 1.9284 <.0001 0.3105 40 0.6624 0.4470 <.0001 0.1057 50 -0.0712 0.3889 <.0001 0.0882 62 0.7431 0.4339 <.0001 0.0837 comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 351 figure 1. changes of bias in all parameter estimation when sample size increases in delete-one jackknife. figure 2. changes of variance in all parameter estimation when sample size increases in delete-one jackknife. the difference between residual bootstrap estimation and random bootstrap estimation is obvious when the sample size is 20 (small). the residual bootstrap provided better parameter estimation than random bootstrap in bias and variance. this shows that residual has a big influence in linear regression. but, as the sample size increases, both residual and random bootstrap methods show similar results. the increase in bootstraps replicates and sample size gave better parameter estimation in both methods. jackknife delete-one gave a small variance, but the value of the bias was big when the sample size was small. the bias and variance decrease as the sample size increases. conclusions residual bootstrap, random bootstrap, and delete-one jackknife were compared. jackknife is not advisable to use when the sample size is small. however, when the sample -7 -6 -5 -4 -3 -2 -1 0 1 2 3 20 40 50 62 b ia s sample size β4 β3 β2 β1 β0 0 5 10 15 20 25 20 40 50 62 va ri a n ce sample size β4 β3 β2 β1 β0 comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 352 size is big enough which is near to population size, it will give better parameter estimation than random bootstrap and residual bootstrap. in a situation where the sample size is small due to cost consideration, it is better to use residual bootstrap than other methods in linear regression. in conclusion, it is advisable to use residual bootstrap when the sample is small. the bigger bootstrap replicates will give better parameter estimation. the jackknife can be used when the sample size is big enough. this method will be useful when the sample size is too big which may take time to process in both random and residual bootstrap. in the future, this research can be extended to observe how these methods react when there is an outlier, influential point or leverage point. moreover, the comparisons may involve other resampling techniques to compare which method works well in multiple linear regression. references [1] m. alrasheedi, “parametric and non-parametric bootstrap: a simulation study for a linear regression with residuals from a mixture of laplace distributions,” european scientific journal, vol. 9, no. 12, 2013. [2] r. f. gunst and r. l. mason, regression analysis and its application: a data-oriented approach. crc press, 2018. [3] a. althubaiti, “information bias in health research: definition, pitfalls, and adjustment methods,” j multidiscip healthc, vol. 9, p. 211, 2016. [4] m. r. chernick, “resampling methods,” wiley interdisciplinary reviews: data mining and knowledge discovery, vol. 2, no. 3, pp. 255–262, 2012. [5] r. e. mcroberts, s. magnussen, e. o. tomppo, and g. chirici, “parametric, bootstrap, and jackknife variance estimators for the k-nearest neighbors technique with illustrations using forest inventory and satellite image data,” remote sensing of environment, vol. 115, no. 12, pp. 3165–3174, 2011. [6] r. g. clark and s. allingham, “robust resampling confidence intervals for empirical variograms,” mathematical geosciences, vol. 43, no. 2, pp. 243–259, 2011. [7] s. sahinler and d. topuz, “bootstrap and jackknife resampling algorithms for estimation of regression parameters,” journal of applied quantitative methods, vol. 2, no. 2, pp. 188–199, 2007. [8] x. li, w. wong, e. l. lamoureux, and t. y. wong, “are linear regression techniques appropriate for analysis when the dependent (outcome) variable is not normally distributed?,” invest ophthalmol vis sci, vol. 53, no. 6, pp. 3082–3083, 2012. [9] z. y algamal and k. b rasheed, “re-sampling in linear regression model using jackknife and bootstrap,” iraqi journal of statistical sciences, vol. 10, no. 18, pp. 59–73, 2010. [10] j. ma et al., “probabilistic forecasting of landslide displacement accounting for epistemic uncertainty: a case study in the three gorges reservoir area, china,” landslides, vol. 15, no. 6, pp. 1145–1153, 2018. [11] c. wan, z. xu, y. wang, z. y. dong, and k. p. wong, “a hybrid approach for probabilistic forecasting of electricity price,” ieee transactions on smart grid, vol. 5, no. 1, pp. 463–470, 2013. [12] g. a. nelson, “cluster sampling: a pervasive, yet little recognized survey design in fisheries research,” trans am fish soc, vol. 143, no. 4, pp. 926–938, 2014. comparisons between resampling techniques in linear regression: a simulation study anwar fitrianto 353 [13] p. phaladiganon, s. b. kim, v. c. p. chen, j.-g. baek, and s.-k. park, “bootstrap-based t 2 multivariate control charts,” communications in statistics—simulation and computation®, vol. 40, no. 5, pp. 645–662, 2011. [14] j. shao and d. tu, the jackknife and bootstrap. springer science & business media, 2012. [15] u. beyaztas and a. alin, “sufficient jackknife-after-bootstrap method for detection of influential observations in linear regression models,” statistical papers, vol. 55, no. 4, pp. 1001–1018, 2014. [16] d. c. montgomery, e. a. peck, and g. g. vining, introduction to linear regression analysis. john wiley & sons, 2021. cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 220-230 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 19, 2021 reviewed: december 10, 2021 accepted: january 07, 2022 doi: http://dx.doi.org/10.18860/ca.v7i1.13356 analysis of insurance customer factors to renewal using hybrid ahp-ftopsis kwardiniya andawaningtyas, evi ardiyani*, corina karim departement of mathemathics faculty of mathematics and natural science brawijaya university, 65145, malang, indonesia *corresponding author email: eviardiyani98@gmail.com* dina_math@ub.ac.id, corinaub@gmail.com abstract human life is full of uncertainties that have enormous risks. insurance is one way that can help humans reduce this risk. the human need for insurance causes competition among insurance companies in indonesia to be very competitive. competition between insurance companies is influenced by several factors, one of the factors is having customers who do insurance renewals. this study aims to determine the factors that influence customers to renew using the analytical hierarchy process (ahp) method and to rank customers' favorite insurance using the fuzzy technique for order preference by similarity to ideal solution (ftopsis) method. the results of the analysis using this method concluded that the main factors that influence customers in making renewals are features with sub-criteria for health protection needs. meanwhile, the customer's favorite insurance ratings for extending are takafullink salam cendikia with a closeness coefficient of 0.645, takaful al-khairat with a value of 0.563, takaful dana pendidikan with a value of 0.552, and takafullink salam with a value of 0.341. keywords: insurance; renewal; ahp; ftopsis introduction human life is full of elements of uncertainty that have enormous risks, such as accidents and death. humans need a guarantee or a method to reduce this risk which we usually call insurance. the human need for insurance causes the competition of insurance companies in indonesia to be very competitive. the biggest factor for an insurance company to be competitive is a customer who carries out a renewal. each customer has its own criteria which are the determining factors for a customer to renew. the decision support system (dss) is specific information that is intended to assist management in making decisions related to semi-structured issues. dss aims to assist decision makers in establishing an unstructured decision. unstructured decisions have vague problems, and it's difficult to find solutions. decision support systems are basically designed to support every stage of decision making, namely identifying problems, selecting relevant data, determining approaches, and evaluating alternative choices. in 2018, [1] conducted research on how to improve consumer satisfaction http://dx.doi.org/10.18860/ca.v7i1.13356 mailto:eviardiyani98@gmail.com* mailto:dina_math@ub.ac.id mailto:corinaub@gmail.com analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 221 learning (lbb) in malang using the danp-topsis method. identifying important human error factors in emergency departments in taiwan using hfacs, ahp, and ftopsis by [2]. [3] conducted research on the selection of favorite banks using the ahp and topsis methods. [4] discusses the selection of the best health applications and features that affect the ahp and ftopsis methods. comparasion on of anp and ahp methods studied by [5]. [6] conducted reseacrh comparison between topsis and saw. [7]discusses decision making using hybrid ahp-topsis. hybrid fuzzy ahp-topsis researched by [8]. [9] researched decision making using hybrid ahp-topsis. comparison beetwen saw, ahp, and topsis researched by [10]. [11] conducted research on the comparison between saw method and ahp method. integrated anp and topsis method for suplier performace assesment researched by [12]. [13] reasearched evaluation of smart and suistainable cities with anp and topsis method. [14] conducted research hybrid ahptopsis method under spherical fuzzy sets for system selection. hybrid ahp-topsis for selecting supplier in construction supply chain researched by [15]. this study aims to determine the factors that most influence insurance customers to renew and obtain favorite insurance alternatives by combining the ahp and ftopsis methods. the combination of the ahp and ftopsis methods is to obtain the criteria weights using the ahp method, then the ftopsis method uses the criteria weights that have been obtained by the ahp method to obtain the best alternative. methods analytical hierarchy process (ahp) method. ahp is a decision-making process with compilation of functional hierarchies with the main input being human [16]. ahp requires ideas from individuals and groups by obtaining their respective assumptions and obtaining the desired solutions. these ideas are used to determine criteria that can solve a problem. in this research, ahp method is used to determine criterion weight to be used in the ftopsis method. according [16] there are general measures of ahp method consists of seven steps. 1. defining the problem and determining the desired solution then arranging hierarchy of the problems by setting goals which are the overall system goals at the top level. 2. determine the priority of the elements. a. making pair comparasons by comparing elements in pairs according to given criteria. b. the pairwaise comparison matrix is filled using numbers to represent the relative importance of one element to another. the pairwise comparison matrix entry is the result of a questionnaire converted using table 1. 3. synthesis considerations for pairwise comparisons are synthesized to obtain overall priority. a. sum each column on the matrix. b. divide each value from the column by the total column obtain a normalized matrix. c. sum each row and divide by the number of elements to get the average value. 4. measure consistency. a. multiplies each value in the first column by the relative priority of the first element, the value in the second column by the relative priority if the second element, and so on. b. adding each row, the result divided by the corresponding relative priority element. analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 222 c. adding the results above for the elements that exist, called λmaks 5. calculating the consistency index (ci) (1) = number of elements 6. calculating the consistency ratio (cr) (2) where is index random consistency contained in table 2 7. check hierarchy consistency, the consistency ratio must be less or equal to 0.1. the calculation result can be declared correct. table 1. ahp rating scale difference -8 -7 -6 -5 -4 -3 -2 -1 0 ahp scale 9 8 7 6 5 4 3 2 1 source : [16] table 2. index random consistency matrix size ir value 1,2 0.00 3 0.58 4 0.90 5 1.12 6 1.24 7 1.32 8 1.41 9 1.45 10 1.49 11 1.51 source : [16] fuzzy technique for order preferences by similarity to ideal solution (ftopsis) method. the ftopsis method is a development of the topsis (technique for order preference by similarity to ideal solution). topsis method first introduced by yoon and hwang in 1981. the topsis method has a weakness, when the decision maker has difficulty determining a value. therefore, it is necessary to provide an assessment in the form of intervals such as applying fuzzy logic. fuzzy numbers, linguistic values and membership function shown in the figure 1 and table 3. in this research, ftopsis method is used to rank the alternatives. analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 223 figure 1. fuzzy numbers and linguistic table 3. the membership function of linguistic value linguistic value fuzzy number very low (vl) ( 0 , 0 , 0.2 ) low (l) ( 0, 0.2, 0.4 ) medium (m) (0.2, 0.4, 0.6) high (h) (0.4, 0.6, 0.8) very high (vh) ( 0.6, 0.8, 1 ) excellent (e) ( 0.8, 1, 1 ) source : [3] general measurer of ftopsis method consists of 9 steps. 1. assesing criteria and alternatives assumed that there are alternatives that will be evaluated against criteria . the weight of each criterion is denoted by . the ranking of the fuzzy criteria value of each decision for each alternative against the criterion denoted by with the membership function . 2. calculate the comparison value of each criterion and alternatives the fuzzy values for each decision maker are presented as fuzzy triangle the value of the fuzzy ratio is given by , with (3) fuzzy number linguistic value analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 224 fuzzy weight ratio with : (4) 3. make a decision matrix creating a decision matrix (dk) that is appropriate for the alternatives to be evaluated based on the following defined criteria : with states the performance of the calculation for i alternatives against the j criterion. 4. normalize the fuzzy decision matrix normalize the data using a linear scale transformation, the normalized matrix is defined by (5) with (6) (7) (6) benefit criteria, (7) cost criteria 5. calculate the normalized matrix weights the normalized matrix weight is calculated by multiplying the weight of the evaluated criterion by the normalized decision matrix (8) with 6. calculate the value of fuzzy positive ideal solution (fpis) and fuzzy negative ideal solution (fnis) (9) (10) with and is the set of benefit criteria. and is the set of cost criteria. 7. calculate the distance for each alternatives from fpis and fnis if there is and is two fuzzy triangular numbers, defined as and then the distance between and can be calculated by analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 225 (11) distance of each weighted alternative (i = 1, 2, 3, … , m) from fpis and fnis can be calculated by (12) (13) 8. calculate the closeness coefficient value the closeness coefficient ( ) represents the distance between fpis(a+) and fnis (a) simultaneously for each alternative, the closeness coefficient ( ) can be calculated by (14) with 9. sort alternative each alternative is sorted according to the decreasing closeness coefficient ( ) value. the best alternatives is the closeness coefficient ( ) value is close to fpis and far from fnis. results and discussion analytical hierarchy process (ahp) method. the first step in the ahp method is arrangement the hierarchichal structure. the hierarchical structure in this study consists of 4 levels, the first level is the goal, namely to determine the favorite type of insurance. the second level is the elaboration of the main aspects that influence the objectives, namely the criteria. the third level is the aspects that influence the criteria, namely sub-criteria. the fourth level or the lowest level is the level that consists of alternatives. the structure of the hierarchical system in this study can be seen in figure 2. then, we determine the priority of the elements by create formation of pairwise comparison matrix between sub-criteria and create weight matrix between sub-criteria based on the results of the questionnaire. next step is calculate value. the five criteria have a , it can be concluded that the pairwise comparison matrix between these subcriteria is consistent. the most influential criterion in choosing the customer’s favorite insurance for renewal in company a is the insurance feature with the subcriteria for the need for health protection having a weight value of 0.800. table 5 shows the evaluation result and final ranking of criterion. to determine the level of data consistency, we calculate the value. first, calculate the value of then calculate the using equation (1). is obtained by adding the results for the elements that exist and each number of subcriteria is the value of n used. table 5 shows the results. using the value that has been obtained, calculate the value using equation (2). if the the research can be continued. table 4 analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 226 shows the value. it is shown that the five criteria have a value of less than 0,1 which means that the data for the five criteria are consistent. so, research can be continued. table 4. value criteria value company image 0,068 agent 0,034 insurance features 0,064 claim 0,020 income 0,055 figure 2. hierarchical system table 5. ahp method result subcriteria (priority vector) honesty 0.639 achievment 0.087 tdp tls tlsc tak customer income (rp. 2,5 4,9 million /month) customer income (rp. 5,0 7,5 million /month) customer income (>rp. 7,5 million /month) goal favorite type of insurance and factors that influence customers to renew criterion sub-criterion alternative agent product mastery communication ease of contact honesty achievment track record insurance features health education investation claim ease of taking claims great claim company image time period for claiming income customer income (rp. 7,5 million/month) 0.244 afterward, we analyze the best alternative in ftopsis methods. the weights of criteria to be used in evalution process are calculated by using ahp method combined with the scores from the expert questionnaire. table 6 shows the data from the expert questionnaire. table 6. data from the expert questionnaire tdp tls tlsc tak honesty h vh m h achievment vh vh h m track record m h vh l product mastery l vh m m communication h l vh h ease of contact m l l m health h h vh e education e vh h h investation h vh e h ease of taking claims vh h h vh great claim m l vh h time period for claiming vh h vh h customer income (rp. 7,5 million/month) h h m h then the next step is calculating the weight of the alternative matrix. table 7 shown the multiplication results of the expert questionnaire values that have been converted based on table 3 with the priority vector value for example, criteria honesty on alternative tdp is h then convert the value to fuzzy number based on table 3 which is (0.4, 0.6, 0.8). then do fuzzy multiplication with the value of the priority vector which is 0.639. after we get the multipclication of the priority vectors and the expert quiestionner, we calculate the fpis and fnis values, then we use these values to calculate the fpis and fnis distances using equation (12) and (13). table 8 shows the value of the fpis and fnis distance. after calculating the distance between fpis and fnis, we calculate value using equation (14). table 9 shows the results of calculating the value. analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 228 depends of the value in table 9 the alternatives ranking in ftopsis method, the first order is the tlsc alternative, the second is the tak alternative, the third is the tdp alternative, and the last order is the tls alternative. table 7. multiplication of the priority vectors by the results of the expert questionnaire tdp tls tlsc tak honesty (0.256,0.384,0.511) (0.384,0.511,0.639) (0.128,0.256,0.384) (0.256,0.384,0.511) achievment (0.052,0.070,0.087) (0.052,0.070,0.087) (0.035,0.052,0.070) (0.017,0.035,0.052) track record (0.055,0.109,0.164) (0.109,0.164,0.219) (0.164,0.219,0.274) (0,0.055,0.109) product mastery (0,0.069,0.137) (0.206,0.274,0.343) (0.069,0.137,0.206) (0.069,0.137,0.206) communication (0.230,0.345,0.460) (0,0.115,0.230) (0.345,0.460,0.575) (0.230,0.345,0.460) ease of contact (0.016,0.033,0.049) (0,0.016,0.033) (0,0.016,0.033) (0.016,0.033,0.049) health (0.320,0.480,0.640) (0.320,0.480,0.640) (0.480,0.640,0.800) (0.640,0.800,0.800) education (0.099,0.124,0.124) (0.075,0.099,0.124) (0.050,0.075,0.099) (0.050,0.075,0.099) investation (0.030,0.045,0.060) (0.045,0.060,0.075) (0.060,0.075,0.075) (0.030,0.045,0.060) ease of taking claims (0.074,0.098,0.123) (0.049,0.074,0.098) (0.049,0.074,0.098) (0.074,0.098,0.123) great claim (0.446,0.557,0.557) (0,0.111,0.223) (0.334,0.446,0.557) (0.223,0.334,0.446) time perios for claiming (0.192,0.256,0.320) (0.128,0.192,0.256) (0.192,0.256,0.320) (0.128,0.192,0.256) customer income (rp. 7,5 million/month) (0.098,0.146,0.195) (0.098,0.146,0.195) (0.049,0.098,0.146) (0.098,0.146,0.195) table 8. fpis and fnis distances. tdp tls tlsc tak 0.982 1.445 0.787 0.958 1.207 0.748 1.429 1.234 table 9. value alternative tdp 0.552 tls 0.341 tlsc 0.645 tak 0.563 conclusion the results of data analysis that has been carried out from the combination of the two methods indicate that the most influencing factor for insurance customers to renew at pt asuransi takaful keluarga is the insurance feature, namely the customer's need for health protection with weight value of 0.800. all health insurance companies must have health protection features. so, we see the next order of sub-criteria, honesty with a weighted value of 0.639, agent communication with a weighted value of 0.575, and good claims with a weighted value of 0.557. the sub-criteria that have the highest value analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 229 weight are the criteria for company image, agent, and claims. the five criteria have a priority value that is quite close, indicating that the five criteria are mutually sustainable.the order of alternative choices for the customer's favorite insurance who renews at pt. family takaful insurance is takafulink salam scholar, takaful al-khairat, takaful fund education, and takafulink salam. references [1] k. andawaningtyas, e. w. handamari, and c. karim, “improving the guidance learning (lbb) consumer satisfaction in malang using danp topsis method,” cauchy, vol. 5, no. 3, p. 117, 2018, doi: 10.18860/ca.v5i3.5541. [2] m. chih hsieh et al., “application of hfacs, fuzzy topsis, and ahp for identifying important human error factors in emergency departments in taiwan,” int. j. ind. ergon., vol. 67, pp. 171–179, 2018, doi: 10.1016/j.ergon.2018.05.004. [3] h. i, “analisis keputusan pemilihan bank favorit menggunakan kombinasi metode ahp dan metode topsis,” 2019. [4] m. rajak and k. shaw, “evaluation and selection of mobile health (mhealth) applications using ahp and fuzzy topsis,” technol. soc., vol. 59, 2019, doi: 10.1016/j.techsoc.2019.101186. [5] a. j. olanta, m. e. sianto, and i. gunawan, “perbandingan metode anp dan ahp dalam pemilihan jasa kurir logistik oleh penjual gadget online,” widya tek., vol. 18, no. 2, pp. 96–101, 2019, doi: 10.33508/wt.v18i2.2275. [6] sunarti, “comparison beetwen topsis and saw method in the selection of tourist destinations in west java,” techno com, vol. 18, 2019. [7] r. agusli, m. i. dzulhaq, and f. c. irawan, “sistem pendukung keputusan penerimaan karyawan menggunakan metode ahp-topsis,” acad. j. comput. sci. res., vol. 2, no. 2, 2020, doi: 10.38101/ajcsr.v2i2.286. [8] v. julianto, h. s. utomo, and h. herpendi, “analisis dan penerapan metode fuzzy ahp-topsis dalam penentuan mitra industri sebagai tempat praktek kerja lapangan,” j. ilm. inform., vol. 5, no. 2, pp. 108–121, 2020, doi: 10.35316/jimi.v5i2.942. [9] g. surya mahendra and i. p. y. indrawan, “metode ahp-topsis pada sistem pendukung keputusan penentuan penempatan automated teller machine,” j. sains dan teknol., vol. 9, no. 2, pp. 130–142, 2020. [10] wawan firgiawan, sugiarto cokrowibowo, and nuralamsah zulkarnaim, “komparasi algoritma saw, ahp, dan topsis dalam penentuan uang kuliah tunggal (ukt),” j. comput. inf. syst. ( j-cis ), vol. 1, no. 2, pp. 1–11, 2019, doi: 10.31605/jcis.v1i2.426. [11] qiyamullaily arista, nandasari silvia, and amrozi yusuf, “perbandingan penggunaan metode saw dan ahp untuksistem pendukung keputusan penerimaan karyawan baru,” tek. eng. sains j., vol. 4, 2020. [12] c. natalia, i. p. surbakti, and c. w. oktavia, “integrated anp and topsis method for supplier performance assessment,” j. tek. ind., vol. 21, no. 1, pp. 34–45, 2020, doi: 10.22219/jtiumm.vol21.no1.34-45. [13] g. ozkaya and c. erdin, “evaluation of smart and sustainable cities through a hybrid mcdm approach based on anp and topsis technique,” heliyon, vol. 6, no. 10, 2020, doi: 10.1016/j.heliyon.2020.e05052. [14] m. mathew, r. k. chakrabortty, and m. j. ryan, “a novel approach integrating ahp and topsis under spherical fuzzy sets for advanced manufacturing system selection,” eng. appl. artif. intell., vol. 96, 2020, doi: analysis of insurance customer factors to renewal using hybrid ahp-ftopsis 230 10.1016/j.engappai.2020.103988. [15] m. marzouk and m. sabbah, “ahp-topsis social sustainability approach for selecting supplier in construction supply chain,” clean. environ. syst., vol. 2, p. 100034, 2021, doi: 10.1016/j.cesys.2021.100034. [16] kusrini, konsep dan aplikasi sistem pendukung keputusan. 2007. c-type ops transformation cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 401-410 p-issn: 2086-0382; e-issn: 2477-3344 submitted: april 05, 2022 reviewed: june 29, 2022 accepted: july 07, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.15749 c-type ops transformation ahmad lazwardi*, iin ariyanti, soraya djamilah universitas muhammadiyah banjarmasin, indonesia email: lazwardiahmad@gmail.com abstract the scope of the ops transformation is limited to the mclaurin series only. while there are still many cases in mathematical modeling which are modeled in the form of a more general series. the purpose of this study is to generalize the ops transformation into a more general form that can be used for any taylor series. this study uses a literature study method, namely by reviewing the ops transformation and then observing aspects that can be generalized. next is to construct a more general definition of ops transformation which is referred to as c-type ops transformation. at the end of this research, the ops transformation will be applied to solve ordinary differential equations with variable coefficients very briefly. keywords: c-type ops transformation; ops transformation; power series introduction taylor series is accessible to all students and it is a useful mathematical tool to nonlinear equations [1]. power series is an essensial method for solving many problems in mathematics such as algebra and differential equations [2]. many cases also appear in algebra which is involving power series such as solving polynomial homotopies equations [3]. as we know on algebraic geometry topics we talk about rings, ring extensions and ideals which recently appears as series forms. for example, is on commutative ring topics. if r be a commutative ring with identity. let r[x] and r[[x]] be the collection of polynomials and, respectively, of power series with coefficients in r. their multiplications are from a class of sequences 𝜆 = {𝜆𝑛} of positive integers which is related one-one correspondence to its power series [4]. another branch of mathematics which also involved by power series recently is differential equations[5]. the problem of finding formal power series solutions of differential equations has a long history and it has been extensively studied in literature. using power series method however, is a more systematic way and standard basic method for approximating the solutions of such differential equations analytically and thus studying the method is of greater importance [6]. ordinary power series has also appear as solutions for fractional differential equations as told by angstmann and henry in their publication namely generalized series expansions involving integer powers and fractional powers in the independent variable have recently been shown to provide solutions to certain linear fractional order differential equations [7]. this incredible discovery is related to i. area and j. nieto who discover the solution for fractional logistics equation which is appeared as power series form [8]. http://dx.doi.org/10.18860/ca.v7i3.15749 c-type ops transformation ahmad lazwardi 402 this research discovers new theory of ordinary power series expecially the alternating method to analyze ordinary power series by considering ordinary power series as a transformation which called ops transformation. this transformation will simplify the counting process of some equations involving sigma notation. reducing the use of sigma algebra, alternating it by algebra of linear transformation. the aim of this research is to generalize the concept of ops transformation for power series of form (1). we will name it as c-type ops transformation. the letter c is indicated the center of the power series. methods we will proceed this reseach through the following procedures. first we will define the more general form of ops transforamtion. after that we will find some basic results of our new definition. we will use such results to extend the theory of ops transformation. next we will make sure that our previous definition of ops transformation to become special case for our new definition. next we will find some theorems regarding our new transformation properties. we also will make sure that our new definition is able to be applied much wider than our previous one. results and discussion first we shall define the c-type of ops transformation. recall that power series centered at c is defined to be the real valued function of the form [9] .)( 0     n n n cxa the series has a value depending on what value of x we choose. some value of x will result the series tend to infinity. some other will result the series converges[10]. the set of x which result (1) converges is called convergence interval. the term “interval” makes sense because such set always forms an interval. the half-length of such interval is called convergence radius. some smooth function f(x) at point c is able to be approximated by power series which on some value of x0 lies on its convergence interval centered at c, the result will be same, i.e f(x0) = ∑ 𝑎𝑛(𝑥0 − 𝑐) 𝑛∞ 𝑛=0 . such functions are called “real analytic” functions. the method of resulting such series was given by taylor which is called taylor series as below [11] .)( ! )( 0 )(     n n n cx n cf special case of taylor series is when the value of c = 0, the series is called mclaurin series. lazwardi (2021) has already able to reformulate such series into more simple form called ops transformation. the ops transformation is defined as below .)})(({ 0     n n nn xaxaops therefore, for some mclaurin series of the form . ! )0( 0 )(   n n n x n f (1) (2) (4) c-type ops transformation ahmad lazwardi 403 its enough to write the series simply as 𝑂𝑝𝑠{ 𝑓(𝑛)(0) 𝑛! } . simplification of the form will make calculations and manipulations easier [12]. there are some properties regarding ops transformation as following: theorem 1. (shifting-entry) for each {𝑎𝑛} sequence, we have 𝑂𝑝𝑠{0,𝑎0,𝑎1,…} = 𝑥𝑂𝑝𝑠{𝑎𝑛}. theorem 2. for each {𝑎𝑛} sequence, we have 𝑂𝑝𝑠({𝑎𝑛})− 𝑎0 = 𝑥𝑂𝑝𝑠({𝑎𝑛+1}). beside two above theorems, ops transformation inherits linearity properties as well as sigma notations. theorem 3. for each {𝑎𝑛},{𝑏𝑛} sequences and any real numbers 𝛼,𝛽 , we have 𝑂𝑝𝑠(𝛼{𝑎𝑛} +𝛽{𝑏𝑛}) = 𝛼𝑂𝑝𝑠{𝑎𝑛} + 𝛽𝑂𝑝𝑠{𝑏𝑛}. the last theorem notices us that we can view ops transformation as linear transformation which mapping from the space of all real sequences to real numbers on its convergence radius. we shall use this necessary fact to simplify several calculations. besides that lazwardi was able to prove the formula regarding product of two ops transformations as following. theorem 4. for each {𝑎𝑛},{𝑏𝑛} sequences, we have .}{}{ 0           n k knknn baopsbopsaops this is just similiar with the product two power series but wihout involving double sigma notation. another important result of previous research is we can use the fact that the power series is always able to differentiate n-times, to construct the rule of differentiation for ops transformation. talking about differentiation of ops transformation meas we have to state the symbol for its derivative. we use 𝐷𝑥𝑂𝑝𝑠{𝑎𝑛} to notate the derivative of ops transformation on its radius convergence. therefore we have the following theorems. theorem 5. for each {𝑎𝑛} sequence, we have 𝐷𝑥𝑂𝑝𝑠({𝑎𝑛}) = 𝑂𝑝𝑠({(𝑛 + 1)𝑎𝑛+1}). here is some nice modification formula theorem 6. for each {𝑎𝑛} sequence, we have 𝑥𝐷𝑥𝑂𝑝𝑠({𝑎𝑛}) = 𝑂𝑝𝑠({𝑛𝑎𝑛}). if we pay more attention to the (1). there are some difference between (1) and (3) i.e the value of c will be varied and able to consider it as a variable. therefore we have at least 4 variable involved in calculations of (1) which is more complicated than (3) expecially special type of ordinary power series which called taylor series. as told by salwa in her research that many infinite series form are recently appear in sequence spaces expecially on 𝛽 − 𝑑𝑢𝑎𝑙 sequence spaces which is defined as infinite series form [13] one of popular application from taylor series is the iterative method of the (5) (6) (7) (8) (9) (10) c-type ops transformation ahmad lazwardi 404 differential transform methor has already been used for a while, by the ‘‘traditional’’ taylor series method users which have even better developed the method. suppose that we have power series of form (1) centered at c. we define definition 1. let c be a real number and suppose power series ∑𝑎𝑛(𝑥 − 𝑐) 𝑛 has positive convergence radius near c. define the c-type ops transformation as following 𝑂𝑝𝑠𝑐{𝑎𝑛}(𝑥) = ∑𝑎𝑛 ∞ 𝑛=0 (𝑥 − 𝑐)𝑛. it looks similiar to the previous form with additional superscript c. note that the additional superscript c on ops roles as index depending on value c on the right side. for some reason, we shall keep c to become upper index because we shall use lower index with another use on the next research. recall that one of the most suitable form which is similiar to our last definition is taylor series of analytic function on c. if 𝑓(𝑥) is an analytic function near c, then we can write f as taylor series on some neighborhood c as below 𝑓(𝑥) = ∑ 𝑓(𝑛)(𝑐) 𝑛! ∞ 𝑛=0 (𝑥 − 𝑐)𝑛. hence, taylor series of f can be written as c-type ops transformation as 𝑓(𝑥) = 𝑂𝑝𝑠𝑐 { 𝑓(𝑛)(𝑐) 𝑛! }(𝑥). for some reason, we just write 𝑓 = 𝑂𝑝𝑠𝑐 { 𝑓(𝑛)(𝑐) 𝑛! }. please pay more attention here. upper index c is viewed as variable (not necessary fixed). we can write 𝑂𝑝𝑠𝑎 { 𝑓(𝑛)(𝑐) 𝑛! } = ∑ 𝑓(𝑛)(𝑐) 𝑛! ∞ 𝑛=0 (𝑥 − 𝑎)𝑛. i.e when we change the value of upper index c by a, the value of 𝑓(𝑛)(𝑐) 𝑛! doesn’t change but the center of power series on the right side changes to a. its clear that ops transformation is a special case of c-type of ops transformation by taking value c = 0 [14], i.e 𝑂𝑝𝑠0{𝑎𝑛} = 𝑂𝑝𝑠{𝑎𝑛}. for more brief information. we shall discuss some more examples as following. (11) (12) (13) (14) (15) c-type ops transformation ahmad lazwardi 405 example 1. 𝑂𝑝𝑠𝑐{1} = 1 1−(𝑥−𝑐) . proof: observe that 𝑂𝑝𝑠𝑐{1} = ∑(𝑥 − 𝑐)𝑛. suppose that 𝑦 = 𝑥 − 𝑐 then the right side of equation become ∑𝑦𝑛 = 1 1−𝑦 for |𝑦| < 1. hence we have for |𝑥 − 𝑐| < 1 or 𝑐 − 1 < 𝑥 < 𝑐 + 1 we will get the series ∑(𝑥 − 𝑐)𝑛 will converge and we have 𝑂𝑝𝑠𝑐{1} = 1 1−(𝑥−𝑐) . here is another example example 2. 𝑂𝑝𝑠𝑐 { 1 𝑛! } = 𝑒𝑥−𝑐. now we shall analyze more properties of c-type ops transformation. first we success to preserve the “shifting index” properties as well as previous form [15] theorem 7. (shifting-entry) for each {𝑎𝑛} sequence, we have 𝑂𝑝𝑠𝑐{0,𝑎0,𝑎1,…} = (𝑥 − 𝑐)𝑂𝑝𝑠 𝑐{𝑎𝑛}. proof: observe that 𝑂𝑝𝑠𝑐{0,𝑎0,𝑎1,…} = 0 + ∑𝑎𝑛−1 ∞ 𝑛=1 (𝑥 − 𝑐)𝑛 = ∑𝑎𝑛(𝑥 − 𝑐) 𝑛+1 ∞ 𝑛=0 = (𝑥 − 𝑐)∑𝑎𝑛 ∞ 𝑛=0 (𝑥 − 𝑐)𝑛 = (𝑥 − 𝑐)𝑂𝑝𝑠𝑐{𝑎𝑛} hence by induction we can conclude as corollary below corollary 1. for each {𝑎𝑛} sequence, we have 𝑂𝑝𝑠𝑐 {0,0,0, . . ,0⏟ 𝑘−𝑒𝑛𝑡𝑟𝑖𝑒𝑠 𝑎0,𝑎1,…} = (𝑥 − 𝑐) 𝑘𝑂𝑝𝑠𝑐{𝑎𝑛}. proof: for n = 1 the statement is true due to theorem 7. lets assume for n = k, the statement is also true, i.e (17) holds. for n = k+1, we have 𝑂𝑝𝑠𝑐 { 0,0,0, . . ,0⏟ 𝑘+1−𝑒𝑛𝑡𝑟𝑖𝑒𝑠 𝑎0,𝑎1,…} = (𝑥 − 𝑐) 𝑂𝑝𝑠 𝑐 {0,0,0, . . ,0⏟ 𝑘−𝑒𝑛𝑡𝑟𝑖𝑒𝑠 𝑎0,𝑎1,…} = (𝑥 − 𝑐)((𝑥 − 𝑐)𝑘𝑂𝑝𝑠𝑐{𝑎𝑛}) = (𝑥 − 𝑐)𝑘+1𝑂𝑝𝑠𝑐{𝑎𝑛} theorem 8. for each {𝑎𝑛} sequence, we have (16) (17) (18) c-type ops transformation ahmad lazwardi 406 𝑂𝑝𝑠𝑐({𝑎𝑛})− 𝑎0 = (𝑥 − 𝑐)𝑂𝑝𝑠 𝑐{𝑎𝑛+1}. proof: observe that 𝑂𝑝𝑠𝑐({𝑎𝑛})− 𝑎0 = ∑𝑎𝑛(𝑥 − 𝑐) 𝑛 ∞ 𝑛=1 = ∑𝑎𝑛+1 ∞ 𝑛=0 (𝑥 − 𝑐)𝑛+1 = (𝑥 − 𝑐)∑𝑎𝑛+1 ∞ 𝑛=0 (𝑥 − 𝑐)𝑛 = (𝑥 − 𝑐)𝑂𝑝𝑠𝑐{𝑎𝑛+1} fortunately we also sucess to keep linearity properties of ops transformation[16]. theorem 9. for each {𝑎𝑛},{𝑏𝑛} sequences and any real numbers 𝛼, 𝛽, we have 𝑂𝑝𝑠𝑐(𝛼{𝑎𝑛} + 𝛽{𝑏𝑛}) = 𝛼𝑂𝑝𝑠 𝑐{𝑎𝑛}+ 𝛽𝑂𝑝𝑠 𝑐{𝑏𝑛}. proof: lets observe 𝑂𝑝𝑠𝑐(𝛼{𝑎𝑛}+ 𝛽{𝑏𝑛}) = ∑(𝛼𝑎𝑛 + 𝛽𝑏𝑛)(𝑥 − 𝑐) 𝑛 ∞ 𝑛=0 = 𝛼 ∑𝑎𝑛 ∞ 𝑛=0 (𝑥 −𝑐)𝑛 + 𝛽 ∑𝑎𝑛 ∞ 𝑛=0 (𝑥 − 𝑐)𝑛 = 𝛼𝑂𝑝𝑠𝑐{𝑎𝑛}+ 𝛽𝑂𝑝𝑠 𝑐{𝑏𝑛} although we success to prove linearity of ops transformation, but unfortunately that linearity of upper index, i.e 𝑂𝑝𝑠𝛼𝑐+𝛽𝑑{𝑎𝑛} ≠ 𝑂𝑝𝑠 𝛼𝑐{𝑎𝑛} +𝑂𝑝𝑠 𝛽𝑑{𝑏𝑛}. next we shall observe properties of c-type ops transformation for product of two power series. its still works similarily as previous result. theorem 10. for each {𝑎𝑛},{𝑏𝑛} sequences, we have 𝑂𝑝𝑠𝑐{𝑎𝑛}𝑂𝑝𝑠 𝑐{𝑏𝑛} = 𝑂𝑝𝑠 𝑐 {∑𝑎𝑘𝑏𝑛−𝑘 𝑛 𝑘=0 }. proof: let {𝑎𝑛},{𝑏𝑛} any two sequences, we have (19) (20) c-type ops transformation ahmad lazwardi 407 }{}{ n c n c bopsaops = ...))()(...)()()(( 2 210 2 210  cxbcxbbcxacxaa ......))()((...))((( 101100  cxbbcxacxbba ...)(.)()()( 2 11 2 20011000 cxbacxbacxbacxbaba  ...))(())(( 2 021120011000  cxbababacxbababa n n n k knk cxba )( 0 0                        n k knk c baops 0 from above theorem, we can conclude easily the following fact example 3. (𝑂𝑝𝑠𝑐{1})2 = 𝑂𝑝𝑠𝑐{𝑛 + 1}. translating the notation into sigma notation we get ∑(𝑛 +1)(𝑥 − 𝑐)𝑛 ∞ 𝑛=0 = ( 1 1 −(𝑥 − 𝑐) ) 2 from example 1 we can observe that how c-type ops transformations helps us to calculate, or manipulate some power series. here is another example example 4. (𝑂𝑝𝑠𝑐 { 𝟏 𝒏! }) 𝟐 = 𝑂𝑝𝑠𝒄 {∑ 𝟏 𝒌!(𝒏−𝒌)! 𝒏 𝒌=𝟎 }. therefore we concluce that ∑ ∑( 1 𝑘!(𝑛 − 𝑘)! )(𝑥 − 𝑐)𝑛 𝑛 𝑘=0 ∞ 𝑛=0 = 𝑒2𝑥−2𝑐 hence we can take another form of 𝑒𝑥 as following equation 𝑒𝑥 = (∑ ∑( 𝑒𝑐 𝑘!(𝑛 − 𝑘)! )(𝑥 − 𝑐)𝑛 𝑛 𝑘=0 ∞ 𝑛=0 ) 1 2 as for last discussion we shall observe how c-type ops transformation properties when we take its derivatives. we still use 𝐷𝑥𝑂𝑝𝑠 𝑐{𝑎𝑛} to notate the derivative of ops transformation on its radius convergence. consider the fact that the form 𝑥 − 𝑐 has the same derivative with x itself [17]. therefore we still able to adapt the formula for derivative of previous ops transformation as below c-type ops transformation ahmad lazwardi 408 theorem 11. for each {𝑎𝑛} sequence, we have 𝐷𝑥𝑂𝑝𝑠 𝑐({𝑎𝑛}) = 𝑂𝑝𝑠 𝑐({(𝑛 + 1)𝑎𝑛+1}). proof: observe that 𝐷𝑥𝑂𝑝𝑠 𝑐({𝑎𝑛}) = 𝑑 𝑑𝑥 ∑𝑎𝑛(𝑥 − 𝑐) 𝑛 ∞ 𝑛=0 = ∑𝑛𝑎𝑛(𝑥 −𝑐) 𝑛−1 ∞ 𝑛=1 = ∑(𝑛 + 1)𝑎𝑛+1(𝑥 − 𝑐) 𝑛 ∞ 𝑛=0 = 𝑂𝑝𝑠𝑐{(𝑛 + 1)𝑎𝑛+1} trivialy we also can conclude the next modification theorem theorem 12. for each {𝑎𝑛} sequence, we have (𝑥 − 𝑐)𝐷𝑥𝑂𝑝𝑠 𝑐({𝑎𝑛}) = 𝑂𝑝𝑠 𝑐{𝑛𝑎𝑛}. proof: observe that (𝑥 − 𝑐)𝐷𝑥𝑂𝑝𝑠 𝑐({𝑎𝑛}) })1{()( 1 n c anopscx      0 1 )()1()( n n n cxancx       0 1 1 )()1( n n n cxan     0 )( n n n cxna }{ n c naops from the last two theorems, we also can inductively conclude the following two corollaries. corollary 2. for each {𝑎𝑛} sequence, we have 𝐷𝑥 𝑘𝑂𝑝𝑠𝑐{𝑎𝑛} = 𝑂𝑝𝑠 𝑐 { (𝑛 + 𝑘)! 𝑛! 𝑎𝑛+𝑘}. where 𝐷𝑥 𝑘 is kth-derivative of ops transformation [15]. corollary 3. for each {𝑎𝑛} sequence, we have 𝑂𝑝𝑠𝑐{𝑛𝑘𝑎𝑛} = (𝑥 − 𝑐)𝐷𝑥((𝑥 − 𝑐)𝐷𝑥(…))(𝑥 −𝑐)𝐷𝑥𝑂𝑝𝑠 𝑐{𝑎𝑛}.⏟ 𝑘−𝑡𝑖𝑚𝑒𝑠 at the end of this research, we will try our transformation to solve some ordinary differential equation with variable coefficient. this solution must be exist due to [18] for example the equation (21) (22) (23) (24) c-type ops transformation ahmad lazwardi 409 02')1(2''  yyxy (25) we will find the solution of (25) near c = 1 as following: step 1: assume the solution has the form     0 )1( n n n xcy . step 2: transform (25) to ops transformation equation. }0{}{2}{)1(2}{ 11112 opscopscopsdxcopsd nnxnx  step 3: solve the equation }{2}{)1(2}{ 2 nnxn c x copscopsdxcopsd  }2{})1)(2{( 1 2 1 nn ncopscnnops   }2{ 1 n cops }22)1)(2{( 2 1 nnn cnccnnops   }0{ 1 ops step 4: remove ops transformation from the equation, we have (𝑛 + 2)(𝑛 + 1)𝐶𝑛+2 = (2𝑛 − 2)𝐶𝑛 for n = 0,1,2, ... step 5: by evaluating n one by one, we have the solution 𝑦 = 𝐶0 (1 − (𝑥 −1) 2 − 1 6 (𝑥 −1)4 + ⋯) +𝐶1(𝑥 −1) conclusions based on the discussion above, it can be concluded that c-type ops transformation is a generalization of the ordinary ops transformation with additional c as the upper index. all properties of ordinary ops transformations can still apply in c-type ops transformations. the c-type ops transformation can also be applied to solve ordinary differential equations for variable coefficients. references [1] h. ji he and f. yu ji, “taylor series solution for lane-emden equation,” j. math. chem., vol. 57, no. 1, pp. 1932–1934, 2019. [2] i. esuabana and e. j. okon, “power series solutions of second order ordinary differential equation using power series solutions of second order ordinary differential equation using frobenius method,” j. res. appl. math., no. november, pp. 44–50, 2021. [3] n. bliss and j. verschelde, “the method of gauss–newton to compute power series solutions of polynomial homotopies,” linear algebra appl., vol. 542, no. 1440534, pp. 569–588, 2018, doi: 10.1016/j.laa.2017.10.022. [4] w. gyu chang and p. tan thoan, “polynomial and power series ring extensions from sequences,” j. algebr. its appl., vol. 21, no. 3, pp. 78–86, 2022. [5] s. falkensteiner and j. r. sendra, “formal power series solutions of first order autonomous algebraic ordinary differential equations,” pp. 1–2, 2018. [6] m. field, t. michael, r. e. coren, a. masce, and m. iaeng, “international journal for innovative research in solving ordinary differential equations,” internatioal j. innov. res. multidiscip. f., vol. 2, no. 7, pp. 10–19, 2016. [7] c. n. angstmann and b. i. henry, “generalized fractional power series solutions for fractional differential equations,” appl. math. lett., vol. 102, p. 106107, 2020, doi: 10.1016/j.aml.2019.106107. c-type ops transformation ahmad lazwardi 410 [8] i. area and j. j. nieto, “power series solution of the fractional logistic equation,” phys. a stat. mech. its appl., vol. 573, p. 125947, 2021, doi: 10.1016/j.physa.2021.125947. [9] m. zabrocky, an introduction to ordinary generating functions. new york: york university, 2015. [10] u. al khawaja and q. m. al-mdallal, “convergent power series of sech⁡(x) and solutions to nonlinear differential equations,” int. j. differ. equations, vol. 2018, no. 1, pp. 1–10, 2018, doi: 10.1155/2018/6043936. [11] a. aslam, e. machisi, a. h. syofra, r. permatasari, and l. a. nazara, “the frobenius method for solving ordinary differential equation with coefficient variable,” int. j. sci. res., vol. 5, no. 7, pp. 2233–2235, 2016, doi: 10.21275/v5i7.art2016719. [12] a. lazwardi, “ops transformation,” j. math. probl. equations stat., vol. 2, no. 1, pp. 75–81, 2021, doi: 10.22271/math.2021.v2.i1a.37. [13] s. salwa, q. aini, and n. w. switrayni, “beta-dual dari ruang barisan n-delta lamda infinity, n-delta lamda dan n-delta lamda 0,” j. mat., vol. 11, no. 2, pp. 119–124, 2021, doi: 10.24843/jmat.2021.v11.i02.p141. [14] h. s. wilf, generating functionology, vol. 31, no. 09. pensylvania: academic press, 1994. doi: 10.5860/choice.31-4969. [15] s. k. lando, lectures on generating functions, vol. 23. united states of america: american mathematica society, 2002. [16] y. bo, w. cai, and y. wang, “a note on the generating function method,” adv. appl. math. mech., vol. 13, no. 4, pp. 982–1004, 2021, doi: 10.4208/aamm.oa-20200286. [17] n. u. khan, t. usman, and j. choi, “certain generating function of hermitebernoulli-laguerre polynomials,” far east j. math. sci., vol. 101, no. 4, pp. 893– 908, 2017, doi: 10.17654/ms101040893. [18] s. falkensteiner, y. zhang, and n. thieou, “on existence and uniqueness of formal power series solutions of algebraic ordinary differential equations,” mediterranian j. math., vol. 19, no. 2, pp. 95–114, 2022, doi: 10.3389/fphy.2021.795693. actuarial modeling of covid-19 insurance cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 362-369 p-issn: 2086-0382; e-issn: 2477-3344 submitted: january 04, 2022 reviewed: july 23, 2022 accepted: august 20, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.14999 actuarial modeling of covid-19 insurance mila kurniawaty*, maulana muhamad arifin, bagus kurniawan, sadam laksamana sukarno, muhammad teguh prayoga department of mathematics, faculty of mathematics and natural sciences, universitas brawijaya, malang, indonesia email: mila_n12@ub.ac.id abstract the coronavirus disease (covid-19) has spread to almost all countries in the world causing economic and financial crisis. many researchers are interested in studying infectious diseases especially in dynamical models of covid-19. peng et al in 2020 studied the generalized seir (susceptible-exposed-infected-recovered) of covid-19. we interested to develop their results to make financial arrangement. in this article, we provide an actuarial model of the covid-19 insurance based on the generalized seir model. we construct the dynamical models of premium and benefit based on generalized seir. based on its dynamical model, we formulate the premium and the premium reserves on hospitalization and death benefits of the covid-19 insurance by using equivalence principle. this actuarial model is expected to able to help financial arrangements to cover losses due to the outbreak of covid-19. keywords: premium; premium reserves; generalized seir; hospitalization benefit; death benefit introduction the novel coronavirus-caused pneumonia 2019 (covid-19) previously called 2019-ncov or sars-cov-2 (severe acute respiratory syndrome coronavirus 2) first appeared in wuhan in december 2019 and then spread rapidly throughout china [1]. based on data on woldometer (2021), this covid-19 case has spread to almost all countries in the world [2]. this condition had a huge impact on the world economy, financial institutions in crisis [3]. arfah et al. [4] studied a new strategy to solve the problem of the global financial crisis in the sharia aspect. from a financial point of view, a well-designed health care system that can reduce the financial impact of sudden outbreaks of a pandemic, such as soaring medical costs, hospital infrastructure, medical equipment, vaccination and quarantine. then the insurance program is expected to cover financial losses arising from disruptions in operation of regular businesses. by applying mathematical and actuarial techniques to model and measure financial risk, actuaries are expected to expand their expertise and tackle epidemics in the health care system. the mathematical modeling has been widely developed and analyzed as a consideration to determine the insurance premium (see [5], [6], [7]). feng and garrido [8] used the epidemic model to make financial arrangements. they used the sir (susceptible-infected-removed) model to study the infectious diseases. the class s http://dx.doi.org/10.18860/ca.v7i3.14999 mailto:mila_n12@ub.ac.id* actuarial modeling of covid-19 insurance mila kurniawaty 363 denoted a group of susceptible individuals to the certain diseases or virus. the class i denoted a group of individuals who are infected and capable of transmitting the disease. the individuals were excluded from the epidemic due to death or recovery through medical treatment are classified in class r. the dynamic model compartments as in [8] are given in figure 1 below. figure 1. dynamical model of the sir premium and benefit payment [8] the outbreak of covid-19 has attracted researchers’ interest in studying infectious diseases. there are several research which studied the sir model of the covid-19 epidemic (see [9] and [10]). however, in the development of the case, there is another factor influencing the spread of disease in covid-19 cases, namely exposed individuals as in [11] and [12]. peng et al. [13] and aldila et al. [14] added some classes influencing the covid-19 epidemic model. to characterize the outbreak of covid-19 in wuhan, peng et al. [13] generalized the classical seir (susceptible-exposed-infected-removed) model by introducing seven classes, that is {𝑆(𝑡), 𝑃(𝑡), 𝐸(𝑡), 𝐼(𝑡), 𝑄(𝑡), 𝑅(𝑡), 𝐷(𝑡)} which represent the number of the susceptible cases, insusceptible cases, exposed cases, infective cases, quarantined cases, recovered cases and death case, respectively, at time 𝑡. the epidemic model for covid-19 of [13] is given in figure 2. figure 2. dynamical model of the generalized seir for covid-19 [13] the total population is assumed constant, that is the summation of all classes, and the coefficient 𝛼, 𝛽, 𝛾 −1, 𝜃−1, 𝜆(𝑡), 𝜅(𝑡) represent the respective protection rate, infection rate, average latent time, average quarantine time, cure rate, and mortality rate. in this paper, the dynamical model of peng et al. [13] will be generalized to determine actuarial calculation of covid-19 insurance. in particular, our result improves the previous work due to feng and garrido [8]. the first one we construct the dynamical model of premium and benefit payment, and then we use the classical actuarial calculation actuarial modeling of covid-19 insurance mila kurniawaty 364 to determine actual present value of benefit payment and premium payment, and also the premium reserves (see [15]-[20]). methods in this research, we develop the research methods into some steps. the first one, the figure 2 is modified into actuarial concept, by adding the premium payment and benefit payment. the premium payment must be done by the population in class 𝑆, 𝐸, 𝐼, 𝑅, and 𝑃. the population in class 𝑄 have to get the hospitalization benefit, whereas the population in class 𝐷 have to get the death benefit. from the new figure will be construct the ordinary differential equation of dynamical model. based on the dynamical model will be constructed the premium rate and the premium reserve. the equivalence principle will be used to construct it. results and discussion dynamical model of premium and benefits in this section, the compartment model in peng et al. [13] will be generalized to dynamical model of premium payments for the covid-19 policyholders and benefit payments by insurance companies. the compartments of the dynamical model are given in figure 3. figure 3. the dynamical model of premium dan benefit payments on generalized seir in this case the policyholder is assumed to be out of insurance after recovery. by [13], the compartment of the generalized seir model is denoted by following system of ordinary differential equations: infective (i) 𝛾 𝛽 premium payment 𝜆(𝑡) 𝜅(𝑡) premium payment 𝛼 recovered (r) susceptible (s) exposed (e) insusceptible (p) quarantined (q) insurance death (d) 𝜃 premium payment hospitalization benefit premium payment death benefit actuarial modeling of covid-19 insurance mila kurniawaty 365 𝑑𝑆(𝑡) 𝑑𝑡 = −𝛼𝑆(𝑡) − 𝛽 𝑆(𝑡)𝐼(𝑡) 𝑁 (1) 𝑑𝐸(𝑡) 𝑑𝑡 = −𝛾𝐸(𝑡) + 𝛽 𝑆(𝑡)𝐼(𝑡) 𝑁 (2) 𝑑𝐼(𝑡) 𝑑𝑡 = 𝛾𝐸(𝑡) − 𝜃𝐼(𝑡) (3) 𝑑𝑄(𝑡) 𝑑𝑡 = 𝜃𝐼(𝑡) − 𝜆(𝑡)𝑄(𝑡) − 𝜅(𝑡)𝑄(𝑡) (4) 𝑑𝑅(𝑡) 𝑑𝑡 = 𝜆(𝑡)𝑄(𝑡) (5) 𝑑𝐷(𝑡) 𝑑𝑡 = 𝜅(𝑡)𝑄(𝑡) (6) 𝑑𝑃(𝑡) 𝑑𝑡 = 𝛼𝑆(𝑡) (7) with given initial value 𝑆(0) = 𝑆0, 𝐸(0) = 𝐸0, 𝐼(0) = 𝐼0, 𝑄(0) = 𝑄0, 𝑅(0) = 𝑅0, 𝐷(0) = 𝐷0, 𝑃(0) = 𝑃0, and 𝑆0 + 𝐸0 + 𝐼0 + 𝑄0 + 𝑅0 + 𝐷0 + 𝑃0 = 𝑁. in actuarial approach, the probability of each class is defined by rasio of each class to the total population, then we now introduce the deterministic functions 𝑠(𝑡), 𝑒(𝑡), 𝑖(𝑡), 𝑞(𝑡), 𝑟(𝑡), 𝑑(𝑡), and 𝑝(𝑡), represented as the fractions of the population in each of class 𝑆, 𝐸, 𝐼, 𝑄, 𝑅, 𝐷, and 𝑃, respectively. by dividing equations (1)-(7) by the constant total population size 𝑁, we have 𝑠′(𝑡) = −𝛼𝑠(𝑡) − 𝛽𝑠(𝑡)𝑖(𝑡), 𝑡 ≥ 0 (8) 𝑒′(𝑡) = −𝛾𝑒(𝑡) + 𝛽𝑠(𝑡)𝑖(𝑡), 𝑡 ≥ 0 (9) 𝑖′(𝑡) = 𝛾𝑒(𝑡) − 𝜃𝑖(𝑡), 𝑡 ≥ 0 (10) 𝑞′(𝑡) = 𝜃𝑖(𝑡) − 𝜆(𝑡)𝑞(𝑡) − 𝜅(𝑡)𝑞(𝑡), 𝑡 ≥ 0 (11) 𝑟′(𝑡) = 𝜆(𝑡)𝑞(𝑡), 𝑡 ≥ 0 (12) 𝑑′(𝑡) = 𝜅(𝑡)𝑞(𝑡), 𝑡 ≥ 0 (13) 𝑝′(𝑡) = 𝛼𝑠(𝑡), 𝑡 ≥ 0 (14) 𝑠(𝑡) + 𝑒(𝑡) + 𝑖(𝑡) + 𝑞(𝑡) + 𝑟(𝑡) + 𝑑(𝑡) + 𝑝(𝑡) = 1, 𝑡 ≥ 0 (15) with initial given value 𝑠(0) = 𝑠0, 𝑒(0) = 𝑒0, 𝑖(0) = 𝑖0, 𝑞(0) = 𝑞0, 𝑟(0) = 𝑟0, 𝑑(0) = 𝑑0, 𝑝(0) = 𝑝0, dan 𝑠0 + 𝑒0 + 𝑖0 + 𝑝0 = 1. premium and benefit payments we assume that the premium payment of an infectious disease insurance plan in the form of continuous annuities from the susceptibles, insuspectible, infected, and exposed. it means the policyholder are commited to pay the premiums continuously as long as they remain in susceptibles, insuspectible, infected, and exposed classes. otherwise, the insurance company will give the benefit if the policyholder are quarantined and death. once the individual dies or recovery after quarantined process in hospital, the plan terminates immediately. by using the principles of international actuarial notation as mention in [8], the actuarial present value (apv) of each class for a 𝑡-year period is denoted by �̅��̅�| 𝑠 , �̅��̅�| 𝑒 , �̅��̅�| 𝑖 , �̅� �̅�| 𝑞 , �̅��̅�| 𝑟 , �̅��̅�| 𝑑 , and �̅� �̅�| 𝑝 . to evaluate the annuity, we use the present value of payments due at time 𝑡, which is the discounted value of one monetary unit for a basic annuity. then it is multiplied by the probability of making those payments and then integrate these apv for all payment times 𝑡. the detailed evaluations of annuities can be found in ([17] and 18]). actuarial modeling of covid-19 insurance mila kurniawaty 366 by figure 3, there are 2 benefits, i.e, hospitalization benefit and death benefit. the hospitalization benefit will be given to quarantined individual and death benefit will be given to death individual. the medical and hospitalization expenses are continuously reimbursed for each quarantined policyholder during the whole period of treatment. hence, the total discounted value of a 𝑡-year annuity of hospitalization payments is can be descibed as follows �̅� �̅�| 𝑞 = ∫ 𝑒−𝛿𝑥 𝑡 0 . 𝑞(𝑥) 𝑑𝑥, 𝛿 > 0 (16) where 𝛿 is the discounting force of interest. when the policyholder is diagnosed with the infectious disease and hospitalized immediately, the medical expenses are to be paid immediately in a lump sum. since its obligation is fulfilled, the insurance plan terminates. in actuarial mathematics, the payment of a lump sum compensation can be analogized as whole life insurance. the apv of hospitalization benefit with lumpsum payment, denoted by �̅� �̅�| 𝑞 , is defined by �̅� �̅�| 𝑞 = ∫ 𝑒−𝛿𝑥 𝑡 0 . 𝜃𝑖(𝑥) 𝑑𝑥 = 𝜃�̅��̅�| 𝑖 (17) since 𝜃𝑖(𝑡) denotes the probability of being newly quarantined at time 𝑡. by the same concept of lumpsum payment of hospitalization benefit, the apv of death benefit is given by �̅��̅�| 𝑑 = ∫ 𝑒−𝛿𝑥 𝑡 0 . 𝜅(𝑥)𝑞(𝑥) 𝑑𝑥 (18) since the probability of being newly death at time 𝑡 is 𝜅(𝑡)𝑞(𝑡). as in the previous section, there are 4 classes must pay the premium, then the total discounted value of a 𝑡-year annuity premium of payments is given by �̅��̅�| 𝑠 + �̅�𝑡| 𝑒 + �̅�𝑡̅| 𝑖 + �̅� �̅�| 𝑝 = ∫ 𝑒−𝛿𝑥 𝑡 0 . (𝑠(𝑥) + 𝑒(𝑥) + 𝑖(𝑥) + 𝑝(𝑥)) 𝑑𝑥 (19) in this section, the compartment model in peng et al. [13] is generalized to dynamical model of premium payments for the covid-19 policyholders and benefit payments by insurance companies. premium rate and premium reserves the policy shall be analized with an infinite term for mathematical convenience. the premium based on an infinite term can be used to estimate the cost of insurance for relatively long policy. then the equation in the previous section must be applied for 𝑡 tend to infinity. proposition 1 in the generalized seir model (8)-(11), and (14), the following inequality holds �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅�∞̅| 𝑝 + �̅�∞̅| 𝑞 = 1 𝛿 (1 − ∫ 𝑒−𝛿𝑥 (𝜆(𝑥) + 𝜅(𝑥))𝑞(𝑥)𝑑𝑥 ∞ 0 ) (20) our study is based on one of three principles in [17] and almost used in ([15] [20]), i.e., the equivalence principle for determination of level premium is given by 𝐸[present value of benefits] = 𝐸[present value of benefit premium] (21) actuarial modeling of covid-19 insurance mila kurniawaty 367 therefore, by using the equations (16), (19), and equivalence principle with an infinite term, the level premium for a unit annuity claim payment plan of hospitalization benefit is given by �̅�(�̅�∞̅| 𝑞 ) = �̅�∞̅| 𝑞 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 by equation (20) we then have the following �̅�(�̅�∞̅| 𝑞 ) = �̅�∞̅| 𝑞 1 𝛿 (1 − ∫ 𝑒−𝛿𝑥 (𝜆(𝑥) + 𝜅(𝑥))𝑞(𝑥)𝑑𝑥 ∞ 0 ) − �̅� ∞̅| 𝑞 the net level premium of hospitalization benefit with lumpsum payment for the infinite term insurance plan is denoted by �̅�(�̅�∞̅| 𝑞 ). by equations (17), and (19)-(21) for infinite term we then have �̅�(�̅�∞̅| 𝑞 ) = �̅�∞̅| 𝑞 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 = 𝜃�̅�∞̅| 𝑖 1 𝛿 (1 − ∫ 𝑒−𝛿𝑥 (𝜆(𝑥) + 𝜅(𝑥))𝑞(𝑥)𝑑𝑥 ∞ 0 ) − �̅� ∞̅| 𝑞 in fact, the covid-19 insurance is a combination of the hospitalization and death insurances due to covid-19. then the net level premium of death benefit and hospitalization claim for the infinite term insurance plan is denoted by �̅�(�̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 ) = �̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 = �̅�∞̅| 𝑞 + ∫ 𝑒−𝛿𝑥 ∞ 0 . 𝜅(𝑥)𝑞(𝑥) 𝑑𝑥 1 𝛿 (1 − ∫ 𝑒−𝛿𝑥 (𝜆(𝑥) + 𝜅(𝑥))𝑞(𝑥)𝑑𝑥 ∞ 0 ) − �̅� ∞̅| 𝑞 meanwhile, the net level premium of death benefit and hospitalization claim with lumpsum payment for the plan of an infinite term insurance is given by �̅�(�̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 ) = �̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 = 𝜃�̅�∞̅| 𝑖 + + ∫ 𝑒−𝛿𝑥 ∞ 0 . 𝜅(𝑥)𝑞(𝑥) 𝑑𝑥 1 𝛿 (1 − ∫ 𝑒−𝛿𝑥 (𝜆(𝑥) + 𝜅(𝑥))𝑞(𝑥)𝑑𝑥 ∞ 0 ) − �̅� ∞̅| 𝑞 (22) we consider the net level premium, the total premium and the total benefit in order to obtained the premium reserves. in actuarial sciences, the premium reserve is very important to determine the ability of the insurance company to pay the claim of policyholders. there are some methods to determine the premium reserves. one of them is retrospective method, the detail of this method can be found in [19]. by ordinary differential equations in [4], where �̅�(𝑡) denotes accumulated premium reserves at time 𝑡 with lumpsum payment, we thus have �̅�′(𝑡) = �̅�(�̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 )(𝑠(𝑡) + 𝑒(𝑡) + 𝑝(𝑡) + 𝑖(𝑡)) − (𝜃𝑖(𝑡) + 𝜅(𝑡)𝑞(𝑡)) + 𝛿�̅�(𝑡) = ( �̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 ) (𝑠(𝑡) + 𝑒(𝑡) + 𝑝(𝑡) + 𝑖(𝑡)) − (𝜃𝑖(𝑡) + 𝜅(𝑡)𝑞(𝑡)) + 𝛿�̅�(𝑡) let us define 𝑓(𝑡) = ( �̅�∞̅̅̅| 𝑞 +�̅�∞̅̅̅| 𝑑 �̅�∞̅̅̅| 𝑠 +�̅�∞̅̅̅| 𝑒 +�̅�∞̅̅̅| 𝑖 +�̅� ∞̅̅̅| 𝑝 ) (𝑠(𝑡) + 𝑒(𝑡) + 𝑝(𝑡) + 𝑖(𝑡)) − (𝜃𝑖(𝑡) + 𝜅(𝑡)𝑞(𝑡)) we thus have �̅�′(𝑡) − 𝛿�̅�(𝑡) = 𝑓(𝑡) by multiplying both sides by 𝑒−𝛿𝑡 , yields �̅�(𝑡) = (∫ 𝑒−𝛿𝑥 𝑓(𝑥) 𝑡 0 𝑑𝑥) . 𝑒𝛿𝑡 + �̅�(0). 𝑒𝛿𝑡 or equivalently, actuarial modeling of covid-19 insurance mila kurniawaty 368 �̅�(𝑡) = (∫ 𝑒−𝛿𝑥 ( �̅�∞̅| 𝑞 + �̅�∞̅| 𝑑 �̅�∞̅| 𝑠 + �̅�∞̅| 𝑒 + �̅�∞̅| 𝑖 + �̅� ∞̅| 𝑝 ) (𝑠(𝑥) + 𝑒(𝑥) + 𝑝(𝑥) + 𝑖(𝑥)) 𝑡 0 − (𝜃𝑖(𝑥) + 𝜅(𝑥)𝑞(𝑥)) 𝑑𝑥) . 𝑒𝛿𝑡 + �̅�(0). 𝑒𝛿𝑡 (23) from equation (23), the amount of the premium reserve at time 𝑡 depends on the total premium, the death benefit and quarantined benefit, and also the initial value of the premium reserve. conclusions the covid-19 insurance by considering the generalized seir model was assumed that the recovery individuals not involved in premium payment since the policyholder who has been quarantined in the hospital and claims the benefit payment then the insurance plan terminates. therefore, the total premium of payment are depend on class 𝑆(𝑡), 𝐸(𝑡), 𝐼(𝑡), and 𝑃(𝑡). meanwhile, the total benefit of payment are depend on class 𝑄(𝑡) and 𝐷(𝑡). by using equivalence principle, the net level premium is the ratio of actuarial present value of benefits to actuarial present value of premiums. hence we get the premium reserve by restrocpective approach. acknowledgments this research is supported by the doctoral grant no. 1631/un10.f09/pn/2021 at mathematics and natural sciences faculty, universitas brawijaya. references [1] c. huang, y. wang, x. li, l. ren, j. zhao, y. hu, li. zhang, g. fan, j. xu, x. gu, z. cheng, t. yu, j. xia, y. wei, w. wu, x. xie, w. yin, h. li, m. liu, y. xiao, h. gao, l. guo, j. xie, g. wang, r. jiang, z. gao, q. jin, j. wang, and b. cao. clinical features of patients infected with 2019 novel coronavirus in wuhan, china. the lancet, vol.395, issue 10223, pp. 497-506, 2020. [2] woldometer. report coronavirus cases. retrieved march 17, 2021, from https://www.worldometers.info/coronavirus/#countries, 2021. [3] m. f. sattar, s. khanum, a. nawaz, m. m. ashfaq, m. a. khan, m. jawad, and w. ullah. covid-19 global, pandemic impact on world economy. technium social sciences journal. vol. 11, pp.165-179, 2020. [4] a. arfah, f. z. olilingo, f. zain, s. syaifuddin, d. dahliah, n. nurmiati, and a. h. k. p.putra. economics during global recession: sharia-economics as a post covid-19 agenda. journal of asian finance, economic and business, vol.7, no. 11, pp.1077-1085, 2020. [5] b. l. jones. actuarial calculations using a markov model. transactions of the society of actuaries, vol. 46, pp.227–250, 1994. [6] n. jia and l. tsui. epidemic modelling using sars as a case study. north american actuarial journal, vol. 9, no.4, pp. 28 – 42, 2005. https://www.worldometers.info/coronavirus/#countries actuarial modeling of covid-19 insurance mila kurniawaty 369 [7] canadian institute of actuaries, “considerations for the development of a pandemic scenario”. canadian institute of actuaries, committee on risk management and capital requirements. research paper 209095, 2009. [8] r. feng and j. garrido. actuarial applications of epidemiological models. north american actuarial journal, vol. 15, no.1, pp. 112-136, 2012. [9] m. batista. estimation of the final size of the coronavirus epidemic by the sir model, preprint, 2020. [10] i. cooper, a. mondal, and c. g. antonopoulos. a sir model assumption for the spread of covid-19 in different communities. chaos, solitons and fractals, vol. 139, 2020. [11] b. tang, n.l. bragazzi, q. li, s. tang, y. xiao, and j. wu, j. an updated estimation of the risk of transmission of the novel coronavirus (2019-ncov). infectious disease modelling. vol. 5, pp. 248-255, 2020. [12] t. zhou, q. liu, z. yang, j. liao, k. yang, w. bai, x. lu, and w. zhang. preliminary predistion of the basic reproduction number of the wuhan novel coronavirus 2019ncov. journal of evidence-based medicine, vol. 13, no.1, 2020. [13] l. peng, w. yang, d. zhang, c. zhuge, and l. hong. epidemic analysis of covid19 in china by dynamical modeling. arxiv preprint arxiv:2002.06563, 2020. [14] d. aldila, s. h. a. khoshnaw, e. safitri, y. r. anwar, a. r. q. bakry, b. m. samiadji, d. a. anugerah, m. f. alfarizi, i. d. ayulani, and s. n. salim. a mathematical study on the spread of covid-19 considering social distancing and rapid assessment: the case of jakarta, indonesia. chaos, solitons and fractals, vol.139, 2020. [15] h. u. gerber. life insurance mathematics. springer, berlin, heidelberg, 1997. [16] n. l. bowers, h. u. gerber, j. c. hickman, d a. jones, and c. j. nesbitt. actuarial mathematics. the society of actuaries, 1997. [17] r. j. cunningham, t. n. herzog, and r. l. london. model fo quantifying risk, 2nd ed, actex publication inc., winsted, 2006. [18] d. c. m. dickson, m. r. hardy, and h. r. waters. actuarial mathematics for life contingent risks, 2nd ed, cambridge university press, united kingdom, 2003. [19] r. k. sembiring. buku materi pokok asuransi i. karunika universitas terbuka. jakarta, 1986. [20] i. catarya. buku materi pokok asuransi ii. karunika universitas terbuka. jakarta, 1988. spatial autoregressive to model tuberculosis cases in central java province in 2019 cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 240-248 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 29, 2021 reviewed: december 09, 2021 accepted: december 14, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.13451 spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua1, i gede nyoman mindra jaya2* 1 post-graduate program in applied statistics, faculty of mathematics and natural sciences, universitas padjadjaran, indonesia 2 department of statistics, faculty of mathematics and natural sciences, universitas padjadjaran, indonesia *corresponding author email: mindra@unpad.ac.id*, hasrat20001@mail.unpad.ac.id abstract tuberculosis is an infectious disease caused by the bacterium mycobacterium tuberculosis. central java is one of the three provinces with the highest tuberculosis cases in indonesia. some of the risk factors used in this research are the spatial lag of the number of tuberculosis cases representing the agent component, the morbidity rate representing the host component, population density, proper sanitation, and proper drinking water which represent environmental components. this study aims to model the tuberculosis cases in central java province using the spatial autoregressive (sar) model. the sar model is a regression model where the response variable has a spatial correlation. the estimation method usually used in sar model is maximum likelihood. the moran's i on the number of tuberculosis cases in central java shows a positive spatial autocorrelation. the model was chosen based on the lm test and aic. the best model is the sar model. the results show that the greater the number of tuberculosis cases is influenced by the number of tuberculosis cases in the neighbouring areas. proper sanitation has a negative effect, on the contrary, the dense population has a positive effect on the number of tuberculosis cases in the province of central java. keywords: maximum likelihood; sar; spatial; tuberculosis introduction tuberculosis is an infectious disease caused by infection with the bacterium mycobacterium tuberculosis [1]. tuberculosis can be transmitted from human to human through the splash of the saliva of tuberculosis sufferers which spreads into the air when coughing or sneezing. single cough or sneeze can produce up to 3000 splashes of saliva [2]. an infection occurs when other people breathe air containing these droplets. this disease usually affects the lungs, but can also affect other parts of the body. according to who [3], in 2019 around 10 million people were infected with tuberculosis and 11,4 million of them died. in 2016, 45 percent of the estimated tuberculosis cases were in southeast asia http://dx.doi.org/10.18860/ca.v7i1.13451 mailto:mindra@unpad.ac.id mailto:email1@gmail.com spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 241 and indonesia is one of them. indonesia is the third country with the highest tuberculosis cases in the world after china and india with an estimated 845 thousand cases. during tuberculosis endemic in indonesia, the spread of tuberculosis is still very high. central java is one of the three provinces with the highest tuberculosis cases in indonesia, with 73,171 cases in 2019 [4]. tuberculosis prevention and control efforts have been carried out such as the bacille calmette guerin (bcg) vaccine in infants [5], increasing the number of case finding and treatment success in healthcare facilities. the biggest challenge in controlling tuberculosis is that there are many missing cases (unreported) which will further increase the transmission process due to patient ignorance. from an epidemiological perspective, the incidence of tuberculosis is an interaction between three components, namely the agent, host, and environment [1]. from the agent's side, intensive interactions between patients and other people can facilitate the transmission. the duration of contact time or the intensity of contact with people with tuberculosis can cause a person to be more easily exposed [6]. the host side (i.e., a person's susceptibility to tuberculosis) is strongly influenced by his body's resistance. as stated by pangaribuan et al [7] the factors that are related from the host side are age, gender, race, socioeconomic, living habits, marital status, occupation, heredity, nutrition, and immunity. in terms of the environment, arinil, et al [8] stated that environmental factors such as the physical environment of the house (occupancy density, ventilation, sanitation) and weather climate (temperature, humidity) were closely related to tuberculosis. prevention of tuberculosis transmission certainly requires control of these three components. identifying area with a high risk of tuberculosis transmission is important to know to see the inter-regional linkages. analytical tool for spatial data to model the number of cases of tuberculosis that occur is needed. in spatial epidemiology, the use of maps as a visualization method is needed to see the distribution of disease by geographic area [9]. several risk factors used in this research are the spatial lag of the number of tuberculosis cases representing the agent component, the morbidity rate representing the host component, population density, proper sanitation, and proper drinking water representing the environmental component. one of the spatial models that can be used is the spatial autoregressive (sar) model. the sar model is a regression model with a spatial correlation in the response variable. using cross-section data, this model combines a linear regression model with a spatial lag of the response variable [10]. the use of the sar model in infectious diseases, especially tuberculosis in central java, is due to agent factors that have high mobility from district to other districts, besides that there are other factors, namely host and environment that can be used as covariates. the estimation method usually used in the sar model is maximum likelihood. therefore, the purpose of this study was to model the sar of the number of tuberculosis cases in central java province using the maximum likelihood approach. in this case, we will use a standardized weighting matrix using the queen contiguity. methods data and variables this study used data from the publication of the health profile of the central java province [4] and the central bureau of statistics of the central java province [11]. the units of analysis are 35 districts/cities in central java. the variables used in this study are shown in table 1: spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 242 table 1. variables and data sources notation variable source 𝑦 number of confirmed cases of tuberculosis health profile of the central java province 𝑥1 proper sanitation 𝑥2 eligible drinking water facilities 𝑥3 population density central bureau of statistics of the central java province 𝑥4 morbidity rate moran’s i moran's i is the value of the test statistic used to determine whether there is spatial autocorrelation or spatial dependence in the data. moran's i has global and local measures. moran's i values range from -1 and 1. the global measure is used to measure the overall autocorrelation and the local is used to identify the autocorrelation on each unit. global moran's i can be seen in the following formula [12]: 𝐼 = 𝑛 ∑ ∑ 𝑤𝑖𝑗 𝑛 𝑗=1 𝑛 𝑖=1 ∑ ∑ (𝑦𝑖 − �̅�)(𝑦𝑗 − �̅�) 𝑛 𝑗=1 𝑛 𝑖=1 ∑ (𝑦𝑖 −�̅�) 2𝑛 𝑖=1 (1) with, n : number of spatial units �̅� : mean of n locations 𝑦𝑖 : observation variable at location i 𝑦𝑗 : observation variable at location j 𝑤𝑖𝑗 : elements of the spatial weight matrix w and the local moran's i can be seen in the following formula: 𝐼𝑖 = (𝑦𝑖 − �̅�) ∑ (𝑦𝑘 − �̅�) 2/𝑛𝑛𝑘=1 ∑ (𝑦𝑗 − �̅�) 𝑛 𝑗=1 (2) the null hypothesis for autocorrelation is 𝐼 = 𝐸(𝐼) no spatial dependence. the formula of test statistics can be written as follows: 𝑍(𝐼) = 𝐼 − 𝐸(𝐼) √𝑉𝐴𝑅(𝐼) ~𝑁(0,1) (3) with, 𝐸(𝐼) = − 1 𝑛 − 1 𝑉𝐴𝑅(𝐼) = 𝑛2𝑆1 − 𝑛𝑆2 + 3𝑆0 2 (𝑛2 − 1)𝑆0 2 − [𝐸(𝐼)]2 𝑆0 = ∑ ∑ 𝑤𝑖𝑗 𝑛 𝑗=1 𝑛 𝑖=1 ; 𝑆1 = 1 2 ∑ ∑ (𝑤𝑖𝑗 + 𝑤𝑗𝑖) 2𝑛 𝑗=1 𝑛 𝑖=1 ; 𝑆2 = ∑ (∑ 𝑤𝑖𝑗 𝑛 𝑗 𝑛 𝑖=1 + ∑ 𝑤𝑗𝑖 𝑛 𝑗 ) 2. lagrange multiplier (lm) test the lagrange multiplier (lm) test is used to determine whether there is a spatial dependence are not. there are two types of lm tests that have been developed, namely the spatial dependence of the dependent variable and the spatial error dependence. lm test statistics on the spatial dependence (𝐿𝑀𝐿𝐴𝐺) of the dependent variable are as follows [13]: 𝐿𝑀𝐿𝐴𝐺 = [(𝑒𝑇𝑊𝐴𝑦)/(𝑒 𝑇𝑒/𝑛)]2 [(𝑊𝐴𝑋�̂�) 2 𝑀(𝑊𝐴𝑋�̂�)/(𝑒 𝑇𝑒/𝑛)] + [𝑡𝑟(𝑊𝐴 𝑇𝑊𝐴 + 𝑊𝐴 2)] ~𝜒(1−𝛼);𝑑𝑓=1 2 (4) if the test statistic value is greater than the chi-square value (reject h0), then the model spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 243 made is the spatial autoregressive (sar) model. lm test statistics for the dependence of spatial error (𝐿𝑀𝐸𝑅𝑅) can be seen in the following formula: 𝐿𝑀𝐸𝑅𝑅 = [(𝑒𝑇𝑊𝐴𝑒)/(𝑒 𝑇𝑒/𝑛)]2 𝑡𝑟(𝑊𝐴 𝑇𝑊𝐴 + 𝑊𝐴 2) ~𝜒(1−𝛼);𝑑𝑓=1 2 (5) if the test statistic value is greater than the chi-square value (reject h0), then the model made is the spatial error model (sem). meanwhile, if the 𝐿𝑀𝐿𝐴𝐺 and 𝐿𝑀𝐸𝑅𝑅 values are both significant, the best model can be chosen by comparing the akaike information criterion (aic). model with the smaller aic value is the best model. spatial autoregressive (sar) model the sar model is a combination of a linear regression model with a spatial lag of the response variable using cross-section data. in general, the sar model can be written as follows [14]: 𝐲 = ρ𝐖𝐲 + 𝐗𝛃+ 𝛆; 𝛆~mvn(0,σ𝜀 2in) (6) with: 𝐲 : continuous response variable ρ : autoregressive coefficient 𝐖 : spatial weight matrix 𝛃 : intercept and regression coefficient 𝐗 : predictor variable 𝛆 : error this model assumes that the autoregressive process is only found in the response variable. maximum likelihood is one of the most commonly used estimators because it can provide the best linear unbiased estimation (blue) and overcome endogeneity in the sar model. the estimated parameters using the maximum likelihood method are as follows: �̂�𝑀𝐿 = (𝑿 𝑻𝑿)−1𝑿𝑻𝒚⏟ �̂�𝑂𝐿𝑆 − 𝜌(𝑿𝑻𝑿)−1𝑿𝑻𝑾𝝆𝒚⏟ �̂�𝐿 = �̂�𝑂𝐿𝑆 − 𝜌�̂�𝐿 (7) where �̂�𝐿 is an estimator of regression parameters that depends on the spatial autocorrelation ρ and the weight matrix (w). however, this form cannot be solved directly because the value of ρ is unknown. to be able to estimate the regression parameters, it can be done using a concentrated log-likelihood function (𝐿𝑐) which is a function of the mle residual which is defined as follows: ln𝐿𝑐(𝜌) = 𝐶 − 𝑛 2 ln[ 1 𝑛 (𝒆𝟎 − 𝜌𝒆𝑳) 𝑇(𝒆𝟎 − 𝜌𝒆𝑳)] + ln |𝑰 −𝜌𝑾𝜌| (8) the formula cannot be solved analytically so a numerical method is needed to find the estimated ρ parameter of the equation. results and discussion the case notification rate (cnr) of tuberculosis in central java province is 211, which means that there are 211 cases of tuberculosis being treated and reported among 100,000 residents in central java. judging from the number of cases of tuberculosis in central java there were as many as 73,171 cases during 2019. tuberculosis cases are a disease that is spread throughout the district in central java province. the lowest number of cases was in karanganyar regency with 514 cases and cilacap regency with the highest number of 4,703 cases. however, when viewed from the district/city cnr, spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 244 the highest cnr is tegal city at 832.5 per 100,000 population and the lowest cnr is temanggung regency at 45.72 per 100,000 [4]. the map of the distribution of tuberculosis cases in central java can be seen in figure 1. figure 1. map of tuberculosis case quantile in central java province in 2019 figure 1 shows that areas with a large number of cases (in solid red) tend to be close to areas with a large number of cases. areas with a small number of cases (in faded red) also tend to be adjacent to areas with a small number of cases. this indicates that there is a spatial dependence between regions. before conducting sar modeling, it is necessary to test the classical assumptions of multiple linear regression and moran's i tests. table 2. the statistic test result of classic assumptions and moran’s i statistic test p-value normality (shapiro-wilk) 0.9821 nonautocorrelation (durbin-watson) 0.4893 nonmulticolinierity (vif) 1.138817(x1); 1.049796(x2); 1.065710(x3); 1.074787(x4) homoscedasticity (breusch-pagan) 0.9749 moran’s i 0.000 (i= 0.4991) table 2 shows that all assumptions in multiple linear regression have been fulfilled. the results of the moran's i value using a spatial weight matrix based on queen contiguity on the number of tuberculosis cases in the province of central java is 0.499 with a p-value <0.05 (reject h0) which means there is a positive spatial autocorrelation. high tuberculosis cases areas will be surrounded by high areas as well, and low tuberculosis cases areas will be surrounded by low areas as well. for local moran's i values will produce different values for each location. there are several significant areas in local moran's i which is divided into two parts. first, the high-high areas include cilacap, banyumas, brebes, tegal, puralingga, and tegal city. this indicates that districts/cities in high-high areas have a high number of tuberculosis cases and the surrounding areas are also high, which is possible due to the transmission spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 245 of tuberculosis to the surrounding area. second, in the low-low area, there are magelang, boyolali, sragen, and karanganyar. this shows that in the low-low area the number of tuberculosis cases is low and the surrounding area is also low. for other regions, the local moran’s i is not significant. moran's i local results can be seen in figure 2: figure 2. local moran's i of tuberculosis case in central java province in 2019 the selection of the spatial model was carried out through the lagrange multiplier (lm) test as an initial identification. lm test is used to determine the spatial dependence more specifically whether the dependency on a response variable (lag), dependency on other variables that are not studied (error), or both (lag and error). the results of the lm test carried out can be seen in table 3. table 3. lm test results model lm-test value p-value aic sar 𝐿𝑀𝐿𝐴𝐺 12,635 0.000378 52.72579 sem 𝐿𝑀𝐸𝑅𝑅 8,973 0.002739 53.10059 from table 2 it can be seen that the spatial dependence in lag and error is significant because the p-value is smaller than alpha (0.05). the sar model will be applied in this study due to the aic value is smaller than the sem model. the sar model is a spatial regression model that involves spatial lag in the response variable. the estimation results of the sar model using the maximum likelihood method and using a spatial weight matrix based on queen contiguity can be seen in table 4. table 4. sar parameter estimation results estimate std. error z-value p-value (intercept) 3.383100 1.346100 2.51318 0.01196 proper sanitation (𝑥1) -0.010300 0.005600 -1.85760 0.06322 eligible drinking water facilities (𝑥2) 0.006170 0.005800 1.06111 0.28864 population density (𝑥3) 0.000094 0.000029 3.20521 0.00134 morbidity rate (𝑥4) -0.003020 0.020255 -0.14910 0.88147 spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 246 estimate std. error z-value p-value rho: 0.58787, lr test value: 11.396, p-value: 0.000 asymptotic standard error: 0.14246 z-value: 4.1267, p-value: 0.000 wald statistic: 17.03, p-value: 0.000 aic: 52.726 the spatial lag variable (rho) has a positive and significant coefficient in influencing the number of tuberculosis cases in central java province. this means that the greater the number of tuberculosis cases is influenced by a large number of tuberculosis cases in the surrounding area. this is in accordance with the research of mindra, et al [15] in the city of bandung. the variable of proper sanitation has a negative regression coefficient value and significantly influences the number of tuberculosis cases in central java province. this means that the more families that have access to proper sanitation (healthy latrines), the number of tuberculosis cases will decrease with the assumption that the other variables are constant. the population density variable has a positive and significant regression coefficient value in influencing the number of tuberculosis cases in central java province. this means that the denser the population of an area, the number of tuberculosis cases will increase with the assumption that the other variables are constant. variables eligible drinking water facilities and morbidity rates have no significant effect on tuberculosis cases in central java province. in the sar model, the covariate impact can be categorized in three types, namely direct impact, indirect impact, and total impact which can be seen in table 5. table 5. direct and indirect impact measure of sar model variable direct indirect total proper sanitation (𝑥1) -0.0116 -0.0135 -0.0251 eligible drinking water facilities (𝑥2) 0.0069 0.0080 0.0149 population density (𝑥3) 0.0001 0.0001 0.0002 morbidity rate (𝑥4) -0.0034 -0.0039 -0.0073 the direct impact is an impact that occurs locally in an area as a result of changes in predictor variables. the indirect impact is a spillover effect, which is the impact that occurs when the predictor variable is in the surrounding area. the total impact is a change that occurs in an area as a result of changes in the area and its surroundings. to find out whether the obtained sar model is good, it is necessary to carry out a diagnostic check, including assumptions of normality, non-autocorrelation, and homogeneity. the results of the diagnostic check performed in table 6 show that these assumptions have been fulfilled. table 6. diagnostic check of sar model statistic test p-value normality (shapiro-wilk) 0.2812 lm test for residual autocorrelation 0.4840 homoscedasticity (breusch-pagan) 0.7637 based on table 6, it can be seen that the p-value for the assumption of normality using the shapiro-wilk test is 0.2821 which is greater than alpha (0.05), which means the residuals are normally distributed. in the autocorrelation test, it was found that the p-value was also greater than alpha (0.05), which means that the residuals meet the non-autocorrelation assumption. likewise, in the homoscedasticity test using the spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 247 breusch-pagan test, it was found that the p-value was also greater than alpha (0.05) so that the assumption of homogeneity was also fulfilled. conclusions tuberculosis cases in central java province showed a positive spatial autocorrelation. it supports the hypothesis that tuberculosis cases are spatially dependent and that a spatial econometrics model should be considered. the best spatial econometrics model was chosen based on the lm test and the aic. the best model is the spatial autoregressive (sar) model. the estimation results of the sar show that the number of tuberculosis cases is influenced by a large number of tuberculosis cases in the neighbouring areas. proper sanitation (ownership of healthy latrines) has a negative effect on the number of tuberculosis cases, on the other hand, the dense population has a positive influence on the number of tuberculosis cases in the province of central java. references [1] k. ri, "pusat data dan informasi kementrian kesehatan ri tuberkolosis," kementrian kesehatan republik indonesia, jakarta, 2018. [2] k. ri, "pedoman nasional pengendalian tuberkolosis," jakarta, 2014. [3] w. h. organization, "global tuberculosis report 2020," geneva, 2020. [4] d. k. p. j. tengah, "profil kesehatan provinsi jawa tengah tahun 2019," semarang, 2020. [5] m. dara, c. d. acosta, v. rusovich, j. p. zellweger, r. centis and g. b. migliori, "bacille calmette–guérin vaccination: the current situation in europe," european respiratory journal, vol. 43, no. 1, pp. 24-35, 2014. [6] t. d. kristini and r. hamidah, "potensi penularan tuberculosis paru pada anggota keluarga penderita," jurnal kesehatan lingkungan indonesia, vol. 15, no. 1, 2020. [7] l. pangaribuan, kristina, d. pangaribuan, t. tejayanti and d. b. lolong, "faktorfaktor yang mempengaruhi kejadian tuberkulosis pada umur 15 tahun ke atas di indonesia (analisis data survei prevalensi tuberkulosis (sptb) di indonesia 20132014)," vol. 23, no. 1, 2020. [8] a. haq, u. f. achmadi and d. susanna, "analisis spasial (topografi) tuberkulosis paru di kota pariaman, bukittinggi, dan dumai tahun 2010-2016," jurnal ekologi kesehatan, vol. 18, no. 3, p. 149 – 158, 2020. [9] m. souris, epidemiology and geography principles, methods and tools of spatial analysis, great britain and the united states: iste ltd and john wiley & sons, inc, 2019. [10] j. lesage and r. k. pace, introduction to spatial econometrics, london: chapman and hall/crc, 2009. [11] b. p. s. p. j. tengah, "provinsi jawa tengah dalam angka," badan pusat statistik, semarang, 2020. [12] g. grekousis, spatial analysis methods and practice, cambridge: cambridge university press, 2020. [13] l. anselin, "lagrange multiplier test diagnostics for spatial dependence and spatial spatial autoregressive to model tuberculosis cases in central java province in 2019 hasrat ifolala zebua 248 heterogeneity," geographical analysis,, vol. 20, pp. 1-17, 2010. [14] i. g. n. m. jaya and y. andriyana, analisis data spasial perspektif bayesian, sumedang: alqaprint jatinangor, 2020. [15] i. g. n. jaya and e. al, "metode bayesian dalam penaksiran model spasial autoregressive (sar) (studi kasus pemodelan penyakit tb paru di kota bandung)," jurnal euclid, vol. 4, no. 2, 2017. bayesian hurdle poisson regression for assumption violation cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 384-393 p-issn: 2086-0382; e-issn: 2477-3344 submitted: march 05, 2022 reviewed: april 23, 2022 accepted: april 26, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.15549 bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah*, ani budi astuti, maria bernadetha t. mitakda department of statistics, faculty of mathematics and natural science, brawijaya university, indonesia email: nurkamilahs@student.ub.ac.id abstract violation of the poisson regression assumption can cause the model formed will produce an unbiased estimator. there is a good method for estimating parameters on small sample sizes and on all distributions, namely the bayesian method. the number of death due to chronic filariasis data violates the poisson regression assumption (overdispersion and response variable did not follow poisson distribution), so it is modeled with the bayesian hurdle poisson regression. with the bayesian method, convergence is fullfilled when 300000 iterations and 7 thin are performed. in addition to presenting an alternative method for estimating the hurdle poisson regression parameter, the model obtained can be used by the government in efforts to mitigate disease disasters through efforts to prevent, control, and handle cases of filariasis. the results showed that in the logit model only the percentage of households that have access to proper sanitation in 34 provinces in indonesia had a significant effect on the number of death due to chronic filariasis cases in 34 provinces in indonesia (𝑌). the truncated poisson model resulted in all predictor variables having a significant effect on the number of death due to chronic filariasis cases. keywords: bayesian; filariasis; hurdle; overdispersion; poisson introduction an important assumption in poisson regression analysis is that the response variable in the form of count distribute poisson, does not occur multicollinearity in the predictor variable, and occurs equidispersion (the mean of the data is equal to its variance). however, in certain cases, the assumption of conformity of poisson's distribution and equidispersion is not fullfilled. this can cause the model formed will produce an unbiased estimator [1]. equidispersion violations or often known as overdispersion (variance greater than the mean) can be overcome with zero inflated model and hurdle model. the handling of overdispersion in this study uses the hurdle poisson model because hurdle model better than the zero inflated model [2]. the parameter estimation method often used in the poisson hurdle model is maximum likelihood estimation (mle). however, mle cannot estimate parameters on small sample sizes and on certain distributions. there is a good method for estimating parameters on small sample sizes and on all distributions, namely the bayesian method. the advantage of the bayesian method is that it can estimate parameters for extremely small observations and can be used for all distributions [3]. the application of the bayesian method to overdispersion data has been carried out to analyze the number of filariasis sufferers in papua province, using the bayesian zero http://dx.doi.org/10.18860/ca.v7i3.15549 mailto:email1@gmail.com bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 385 inflated poisson model [4]. in this study will model data on the number of death from chronic filariasis cases in indonesia that violate the assumption of equidispersion and suitability of poisson distribution with bayesian hurdle poisson regression. filariasis or also known as elephant foot disease is believed to have existed since b.c. because in 1501-1480 bc found an ancient relief in a cemetery temple. queen hatshepsut in thebet, egypt who depicts the princess punt suffering from filariasis on her legs [5]. filariasis in indonesia is one of the endemic diseases (a disease that continues to infect certain regions) and was first reported by haga and van eecke in 1889 in jakarta caused by brugaria malayi [6]. acute clinical symptoms of filariasis disease include inflammation and swelling of the lymph canal accompanied by fever, headache, weak feeling and the onset of abscesses/ulcers while symptoms chronic clinical is the occurrence of enlargement that persists in the legs, arms, breasts and genitals of women and men [7]. one of the efforts to inhibit the transmission of filariasis disease is to mass preventive drug delivery (mpdd) filariasis implemented by endemic districts/cities of filariasis [5]. the success of the filariasis control program can be known by looking at the number of districts/cities that managed to reduce the number of microphilia to <1% [8]. this study discusses the influence of the number of chronic cases of filariasis in 34 provinces in indonesia (𝑋1), the number of districts/cities succeeded in reducing mikrophilia <1% in 34 provinces in indonesia (𝑋2), the number of districts/cities still carry out mass preventive drug delivery (mpdd) filariasis in 34 provinces in indonesia (𝑋3), population density in 34 provinces in indonesia (𝑋4), and the percentage of households that have access to proper sanitation in 34 provinces in indonesia (𝑋5) against the number of deaths from chronic filariasis in 34 provinces in indonesia (𝑌). the results of this study can be utilized for many things, namely (1) through the bayesian hurdle poisson regression model that is built can be identified factors that affect the number of cases of chronic filariasis death in indonesia, so that this information can be utilized for appropriate policy making for the central and local governments and related agencies in order to mitigate the disaster of chronic filariasis disease in indonesia through prevention efforts, control, and handling of the case. (2) by using bayesian parameter estimation approach, it is very useful and superior in various data challenge cases, namely for various sample sizes (any sample) small or large and various distributions (any distribution) with a data driven concept. methods this study uses secondary data from the indonesian health profile in 2020, namely the number of cases of chronic filariasis in 2020 with five predictor variables and one response variable [9]. the first step that must be done is testing the poisson regression assumption (poisson distribution suitability, non-multicollinearity, and overdispersion testing). the variables used in this study are the number of chronic cases of filariasis in 34 provinces in indonesia (𝑋1), the number of districts/cities succeeded in reducing mikrophilia <1% in 34 provinces in indonesia (𝑋2), the number of districts/cities still carry out mass preventive drug delivery (mpdd) filariasis in 34 provinces in indonesia (𝑋3), population density in 34 provinces in indonesia (𝑋4), and the percentage of households that have access to proper sanitation in 34 provinces in indonesia (𝑋5) against the number of deaths from chronic filariasis in 34 provinces in indonesia (𝑌). bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 386 poisson regression assumption poisson distribution suitability was tested with the kolmogorov-smirnov. kolmogorov-smirnov test statistics for testing the suitability of the poisson distribution are presented in equation (1)[10]. 𝐷 = 𝑚𝑎𝑥𝑖𝑚𝑢𝑚|𝐹𝑁 (𝑦(𝑖)) − 𝑃(𝑦(𝑖), 𝜆)| (1) if 𝐷 > 𝐷(𝑛,𝛼) or 𝑝𝑣𝑎𝑙𝑢𝑒 < 0.05, so we can conclude that response variable does not follow a poisson distribution. assumption of non-multicollinierity was tested with the 𝑉𝐼𝐹 criteria. if the 𝑉𝐼𝐹𝑗 exceeds 10, non-multicollinierity assumption is not fulfilled [11]. the third assumption test that must be done is the overdispersion test. the overdispersion test is carried out by calculating pearson chi square divided by the degrees of freedom of residual based on the formula (2). 𝜒𝑃𝑒𝑎𝑟𝑠𝑜𝑛 2 = ∑ (𝑦𝑖−�̂�𝑖) 2 �̂�𝑖 𝑛 𝑖=1 (2) where: �̂�𝑖 = �̂�𝑖 = 𝑒𝑥𝑝(�̂�0 + �̂�1𝑋𝑖1 + �̂�2𝑋𝑖2 + ⋯ + �̂�𝑘 𝑋𝑖𝑘 ) 𝑑𝑓 = 𝑛 − 𝑝 𝑛 : number of observations 𝑝 : number of parameters (𝑘 + 1) if (𝜒𝑃𝑒𝑎𝑟𝑠𝑜𝑛 2 𝑑𝑓⁄ ) > 1 then it can be said that observations contain overdispersion [12]. bayesian method suppose there are parameters 𝜃 to be estimated. in bayesian method, parameters 𝜃 treated as variable will have value in the domain 𝑓(𝜃). the prior distribution is the initial information to form the posterior. with prior information combined with data, calculating the posterior will be easier. based on the bayesian method, the posterior distribution is proportional (comparable) to the combination of the prior distribution and the likelihood function based on equation (3) [13]. 𝑓(𝜃|𝑦) ∝ 𝑓(𝑦|𝜃)𝑓(𝜃) (3) where: 𝑓(𝑦|𝜃) : likelihood function 𝑓(𝜃) : prior distribution function 𝑓(𝜃|𝑦) : posterior distribution function bayesian hurdle poisson regression there are three important components in bayesian method, namely (1) the likelihood function of the hpr model, (2) the prior distribution and (3) the posterior distribution. the likelihood function of the hpr model is as presented in equation (4). 𝑓(𝑌|𝛽, 𝛿) = ∏ 1 1+𝑒𝑥𝑝(𝑿𝑇𝜹) 𝑛 𝑖=1 𝑦𝑖=0 × ∏ [𝑒𝑥𝑝(−𝑒𝑥𝑝(𝑿𝑇𝜷))][𝑒𝑥𝑝(𝑿𝑇𝜷)] 𝑦𝑖 (1−[𝑒𝑥𝑝(−𝑒𝑥𝑝(𝑿𝑇𝜷))])𝑦𝑖! 𝑛 𝑖=1 𝑦𝑖>0 (4) the prior distribution for 𝛽 and 𝛿 is assumed to be normally distributed with the mean and variance 𝜎2 with the form as shown in equation (5). 𝑓(𝛽, 𝛿) = ∏ 1 𝜎𝛽 √2𝜋 𝑒𝑥𝑝 (− (𝛽−𝜇𝛽) 2 2𝜎𝛽 2 ) 𝑘 𝑗=0 × ∏ 1 𝜎𝛿√2𝜋 𝑒𝑥𝑝 (− (𝛿−𝜇𝛿) 2 2𝜎𝛿 2 ) 𝑘 𝑗=0 (5) the posterior distribution is obtained from the product of the likelihood function and the prior distribution in the form of an equation as presented in equation (6). bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 387 𝑓(𝛽, 𝛿|𝑌) ∝ ∏ 1 1+𝑒𝑥𝑝(𝑿𝑇𝜹) 𝑛 𝑖=1 𝑦𝑖=0 ∏ [𝑒𝑥𝑝(−𝑒𝑥𝑝(𝑿𝑇𝜷))][𝑒𝑥𝑝(𝑿𝑇𝜷)] 𝑦𝑖 (1−[𝑒𝑥𝑝(−𝑒𝑥𝑝(𝑿𝑇𝜷))])𝑦𝑖! 𝑛 𝑖=1 𝑦𝑖>0 × ∏ 1 𝜎𝛽√2𝜋 𝑒𝑥𝑝 (− (𝛽−𝜇𝛽) 2 2𝜎𝛽 2 ) 𝑘 𝑗=0 ∏ 1 𝜎𝛿√2𝜋 𝑒𝑥𝑝 (− (𝛿−𝜇𝛿) 2 2𝜎𝛿 2 ) 𝑘 𝑗=0 (6) the posterior distribution of the bayesian hurdle poisson regression model parameters has a complex function and requires a difficult integration process, so it is not easy to obtain analytically. therefore, a numerical approach is needed using the markov chain monte carlo (mcmc) simulation method [14]. bayesian model convergence test convergence test method consists of trace plot, autocorrelation plot, ergodic mean plot, and monte carlo error (mc error) [15]. convergence will be fullfilled if the trace plot does not form an ascending or descending pattern, the autocorrelation plot is close to one and the next lag is close to zero, after several iterations the ergodic mean plot is stable, or mc error is less than 5% of the standard deviation of each parameter. results and discussion the results of the analysis begin with testing the poisson regression assumption, then the parameter estimator of the bayesian hurdle poisson regression. result of poisson regression assumption test the first assumption in poisson regression is the response variable in the form of count with poisson distribution based on hypothesis. 𝐻0: the number of death due to chronic filariasis cases follows a poisson distribution versus 𝐻1: the number of death due to chronic filariasis cases does not follows a poisson distribution the results of the kolmogorov-smirnov test with software r showed that the 𝑝𝑣𝑎𝑙𝑢𝑒 less than 2.2 × 10−16. this suggests that the response variable did not follows a poisson distribution. then do fit distribution with easyfit software. poisson distribution ranked third after uniform and geometric distribution. since poisson regression is the most common regression model for modeling response variable in the form of count, then no one has researched related to uniform regression and geometric regression, the study still uses poisson's regression model, but uses the bayesian method to estimate the parameters because they have advantages that can be applied to all distribution. the next assumption is non-multocollinearity. the results of the multicollinearity test with the 𝑉𝐼𝐹𝑗 are presented in table 1. table 1. vif for all predictors variable 𝑉𝐼𝐹𝑗 𝑋1 4.782 𝑋2 1.530 𝑋3 3.872 𝑋4 1.173 𝑋5 3.162 bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 388 table 1 shows that the 𝑉𝐼𝐹 of all predictor variables is less than 10, so it can be concluded that the non-multicollinearity assumption is fullfilled. the last assumption in poisson regression is the occurrence of equidispersion. overdispersion testing was carried out with 𝜒𝑃𝑒𝑎𝑟𝑠𝑜𝑛 2 𝑑𝑓⁄ . data is said to contain overdispersion if (𝜒𝑃𝑒𝑎𝑟𝑠𝑜𝑛 2 𝑑𝑓⁄ ) > 1. the 𝜒𝑃𝑒𝑎𝑟𝑠𝑜𝑛 2 𝑑𝑓⁄ = 212.549, it can be concluded that the data contains overdispersion. because the two poisson regression assumptions are not fullfilled, then estimate the parameters with the bayesian hurdle poisson regression model. result of bayesian model convergence test in bayesian method, parameters are generated using the gibbs sampling algorithm with 300000 iterations and 7 thin. it is important to check the convergence of the model parameters to check the accuracy of the parameter estimation using the bayesian method. there are four methods for checking the convergence of parameters, namely (1) trace plot, (2) autocorrelation plot, and (3) ergodic mean plot (4) mc error. trace plots for each parameter are presented in figure 1. trace plot of 𝛿0 trace plot of 𝛿1 trace plot of 𝛿2 trace plot of 𝛿3 trace plot of 𝛿4 trace plot of 𝛿5 trace plot of 𝛽0 trace plot of 𝛽1 trace plot of 𝛽2 trace plot of 𝛽3 trace plot of 𝛽4 trace plot of 𝛽5 figure 1. trace plot for bayesian hurdle poisson regression parameters the figure 1 shows that the trace plot is random when 300000 iterations are carried out and 7 thin. it can be concluded that the parameters are convergent, so the iteration is stopped. the second method used to check the convergence is the autocorrelation plot. the figure 2 shows the autocorrelation plot for each parameter. bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 389 autocorrelation plot of 𝛿0 autocorrelation plot of 𝛿1 autocorrelation plot of 𝛿2 autocorrelation plot of 𝛿3 autocorrelation plot of 𝛿4 autocorrelation plot of 𝛿5 autocorrelation plot of 𝛽0 autocorrelation plot of 𝛽1 autocorrelation plot of 𝛽2 autocorrelation plot of 𝛽3 autocorrelation plot of 𝛽4 autocorrelation plot of 𝛽5 figure 2. autocorrelation plot for bayesian hurdle poisson regression parameters the figure 2 shows that the first lag in the autocorrelation plot is close to one and the next lag is close to zero, so the convergence of parameters is fulfilled. the third method used to check convergence is the ergodic mean plot. convergence will be fullfilled if after several iterations the ergodic mean plot is stable. the figure 3 shows the ergodic mean plot for each parameter. ergodic mean plot of 𝛿0 ergodic mean plot of 𝛿1 ergodic mean plot of 𝛿2 ergodic mean plot of 𝛿3 ergodic mean plot of 𝛿4 ergodic mean plot of 𝛿5 ergodic mean plot of 𝛽0 ergodic mean plot of 𝛽1 ergodic mean plot of 𝛽2 bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 390 ergodic mean plot of 𝛽3 ergodic mean plot of 𝛽4 ergodic mean plot of 𝛽5 figure 3. ergodic mean plot for bayesian hurdle poisson regression parameters the figure 3 shows that after 300000 iterations and 7 thin the ergodic mean plot is stable. it can be concluded that the parameters are convergent. in addition to using plots, convergence checks can also be done by comparing the mc error with 5% standard deviation for each parameter. the mc error for each parameter of the bayesian hurdle poisson regression model are presented in the table 2. table 2. mc error for bayesian hurdle poisson regression parameters model parameter estimator standard deviation 5% standard deviation mc error decision logit �̂�0 7.278704 0.363935 0.255295 convergence �̂�1 0.001624 8.12 × 10 −5 2.17 × 10−5 convergence �̂�2 0.175980 0.008799 0.003009 convergence �̂�3 0.242849 0.012142 0.002269 convergence �̂�4 0.000538 2.69 × 10 −5 3.59× 10−6 convergence �̂�5 0.084389 0.004219 0.002958 convergence truncated poisson �̂�0 0.918011 0.045901 0.040426 convergence �̂�1 0.000272 1.36 × 10 −5 3.33 × 10−6 convergence �̂�2 0.025134 0.001257 0.000488 convergence �̂�3 0.026595 0.00133 0.000515 convergence �̂�4 0.000131 6.54 × 10 −6 9.12× 10−7 convergence �̂�5 0.010116 0.000506 0.000445 convergence based on table 2, mc error on all parameters is less than 5% standard deviation, then the convergence is met. based on the four methods of checking the convergence, the results are the same, namely the convergence is fulfilled when 300000 and 7 thin amere performed. parameter estimation results of bayesian hurdle poisson regression model after the convergence is fullfilled, we can calculate the parameter estimator obtained from the sample generation using gibbs sampling. the parameter estimator is the average of the sample generation results for each parameter which is shown in table 3. testing bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 391 the bayesian model parameters using a confidence interval by looking at the lower limit of the 2.5% percentile and the upper limit of the 97.5% percentile. if it contains zero in that range, the decision to accept 𝐻0 or the 𝑗 th predictor variable has no significant effect to the response variable. table 3. parameter estimator of bayesian hurdle poisson regression model parameter parameter estimator percentile 2.5% percentile 97.5% decision logit 𝛿0 16.6551 5.0846 28.4970 reject 𝐻0 𝛿1 0.0004 -0.0031 0.0022 accept 𝐻0 𝛿2 −0.2270 -0.5155 0.0572 accept 𝐻0 𝛿3 −0.2974 -0.6936 0.0984 accept 𝐻0 𝛿4 0.0006 -0.0001 0.0014 accept 𝐻0 𝛿5 −0.1922 -0.3257 -0.0549 reject 𝐻0 truncated poisson 𝛽0 −4.2404 -5.7179 -2.7044 reject 𝐻0 𝛽1 −0.0027 -0.0032 -0.0023 reject 𝐻0 𝛽2 0.1500 0.1086 0.1911 reject 𝐻0 𝛽3 0.5121 0.4677 0.5551 reject 𝐻0 𝛽4 −0.0004 -0.0006 -0.0002 reject 𝐻0 𝛽5 0.0805 0.0636 0.0968 reject 𝐻0 based on table 3, the bayesian hurdle poisson regression model can be presented as follows 𝑙𝑜𝑔𝑖𝑡 �̂�𝑖 = 16,6551 − 0,1922𝑋5𝑖 (6) ln �̂�𝑖 = −4,2404 − 0,0027𝑋1𝑖 + 0,1500𝑋2𝑖 + 0,5121𝑋3𝑖 − 0,0004𝑋4𝑖 + 0,0805𝑋5𝑖 (7) the interpretation of the logit model in equation (6), that is, every 1% increase in the percentage of households that have access to proper sanitation in 34 provinces in indonesia will increase the probability of the number of cases of death due to chronic filariasis in 34 provinces in indonesia by exp(-0.1922) = 0.825 times of the original number of death from chronic filariasis cases. the interpretation of poisson's truncated model in equation (7) is: 1. every 1 person increase in the total number of chronic filariasis cases in 34 provinces in indonesia will increase the average number of deaths due to chronic filariasis in 34 provinces in indonesia by exp(-0.0027)=0.997≈1 person. 2. every increase in 1 district/city that succeeds in reducing microphilia <1% will increase the average number of cases of death due to chronic filariasis in 34 provinces in indonesia by exp(0.1500)=1.16≈1 person. 3. every increase in 1 district/city in indonesia that is still implementing mass preventive drug delivery (mpdd) will increase the average number of cases of death due to chronic filariasis in 34 provinces in indonesia by exp(0,5121)=1,669≈2 persons. 4. every 1 person/km2 increase in population density in 34 provinces in indonesia will increase the average number of cases of death due to chronic filariasis in 34 provinces in indonesia by exp(-0.0004)=0.9996≈1 person. bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 392 5. every 1% increase in the percentage of households having access to proper sanitation in 34 provinces in indonesia will increase the average number of cases of death due to chronic filariasis in 34 provinces in indonesia by exp(0.0805)=1.08≈ 1 person. conclusions in the logit model, the percentage of households that have access to proper sanitation in 34 provinces in indonesia (𝑋5) has a significant effect on the number of cases of death due to chronic filariasis in 34 provinces in indonesia (𝑌). then in the truncated poisson model, all predictor variables, namely the number of all chronic cases of filariasis in 34 provinces in indonesia (𝑋1), the number of district/cities managed to reduce microphilia <1% in 34 provinces in indonesia (𝑋2), the number of district/cities that are still implementing the mass preventive drug delivery (mpdd) for filariasis in 34 provinces in indonesia (𝑋3), population density in 34 provinces in indonesia (𝑋4), as well as the percentage of households that have access to proper sanitation in 34 provinces in indonesia (𝑋5) have a significant effect on the number of deaths due to chronic filariasis in 34 provinces in indonesia (y). references [1] d. w. osgood, “poisson based regression analysis of aggregate crime rates,” quant. methods criminol., vol. 16, no. 1, pp. 577–599, 2017, doi: 10.4324/9781315089256-23. [2] w. t. tedra, i. m. rizki, and d. prariesa, “konsumsi rokok masyarakat kota bandung tahun 2015 dengan model hurdle negatif binomial ( hurdle-nb ),” forum statistika dan komputasi., vol. 15, no.1, pp. 18–27, 2015. [3] a. taufiq, a. b. astuti, and a. a. rinaldo fernandes, “geographically weighted regression in cox survival analysis for weibull distributed data with bayesian approach,” iop conf. ser. mater. sci. eng., vol. 546, no. 5, 2019, doi: 10.1088/1757899x/546/5/052078. [4] a. r. maulana, s. astutik, u. brawijaya, and l. belakang, “penerapan regresi zero inflated poisson dengan metode bayesian,” prosiding seminar nasional pendidikan matematika, vol. 1, pp. 226–233, 2016. [5] g. meliyanie and d. andiarsa, “program eliminasi lymphatic filariasis di indonesia,” j. heal. epidemiol. commun. dis., vol. 3, no. 2, pp. 63–70, 2019, doi: 10.22435/jhecds.v3i2.1790. [6] a. a. arsin, epidemiologi filariasis di indonesia, makassar: masagna press, 2016. [7] a. ernawati, “faktor risiko penyakit filariasis (kaki gajah),” j. litbang media inf. penelitian, pengemb. dan iptek, vol. 13, no. 2, pp. 105–114, 2017, doi: 10.33658/jl.v13i2.98. [8] kementerian kesehatan ri, “situasi filariasis di indonesia tahun 2018,” infodatin pusat data dan informasi kementerian kesehatan ri. pp. 1&4, 2019, [online]. available: https://pusdatin.kemkes.go.id/download.php?file=download/pusdatin/infodatin /infodatin-filariasis-2019.pdf. [9] kementerian kesehatan ri, "profil kesehatan indonesia 2020," 2021. [10] f. antoneli, f. m. passos, l. r. lopes, and m. r. s. briones, “a kolmogorov-smirnov test for the molecular clock based on bayesian ensembles of phylogenies,” plos one, vol. 13, no. 1, 2018, doi: 10.1371/journal.pone.0190826. bayesian hurdle poisson regression for assumption violation nur kamilah sa’diyah 393 [11] d. n. gujarati and d. c. porter, dasar-dasar ekonometrika, edisi 5, jakarta: salemba empat, 2012. [12] a. agresti, categorical data analysis second edition, new york: john wiley & sons inc, 2002. [13] g. e. p. box and g. c. tiao, bayesian inference in statistical analysis. 1992. [14] a. b. astuti, n. iriawan, irhamah, h. kuswanto, and l. sasiarini, “blood sugar levels of diabetes mellitus patients modeling with bayesian mixture model averaging,” glob. j. pure appl. math., vol. 12, no. 4, pp. 3143–3158, 2016. [15] i. ntzoufras, "bayesian modeling using winbugs", vol. 698. john wiley & sons, 2011. multipolar intuitionistic fuzzy ideal in b-algebras cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 293-301 p-issn: 2086-0382; e-issn: 2477-3344 submitted: november 17, 2021 reviewed: december 10, 2021 accepted: january 20, 2022 doi: http://dx.doi.org/10.18860/ca.v7i1.14003 multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo*, noor hidayat, vira hari krisnawati department of mathematics, university of brawijaya, malang, indonesia *corresponding author email: amigo.royyan@yahoo.com* abstract b-algebra is an algebraic structure which combine some properties from 𝐵𝐶𝐾-algebras and 𝐵𝐶𝐼-algebras. some researchers have investigated the concept of multipolar fuzzy ideals in 𝐵𝐶𝐾/𝐵𝐶𝐼-algebras and multipolar intuitionistic fuzzy set in 𝐵-algebras. in this paper, we construct a new structure which is called a multipolar intuitionistic fuzzy ideal in 𝐵-algebras. this structure is a combination of three structures such as multipolar fuzzy ideals in 𝐵𝐶𝐾/𝐵𝐶𝐼-algebras, fuzzy 𝐵-subalgebras in 𝐵-algebras, and multipolar intuitionistic fuzzy 𝐵-algebras. we investigated and proved some characterizes of the multipolar intuitionistic fuzzy ideal, such as a necessary condition and sufficient condition. keywords: b-algebras; multipolar fuzzy ideal; multipolar intuitionistic fuzzy set; multipolar intuitionistic fuzzy ideal introduction zadeh [1] introduced a new idea, namely a fuzzy set as a non-empty set with a degree of membership whose value in interval [0,1] in 1965. the degree of membership of each member of the set is determined by the membership function. that notion from zadeh became the basis for further researchers to develop fuzzy concepts in various fields such as graph theory, data analysis, decision making, and so on. a simple example of an algebraic structure is a group. not only groups, 𝐵𝐶𝐾-algebras, 𝐵𝐶𝐼-algebras and 𝐵-algebras are also other examples of algebraic structures. imai and iseki [2] proposed the notion a new algebraic structure called 𝐵𝐶𝐾-algebras in 1966. 𝐵𝐶𝐾-algebras is an important class of algebraic structure which is constructed from two different fragments, set theory and propositional calculus. in the same year, iseki [3] continued his research to propose the notion of 𝐵𝐶𝐼-algebras which is generalization from 𝐵𝐶𝐾-algebras. a new idea about algebraic structure is called 𝐵algebras which satisfies some properties from 𝐵𝐶𝐾-algebras and 𝐵𝐶𝐼-algebras was proposed by neggers and kim in [4]. they also investigated its properties. zhang [5] introduced the concepts of bipolar fuzzy sets which is the extension of fuzzy set. meng [6] studied about fuzzy implicative ideals in 𝐵𝐶𝐾-algebras in 1997. moreover, muhiuddin and al-kadi [7] introduced bipolar fuzzy implicative ideals in 𝐵𝐶𝐾-algebras. they discussed about the relationship between a bipolar fuzzy ideal and bipolar fuzzy implicative ideal. furthermore, chen et al. [8] introduced the concepts of multipolar fuzzy sets which is the extension of bipolar fuzzy set. kang et al. [9] proposed http://dx.doi.org/10.18860/ca.v7i1.14003 mailto:amigo.royyan@yahoo.com multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 294 the concepts about multipolar intuitionistic fuzzy set with finite degree and its application in 𝐵𝐶𝐾/𝐵𝐶𝐼-algebras. in 1999, attanasov [10] introduced the new notion about intuitionistic fuzzy set. jun et al. [11] defined fuzzy 𝐵-algebras. then, al-masarwah and ahmad [12] discussed about multipolar fuzzy ideals in 𝐵𝐶𝐾/𝐵𝐶𝐼-algebras. ahn and bang [13] studied fuzzy 𝐵-subalgebras in 𝐵-algebras. recently, borzooei et al. [14] proposed the concept about multipolar intuitionistic fuzzy 𝐵-algebras and some properties. they constructed a simple multipolar fuzzy set. then, they also discussed about multipolar intuitionistic fuzzy subalgebras of 𝐵-algebras. in this paper, we construct a new structure which is called a multipolar intuitionistic fuzzy ideal in 𝐵-algebras. this structure is a combination of three structures which are the results of research by al-masarwah and ahmad [12], ahn and bang [13], and borzooei et al. [14]. next, we investigated and proved some necessary condition and sufficient condition of the multipolar intuitionistic fuzzy ideal. methods by using literary study and analogical related concepts from [12], [13] and [14], we propose the terminology of multipolar intuitionistic fuzzy ideal in 𝐵-algebras. we start to describe the structure of 𝐵-algebra, fuzzy 𝐵-algebra, and multipolar intuitionistic fuzzy sets. each structure is given its definition, examples, and some of its properties. definition 2.1 [15] 𝐵-algebra is a nonempty set 𝑋 with 0 as identity element (right) and a binary operation ∗ satisfying the following axioms for all 𝑥, 𝑦, 𝑧 ∈ 𝑋: i. 𝑥 ∗ 𝑥 = 0. ii. 𝑥 ∗ 0 = 𝑥. iii. (𝑥 ∗ 𝑦) ∗ 𝑧 = 𝑥 ∗ (𝑧 ∗ (0 ∗ 𝑦)). for all 𝑥, 𝑦 ∈ 𝑋, we define a partial ordering relation " ≤ " on 𝑋 by 𝑥 ≤ 𝑦 if and only if 𝑥 ∗ 𝑦 = 0 ([14]). example 2.2 [15] let 𝑋 = {0, 𝑎, 𝑏, 𝑐} be a set with cayley table as follows: table 1: cayley table for (𝑋;∗ ,0). ∗ 𝟎 𝒂 𝒃 𝒄 𝟎 0 0 𝑏 𝑏 𝒂 𝑎 0 𝑐 𝑏 𝒃 𝑏 𝑏 0 0 𝒄 𝑐 𝑏 𝑎 0 then, (𝑋;∗ ,0) is a 𝐵-algebra. example 2.3 [15] let (ℤ; −,0) with ′′ − ′′ be a substraction operation of integers ℤ. then, (ℤ; −,0) is a 𝐵-algebra. example 2.4 let (ℝ+ − {0};∗ ,1) with ′′ ∗ ′′ be a binary operation of ℝ+ − {0} defined by 𝑥 ∗ 𝑦 = 𝑥 𝑦 . multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 295 then, (ℝ+ − {0};∗ ,1) is a 𝐵-algebra. proposition 2.5 [16] if (𝑋;∗ ,0) is a 𝐵-algebra, then for all 𝑥, 𝑦, 𝑧 ∈ 𝑋 satisfies the following conditions. i. (𝑥 ∗ 𝑦) ∗ (0 ∗ 𝑦) = 𝑥. ii. 𝑥 ∗ (𝑦 ∗ 𝑧) = (𝑥 ∗ (0 ∗ 𝑧)) ∗ 𝑦. iii. if 𝑥 ∗ 𝑦 = 0 then 𝑥 = 𝑦. iv. 0 ∗ (0 ∗ 𝑥) = 𝑥. v. (𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧) = 𝑥 ∗ 𝑦. vi. 0 ∗ (𝑥 ∗ 𝑦) = 𝑦 ∗ 𝑥. vii. 𝑥 ∗ 𝑦 = 0 if and only if 𝑦 ∗ 𝑥 = 0. viii. if 0 ∗ 𝑥 = 0 then 𝑋 contains only 0. definition 2.6 [16] a 𝐵-algebra (𝑋;∗ ,0) is called commutative 𝐵-algebra if for all 𝑥, 𝑦 ∈ 𝑋 satisfies: 𝑥 ∗ (0 ∗ 𝑦) = 𝑦 ∗ (0 ∗ 𝑥). example 2.7 let (ℤ; −,0) with ′′ − ′′ be a substraction operation of integers ℤ. then, (ℤ; −,0) is a commutative 𝐵-algebra. proposition 2.8 [16] if (𝑋;∗ ,0) is a commutative 𝐵-algebra, then for all 𝑥, 𝑦, 𝑧, 𝑡 ∈ 𝑋 satisfies the following rules. i. (0 ∗ 𝑥) ∗ (0 ∗ 𝑦) = 𝑦 ∗ 𝑥. ii. (𝑧 ∗ 𝑦) ∗ (𝑧 ∗ 𝑥) = 𝑥 ∗ 𝑦. iii. (𝑥 ∗ 𝑦) ∗ 𝑧 = (𝑥 ∗ 𝑧) ∗ 𝑦. iv. (𝑥 ∗ (𝑥 ∗ 𝑦)) ∗ 𝑦 = 0. v. (𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑡) = (𝑡 ∗ 𝑧) ∗ (𝑦 ∗ 𝑥). definition 2.9 [15] let (𝑋;∗ ,0) be a 𝐵-algebra. a nonempty subset 𝐼 of 𝑋 is called ideal of 𝑋 if it satisfies: i. 0 ∈ 𝐼, ii. for all 𝑥, 𝑦 ∈ 𝑋, if 𝑦 ∈ 𝐼 and 𝑥 ∗ 𝑦 ∈ 𝐼 then 𝑥 ∈ 𝐼. example 2.10 [15] let 𝐼 = ℤ+ ⋃ {0} be a subset of 𝐵-algebra (ℤ; −,0), then 𝐼 is ideal of ℤ. let (𝑋;∗ ,0) be a 𝐵-algebra. a non empty subset 𝐼 of 𝑋 is called subalgebras (𝐵-subalgebras) of 𝑋 if for all 𝑥, 𝑦 ∈ 𝐼 satisfies 0 ∈ 𝐼 and 𝑥 ∗ 𝑦 ∈ 𝐼 ([15]). definition 2.11 [11] let (𝑋;∗ ,0) be a 𝐵-algebra. a fuzzy set 𝐴 in 𝑋 is called fuzzy 𝐵-algebra if it satisfies the inequality for all 𝑥, 𝑦 ∈ 𝑋, 𝜇𝐴(𝑥 ∗ 𝑦) ≥ min{𝜇𝐴(𝑥), 𝜇𝐴(𝑦)}. let (𝑋;∗ ,0) be a 𝐵-algebra. a fuzzy set 𝐴 in 𝑋 is called fuzzy ideal 𝐵-algebra ([17]) if it satisfies for all 𝑥, 𝑦 ∈ 𝑋, 𝜇𝐴(0) ≥ 𝜇𝐴(𝑥), 𝜇𝐴(𝑥) ≥ min{𝜇𝐴(𝑥 ∗ 𝑦), 𝜇𝐴(𝑦)}. multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 296 a 𝐵-algebra (𝑋;∗ ,0) in the example 2.2. if we define a fuzzy set 𝐴 in 𝑋 by 𝜇𝐴(0) = 𝜇𝐴(𝑏) = 1 and 𝜇𝐴(𝑎) = 𝜇𝐴(𝑐) = 0.5, then 𝐴 is fuzzy ideal of 𝑋. moreover, a 𝐵-algebra (ℝ+ − {0};∗ ,1) in the example 2.4, if we define a fuzzy set 𝐴 in ℝ+ − {0} by 𝜇𝐴(𝑥) = { 1 𝑖𝑓 𝑥 = 1, 0.5 𝑖𝑓 𝑥 ≠ 1, then 𝐴 is fuzzy ideal of ℝ+ − {0}. let (𝑋;∗ ,0) be a 𝐵-algebra. a multipolar intuitionistic fuzzy set over 𝑋 is a mapping (ℓ̂, �̂�) ∶ 𝑋 → ([0,1] × [0,1])𝑚 𝑥 ↦ (ℓ̂(𝑥), �̂�(𝑥)), where ℓ̂ ∶ 𝑋 → [0,1]𝑚 and �̂� ∶ 𝑋 → [0,1]𝑚 are multipolar fuzzy sets over 𝑋 which is satisfies the condition for all 𝑥 ∈ 𝑋, ℓ̂(𝑥) + �̂�(𝑥) ≤ 1 where 𝜋𝑖 ∶ [0,1] 𝑚 → [0,1] such that (𝜋𝑖 ∘ ℓ̂)(𝑥) + (𝜋𝑖 ∘ �̂�)(𝑥) ≤ 1 for 𝑖 = 1,2, … , 𝑚 (see [14]). results and discussion in this section, we will describe the structure of multipolar intuitionistic fuzzy ideal in 𝐵-algebras. the description begins with the definition of the new structure, then examples are given, and its properties are determined and proven. definition 3.1 let (𝑋;∗ ,0) be a 𝐵-algebra. a multipolar intuitionistic fuzzy set (ℓ̂, �̂�) over 𝑋 is called multipolar intuitionistic fuzzy ideal in 𝑋 if it satisfies: i. (∀𝑥 ∈ 𝑋)(ℓ̂(0) ≥ ℓ̂(𝑥) and �̂�(0) ≤ �̂�(𝑥)) such that (𝜋𝑖 ∘ ℓ̂)(0) ≥ (𝜋𝑖 ∘ ℓ̂)(𝑥) and (𝜋𝑖 ∘ �̂�)(0) ≤ (𝜋𝑖 ∘ �̂�)(𝑥), ii. (∀𝑥, 𝑦 ∈ 𝑋)(ℓ̂(𝑥) ≥ inf{ℓ̂(𝑥 ∗ 𝑦), ℓ̂(𝑦)} and �̂�(𝑥) ≤ sup{�̂�(𝑥 ∗ 𝑦), �̂�(𝑦)}) such that (𝜋𝑖 ∘ ℓ̂)(𝑥) ≥ inf{(𝜋𝑖 ∘ ℓ̂)(𝑥 ∗ 𝑦), (𝜋𝑖 ∘ ℓ̂)(𝑦)} and (𝜋𝑖 ∘ �̂�)(𝑥) ≤ sup{(𝜋𝑖 ∘ �̂�)(𝑥 ∗ 𝑦), (𝜋𝑖 ∘ �̂�)(𝑦)}, for 𝑖 = 1,2, … . , 𝑚. example 3.2 let (𝑋;∗ ,0) be a 𝐵-algebra in the example 2.2. given a multipolar intuitionistic fuzzy set (ℓ̂, �̂�) over 𝑋 by (ℓ̂, �̂�) ∶ 𝑋 → ([0,1] × [0,1])5, multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 297 𝑥 ↦ { ((0.7,0.3), (0.6,0.25), (0.7,0.15), (0.63,0.2), (0.8,0.18)) 𝑖𝑓 𝑥 ∈ {0, 𝑏}, ((0.3,0.6), (0.4,0.5), (0.5,0.4), (0.2,0.7), (0.4,0.5)) 𝑖𝑓 𝑥 ∈ {𝑎, 𝑐}. then, (ℓ̂, �̂�) is 5-polar intuitionistic fuzzy ideal of 𝑋. example 3.3 let (ℝ+ − {0};∗ ,1) be a 𝐵-algebra in the example 2.4. given a multipolar intuitionistic fuzzy set (ℓ̂, �̂�) over ℝ+ − {0} by (ℓ̂, �̂�) ∶ 𝑋 → ([0,1] × [0,1])5, 𝑥 ↦ { ((1,0), (1,0), (1,0), (1,0), (1,0)) 𝑖𝑓 𝑥 = 1, ((0.5,0.5), (0.4,0.4), (0.3,0.3), (0.2,0.2), (0.1,0.1)) 𝑖𝑓 𝑥 ≠ 1. then, (ℓ̂, �̂�) is 5-polar intuitionistic fuzzy ideal of ℝ+ − {0}. for any 𝜔 ∈ 𝑋 and multipolar intuitionistic fuzzy set (ℓ̂, �̂�) in 𝑋, we give the conditions for the set 𝐼(𝜔) to be an ideal of 𝑋 and its example. theorem 3.4 let (𝑋;∗ ,0) be a 𝐵-algebra and 𝑥 ∈ 𝑋. if (ℓ̂, �̂�) is a multipolar intuitionistic fuzzy ideal of 𝑋, then 𝐼(𝜔) is an ideal of 𝑋 where 𝐼(𝜔) = {𝑥 ∈ 𝑋|ℓ̂(𝑥) ≥ ℓ̂(𝜔) 𝑎𝑛𝑑 �̂�(𝑥) ≤ �̂�(𝜔)}. proof. let (ℓ̂, �̂�) be a multipolar intuitionistic fuzzy ideal of 𝑋 where 𝐼(𝜔) = {𝑥 ∈ 𝑋|ℓ̂(𝑥) ≥ ℓ̂(𝜔) and �̂�(𝑥) ≤ �̂�(𝜔)}. i. by using definition 3.1 (i) we have that ℓ̂(0) ≥ ℓ̂(𝑥) ≥ ℓ̂(𝜔) and �̂�(0) ≤ �̂�(𝑥) ≤ �̂�(𝜔). hence, 0 ∈ 𝐼(𝜔). ii. let 𝑥, 𝑦 ∈ 𝑋 such that 𝑥 ∗ 𝑦 ∈ 𝐼(𝜔) and 𝑦 ∈ 𝐼(𝜔). then, ℓ̂(𝑥 ∗ 𝑦) ≥ ℓ̂(𝜔) and �̂�(𝑥 ∗ 𝑦) ≤ �̂�(𝜔), ℓ̂(𝑦) ≥ ℓ̂(𝜔) and �̂�(𝑦) ≤ �̂�(𝜔). by using definition 3.1 (ii), we have ℓ̂(𝑥) ≥ inf{ℓ̂(𝑥 ∗ 𝑦), ℓ̂(𝑦)} ≥ ℓ̂(𝜔) and �̂�(𝑥) ≤ sup{�̂�(𝑥 ∗ 𝑦), �̂�(𝑦)} ≤ �̂�(𝜔), such that ℓ̂(𝑥) ≥ ℓ̂(𝜔) and �̂�(𝑥) ≤ �̂�(𝜔). hence, 𝑥 ∈ 𝐼(𝜔). therefore, 𝐼(𝜔) is an ideal of 𝑋. ∎ multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 298 example 3.5 let (𝑋;∗ ,0) be a 𝐵-algebra in the example 2.2. given a multipolar intuitionistic fuzzy ideal (ℓ̂, �̂�) over 𝑋 in the example 3.2 where 𝐼(𝑏) = {0, 𝑏|ℓ̂(0) ≥ ℓ̂(𝑏) and �̂�(0) ≤ �̂�(𝑏), ℓ̂(𝑏) ≥ ℓ̂(𝑏) and �̂�(𝑏) ≤ �̂�(𝑏)}. then, 𝐼(𝑏) is an ideal of 𝑋. next, we discuss some properties of multipolar intuitionistic fuzzy ideal in 𝐵-algebras. proposition 3.6 let (𝑋;∗ ,0) be a 𝐵-algebra. every multipolar intuitionistic fuzzy ideal (ℓ̂, �̂�) over 𝑋 satisfies the following implication for all 𝑥, 𝑦 ∈ 𝑋, if 𝑥 ≤ 𝑦 then ℓ̂(𝑥) ≥ ℓ̂(𝑦) and �̂�(𝑥) ≤ �̂�(𝑦). proof. let 𝑥, 𝑦 ∈ 𝑋 such that 𝑥 ≤ 𝑦. so, 𝑥 ∗ 𝑦 = 0. by using definition 3.1 (i) and (ii), we have that ℓ̂(𝑥) ≥ inf{ℓ̂(𝑥 ∗ 𝑦), ℓ̂(𝑦)} = inf{ℓ̂(0), ℓ̂(𝑦)} = ℓ̂(𝑦) and �̂�(𝑥) ≤ sup{�̂�(𝑥 ∗ 𝑦), �̂�(𝑦)} = sup{�̂�(0), �̂�(𝑦)} = �̂�(𝑦). ∎ proposition 3.7 let (𝑋;∗ ,0) be a commutative 𝐵-algebra. for any multipolar intuitionistic fuzzy ideal (ℓ̂, �̂�) over 𝑋, if for all 𝑥, 𝑦 ∈ 𝑋 satisfies ℓ̂(𝑥 ∗ 𝑦) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑦) 𝑎𝑛𝑑 �̂�(𝑥 ∗ 𝑦) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑦), then for all 𝑥, 𝑦, 𝑧 ∈ 𝑋 satisfies ℓ̂((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑧) 𝑎𝑛𝑑 �̂�((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑧). proof. let 𝑥, 𝑦, 𝑧 ∈ 𝑋 such that ((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧 ≤ (𝑥 ∗ 𝑦) ∗ 𝑧. by using proposition 2.5 and 2.8, we have that ((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ∗ 𝑧 = ((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧 ≤ (𝑥 ∗ 𝑦) ∗ 𝑧. from proposition 3.6, we have ℓ̂ (((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ∗ 𝑧) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑧) and �̂� (((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ∗ 𝑧) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑧). multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 299 so, from proposition 2.8, we get ℓ̂((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) = ℓ̂ ((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ≥ ℓ̂ (((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ∗ 𝑧) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑧) and �̂�((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) = �̂� ((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ≤ �̂� (((𝑥 ∗ (𝑦 ∗ 𝑧)) ∗ 𝑧) ∗ 𝑧) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑧). ∎ proposition 3.8 let (𝑋;∗ ,0) be a 𝐵-algebra. for any multipolar intuitionistic fuzzy ideal (ℓ̂, �̂�) over 𝑋, if for all 𝑥, 𝑦, 𝑧 ∈ 𝑋 satisfies ℓ̂((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑧) 𝑎𝑛𝑑 �̂�((𝑥 ∗ 𝑧) ∗ (𝑦 ∗ 𝑧)) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑧), then for all 𝑥, 𝑦 ∈ 𝑋 satisfies ℓ̂(𝑥 ∗ 𝑦) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑦) 𝑎𝑛𝑑 �̂�(𝑥 ∗ 𝑦) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑦). proof. let 𝑥, 𝑦, 𝑧 ∈ 𝑋. if 𝑧 is replaced by 𝑦 on the assumption, then by using definition 2.1 (i) and (ii) we have ℓ̂(𝑥 ∗ 𝑦) = ℓ̂((𝑥 ∗ 𝑦) ∗ 0) = ℓ̂((𝑥 ∗ 𝑦) ∗ (𝑦 ∗ 𝑦)) = ℓ̂(𝑥 ∗ 𝑦) ≥ ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑦) and �̂�(𝑥 ∗ 𝑦) = �̂�((𝑥 ∗ 𝑦) ∗ 0) = �̂�((𝑥 ∗ 𝑦) ∗ (𝑦 ∗ 𝑦)) = �̂�(𝑥 ∗ 𝑦) ≤ �̂�((𝑥 ∗ 𝑦) ∗ 𝑦). ∎ based on proposition 3.7 and proposition 3.8, we get the following corollary. corollary if we assume that 𝑋 is a commutative 𝐵-algebra, then the statements in proposition 3.7 and proposition 3.8 are equivalent. furthermore, we also give another condition of multipolar intuitionistic fuzzy ideal in 𝐵-algebras such that make this following proposition. proposition 3.9 let (𝑋;∗ ,0) be a 𝐵-algebra. a multipolar intuitionistic fuzzy set (ℓ̂, �̂�) over 𝑋 is a multipolar intuitionistic fuzzy ideal (ℓ̂, �̂�) over 𝑋 if and only if for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, (𝑥 ∗ 𝑦) ∗ 𝑧 = 0 implies ℓ̂(𝑥) ≥ inf{ℓ̂(𝑦), ℓ̂(𝑧)} and �̂�(𝑥) ≤ sup{�̂�(𝑦), �̂�(𝑧)}. multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 300 proof. we assume that (ℓ̂, �̂�) is a multipolar intuitionistic fuzzy ideal over 𝑋. let 𝑥, 𝑦, 𝑧 ∈ 𝑋 such that (𝑥 ∗ 𝑦) ∗ 𝑧 = 0. so, 𝑥 ∗ 𝑦 ≤ 𝑧. by using definition 3.1 (i) and (ii), we have ℓ̂(𝑥) ≥ inf{ℓ̂(𝑥 ∗ 𝑦), ℓ̂(𝑦)} ≥ inf{inf{ℓ̂((𝑥 ∗ 𝑦) ∗ 𝑧), ℓ̂(𝑧)} , ℓ̂(𝑦)} = inf{inf{ℓ̂(0), ℓ̂(𝑧)} , ℓ̂(𝑦)} = inf{ℓ̂(𝑦), ℓ̂(𝑧)} and �̂�(𝑥) ≤ sup{�̂�(𝑥 ∗ 𝑦), �̂�(𝑦)} ≤ sup{sup{�̂�((𝑥 ∗ 𝑦) ∗ 𝑧), �̂�(𝑧)} , �̂�(𝑦)} = sup{sup{�̂�(0), �̂�(𝑧)} , �̂�(𝑦)} = sup{�̂�(𝑦), �̂�(𝑧)}. conversely, we assume for all 𝑥, 𝑦, 𝑧 ∈ 𝑋, (𝑥 ∗ 𝑦) ∗ 𝑧 = 0. then ℓ̂(𝑥) ≥ inf{ℓ̂(𝑦), ℓ̂(𝑧)} and �̂�(𝑥) ≤ sup{�̂�(𝑦), �̂�(𝑧)}. let 𝑥 ∈ 𝑋. by using definition 2.1 (ii) and definition 2.11, we have ℓ̂(0) = ℓ̂(𝑥 ∗ 𝑥) ≥ inf{ℓ̂(𝑥), ℓ̂(𝑥)} = ℓ̂(𝑥) and �̂�(0) = �̂�(𝑥 ∗ 𝑥) ≤ sup{�̂�(𝑥), �̂�(𝑥)} = �̂�(𝑥). then, let 𝑥, 𝑦 ∈ 𝑋. by using definition 2.1 (i), we have (𝑥 ∗ 𝑦) ∗ (𝑥 ∗ 𝑦) = 0 such that ℓ̂(𝑥) ≥ inf{ℓ̂(𝑦), ℓ̂(𝑥 ∗ 𝑦)} and �̂�(𝑥) ≤ sup{�̂�(𝑦), �̂�(𝑥 ∗ 𝑦)}. hence, (ℓ̂, �̂�) is a multipolar intuitionistic fuzzy ideal over 𝑋. ∎ conclusions in this paper, we apply the terminology of multipolar intuitionistic fuzzy ideal in 𝐵-algebras and investigate some properties. we also explain the conditions for a multipolar intuitionistic fuzzy set to be a multipolar intuitionistic fuzzy ideal and give some examples. these definitions and main results can be applied with similarly in other algebraic structure such as 𝐵𝐺-algebras, 𝐵𝐹-algebras and 𝐵𝐷-algebras. references [1] zadeh, l.a. 1965. fuzzy sets. information and control, vol. 8, no. 3, 338-353. [2] imai, y.; iseki, k. 1966. on axiom system of propositional calculi, xiv. japan acad 42, 19-22. [3] iseki, k. 1966. an algebra related with a propositional calculus. japan acad 42, 26-29. [4] neggers, j.; kim, h. s. 2002. on b-algebras. mate vesnik, no. 54, 21-29. multipolar intuitionistic fuzzy ideal in b-algebras royyan amigo 301 [5] zhang, w.r.; yang, y. 1994. bipolar fuzzy sets. proceedings of the 1998 ieee international conference on fuzzy systems, 835–840. anchorage, ak, usa. [6] meng, j.; jun, y.b.; kim, h.s. 1997. fuzzy implicative ideals of bck-algebras. fuzzy sets and systems, vol. 89, 243-248. [7] muhiuddin, g.; al-kadi, d. 2021. bipolar fuzzy implicative ideals of bck-algebras. journal of mathematics. [8] chen, j.; li, s.; ma, s.; wang, x. 2014. m-polar fuzzy sets: an extension of bipolar fuzzy sets. sci. world j. [9] kang, k.t.; song, seok-zun.; jun, y.b. 2020. multipolar intuitionistic fuzzy set with finite degree and its application in bck/bci-algebras. mdpi, 8, 177. [10] attanassov, k.t. 1999. intuitionistic fuzzy sets theory and applications. bulgaria. [11] jun, y.b.; roh, e.h.; kim, h.s. 2002. on fuzzy b-algebras. czechoslovak mathematical journal, vol. 52, no. 2, 375—384. [12] al-masarwah, a.; ahmad, a.g. 2019. m-polar fuzzy ideals of bck/bci-algebras. journal of king saud university-science, vol. 31, no. 4, 1220–1226. [13] ahn, s.s.; bang, k. 2003. on fuzzy subalgebras in b-algebras. commun korean math. soc, vol. 10, no. 3, 429-437. [14] borzooei, r.a.; kim, h.s.; jun, y.b.; ahn, s.s. 2020. on multipolar intuitionistic fuzzy b-algebras. mdpi, 8, 907. [15] fitria, e.; gemawati, s.; kartini. 2017. prime ideals in b-algebras. international journal of algebra, vol. 11, no. 7, 301-309. [16] abdullah, h.k; atshan, a.a. 2017. complete ideal and n-ideal of b-algebras. applied mathematical sciences, vol. 11, no. 35, 1705-1713. [17] senapati, t.; bhowmik, m.; pal, m. 2011. fuzzy closed ideals of b-algebras. ijcset, vol. 1, no. 10, 669-673. on rainbow vertex antimagic coloring of graphs: a new notion cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 64-72 p-issn: 2086-0382; e-issn: 2477-3344 submitted: juni 30, 2021 reviewed: august 12, 2021 accepted: october 20. 2021 doi: https://doi.org/10.18860/ca.v7i1.12796 on rainbow vertex antimagic coloring of graphs: a new notion marsidi1, ika hesti agustin2, dafik3, elsa yuli kurniawati4 1department of mathematics education, universitas pgri argopuro jember, indonesia 2department of mathematics, university of jember, indonesia 3department of mathematics education, university of jember, indonesia 4cgant, university of jember, indonesia email: marsidiarin@gmail.com, ikahesti.fmipa@unej.ac.id, d.dafik@unej.ac.id, elsayuli@unej.ac.id abstract for a bijective function 𝑔: 𝐸(𝐺) → {1, 2,3, ⋯ , |𝐸(𝐺)|}, the associated weight of a vertex 𝑣 ∈ 𝑉(𝐺) under 𝑔 is 𝑤𝑔(𝑣) = σ𝑒∈𝐸(𝑣)𝑔(𝑒), where 𝐸(𝑣) is the set of vertices incident to 𝑣. the function 𝑔 is called a vertex-antimagic edge labeling if every vertex has distinct weight. a path 𝑃 in the edgelabeled graph 𝐺 is said to be a rainbow path if for any two vertices 𝑥 and 𝑥′, all internal vertices in the path 𝑥 − 𝑥′ have different weight. if for every two vertices 𝑥 and 𝑦 of 𝐺, there exists a rainbow 𝑥 − 𝑦 path, then 𝑔 is called a rainbow vertex antimagic labeling of 𝐺. when we assign each edge 𝑥𝑦 with the color of the vertex weight 𝑤𝑔(𝑣), thus we say the graph 𝐺 admits a rainbow vertex antimagic coloring. the smallest number of colors taken over all rainbow colorings induced by rainbow vertex antimagic labelings of 𝐺 is called rainbow vertex antimagic connection number of 𝐺, denoted by 𝑟𝑣𝑎𝑐(𝐺). in this paper, we initiate to determine the rainbow vertex antimagic connection number of graphs, namely path (𝑃𝑛), wheel (𝑊𝑛 ), friendship (ℱ𝑛), and fan (𝐹𝑛). keywords: antimagic labeling; rainbow vertex coloring; rainbow vertex antimagic coloring; rainbow vertex antimagic connection number. introduction we consider a graph 𝐺(𝑉, 𝐸) in this paper are simple, connected and un-directed graph, where 𝑉 and 𝐸 are respectively a vertex set and edge set of 𝐺 [1]. the rainbow coloring problem has been studied by many researchers since many years ago. many good results has been published in some reputable journal [2]. thus, it has given many contributions in graph theory research of interest. there are many types of rainbow coloring, namely rainbow (edge) coloring, rainbow vertex coloring, strong rainbow edge/vertex coloring. the minimum number of colors for which an edge (vertex) coloring exists such that the graph 𝐺 is rainbow connected is called the rainbow connection number, denoted by 𝑟𝑐(𝐺) for edge coloring and the rainbow vertex connection number, denoted by 𝑟𝑣𝑐(𝐺) for vertex coloring, see [3]–[10] for detail. krivelevich and yuster [6] gave the lower bound for 𝑟𝑣𝑐(𝐺), namely 𝑟𝑣𝑐(𝐺) ≥ 𝑑𝑖𝑎𝑚(𝐺) – 1, where 𝑑𝑖𝑎𝑚(𝐺) is the diameter of graph 𝐺. an easy observation is that if 𝐺 has an order n, then 𝑟𝑣𝑐(𝐺) ≤ 𝑛 − 2 and 𝑟𝑣𝑐(𝐺) = 0 if and only if 𝐺 is a complete graph. notice that 𝑟𝑣𝑐(𝐺) ≥ 𝑑𝑖𝑎𝑚(𝐺) − 1 with equality if the diameter of 𝐺 is 1 or 2. https://doi.org/10.18860/ca.v7i1.12796 mailto:marsidiarin@gmail.com mailto:ikahesti.fmipa@unej.ac.id mailto:d.dafik@unej.ac.id mailto:elsayuli@unej.ac.id on rainbow vertex antimagic coloring of graphs: a new notion marsidi 65 meanwhile, in 2003, hartsfield and ringel [11] defined antimagic graphs. a graph 𝐺 is called antimagic if there exists a bijection 𝑓: 𝐸(𝐺) → {1,2, ⋯ , 𝑞} such that the weights of all vertices are distinct [12] . the vertex weight of a vertex 𝑣 under 𝑓, 𝑤𝑓 (𝑣), is the sum of labels of edges incident with 𝑣, that is, 𝑤𝑓 (𝑣) = ∑ 𝑓(𝑢𝑣)𝑢𝑣∈𝐸(𝐺) . in this case, 𝑓 is called an antimagic labeling. there many results were found for antimagicness of graph. there are extension types of vertex antimagic labeling, namely total vertex antimagic labeling, super total vertex antimagic labeling, (𝑎, 𝑑)-vertex antimagic labeling, super (𝑎, 𝑑)-vertex antimagic labeling. for detail, see galian dynamic survey of graph labeling [13] . in this study, we initiate to combine the two notion, namely rainbow coloring and antimagic labeling [14][15]. we name for this combination as rainbow vertex antimagic coloring. for a bijective function 𝑔: 𝐸(𝐺) → {1, 2,3, ⋯ , |𝐸(𝐺)|}, the associated weight of a vertex 𝑣 ∈ 𝑉(𝐺) under 𝑔 is 𝑤𝑔(𝑣) = σ𝑒∈𝐸(𝑣)𝑔(𝑒), where 𝐸(𝑣) is the set of vertices incident to 𝑣. the function 𝑔 is called a vertex-antimagic edge labeling if every vertex has distinct weight. a path 𝑃 in the edge-labeled graph 𝐺 is said to be a rainbow path if for any two vertices 𝑥 and 𝑥′, all internal vertices in the path 𝑥 − 𝑥′ have different weight. if for every two vertices 𝑥 and 𝑦 of 𝐺, there exists a rainbow 𝑥 − 𝑦 path, then 𝑔 is called a rainbow vertex antimagic labeling of 𝐺. when we assign each edge 𝑥𝑦 with the color of the vertex weight 𝑤𝑔(𝑣), thus we say the graph 𝐺 admits a rainbow vertex antimagic coloring. the rainbow vertex antimagic connection number of 𝐺, denoted by 𝑟𝑣𝑎𝑐(𝐺), is the smallest number of colors taken over all rainbow colorings induced by rainbow vertex antimagic labelings of 𝐺. to determine the rainbow vertex antimagic connection number of any graph is considered to be hard problem. even, this study fall into np-hard problem. in this paper, we initiate to determine the rainbow vertex antimagic connection number of graphs, namely path (𝑃𝑛 ), wheel (𝑊𝑛), friendship (ℱ𝑛), and fan (𝐹𝑛 ) as well as fix the lower bound 𝑟𝑣𝑎𝑐(𝐺) of any graph. methods this research includes deductive analytic methods. the procedures to obtain the rainbow vertex antimagic connection number of are as follows. 1. define a graph 𝐺. 2. determine the cardinality of graph 𝐺 by obtaining the order and size of graph 𝐺. 3. determine the lower bound of 𝑟𝑣𝑎𝑐(𝐺) by using the obtained remark of sharpest lower bound. 4. determine the upper bound of 𝑟𝑣𝑎𝑐(𝐺) by constructing the bijective function, compute the vertex weight using 𝑤𝑔(𝑣) = σ𝑒∈𝐸(𝑣)𝑔(𝑒), and show that every two different vertices of 𝐺 satisfy the rainbow vertex antimagic coloring. 5. if the upper bound attains the lower bound, then we obtain the 𝑟𝑣𝑎𝑐(𝐺). if the upper bound does not attain the lower bound, then we return to determine the upper bound of 𝑟𝑣𝑎𝑐(𝐺). 6. finally we can construct a new theorem and its proof after we obtain the rainbow vertex antimagic connection number of graph 𝐺. on rainbow vertex antimagic coloring of graphs: a new notion marsidi 66 results and discussion in this section we have several theorems on the rainbow vertex antimagic coloring. we determine the minimum color taken to the graph such that it has rainbow vertex antimagic coloring. since we determine the minimum colors such that 𝐺 has rainbow vertex antimagic coloring, then the lower bound of rainbow vertex antimagic connection number of graph is at least and equal to rainbow vertex connection number. the lower bound of rainbow vertex antimagic connection number of any graph is mathematically written in the remark 1. remark 1 let 𝐺 be a connected graph, 𝑟𝑣𝑎𝑐(𝐺) ≥ 𝑟𝑣𝑐(𝐺). theorem 1 if 𝑃𝑛 be a path graph of order 𝑛 and 𝑛 ≥ 3, then 𝑟𝑣𝑎𝑐(𝑃𝑛) = { 3, 𝑛 = 3,4 𝑛 − 2, 𝑛 ≥ 5 proof. let 𝑃𝑛 be a path graph with vertex set 𝑉(𝑃𝑛) = {𝑣1, 𝑣2, 𝑣3, ⋯ , 𝑣𝑛 } and edge set 𝐸(𝑃𝑛 ) = {𝑣𝑖 𝑣{𝑖+1}: 1 ≤ 𝑖 ≤ 𝑛 − 1}. the diameter of 𝑃𝑛 is 𝑛 − 1. we divide into two cases to prove the rainbow vertex antimagic connection number as follows. case 1. for 𝑃𝑛 , 𝑛 = 3,4 path graph 𝑃𝑛 , 𝑛 = 3 have two edges. if we give labels on it, it gives three different weights on its edges exactly. it concludes that the rainbow vertex antimagic connection number of 𝑃3 is 3. furthermore for 𝑃4, we determine the all permutation of edge labeling on 𝑃4. let 𝑒1, 𝑒2, 𝑒3 are the edges of 𝑃4, thus there are six possibilities of edge labeling on 𝑃4 as follows. 1). if 𝑒1 = 1, 𝑒2 = 2, 𝑒3 = 3, then 𝑤𝑡(𝑣1) = 1, 𝑤𝑡(𝑣2) = 3, 𝑤𝑡(𝑣3) = 5, 𝑤𝑡(𝑣4) = 3. 2). if 𝑒1 = 1, 𝑒2 = 3, 𝑒3 = 2, then 𝑤𝑡(𝑣1) = 1, 𝑤𝑡(𝑣2) = 4, 𝑤𝑡(𝑣3) = 5, 𝑤𝑡(𝑣4) = 2. 3). if 𝑒1 = 2, 𝑒2 = 1, 𝑒3 = 3, then 𝑤𝑡(𝑣1) = 2, 𝑤𝑡(𝑣2) = 3, 𝑤𝑡(𝑣3) = 4, 𝑤𝑡(𝑣4) = 3. 4). if 𝑒1 = 2, 𝑒2 = 3, 𝑒3 = 1, then 𝑤𝑡(𝑣1) = 2, 𝑤𝑡(𝑣2) = 5, 𝑤𝑡(𝑣3) = 4, 𝑤𝑡(𝑣4) = 1. 5). if 𝑒1 = 3, 𝑒2 = 1, 𝑒3 = 2, then 𝑤𝑡(𝑣1) = 3, 𝑤𝑡(𝑣2) = 4, 𝑤𝑡(𝑣3) = 3, 𝑤𝑡(𝑣4) = 2. 6). if 𝑒1 = 3, 𝑒2 = 2, 𝑒3 = 1, then 𝑤𝑡(𝑣1) = 3, 𝑤𝑡(𝑣2) = 5, 𝑤𝑡(𝑣3) = 3, 𝑤𝑡(𝑣4) = 1. based on edge labelings and vertex weights above, it is easy to determine the rainbow vertex antimagic connection number of 𝑃4 at least 3. thus 𝑎𝑟𝑣𝑐(𝑃4) = 3. case 2. for 𝑃𝑛 , 𝑛 ≥ 5 based on remark 1, we have 𝑟𝑣𝑎𝑐(𝑃𝑛) ≥ 𝑟𝑣𝑐(𝑃𝑛 ) = 𝑑𝑖𝑎𝑚(𝑃𝑛 ) − 1 = 𝑛 − 1 − 1 = 𝑛 − 2. furthermore, to show the upper bound we construct the bijective function of edge labels. we have two conditions, namely for 𝑛 ≡ 1(mod 2) and 𝑛 ≡ 0(mod 2). for 𝑛 ≡ 1(mod 2), we have 𝑔(𝑣1𝑣2) = 3 𝑔(𝑣2𝑣3) = 1 𝑔(𝑣3𝑣4) = 2 𝑔(𝑣𝑛−1𝑣𝑛 ) = 4 𝑔(𝑣𝑖 𝑣𝑖+1) = 𝑖 + 1: 4 ≤ 𝑖 ≤ 𝑛 − 2 from the edge labels above, we have the vertex weight as follows. for 𝑃5, we have 𝑤(𝑣1, 𝑣2, 𝑣3, 𝑣4, 𝑣5) = (3,4,3,6,4). for 𝑃𝑛 : 𝑛 ≥ 6, we have on rainbow vertex antimagic coloring of graphs: a new notion marsidi 67 𝑤(𝑣1) = 3 𝑤(𝑣2) = 4 𝑤(𝑣3) = 3 𝑤(𝑣4) = 7 𝑤(𝑣𝑖 ) = 2𝑖 + 1: 5 ≤ 𝑖 ≤ 𝑛 − 2 𝑤(𝑣𝑛−1) = 𝑛 + 3 𝑤(𝑣𝑛) = 4 for 𝑛 ≡ 0(mod 2), we have 𝑔(𝑣1𝑣2) = 3 𝑔(𝑣2𝑣3) = 1 𝑔(𝑣3𝑣4) = 2 𝑔(𝑣𝑖 𝑣𝑖+1) = 𝑖: 4 ≤ 𝑖 ≤ 𝑛 − 1 from the edge labels above, we have the vertex weights in the following: 𝑤(𝑣1) = 3 𝑤(𝑣2) = 4 𝑤(𝑣3) = 3 𝑤(𝑣4) = 6 𝑤(𝑣𝑖 ) = 2𝑖 − 1: 5 ≤ 𝑖 ≤ 𝑛 − 1 𝑤(𝑣𝑛) = 𝑛 − 1 from the vertex weight above, it is easy to see that the different weight is 𝑛 − 2. it concludes that the rainbow vertex antimagic connection number of 𝑃𝑛 : 𝑛 = {3,4} is 3 and the rainbow vertex antimagic connection number of 𝑃𝑛 : 𝑛 ≥ 5 is 𝑛 − 2. furthermore, we show that every two different vertices of 𝑃𝑛 is rainbow vertex antimagic coloring. suppose that 𝑣 ∈ 𝑉(𝑃𝑛 ), refer to the vertex weight the rainbow vertex path is shown in table 1. table 1. the rainbow vertex path of 𝑃𝑛 case 𝒗 𝒗 rainbow vertex coloring 1 𝑣1 𝑣𝑛 𝑣1, 𝑣2, 𝑣3, … , 𝑣𝑖 , … , 𝑣𝑛−1 hence, the vertex coloring of 𝑃𝑛 is rainbow vertex antimagic coloring. thus, we obtain 𝑎𝑟𝑣𝑐(𝑃𝑛) is 3 for 𝑛 = 3,4 and 𝑎𝑟𝑣𝑐(𝑃𝑛 ) is 𝑛 − 2 for 𝑛 ≥ 5. ∎ theorem 2 if 𝑊𝑛 be a wheel graph of order 𝑛 + 1 and 𝑛 ≥ 3, then 𝑟𝑣𝑎𝑐(𝑊𝑛) = 2 if 𝑛 ≡ 1(mod 2) and 2 ≤ 𝑟𝑣𝑎𝑐(𝑊𝑛) ≤ 3 if 𝑛 ≡ 0(mod 2). proof. let 𝑊𝑛 be a wheel graph with vertex set 𝑉(𝑊𝑛) = {𝐴, 𝑥1, 𝑥2, 𝑥3, ⋯ , 𝑥𝑛} and edge set 𝐸(𝑊𝑛) = {𝐴𝑥𝑖 : 1 ≤ 𝑖 ≤ 𝑛} ∪ {𝑥𝑖 𝑥{𝑖+1}: 1 ≤ 𝑖 ≤ 𝑛 − 1} ∪ {𝑥𝑛−1𝑥1}. the diameter of 𝑊𝑛 is 2. based on remark 1, we have 𝑟𝑣𝑎𝑐(𝑊𝑛) ≥ 𝑟𝑣𝑐(𝑊𝑛) = 𝑑𝑖𝑎𝑚(𝑊𝑛) − 1 = 2 − 1 = 1. since the vertex 𝐴 has degree of much greater than the others, it must have a different vertex weight than the others. the vertex weight of 𝐴 is the sum of labels of edges which incident to 𝐴. from this condition, such that we have 𝑟𝑣𝑎𝑐(𝑊𝑛) ≥ 2. we divide into two cases to show the upper bound of the rainbow vertex antimagic connection number of 𝑊𝑛 as follows. on rainbow vertex antimagic coloring of graphs: a new notion marsidi 68 case 1. for 𝑊𝑛, 𝑛 ≡ 1(mod 2) to show the upper bound of (𝑊𝑛): 𝑛 ≡ 1(mod 2) , we construct the bijective function of edge labels. 𝑔(𝑥𝑖 𝑥𝑖+1) = { 𝑖 + 1 2 , if 𝑖 ≡ 1(mod 2) ⌈ 𝑛 2 ⌉ + 𝑖 2 , if 𝑖 ≡ 0(mod 2) 𝑔(𝐴𝑥𝑖 ) = 2𝑛 + 1 − 𝑖 from the edge labels above, we have the vertex weights in the following: 𝑤(𝑥𝑖 ) = 2𝑛 + 1 + ⌈ 𝑛 2 ⌉ 𝑤(𝐴) = 𝑛 2 (3𝑛 + 1) from the vertex weights above, it is easy to see that the different weight is 2. case 2. for 𝑊𝑛, 𝑛 ≡ 0(mod 2) to show the upper bound of 𝑟𝑣𝑎𝑐(𝑊𝑛): 𝑛 ≡ 0(mod 2), we construct the bijective function of edge labels. 𝑔(𝑥𝑖 𝑥𝑖+1) = { 𝑖 + 1 2 , if 𝑖 ≡ 1(mod 2) ⌈ 𝑛 2 ⌉ + 𝑖 2 , if 𝑖 ≡ 0(mod 2) 𝑔(𝐴𝑥𝑖 ) = 2𝑛 + 1 − 𝑖 from the edge labels above, we have the vertex weights in the following. 𝑤(𝑥1) = 3𝑛 + 1 𝑤(𝑥𝑖 ) = 2𝑛 + 1 + ⌈ 𝑛 2 ⌉ 𝑤(𝐴) = 𝑛 2 (3𝑛 + 1) from the vertex weight above, it is easy to see that the different weight is 3. furthermore, we show that every two different vertices of 𝑊𝑛 is rainbow vertex antimagic coloring. suppose that 𝑥, 𝑦 ∈ 𝑉(𝑊𝑛), refer to the vertex weight the rainbow vertex 𝑥 − 𝑦 path is shown in table 2. table 2. the rainbow vertex of 𝑥 − 𝑦 path of 𝑊𝑛 case 𝒙 𝒚 rainbow vertex coloring 𝒙 − 𝒚 1 𝑥𝑖 𝐴 𝑥𝑖 , 𝐴 2 𝑥𝑖 𝑥𝑖 𝑥𝑖 , 𝐴, 𝑥𝑖 hence, the vertex coloring of 𝑊𝑛 is rainbow vertex antimagic coloring. thus, we obtain 𝑟𝑣𝑎𝑐(𝑊𝑛) = 2 if 𝑛 ≡ 1(mod 2) and 2 ≤ 𝑟𝑣𝑎𝑐(𝑊𝑛) ≤ 3 if 𝑛 ≡ 0(mod 2). ∎ theorem 3 if ℱ𝑛 be a friendship graph of order 2𝑛 + 1 and 𝑛 ≥ 3, then 𝑟𝑣𝑎𝑐(ℱ𝑛) = 3. proof. let ℱ𝑛 be a friendship graph with vertex set 𝑉(ℱ𝑛) = {𝐴} ∪ {𝑥1, 𝑥2, 𝑥3, … , 𝑥𝑛 } ∪ {𝑦1, 𝑦2, 𝑦3, … , 𝑦𝑛 } and edge set 𝐸(ℱ𝑛) = {𝐴𝑥𝑖 ; 1 ≤ 𝑖 ≤ 𝑛} ∪ {𝐴𝑦𝑖 ; 1 ≤ 𝑖 ≤ 𝑛} ∪ {𝑥𝑖 𝑦𝑖 ; 1 ≤ 𝑖 ≤ 𝑛}. the diameter of ℱ𝑛 is 2. based on remark 1, we have 𝑟𝑣𝑎𝑐(ℱ𝑛) ≥ 𝑟𝑣𝑐(ℱ𝑛) = 𝑑𝑖𝑎𝑚(ℱ𝑛) − 1 = 2 − 1 = 1. since the vertex 𝐴 has degree of much greater than the on rainbow vertex antimagic coloring of graphs: a new notion marsidi 69 others, it must have a different vertex weight than the others. the vertex weight of 𝐴 is the sum of labels of edges which incident to 𝐴. in the other hand, the vertex 𝑥𝑖 and 𝑦𝑖 are adjacent, such that based on the edge labeling it can not receive the same weight. from this condition, such that we have 𝑎𝑟𝑣𝑐(ℱ𝑛) ≥ 3. furthermore, to show the upper bound we construct the bijective function of edge labels. 𝑔(𝐴𝑥𝑖 ) = 𝑖 ∶ 1 ≤ 𝑖 ≤ 𝑛 𝑔(𝑥𝑖 𝑦𝑖 ) = 2𝑛 + 1 − 𝑖 ∶ 1 ≤ 𝑖 ≤ 𝑛 𝑔(𝐴𝑦𝑖 ) = 2𝑛 + 𝑖 ∶ 1 ≤ 𝑖 ≤ 𝑛 from the edge labels above, we have the vertex weights in the following. 𝑤(𝑥𝑖 ) = 2𝑛 + 1 𝑤(𝑦𝑖 ) = 4𝑛 + 1 𝑤(𝐴) = 3𝑛2 + 𝑛 from the vertex weight above, it is easy to see that the different weight is 3. furthermore, we show that every two different vertices of ℱ𝑛is rainbow vertex antimagic coloring. suppose that 𝑥, 𝑦 ∈ 𝑉(ℱ𝑛), refer to the vertex weight the rainbow vertex 𝑥 − 𝑦 path is shown in table 3. table 3. the rainbow vertex of 𝑥 − 𝑦 path of ℱ𝑛 case 𝒙 𝒚 rainbow vertex coloring 𝒙 − 𝒚 1 𝑥𝑖 𝑥𝑖 𝑥𝑖 , 𝐴, 𝑥𝑖 2 𝑥𝑖 𝑦𝑖 𝑥𝑖 , 𝐴, 𝑦𝑖 3 𝑦𝑖 𝑦𝑖 𝑦𝑖 , 𝐴, 𝑦𝑖 4 𝑦𝑖 𝑥𝑖 𝑦𝑖 , 𝐴, 𝑥𝑖 hence, the vertex coloring of ℱ𝑛 is rainbow vertex antimagic coloring. thus, we obtain 𝑟𝑣𝑎𝑐(ℱ𝑛) is 3 . ∎ theorem 4 if 𝐹𝑛 be a fan graph 𝑛+1 and 𝑛 ≥ 3, then 𝑟𝑣𝑎𝑐(𝐹𝑛 ) = 2 if 𝑛 ≡ 1(mod 2) and 2 ≤ 𝑟𝑣𝑎𝑐(𝐹𝑛) ≤ 3 if 𝑛 ≡ 0(mod 2). proof. let 𝐹𝑛 be a fan graph with vertex set 𝑉(𝐹𝑛 ) = {𝐴, 𝑥1, 𝑥2, 𝑥3, ⋯ , 𝑥𝑛} and edge set 𝐸(𝐹𝑛 ) = {𝐴𝑥𝑖 : 1 ≤ 𝑖 ≤ 𝑛} ∪ {𝑥𝑖 𝑥{𝑖+1}: 1 ≤ 𝑖 ≤ 𝑛 − 1}. the diameter of 𝐹𝑛 is 2. based on remark 1, we have 𝑟𝑣𝑎𝑐(𝐹𝑛) ≥ 𝑟𝑣𝑐(𝐹𝑛 ) = 𝑑𝑖𝑎𝑚(𝐹𝑛) − 1 = 2 − 1 = 1. since the vertex 𝐴 has degree of much greater than the others, it must have a different vertex weight than the others. the vertex weight of 𝐴 is the sum of labels of edges which incident to 𝐴. from this condition, such that we have 𝑟𝑣𝑎𝑐(𝐹𝑛) ≥ 2. we divide into two cases to show the upper bound of the antimagic rainbow connection number of 𝐹𝑛 as follows. case 1. for 𝐹𝑛 , 𝑛 ≡ 1(mod 2) to show the upper bound of 𝑟𝑣𝑎𝑐(𝐹𝑛): 𝑛 ≡ 1(mod 2), we construct the bijective function of edge labels. 𝑔(𝑥𝑖 𝑥𝑖+1) = { 𝑖 2 , if 𝑖 ≡ 0(mod 2) 𝑛 + 𝑖 2 , if 𝑖 ≡ 1(mod 2) on rainbow vertex antimagic coloring of graphs: a new notion marsidi 70 𝑔(𝐴𝑥𝑖 ) = { 2𝑛 − 1, if 𝑖 = 𝑛 2𝑛 − 𝑖 − 1, if 1 ≤ 𝑖 ≤ 𝑛 − 1 from the edge labels above, we have the vertex weights in the following. 𝑤(𝑥𝑖 ) = 5𝑛 − 3 2 𝑤(𝐴) = 3𝑛2 − 𝑛 2 from the vertex weights above, it is easy to see that the different weight is 2. case 2. for 𝐹𝑛 , 𝑛 ≡ 0(mod 2) to show the upper bound of 𝑟𝑣𝑎𝑐(𝐹𝑛): 𝑛 ≡ 0(mod 2), we construct the bijective function of edge labels. 𝑔(𝑥𝑖 𝑥𝑖+1) = { 𝑖 2 , if 𝑖 ≡ 0(mod 2) 𝑛 + 𝑖 − 1 2 , if 𝑖 ≡ 1(mod 2) 𝑔(𝐴𝑥𝑖 ) = { 2𝑛 − 1, if 𝑖 = 𝑛 2𝑛 − 𝑖 − 1, if 1 ≤ 𝑖 ≤ 𝑛 − 1 from the edge labels above, we have the vertex weights in the following. 𝑤(𝑥𝑖 ) = { 3𝑛 − 2, if 𝑖 = 𝑛 5𝑛 2 − 2, if 1 ≤ 𝑖 ≤ 𝑛 − 1 𝑤(𝐴) = 3𝑛2 − 𝑛 2 from the vertex weight above, it is easy to see that the different weight is 3. furthermore, we show that every two different vertices of 𝐹𝑛 is rainbow vertex antimagic coloring. suppose that𝑥, 𝑦 ∈ 𝑉(𝐹𝑛 ), refer to the vertex weight the rainbow vertex 𝑥 − 𝑦 path is shown in table 4. table 4. the rainbow vertex of 𝑥 − 𝑦 path of 𝐹𝑛 case 𝒙 𝒚 rainbow vertex coloring 𝒙 − 𝒚 1 𝑥𝑖 𝐴 𝑥𝑖 , 𝐴 2 𝑥𝑖 𝑥𝑖 𝑥𝑖 , 𝐴, 𝑥𝑖 hence, the vertex coloring of 𝐹𝑛 is rainbow vertex antimagic coloring. thus, we obtain 𝑟𝑣𝑎𝑐(𝐹𝑛) = 2 if 𝑛 ≡ 1(mod 2) and 2 ≤ 𝑟𝑣𝑎𝑐(𝐹𝑛 ) ≤ 3 if 𝑛 ≡ 0(mod 2). ∎ the illustration of antimagic rainbow edge labeling can be seen in figure 1. based on the figure 1, we know that wheel graph 𝑊17 satisfy the rainbow vertex antimagic coloring and rainbow vertex antimagic connection number of 𝑊17 is 2. on rainbow vertex antimagic coloring of graphs: a new notion marsidi 71 figure 2. the illustration rainbow vertex antimagic coloring of 𝑊17 conclusions we have obtained the exact values of rainbow vertex antimagic connection number of some connected graphs, namely path (𝑃𝑛 ), wheel (𝑊𝑛), friendship (ℱ𝑛), and fan (𝐹𝑛 ). however, since obtaining rainbow vertex antimagic connection number of graph is considered to be np-complete problem, the characterization of the exact value of 𝑎𝑟𝑣𝑐(𝐺) for any family graph is still widely open. therefore, we propose the following open problems as follows. 1. determine the exact value of rainbow vertex antimagic connection number of graphs apart from those families. 2. determine the exact value of rainbow vertex antimagic connection number of any operation graphs. acknowledgments we gratefully acknowledge to department of mathematics education, universitas pgri argopuro jember, cgant university of jember in 2021, and the reviewers who have make some corrections in completing this paper. references [1] g. chartrand, l. lesniak, and p. zhang, graphs & digraphs, fifth edition. 2010. [2] g. chartrand, g. l. johns, k. a. mckeon, and p. zhang, “rainbow connection in graphs,” math. bohem., vol. 133, pp. 85–98, 2008. [3] g. chartrand, g. l. johns, k. a. mckeon, and p. zhang, “the rainbow connectivity of a graph,” networks, 2009, doi: 10.1002/net.20296. [4] dafik, i. h. agustin, a. fajariyato, and r. alfarisi, “on the rainbow coloring for some graph operations,” vol. 020004, 2016, doi: 10.1063/1.4940805. [5] x. li and y. sun, “an updated survey on rainbow connections of graphsa dynamic survey,” theory appl. graphs, 2017, doi: 10.20429/tag.2017.000103. [6] m. krivelevich and r. yuster, “the rainbow connection of a graph is (at most) reciprocal to its minimum degree,” j. graph theory, vol. 63, pp. 185–191, 2010. [7] d. n. s. simamora and a. n. m. salman, “the rainbow (vertex) connection number on rainbow vertex antimagic coloring of graphs: a new notion marsidi 72 of pencil graphs,” procedia comput. sci., vol. 74, pp. 138–142, 2015, doi: 10.1016/j.procs.2015.12.089. [8] m. s. hasan, slamin, dafik, i. h. agustin, and r. alfarisi, “on the total rainbow connection of the wheel related graphs,” 2018. [9] p. heggernes, d. issac, j. lauri, p. t. lima, and e. j. van leeuwen, “rainbow vertex coloring bipartite graphs and chordal graphs,” leibniz int. proc. informatics, lipics, vol. 117, no. 83, pp. 1–13, 2018, doi: 10.4230/lipics.mfcs.2018.83. [10] dafik, slamin, and a. muharromah, “on the ( strong ) rainbow vertex connection of graphs resulting from edge comb product,” 2018. [11] n. hartsfield and g. ringel, pearls in graph theory. 2003. [12] r. simanjuntak, f. bertault, and m. miller, “two new (a, d)-antimagic graph labelings,” proc. elev. australas. work. comb. algorithms 11, pp. 179–189, 2000. [13] j. a. gallian, “a dynamic survey of graph labeling,” electron. j. comb., vol. 1, no. dynamicsurveys, 2018. [14] b. j. septory, m. i. utoyo, dafik, b. sulistiyono, and i. h. agustin, “on rainbow antimagic coloring of special graphs,” j. phys. conf. ser., vol. 1836, no. 1, 2021, doi: 10.1088/1742-6596/1836/1/012016. [15] h. s. budi, dafik, i. m. tirta, i. h. agustin, and a. i. kristiana, “on rainbow antimagic coloring of graphs,” j. phys. conf. ser., vol. 1832, no. 1, 2021, doi: 10.1088/17426596/1832/1/012016. analysis of the rosenzweig-macarthur predator-prey model with anti-predator behavior cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 260-269 p-issn: 2086-0382; e-issn: 2477-3344 submitted: january 22, 2021 reviewed: march 17, 2021 accepted: april 14, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11472 analysis of the rosenzweig-macarthur predator-prey model with anti-predator behavior ismail djakaria1, muhammad bachtiar gaib2 , resmawan3 1,2,3department of mathematics, universitas negeri gorontalo, indonesia email: iskar@ung.ac.id, m.tiargaib@gmail.com, resmawan@ung.ac.id abstract this paper discusses the analysis of the rosenzweig-macarthur predator-prey model with antipredator behavior. the analysis is started by determining the equilibrium points, existence, and conditions of the stability. identifying the type of hopf bifurcation by using the divergence criterion. it has shown that the model has three equilibrium points, i.e., the extinction of population equilibrium point (𝐸0), the non-predatory equilibrium point (𝐸1), and the co-existence equilibrium point (𝐸2). the existence and stability of each equilibrium point can be shown by satisfying several conditions of parameters. the divergence criterion indicates the existence of the supercritical hopf-bifurcation around the equilibrium point 𝐸2. finally, our model's dynamics population is confirmed by our numerical simulations by using the 4th-order runge-kutta methods. keywords: rosenzweig-macarthur; predator-prey model; anti-predator behaviour; hopf bifurcation; divergence criterion; equilibrium point. introduction population dynamics are the most interesting research in mathematical biology which discusses the interactions that occur between prey and predator in a particular ecosystem [1]. this interaction has implemented to a simple mathematical model known as the lotka-volterra predator-prey model [2]. in a mathematical model, the predation process (interaction between prey and predator) is expressed in some form that is known as a functional response. this functional response has classified three functions, i.e. holling-type i, holling-type ii, and holling-type iii where each type determine the characteristic of the predator [3]. on the progress, rosenzweig and macarthur modifying the lotka-volterra predator-prey model with the assumption the attack rate of predator increases at a decreasing rate with prey density until it becomes constant due to satiation which is affected by holling-type ii functional response [4]. further, some modified of lotka-volterra predator-prey model by considering the infectious disease [5]-[7]. several research has discussed the modification of the rosenzweig-macarthur predator-prey model [8][9] is introduced predator foraging facilitation into holling-type ii functional response. furthermore, the rosenzweig-macarthur model has modified with various factors, e.g. the stage-structure [10][11], the refuge effect [12][13], the harvesting to one or more population [14][15]. from several studies described above, no one http://dx.doi.org/10.18860/ca.v6i4.11472 mailto:iskar@ung.ac.id mailto:m.tiargaib@gmail.com mailto:resmawan@ung.ac.id analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 261 (1) considering anti-predator behavior factors. in this article, the rosenzweig-macarthur predator-prey model by [6] modified considering anti-predator behavior factors [16]. these factors can be considered in the model because the dynamics of the model will be very complex when the prey population prefers to defending and provide resistance when the predation process is occurring. the structure of this paper is as follows. in the next section, the methods in our work are described. then, the analysis of the model has been discussed. finally, a brief conclusion of our work is given. methods the dynamics of the model is analyzed by carrying out the following steps: 1. modifying the rosenzweig-macarthur predator-prey model considering antipredator behavior factors. 2. simplifying the model by using non-dimensional to reduce the number of parameters and solving the equilibrium points of the model. 3. identifying the existence, local stability, and global stability of the equilibrium points. 4. identifying the hopf-bifurcation type by using the divergence criterion. 5. demonstrated the numerical simulations of the model to describe the analysis results by using the 4th-order runge-kutta method. results and discussion mathematical model in this article, the mathematical model is formulated based on the following assumptions: 1. the prey population is assumed to grow logistically with an intrinsic growth rate of 𝑟 and carrying capacity of the environment of 𝐾 and reduced due to the predation process. 2. the predator population is assumed to grow due to the predation process. 𝑐 is the conversion rate of the consumed prey into predator births. 3. the predation process follows holling-type ii functional response which is affected by the encounter rate function where there is foraging facilitation of predator (𝑤 = 0), 𝑎 is the saturated rate of the predator, 𝑏 is coefficient interaction on both population and ℎ is the predator time handling. 4. 𝑚 is the mortality of predators. 5. 𝜂 is the anti-predator behavior. from the following assumptions above, the dynamics of the model can be represented by the following set of differential equations: 𝑑𝑥 𝑑𝑡 = 𝑟𝑥(1− 𝑥 𝐾 )− (𝑎 −𝑏)𝑥𝑦 𝑦 +ℎ(𝑎 −𝑏)𝑥 𝑑𝑦 𝑑𝑡 = 𝑐(𝑎 −𝑏)𝑥𝑦 𝑦 +ℎ(𝑎 −𝑏)𝑥 −𝑚𝑦 −𝜂𝑥𝑦 where 𝑥 and 𝑦 are respectively the densities of prey and predator population at time 𝑡 and 𝑥(0),𝑦(0) > 0. analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 262 (2) (3) (4) to simplify our analysis, we reduce the number of parameters in system (1) by using the following parameter scales [17]: 𝑥 → 𝑥𝐾, 𝑦 → 𝑦(𝑎 −𝑏)𝐾ℎ, 𝑡 → 𝑡 𝑟 we obtain the following non-dimensional model 𝑑𝑥 𝑑𝑡 = 𝑥(1−𝑥)− 𝛼𝑥𝑦 𝑥 +𝑦 𝑑𝑦 𝑑𝑡 = 𝛽𝑥𝑦 𝑥 +𝑦 −𝛾𝑦 −𝛿𝑥𝑦 where 𝛼 = (𝑎 −𝑏) 𝑟 , 𝛽 = 𝑐 ℎ𝑟 , 𝛾 = 𝑚 𝑟 , 𝛿 = 𝜂𝐾 𝑟 existence and stability analysis of equilibrium points in this section, the equilibrium point of model (2) is obtained by solving [18]: 𝑥(1−𝑥)− 𝛼𝑥𝑦 𝑥 +𝑦 = 0 𝛽𝑥𝑦 𝑥 +𝑦 −𝛾𝑦 −𝛿𝑥𝑦 = 0 thus, from the system (3), we obtain the following equilibrium points, i.e.: 1. a trivial equilibrium point 𝐸0 = (0,0), always exists. 2. a non-predator equilibrium point 𝐸1 = (1,0), always exists too. 3. a co-existence equilibrium point 𝐸2 = (𝑥 ∗,𝑦∗), where 𝑥∗ = 𝛽 −𝛼𝛽 +𝛼𝛾 𝛽 −𝛼𝛿 , 𝑦∗ = (𝛽 −𝛼𝛽 +𝛼𝛾)(𝛽 −𝛾 −𝛿) (𝛽 −𝛼𝛿)(𝛾 +𝛿 −𝛼𝛿) which exists if 𝛽 > 𝛼(𝛽 −𝛾), 𝛾 +𝛿 < 𝛼𝛿 < 𝛽 now, study the local stability of the dynamics of the system (3) around each of equilibrium point. the jacobian matrix from the system (3) is determined as [19]: 𝐽(𝑥,𝑦) = ( 1−2𝑥 − 𝛼𝑦 𝑥 +𝑦 + 𝛼𝑥𝑦 (𝑥 +𝑦)2 − 𝛼𝑥 𝑥 +𝑦 + 𝛼𝑥𝑦 (𝑥 +𝑦)2 𝛽𝑦 𝑥 +𝑦 − 𝛽𝑥𝑦 (𝑥 +𝑦)2 −𝛿𝑦 𝛽𝑥 𝑥 +𝑦 − 𝛽𝑥𝑦 (𝑥 +𝑦)2 −𝛾 −𝛿𝑥 ) by evaluating this jacobian matrix (4) at each equilibrium point, we obtain the local stability properties of 𝐸0, 𝐸1, and 𝐸2 as follows. analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 263 theorem 1. the trivial equilibrium point 𝐸0 always unstable (saddle). proof: the jacobian matrix (4) evaluated in equilibrium point 𝐸0 is given by 𝐽(𝐸0) = ( 1 0 0 −𝛾 ) so, by solving the characteristic equation, we obtained the eigenvalues of 𝐽(𝐸0) is 𝜆1 = 1 and 𝜆2 = −𝛾. it means 𝜆1 > 0 and 𝜆2 < 0. therefore, stability of equilibrium point 𝐸0 is unstable (saddle).∎ theorem 2. if 𝛿 > 𝛽 −𝛾, then the non-predatory equilibrium point 𝐸1 of system (2) is locally asymptotically stable. proof: the jacobian matrix (4) evaluated in equilibrium point 𝐸1 is given by 𝐽(𝐸1) = ( −1 −𝛼 0 𝛽 −𝛾 −𝛿 ) so, by solving the characteristic equation, we obtained the eigenvalues of 𝐽(𝐸1) is 𝜆1 = −1 and 𝜆2 = 𝛽 −𝛾 −𝛿. it means 𝜆1 < 0. therefore, if 𝛿 > 𝛽 −𝛾 then each the eigenvalues of 𝐽(𝐸1) are negatif, and 𝐸1 is locally asymptotically stable.∎ theorem 3. the co-existence equilibrium point 𝐸2 is locally asymptotically stable if the conditions below are satisfied 𝛿2 < θ+υ ζ proof: the jacobian matrix (4) evaluated in equilibrium point 𝐸1 is given by 𝐽(𝐸2) = ( 𝑀11 𝑀12 𝑀21 𝑀22 ) where 𝑀11 = −𝛽2 +𝛼𝛽2 −𝛼𝛾2 −2𝛼𝛿(𝛼 −1)(𝛽 −𝛾)−𝛼𝛿2 +𝛼2𝛿2 (𝛽 −𝛼𝛿)2 𝑀12 = − 𝛼(𝛾 +𝛿 −𝛼𝛿)2 (𝛽 −𝛼𝛿)2 𝑀21 = (𝛽 −𝛾 −𝛿)(𝛽2𝛾 +𝛼2𝛾𝛿2 −𝛽(𝛾2 +2𝛾𝛿 +𝛿2(𝛼 −1)2)) (𝛽 −𝛼𝛿)2 𝑀22 = − 𝛽(𝛽 −𝛾 −𝛿)(𝛾 +𝛿 −𝛼𝛿) (𝛽 −𝛼𝛿)2 by solving the characteristic equation, we obtained the eigenvalues of 𝐽(𝐸2) is 𝜆1,2 = 1 2 . 1 (𝛽 −𝛼𝛿)2 (𝐴±𝐵) analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 264 where 𝐴 = ζ𝛿2 −θ−υ and 𝐵 = ψ2 −𝛼ω with ζ = (𝛼2 −𝛼 +𝛽 −𝛼𝛽) θ = 𝛿(𝛽(𝛽 −2𝛾)+2𝛼2(𝛽 −𝛾)−𝛼(𝛽2 +2(𝛽 −𝛾)−𝛽𝛾)) υ = 𝛽2(𝛾 −𝛼 +1)+𝛾2(𝛼 +𝛽) ψ = (𝛽2 −𝛼𝛽2 +𝛼𝛾2 +2𝛼𝛿(𝛼 −1)(𝛽 −𝛾)−𝛼𝛿2(𝛼 −1)−𝛽(𝛽 −𝛾 −𝛿)(𝛾 +𝛿 −𝛼𝛿)) ω = 4(𝛽 −𝛾 −𝛿)(𝛾 +𝛿 −𝛼𝛿)(𝛽2𝛾 +𝛼2𝛾𝛿2 −𝛽((𝛼 −1)2𝛿2 +𝛾2 +2𝛾𝛿)) according to (), the stability of equilibrium point 𝐸2 depending on the value of 𝐴. if 𝐴 < 0, we obtained: ζ𝛿2 −θ𝛿 −υ < 0 ζ𝛿2 < θ𝛿 +υ 𝛿2 < θ+υ ζ by the conditions above, the stability of equilibrium point 𝐸2 is locally asymptotically stable.∎ next, study the global stability of the dynamics of the system (3) around equilibrium point 𝐸2. we obtain the global stability properties of 𝐸2 by using the lyapunov function [20] as follows. theorem 4. the co-existence equilibrium 𝐸2 is globally asymptotically stable if the conditions below are satisfied: 𝑥∗ < (𝛼 −𝛽 +𝛾 +𝛿)(𝛾 +𝛿 −𝛼𝛿) 𝛼(𝛾 +𝛿 −𝛼𝛿)−(𝛽 −𝛾 −𝛿)2 proof: define a lyapunov function as follows 𝑉(𝑥,𝑦) = [𝑥 −𝑥∗ −𝑥∗ ln( 𝑥 𝑥∗ )]+[𝑦 −𝑦∗ −𝑦∗ ln( 𝑦 𝑦∗ )] by using the function �̇� < 0,∀ (𝑥,𝑦) ∈ ℝ2 +, we obtain: 𝜕𝑉 𝜕𝑥 . 𝜕𝑥 𝜕𝑡 + 𝜕𝑉 𝜕𝑦 . 𝜕𝑦 𝜕𝑡 ≤ 0 (1− 𝑥∗ 𝑥 )(𝑥(1−𝑥)− 𝛼𝑥𝑦 𝑥 +𝑦 )+(1− 𝑦∗ 𝑦 )( 𝛽𝑥𝑦 𝑥 +𝑦 −𝛾𝑦 −𝛿𝑥𝑦) ≤ 0 ( (1−𝑥)(𝑥 +𝑦)−𝛼𝑦 𝑥 +𝑦 )(𝑥 −𝑥∗)+( 𝛽𝑥 −𝛾(𝑥 +𝑦)−𝛿𝑥(𝑥 +𝑦) 𝑥 +𝑦 )(𝑦 −𝑦∗) ≤ 0 for (𝑥,𝑦) ∈ ℝ2 +, we obtain: analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 265 (5) −𝛼 +𝛼𝑥∗ +𝛽 −𝛾 −𝛿 −(𝛽 −𝛾 −𝛿)𝑦∗ < 0 −𝛼 +𝛼𝑥∗ +𝛽 −𝛾 −𝛿 −𝑥∗ (𝛽 −𝛾 −𝛿)2 (𝛾 +𝛿 −𝛼𝛿) < 0 𝑥∗ ( 𝛼(𝛾 +𝛿 −𝛼𝛿)−((𝛽 −𝛾 −𝛿)2) (𝛾 +𝛿 −𝛼𝛿) ) < 𝛼 −𝛽 +𝛾 +𝛿 𝑥∗ < (𝛾 +𝛿 −𝛼𝛿)(𝛼 −𝛽 +𝛾 +𝛿) 𝛼(𝛾 +𝛿 −𝛼𝛿)−((𝛽 −𝛾 −𝛿)2) by the conditions above, the stability of equilibrium point 𝐸2 is globally asymptotically stable.∎ analysis of hopf bifurcation type in this section, we’ll define the hopf-bifurcation type by using the divergence criterion [21]. system (3) underwent a hopf-bifurcation when it satisfies the following conditions: 𝛿2 < θ+υ ζ and 𝛼 > ψ2 ω to determine the hopf-bifurcation type of system (3) on equilibrium point 𝐸2, then we formed a new system. let 𝜙(𝑥,𝑦) is a divergence of (𝑎𝑓,𝑎𝑔). we obtain the coefficient value of 𝑎(𝑥,𝑦) of the system (3) when the parameter value 𝛼 = 2, 𝛽 = 0.79, 𝛾 = 0.5, and 𝛿 = 0.0186 with equilibrium point 𝐸2 ∗ = (0.279;0.157) as follows: 𝑎(𝑥,𝑦) = 1+6.956𝑥 +13,386𝑦 −6.77𝑥2 +32.968𝑥𝑦 +55.507𝑦2 so that a new system is obtained: 𝑧(𝑥,𝑦) = (1+6.956𝑥 +13,386𝑦 −6.77𝑥2 +32.968𝑥𝑦 +55.507𝑦2) (𝑥(1−𝑥)− 𝛼𝑥𝑦 𝑥 +𝑦 ) 𝑤(𝑥,𝑦) = (1+6.956𝑥 +13,386𝑦 −6.77𝑥2 +32.968𝑥𝑦 +55.507𝑦2) ( 𝛽𝑥𝑦 𝑥 +𝑦 −𝛾𝑦 −𝛿𝑥𝑦) by linearizing system (4), we obtained: 𝐽(𝐸2 ∗) = ( 1.337 −6.002 0.732 −1.337 ) by solving the characteristic equation, we obtained the eigenvalues of 𝐽(𝐸2 ∗) is 𝜆1,2 = ±1.615𝑖 for a system (5) to obtain the eigenvalues of conjugate complex numbers, then we can analyze the hopf-bifurcation of system (3) type by looking at the divergence value of system (3). we obtained: 𝜙𝑥𝑥(𝐸2 ∗) = −21.109 analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 266 based on the divergence value above, a stable limit cycle appears in the system (3). therefore, system (3) underwent a supercritical hopf-bifurcation. numerical simulations in this section, the numerical simulation is solved using the 4th-order runge-kutta method [22] with initial conditions and some values of the parameters. we choose the following set of parameter values: 𝛼 = 2, 𝛽 = 0.79, 𝛾 = 0.5 with different parameter control values as follows 𝛿1 = 0.011, 𝛿2 = 0.0186 and 𝛿3 = 0.026. we using the initial condition is 𝑥(0) = 0.3 and 𝑦(0) = 0.3. (a) (b) figure 1. (a) phase portrait of case 1 and (b) time-series portrait (a) (b) figure 2. (a) phase portrait of case 2 and (b) time-series portrait in case 1, we obtained the dynamics of the solution on the system (3) with parameter control values 𝛿1 = 0.011. based on figure 1(a), the trivial equilibrium point 𝐸0 = (0,0) is unstable (saddle) with eigenvalues 𝜆1 = 1 and 𝜆2 = −0.5. this coincides with theorem 1. the non-predator equilibrium point 𝐸1 = (1,0) is unstable (saddle) with eigenvalues 𝜆1 = −1 and 𝜆2 = 0.279. this coincides with theorem 2 on condition 𝛿 < 𝛽 −𝛾. the co-existence equilibrium point 𝐸2 = (0.273;0.156) is unstable (spiral) with eigenvalues 𝜆1,2 = 0.003±0.220𝑖. this analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 267 coincides with theorem 3 on condition 𝛿2 < θ+υ ζ . based on figure 1(b), the prey population and predator population have increased and decreased of total populations. the case continuously oscillates with a greater deviation value. hence, both population is unstable to a specific point. (a) (b) figure 3. (a) phase portrait of case 3 and (b) time-series portrait in case 2, we obtained the dynamics of the solution on the system (3) with parameter control values 𝛿1 = 0.0186. based on figure 2(a), the trivial equilibrium point 𝐸0 = (0,0) is unstable (saddle) with eigenvalues 𝜆1 = 1 and 𝜆2 = −0.5. this coincides with theorem 1. the non-predator equilibrium point 𝐸1 = (1,0) is unstable (saddle) with eigenvalues 𝜆1 = −1 and 𝜆2 = 0.271. this coincides with theorem 2 on condition 𝛿 < 𝛽 −𝛾. the coexistence equilibrium point 𝐸2 = (0.279;0.157) is center (spiral) with eigenvalues 𝜆1,2 = ±0.220𝑖. this coincides with theorem 3 on condition 𝛿2 = θ+υ ζ . based on figure 2(b), the oscillations that occur have a smaller deviation value. this condition explains that there is a stability transition from unstable to stable to a specific point. this stability transition has led to the appearance of hopf-bifurcation. in case 3, we obtained the dynamics of the solution on the system (3) with parameter control values 𝛿1 = 0.026. based on figure 3(a), the trivial equilibrium point 𝐸0 = (0,0) is unstable (saddle) with eigenvalues 𝜆1 = 1 and 𝜆2 = −0.5. this coincides with theorem 1. the non-predator equilibrium point 𝐸1 = (1,0) is unstable (saddle) with eigenvalues 𝜆1 = −1 and 𝜆2 = 0.264. this coincides with theorem 2 on condition 𝛿 < 𝛽 −𝛾. the coexistence equilibrium point 𝐸2 = (0.285;0.159) is stable (spiral) with eigenvalues 𝜆1,2 = −0.003±0.220𝑖. this coincides with theorem 3 on condition 𝛿2 > θ+υ ζ . based on figure 3(b), the dynamics between prey and predator begin to stabilize at 1500 days to a specific point. conclusions the rosenzweig-macarthur predator-prey model with anti-predator behavior has been studied. from the analysis of system (2), we obtain three equilibrium points, i.e., the trivial equilibrium point (𝐸0), the non-predatory equilibrium point (𝐸1), and the coexistence equilibrium point (𝐸2). the local stability conditions of each equilibrium point have been appointed, and the global stability conditions of the co-existence equilibrium analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 268 point (𝐸2) have been obtained. our analysis also showed that the model occurs a supercritical hopf-bifurcation by using the divergence criterion. numerical analytic has been simulated to verify the theoretical results. no one extinction matters in any population. references [1] p. b. turchin, complex population dynamics: a theoretical/empirical synthesis, priceton university press, 2003. [2] j. d. murray, mathematical biology: an introduction, 3rd edition, springer-verlag, 2002. [3] c. s. holling, "some characteristic of simple types of predation and parasitism", the canadian entomologist, vol. 91, no. 7, pp. 385-398, 1959. [4] m. l. rosenzweig and r. h. macarthur, "graphical representation and stability conditions of predator-prey interactions", the american naturalist, vol. 895, pp. 209-223, 1963. [5] n. hasan, r. resmawan, and e. rahmi, “analisis kestabilan model eko-epidemiologi dengan pemanenan konstan pada predator,” j. mat. stat. dan komputasi, vol. 16, no. 2, pp. 121–142, dec. 2020. [6] s. h. arsyad, r. resmawan, and n. achmad, “analisis model predator-prey lesliegower dengan pemberian racun pada predator,” j. ris. dan apl. mat., vol. 4, no. 1, pp. 1–16, 2020. [7] s. maisaroh, r. resmawan, and e. rahmi, “analisis kestabilan model predator-prey dengan infeksi penyakit pada prey dan pemanenan proporsional pada predator,” jambura j. biomath, vol. 1, no. 1, pp. 8–15, 2020. [8] l. berec, "impacts of foraging facilitation among predators on predator-prey dynamics", bulletin of mathematical biology, vol. 72, pp. 94-121, 2010. [9] l. pribylova and a. peniaskova, "foraging facilitation among predators and its impact on the stability of predator-prey dynamics", ecological complexity, vol. 29, pp. 30-39, 2017. [10] m. moustafa, m. h. mohd, a. i. ismail, and f. a. abdullah, "stage structure and refuge effects in the dynamical analysis of a fractional-order rosenzweig-macarthur prey-predator model", progress in fractional differentiation and application, vol. 5, no. 1, pp. 49-64, 2019. [11] l. k. beay and m. saija, "a stage-structure rosenzweig-macarthur model with effect of prey refuge", jambura journal of biomathematics, vol. 1, no. 1, pp. 1-7, 2020. [12] e. alamanza-vasquez, r. d. ortiz-ortiz, and a. m. marin-ramirez, "bifurcations in the dynamics of rosenzweig-macarthur predator-prey model considering saturated refuge for the preys", applied mathematical sciences, vol. 9, pp. 74757482, 2015. [13] m. moustafa, m. h. mohd, a. i. ismail, and f. a. abdullah, "dynamical analysis of a fractional-order rosenzweig-macarthur model incorporating a prey refuge", chaos, solitons and fractals, vol. 109, pp. 1-13, 2018. analysis of the rosenzweig-macarthur predator-prey model with anti-predator behaviour ismail djakaria 269 [14] a. suryanto, i. darti, h. s. panigoro, and a. kilicman, "a fractional-order predatorprey model with ratio-dependent functional response and linear harvesting", mathematics, vol. 7, no. 11, pp. 1-13, 2019. [15] h. s. panigoro, a. suryanto, w. m. kusumawinahyu, and i. darti, "a rosenzweigmacarthur model with continuous threshold harvesting in predator involving fractional derivatives with power law and mittag-leffler kernel", axioms, vol. 9, no. 122 pp. 1-23, 2020. [16] s. g. mortoja, p. panja, and s. k. mondal, "dynamics of a predator-prey model with stage-structure on both species and anti-predator behavior", informatics in medicine unlocked, vol. 10, pp. 50-57, 2018. [17] s. h. strogatz, nonlinear dynamics and chaos with application to physics, biology chemistry and engineering, west-view press, 2015. [18] l. perko, differential equations and dynamical systems, 3rd edition, springerverlag, 2001. [19] j. k. hale and h. kocak, dynamic and bifurcation, springer-verlaag, 1991. [20] r. sundari and e. apriliani, "konstruksi fungsi lyapunov untuk menentukan kestabilan", jurnal sains dan seni its, vol. 6, no. 1, pp. 28-32, 2017. [21] s. s. pilyugin and p. waltman, "divergence criterion for generic planar system", siam j appl. math., vol. 64, pp. 81-93, 2003. [22] a. suryanto, metode numerik untuk persamaan diferensial biasa dan aplikasinya dengan matlab, universitas negeri malang, 2017. richards curve implementation for prediction of covid-19 spread in maluku province cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 195-206 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 10, 2021 reviewed: december 22, 2021 accepted: january 05, 2022 doi: http://dx.doi.org/10.18860/ca.v7i1.13323 richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi, francis yunito rumlawang, yopi andry lesnussa* department of mathematics, faculty of mathematics and natural sciences, pattimura university, indonesia *corresponding author email: yopi_a_lesnussa@yahoo.com* rumlawang@yahoo.com, nanangondi21@gmail.com abstract the first case of covid-19 in maluku province, indonesia was reported at the end of march 2020 as many as 1 case and the total cumulative cases reported were 3.884 cases on november 4, 2020. the purpose of this study is to predict the spread of covid-19 cases in maluku province by estimating the richards function parameters are i is the population size, k is carrying capacity, k is the growth rate, a is the scaling parameter and mt is the turning point using the nonlinear least-squares (nls) method. the method use in this research is richards curve method. the results of this research found the estimation results, with rmse = 75,1057, the peak of the spread of covid-19 cases in the maluku province is predicted to occur on october 22, 2020, with a total of 3.623 cases and ends on may 25, 2023, with a total of 9.451 cases. this research can provide an overview of the results of predictions for the development of covid-19 for the government, making it easier for the government to make decisions in the future. keywords: carrying capacity; covid-19; prediction; richards curve; turning point introduction coronavirus is a group of viruses from the subfamily orthocronavirinae in the coronaviridae family and the order nidovirales. this group of viruses can cause disease in birds and mammals, including humans [1]. in 2002, the sars-cov coronavirus (sars coronavirus) caused severe acute respiratory syndrome (sars) in guangdong, china [2]. in 2012 the type of coronavirus mers-cov (mers coronavirus) caused middle eastern respiratory syndrome (mers) which occurred in saudi arabia and the middle east [3]. in early 2020, who (world health organization) received a report from china that there were 44 patients with severe pneumonia in wuhan city, hubei province, china [4]. subsequent research showed a close relationship with the coronavirus that caused sars in 2002 [5]. on february 11, 2020, who inaugurated the term covid-19 (coronavirus disease 2019) which is an infectious disease similar to influenza caused by severe acute respiratory syndrome 2 (sars-cov-2) [6], [7]. the first covid-19 was reported in indonesia on march 23, 2020, with two cases. data on march 31, 2020, showed that there were 1,528 confirmed cases and 136 deaths. http://dx.doi.org/10.18860/ca.v7i1.13323 mailto:yopi_a_lesnussa@yahoo.com* mailto:rumlawang@yahoo.com mailto:nanangondi21@gmail.com richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 196 in 1839 verhulst introduced the logistics equation to model population growth which became known as the verhulst equation and was rediscovered in 1912 [8], [9]. in 1959 in research entitled: a flexible growth function for empirical use, richards modified the verhulst equation and became known as the richards curve [10] or generalized logistic function [11] because it is an extension of the logistic model [12], [13] and in some literature, the richards curve is also called the theta logistic model [14], [15] with parameters namely k (carrying capacity), k (growth rate), mt (inflection point) and a (scaling parameter) the shape of the richards curve resembles the shape of the exponential curve [16]. richards curve is a model of a population growth curve in conditions where growth is not symmetrical with inflexion points [17], [18]. in 2004 the richards curve was used to predict the spread of sars in singapore, hong kong and beijing [19] after estimation with the richards curve, the results obtained are that the spread of sars in beijing is predicted to end on 27 june 2003 with a total of 2.595 cases, in hong kong it is predicted to end on 29 june 2003 with a total of 1.748 cases and in singapore it is predicted to end in may 28, 2003 with a total of 207 cases. the prediction results of the spread of sars in singapore, hong kong and beijing using the richards curve were considered quite successful, because based on the data obtained, singapore last reported cases of sars on may 18, 2003 with a total of 206 cases, hong kong on june 11, 2003 with a total of cases of 1.755 cases and beijing on june 11, 2003 with a total of 2.631 cases. besides that, the richards curve was widely used in other studies [20]–[22] and in 2020, the richards curve was used to predict the spread of covid-19 in the province of south sulawesi, indonesia, with the peak of the spread predicted to occur in mid-june 2020 july 2020 with a total of 10,000-12,000 cases and the end of the spread is predicted to occur at the end of november 2020. based on the above background, where the richards curve is considered quite good in predicting the spread of sars in singapore, hong kong and beijing in 2002, therefore in this study the richards curve will be used to predict the spread of covid-19 in maluku province. methods in general, the differential form of the richards curve is : [10], [23]   1 a di i i t ri dt k                    (1) where i is the population size, k is carrying capacity, k is the growth rate and a is the scaling parameter. to find a solution to equation 1, the integration technique can be written as: a a a k di r dt i k i                or it can be written: richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 197 based on the similarity of the two sides, the values of 1a  and 1a b i   are obtained so as to obtain : a a aa a k a b di di i k ii k i                      1 1 a a a i di di i k i                  so we get:   1 ln ln a a a a a a k di i k i i k i                        since we get r dt rt c  , we can write: :     1 ln ln a a a i k i rt c           or it can be written : so we get:   1 a a a rt c k i e i                to simplify the above form, both sides can be raised to the power of a so that we get: :   1 a a art ac k i e e           (2)       a a a a a k i b i di r dt i k i                  1 ln a a a k i rt c i              richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 198 from equation 2, since ,a r and c are constants, it is assumed that k is the product of ar and q is the product of  ac e  , so it can be written as:  1 a a kt k i q e          so we get:     1 1 kt a k i t q e               (3) since the inflection point of equation 3 is   1 1 a k a         [24] , let mt be the parameter of the inflection point of equation 3 then it can be written as : [25]     1 1 m k t t a k i t ae                 (4) where i is the population size or the total number of cases that occurred at the time of t , k is the carrying capacity or total of the latest cases, k is the rate of growth of cases, mt is the inflexion point or time of the peak of the spread of covid-19 cases where       1 1 1 1 m m m k t t a a k k i t aae                        . results and discussion covid-19 cases in maluku province have continued to increase since it was first reported on march 23, 2020, and as of november 4, 2020, the total cumulative cases of covid-19 in maluku province were reported as many as 3,884 cases, including 551 positive patient cases or with a percentage of 14.18%, 3,286 cases of patients cured or with a percentage of 84.6 and 47 cases of patients dying or with a percentage of 1.2%. cumulative case developments and the addition of daily cases of covid-19 in maluku province from march 23, 2020 – november 4, 2020, can be described as follows: table 1. cumulative and daily case data of covid-19 in maluku province date cumulative cases daily cases march 23, 2020 1 0 march 24, 2020 1 0 march 25, 2020 1 0 march 26, 2020 1 0 richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 199 date cumulative cases daily cases march 27, 2020 1 0 … … … october 30, 2020 3849 59 october 31, 2020 3851 2 november 1, 2020 3863 12 november 2, 2020 3863 0 november 3, 2020 3877 14 november 4, 2020 3884 7 the development of cumulative covid-19 cases in maluku province from 23 march – 4 november 2020 can be described as follows: figure 1. cumulative case development the first case of the spread of covid-19 in maluku province was reported on march 23, 2020, as many as 1 case and up to july 5, 2020 the total cumulative cases reported were 794 cases or with a growth rate of 755, 23%. on july 6 to october 22, 2020 the average daily addition of cases increased to 26 cases with the average growth rate increasing significantly as much as %1864, 953 from the previous one, which was 2620,183%, and from october 23 to november 4, 2020, the average increase in cases the daily rate of covid-19 in maluku province decreased by 18 cases. the graph of the daily increase in cases can be seen in figure 2, where the maluku province experienced the highest number of cases on october 2, 2020, which was 117 cases. figure 2. development of daily cases of covid-19 in maluku province 23 march – 4 november 2020 richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 200 parameter estimation results by using data on cumulative cases of covid-19 in maluku province from march 23 – november 4, 2020, an estimate was made with the richards function parameter using the nonlinear least square method in python with the following script: import scipy.optimize as optimize from scipy.optimize import curve_fit import numpy as np import pandas as pd def richardsfunction(t,k,a,k,tm): return k/(1 + a*(np.exp(-k*(t-tm))))**(1/a) df=pd.read_excel('coviddate_maluku.xlsx') data=df[0:227] y=data['cummulativecases'] t=np.arange(1,228,1) popt,pcov=optimize.curve_fit(richardsfunction,t,y,bounds=(0.01,np.inf)) the results obtained are: table 2. richards parameter estimation results (rmse: 75,1057) k a k mt 9.451,245 0,085 0,01 213,918 so by substituting the parameter values k , a , k and mt in equation (4) obtained the richards equation, namely :     1 20.01 0,08591 113, 8 9.451, 245 1 0, 085 t i t e                then it can be illustrated that the comparison between the cumulative covid19 case data from the richards function parameter estimation results with the actual data for [1, 227]t  is as follows: figure 3. comparison of the results of predictions of cumulative cases of covid-19 in maluku province with actual data richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 201 in figure 3. the cumulative comparison between the actual data and the predicted data using the richards function at the time of t = 1 to t = 227, we get rmse = 75.1057 while using the logistic function we get a larger rmse value of 85.1813. the comparison of the error values between the predicted results and the actual data can be seen in the following table 3: table 3. the error value of the predicted data with the actual data t actual predict error 1 1 16.65457518205870 15.65457518205870 2 1 17.48844921400830 16.48844921400830 3 1 18.35884011574600 17.35884011574600 4 1 19.26706628665680 18.26706628665680 5 1 20.21448005719650 19.21448005719650 … … … … 222 3849 3889.25714338151000 40.25714338151370 223 3851 3922.50525096264000 71.50525096263660 224 3863 3955.72658861666000 92.72658861666100 225 3863 3988.91835825547000 125.91835825547500 226 3877 4022.07779333937000 145.07779333936700 227 3884 4055.20215931066000 171.20215931065600 from figure 3, the richards curve can be described from the estimation results as follows: figure 4. richards curve of estimation results from figure 4, suppose that  ii t is the total cumulative cases on day i and  1ii t  is the total cumulative cases on day 1i  , then the total addition of daily cases can be formulated as follows : [26]      1 ; 1, 2, 3,...i i ij t i t i t i   (5) so the comparison between the predicted data and actual data from daily covid-19 cases in maluku province can be described as follows: richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 202 figure 5. comparison of the results of daily covid-19 case predictions in maluku province with actual data from figure 5, it can be seen that the results of daily case predictions for covid19 in maluku province are as follows: figure 6. daily cases of covid-19 in maluku province from estimated result turning point of case deployment from the results of richards parameter estimation with data on covid-19 cases in maluku province, the parameter mt value is 213.918, meaning that the time of the turning point for the spread of covid-19 in maluku province is predicted to occur on the 214th day, where the total cases on the 214th day are obtained. from the equation:     1 0.01 214 0,085213,91 18 9.451, 245 214 1 0, 085 i e                which is 3.622,654 means that the total cases at the inflection point are 3.623 cases or can be described as follows: figure 7. inflection point richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 203 for the addition of daily cases, the total addition of cases can be obtained at the inflexion point or when m t t namely :    214 213i i        1 1 0.01 214 0.01 2130,0851 0,081 5213,918 213,9 8 1 9.451, 245 9.451, 245 33, 358161 1 0, 085 1 0, 085e e                                  so the total addition of daily cases at the inflection point is 33 cases, so it can be concluded that the turning point of the covid-19 case in maluku province is based on the estimation results, namely     , 214, 33t i t  or can be described as follows : figure 8. turning point in figure 7, the point     , 214, 33t i t  which is the turning point of the curve is also the peak of the curve, namely when t = 214. end of case deployment from richards parameter estimation results with data on covid-19 cases in maluku province, the parameter k value is 9,451,245, meaning that the latest total cases for covid-19 cases in maluku province are predicted to be 9,451 cases. for example, if endt is the end time of covid-19 cases in maluku province, with a total of 9,450.5 cases or can be written as  endi t = 9.450,5 then the value of endt can be obtained from the equation:  . 213,918 1 0 01 0,0851 9.451, 245 9.450, 5 1 0, 085 end t e                 that is endt = 1.158,681 meaning that the time for the end of the covid-19 case in maluku province is predicted to occur on the 1.159th day. richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 204 figure 9. total case when 1.159t  from figure 9, when 1.159t  the population size will always be at number 1.159 and will only move towards the value of k or carrying capacity. conclusions from the estimation results of the richards function parameter with the cumulative case data of covid-19 in the maluku province, the richards equation is obtained to predict the spread of covid-19 in the maluku province, namely:     1 20.01 0,08591 113, 8 9.451, 245 1 0, 085 t i t e                where, the turning point or peak of the spread of covid-19 in maluku province is predicted to occur on october 22, 2020 with a total of 3.623 cases, while the time for the end of the spread of covid-19 in maluku province is predicted to occur on may 25, 2023 with 9.451 cases. references [1] n. r. yunus and a. rezki, “kebijakan pemberlakuan lock down sebagai antisipasi penyebaran corona virus covid-19,” salam j. sos. dan budaya syar-i, 2020, doi: 10.15408/sjsbs.v7i3.15083. [2] j. s. m. peiris et al., “coronavirus as a possible cause of severe acute respiratory syndrome,” lancet, 2003, doi: 10.1016/s0140-6736(03)13077-2. [3] a. zumla, d. s. hui, and s. perlman, “middle east respiratory syndrome,” the lancet. 2015, doi: 10.1016/s0140-6736(15)60454-8. [4] h. diah, h. d. rendra, i. fathiyah, e. burhan, and a. heidy, “penyakit virus corona 2019,” j. respirologi indones., 2020. [5] c. ceraolo and f. m. giorgi, “genomic variance of the 2019-ncov coronavirus,” j. med. virol., 2020, doi: 10.1002/jmv.25700. [6] a. e. gorbalenya et al., “the species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it sars-cov-2,” nature microbiology. 2020, doi: 10.1038/s41564-020-0695-z. richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 205 [7] l. lin, l. lu, w. cao, and t. li, “hypothesis for potential pathogenesis of sars-cov2 infection–a review of immune changes in patients with viral pneumonia,” emerging microbes and infections. 2020, doi: 10.1080/22221751.2020.1746199. [8] p. f. verhulst, “notice sur la loi que la population suit dans son accroissement,” corresp. mathématique phys., 1838. [9] a. g. mckendrick and m. k. pai, “xlv.—the rate of multiplication of microorganisms: a mathematical study,” proc. r. soc. edinburgh, vol. 31, pp. 649–653, 1912. [10] f. j. richards, “a flexible growth function for empirical use,” j. exp. bot., vol. 10, no. 2, pp. 290–301, 1959. [11] j. a. nelder, “182. note: an alternative form of a generalized logistic equation,” biometrics, 1962, doi: 10.2307/2527907. [12] r. pearl and l. j. reed, “the logistic curve and the census count of 1930,” science (80. )., 1930, doi: 10.1126/science.72.1868.399-a. [13] s. y. lee, b. lei, and b. mallick, “estimation of covid-19 spread curves integrating global data and borrowing information,” plos one, 2020, doi: 10.1371/journal.pone.0236860. [14] m. e. gilpin and f. j. ayala, “global models of growth and competition,” proc. natl. acad. sci. u. s. a., 1973, doi: 10.1073/pnas.70.12.3590. [15] j. v. ross, “a note on density dependence in population models,” ecol. modell., 2009, doi: 10.1016/j.ecolmodel.2009.08.024. [16] n. r. lambe, e. a. navajas, g. simm, and l. bünger, “a genetic investigation of various growth models to describe growth of lambs of two contrasting breeds,” j. anim. sci., 2006, doi: 10.2527/jas.2006-041. [17] g. a. f. seber and c. j. wild, “nonlinear regression. hoboken,” new jersey john wiley sons, vol. 62, p. 63, 2003. [18] h. anton, calculus: with analytic geometry, no. qa 303. a57 1980. 1980. [19] g. zhou and g. yan, “severe acute respiratory syndrome epidemic in asia.,” emerg. infect. dis., vol. 9, no. 12, pp. 1608–1610, 2003. [20] h. nishiura, s. tsuzuki, b. yuan, t. yamaguchi, and y. asai, “transmission dynamics of cholera in yemen, 2017: a real time forecasting,” theor. biol. med. model., 2017, doi: 10.1186/s12976-017-0061-x. [21] r. zreiq, s. kamel, s. boubaker, a. a. al-shammary, f. d. algahtani, and f. alshammari, “generalized richards model for predicting covid-19 dynamics in saudi arabia based on particle swarm optimization algorithm,” aims public heal., vol. 7, no. 4, p. 828, 2020. [22] k. roosa et al., “short-term forecasts of the covid-19 epidemic in guangdong and zhejiang, china: february 13–23, 2020,” j. clin. med., 2020, doi: 10.3390/jcm9020596. [23] r. b. banks, growth and diffusion phenomena: mathematical frameworks and applications, vol. 14. springer science & business media, 1993. [24] a. t. goshu, “derivation of inflection points of nonlinear regression curves implications to statistics,” am. j. theor. appl. stat., 2013, doi: 10.11648/j.ajtas.20130206.25. richards curve implementation for prediction of covid-19 spread in maluku province nanang ondi 206 [25] y.-h. hsieh, “richards model: a simple procedure for real-time prediction of outbreak severity,” 2009. [26] m. höök, j. li, n. oba, and s. snowden, “descriptive and predictive growth curves in energy system analysis,” natural resources research. 2011, doi: 10.1007/s11053011-9139-z. forecasting population of madiun regency using arima method cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 420-431 p-issn: 2086-0382; e-issn: 2477-3344 submitted: may 26, 2022 reviewed: july 19, 2022 accepted: july 25, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.16156 forecasting population of madiun regency using arima method yuniar farida*, mayandah farmita, nurissaidah ulinnuha, dian yuliati department of mathematics, faculty of science and technology, sunan ampel state islamic university surabaya, indonesia email: yuniar_farida@uinsby.ac.id abstract the high population growth of the madiun regency can cause population density that can have implications for other problems, both in terms of social, economic, welfare, security, land availability, availability of clean water, and food needs. this study aims to predict the population growth of madiun regency using the arima method. the arima method is popular for forecasting time series data, which is reliable because the calculation process is done gradually. the arima method has three models, namely ar (autoregressive), ma (moving average), and arma (autoregressive moving average). this study uses annual population data of madiun regency from 1983 to 2021 and produces an arima forecasting model (0,2,1) with a mape value of 8.42%. this study also showed that from 2022 to 2024 is predicted to increase by 17947 people or 2.39%. the results of this study are expected to be used as information from the madiun regency government in anticipating the emergence of problems caused by the population level of madiun regency in the future. keywords: arima; forecasting; population; time series analysis introduction one of the most important problems globally is the high population growth in developing countries [1], [2]. indonesia is listed as one of the five most populous countries in the world. indonesia ranks fourth after china, india, and the united states and is the most populous asian continent [3]. according to the 2020 population census, conducted in 2020, the population of indonesia reached 270.20 million. indonesia has a land area of 1.9 million km2 and a population density of 141 people per km2, with an average annual population growth rate of 1.25% between 2010 and 2020 [4]. in indonesia, java ranks first as the most populous island, and east java is the second-most populous province after west java, with 40.67 million people. [4]. east java province consists of several cities and regencies, one of which is madiun regency, whose population growth rate ranks fifth in the 2020 period and experiences population growth every year. in 2015 the population was 676,087 people, while its development in 2019 was 749,070 [5]. population growth in madiun regency is affected by the high birth rate. in 2020, the birth rate in madiun regency will be 3928, with a population growth rate of 0.92% [5]. high population growth can cause various problems, such as regional spatial problems, housing, employment, education, economy, and security. in addition, it can also http://dx.doi.org/10.18860/ca.v7i3.16156 mailto:yuniar_farida@uinsby.ac.id population forecasting of madiun regency using arima method yuniar farida 421 cause problems in social aspects, welfare, availability of clean water, food needs, and can cause environmental damage [1], [6], [7]. population density, which can cause problems, needs to be anticipated by making predictions so that various handling strategies can be carried out. several statistical mathematical-based forecasting models can predict it, including exponential smoothing, moving average, and arima models (box-jenkins). other forecasting models are based on artificial intelligence, such as neural networks, genetic algorithmic, simulated annealing, and classification [8]. several previous studies by xu et al., predicted beijing's main area population using the long short term memory (lstm) model with mape 4.35% [9]. next, a study that indicated the population of residents in east kalimantan using exponential smoothing by pakpahan, basani, and hariani that yielded a mape of 14.81% [10]. on the other, there are studies related to the prediction of population prediction using the arima method conducted by mardiyah et al. in pasuruan city [11], nyoni, mutongi, and munyaradzi in the gambia [12], and nyoni in zimbabwe obtained mape < 3.94% [13]. based on the previous studies, the authors are interested in the arima method, which results in an excellent level of accuracy in some cases of population forecasting. the arima method is flexible and straightforward in an application, and accurate prediction results for the short term, but the forecasting accuracy for long-term forecasting is not good and will usually tend to be flat for a long time [14], [15]. in applying practice and forecasting the population, arima is also widely used in various case studies, including research by alabdulrazzaq related to predicting the spread of covid-19 with mape 4.2% [16]. other research by swaraj about covid-19 predictions in india with mape 4.7% [17]. additional research by guha and bandyopadhyay predicts the price of gold with a mape of 3.25% [18]. another study by banerjee related forecasting on the indian stock market with mape 3.33% [19]. then there was research by grigonytė and butkevičiūtė about predicting wind speed in latvia with mape 1% [20]. based on the explanation above, this study used the arima method to predict the population of the madiun regency. this research is expected to provide information for the madiun regency government to take policy steps to minimize and reduce risk due to the high rate of population growth of madiun regency. methods the data the data used in this study is data on residents of madiun regency from 1983 to 2021, which is taken from the central bureau of statistics of madiun regency, from the website https://bit.ly/pendudukkabmadiun [21]. table 1. the population of madiun regency no. year population 1. 1983 638586 2. 1984 627467 ⋮ ⋮ ⋮ 38 2020 744350 39 2021 750143 https://bit.ly/pendudukkabmadiun population forecasting of madiun regency using arima method yuniar farida 422 arima box and jenkins first developed the arima model in the 1970s [22]. the arima is one of the econometric methods used to predict univariate time-series data. box and jenkins state that this model does not use independent variables but instead utilizes the information in the circuit to generate pre-predicted values. therefore, the arima model requires an autocorrelation process in the series. autocorrelation is the correlation between two observations at different points in a time series. in other words, time series data is self-correlated. time series models in the arima method include autoregressive (ar), moving average (ma), and autoregressive moving average (arma) [23], [24]. the analysis with arima box jenkins begins by creating a series of plot periods and plotting the acf to determine whether the data is mean-stationary or variancestationary. differentiation must be done if the data are not stationary to the mean. otherwise, if the data is not stationary to the variance, a box-cox transformation is performed. repeat the process for the data stationer. after getting stationary data, the next step is to predict the data from arima based on the acf and pacf plots. then use ljung-box to test the parameters of the test model as well as test the residual hypothesis, which is residual white noise. it can be concluded, there are several stages of forecasting in arima, namely model identification, parameter estimation, diagnostic testing, and prediction. model identification when identifying the model on the arima method, the data used must meet the stationary or stability requirements. if the data does not meet the stationery requirements, the data must be stationary for the variance and average (mean) [25]. the transformation equation is as follows [26]. 𝑇(𝑍𝑡) ′ = 𝑍𝑡 𝜆 𝜆 (1) where: 𝑇(𝑍𝑡) : transformed data value 𝑍𝑡 : i th time data value 𝜆 : the estimated value of transformation parameters the transformed data is determined by the lambda value. for example, the following table shows some commonly used 𝜆 values and associated transformations. table 2. 𝜆 value and transformation 𝝀 value transformation -1.0 1 𝑍𝑡⁄ -0.5 1 √𝑍𝑡⁄ 0.0 ln𝑍𝑡 0.5 √𝑍𝑡 1.0 𝑍𝑡 for time-series data that have not satisfied the stationarity of the average, the data must be processed differentially to find the difference between one data and the previous data in sequence. the differencing equation is as follows. 𝑍𝑡 ′ = 𝑍𝑡 − 𝑍𝑡−1 (2) population forecasting of madiun regency using arima method yuniar farida 423 𝑍𝑡 ′ is the differentiated data value, where 𝑍𝑡 is the i th time data value. if the data is already stationary, a tentative model of arima (p, d, q) is obtained. annotation p is a lag that exceeds the significance limit on the partial autocorrelation function (pacf) plot graph, d is the level of differencing performed, q is the lag that crosses the significance limit of the autocorrelation function (acf) plot. autoregressive is a model in which a dependent variable is influenced by the value of the dependent variable itself because the data used is single. in general, ar is p ordo, with the form 𝐴𝑅(𝑝) as follows [27]. 𝑋𝑡 = ∅0 + ∅1𝑋𝑡−1 + ∅2𝑋𝑡−2 + ⋯+ ∅𝑖𝑋𝑡−𝑖 + 𝛼𝑡 (3) where: 𝑋𝑡 : time series t 𝑋𝑡−𝑖 : time series t-i 𝛼𝑡 : time error value t ∅0 : constant ∅𝑖 : coefficients of autoregressive moving average is a model that measures autocorrelation between the error or residual values. the ma is generally ornate q, with the following form of 𝑀𝐴(𝑞) [28]. 𝑋𝑡 = 𝑒𝑡 − 𝜃1𝛼𝑡−1 − 𝜃2𝛼𝑡−2 − ⋯− 𝜃𝑖𝛼𝑡−𝑖 (4) where: 𝛼𝑡−𝑖 : time error value t-i 𝜃𝑖 : coefficient of moving average autoregressive moving average or arma (p, q), with the following general equations [29]. 𝑋𝑡 = ∅0 + ∅1𝑋𝑡−1 + ⋯+ ∅𝑖𝑋𝑡−𝑖 + 𝛼𝑡 − 𝜃1𝛼𝑡−1 − ⋯− 𝜃𝑖𝛼𝑡−𝑖 (5) where: 𝑋𝑡 : stationary time series autoregressive integrated moving average data used must be stationary. arima's general statement is as follows [30]. ∅0(𝐵)(1 − 𝐵) 𝑑𝑍𝑡 = 𝜃0 + 𝜃𝑞(𝐵)𝑎𝑡 (6) where: ∅0 : autoregressive process 𝜃𝑞 : moving average process (1 − 𝐵)𝑑 : differentiating operator 𝑑 : differencing parameter 𝐵 : step-back operator 𝑍𝑡 : deviations from the average process parameter estimation tentative model determination requires several estimation stages through model feasibility tests to find the best model. the significance test hypothesis is as follows [29]. 𝐻0:∅ = 0 (indicates parameters are not yet significant) 𝐻1:∅ ≠ 0 (shows the parameters are significant) population forecasting of madiun regency using arima method yuniar farida 424 𝑡𝑐𝑜𝑢𝑛𝑡 = 𝜃 𝑆𝐸(𝜃𝑗) (7) where: 𝜃 : estimation of autoregressive model parameters and moving averages 𝑆𝐸(𝜃𝑗) : standard errors diagnostic test diagnostic tests are used to determine whether or not the model is the best. a good model, where the residual results of the white noise assumption test using the ljung-box test are as follows [6], [29]. 𝑄 = 𝑛(𝑛 + 2)∑ �̂�𝑘 2 (𝑛 − 𝑘) 𝑖 𝑘 (8) where: �̂�𝑘 : lag autocorrelation value k q : ljung-box test 𝑘 : lag time prediction accuracy value the results produced by the arima model are measured in terms of forecast accuracy. each method has a mape (mean absolute percentage error) error value that can be used to calculate the error value with the following formula [16], [31]. 𝑀𝐴𝑃𝐸 = ∑ |𝑃𝐸𝑡| 𝑛 𝑡=1 𝑛 (9) with 𝑃𝐸𝑡 = 𝑒𝑡 𝑍𝑡 × 100 (8) where: 𝑃𝐸𝑡 : percentage of errors at t 𝑒𝑡 : t-time error value 𝑍𝑡 : actual data of t-time the quality of the prediction can be shown by the mape value, which can be interpreted into four categories, namely excellent (mape < 10%), good (mape 11% 20%), good enough (mape 21% 50%), and not good (mape > 50%). results and discussion based on table 1, a time series plot is performed to determine the arima model and identify the stationarity of the data. population forecasting of madiun regency using arima method yuniar farida 425 figure 1. plot data on the population of madiun regency based on figure 1, the plot data shows an uptrend (positive). the data is not stationary because in 2019 there was an increase seen from the previous year's difference of 67,676 people. if there is no increase or decrease invariance and average, the data is stationary. figure 2. plot box-cox transformation figure 2 shows that the lambda value is equal to 1, the data can be said to be stationary in variance. stationary data on the average can be seen from the acf plot and time series plot. the data does not yet have a fixed pattern. population forecasting of madiun regency using arima method yuniar farida 426 figure 3. acf plot of madiun regency population from the plot figure 3, it appears that the lag-lag is falling slowly. the plot time series data also does not have a fixed pattern, so the data is not stationary against the average. as a result, it is necessary to do a further transformation process through differencing so that the data is stationary. figure 4. acf plot after differencing figure 4 shows that the data is stationary against the mean. if the data is stationary, the next step is to plot the autocorrelation function (pacf). population forecasting of madiun regency using arima method yuniar farida 427 figure 5. pacf plot after differencing based on figures 4 and 5 shows that the plot does not have an autocorrelation on the model, so the values 𝑀𝐴(𝑞) = 0 and 𝐴𝑅(𝑝) = 0, then obtained the tentative model arima (0,2,0). the model is a random walk where the autocorrelation coefficient is equal to 1, so the tentative models of arima are arima models (1,2,0), (0,2,1), and (1,2,1). a significance test and a residual white noise test were carried out to choose the model used in the prediction. test the significance of the parameters by knowing the pvalue. if the p-value is less than 0.05, then the model is significant. the results of the arima model's tentative significance test (1,2,0), (0,2,1), and (1,2,1) are as follows. table 3. significance test results model parameters coef se coef t-value p-value arima (1,2,0) ar (1) -0.580 0.139 -4.19 0.000 arima (0,2,1) ma (1) 0.950 0.131 7.26 0.000 arima (1,2,1) ar (1) -0.221 0.180 -1.23 0.228 ma (1) 0.946 0.122 7.73 0.000 table 3 shows that the arima models (1,2,1) are not significant because p-values are more than 0.05, arima models (1,2,0) and (0,2,1) are significant because p-values are less than 0.05. after conducting a parameter significance test, it is necessary to perform a residual white noise test to determine which model to use for prediction in performing residual tests using ljung-box. if the p-value is more than 0.05, the model meets the white noise requirement. ljung-box test results for arima models (1,2,0), (0,2,1), and arima models (1,2,1). population forecasting of madiun regency using arima method yuniar farida 428 table 4. ljung-box test results model lag chi-square df p-value arima (1,1,1) 12 5.34 10 0.867 24 11.17 22 0.972 36 23.15 34 0.920 arima (2,1,1) 12 1.83 10 0.998 24 7.23 24 0.999 36 11.94 36 1.000 arima (0,1,2) 12 0.53 9 1.000 24 6.15 21 0.999 36 11.41 33 1.000 table 4, after the ljung-box test, shows that the arima model (1,2,0), (0,2,1), and arima model (1,2,1) are white noise because the p-value is more than 0.05. after performing the white noise test, determine the mape value of the arima provisional model. table 5. mape value model mape arima (1,2,0) 12.45 arima (0,2,1) 8.42% arima (1,2,1) 8.92% based on table 3, table 4, and table 5 of the arima model, whose parameters are significant and meet the assumption of white noise, the arima model (0,2,1) has the smallest mape value. after choosing the best model, then predicted the number of residents of madiun regency. figure 6. actual data plot and forecasting figure 6 showed that there is not much difference between the actual data and the forecast and obtained mape < 10%, which means that the prediction model is already very good. the prediction of the number of residents of madiun regency for the next three years (2022 to 2024) is presented in table 6 below: population forecasting of madiun regency using arima method yuniar farida 429 table 6. population prediction year population 2022 758561 2023 767346 2024 776508 table 6 shows that the madiun regency population from 2022 to 2024 is predicted to increase by 17947 people or 2.39%. the results of these predictions show an uptrend every year. population growth, if followed by an increase in the quality of human resources, will become a regional potential for development. on the other hand, if the increase in population is not accompanied by good quality human resources, it will become a burden for regional development [32]. population data prediction is needed in the planning and evaluating of humanoriented development as the primary target because the population is both an object and a subject of the action. the object’s function means the population as a target, and the people carry out the development mark. the function of the issue means that the people are the sole actor in action. the two functions are expected to go hand in hand and line integrally [32]. conclusions based on the research results on the prediction of the population of madiun regency using the arima method, it can be concluded that the best model for predicting the number of residents of madiun regency is arima (0,2,1) with a mape of 8.42%. the predicted number of residents of madiun regency in 2022 amounted to 758561 people. the selection of the arima method in this research for forecasting the number of residents in madiun is very appropriate because it produces an error value of less than 10%. this method can be applied to similar case studies, especially the case of forecasting the number of residents in other areas. references [1] c. christiani, p. tedjo, and b. martono, “analisis dampak kepadatan penduduk terhadap kualitas hidup masyarakat provinsi jawa tengah,” ilmiah, vol. 3, no. 1, pp. 102–114, 2014. [2] j. dai and s. chen, “the application of arima model in forecasting population data,” j. phys. conf. ser., vol. 1324, no. 1, 2019, doi: 10.1088/1742-6596/1324/1/012100. [3] f. fejriani, m. hendrawansyah, l. muharni, s. f. handayani, and syaharuddin, “forecasting peningkatan jumlah penduduk berdasarkan jenis kelamin menggunakan metode arima,” j. kajian, penelit. dan pengemb. pendidik., vol. 8, no. 1 april, pp. 27–36, 2020. [4] direktorat statistik kependudukan dan ketenagakerjaan, potret sensus penduduk 2020 menuju satu data kependudukan indonesia. jakarta: bps ri, 2021. [5] b. k. madiun, kabupaten madiun dalam angka 2021. madiun: bps kabupaten madiun, 2021. [6] haslina, hasmah, k. w. fitriani, m. asbar, and asrirawan, “penerapan metode arima (autoregressive integrated moving average) box jenkins untuk memprediksi pertambahan jumlah penduduk transmigran (jawa dan bali) di population forecasting of madiun regency using arima method yuniar farida 430 kecamatan sukamaju, kabupaten luwu utara propinsi sulawesi selatan,” dinamika, vol. 9, no. 1, pp. 55–67, 2018. [7] h. yoshikura, “negative impacts of large population size and high population density on the progress of measles elimination,” jpn. j. infect. dis., vol. 65, no. 5, pp. 450–454, 2012, doi: 10.7883/yoken.65.450. [8] n. l. a. k. yuniastari and i. w. w. wirawan, “peramalan permintaan produk perak menggunakan metode simple moving average dan single exponential smoothing,” sist. dan inform., vol. 9, no. 1, pp. 97–106, 2016. [9] z. xu, j. li, z. lv, y. wang, l. fu, and x. wang, “a graph spatial-temporal model for predicting population density of key areas,” comput. electr. eng., vol. 93, no. may, p. 107235, 2021, doi: 10.1016/j.compeleceng.2021.107235. [10] h. s. pakpahan, y. basani, and r. r. hariani, “prediksi jumlah penduduk miskin kalimantan timur menggunakan single dan double exponential smoothing,” inform. mulawarman j. ilm. ilmu komput., vol. 15, no. 1, pp. 47–51, 2020. [11] i. mardiyah, w. d. utami, d. c. r. novitasari, m. hafiyusholeh, and d. sulistiyawati, “analisis prediksi jumlah penduduk di kota pasuruan menggunakan metode arima,” ilmu mat. dan terap., vol. 15, no. 3, pp. 525–534, 2021. [12] t. nyoni, c. mutongi, and n. munyaradzi, “population dynamics in gambia: an arima approach,” munich pers. repec arch., 2019. [13] t. nyoni, “the population question in zimbabwe : reliable projections from the box jenkins arima approach,” munich pers. repec arch., pp. 0–15, 2019. [14] m. s. k. abhilash, a. thakur, d. gupta, and b. sreevidya, “time series analysis of air pollution in bengaluru using arima model,” adv. intell. syst. comput., vol. 696, pp. 413–426, 2018, doi: 10.1007/978-981-10-7386-1_36. [15] nurviana, r. p. sari, u. nabilla, and t. talib, “forecasting rice paddy production in aceh using arima and exponential smoothing models,” cauchy, vol. 7, no. 2, pp. 281–292, 2022. [16] h. alabdulrazzaq, m. n. alenezi, y. rawajfih, b. a. alghannam, a. a. al-hassan, and f. s. al-anzi, “on the accuracy of arima based prediction of covid-19 spread,” results phys., vol. 27, p. 104509, 2021, doi: 10.1016/j.rinp.2021.104509. [17] a. swaraj, k. verma, a. kaur, g. singh, a. kumar, and l. melo de sales, “implementation of stacking based arima model for prediction of covid-19 cases in india,” j. biomed. inform., vol. 121, no. august 2020, p. 103887, 2021, doi: 10.1016/j.jbi.2021.103887. [18] b. guha and g. bandyopadhyay, “gold price forecasting using arima model,” j. adv. manag. sci., no. march, pp. 117–121, 2016, doi: 10.12720/joams.4.2.117-121. [19] d. banerjee, “forecasting of indian stock market using time-series arima model,” int. conf. bus. inf. manag. icbim 2014, pp. 131–135, 2014, doi: 10.1109/icbim.2014.6970973. [20] e. grigonytė and e. butkevičiūtė, “short-term wind speed forecasting using arima model,” energetika, vol. 62, no. 1–2, pp. 45–55, 2016, doi: 10.6001/energetika.v62i1-2.3313. [21] bps kabupaten madiun, kabupaten madiun dalam angka madiun regency in figures 2021. 2021. [22] s. ozturk and f. ozturk, “forecasting energy consumption of turkey by arima model,” j. asian sci. res., vol. 8, no. 2, pp. 52–60, 2018, doi: 10.18488/journal.2.2018.82.52.60. [23] j. sun, “forecasting covid-19 pandemic in alberta, canada using modified arima models,” comput. methods programs biomed. updat., vol. 1, no. september, p. population forecasting of madiun regency using arima method yuniar farida 431 100029, 2021, doi: 10.1016/j.cmpbup.2021.100029. [24] c. b. a. satrio, w. darmawan, b. u. nadia, and n. hanafiah, “time series analysis and forecasting of coronavirus disease in indonesia using arima model and prophet,” procedia comput. sci., vol. 179, no. 2020, pp. 524–532, 2021, doi: 10.1016/j.procs.2021.01.036. [25] d. didiharyono and m. syukri, “forecasting with arima model in anticipating open unemployment rates in south sulawesi,” int. j. sci. technol. res., vol. 9, no. 3, pp. 3838–3841, 2020. [26] d. s. domingos, j. f. l. de oliveira, and p. s. g. de mattos neto, “an intelligent hybridization of arima with machine learning models for time series forecasting,” knowledge-based syst., vol. 175, pp. 72–86, 2019, doi: 10.1016/j.knosys.2019.03.011. [27] f. a. chyon, m. n. h. suman, m. r. i. fahim, and m. s. ahmmed, “time series analysis and predicting covid-19 affected patients by arima model using machine learning,” j. virol. methods, vol. 301, no. december 2021, p. 114433, 2021, doi: 10.1016/j.jviromet.2021.114433. [28] n. ulinnuha and y. farida, “prediksi cuaca kota surabaya menggunakan autoregressive integrated moving average (arima) box jenkins dan kalman filter,” j. mat. “mantik,” vol. 4, no. 1, pp. 59–67, 2018, doi: 10.15642/mantik.2018.4.1.59-67. [29] t. yunita, “peramalan jumlah penggunaan kuota internet menggunakan metode autoregressive integrated moving average ( arima ),” j. math. theory appl., vol. 1, no. 2, pp. 16–22, 2019. [30] l. wulandari, y. farida, a. fanani, and m. syai’in, “optimization of autoregressive integrated moving average (arima) for forecasting indonesia sharia stock of index (issi) using kalman filter,” pp. 295–303, 2020, doi: 10.5220/0008906902950303. [31] m. b. s. junianto, “fuzzy inference system mamdani dan the mean absolute percentage error (mape) untuk prediksi permintaan dompet pulsa pada xl axiata depok,” j. inform. univ. pamulang, vol. 2, no. 2, p. 97, 2017, doi: 10.32493/informatika.v2i2.1511. [32] p. k. madiun, data demografi, ekonomi dan sosial budaya kota madiun 2017. madiun: pemerintah kota madiun, 2017. optimal prevention and treatment control on sveir type model spread of covid-19 cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages pages 40-48 p-issn: 2086-0382; e-issn: 2477-3344 submitted: juni 21, 2021 reviewed: september 09, 2021 accepted: november 07, 2021 doi: https://doi.org/10.18860/ca.v7i1.12634 optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan department of mathematics, cenderawasih university jayapura indonesia email: jonner2766@gmail.com abstract covid-19 pandemic has disrupted the world's health and economy and has resulted in many deaths since the first case occurred in china at the end of 2019. moreover, the covid-19 disease spread throughout the world, including indonesia on march 2, 2020. coronavirus quickly spreads through droplets of phlegm through the throat to the lungs. researchers in the medical field and the exact science are jointly examined transmission, prevention, and optimal control of covid-19 disease. due to the prevention of covid-19, a vaccine has been found in early 2021, which at the time, the vaccination process was carried out worldwide against covid-19. this paper examines the spread model of sveir-type covid-19 by considering the vaccination subpopulation. in this study, control of prevention efforts (𝑢1 ∗ and 𝑢2 ∗ ) and healing efforts (𝑢3 ∗ ) are given and being analyzed with the fourth-order runge-kutta approach. based on numerical simulations, it can be seen that using the controls 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ can decrease the amount of infected people in the subpopulation compared to those without control. the 𝑢3 ∗ control can increase the number of recovered individual subpopulations. keywords: covid-19; sveir model; optimal control; treatment; vaccination. introduction coronavirus is a virus that attacks the respiratory system. corona virus interferes with mild respiratory and lung infections and can result in death [1]. corona virus is rapidly spreading to almost all countries in the world, and indonesia on march 2, 2020. to prevent the spread of covid-19, the government recommends frequent hand washing with soap/hand sanitizer and practicing cough etiquette. corona virus spreads by sprinkling phlegm from the throat of an infected person, especially in closed air circulation areas. be aware of covid-19, improve your health with healthy lifestyle, includes: balanced nutrition consumption, exercises, adequate rest, and frequent handwashing with soap. exact science and medicine experts are working together to prevent the spread of covid-19. consequently, mathematical researchers are also take part in assessing transmission, streamlining, and optimizing control of the disease. the optimal control strategy for the control of pandemic avian influenza along with the quarantine subpopulation [2]. studying the model of the spread of the corona virus type seir in wuhan, by controlling people's travel history to and from the city of wuhan [3]. then, compared to the pattern of the spread of covid-19 in wuhan, china, and internationally in january and february 2020. examine the mathematical modeling of the epidemic of covid-19 cases in nigeria from 29 https://doi.org/10.18860/ca.v7i1.12634 mailto:jonner2766@gmail.com optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 41 march to 12 june 2020 with the effect of public awareness programs in implementing health protocols according to government recommendations [4]. the model for the spread of covid19 by paying attention to symptomatically and asymptomatically infectious compartments [5]. studied the spread of the covid-19 outbreak by applying mathematical growth functions and analyzing cases caused by the disease [6]. examined a mathematical model on the sir type of covid-19 to predict its spread by considering the social distancing factors [7]. formulated a deterministic model on the spread of covid-19 by estimating the model parameters according to the pandemic data that occurred in india, then conducting a sensitivity analysis to identify the model parameters [8]. analyzed the covid-19 modeling based on morbidity data in anhui, china by taking into account increased morbidity and mitigation measures [9]. studied and analyzed corona virus disease spread system and the seir type cov-2 sar with the effectiveness of government intervention [10]. analyzed model and predicted the spread of covid-19, then examined the exposed subpopulation, with measurement to prevent and control the epidemic [11]. examined the growth of the logistic model of the covid-19 spread in china and compared with data globally, determined the parameter values with a least-squares approach [12]. studied a mathematical model for the covid-19 pandemic, type sir, with asymptomatic individual effects considered with the finite antibody duration and health policy [13]. studied the global dynamics of the seir type covid-19 under convex incidence rate, through a mathematical model [14]. other studies have been carried out to find the exponential growth rate of the epidemic and determine the basic reproduction number [15]. reviewed a nonlinear ordinary differential equation model for the spread of covid-19, the model studied predicts the total number of covid-19 cases in austria, france, and poland [16]. the model studied is based on the model studied by fang et al. (2020), figuring out the vaccinated subpopulation and providing optimal control of u1, u2, and u3. the objectives of this study includes: (1) to examine and analyze the sveir-type of covid-19 spread model, (2) determine the co-state function of the sveir-type of covid-19 spread model with control 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ . (3) determine numerical solution of the optimal control for prevention and treatment of the sveir-type of covid-19 spread model. optimal control of prevention and treatment in the model of the spread of covid19 is important to study, since the covid-19 disease is still spreading and has not been completely the treatment until now. this study will examine the sveir type of covid-19 spread model by considering vaccination subpopulations and provide preventive control measures (𝑢1 ∗ and 𝑢2 ∗ ), healing efforts (𝑢3 ∗ ) and analyzed simulation by numerical approach using runge-kutta fourth order. method the model used in this study comes from the development of the tian (2020) which contemplated the recruitment of suspected subpopulations denoted by s, namely individuals who have a history of travel to infected areas. vaccination subpopulation, namely susceptible individuals who are vaccinated to increase individual immunity to the covid-19 virus so as not to be infected with covid-19 disease. the exposed subpopulation, namely individuals who are positive for covid-19 from the results of the swab, but not severe, denoted e. infected subpopulations denoted by i, i.e. individuals who are positive for covid-19 from the swab results and experience severe illness due to covid-19. the recovered subpopulation is denoted by r, due to treatment in hospitals provided by the government or due to self-isolation at home by eating or taking vitamins to increase immunity so that they can recover from covid-19. the assumptions of the optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 42 model studied are as follows: (i) individuals who travel to areas infected with covid-19 or have been in contact with individuals infected with covid-19 enter the susceptible subpopulation. (ii) does not consider the natural mortality of each subpopulation. (iii) most of the susceptible individuals who want to be vaccinated yet entering the vaccination subpopulation. (iv) individuals who were successfully vaccinated entered the recovered subpopulation, and those who failed to enter the exposed subpopulation. (v) if the maximum number of individuals in the subpopulation is exposed or the virus is multiplying in the lungs, then the individual will enter the infected subpopulation. (vi) pay attention to deaths due to covid-19. (vi) individuals may experience recovery due to treatment or by self-isolation. the models of the spread of covid-19 studied are as follows: 𝑑𝑆 𝑑𝑡 = λ − 𝛽𝑆𝐼 𝑁 − 𝜃𝑆 (1) 𝑑𝑉 𝑑𝑡 = 𝜃𝑆 − (𝜎 + 𝑟)𝑉 (2) 𝑑𝐸 𝑑𝑡 = 𝛽𝑆𝐼 𝑁 + 𝜎𝑉 − 𝛾𝐸 (3) 𝑑𝐼 𝑑𝑡 = 𝛾𝐸 − (𝑑 +  + 𝜏)𝐼 (4) 𝑑𝑅 𝑑𝑡 = 𝑟𝑉 + ( + 𝜏)𝐼 (5) where n = s + v + e + i + r. description of the parameters as table 1 follows: table 1. parameter description and estimate value parameter description value reference = n recruitment rate entering the s subpopulation 0,047/day covid-19 go.id data  transmission rate from s to e 0,154/day assumed  vaccination rates from subpopulation s to v 0,04/day assumed  the transfer rate from subpopulation v to e, due to vaccine failure 0,005/day assumed r vaccination success rate 0,05/day assumed  transmission rate from subpopulation e to i 0,036/day assumed d death rate due to covid-19 0,002/day covid-19 go.id data  healing rate 0,036/day covid-19 go.id data  speed of healing with covid-19 self-isolation 0,04/day assumed equilibrium point non-endemic fixed point, to analyze equation (1)-(5) it is enough to use equation (1)-(4) because equation (5) is redundant to equation (1)-(4). based on equation (1)-(4), the non-endemic fixed point is obtained, namely: 𝐸0 = ( λ 𝜃 , λ 𝑟+𝜎 , 0,0), and the endemic point of system (1)-(4) is 𝐸1 = ( λ𝑁 𝛽𝐼 + 𝜃𝑁 , 𝜃λ𝑁 (𝑟 + 𝜎)(𝛽𝐼 + 𝜃𝑁) , 𝛽λ𝑁 𝛾(𝛽𝐼 + 𝜃𝑁) + 𝜎𝜃λ𝑁 𝛾(𝑟 + 𝜎)(𝛽𝐼 + 𝜃𝑁) , 𝐼∗∗) with 𝐼∗∗ = − 1 2 𝑐𝜃𝑁𝑇±√(𝑐𝜃𝑁𝑇)2+4𝑎𝛽𝑐2𝑇+4𝑏𝛽𝑐𝑇 𝑐𝛽𝑇 , 𝑎 = 𝛽λ𝑁, 𝑏 = 𝜎𝜃λ𝑁, 𝑇 = 𝑑 +  + 𝜏, 𝑐 = 𝑟 + 𝜎. optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 43 the reproduction number the reproduction number is a parameter that expresses the expectation of secondary infective individuals due to contracting primary infective individuals in the susceptible population. the standard parameters that need to be known to determine whether the disease is spreading or not are the basic reproduction number (r0), if r0 > 1 then the number of infected individuals increases, if r0 < 1 then the number of infected individuals does not increase or decrease. the basic reproduction number is calculated by using the next-generation method, the basic reproduction number of equations (1)– (5) are as follows: r0 = 𝜌(𝐺𝑈 −1) with  is radius of the matrix built by gu-1. the rate of new infections denoted with g, and u is individuals out. the jacobian matrix of g(x) and u(x), and denote g = [gi /xj] and u = [ui /xj], (i, j = 1, 2, 3, 4, 5). parameter the reproduction number, r0 as r0 = 𝑟𝛽λ 𝜃𝑁(𝑑+𝛿+𝜏)(𝑟+𝜎) . (6) result and discussion the handling of covid-19 carried out by the stakeholders are: control u1(t) for prevention with government appeals in government agencies, schools, and the general public by implementing physical distancing, namely maintaining a minimum distance of 1 meter from other people. then, implementing a mask when doing activities in public places or crowds. to continue, washing our hands regularly with soap and water or a hand sanitizer that contains at least 60% alcohol, especially after coming back from outdoor activities or in public places. the u2(t) control is vaccination counseling because many individuals are afraid to be vaccinated. this u2(t) control provides education to the public, how to prepare for vaccination, what to do after being vaccinated, and what are the effects after being vaccinated. because there is a lot of hoax information circulating, to scare people not to be vaccinated. u3(t) control is an effort to accelerate healing from covid-19 by individuals who are self-isolating in their own homes or places provided by the central government or local governments. the healing efforts given are: taking vitamins c, d, b, zinc, selenium, curcumin, echinacea. based on equations (1)-(5) after being given the control u1(t), u2(t) and u3(t) obtained a system of differential equations with optimal control as follows: 𝑑𝑆 𝑑𝑡 = λ − 𝛽(1−𝑢1(𝑡))𝑆𝐼 𝑁 − 𝜃(1 + 𝑢2(𝑡))𝑆 (7) 𝑑𝑉 𝑑𝑡 = 𝜃(1 + 𝑢2(𝑡))𝑆 − (𝜎 + 𝑟)𝑉 (8) 𝑑𝐸 𝑑𝑡 = 𝛽(1−𝑢1(𝑡))𝑆𝐼 𝑁 + 𝜎𝑉 − 𝛾𝐸 (9) 𝑑𝐼 𝑑𝑡 = 𝛾𝐸 − (𝑑 +  + 𝜏(1 + 𝑢3(𝑡)))𝐼 (10) 𝑑𝑅 𝑑𝑡 = 𝑟𝑉 + ( + 𝜏(1 + 𝑢3(𝑡)))𝐼 (11) the functional objective of optimal control studied in this paper are: 𝐽(𝑢1, 𝑢2, 𝑢3) = ∫ (𝐴𝐸(𝑡) + 𝐵𝐼(𝑡) + 𝐶1𝑢1 2(𝑡) + 𝐶2𝑢2 2(𝑡)+𝐶3𝑢3 2(𝑡))𝑑𝑡 𝑡𝑓 0 , (12) where the coefficients a, b is the balance weights of the individual compartments exposed and actively infected with covid-19, respectively. while the coefficient c1 is a parameter weight that corresponds to the control u1(t), c2 is a parameter weight that corresponds to the control u2(t), and c3 is a parameter weight corresponding to the control u3(t), and tf is optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 44 the end time of the period. let 𝑢1 ∗ (𝑡), 𝑢2 ∗ (𝑡), and 𝑢3 ∗ (𝑡), be the optimal control of the system (7)-(11) and (12), such that it satisfies 𝐽(𝑢1 ∗ , 𝑢2 ∗ , 𝑢3 ∗ ) = min 𝐽(𝑢1, 𝑢2, 𝑢3), (13) where the control set 𝑈 = {(𝑢1, 𝑢2, 𝑢3)|𝑢𝑖 : [0, 𝑡𝑓 ] → [0,1], lebesgue measurable, 𝑖 = 1, 2, 3} . the objective function (12), the optimal control 𝑢1 ∗ , 𝑢2 ∗ , 𝑢3 ∗ , obtained with provided that (13) with restriction system (7)-(11) by using matlab tools, the solution will be obtained [17]. by model (7)-(11), and minimum functional adjoint (12) obtained hamiltonian function h, that is 𝐻 = 𝐴𝐸 + 𝐵𝐼 + 𝐶1𝑢1 2 + 𝐶2𝑢2 2 + 𝐶3𝑢3 2 + 𝜆1 𝑑𝑆 𝑑𝑡 + 𝜆2 𝑑𝑉 𝑑𝑡 +𝜆3 𝑑𝐸 𝑑𝑡 +𝜆4 𝑑𝐼 𝑑𝑡 +𝜆5 𝑑𝑅 𝑑𝑡 (14) theorem 1 there exists a optimal control 𝑢1 ∗ (𝑡), 𝑢2 ∗ (𝑡) and 𝑢3 ∗ (𝑡) and associated solution 𝑆∗(𝑡), 𝑉 ∗(𝑡), 𝐸∗(𝑡), 𝐼∗(𝑡) , 𝑅∗(𝑡) from models (7)-(11) and (14). then there exist costate functions λi, i = 1, 2, 3, 4, 5 satisfying 𝑑𝜆1 𝑑𝑡 = (𝜆1 − 𝜆3) 𝛽(1−𝑢1)𝐼 𝑁 + (𝜆1 − 𝜆2)(1 + 𝑢2)𝜃 𝑑𝜆2 𝑑𝑡 = (𝜆2 − 𝜆3)𝜎 + (𝜆2 − 𝜆5)𝑟 𝑑𝜆3 𝑑𝑡 = −𝐴 + (𝜆3 − 𝜆4)𝛾 𝑑𝜆4 𝑑𝑡 = −𝐵 + (𝜆1 − 𝜆3) 𝛽(1−𝑢1)𝑆 𝑁 + (𝜆4 − 𝜆5)(𝛿 + 𝜏) + 𝜆4𝑑 𝑑𝜆5 𝑑𝑡 = 0, the transversality conditions are given by 𝜆𝑖 (𝑡𝑓 ) = 0, 𝑖 = 1, 2, 3, 4, 5. finally, from the optimality condition, we obtain the following optimal controls: 𝑢1 ∗ = min {𝑚𝑎𝑥 {0, (𝜆3−𝜆1)𝛽𝑆𝐼 2𝐶1𝑁 } , 1} 𝑢2 ∗ = min {𝑚𝑎𝑥 {0, (𝜆1−𝜆2)𝜃𝑆 2𝐶2 } , 1}. 𝑢3 ∗ = min {𝑚𝑎𝑥 {0, (𝜆4−𝜆5)𝜏𝐼 2𝐶3 } , 1}. proof: we use pontrygain’s maximum principle [17] on our model system (14), and the hamiltonian is given by, 𝐻 = 𝐴𝐼 + 𝐵1𝑢1 2 + 𝐵2𝑢2 2 + 𝜆1 (λ − 𝛽(1 − 𝑢1(𝑡))𝑆𝐼 𝑁 − 𝜃(1 + 𝑢2(𝑡))𝑆) +𝜆2(𝜃(1 + 𝑢2(𝑡))𝑆 − (𝜎 + 𝑟)𝑉) + 𝜆3 ( 𝛽(1−𝑢1(𝑡))𝑆𝐼 𝑁 + 𝜎𝑉 − 𝛾𝐸) + +𝜆4(𝛾𝐸 − (𝑑 +  + 𝜏(1 + 𝑢3(𝑡)))𝐼 ) + 𝜆5(𝑟𝑉 + ( + 𝜏(1 + 𝑢3(𝑡)))𝐼 ) 𝑑𝜆1 𝑑𝑡 = − 𝜕𝐻 𝜕𝑆 = (𝜆1 − 𝜆3) 𝛽(1−𝑢1)𝐼 𝑁 + (𝜆1 − 𝜆2)(1 + 𝑢2)𝜃 𝑑𝜆2 𝑑𝑡 = − 𝜕𝐻 𝜕𝑉 = (𝜆2 − 𝜆3)𝜎 + (𝜆2 − 𝜆5)𝑟 𝑑𝜆3 𝑑𝑡 = − 𝜕𝐻 𝜕𝐸 = −𝐴 + (𝜆3 − 𝜆4)𝛾 𝑑𝜆4 𝑑𝑡 = − 𝜕𝐻 𝜕𝐼 optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 45 = −𝐵 + (𝜆1 − 𝜆3) 𝛽(1−𝑢1)𝑆 𝑁 + (𝜆4 − 𝜆5)(𝛿 + 𝜏) + 𝜆4𝑑 𝑑𝜆5 𝑑𝑡 = − 𝜕𝐻 𝜕𝑅 = 0. the optimality equations (14) and must satisfy transversality conditions λ(𝑡𝑓 ) = 0 for values i = 1, 2, 3, 4, 5. there exist unique optimal controls 𝑢1 ∗ (𝑡) and 𝑢2 ∗ (𝑡) which minimize j over u: the optimality necessary conditions that 𝜕𝐻 𝜕𝑢1 = 0, 𝜕𝐻 𝜕𝑢2 = 0 and 𝜕𝐻 𝜕𝑢3 = 0, then, by the bounds on the controls, it is easy to obtain and in the form 𝑢1 ∗ (𝑡) = (𝜆3−𝜆1)𝛽𝑆𝐼 2𝐶1𝑁 , 𝑢2 ∗ (𝑡) = (𝜆1−𝜆2)𝜃𝑆 2𝐶2 , and 𝑢3 ∗ (𝑡) = (𝜆4−𝜆5)𝜏𝐼 2𝐶3 . the optimal prevention control of disease, the reproduction numbers declared to be as follows: 𝑅0𝑝 ∗ = 𝑟𝛽(1−𝑢1)λ 𝜃𝑁(1+𝑢2)(𝑑+𝛿+𝜏)(𝑟+𝜎) . the optimal healing control of disease, the reproduction numbers declared to be as follows: 𝑅0ℎ ∗ = 𝑟𝛽λ 𝜃𝑁(𝑑+𝛿+𝜏(1+𝑢3))(𝑟+𝜎) . numerical method to solve the optimal control on the studied system of differential equations, it is solved by using a numerical method approach. the numerical solution used is the pontryagin maximum principle. completion of the optimal control system using the fourth-order runge–kutta procedure iterative method. the solution of the model (7)-(11) by guessing the initial and forward time from left to right with the same time co-state is solved from left to right by a forward runge–kutta fourth-order procedure in time with conditions of transversality [1]. suppose the initial number of subpopulations s0 = 61478 , v0 = 40000 (assumed), e0 = 20000 (assumed), i0 = 26940, r0 = 7637. figure 1. the dynamics of e with 𝑢1 ∗ and 𝑢2 ∗ controls figure 1, the optimal control on covid-19 prevention (𝑢1 ∗ ) and on efforts to increase the effectiveness of vaccination (𝑢2 ∗ ) are used equation (12). the results in figure 1 show that a significant difference in the e with the optimal control 𝑢1 ∗ and 𝑢2 ∗ compared to exposed subpopulation without control, using effective controls decreased the number of exposed covid-19 (e) than without control. optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 46 figure 2. the dynamics of i with 𝑢1 ∗ and 𝑢2 ∗ controls figure 2, the optimal control (𝑢1 ∗ ) and (𝑢2 ∗ ) on covid-19 are used to equation (12). the results in figure 2 show that there is difference in the i with control than i without control 𝑢1 ∗ and 𝑢2 ∗ , using effective controls decreased the number of active covid-19 (i) compared to without control. figure 3. the dynamics of i with 𝑢1 ∗ and 𝑢2 ∗ controls figure 4. the dynamics of r with 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ control figure 3, the optimal control on covid-19 prevention (𝑢1 ∗ ), the optimal control (𝑢2 ∗ ) and optimal control on covid-19 treatment (𝑢3 ∗ ) are application equation (12). the results in figure 3 show that there is difference in the i with control than i without control 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ compared to i without control, using effective controls decreased the number of active covid-19 (i) compared to with control strategy 𝑢1 ∗ and 𝑢2 ∗ and without control. optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 47 figure 4, the results in figure 4 show that a significant difference in the r with the optimal control strategy 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ compared to r without control. we see in figure 4 the number of recovered individuals in subpopulation of covid-19 increases rapidly with optimal control, while it is increase slowly without the control. figure 5. the profile of the optimal controls 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ figure 5: in this scenario, we consider the covid-19 optimal prevention and treatment control of covid-19 simultaneously. the profile of the optimal prevention control 𝑢1 ∗ , 𝑢2 ∗ and optimal treatment control 𝑢3 ∗ of this scenario in figure 5. conclusion until now, various efforts have been made by medical personnel in each country and who has been trying to find a cure and a vaccine for covid-19, but until now it has not been found so that the spread of covid-19 in the world has not been controlled. but in early 2021 a vaccine for covid-19 has been found and vaccinations have been carried out all over the world. having been vaccinated against covid-19 does not guarantee that you will not be infected with covid-19 again. the model studied in this paper discusses model covid-19 in indonesia concerning vaccinations. based on the numerical simulation obtained optimal control strategy 𝑢1 ∗ and 𝑢2 ∗ compared to e without control, using effective controls decreased the number of exposed covid-19 (e) compared to without control. optimal control strategy 𝑢1 ∗ , 𝑢2 ∗ and 𝑢3 ∗ compared to i without control, using effective controls decreased the number of active covid-19 (i) compared to with control strategy 𝑢1 ∗ and 𝑢2 ∗ and without control. the control result in a further increase in the number who recovered of covid-19 (r) compared optimal control strategy 𝑢1 ∗ and 𝑢2 ∗ and without control. acknowledgments the authors would like to thank kemenristek dikti for providing higher education grants for fiscal year 2020, through lppm universitas cenderawasih who sponsored the research. optimal prevention and treatment control on sveir type model spread of covid-19 jonner nainggolan 48 references [1] kemenkes ri, “dokumen resmi,” pedoman kesiapan menghadapi covid-19, pp. 0– 115, 2020. [2] e. jung, s. iwami, y. takeuchi, and t. c. jo, “optimal control strategy for prevention of avian influenza pandemic,” j. theor. biol., vol. 260, no. 2, pp. 220–229, 2009. [3] m. veera krishna, “mathematical modeling on diffusion and control of covid–19,” infect. dis. model., vol. 5, pp. 588–597, 2020. [4] s. s. moses, s. qureshi, s. zhao, a. yusuf, u. t. mustapha, and d. he, “mathematical modeling of covid-19 epidemic with effect of awareness programs,” infect. dis. model., vol. 6, no. february, pp. 448–460, 2021. [5] j. arino and s. portet, “a simple model for covid-19,” infect. dis. model., vol. 5, pp. 309–315, 2020. [6] m. kamrujjaman, m. s. mahmud, and m. s. islam, “coronavirus outbreak and the mathematical growth map of covid-19,” annu. res. rev. biol., no. march, pp. 72–78, 2020. [7] m. imran, m. wu, y. zhao, e. beşe, and m. j. khan, “mathematical modelling of sir for covid-19 forecasting,” vol. xxx, no. february, pp. 218–226, 2021. [8] s. k. biswas, j. k. ghosh, s. sarkar, and u. ghosh, “covid-19 pandemic in india: a mathematical model study,” nonlinear dyn., vol. 102, no. 1, pp. 537–553, 2020. [9] j. tian et al., “modeling analysis of covid-19 based on morbidity data in anhui, china,” math. biosci. eng., vol. 17, no. 4, pp. 2842–2852, 2020. [10] y. fang, y. nie, and m. penny, “transmission dynamics of the covid-19 outbreak and effectiveness of government interventions: a data-driven analysis,” j. med. virol., vol. 92, no. 6, pp. 645–659, 2020. [11] y. li et al., “mathematical modeling and epidemic prediction of covid-19 and its significance to epidemic prevention and,” ann. infect. dis. epidemiol., vol. 5, no. 1, p. 1052, 2020. [12] c. y. shen, “logistic growth modelling of covid-19 proliferation in china and its international implications,” int. j. infect. dis., vol. 96, pp. 582–589, 2020. [13] m. tomochi and m. kono, “a mathematical model for covid-19 pandemic—siir model: effects of asymptomatic individuals,” j. gen. fam. med., vol. 22, no. 1, pp. 5– 14, 2021. [14] r. ud din, a. r. seadawy, k. shah, a. ullah, and d. baleanu, “study of global dynamics of covid-19 via a new mathematical model,” results phys., vol. 19, p. 103468, 2020. [15] j. ma, “estimating epidemic exponential growth rate and basic reproduction number,” infect. dis. model., vol. 5, pp. 129–141, 2020. [16] r. cherniha and v. davydovych, “a mathematical model for the covid-19 outbreak and its applications,” symmetry, basel, vol. 12, no. 6, 2020. [17] s. lenhart and j. t. workman, optimal control applied to biological models, john chapman and hall, new york, 2007. spatio temporal modelling for government policy the covid-19 pandemic in east java cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 218-226 p-issn: 2086-0382; e-issn: 2477-3344 submitted: november 07, 2020 reviewed: december 03, 2020 accepted: april 12, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.10639 spatio temporal modelling for government policy the covid-19 pandemic in east java atiek iriany1, novi nur aini1, agus dwi sulistyono2 1 department of statistics faculty of mathematics and natural sciences, brawijaya university, indonesia 2 faculty of fisheries and marine science, brawijaya university, indonesia email: atiekiriany@ub.ac.id abstract covid-19 has cursorily spread globally. just in four months, its status altered into a pandemic. in indonesia, the virus epicenter is identified in java. the first positive case was identified in west java and later spread in all java. the large-scale social restrictions are seemingly inefficient as the sars-cov-2 transmission remains. as such, the government is struggling to find anticipatory policies and steps best to mitigate the transmission. in this particular article, we used a spatiotemporal model method for the total covid-19 cases in java and forecasted the total cases for the next 14 days, allowing the stakeholders to make more effective policies. the data we were using was the daily data of the cumulative number of covid-19 cases taken from www.covid19.go.id. data modeling was conducted using a generalized spatio-temporal autoregressive model. the model acquired to model the covid-19 cases in java was the gstar(1)(1,0,0) model. keywords: covid-19; forecasting, pandemic; spatio-temporal introduction as stipulated by who on 12 march 2020, covid-19 had become a pandemic [1]. the virus, firstly identified in wuhan in december 2019, rapidly spread throughout china and other 190 countries [2]. no research exactly explains how the sars-cov-2 was initially transmitted, but, in the meantime, it is believed that humans transmit this virus to humans. later research reveals that symptomatic patients transmit sars-cov-2 through droplets or sneezes [3]. moreover, another research mentions that sars-cov-2 can live in gas particles, e.g., air (generated through nebulizer) for approximately three hours [4]. due to its relatively rapid transmission and mortality rate which cannot be overlooked and no definitive therapy found, covid-19 is one of the diseases to which we should alert [5]. the coronavirus epicenter in indonesia is identified in java. the first positive case was identified in west java and later spread in all java. it indicates that adjacent locations closely pertain to the sars-cov-2 transmission. in response to the virus, china’s social distancing regulation is proven effective to stabilize the virus transmission, and hence the number declines [6]. indonesia, similar to china, issues the same regulation, namely the large-scale social restrictions (psbb). nevertheless, the regulation is seemingly inefficient as the sars-cov-2 transmission remains. as such, the government is struggling to find anticipatory policies and steps best to mitigate the transmission. http://dx.doi.org/10.18860/ca.v6i4.10639 http://www.covid19.go.id/ spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 219 many researchers, e.g., jia et al. [7], albana [8], and fajar [9] have studied the covid-19 transmission and aim to recommend some anticipatory efforts. meanwhile, we made covid-19 modeling using a spatio-temporal approach due to the sars-cov-2 transmission, which is mostly influenced by interplay, and numerous positive cases. several researchers used the spatio-temporal model [10] and [11]. one of the methods used to handle data attributed to time and location was generalized space-time autoregressive (gstar). some researchers, such as iriany [12], ruchjana [13], and prastyo [14], prefer this method. in this particular article, we used a spatio-temporal model for the total covid-19 cases in java and forecasted the total cases for the next 14 days, allowing the stakeholders to make more effective policies. methods data source the data we were using in this research were the daily data of the cumulative number of covid-19 cases taken from www.covid19.go.id. data stationarity according to the stationary time series data, neither a sharp decrease nor an increase in data value nor fluctuated data was found around the constant mean value [15]. stationary data had the mean 𝐸(𝑍𝑡) = µ and variance 𝑉𝑎𝑟(𝑍𝑡) = σ2. the mean value conditioned that data had to be stationary, so neither decrease nor an increase in data from time to time was allowed [16]. furthermore, the characteristic of a stationary time series was endlessly constant average and variance. there were two types of time series stationarity, namely stationarity to variance and the mean. a. stationarity to variance stationarity to variance was if 𝑉𝑎𝑟(𝑍𝑡) = 𝑉𝑎𝑟(𝑍𝑡−𝑘) for all t and k, the variance was constant from time to time [17]. to observe whether or not the data was stationary to variance, we used a box-cox plot. non-stationary data could be altered into stationary ones through transformation. b. stationarity to the mean stationarity to the mean was if 𝐸(𝑍𝑡) = 𝐸(𝑍𝑡−𝑘) for all t and k, the mean function remained constant from time to time. stationarity to the mean was observed using the acf (autocorrelation function) plot or the dickey-fuller test. non-stationary data could be altered into stationary ones through differencing. generalized space-time autoregressive integrated (gstar) the ar order was determined using the mpacf plot. correlation between zt and zt+k, after a dependence relationship, was linear. the variables zt+1, zt+2, …, and zt+k-1 were thus negated. the formula of correlation partial matrix function is as follows: ϕkk = 𝑐𝑜𝑣 [(𝑍𝑡−�̂�𝑡),(𝑍𝑡+𝑘−�̂�𝑡+𝑘)] √𝑣𝑎𝑟(𝑍𝑡−�̂�𝑡)√𝑣𝑎𝑟(𝑍𝑡+𝑘−�̂�𝑡+𝑘) (1) where ϕkk = partial correlation matrix coefficient at lag k 𝑍𝑡 = observation data at the time t �̂�𝑡 = predictor for 𝑍𝑡 𝑍𝑡+𝑘 = observation data at the time 𝑡 + 𝑘 �̂�𝑡+𝑘 = predictor for 𝑍𝑡+𝑘 http://www.covid19.go.id/ spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 220 the partial autoregression matrix at lag s became the last matrix coefficient when the data were leveraged for the vector autoregression process of the order s. the best model was selected among some models considered feasible for mpacf testing. model selection was conducted using aic. the less the aic value in a model, the better the model. the quantification of the aic value was as follows: 𝐴𝐼𝐶(𝑖) = ln (|𝑆(𝑝)| + 2𝑝𝑏2 𝑇 ) (2) where: b = the number of predicted parameters in the model t = the number of observations s(p) = residual sum of squares p = var model order the gstar model was introduced by borovkova, lopuha, and ruchjana in 2020 in wutsqa et al. [18]. it was more flexible and generalized than the star model and did not require the same parameter values at all locations. the gstar model (𝑝, 𝜆1, … . , 𝜆𝑙) is written as follows [19]: zt = ∑ [φ𝑘0 + 𝑝 𝑘=1 φ𝑘1𝑊] 𝑍𝑡−𝑝 + 𝑒𝑡 (3) where: φ𝑘0 = diag (𝜙𝑘0 1 , … , 𝜙𝑘0 𝑛 ), diagonal matrix of the parameter space-time lag spatial 0 and the parameter autoregressive lag at the time kth φ𝑘1 = diag (𝜙𝑘1 1 , … , 𝜙𝑘1 𝑛 ), diagonal matrix of the parameter space-time lag spatial 1 and the parameter autoregressive lag at the time kth w = weighing matrix (n×n) selected as such that 𝑊 𝑖𝑖 (𝑘) = 0 dan ∑ 𝑊 𝑖𝑗 (𝑘) = 1𝑖≠𝑗 e(t) = the white-nose vector in size of (n × 1) z(t) = the random vector in size of (n × 1) at the time t suhartono and subanar [20] introduced a new method for determining weight using the result of cross-correlation normalization between locations at a congruent time lag. �̂�𝑖𝑗(𝑘) = 𝑟𝑖𝑗 (𝑘) = ∑ [𝑍𝑖(𝑡)− 𝑍𝑙̅̅ ̅] 𝑛 𝑘+1 [[𝑍𝑗(𝑡−𝑘)− 𝑍𝑗] ̅̅ ̅̅ √(∑ [𝑍𝑖(𝑡)− 𝑍𝑙̅̅ ̅] 2𝑛 𝑡=1 )(∑ [𝑍𝑗(𝑡)− 𝑍𝑗 ̅̅ ̅]2𝑛𝑡=1 (4) the determination of location weight for the gstar model (1;p) is as follows: wij = 𝑟𝑖𝑗(1) ∑ |𝑟𝑖𝑘(1)|𝑘≠1 (5) with i ≠ j and the weight had fulfilled ∑ 𝑤𝑖𝑗𝑖≠𝑗 = 1. the weight of cross-correlation normalization represented the variance of correlation between locations occurring in the data. spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 221 results and discussion the covid-19 cases in indonesia were ever-increasing, and java was regarded as the transmission epicenter. the increase in the covid-19 cases is depicted in figure 1. figure 1. the plot of the time series of covid-19 cases in each province figure 1 indicates that as of 2 march-18 may 2020, the covid-19 cases increased in all provinces in java. on 18 may 2020, the highest number of cases, 5,555, was reportedly in jakarta, whereas the lowest one, 185, was in yogyakarta. using the data of the total covid-19 cases in six provinces in java, we identified the correlation between provinces and the covid-19 transmission in java. correlation between locations was identified using pearson’s correlation between provinces. the result of pearson correlation quantification is presented in table 1. table 1. the correlation value of the covid-19 cases between provinces in java banten jakarta west java central java yogyakarta east java banten 1 0.994 0.994 0.982 0.981 0.973 jakarta 0.994 1 0.995 0.989 0.975 0.967 west java 0.994 0.995 1 0.992 0.986 0.977 central java 0.982 0.989 0.992 1 0.983 0.980 yogyakarta 0.981 0.975 0.986 0.983 1 0.996 east java 0.973 0.967 0.977 0.980 0.996 1 in table 1, we can see that the data of the number of the covid-19 cases in six provinces in java had a high pearson’s correlation value which was higher than 0.9. it implies that the correlation of the covid-19 cases between provinces in java was strong. data stationarity test data stationarity testing was performed in two stages which were stationarity to variance and stationarity to the mean. stationarity to variance was tested using the boxcox transformation. data were regarded stationary if the lambda value was 1, signifying that var(zt) = var(zt-k). the result of the stationarity test to variance is shown in table 2. table 2. the result of box-cox transformation location λ transformation final transformation trans. λ trans. λ banten 0.20 zt0.20 1.00 zt0.20 jakarta 0.20 zt0.20 1.00 zt0.20 spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 222 location λ transformation final transformation trans. λ trans. λ west java 0.19 zt0.19 1.00 zt0.19 central java 0.00 ln(zt) 1.00 ln(zt) yogyakarta 0.00 ln(zt) 0.00 ln(zt) 1.00 ln(ln(zt)) east java 0.00 ln(zt) 0.50 zt0.50 1.00 ln(zt)0.50 as seen in table 2, the initial data had not fulfilled the stationarity to variance yet. several transformations were thus called for. after conducting the data stationarity test, we did the stationarity test to the mean. the test was conducted using an augmented dickey-fuller test. the result of the stationarity test to the mean is indicated in table 3. table 3. the result of the augmented dickey-fuller test location lag 0 1 2 banten π 98.73 60.83 34.96 p-value 0.001 0.001 0.001 jakarta π 107.89 32.98 21.15 p-value 0.001 0.001 0.001 west java π 100.74 40.22 28.78 p-value 0.001 0.001 0.001 central java π 122.94 55.84 29.66 p-value 0.001 0.001 0.001 yogyakarta π 75.55 51.84 42.09 p-value 0.001 0.001 0.001 east java π 155.97 57.49 25.85 p-value 0.001 0.001 0.001 from the augmented dickey-fuller test, we acquired predicted values less than the real ones (0.05). it indicates that the data had fulfilled the stationarity to variance. interpretation of the gstar model parameter model identification was aimed to find the autoregressive gstar model order. the order was elicited by identification using aic. the lag with the smallest aic value was regarded as the autoregressive gstar model order. table 4 lists the aic values. table 4. the aic value in model order selection lag ma 0 ma 1 ma 2 ma 3 ma 4 ma 5 ar 0 34.0876 35.2909 35.5546 36.0924 36.8882 36.1828 ar 1 31.9424 32.9755 33.3945 33.9363 33.9725 33.0619 ar 2 32.0815 33.1868 33.0648 33.8555 34.487 34.3527 ar 3 32.006 32.8989 33.096 35.128 35.9621 36.0125 ar 4 32.9289 34.1187 32.882 35.2756 35.429 42.0827 ar 5 34.6863 35.2259 35.61 35.7834 42.3544 table 4 shows the smallest aic value at the lag ar(1) and ma(0), hence the gstar(1)(1,0,0) model. spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 223 interpretation of the gstar model parameter the gstar model was a particular form of var engaging spatial elements. estimating the gstar(1) (1,0,0) spatial parameters with the ordinary least square method using cross-correlation normalization weight generated the following parameters. table 5. the parameters of the gstar(1)(1,0,0) model location parameter estimation banten ∅10 (1) 1.015 ∅11 (1) 0.793 jakarta ∅10 (2) 0.915 ∅11 (2) 0.984 west java ∅10 (3) 0.758 ∅11 (3) 1.031 central java ∅10 (4) -0.003 ∅11 (4) 0.256 yogyakarta ∅10 (5) 0.118 ∅11 (5) 0.061 east java ∅10 (6) 0.088 ∅11 (6) -0.013 referring to table 5, we generated the matrix equation of the gstar(1)(1,0,0) model, which is as follows: [ 𝑍1(𝑡) 𝑍2(𝑡) 𝑍3(𝑡) 𝑍4(𝑡) 𝑍5(𝑡) 𝑍6(𝑡)] = [ 1.015 0 0 0 0 0 0 0.915 0 0 0 0 0 0 0.758 0 0 0 0 0 0 −0.03 0 0 0 0 0 0 0.118 0 0 0 0 0 0 0.088] [ 𝑍1(𝑡 − 1) 𝑍2(𝑡 − 1) 𝑍3(𝑡 − 1) 𝑍4(𝑡 − 1) 𝑍5(𝑡 − 1) 𝑍6(𝑡 − 1)] + [ 0.793 0 0 0 0 0 0 0.984 0 0 0 0 0 0 1.031 0 0 0 0 0 0 0.256 0 0 0 0 0 0 0.061 0 0 0 0 0 0 −0.013] [ 0 0.256 0.116 0.209 0.183 0.236 0.205 0 0.187 0.217 0.160 0.231 0.192 0.223 0 0.233 0.150 0.201 0.092 0.273 0.194 0 0.231 0.210 0.156 0.242 0.079 0.276 0 0.247 0.134 0.241 0.180 0.124 0.321 0 ] [ 𝑍1(𝑡 − 1) 𝑍2(𝑡 − 1) 𝑍3(𝑡 − 1) 𝑍4(𝑡 − 1) 𝑍5(𝑡 − 1) 𝑍6(𝑡 − 1)] + [ 𝑒1(𝑡) 𝑒2(𝑡) 𝑒3(𝑡) 𝑒4(𝑡) 𝑒5(𝑡) 𝑒6(𝑡)] the following matrix equation was derived from the above equation. spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 224 [ 𝑍1(𝑡) 𝑍2(𝑡) 𝑍3(𝑡) 𝑍4(𝑡) 𝑍5(𝑡) 𝑍6(𝑡)] = [ 1.015 0 0 0 0 0 0 0.915 0 0 0 0 0 0 0.758 0 0 0 0 0 0 −0.03 0 0 0 0 0 0 0.118 0 0 0 0 0 0 0.088] [ 𝑍1(𝑡 − 1) 𝑍2(𝑡 − 1) 𝑍3(𝑡 − 1) 𝑍4(𝑡 − 1) 𝑍5(𝑡 − 1) 𝑍6(𝑡 − 1)] + [ 0 0.259 0.118 0.212 0.186 0.239 0.202 0 0.184 0.213 0.157 0.227 0.198 0.229 0 0.240 0.155 0.207 0.024 0.069 0.049 0 0.059 0.054 0.009 0.015 0.005 0.017 0 0.015 −0.002 −0.004 −0.002 −0.002 −0.004 0 ] [ 𝑍1(𝑡 − 1) 𝑍2(𝑡 − 1) 𝑍3(𝑡 − 1) 𝑍4(𝑡 − 1) 𝑍5(𝑡 − 1) 𝑍6(𝑡 − 1)] + [ 𝑒1(𝑡) 𝑒2(𝑡) 𝑒3(𝑡) 𝑒4(𝑡) 𝑒5(𝑡) 𝑒6(𝑡)] from the model generated, we made a comparison between the actual and predicted data, in which we acquired an rmse and mape value of 0.005 and 1.43, respectively. the two gave us a hint that the model generated was good. prediction result from the equation, we forecasted the total cases for the next 14 days, namely 19 may-1 june 2020, the result of which is presented in table 6. table 6. the predicted covid-19 cases on 19 may-1 june 2020 banten jakarta west java central java yogyakarta east java 1 628 5662 1689 1214 195 2321 2 651 5738 1746 1249 201 2438 3 674 5802 1804 1283 207 2561 4 699 5850 1862 1318 214 2689 5 725 5881 1920 1352 220 2822 6 754 5892 1978 1386 227 2961 7 784 5880 2035 1419 233 3105 8 817 5841 2092 1450 240 3255 9 852 5773 2148 1481 247 3411 10 891 5672 2202 1509 254 3573 11 933 5533 2254 1535 260 3741 12 979 5351 2305 1559 267 3916 13 1030 5122 2352 1579 274 4097 14 1086 4839 2396 1596 280 4284 the prediction stated that the total cases in all provinces in java would increase, except jakarta, in which there would be a declined total number of cases. the prediction was based on the assumption that there was no change in social interaction in the community. spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 225 conclusions the model acquired in modeling the covid-19 cases in java was the gstar(1)(1,0,0) model. our predicted covid-19 case data was close to the actual number of covid-19 cases in java. a spatio-temporal model could be used to predict the number of covid-19 cases in java. human-to-human transmission likely had a cross-location impact due to an interaction between individuals. our prediction indicates that all provinces in java, but jakarta, would likely have an increase in the total number of covid19 cases for the next 14 days. acknowledgments we would like to thank the university of brawijaya university for funding and support of this research. references [1] w. h. organization, “who director-general’s opening remarks at the media briefing on covid-19-11 march 2020,” geneva, switz., 2020. [2] w. h. organization, “novel coronavirus (2019-ncov), situation report–70, as of march 30, 2020,” url https//www. who. int/docs/defaultsource/coronaviruse/situation-reports/20200330-sitrep-70-covid-19. pdf? sfvrsn= 7e0fe3f8_2. [3] y. han and h. yang, “the transmission and diagnosis of 2019 novel coronavirus infection disease (covid‐19): a chinese perspective,” j. med. virol., vol. 92, no. 6, pp. 639–644, 2020. [4] n. van doremalen, t. bushmaker, d. h. morris, m. g. holbrook, a. gamble, and b. n. williamson, “& lloyd-smith, jo (2020). aerosol and surface stability of sars-cov2 as compared with sars-cov-1,” n. engl. j. med. [5] a. susilo et al., “coronavirus disease 2019: tinjauan literatur terkini,” j. penyakit dalam indones., vol. 7, no. 1, pp. 45–67, 2020. [6] k. roosa et al., “real-time forecasts of the covid-19 epidemic in china from february 5th to february 24th, 2020,” infect. dis. model., vol. 5, pp. 256–263, 2020. [7] j. s. jia, x. lu, y. yuan, g. xu, j. jia, and n. a. christakis, “population flow drives spatiotemporal distribution of covid-19 in china,” nature, pp. 1–11, 2020. [8] a. s. albana, “optimasi alokasi pasien untuk kasus covid-19 wilayah surabaya,” j. tecnoscienza, vol. 4, no. 2, pp. 181–200, 2020. [9] m. fajar, “estimation of covid-19 reproductive number case of indonesia.” [10] d. giuliani, m. m. dickson, g. espa, and f. santi, “modelling and predicting the spatio-temporal spread of coronavirus disease 2019 (covid-19) in italy,” available ssrn 3559569, 2020. [11] d. giuliani, m. m. dickson, g. espa, and f. santi, “modelling and predicting the spread of coronavirus (covid-19) infection in nuts-3 italian regions,” arxiv prepr. arxiv2003.06664, 2020. [12] a. iriany and b. n. suhariningsih, “ruchjana and setiawan," prediction of precipitation data at batu town using the gstar (1, p)-sur model,",” j. basic appl. sci. res., vol. 3, no. 6, pp. 860–865, 2013. [13] b. n. ruchjana, s. a. borovkova, and h. p. lopuhaa, “least-squares estimation of spatio-temporal modelling for government policy the covid-19 pandemic in java atiek iriany 226 generalized space-time autoregressive (gstar) model and its properties,” in aip conference proceedings, 2012, vol. 1450, no. 1, pp. 61–64. [14] d. d. prastyo, f. s. nabila, m. h. lee, n. suhermi, and s.-f. fam, “var and gstarbased feature selection in support vector regression for multivariate spatiotemporal forecasting,” in international conference on soft computing in data science, 2018, pp. 46–57. [15] w. w. s. wei, “time series analysis,” in the oxford handbook of quantitative methods in psychology: vol. 2, 2006. [16] s. makridakis, s. c. wheelwright, and v. e. mcgee, “metode dan aplikasi peramalan,” jakarta: erlangga, 1999. [17] d. c. jonathan and c. kung-sik, “time series analysis with applications in r,” springerlink, springer ebooks, 2008. [18] d. u. wutsqa, s. b. suhartono, and b. sutijo, “generalized space-time autoregressive modeling,” in proceedings of the 6th imt-gt conference on mathematics, statistics and its applications (icmsa2010), 2010. [19] s. borovkova, h. p. lopuhaä, and b. n. ruchjana, “consistency and asymptotic normality of least squares estimators in generalized star models,” stat. neerl., vol. 62, no. 4, pp. 482–508, 2008. [20] s. suhartono and s. subanar, “the optimal determination of space weight in gstar model by using cross-correlation inference,” quant. methods, vol. 2, no. 2, pp. 45– 53, 2006. bühlmann's credibility model with claims of negative binomial and 2-poisson distribution cauchy –jurnal matematika murni dan aplikasi volume 7(4) (2023), pages 493-502 p-issn: 2086-0382; e-issn: 2477-3344 submitted: june 10, 2022 reviewed: february 22, 2023 accepted: march 09, 2023 doi: http://dx.doi.org/10.18860/ca.v7i4.16400 bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi1, taufiq iskandar2, nurbahari3, alim misbullah4, vina apriliani5* 1,2,3department of mathematics, syiah kuala university, banda aceh, indonesia 4department of informatics, syiah kuala university, banda aceh, indonesia 5department of mathematics education, uin ar-raniry, banda aceh, indonesia 1,5graduate school of natural science and technology, kanazawa university, kanazawa, japan email: vina.apriliani@ar-raniry.ac.id abstract one technique for determining the premium is using the credibility theory. in this study, a credibility premium determination model was derived with the best accuracy approach in the form of bühlmann’s credibility premium. the approach used was a parameteric approach where the claim data is assumed to have a negative binomial and 2-poisson distribution. the bühlmann's credibility premium formula is given explicitly for these two data distributions. the obtained model is also applied to the correct data following these distributions. from the simulation results, it is obtained that the premium values are very close in value so that both models can be applied to the data and have a high level of credibility because they have a high credibility factor value. the results of this study provide a basic contribution to the development of actuarial science, especially in the technique of determining insurance premiums. copyright © 2023 by authors, published by cauchy group. this is an open access article under the cc by-sa license (https://creativecommons.org/licenses/by-sa/4.0/) keywords: 2-poisson distribution; bühlmann's credibility; negative binomial distribution introduction in everyday life, humans are very vulnerable to risk. for example, the risk of accidents, property loss, illness, and even loss of life. this risk causes humans to lose their assets. therefore, insurance comes with the aim of minimizing loss if the risk does occur. insurance is an agreement between the insured (policyholder) and the insurer which requires the policyholder to pay a premium as compensation for the insurance benefits to be provided by the insurer in the event of a risk of failure to the policyholder [1]. one of the problems of insurance companies is how to determine the premium of a product. one of the techniques used is credibility theory. this theory predicts the amount of premium rates in the future based on experience data in the past. there are two predictive models that can be formed, namely a model for the number of claims and a model for the amount of claims (claim severity) that occur. this of course will be related to the type of data distribution used. the statistical approach that can be used in modeling the data is a parametric approach and a nonparametric approach. in this approach, we used a http://dx.doi.org/10.18860/ca.v7i4.16400 mailto:vina.apriliani@ar-raniry.ac.id https://creativecommons.org/licenses/by-sa/4.0/ bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 494 parametric approach where the claim data is assumed to follow a certain distribution [2]. one type of credibility that is widely used is the best accuracy credibility which consists of the bühlmann model and the bühlmann-straub model [3]. in the bühlmann model, policyholders are assumed to be the same number between time periods, while the bühlmann-straub model is a general form of the bühlmann model where the number of policyholders may differ between time periods. several studies related to the determination of premiums with bühlmann and bühlmann-straub credibility parametrically can be seen in the research conducted by [4]–[7]. the distributions that are commonly used to model many claims are the negative binomial distribution and the poisson distribution [8]. according to [9], mixed distribution is a distribution that can be considered in modeling the data. this is because, data modeling becomes more accurate. one of the mixed distributions used in this study is the 2-poisson distribution [10]. the use of 2-poisson distribution has not been found in previous studies. by using the assumption of a negative binomial and 2-poisson distribution on the data, the bühlmann’s credibility model will be determined on the data that satisfies this distribution. the equation for determining bühlmann's credibility parameter is given explicitly. in addition, the prediction results obtained through the application of the data are also compared. applications on nonparametric data using r can be seen in [11]. methods poisson distribution the poisson distribution is a distribution with one parameter (𝜆). the probability function for the poisson distribution is 𝑝𝑘 = 𝑒−𝜆𝜆𝑘 𝑘! , (1) with 𝑘 = 0,1,2, …. the expected value and variance of the poisson distribution are 𝐸(𝑋) = 𝜆, (2) 𝑉𝑎𝑟(𝑋) = 𝜆 (3) [12]. the other properties and the applications of this distribution can be seen in [13]. gamma distribution a random variable is said to have a gamma distribution with parameters 𝛼 and 𝛽, if it has a probability density function 𝑔𝑋 (𝑥; 𝛼, 𝛽) = 𝑥𝛼−1𝑒 −𝑥 𝛽⁄ 𝛤(𝛼)𝛽𝛼 , 𝑥 ∈ ℝ+ (4) with 𝛼 > 0, 𝛽 > 0, γ(𝛼) > 0, and γ(𝛼) = ∫ 𝑦𝛼−1 ∞ 0 𝑒−𝑦 𝑑𝑦. the parameter 𝛼 is called the shape parameter associated with the gamma distribution and the parameter 𝛽 is generally called the scale parameter because it multiplies the random variable with a gamma distribution by a positive constant. the expected value and variance of the gamma distribution are 𝐸(𝑋) = 𝛼 𝜏 , (5) 𝑉𝑎𝑟(𝑋) = 𝛼 𝜏2 . (6) the evidence can be found at [14]. bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 495 negative binomial distribution a negative binomial distribution is formed by an experiment that satisfies the following conditions: 1. an experiment consists of a series of independent experiments. 2. each experiment can only produce one of two possible outcomes, failure and success. 3. the experiment continues until a total number of 𝑥 successes. the negative binomial distribution can be formed from a mixed distribution of the poisson distribution and the gamma distribution. by using equation (1) and (4), it can be shown that this distribution has two parameters (𝛼 and 𝜏). the probability density function is 𝑝𝑋 (𝑥) = ( 𝑥 + 𝛼 − 1 𝛼 − 1 ) (1 − 𝜏)𝑥 𝜏𝛼 , (7) with 𝑥 = 0,1,2, … [15]. 2-poisson distribution the 2-poisson distribution is a distribution with three parameters (𝜆1, 𝜆2, and 𝑝). the probability function for the 2-poisson distribution is 𝑝𝑋 (𝑥) = 𝑒−𝜆1 𝜆1 𝑥 𝑥! 𝑝 + (1 − 𝑝) 𝑒−𝜆2 𝜆2 𝑥 𝑥! , (8) with 𝑥 = 0,1,2, … [16]. some of the applications of this distribution can be seen in [17]-[18]. bühlmann’s credibility bühlmann's credibility is a credibility model with the best accuracy approach. in this model, the number of policyholders observed is assumed to be the same every year. premiums are determined based on a linear model between past data and theoretical premiums. the parameters used in the bühlmann's credibility model are as follows: 1. the average value of individual claims or premiums and the expected values 𝝁(𝜽) = 𝑬(𝑿|𝚯), (𝟗) 𝝁 = 𝑬(𝝁(𝜽)), (𝟏𝟎) 2. the variance of the hypothetical mean 𝒂 = 𝑽𝒂𝒓(𝝁(𝜽)) = 𝑽𝒂𝒓(𝚯), (𝟏𝟏) 3. variance process and the expected value of variance 𝒗(𝜽) = 𝑽𝒂𝒓(𝑿|𝚯), (𝟏𝟐) 𝒗 = 𝑬(𝒗(𝜽)), (𝟏𝟑) 4. credibility coefficient 𝑲 = 𝒗 𝒂 , (𝟏𝟒) 5. credibility factor 𝒁 = 𝑵 𝑵 + 𝑲 , (𝟏𝟓) 6. credibility premium 𝑃𝐶 = 𝑍�̅� + (1 − 𝑍) (16) [19]. goodnes-of-fit test suppose 𝑚 is the largest value in the distribution of the observed data and 𝑟 is the number of parameters to be estimated, then with several predicted parameters, statistics bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 496 𝝌𝟐 = ∑ (𝒏𝒙 − 𝒏𝒑𝑿(𝒙)) 𝟐 𝒏𝒑𝒙(𝒙) 𝒎 𝒙=𝟎 (𝟏𝟕) asymptotically spread 𝝌𝟐 with degrees of freedom 𝒎 − 𝒓. hypothesis testing is one of the statistical tests carried out for testing the suitability of the parameter 𝛽𝑖 which is made with the following hypothesis: 𝐻0: 𝜃 = 𝛽𝑖 , (data has a distribution that matches the distribution of the test) 𝐻1: 𝜃 ≠ 𝛽𝑖 . (data does not have a distribution that matches the distribution of the test) by using the values of the calculated chi-square and the table chi-square, the following decision rules apply: if 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 ≤ 𝜒𝑡𝑎𝑏𝑙𝑒 2 then the null hypothesis is accepted and if 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 > 𝜒𝑡𝑎𝑏𝑙𝑒 2 then the null hypothesis is rejected, by setting the alpha value as well as the degrees of freedom of the chi-square distribution [20]. results and discussion estimating bühlmann’s credibility parameters using negative binomial distribution bühlmann's credibility model with claims of negative binomial distribution can be derived by providing an estimate of the credibility parameter assuming the frequency of claims with a negative binomial distribution. we know that the negative binomial distribution is a mixed distribution of the poisson distribution and the gamma distribution. suppose 𝑋|θ ∽ poisson(𝜃) and θ ∽ gamma(𝛼, 𝜏), it can be proven that 𝑋 has a negative binomial distribution with parameters 𝛼 and 𝜏 [21]. the following formula is given for the bühlmann’s credibility parameters using this distribution assumption. hypothetical mean and the expected value for negative binomial model the hypothetical mean for negative binomial model can be determined using equation (9). since 𝑋|θ ∽ poisson(𝜃) then 𝜇(θ) = 𝐸(𝑋|θ = θ) = 𝜃. the expected value of the hypothetical mean or known as the individual premium (𝜇) can be determined by equation (10) as follows: 𝜇 = 𝐸(𝜇(θ)) = 𝐸(θ). since θ has gamma distribution (𝛼, 𝜏), then according to equation (5), 𝜇 = 𝛼 𝜏 . (18) to determine the credibility coefficient, it is necessary to estimate the value of parameter 𝑎. the value of 𝑎 is the variance of the hypothetical mean. using equation (11), the formula for the variance value of the hypothetical mean can be determined as follows: 𝑎 = 𝑉𝑎𝑟(𝜇(𝜃)) = 𝑉𝑎𝑟(θ) = 𝐸(θ2) − (𝐸(θ)) 2 . since θ has gamma distribution, then according to equation (6), 𝑎 = 𝛼 𝜏2 . (19) variance process and the expected value for negative binomial model the variance process formula and the expected value can be determined using equation (12) and (13). since 𝑋|θ ∽ poisson(𝜃) then 𝑣(𝜃) = 𝑉𝑎𝑟(𝑋|θ = θ) = 𝜃. the expected value of the variance process (𝑣) ishypothetical𝑣 = 𝐸(𝑣(θ)) = 𝐸(θ) 𝑣 = 𝛼 𝜏 . (20) bühlmann’s credibility coefficient for negative binomial model the credibility coefficient is the ratio between the expected value of variance process bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 497 and variance of the hypothetical mean. by using equation (11), (19), and (20), it is obtained 𝐾 = 𝑣 𝑎 = 𝜏. (21) bühlmann's credibility premium for negative binomial model after obtaining the formula from the credibility parameter for the negative binomial model, it can be formulated a formula for determining the premium with the negative binomial model using equation (15) and (16) as follows: 𝑃𝐶 = 𝑍�̅� + (1 − 𝑍) μ, (22) with 𝑍 = 𝑁 𝑁 + 𝐾 = 𝑁 𝑁 + 𝜏 , �̅� = ∑ 𝑥𝑖 (𝑛𝑥𝑖 ) 𝑁 𝑖=1 (𝑛𝑥𝑖 ) . �̅� is the average of the observed data while μ is the individual premium which can be determined using equation (18). the 𝑍 variable is also called the bühlmann’s credibility factor for the frequency of claims with a negative binomial distribution, where 𝐾 is a credibility coefficient that satisfies equation (21). estimating bühlmann's credibility parameters using 2-poisson distribution the following formula is given for the bühlmann’s credibility parameters using this distribution assumption. hypothetical mean and the expected value for 2-poisson model as before, the hypothetical mean for 2-poisson model can be determined using equation (9). the 2-poisson distribution gives that 𝑋|θ ∽ poisson(𝜃) and θ ∽ 𝑢(𝜃) = { 𝑝 ∶ 𝜃 = 𝜆1 1 − 𝑝 ∶ 𝜃 = 𝜆2. (23) since 𝑋|θ ∽ poisson(𝜃), then according to equation (2), 𝜇(𝜃) = 𝐸(𝑋|θ = θ) = 𝜃. (24) the expected value of the hypothetical mean (𝜇) is 𝜇 = 𝐸(𝜇(θ)) = 𝐸(θ) = 𝑝𝜆1 + (1 − 𝑝)𝜆2 = 𝑝(𝜆1 − 𝜆2) + 𝜆2. (25) furthermore, the variance of the hypothetical mean can be determined using equation (11) and (23) as follows: 𝑎 = 𝑉𝑎𝑟(𝜇(𝜃)) = 𝑉𝑎𝑟(θ) = 𝐸(θ2) − (𝐸(θ)) 2 = 𝜆1 2 𝑝 + 𝜆2 2(1 − 𝑝) − (𝑝(𝜆1 − 𝜆2) + 𝜆2) 2 = 𝜆1 2 𝑝 + 𝜆2 2(1 − 𝑝) − (𝑝2(𝜆1 − 𝜆2) 2 + 2𝑝(𝜆1 − 𝜆2)𝜆2 + 𝜆2 2 ) = (𝜆1 2 − 𝜆2 2 )𝑝 + 𝜆2 2 − [(𝑝(𝜆1 − 𝜆2) + 2)𝑝(𝜆1 − 𝜆2) + 𝜆2 2 ] = (𝜆1 2 − 𝜆2 2 )𝑝 (𝜆1 − 𝜆2)(𝜆1 + 𝜆2) − [(𝑝(𝜆1 − 𝜆2) + 2)𝑝(𝜆1 − 𝜆2)] = [(𝜆1 + 𝜆2) − (𝑝(𝜆1 − 𝜆2) + 2)𝑝(𝜆1 − 𝜆2)] 𝑎 = [(1 − 𝑝)(𝜆1 + 𝜆2) − 2]𝑝(𝜆1 − 𝜆2). (26) variance process and the expected value for 2-poisson model the variance process and the expected value can be determined using equation (12) bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 498 and (13). since 𝑋|θ ∽ poisson(𝜃) then according to equation (3), we get 𝑣(𝜃) = 𝑉𝑎𝑟(𝑋|θ = θ) = 𝜃. the expected value of the variance process (𝑣) can be obtained using equation (23), which is as follows: 𝑣 = 𝐸(𝑣(θ)) = 𝐸(θ) 𝑣 = 𝑝𝜆1 + (1 − 𝑝)𝜆2 𝑣 = 𝑝(𝜆1 − 𝜆2) + 𝜆2. (27) bühlmann’s credibility coefficient for 2-poisson model the last part of determining bühlmann's credibility premium is the need to assign a credibility coefficient value. by using equation (14), (26), and (27), it is obtained 𝐾 = 𝑣 𝑎 𝐾 = 𝑝𝜆1 + (1 − 𝑝)𝜆2 [(1 − 𝑝)(𝜆1 + 𝜆2) − 2]𝑝(𝜆1 − 𝜆2) . (28) bühlmann's credibility premium for 2-poisson model bühlmann's credibility premium is obtained by using equation (15) and (28) so that the bühlmann credibility value for the 2-poisson model is 𝑃𝐶 = 𝑍�̅� + (1 − 𝑍) μ, (29) with 𝑍 = 𝑁 𝑁 + 𝐾 = 𝑁 𝑁 + 𝑝𝜆1+(1−𝑝)𝜆2 [(1−𝑝)(𝜆1+𝜆2)−2]𝑝(𝜆1−𝜆2) , �̅� = ∑ 𝑥𝑖 (𝑛𝑥𝑖 ) 𝑁 𝑖=1 (𝑛𝑥𝑖 ) . the 𝑍 variable is also known as the bühlmann’s credibility factor for the frequency of claims with 2-poisson distribution. the value of μ can be determined by equation (25) and �̅� is the average of the observed data. application on data the data used for the application of the model is data on the distribution of claims (𝑛𝑥 ) on the motor vehicle insurance portfolio in singapore [22]. the number of claims occurred from 1993 to 2001 which can be seen in table 1. table 1. portfolio of the number of claims from observation results (millions of dollars) 𝑥 𝑛𝑥 0 178.080 1 19.224 2 1.859 3 177 4 11 5 1 >5 0 total 𝑛𝑥 = 199.352 before applying to the model, it is first tested whether the claim frequency data in table 1 has a negative binomial and 2-poisson distribution or not. testing the distribution of data was carried out using the chi-square test. bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 499 negative binomial distribution test the negative binomial distribution has two estimators for the parameters (�̂� and �̂�) based on equation (7). parameter estimates can be obtained using the moment method. based on table 1, the average value of the number of claims �̅� = 0,1179923, �̅�2 = 0,01392218, and variance 𝑆2 = 0,12881027. the values of �̂� and �̂� are �̂� = �̅�2 𝑆2 − �̅� = 0,01392218 0,01081797 = 1,28694966, �̂� = �̅� 𝑆2 − �̅� = 0,1179923 0,01081797 = 10,9070648. the distribution test steps carried out are as follows: a. hypothesis formulation. 𝐻0 ∶ data has negative binomial distribution. 𝐻1 ∶ data is not distributed negative binomial. b. calculates the probability for each claim frequency and the expected value. the probability is calculated for each claim frequency (𝑝𝑥 ) for each 𝑥 in the table data and based on the parameter estimator values and the negative binomial distribution formula, then for 𝑥 = 0: 𝑝𝑥 = ( 𝛼 + 𝑥 − 1 𝑥 ) ( 𝜏 1 + 𝜏 ) 𝛼 ( 1 1 + 𝜏 ) 𝑥 𝑝0 = ( 1,28694966 + 0 − 1 0 ) ( 10,9070648 1 + 10,9070648 ) 1,28694966 ( 1 1 + 10,9070648 ) 0 𝑝0 = 0,89324646. for the expected value (𝑛𝑝𝑥 ): 𝑛𝑝0 = 199.352(0,89324646) = 178.070,47. in the same way, it will produce a portfolio in table 2. table 2. portfolio of the number of claims with negative binomial distribution 𝑥 𝑛𝑥 𝑛𝑝𝑋 (𝑥) 0 178.080 178.070,47 1 19.224 19.246,38 2 1.859 1.848,29 3 177 170,07 4 11 15,31 5 1 1,36 >5 0 0,12 total 199.352 199.352 c. determine the value of the chi-square test statistic. the chi-square test statistic determined by equation (17) is obtained 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 = 1,6912. with a 95% confidence interval, then 𝛼 = 0,05 and 𝜒𝑡𝑎𝑏𝑙𝑒 2 with degrees of freedom 𝑚 − 𝑟 = 5 − 2 = 3 is 7,8147. based on the values of 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 and 𝜒𝑡𝑎𝑏𝑙𝑒 2 , it can be concluded that 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 < 𝜒𝑡𝑎𝑏𝑙𝑒 2 and the null hypothesis is accepted. thus, the data used in this study has met the requirements for a negative binomial distribution. 2-poisson distribution test there are three parameters that are assumed to have 2-poisson distribution based on bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 500 equation (8). by using the moment method, it is obtained that 𝑝 ̂ = 0,77481, 𝜆1̂ = 0,06191, and 𝜆2̂ = 0,31092. the distribution test steps carried out are as follows: a. hypothesis formulation. 𝐻0 ∶ data has 2-poisson distribution. 𝐻1 ∶ data is not distributed 2-poisson. b. calculates the probability for each claim frequency and the expected value. the probability is calculated for each claim frequency (𝑝𝑥 ) for each 𝑥 in the table data and based on the parameter estimator values and the 2-poisson distribution formula, the portfolio is obtained in table 3. table 3. portfolio of the number of claims with 2-poisson distribution 𝑥 𝑛𝑥 𝑛𝑝𝑋 (𝑥) 0 178.080 178.081,52 1 19.224 19.217,82 2 1.859 1.868,36 3 177 170,53 4 11 12,89 5 1 0,79 >5 0 0,04 total 199.352 199.352 c. determine the value of the chi-square test statistic. the chi-square test statistic determined by equation (17) is obtained 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 = 0.6249634. with a 95% confidence interval, then 𝛼 = 0,05 and 𝜒𝑡𝑎𝑏𝑙𝑒 2 with degrees of freedom 𝑚 − 𝑟 = 4 − 3 = 1 is 3,8414. based on the values of 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 and 𝜒𝑡𝑎𝑏𝑙𝑒 2 , it can be concluded that 𝜒𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 2 < 𝜒𝑡𝑎𝑏𝑙𝑒 2 and the null hypothesis is accepted. thus, the data used has met the requirements for a 2-poisson distribution. parameter estimation of bühlmann's credibility premium based on the estimation of distribution parameter values that have been obtained, it can be determined the parameter estimation of bühlmann's credibility premium. determination of the estimated parameter value using equation (18)-(29). the alleged results are presented in the table 4. table 4. parameter estimation of bühlmann's credibility premium from the data used parameter estimation negative binomial 2-poisson �̅� 0,1179 0,1179 �̂� 0,1180 0,1179 �̂� 0,0108 0,3696 𝑣 0,1180 0,1179 �̂� 10,9071 0,3191 �̂� 0,9999 0,9494 𝑃𝐶 0,11790 0,11799 based on the results in table 4, it can be seen that the estimated frequency of claims in the next period assuming the data is negative binomial and 2-poisson distribution is 0.11790 and 0.11799, respectively. this means that in the negative binomial model it is estimated that there will be 11.79% of policyholders who will make insurance claims in the bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 501 next period, while in the 2-poisson model it is estimated that there will be 11.799% of policyholders who will make insurance claims. the two models give fairly close results, this is because the bühlmann's credibility factor for the two distribution models is quite large, namely each 0.9999 and 0.9494. conclusions bühlmann's credibility formula has been given for data with negative binomial and 2poisson distribution. both distributions are mixed distributions. mixed distributions are quite well used in determining premiums with credibility. this is because in a mixed distribution, the distribution of claim frequency often depends on the distribution of risk. the simulation on the data shows that the premium value obtained is very good with high credibility for both distributions modeled. references [1] t. futami, matematika asuransi jiwa. tokyo: kyoei life insurance, 1993. [2] a. k. mutaqin, a. kudus, and y. karyana, “metode parametrik untuk menghitung premi program asuransi usaha tani padi di indonesia,” ethos (jurnal penelit. dan pengabdi. masyarakat), vol. 4, no. 2, pp. 318–326, 2016, doi: 10.29313/ethos.v0i0.1656. [3] h. bühlmann and a. gisler, a course in credibility theory and its applications. berlin: springer, 2005. [4] i. maulidi and v. apriliani, “model kredibilitas bühlmann dengan frekuensi klaim berdistribusi binomial negatif-lindley,” limits j. math. its appl., vol. 18, no. 1, pp. 71– 78, 2021, doi: 10.12962/limits.v18i1.6690. [5] t. m. karina, s. nurrohmah, and i. fithriani, “buhlmann credibility model in predicting claim frequency that follows heterogeneous weibull count distribution,” in journal of physics: conference series, 2019, vol. 1218, no. 1, p. 012041. doi: 10.1088/1742-6596/1218/1/012041. [6] l. m. wen, w. wang, and j. l. wang, “the credibility premiums for exponential principle,” acta math. sin. engl. ser., vol. 27, no. 11, pp. 2217–2228, 2011, doi: 10.1007/s10114-011-9198-4. [7] a. hassan zadeh and d. a. stanford, “bayesian and bühlmann credibility for phasetype distributions with a univariate risk parameter,” scand. actuar. j., vol. 2016, no. 4, pp. 338–355, 2016, doi: 10.1080/03461238.2014.926977. [8] a. k. mutaqin and k. komarudin, “perhitungan premi untuk asuransi kendaraan bermotor berdasarkan sejarah frekuensi klaim pemegang polis menggunakan analisis bayes,” pythagoras j. pendidik. mat., vol. 4, no. 1, pp. 47–55, 2008, doi: 10.21831/pg.v4i1.686. [9] s. a. thamrin, a. lawi, and r. mahmudah, “simulasi penaksiran parameter distribusi weibull campuran untuk data survival heterogen dengan pendekatan bayesian,” indoms j. stat., vol. 2, no. 2, pp. 37–46, 2014. [10] a. palmisano, “poisson and binomial distribution,” the encyclopedia of archaeological sciences, no. june. pp. 1–4, 2018. doi: 10.1002/9781119188230.saseas0467. [11] i. maulidi, w. erliana, a. d. garnadi, s. nurdiati, and i. g. p. purnaba, “penghitungan kredibilitas dengan pustaka actuar dalam r,” j. math. its appl., vol. 16, no. 2, pp. 45– 52, 2017, doi: 10.29244/jmap.16.2.45-52. bühlmann's credibility model with claims of negative binomial and 2-poisson distribution ikhsan maulidi 502 [12] s. ghahramani, fundamentals of probability. new york (us): prentice hall, 2005. [13] j. zhao, f. zhang, c. zhao, g. wu, h. wang, and x. cao, “the properties and application of poisson distribution,” in journal of physics: conference series, 2020, vol. 1550, no. 3, p. 032109. doi: 10.1088/1742-6596/1550/3/032109. [14] l. j. bain and m. engelhardt, introduction to probability and mathematical statistics. california: duxbury press, 1992. [15] d. lord and s. r. geedipally, “the negative binomial–lindley distribution as a tool for analyzing crash data characterized by a large amount of zeros,” accid. anal. prev., vol. 43, no. 5, pp. 1738–1742, 2011, doi: 10.1016/j.aap.2011.04.004. [16] p. c. consul and g. c. jain, “a generalization of the poisson distribution,” technometrics, vol. 15, no. 4, pp. 791–799, 1973, doi: 10.1080/00401706.1973.10489112. [17] s. e. robertson and s. walker, some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval, no. april. london, 1994. doi: 10.1007/978-1-4471-2099-5. [18] p. cholayudth, “application of poisson distribution in establishing control limits for discrete quality attributes,” j. valid. technol., vol. 13, no. 3, pp. 196–205, 2007, [online]. available: https://www.ivtnetwork.com/sites/default/files/poisiondistrib_01.pdf [19] t. n. herzog, introduction to credibility theory. winsted (us): actex publications, 1999. [20] r. e. walpole, pengantar statistika. jakarta: gramedia pustaka utama, 1995. [21] s. wu, “poisson-gamma mixture processes and applications to premium calculation,” commun. stat. methods, pp. 1–29, 2020, doi: 10.1080/03610926.2020.1850791. [22] s. m. simanjuntak, “beberapa model banyaknya klaim dalam sistem bonus malus,” ipb university, 2017. a study of count regression models for mortality rate cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 142-151 p-issn: 2086-0382; e-issn: 2477-3344 submitted: october 16, 2021 reviewed: november 04, 2021 accepted: november 05, 2021 doi: https://doi.org/10.18860/ca.v7i1.13642 a study of count regression models for mortality rate anwar fitrianto department of statistics, faculty of mathematics and natural sciences, ipb university email: anwarstat@gmail.com abstract in this study, poisson regression model, negative binomial 1 regression model (negbin 1) and negative binomial regression 2 (negbin 2) model were proposed to fit mortality rate data. the method used is comparing the values of akaike information criterion (aic) and bayesian information criterion (bic) to find out which method suits the data the most. the results show that the data indeed display higher variability. among the three models, the model preferred is negbin 1 model. keywords: mortality; poisson; regression; binomial; overdispersion introduction count data contain variables that count how many times something has happened, such as the number of cases with a particular disease in epidemiology [1]. linear regression models have often been applied to handle this kind of data, but the results are inefficient, inconsistent, and biased. this type of data is considered as count data with variable offset. mortality data is considered as the amount of data that contains the offset variable. a study of mortality for middle-aged men on ischemic heart disease (ihd) that affects mortality has been conducted by [2]. the results showed that there were 46 of 109 deaths around 11.4 years of follow-up due to ihd. in addition to studies on causes of death other than ihd, [3] has researched the global impact of hiv/aids. another study on mortality was conducted by [4] about the diarrheal disease. it has been found that diarrhea causes 1 in 9 child deaths worldwide, the second leading cause among children under 5 years of age. in addition, [5] examined the global causes of death due to disease in children under 5 years. in their study, diarrhea remained the second leading cause of death in children from infection in the last 30 years. in addition, malnutrition is said to be one of the world's worrisome problems. it affects about 6 million child deaths every year. [6] studied that poor nutrition during fetal development can cause severe physical damage, and malnutrition always increases susceptibility to disease. a study conducted by [7] stated that malnutrition (measured as poor anthropometric status) accounted for nearly 50% of childhood deaths. regarding the problem of mortality due to disease, [8] stated that the trend of injuries and deaths from road traffic accidents (rta) is becoming severe in countries such as india. not a day goes by without an rta in india; many people die or become disabled. in https://doi.org/10.18860/ca.v7i1.13642 mailto:anwarstat@gmail.com a study of count regression models for mortality rate anwar fitrianto 143 addition, suicide is one of the factors that contribute to the death rate. in a study by [9], suicidal behavior has always been a major health problem in many countries, both developed and developing countries. poisson regression model is one of the general linear models for data with offset variables. it is also the standard model for calculating data and contingency tables. in this model, the response variable is assumed to have a poisson distribution. in addition to poisson regression, negative binomial regression is also a generalized linear model where the dependent variable is the number of events. the negative binomial distribution is a two-parameter distribution that is generally more flexible than the poisson model [2]. this model can also model scattered quantities, which the poisson model cannot. the negative binomial model can be derived from the poisson distribution and the generalized poisson distribution. [10] has discussed several other specific mortality measures, such as age-specific crude death rates, cause-specific mortality rates, and infant and maternal mortality rates. in the data collection process, there may be biased and inaccurate data measurements. the inaccuracy of this data collection will cause overdispersion. this study aims to identify the most suitable method when dealing with mortality data which usually has overdispersion. methods data the data used in this study is mortality rate data which is available in [11]. the data consists of 163 observations (countries) with seven independent variables, which are the number of people dying per 100,000 live births due to ihd (𝑥1), diarrheal disease (𝑥2), hiv/aids (𝑥3), malaria (𝑥4), malnutrition (𝑥5), road accidents (𝑥6), and suicides (𝑥7). count regression models according to [12], the count regression model has been suggested to be used to model over-dispersed and zero-inflated count response variables. poisson regression is the standard model for modeling count data, while the negative binomial regression model is often introduced to solve count data with overdispersion. meanwhile, the zero-inflated poisson model (zip) and the zero-inflated negative binomial model (zinb) are introduced to solve a zero-inflated variable in which the data contains many zeros. moreover, [13] found that zip and zinb can be obtained by mixing a distribution degenerate at zero with a poisson regression and negative binomial regression, respectively. the probability mass function of the zip is, 𝑃(𝑌 = 𝑦𝑖) = { 𝜔𝑖 + (1 − 𝜔𝑖) 𝑒𝑥𝑝( 𝜆𝑖), 𝑦𝑖 = 0 (1 − 𝜔𝑖) 𝜆𝑖 𝑦𝑖! 𝑒𝑥𝑝( 𝜆𝑖), 𝑦𝑖 > 0 (1) meanwhile, the zinb's probability mass function can be formulated as: 𝑃(𝑌 = 𝑦𝑖) = { 𝜔𝑖 + (1 − 𝜔𝑖) ( 𝜃 𝜃+𝜆𝑖 ) 𝜃 , 𝑦𝑖 = 0 (1 − 𝜔𝑖) 𝛤(𝑦𝑖+𝜃) 𝑦𝑖!𝛤(𝜃) ( 𝜃 𝜃+𝜆𝑖 ) ( 𝜆𝑖 𝜃+𝜆𝑖 ) 𝑦𝑖 , 𝑦𝑖 > 0 (2) with 𝜆𝑖 = 𝑒𝑥𝑝( 𝑥𝑖𝛽). the 0's arise with probability 𝜔 from a second process. the function f that relates to the product 𝑥𝑖𝛾 to the probability 𝜔𝑖 is named as the zero-inflated link function, 𝜔𝑖 = 𝐹(𝑥𝑖𝛾). a study of count regression models for mortality rate anwar fitrianto 144 poisson regression model [14] studied about poisson regression model as the standard model for count data. a variable y is a count of events of poisson regression, and the marginal probability of poisson regression is written as: 𝑃(𝑌 = 𝑦𝑖) = 𝑒𝑥𝑝(−𝜆𝑖)𝜆𝑖 𝑦𝑖 𝛤(1+𝑦𝑖) ; (3) with 𝜆𝑖 = 𝑒𝑥𝑝( 𝛼 + 𝑥𝑖𝛽); 𝑦𝑖 = 0,1, . . . 𝑁. the rate parameter of poisson regression is 𝜆𝑖 and it is also known as its expected count is formulated as: 𝜆𝑖 = 𝑒 𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝 . (4) based on equation (4), the log-linear model for mean rate is written as: 𝑙𝑜𝑔(𝜆𝑖) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝, (5) with p is the number of predictors or covariates in the model, 𝛽0 is the intercept of the regression, 𝛽𝑝 are the regression coefficients, and 𝑥𝑖 is the independent variable. [14] formulated maximum likelihood estimation (mle) of poisson regression. let y be a random variable with poisson distribution and with an unknown parameter value 𝜃. the probability mass function of y is obtained, which is 𝑃𝑦(𝑦; 𝜃) to emphasize the parameter 𝜃 and n is the independent trials in order to get the data 𝑦1, 𝑦2, 𝑦3, . . . , 𝑦𝑛. the joint probability mass function is as follows: 𝑃𝑌...𝑌𝑛(𝑦1, . . . , 𝑦𝑛; 𝜃) = 𝑃𝑦(𝑦1; 𝜃). . . . 𝑃𝑦(𝑦𝑛; 𝜃). (6) the likelihood of 𝜃 given data 𝑦1, . . . , 𝑦𝑛 can be obtained from equation (6) by applying the logarithm as follows: 𝐿(𝜃; 𝑦1, … , 𝑦1) = 𝑃𝑌...𝑌𝑛1 (𝑦1, . . . , 𝑦𝑛; 𝜃) = 𝑃𝑦(𝑦1; 𝜃). . . . 𝑃𝑦(𝑦𝑛; 𝜃). (7) the estimated value maximizes the maximum likelihood estimates 𝜃 = 𝜃. y follows a poisson distribution with unknown parameters, and the data is collected from the independent trials are of the form 𝑌1 = 𝑦1, 𝑌2 = 𝑦2, . . . , 𝑌𝑛 = 𝑦𝑛. on the other hand, the likelihood function of the poisson regression is written as: 𝐿 = ∏ 𝑒−𝜆𝑖𝜆 𝑖 −𝑦𝑖 𝑦𝑖! 𝑁 𝑖=1 . (8) the log-likelihood function of poisson regression is obtained by applying the logarithm of equation (8), l ∑ 𝑙𝑜𝑔 ( 𝑒−𝜆𝑖𝜆 𝑖 −𝑦𝑖 𝑦𝑖! )𝑁𝑖=1 . the standard negative binomial regression model according to [1], in most applications, the mean of the data is usually greater than the variance. if otherwise, it is called overdispersion in the particular data. but, based on the study of [15], the poisson regression model is inefficient when dealing with overdispersed data. while in a study by [16], the negative binomial distribution is more flexible than poisson distribution as it is a two-parameter when modeling the data with overdispersion. particularly, negative binomial regression can model overdispersed counts. the negative binomial model can be derived as a mixture of the gamma-poisson model. starting from the conditional mean of the poisson model, 𝐸(𝑦𝑖|𝑥𝑖. 𝜀𝑖) = 𝑒𝑥𝑝( 𝛼 + 𝑥𝑖𝛽 + 𝜀𝑖) = ℎ𝑖𝜆𝑖, (9) a study of count regression models for mortality rate anwar fitrianto 145 where ℎ𝑖 = 𝑒𝑥𝑝(𝜀𝑖). in the case of the poisson-gamma distribution, 𝑔(𝜃, 𝜃) is the poisson distribution while ℎ𝑖 = 𝑒𝑥𝑝(𝜀𝑖) follows gamma distribution. the ℎ𝑖 is assumed to follow a two-parameter gamma distribution, 𝑓(ℎ𝑖) = 𝜃𝜃 𝑒𝑥𝑝(−𝜃ℎ𝑖)ℎ𝑖 𝜃−1 𝛤(𝜃) . (10) once ℎ𝑖 has been integrated out from the joint distribution, then the marginal probability of negative binomial distribution is obtained as follows: 𝑃(𝑌 = 𝑦𝑖|𝑥𝑖) = 𝛤(𝜃+𝑦𝑖) 𝑦𝑖!𝛤(𝜃) ( 𝜃 𝜃+𝜆𝑖 ) 𝜃 ( 𝜆𝑖 𝜃+𝜆𝑖 ) 𝑦𝑖 . (11) the mean of negative binomial is the same as poisson regression, which is written as 𝐸(𝑦𝑖|𝑥𝑖) = 𝜆𝑖 = 𝑒 𝑥𝑖𝛽 and the variance of a negative binomial is written as: 𝑉𝑎𝑟(𝑦𝑖|𝑥𝑖) = 𝜆𝑖 [1 + ( 1 𝜃 ) 𝜆𝑖] = 𝜆𝑖(1 + 𝑘𝜆𝑖), (12) where 𝑘 = 𝑉𝑎𝑟(ℎ𝑖). moreover, the rate parameter of negative regression 𝜆𝑖, which is also known as its expected counts, is written as: 𝜆𝑖 = 𝑒 𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝 . (13) the log-linear model for the mean rate of negative binomial regression can be obtained by applying the logarithm of equation (13): 𝑙𝑜𝑔(𝜆𝑖) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝, (14) where p is the number of predictors or covariates in the model, 𝛽0 is the intercept of the regression, 𝛽𝑝 are the regression coefficients, and x's are the independent variables. [17] has discussed the mle of negative binomial regression in which random samples of n subjects are given. in a standard negative binomial model, the dependent variables 𝑦𝑖 and the predictor variables 𝑥1𝑖, 𝑥2𝑖, … , 𝑥𝑝𝑖 are included. predictor variables are combined to form the following matrix, 𝑿 = [ 1 𝑥11 … 𝑥1𝑝 1 𝑥12 ⋯ 𝑥2𝑝 ⋮ ⋮ ⋱ ⋮ 1 𝑥1𝑛 ⋯ 𝑥𝑛𝑝] . the 𝑖𝑡ℎ row of x is designated to be 𝑥𝑖 , from equation (11), the 𝜆𝑖 in which is replaced by 𝑒𝑥𝑖𝛽. the equation can be rewritten as, 𝑃(𝑌 = 𝑦𝑖|𝑥𝑖) = 𝛤(𝜃+𝑦𝑖) 𝑦𝑖!𝛤(𝜃) ( 𝜃 𝜃+𝑒𝑥𝑖𝛽 ) 𝜃 ( 𝑒𝑥𝑖𝛽 𝜃+𝑒𝑥𝑖𝛽 ) 𝑦𝑖 . (15) the likelihood function of negative binomial is stated as below, 𝐿 = ∏ γ(𝜃+𝑦𝑖) 𝑦𝑖!γ(𝜃) 𝑁 𝑖=1 ( 𝜃 𝜃+𝑒𝑥𝑖𝛽 ) 𝜃 ( 𝑒𝑥𝑖𝛽 𝜃+𝑒𝑥𝑖𝛽 ) 𝑦𝑖 , (16) and the log-likelihood function of negative binomial regression is obtained by applying the logarithm to obtain the following equation:               ln1lnln1ln) 1 ln( 1                ii i iii n i yy x yxyyl e . (17) a study of count regression models for mortality rate anwar fitrianto 146 negative binomial 1 model and negative binomial p model [18] have shown the equation (9) is considered as a negative binomial 2 (negbin 2) model. they re-parameterized the negbin 2 model, and it is labeled as specification negative binomial 1 (negbin 1), which is written as, 𝑉𝑎𝑟(𝑦𝑖|𝑥𝑖) = 𝜆𝑖 + 𝑘𝜆𝑖 = 𝑉𝑎𝑟(𝑦𝑖|𝑥𝑖) = 𝜆𝑖(1 + 𝑘). (18) the marginal probability of negbin 1 is obtained by replacing 𝛳 with 𝛳𝜆𝑖 in equation (11), 𝑃(𝑌 = 𝑦𝑖|𝑥) = γ(𝜃𝜆𝑖+𝑦𝑖) 𝑦𝑖!γ(𝜃𝜆𝑖) ( 𝜃𝜆𝑖 𝜃𝜆𝑖+𝜆𝑖 ) 𝜃 ( 𝜆𝑖 𝜃𝜆𝑖+𝜆𝑖 ) 𝑦𝑖 . (19) by replacing 𝜃 with 𝜃𝜆𝑖 2−𝑃 in equation 19, the negative binomial p (negbin p) model is written as: 𝑃(𝑌 = 𝑦𝑖|𝑥𝑖) = 𝛤(𝜃𝜆𝑖 2−𝑃+𝑦𝑖) 𝑦𝑖!𝛤(𝜃𝜆𝑖 2−𝑃) ( 𝜃𝜆𝑖 2−𝑃 𝜃𝜆𝑖 2−𝑃+𝜆𝑖 ) 𝜃𝜆𝑖 2−𝑃 ( 𝜆𝑖 𝜃𝜆𝑖 2−𝑃+𝜆𝑖 ) 𝑦𝑖 . (20) overdispersion [19] have proposed that in almost the statistical study for the count data, it is always assumed that the dependent variable follows the poisson distribution. the mean is assumed to be equal to the variance. however, in real life, the variance is usually larger than the mean. [19] also stated that overdispersion indicates high variability around a model's fitted values in the poisson formulation. this case will lead to a negative binomial model as a proposal to correct this problem. when the data are over-dispersed, the variance is not the same as its mean, or 𝑉𝑎𝑟(𝑥𝑖) = 𝜑𝜆, where 𝜆 is the mean. if 𝜑 = 1, the poisson model is ordinary; if 𝜑 > 1, it means that the model is overdispersed model. consequently, [20] stated that a unique property of distributions in exponential families is the conditional variance equal the conditional mean. the dispersion parameter, 𝜑. in the poisson model, the dispersion parameter is set to constant value 𝜑 = 1. count data according to [16], count data indicates how many times or how frequent something happens. furthermore, [18] stated that an event outcome is the number of times an event occurs while an event count is a nonnegative random variable. the examples of count data included the number of patients hospitalized, the number of thieves arrested, and the number of natural disasters. in some cases of count, data have offset variables. [21] said that offset variable is always being analyzed by the generalized linear model (glm) and count regression model. the analysis is usually used whenever the data is recorded over an observed period. offset is used to denote the period observed in glm. other than that, offset is usually defined as a measure of exposure. the exposure can be the number of house years incurred, and the response will be the number of claims incurred. the log-linear mean rate for poisson regression and negative binomial model is, 𝑙𝑜𝑔(𝜆𝑖) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝, (21) when applying poisson regression or negative binomial regression, the offset variable, 𝑙𝑜𝑔(𝑡) is added a study of count regression models for mortality rate anwar fitrianto 147 𝑙𝑜𝑔(𝜆𝑖) = 𝛽0 + 𝛽1𝑥1 + ⋯ + 𝛽𝑝𝑥𝑝 + 𝑙𝑜𝑔( 𝑡) (22) 𝑙𝑜𝑔(𝜆𝑖) − 𝑙𝑜𝑔( 𝑡) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝 𝑙𝑜𝑔 ( 𝜆𝑖 𝑡 ) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝. 𝑙𝑜𝑔(𝜆𝑖) − 𝑙𝑜𝑔( 𝑡) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝 𝑙𝑜𝑔 ( 𝜆𝑖 𝑡 ) = 𝛽0 + 𝛽1𝑥1+. . . +𝛽𝑝𝑥𝑝, where p is the number of predictors or covariates in the model, 𝛽0 is the intercept of the regression, 𝛽𝑝 are the coefficients of the regression, 𝑥 is the independent variable, t is the period observed (exposure), log (t) is the offset variable and 𝜆𝑖 𝑡 is the rate. in this study, our interest is in modeling for the mortality data, which is count data. poisson regression and negative binomial regression are generally appropriate to deal with the count data. in this research, our interest is to find out which regression best fits the mortality data. modelling the mortality rate data poisson regression and negative binomial regression are the main study in this research in modeling the data. the model for poisson model and negative binomial model are written as equation (22), where p is the number of predictors or covariates in the model, 𝛽0 is the intercept of the regression, 𝛽𝑝 is the covariate coefficients, and 𝑥 is the independent variable. the 𝑙𝑜𝑔 ( 𝜆𝑖 𝑡 ) represents the number of people dying per time unit and the function βx is the relationship of death rate changes as a function of subject covariates. the null hypothesis states the slope is equal to zero, whereas the alternative hypothesis indicates the slope is not equal to zero. goodness-of-fit test deviance and person's chi-square will be carried out to check if the data has overdispersion or under-dispersion. the results of deviance and pearson's chi-square that are divided by the degree of freedom (df) should be approximately equal to one. if the values are more than one, this indicates that the data is overdispersion. goodness-of-fit is performed by using the proc genmod statement in sas. deviance for fitted poisson regression and negative binomial regression is written as: 𝐷 = 2 ∑ {𝑦𝑖𝑙𝑜𝑔 ( 𝑥𝑖 𝑦𝑖 ) − (𝑥𝑖 − 𝜆𝑖)} 𝑛 𝑖=1 . (23) and the pearson’s chi-square is defined as, 𝜒2 = ∑ (𝑥𝑖−𝜆𝑖) 2 𝑉𝑎𝑟(𝑥𝑖) 𝑛 𝑖=1 , (24) where 𝜆𝑖 = 𝑒 𝛽0+𝛽1𝑥1+...+𝛽𝑝𝑥𝑝 . a study of count regression models for mortality rate anwar fitrianto 148 results and discussion mortality rate data models proc genmod statement in sas version 9.4 was used to run the poisson regression analysis. at 5% level of significance, all independent variables contributed significantly to the mortality rate with the following estimated poisson regression model (table 1): ( 𝑙𝑜𝑔(𝜆𝑖) 𝑡 ) ̂ =6.5834 + 0.0008𝑥1+ 0.0039𝑥2+0.0010𝑥3+0.004𝑥4-0.003𝑥5– 0.0123𝑥6+0.0081𝑥7 table 1. analysis of maximum likelihood parameter estimates for poisson regression parameter degree of freedom estimate standard error chi-square pr > chisquare intercept 1 6.5834 0.0083 6255069.00 <.0001 𝑥1 1 0.0008 0.0000 507.71 <.0001 𝑥2 1 0.0039 0.0002 312.25 <.0001 𝑥3 1 0.0010 0.0000 1787.60 <.0001 𝑥4 1 0.0046 0.0002 529.74 <.0001 𝑥5 1 -0.0032 0.0003 95.57 <.0001 𝑥6 1 -0.0123 0.0004 1092.88 <.0001 𝑥7 1 0.0081 0.0004 463.89 <.0001 the estimated poisson model, along with the standard error of each estimated coefficient and p values, indicated that the ihd, diarrheal disease, aids/hiv, malaria, malnutrition, road accidents and suicides were significant predictors contributing to the mortality rate. as an alternative to the poisson regression model, the data were also analyzed using the negative binomial model. table 2 displays the result of the analysis based on maximum likelihood estimation for the negative binomial regression. table 2. analysis of maximum likelihood parameter estimates for negative binomial regression parameter degree of freedom estimate standard error chi-square pr > chi square intercept 1 6.5602 0.00875 5616.57 <.0001 𝑥1 1 0.0008 0.0004 4.02 0.0451 𝑥2 1 0.0046 0.0026 3.16 0.0755 𝑥3 1 0.0011 0.0003 13.13 0.0003 𝑥4 1 0.0054 0.0022 5.95 0.0147 𝑥5 1 -0.0041 0.0038 1.19 0.2750 𝑥6 1 -0.0138 0.0037 13.88 0.0002 𝑥7 1 0.0112 0.0045 6.13 0.0133 fitting the data using the negative binomial regression model found that all independent variables are except 𝑥2 (diarrheal disease) and 𝑥5(malnutrition) contribute significantly to the mortality rate. both variables have a more considerable p value (0.0755 for diarrheal diseases and 0.2750 for malnutrition). hence, diarrheal disease and malnutrition were not significant predictors, while the other variables ihd, aids/hiv, malaria, road accidents, and suicides, were the significant predictors. the predicted model using the negative binomial regression model for the mortality rate data is written as, ( 𝑙𝑜𝑔(𝜆𝑖) 𝑡 ) ̂ =6.5602+0.0008𝑥1+0.0046𝑥2+0.0011𝑥3+0.0054𝑥4-0.0041𝑥5-.0138𝑥6+0.0112𝑥7 a study of count regression models for mortality rate anwar fitrianto 149 descriptive statistics of the variables for checking overdispersion when the variance of a particular variable is higher than its mean, it indicates that the data has overdispersion. in this study, the dependent variable's mean and variance were 824.0061 and 105125.22, respectively, indicating overdispersion. table 3 displays the means and variances of all the variables in the study. all the variables were overdispersed and more considerable variability was given around a model's fitted values in poisson regression, 𝑉𝑎𝑟(𝑥𝑖) =𝜑𝜆, 𝜑 >1. as a consequence, the negative binomial regression was the better approach for modeling over-dispersed count data. table 3. the mean and the variance for each variable variable mean variance mortality 824.0061 105125.22 ihd 114.5747 6031.84 diarrhoel disease 22.0791 1118.02 aids/hiv 46.6652 11754.18 malaria 11.5688 481.2379 malnutrition 12.7489 477.6099 road accidents 17.2767 91.6093 suicides 10.0120 52.6231 goodness-of-fit test for poisson regression and negative binomial regression the main purpose of the goodness-of-fit test is to determine a more appropriate model. table 4 presents the deviance and pearson's chi-square to observe whether the deviance and pearson's chi-square obtained close to one. table 4. goodness-of-fit test for poisson regression and negative binomial regression model criterion df value value/df poisson regression deviance 155 15081.6228 97.3008 pearson's chi-square 155 14196.5148 91.5904 negative binomial regression deviance 155 167.1663 1.0785 pearson's chi-square 155 131.1002 0.8458 the value/df column of deviance and pearson's chi-square for the poisson model were 97.3008 and 91.5904, respectively, which were remarkably higher than one. the poisson model did not correctly describe the data. there was more significant variability among counts than will be expected for poisson distribution. this situation arises because repeated subjects may not be independent. one of the possible reasons for the overdispersion is that experimental conditions are not under control, hence 𝜆𝑖 varies with uncontrolled factors. the table shows that the negative binomial regression was the better alternative to model the mortality rate. the value/df of the deviance and pearson's chi-square were 1.0785 and 0.8458, respectively. both values were closer to one as compared to the corresponding values in the poisson regression model. comparison between poisson regression, negative binomial 1 and negative binomial 2. comparisons between all the three proposed models for the mortality data were given in table 5. the aic for poisson regression was larger compared to the other two. the aic value for negbin 1 was slightly smaller than the one for negbin 2. it indicated that negbin l was a better fit than poisson regression and negbin 2. on the other hand, the bic values for the three regressions were 16501, 2345, and 2347, respectively, for a study of count regression models for mortality rate anwar fitrianto 150 poisson, negbin 1, and negbin 2. the bic value for poisson regression was much higher when compared to the negative binomial regressions. thus, with lower aic and bic values, the negbin 1 was the better approach for the mortality rate data since it can explain more variation with the same number of independent variables.. table 5. aic and bic values between fifferent regression models regressions aic bic poisson 16476 16501 negbin 1 2317 2315 negbin 2 2319 2347 conclusions the analysis was conducted to compare the performance of three models: poisson regression, negbin 1 and negbin 2. the negbin 1 has been proven that it is the most appropriate model for overdispersed data. the mean and the variance were calculated to ensure that data has overdispersion. since the data were overdispersed, the results of deviance and pearson's chi-square showed that negative binomial was a better model for the data. then, the performance of aic and bic showed that negbin 1 is a better model, followed by negbin 2 and poisson regression. references [1] tutz, g. regression for categorical data, cambridge university press, new york, 2012. [2] kumpusalo, e., lakka, h.n., laaksomen, d.e., lakka, t.a., niskanen, l.k., salonen, j.t. and tuomilehto, j., "the metabolic syndrome and total and cardiovascular disease mortality in middle-age men", journal of american medical association, vol. 288, no. 3, pp. 2708-2716, 2002. [3] gayle, h.d. and hill, g.l. "global impact of human immunodeficiency virus and aids". clinical microbiology reviews, vol. 14, no. 2, pp. 327 – 335, 2001. [4] breman, j.g., jamison, d.t., and measam, a.r., disease control priorities in developing countries 2nd edition: worldbank, washington (dc), 2006. [5] claudio f. lanata, christa l. fischer-walker and ana c. olascoaga, carla x. torres, martin j. aryee, robert e. black, "global causes of diarrheal disease mortality in children <5 years of age: a systematic review", plos one. vol. 8, no.9, pp. 1-11, 2013. [6] bassett, l. and levinson, f.j. malnutrition is still a major contributor to child death. population reference bureau, boston, 2007. [7] black, r.e., hyder, a., sacco, l. and rice, a.l., malnutrition as an underlying cause of childhood deaths associated with infectious diseases in developing countries. world health organization. bulletin of the world health organization, 2000. [8] gopalakrishnan, s. , "a public health perspective of road traffic accidents", journal of family medicine and primary care, vol. 1, no. 2, pp. 144–150, 2012. [9] wasserman, d., cheng, q., and jiang, g.x., "global suicide rates among young people age 15 – 19", world psychiatry, vol. 4, no. 2, pp. 114 –120, 2005. a study of count regression models for mortality rate anwar fitrianto 151 [10 sheil, d, alder, d., and burshem, d., "the interpretation and misinterpretation of mortality rate measures", journal of ecology, vol. 83, no. 2, pp. 331–333, 1995. [11] koontz, d, life expectancy,, http://www.worldlifeexpectancy.com/life-expectancyresearch (accessed 12 august 2021), 2021. [12] özmen, i. and fayome, f., "count regressions model with an application to zoological data containing structural zero", journal of data science, vol. 5, no. 4, pp. 491-502, 2007. [13] ismail, n. and zamani, h.,"estimation of claim count data using negative binomal, generalized poisson, zero-inflated negative binomial and zero-inflated generalized poisson model", casualty acturial society e-form. vol. 41, no. 20, pp. 1-28, 2013. [14] greene, w. functional form and heterogeneity in models for count data, now publisher inc, 2008. [15] gourieroux, c., monfort, a., and trognon, a. , "pseudo maximum likelihood methods", econometrica, vol. 52, no. 3,pp. 681-700, 1984. [16] long, j. s. and freese, j., regression models for categorical dependent variables using stata, second edition, college station, tx: stata press, 2006. [17] zwilling, m.l., "negative binomial regression", the mathematical journal, vol. 15, no. 1, pp. 1-18, 2013. [18] cameron, c.a. and trivedi, p.k., regression aof count data, cambridge university press, cambridge, 2013. [19] berk, r and macdonald, j., "overdispersion and poisson regression", journal of quantitative criminology, vol. 24, no. 3, pp. 269-284. [20] turner, h., introduction to generalized linear model, esrc national centre for research method, 2008. [21] yan, j., guszcza, j., flynn, m., and wu, c.s., "applications of the offset in propertycasualty predictive modeling", casualty actuarial society e-forum, vol. 1, no. 1., pp. 366-385. http://www.worldlifeexpectancy.com/lifethe generalized star modeling with heteroscedastic effects cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 158-172 p-issn: 2086-0382; e-issn: 2477-3344 submitted: august 05, 2021 reviewed: august 20, 2021 accepted: december 20, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.13097 the generalized star modeling with heteroscedastic effects utriweni mukhaiyar1,*, syahri ramadhani2 1statistics research division, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia 2undergraduate programme in mathematics, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia *corresponding author email: utriweni@math.itb.ac.id abstract most of the generalized space time autoregressive (gstar) models assume the constant error variance. in fact, there are many space-time observations whose variability is changing over the times. in this study, a gstar model is built with an error variance that is not constant or has a heteroscedasticity effect, namely the combination of gstar–autoregressive conditional heteroscedasticity (arch). the parameters of the gstar–arch model are estimated using the generalized least square (gls) method to obtain the efficient parameter estimation. as a case study, the gstar–arch model is applied to the daily mean wind speed data of new orleans, florida and mississippi, in order to predict the occurrence of hurricane katrina that occurred in 2005. it is obtained that the heteroscedastic involvement in gstar modeling gives the better results in predictions, compared to the homoscedastics approach. furthermore, as the order of model is higher, the gstar model performances is better, which is shown by the least mean squared errors (mse) and mean absolute percentage error (mape). the obtained results show that the gstar model (3;0,0,1)–arch(1) predicts the hurricane katrina better than the gstar(3;0,0,1) and gstar(1;1)–arch(1) models. keywords: gstar; arch; conditional variance; generalized least squares; heteroskedasticity introduction a hurricane is a natural phenomenon in the form of wind gusts with a speed exceeding 119 km/hour. hurricanes are a type of tropical cyclone that usually forms on warm sea surfaces around the equator. the wind speed at one location is influenced by the wind speed of the previous time at that location and also influenced by the average wind speed at other locations. this means that the wind speed can be modeled with space-time models such as starma(p,q), star(p), stma(q), gstar(p,q) and starmag(p,q). the model used in this study is the gstar model which uses a certain weight matrix according to the location conditions. http://dx.doi.org/10.18860/ca.v7i1.13097 mailto:utriweni@math.itb.ac.id the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 159 the arrival of a storm which is usually predicted by weather satellites will be predicted using the gstar model. by using the daily average wind speed data one year before the storm, it is hoped that the arrival of the storm can be predicted earlier and reduce the number of victims, both property and life, due to the storm. however, a large increase in wind speed when a storm occur, causes the data to have heteroscedastic effect, so that the error generated by the gstar model has a non-constant variance. it makes the estimation of the initial model parameters being no longer efficient, thus a model that explains the variance of the error is not constant, namely the arch model is should be developed. the use of the arch is to model the variance of errors, such that it is expected to eliminate heteroscedastic elements and more efficient parameters can be produced. the development of gstar model in indonesia is very fast, both theoretically and in application. theoretically, it includes the stationary properties of the process using the inverse autocovariance matrix [1] as well as the kernel approach [2, 3] and minimum spanning tree approach [4], gstar with correlated errors [5, 6], gstar with outliers [7], gstar for discrete data [8], and invertibility of kernel gstar model [9]. the application of the gstar model has been carried out on economic data [10], tea plantations [11], palm oil production [12], red chili commodity prices [13], number of dengue fever cases [14], predictions of robbery cases in medan, north sumatra [15], the spread of covid-19 cases in java [16], and copper and gold grades vertical distribution [17]. on the other hand, the arch model, which accommodates the element of heteroscedasticity and exogenous variable which make the high volatility of process, is widely developed in economic problems. the impact of covid-19 as the exogenous factor to the economic sector be explored by [18]. sometimes the exogeneous factor cause a point of change happen and it should be detected [19]. however, its application has also been carried out to predict electric current [20], caterpillar pests in oil palm plantations [21], and rice prices [22]. the development of the gstar model with an error variance by considering the heteroscedasticity effect, has been investigated by [23] on the gstar(1,1) model with application to stock prices. the contruction of gstar model with the arch effect and estimate the parameters using the maximum likelihood method approach be explored by [24]. in this study, the gstar–arch model was developed by estimating the parameters using the generalized least square (gls) approach. methods generalized star model the gstar model is a generalization of the star model where the model parameters for each location that were initially considered homogeneous can be different. an observation at location i at time t are expressed as 𝑍𝑖,𝑡 . if the observations between locations are related, then these observations can be modeled using the gstar model. the general form of gstar is, 𝒁𝑡 = ∑ ∑ 𝚽𝑗𝑙 λ𝑘 ℓ=0 𝑘 𝑗=1 𝑾(𝓵)𝒁𝑡−𝑗 + 𝒆𝑡 the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 160 with 𝒁𝑡 is a n-dimensional column vector ( 𝑍1,𝑡 , 𝑍2,𝑡 , … , 𝑍𝑁,𝑡 )′, 𝑾 (𝓵) is a n-dimensional weight matrix for spatial lag-ℓ𝑡ℎ , 𝚽𝑗𝑙 is n-dimensional matrix of autoregressive parameters for spatial lag ℓ and time lag j, and 𝒆𝑡 is a n-dimensional vector of errors respective to the observations inside vector 𝒁𝑡 . for homoscedastic gstar, the 𝒆𝑡 is a white noise vector whose mean and variance are constant, and follows normal multivariate distribution, meanwhile for heteroscedastics case, the variance is not constant. the gstar modeling stage follows the box-jenkins iteration [1], consist of model identification, parameter estimation and diagnostic checking. before doing gstar modeling, the data of process must have stationary properties. if it is not stationary, a differentiation process should be performed on the data until the data is stationary. in estimating the parameters of the gstar model, it can be done using the ordinary least square (ols) method by constructing the gstar model into a linear form 𝒀 = 𝐗𝜷 + 𝒆, so that the ols estimator obtained is �̂� = (𝐗′𝐗)−1𝐗′𝒀 [1] arch(1) model consider a process {yt} which follows ar(p) model such that can be written as: 𝑌𝑡 = 𝜙0 + 𝜙1𝑌𝑡−1 + ⋯ + 𝜙𝑝𝑌𝑡−𝑝 + 𝑒𝑡 which 𝑒𝑡 is uncorrelated errors but has inconstant variance or depend on time. based on engle (1982), the error 𝑒𝑡 can be expressed as, 𝑒𝑡 = 𝑎𝑡 𝜎𝑡 (1) with 𝑎𝑡 is random sample which independent and has identical standard normal distribution, and 𝜎𝑡 2 = 𝛼0 + 𝛼1𝑒𝑡−1 2 + ⋯ + 𝛼𝑝𝑒𝑡−𝑝 2 (2) if the erros are known until (t-1) the the conditional variance of 𝑒𝑡 is stated as: 𝑉𝑎𝑟𝑡−1(𝑒𝑡 ) = 𝐸𝑡−1[𝑒𝑡 2] = 𝐸[𝑒𝑡 2|𝑒𝑡−1 2 , 𝑒𝑡−2 2 , … ] = 𝜎𝑡 2 from eq. (2), it can be said that the conditional variance of 𝑒𝑡 depends on squares of the past errors and inconstant. tthis condition is named as arch(p) model [25]. the simplest form of the arch(p) model and used in this study is the arch(1) model. in this model, the error variance at time t is affected by the square of the error of the previous one time lag. the arch(1) model is formulated as: 𝑒𝑡 = 𝑎𝑡 𝜎𝑡 and 𝜎𝑡 2 = 𝛼0 + 𝛼1𝑒𝑡−1 2 with 𝛼0 and 𝛼1 are non-negative parameters of arch(1) model. the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 161 the variance of arch(1) errors is, var(𝑒t) = 𝐸[𝑒𝑡 2] − (𝐸[𝑒𝑡 ]) 2 = 𝐸[𝑒𝑡 2] = 𝐸[𝑎𝑡 2]𝐸[𝜎𝑡 2] = 𝐸[𝛼0 + 𝛼1𝑒𝑡−1 2 ] = 𝐸[𝛼0] + 𝐸[𝛼1𝑒𝑡−1 2 ] = 𝛼0 + 𝛼1var(𝑒𝑡 ) then it is obtained, var(𝑒𝑡 ) = 𝛼0 1−𝛼1 (3) since the variance is positive then, based on eq. (3), 𝛼0 > 0 and 0 ≤ 𝛼1 < 1, be the stationary condition of arch(1) model. gstar(1;1) – arch (1) model consider 𝒁𝑡 = (𝑍1,𝑡 , 𝑍2,𝑡 , . . . , 𝑍𝑁,𝑡 )′ as a vector of observations in n location at time t, can be modeled as gstar(1;1)–arch(1), if it can be expressed as: 𝒁𝑡 = (𝜱𝟎 + 𝜱𝟏𝑾)𝒁𝑡−1 + 𝒆𝑡 (4) where 𝒆𝑡 ~ 𝑁(𝟎, ωt), is vector of errors which follows normal distribution with zero mean and inconstant variance over the time. the covariance matrix ωt is defined as ωt = diag(h1,t, h2,t, . . . , hn,t) and hi,t is a vector of erros variance on location i at time t, which can be modeled as arch(1), that is 𝒉𝑖,𝑡 = 𝜶0𝑖 + 𝜶1𝑖 𝒆𝑖,𝑡−1 𝟐 with 𝜶𝑘𝑖 is parameter of model for location i and k = 0, 1. the assumption used in this model is that the errors between locations are uncorrelated with each other so that the error variance at location i at time t is affected by the square of the errors on that location at time (t – 1) , but is not affected by the errors on the other locations. meanwhile, the observation value of location i is influenced by the observations on that location and also the neighbor locations. the method used to estimate this model is the generalized least square (gls) method. suppose that the matrix ω has eigenvalues, 𝜆1, 𝜆2, … , 𝜆𝑇 . by cholesky's decomposition, it can be written as 𝛀 = 𝐂𝚲𝐂′, with 𝚲 = diag(𝜆1, 𝜆2, … , 𝜆𝑇 ) is a diagonal matrix, and c is an orthogonal matrix. consider the matrix 𝐂 = 𝐏−1𝚲 − 1 2 , with 𝚲 1 2 = diag(√𝜆1, √𝜆2, … , √𝜆𝑇 ). thus , 𝛀−1 = 𝐏−1𝚲−1(𝐏′)−1 = 𝐏−1𝚲 − 1 2𝚲′ − 1 2(𝐏′)−1 = 𝐂𝐂′. let the transformed linear model, 𝒀∗ = 𝐗∗𝜷 + 𝒆∗ be defined with 𝒀∗ = 𝐂𝒀, 𝐗∗ = 𝐂𝐗, and 𝒆∗ = 𝐂𝒆. then, unbiased estimator of gstar–arch model parameters are presented as: �̂�gls = (𝐗 ′𝛀−1𝐗)−1𝐗′𝛀−1𝒀 (5) the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 162 with 𝛀 = diag(ℎ1(1), … , ℎ1(𝑇), … , ℎ𝑁 (1), … , ℎ𝑁 (𝑇)) is nt-dimentional of diagonal [26]. furthermore, the stages of gstar-arch modeling is illustrated in a flow chart as presented in fig. 1. figure 1. flowchart of the gstar-arch modeling stage. modeling is carried out to obtain a homoscedastic error. the equation for the mean is modeled by the gstar model while the variance is modeled by the arch model. results and discussion as a case study, the data used are the average daily wind speed in three states of the united states (n = 3) from september 1, 2004 to august 24, 2005 (t = 358) obtained from the national oceanic and atmospheric administration (noaa) that belongs to the united states. geographically, the united states is located around the equator, so some areas of the country are often hit by storms. hurricane katrina was one of the deadliest hurricanes that occurred in 2005. according to the united states department of oceanic and atmospheric research, the total loss caused by hurricane katrina was over a terdapat efek arch space-time data data plot data is stationary no yes gstar model identification difference parameter estimation arch effect test for errors no yes gstar model identify arch model parameter estimation of arch parameter estimation of gstar arch diagnostics tes of model homoscedastic of errors yes no short-time forecasting finish is there arch effect? the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 163 thousand million dollars and more than 1,800 people died. the states of interest are new orleans (louisiana), florida, and mississippi, each of which can be seen in fig. 2. the modeling is carried out with the help of the r application. the data will be modeled with the space-time model and must meet the stationary properties first. the stationary data can be seen from the plot of the row of observations at each location as in fig. 3(a). from the figure, it can be seen that there are several wind speed values that are higher than other observations. in addition, there is also a slight downward and rising pattern, which indicates a data pattern that is not stationary on average. therefore, the data differentiation is done first. the series plot after one-time differentiation can be seen in fig. 3(b). the stationary data is then centered so that it has a zero mean (centralized process). the process variability which occasionally increases, indicates that the variance is not constant. this will be accomodated in the modeling with heteroscedastic effect. in gstar modeling, one of the important elements that characterizes the relationship between locations is the presence of a weight matrix. the weight matrix has entries 𝑤𝑖𝑗 , which represents the weight of location-j to location-i. this matrix has zeros entries in the main diagonal and the total weight in one row is equal to one. in this study, the weight matrix used is uniform and binary weights. the spatial lag used is limited to only one spatial lag. for simplicity, the uniform and binary weight matrix be used, respectively, are 𝑾(𝟏) = ( 0 0.5 0.5 0.5 0 0.5 0.5 0.5 0 ) and 𝑾(𝟏) = ( 0 0 1 1 0 0 1 0 0 ) figure 2. map of the united states and the states where the observations are located (source: https://greatpersie.wordpress.com). new orleans (louisiana), florida, and mississippi are referred to as location 1, location 2, and location 3, respectively (1) (2) (3) the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 164 figure 3. plot series for each location, (a) before differentiation and (b) after differentiation by one time lag. after differentiation, the data become centered with a more stationary pattern in the mean the first stage in the modeling is model identification with the help of space-time acf and pacf plots, called stacf and stpacf. however, because the model has been determined at the beginning, namely the gstar(1;1) model, the model identification stage can be skipped. the stacf and stpacf plots obtained are used to see whether there is a relationship between time and location from the daily average wind rate data. the plots of stacf and stpacf can be seen in fig. 4. from fig. 4 it can be seen that the data have time and spatial dependence, although the gstar(1;1) model is not very appropriate to model this data. from the stpacf plot, the best possible model for the data is gstar(3;0,0,1). thus, the modeling be considered are gstar(1;1) and gstar(3;0,0,1) model. first, the obtained estimated parameters using ordinary least squares (ols) method for the gstar(1;1) model can be seen in table 1. table 1. the ols parameter estimation of gstar(1;1) model parameter 𝝓𝟎𝟏 𝝓𝟏𝟏 𝝓𝟎𝟐 𝝓𝟏𝟐 𝝓𝟎𝟑 𝝓𝟎𝟑 estimation -0.15 0.08 -0.08 -0.04 -0.28 0.20 (a) (b) the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 165 the next step is to test the presence of the arch effect on the error of each location. the existence of the arch effect can be detected from the plot of the squared error of each location which can be seen in fig. 5. figure 4. stacf (left) and stpacf (right) plot based on uniform weight matrix. the stacf has more patterned values than stpacf. thus the autoregressive model is more appropriate figure 5. the square errors plot of the gstar(1;1) model. it can be seen that the error variance is not constant, indicating the presence of the arch the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 166 to confirm the existence of the arch effect, the arch-lm test will be used (engle, 1982). the presence of arch effect on the error is indicated by the p-value which is smaller than 1% ≤ α ≤ 10%. the arch-lm test results for the first six time lags in table 2, show that the p-value is smaller in almost all locations and time lags. so it can be concluded that there is an arch effect on the gstar(1;1) model, means that the variance of errors is not constant over time. a slight difference found at location 2, florida. the p-values obtained are less than 9.2% until the third time lag, indicates that the wind speed value in this area tends to be more constant in average and variance than the other two observation locations. at a time lag of more than three, the wind speed in florida did not show any arch effect on the process. however, the presence of heteroscedasticity effects in two other locations, also in florida until the first three-time lags, be the reason to consider the inconstant variance in this case. table 2. the p-values of heteroscedastic effect existence using arch-lm test time lag new orleans (× 10−4) florida mississippi (× 10−2) 1 0.0019 0.0304 0.0300 2 0.0085 0.0731 0.0980 3 0.0067 0.0917 0.2000 4 0.0224 0.1550 0.4300 5 0.0666 0.2170 0.8000 6 0.1490 0.2950 0.9300 next, the inconstant erros variance of gstar(1;1) be modeled by arch(1). the obtained parameter estimation of model arch(1) model using maximum likelihood (ml) method, can be seen in tabel 3. table 3. the ml parameter estimation of arch(1) model. the i-th parameter of locationj is presented by 𝜶𝑖𝑗 for 𝑖 = 0,1 and 𝑗 = 1,2,3. parameter 𝜶𝟎𝟏 𝜶𝟏𝟏 𝜶𝟎𝟐 𝜶𝟏𝟐 𝜶𝟎𝟑 𝜶𝟏𝟑 estimation 1.01 0.25 0.66 0.18 0.60 0.30 thus, the variances of errors for each location are obtained as:: 𝜎1,𝑡 2 = 1.01 + 0.25𝑒1,𝑡−1 2 𝜎2,𝑡 2 = 0.66 + 0.18𝑒2,𝑡−1 2 𝜎3,𝑡 2 = 0.60 + 0.30𝑒3,𝑡−1 2 furthermore, the variances of errors of each location, for t=1,2,..., t be the entries of 𝑁𝑇 dimenstional of diagonal matrix, ω(t) = diag(𝜎1,𝑡 2 , 𝜎2,𝑡 2 , 𝜎3,𝑡 2 ). this matrix is used for parameter estimation of gstar(1;1) – arch (1) model by using the gls mehod. the estimated parameters are presented in table 4. the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 167 table 4. the gls parameter estimation of gstar(1;1) model with the errors variance follows arch(1) model parameter 𝝓𝟎𝟏 𝝓𝟏𝟏 𝝓𝟎𝟐 𝝓𝟏𝟐 𝝓𝟎𝟑 𝝓𝟎𝟑 estimation -0.11 0.08 -0.05 -0.05 -0.27 0.22 the estimated parameters of the model obtained using the gls method in table 4, are not much different from the estimated parameters obtained using the ols method in table 1. this is probably because the arch(1) model is not the right model to model the error variance of the gstar(1;1) model. for comparison, the modeling with the same steps was carried out again using a binary weight matrix and a weight matrix based on wind direction. determination of the best model is done by comparing the value of the mean squared error (mse) of each model. from table 5 it can be concluded that the use of a uniform weight matrix in the gstar(1;1)–arch (1) modeling is slightly better than the binary weight matrix. the three locations involved in modeling give the composition of the uniform and binary weight matrix are quite similar. table 5. the mse values of gstar(1;1) – arch(1) model with two types of thw weight matrix. the results using uniform weight matrix is slightly better than binary, although the difference is not significant weight matrix mse uniform 0.975 binary 0.978 the comparison between the original and estimated data using the gstar(1;1)– arch(1) model can be seen in fig. 6. from this figure, it can be seen that there is a big difference between the original data and estimated results. this is because the stpacf plot in fig. 4 indicates that the gstar(1;1) model is not appropriate for the data. from the stpacf plot, the possible next space-time model is the gstar (3;0,0,1) with a fixed error variance be modeled by the arch(1) model. this model explains that the condition of location i at t is influenced by its own condition at-(t – 1), (t – 2), (t – 3) and the conditions of other locations which are its closest neighbors at (t – 3). the modeling is carried out following the similar steps in modeling gstar(1;1) – arch (1) until the parameters are obtained before and after the arch element is calculated. the estimation parameter results can be seen in table 6. the estimation parameters obtained from the gstar(3;0,0,1) modeling are not much different from the gstar estimation parameters gstar (3;0,0,1)–arch(1) so that the estimation results obtained will also not far different. the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 168 figure 6. plot of original series (red) and estimation (blue, green) using gstar(1;1) – arch(1) model. the large errors indicate that the gstar(1;1) model is not suitable. table 6. the comparison of gstar(3;0,0,1) parameter estimation before and after the errors variance be modeled by arch(1). generally, there is no difference of both models. parameter 𝝓𝟏𝟎 𝟏 𝝓𝟐𝟎 𝟏 𝝓𝟑𝟎 𝟏 𝝓𝟑𝟏 𝟏 𝝓𝟏𝟎 𝟐 𝝓𝟐𝟎 𝟐 𝝓𝟑𝟎 𝟐 𝝓𝟑𝟏 𝟐 𝝓𝟏𝟎 𝟑 𝝓𝟐𝟎 𝟑 𝝓𝟑𝟎 𝟑 𝝓𝟑𝟏 𝟑 before -0.24 -0.40 -0.15 -0.07 -0.17 -0.32 -0.27 -0.02 -0.31 -0.37 -0.13 -0.07 after -0.21 -0.36 -0.13 -0.09 -0.13 -0.33 -0.24 -0.03 -0.30 -0.34 -0.10 -0.03 the comparison of the original and estimated data using the gstar(3;0,0,1)– arch(1) model can be seen in fig. 7. by comparing the plots in fig. 6 and fig. 7, it can be seen that the estimation results generated by this model are better than the gstar(1;1)–arch(1) model. the gstar(3;0,0,1)–arch(1) model can capture the pattern of process variability. however, this model is not the best model for the data. from fig. 7 it can be seen that the estimation results cannot reach the very high either low value of the original data. this may be due to the inaccurate selection of the arch(1) model as a model that explains the error variance for each location. the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 169 figure 7. plot of the original series (red) and estimation (green) gstar(3;0,0,1) – arch(1) model the mse value for the gstar(3;0,0,1)–arch(1) model is 0.875. this value is smaller than the mse value of the gstar(1;1)–arch (1) model, so the model that will be used for short-term prediction is the gstar(3;0,0,1)–arch(1) model. to make sure the selection of the gstar(3;0,0,1)–arch(1) model, a comparison was made with the gstar(3;0,0,1) model without the effect of heteroscedasticity. table 7 shows the comparison of mse, mad and mape values for the two models. although those values were not significant different, the model used for short-term prediction is the gstar(3;0,0,1)–arch(1) model, since it is slightly better. table 7. the comparison of gstar(3;0,1,1) and gstar(3;0,0,1)–arch(1) model from table 8, it can be concluded that the gstar (3;0,0,1)–arch (1) model is not very good for estimating the changes in the daily average wind speed, which are too large. in a relatively short period of time, august 25 – 29, 2005, there was an increase in the daily average wind speed. if this speed increase is assumed to be the beginning of hurricane katrina, then hurricane katrina is expected to hit new orleans and mississippi on september 1, 2005, which is three days later than the actual time of hurricane landfall in new orleans and mississippi. gstar(3;0,0,1) gstar(3;0,0,1)–arch(1) mse 0.88 0.86 mad 0.70 0.70 mape 7.48 6.86 the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 170 table 8. the comparison of real data and its estimation using gstar(3;0,0,1)–arch(1) model conclusions the daily average wind speed data in new orleans, florida, and mississippi of the united states are not only influenced by the wind speed at the previous days in the same location, but the influenced of the wind speed in neighbor states can not be ignored. through the previous modeling, it was found that the gstar(3;0,0,1)–arch(1) model is a better model than the gstar(3;0,0,1) model. it means that, the wind speed in threeprevious days of the closest neighbors will influence today’s wind speed in the reference location. this model can capture the pattern of the wind speed volatilities compare to other observed models. however, the gstar(3;0,0,1)–arch(1) model is still not good enough to predict wind speeds that are extrme high nor low. this can be caused because the arch(1) model is not the right model to model the model error variance. further analysis of the error variance model such as garch(p,q) is also needed so that the error variance can be modeled better. as for the application of the gstar model with the arch effect, it can also be developed in weather cases in indonesia, as well as in other fields of science. acknowledgments this research is supported by p3mi 2020 and p2mi 2021 research grant of itb. references [1] u. mukhaiyar, and u.s. pasaribu, a new procedure for generalized star modeling using iacm approach. itb j. sci 44(2) : 179-192, 2012. https://journals.itb.ac.id/index.php/jmfs/article/view/106 [2] yundari, u.s. pasaribu, u. mukhaiyar, m.n. heriawan, spatial weight determination of gstar(1;1) model by using kernel function, journal of physics: conference series 1028 (012223): 1-8, 2018. https://iopscience.iop.org/article/10.1088/1742-6596/1028/1/012223 [3] yundari, n.m. huda, u. mukhaiyar, u.s. pasaribu, and k.n. sari, stationary process in gstar(1;1) through kernel function approach: aip conference proceedings, 2268, 020010, 2020. https://aip.scitation.org/doi/10.1063/5.0016808 year 2005 new orleans florida mississippi est. real est. real est. real 25 aug 6.55 6.26 15.09 15.43 2.67 4.25 26 aug 7.05 7.61 15.73 12.08 3.15 5.59 27 aug 7.06 10.96 17.04 9.17 3.54 6.49 28 aug 7.04 25.05 17.53 5.59 3.82 13.87 29 aug 7.024 47.65 16.90 2.68 3.43 28.19 the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 171 [4] u. mukhaiyar, b. i. bilad, and u. s. pasaribu, the generalized star modelling with minimum spanning tree approach of weight matrix for covid-19 case in java island, journal of physics: conference series, 2084 012003, 2021. [5] d. masteriana, and u. mukhaiyar, monte carlo simulation of error assumptions in generalized star(1;1) model. proceedings on the jangjeon mathematical society 22(1) : 43-50, 2019. doi: 10.17777/pjms2019.22.1.43 [6] yundari, u.s. pasaribu, u. mukhaiyar, error assumptions on generalized star model, journal of mathematical and fundamental sciences 49(2): 136-155, 2017. https://journals.itb.ac.id/index.php/jmfs/article/view/3285 [7] u. mukhaiyar, n.m. huda, k.n. sari, and u.s. pasaribu, analysis of generalized space time autoregressive with exogenous variable (gstarx) model with outlier factor, journal of physics: conference series, 1496(1), 012004, 2020. doi: 10.1088/1742-6596/1496/1/012004 [8] n.m. huda, u. mukhaiyar, and u.s. pasaribu, the approximation of gstar model for discrete cases through inar model: journal of physics: conference series, 1722(1), 012100, 2021. https://iopscience.iop.org/article/10.1088/17426596/1722/1/012100/pdf [9] yundari and s.w. rizki, invertibility of generalized space-time autoregressive model with random weight, cauchy, 6(4):246-259, 2021. doi: 10.18860/ca.v6i4.11254 [10] n. nurhayati, u.s. pasaribu, and o. neswan, application of generalized star model on gdp data in west european countries. j. of probability and statistics 2012 : 1-16, 2012. https://www.hindawi.com/journals/jps/2012/867056/ [11] u. mukhaiyar, the goodness of generalized star in spatial dependency observations modeling,. aip conf. proc. 1692 (020012), 2015. https://aip.scitation.org/doi/abs/10.1063/1.4936436 [12] r.f. nugraha, s. setyowati, u. mukhaiyar, prediction of oil palm production using the weighted average of fuzzy sets concept approach. aip conf. proc. 1692 (020008), 2015. https://aip.scitation.org/doi/abs/10.1063/1.4936440 [13] n.f.i. fadlilah, u. mukhaiyar, f. fahmi, the generalized star(1;1) modeling with time correlated errors to red-chili weekly prices of some traditional markets in bandung, west java. aip conf. proc. 1692(020014), 2015. https://iopscience.iop.org/article/10.1088/1742-6596/1722/1/012100 [14] u. mukhaiyar, n.m. huda, k.n. sari, and u.s. pasaribu, modeling dengue fever cases by using gstar(1;1) model with outlier factor: journal of physics: conference series, 1366(1), 012122, 2019. https://iopscience.iop.org/article/10.1088/1742-6596/1366/1/012122 [15] d. masteriana, m.i. riani, and u. mukhaiyar, generalized star (1;1) model with outlier case study of begal in medan, north sumatera: journal of physics: conference series, 12456(1), 012046, 2019. https://iopscience.iop.org/article/10.1088/1742-6596/1245/1/012046/pdf [16] u.s. pasaribu, u. mukhaiyar, n.m. huda, k.n. sari, and s.w. indratno, modelling covid-19 growth cases of provinces in java island by modified spatial weight matrix gstar through railroad passenger’s mobility: heliyon, 7(2), e06025, 2021. https://www.sciencedirect.com/science/article/pii/s2405844021001304 http://dx.doi.org/10.17777/pjms2019.22.1.43 http://dx.doi.org/10.1088/1742-6596/1496/1/012004 http://dx.doi.org/10.18860/ca.v6i4.11254 https://www.sciencedirect.com/science/article/pii/s2405844021001304 the generalized star modeling with heteroscedastic effects utriweni mukhaiyar 172 [17] pasaribu u s, mukhaiyar u, heriawan m n and yundari 2021 generalized spacetime autoregressive modeling of the vertical distribution of copper and gold grades with a porphyry-deposit case study ijaseit 11(6) (to be published) [18] u. mukhaiyar, d. widyanti, and s. vantika, the time series regression analysis in evaluating the economic impact of covid-19 cases in indonesia. journal of model assisted statistics and applications, 16(3), 2021a. https://ieeexplore.ieee.org/document/7005725 [19] s.s. sholihat, s.w. indratno, and u. mukhaiyar, the role of parameters in bayesian online change point detection: detecting early warning of mount merapi eruptions, heliyon, e07482, 2021. https://doi.org/10.1016/j.heliyon.2021.e07482 [20] u.s. pasaribu, u. mukhaiyar, s. setiyowati, an arch model the electric power of extra high voltage (ehv) transmission substation forecasting in cawang, jakarta, indonesia:the proceedings of ieee inagentsys, 1589, 484, 2014. [21] s. setiyowati, r.f. nugraha, u. mukhaiyar, non-stationary time series modeling on caterpillars pest of palm oil for early warning system, aip conference proceedings 1692, 020011, 2015, doi: 10.1063/1.4936439 [22] s. setiyowati, u.s. pasaribu, and u. mukhaiyar, non-stationary model for rice prices in bandung, indonesia: the proceedings of 2012 ieee conference on control, system & industrial informatics, (ica2013), 2013. [23] s. borovkova, svetlana and r. lopuhaa, spatial garch: a spatial approach to multivariate volatility modeling (november 9, 2012). available at ssrn: https://ssrn.com/abstract=2176781 or http://dx.doi.org/10.2139/ssrn.217 6781. [24] n. nainggolan, and j. titaley, development of generalized space time autoregressive (gstar) model, aip conference proceedings 1827, 020034, 2017. https://doi.org/10.1063/1.4979450 [25] r.f. engle, autoregressive conditional heteroskedasticity with estimates of the variance of united kingdom inflation, econometrica, vol 50 (4), pp. 987-1007, 1982. https://www.jstor.org/stable/1912773 [26] s. ramadhani, prakiraan kedatangan badai katrina dengan model ruang waktu gstar(p; λ1, λ2, … , λp)-arch(1), laporan tugas akhir, institut teknologi bandung, 2016. https://www.iospress.nl/journal/model-assisted-statistics-and-applications/ https://www.iospress.nl/journal/model-assisted-statistics-and-applications/ https://doi.org/10.1016/j.heliyon.2021.e07482 https://ssrn.com/abstract=2176781 https://dx.doi.org/10.2139/ssrn.2176781 https://dx.doi.org/10.2139/ssrn.2176781 https://doi.org/10.1063/1.4979450 modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 118-128 p-issn: 2086-0382; e-issn: 2477-3344 submitted: july 25, 2021 reviewed: november 05, 2021 accepted: november 08, 2021 doi: https://doi.org/10.18860/ca.v7i1.12995 modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar1, athifa salsabila deva2, maiyastri3, hazmira yozza4, aidinil zetra5 1,2,3,4mathematics department, faculty of mathematics and natural sciences, universitas andalas, padang 5political sciences department, faculty of social and political sciences, universitas andalas, padang email: ferrayanuar@sci.unand.ac.id, athifasalsabila4300@gmail.com, maiyastri@sci.unand.ac.id, hazmirayozza@sci.unand.ac.id, aidinil@soc.unand.ac.id abstract this study aims to construct the model for the length of hospital stay for patients with covid-19 using quantile regression and bayesian quantile approaches. the quantile regression models the relationship at any point of the conditional distribution of the dependent variable on several independent variables. the bayesian quantile regression combines the concept of quantile analysis into the bayesian approach. in the bayesian approach, the asymmetric laplace distribution (ald) distribution is used to form the likelihood function as the basis for formulating the posterior distribution. all 688 patients with covid-19 treated in m. djamil hospital and universitas andalas hospital in padang city between march-july 2020 were used in this study. this study found that the bayesian quantile regression method results in a smaller 95% confidence interval and higher value than the quantile regression method. it is concluded that the bayesian quantile regression method tends to yield a better model than the quantile method. based on the bayesian quantile regression method, it was found that the length of hospital stay for patients with covid-19 in west sumatra was significantly influenced by age, diagnoses, and discharge status. keywords: length of hospital stay; bayesian quantile regression; asymmetric laplace distribution (ald) introduction the problem of covid-19 has become the concern of the world community from every group. in cases of being infected with covid-19 in west sumatra province, not a few people have been declared cured, died, or are undergoing treatment at the hospital. people with criteria for severe symptoms of covid-19 must undergo treatment in a hospital [1]. certain factors influence the length of stay of covid-19 patients. an estimation of the regression model parameters is carried out using quantile regression and bayesian quantile regression methods to identify the factors that influence the length of stay of covid-19 patients. the estimated length of stay for covid-19 patients who are hospitalized can be used for specific purposes such as in health service activities. the need for health facilities at each level of health care. and the preparation of decisions related to mitigation scenarios and preparedness for covid-19 [2]–[4]. https://doi.org/10.18860/ca.v7i1.12995 mailto:ferrayanuar@sci.unand.ac.id mailto:athifasalsabila4300@gmail.com mailto:maiyastri@sci.unand.ac.id mailto:hazmirayozza@sci.unand.ac.id mailto:aidinil@soc.unand.ac.id modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 119 if linear model assumptions are fulfilled, such as no multicollinearity, homoscedasticity, and no autocorrelation, the ordinary least squares (ols) method is used to estimate the model parameters [5]. in the preliminary analysis, data on the length of stay of covid-19 patients in west sumatra province were not normally distributed. therefore, the use of ols was not efficient in estimating model parameters. for this reason, an analysis of the estimated parameters was carried out using quantile regression and bayesian quantile regression. quantile regression analysis was chosen because in estimating the parameters, it does not require any assumptions, including the assumption of normality, which only requires large data. the merging of quantile analysis into bayesian concepts is carried out so that the resulting estimator becomes more effective and natural so that it can produce a better predictive model that is closer to the actual value [6], [7]. research related to bayesian quantile regression was initiated by yu and mooyed [8]. research on this topic then developed rapidly, including research on numerical simulations in estimating the parameters of the bayesian quantile regression method using the gibbs sampling algorithm [9]. the application of the bayesian quantile regression method is also applied in the use of binary response data based on the asymmetric laplace distribution (ald) distribution [10]. subsequent research discussed the analysis of variable selection in quantile regression using the gibbs sampling concept [7]. further bayesian quantile regression analysis was also used to estimate the model by approximating the likelihood function [11], as well as the analysis of posterior inference with the likelihood of the ald distribution [12]. the application of bayesian quantile regression was also used in modeling the jeonse deposit in korea [13]. oh et al. do selecting variables using the bayesian quantile regression method using the savage– dickey density ratio [14]. furthermore, the application of bayesian quantile regression was also applied in constructing a low birth weight model using the gibbs sampling algorithm approach [15]. this study aims to construct a model of length of stay for covid-19 patients using quantile and bayesian quantile regression methods to then compare the results between two methods. this case is important to be investigated since the cases of covid-19 is increasing. as the results, rooms in hospitals become full. for this reason, this research needs to be carried out in an effort to find out what factors affect the length of stay of covid-19 patients. this research will give information on how to shorten the length of stay of covid-19 patients. methods material huskamp et al. and kaufman et al. have found that mortalities are higher for the old populace than young populace [16], [17]. yuki et al. recognized that older patients were more powerless to longer the length of hospital stay than younger patients [18]. this information implies that age could influence the length of hospital stay of a patient. many studies also investigated that the presence of hypertension, diabetes, and coronary artery disease were considered as hazard factors to covid-19 [19]. gebhard et al, demonstrated that covid-19 is deadlier for infected men than women [20]. the hypothesis model is constructed based on literatures to be then fitted to the data. the data used were 688 covid-19 patients treated at m. djamil hospital, padang city, and andalas university hospital in march-july 2020. in this study, the variables used are factors that are assumed to affect the length of stay of covid-19 patients in west sumatra modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 120 province, they are: age (𝑋1), gender (𝑋2) with male and female categories, diagnosis of covid-19 (𝑋3) by categories are asymptomatic person (asymp), person under supervision (perus), patients under supervision (paus), and positive, discharge status (𝑋4) with the categories are recovered, died, forced discharge, outpatient, referred to another hospital, and the number of comorbid (𝑋5). table 1 presents the frequency distribution for data of covid-19 patients by categorical independent variables, i.e., gender, diagnosis, and discharge status. table 1 shows that most diagnose of the respondents are paus (patients under supervision) with 87.7% of all respondents and 73.3% respondents were recovered. table 1. frequency distribution of covid-19 patients for categorical independent variables variable category frequency percentage gender (𝑋1) male 347 50.4 female 341 49.6 diagnose (𝑋3) asymp 1 0.1 perus 6 0.9 paus 604 87.8 positive 77 11.2 discharge status (𝑋4) recovered 504 73.3 died 141 20.5 forced discharge 30 4.4 outpatient 4 0.6 referred to another hospital 9 1.3 in figure 1 below, part (a) shows that the length of stay for covid-19 patients has a histogram that is skewed to the left, while part (b) shows that some data are not located around a linear line. based on both figures, these are informed that the data on the length of stay of covid-19 patients is not normally distributed. (a) (b) figure 1. data of length of hospital stay: (a) histogram and (b) qq-plot quantile regression method assummed that 𝒚 = (𝑦1, 𝑦2, ⋯ , 𝑦𝑛) ′ is response variable vector and 𝒙 = (𝑥1, 𝑥2, ⋯ , 𝑥𝑘 ) ′ is a covariate vector. in general, a linear regression equation model for the 𝜏-th quantile. modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 121 where 0 < 𝜏 < 1 with 𝑛 sample and 𝑘 predictor for 𝑖 = 1,2, … , 𝑛 written in the form: 𝑦𝑖 = 𝛽0𝜏 + 𝛽1𝜏 𝑥𝑖1 + 𝛽2𝜏 𝑥𝑖2 + ⋯ + 𝛽𝑘𝜏𝑥𝑖𝑘 + 𝜀𝑖 , (1) where 𝜷(𝜏) is parameter’s vector and 𝜺 is the leftover vector. the 𝜏-th conditional quantile function in the quantile regression method is defined as 𝑄𝑦𝑖 (𝜏|𝑥𝑖 ) = 𝑥 ′ 𝑖 𝜷(𝜏) then the estimated value of the parameter is �̂�(𝜏) obtained by minimizing [21]: ∑ 𝜌𝜏 (𝑦𝑖 − 𝑥𝑖 ′𝜷)𝑛𝑖=1 , (2) where 𝜌𝜏 (𝑢) = 𝑢(𝜏 − 𝐼(𝑢 < 0)) is a loss function which is equivalent to : 𝜌𝜏 (𝜀) = 𝜀(𝜏𝐼(𝜀 > 0) − (1 − 𝜏)𝐼(𝜀 < 0)), (3) 𝐼(. ) is an indicator function. with value 1 if 𝐼(. ) is true and zero rest. minimization of equation (2) was done by using the simplex method in linear programming. however, using the simplex method in estimating parameters is complicated to do. therefore, an approach with the bayes method is carried out so that the parameter estimation process becomes a little easier. bayesian quantile regression method yu and mooyed [8] found that minimizing the loss function of the quantile regression is equivalent to maximizing the likelihood function formed from the data assumed to be distributed in the asymmetric laplace distribution (ald). the ald is used in the likelihood distribution to make bayesian estimators more effective and natural. this estimation resulted in the ald distribution is a possible parametric relationship between the minimization problem of equation (2) and the maximum likelihood theory [7]. in addition, the quantile regression loss function is identical to the likelihood function of ald [22]. the ald distribution is one of the continuous probability distributions. a random variable 𝜀 has an ald distribution with probability density function 𝑓(𝜀) [7], [8]: 𝑓𝜏 (𝜀) = 𝜏(1 − 𝜏)𝑒𝑥𝑝(−𝜌𝜏 (𝜀)). (4) where 0 < 𝜏 < 1 and 𝜌𝜏 (𝜀) where defined in equation (3). the estimation of model parameters using the bayesian quantile regression method can be done for any data distribution by assuming the following [8]: 1. 𝑓(𝑦 ; 𝜇𝑖 ) has ald distribution. 2. 𝑔(𝜇𝑖 ) = 𝑥𝑖 ′𝜷(𝜏). the observation was given by 𝒚 = (𝑦1, 𝑦2, ⋯ , 𝑦𝑛 ). based on equation (4), to combine the quantile regression method into the bayesian method to estimate the parameter, 𝜷. ald was used to form the likelihood function. the ald has a combined representation of several distributions based on the exponential distribution and normal distribution [9]. a random variable 𝜀 can be expressed in: 𝜀 = 𝜃𝑧 + 𝑝𝑢√𝑧, (5) where 𝜃 = 1−2τ (1−τ)τ and 𝑝2 = 2 (1−τ)τ . the 𝜏-th quantile regression model can be written as: 𝑦𝑖 = 𝑥𝑖 ′𝜷𝜏 + 𝜎𝜃𝑧𝑖 + 𝜎𝑝𝑢𝑖 √𝑧𝑖 , (6) where 𝑧𝑖 ~𝑒𝑥𝑝(1) and 𝑢𝑖 ~𝑁(0,1), 𝑣𝑖 = 𝜎𝑧𝑖 , 𝒗 = (𝑣1, 𝑣2, ⋯ , 𝑣𝑛 ) ′. because of 𝑧𝑖 ~𝑒𝑥𝑝(1) then 𝑣𝑖 ~𝑒𝑥𝑝(𝜎), and 𝑖 = 1,2, ⋯ , 𝑛. so, we get the probability density function of 𝑦𝑖 : 𝑓(𝑦𝑖 ; 𝜷𝜏 , 𝑣𝑖 , 𝜎) = 1 𝑝√𝜎𝑣𝑖√2𝜋 𝑒𝑥𝑝 (− (𝑦𝑖−(𝑥𝑖 ′𝜷𝜏+𝜃𝑣𝑖)) 2 2𝑝2𝜎𝑣𝑖 ) , (7) and the likelihood function is obtained as follows: modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 122 𝐿(𝜷𝜏 , 𝒗, 𝜎) ∝ (∏ (𝜎𝑣𝑖 ) − 1 2 𝑛 𝑖=1 ) (𝑒𝑥𝑝 (− ∑ (𝑦𝑖−(𝑥𝑖 ′𝜷𝜏+𝜃𝑣𝑖)) 2 2𝑝2𝜎𝑣𝑖 𝑛 𝑖=1 )). (8) then, the prior distribution is selected for the parameter 𝜷𝜏 ~𝑁(𝑏0, 𝑩0). 𝑣𝑖 ~𝑒𝑥𝑝(𝜎), and 𝜎~𝐼𝐺(𝑎, 𝑏). the posterior distribution is obtained, i.e: (𝜷𝜏|𝒗, 𝜎, 𝒚)~𝑁 [(𝑩0 −1 + 𝑥𝑖 (𝑝 2𝜎𝒗)−𝟏𝑥𝑖 ′) −1 (𝑩0 −1𝑏𝟎 + 𝑥𝑖 (𝑝 2𝜎𝒗)−𝟏𝒚 − 𝑥𝑖 (𝑝 2𝜎𝒗)−𝟏𝜃𝒗), (𝑩0 −1 + 𝑥𝑖 (𝑝 2𝜎𝒗)−𝟏𝑥𝑖 ′) −1 ] ; (𝑣𝒊|𝜷𝜏 , 𝜎, 𝒚)~𝐺𝐼𝐺 ( 1 2 , ( (𝑦𝑖−𝑥𝑖 ′𝜷𝜏) 2 𝑝2𝜎 ) , ( 2 𝜎 + 𝜃𝟐 𝑝2𝜎 )) ; (𝜎|𝜷𝜏, 𝒗, 𝒚)~𝐼𝐺 ((𝑎 + 3𝑛 2 ) , (𝑏 + ∑ 𝑣𝑖 𝑛 𝑖=1 + ∑ (𝑦𝑖 − (𝑥𝑖 ′𝜷𝜏 + 𝜃𝑣𝑖 )) 2 2𝑝2𝑣𝑖 𝑛 𝑖=1 )). these posterior distribution then are used to estimate mean posterior and variance posterior as point estimate for unknown parameter using gibbs sampling iteration method [23], [24]. the goodness of fit for both methods is measured using 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 [25]. the formula for 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 is as follows: 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 = 1 − 𝑅𝐴𝑆𝑊𝜏 𝑇𝐴𝑆𝑊𝜏 , (9) where 𝑅𝐴𝑆𝑊𝜏 is the residual absolute sum of weighted differences between the observed dependent variable and the estimated quantile of conditional distribution in the more complex model. while, 𝑇𝐴𝑆𝑊𝜏 is the total absolute sum of weighted differences between the observed dependent variable and the estimated quantile of conditional distribution in the simplest model. the range values for 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 are between zero and one. the value of 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 indicates the goodness of fit of the proposed model in explaining the variance of the response variable. the higher the value of 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 the better the proposed model obtained. results and discussion data analysis begins with fitting the data to the hypothesis model using the ols method to select the significant variables involved for modeling in the quantile and bayesian analysis. based on ols analysis, the variables of age, diagnosis, and discharge status contributed significantly. furthermore, a model of the length of stay for covid-19 patients is constructed using the quantile regression method and the bayesian quantile regression method. the analysis results are then compared between both methods by looking at the width of the 95% confidence interval and 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 of the selected quantile. the quantile used are 0.10; 0.25; 0.50; 0.75; dan 0.90. r software was used to analyze the data. the results of the analysis from both methods are provided in table 3. modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 123 table 3. comparison between quantile and bayesian quantile method indicator variables quantile method bayesian quantile method estimate 95% ci estimate 95% ci 𝜏 = 0.10 intersep 2.0000 na -1.5006 14.6799 age (𝑋1) 0.0000 0.0000 0.0002 0.0041 diagnose (𝑋3) perus (𝑋3𝐷1) -2.0000 na 1.3864 15.3326 paus (𝑋3𝐷2) -1.0000 na 2.4139 14.6589 positive (𝑋3𝐷3) 0.0000 na 3.1337 14.8359 discharge status (𝑋4) recovered (𝑋4𝐷1) 2.0000* 0.0000 2.0446* 0.6021 died (𝑋4𝐷2) 0.0000 0.0000 0.0275 0.6264 outpatient (𝑋4𝐷3) 0.0000 na -0.2693 5.0697 referred to another hospital (𝑋4𝐷4) 0.0000 na -0.1649 2.0374 𝜏 = 0.25 intersep 1.0000 na -1.0118 14.2889 age (𝑋1) 0.0000 0.0000 0.0014 0.0094 diagnose (𝑋3) perus (𝑋3𝐷1) -1.0000 na 1.2671 14.8492 paus (𝑋3𝐷2) 0.0000 na 2.1446 14.2644 positive (𝑋3𝐷3) 3.0000 na 5.1174* 14.4329 discharge status (𝑋4) recovered (𝑋4𝐷1) 3.0000* 1.2544 2.6699* 1.2514 died (𝑋4𝐷2) 0.0000 0.9709 -0.2344 1.1873 outpatient (𝑋4𝐷3) 0.0000 na 1.5336 6.3011 referred to another hospital (𝑋4𝐷4) 0.0000 1.1155 0.1684 2.9599 𝜏 = 0.50 intersep 1.0000 na 1.4877 15.5613 age (𝑋1) 0.0000 0.0000 0.0002 0.0010 diagnose (𝑋3) perus (𝑋3𝐷1) 1.0000 na 0.9143 16.1529 paus (𝑋3𝐷2) 1.0000 na 0.9518 15.4168 positive (𝑋3𝐷3) 7.0000 na 6.8087 15.6258 discharge status (𝑋4) recovered (𝑋4𝐷1) 3.0000* 1.3265 2.4889* 2.1480 died (𝑋4𝐷2) -1.0000* 2.1006 -1.3962* 2.1555 outpatient (𝑋4𝐷3) 3.0000 6.1705 2.4900 7.1299 referred to another hospital (𝑋4𝐷4) 0.0000 2.9157 0.0397 4.2175 𝜏 = 0.75 intersep 2.0000 na 5.7668 22.7172 age (𝑋1) -1.05𝑥10 −7 0.0210 -0.0071* 0.0229 diagnose (𝑋3) perus (𝑋3𝐷1) 5.0000 na 0.1273 24.1500 paus (𝑋3𝐷2) 2.0000 na -1.4964 22.7243 positive (𝑋3𝐷3) 12.0000 na 9.2450 23.1280 modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 124 indicator variables quantile method bayesian quantile method estimate 95% ci estimate 95% ci discharge status (𝑋4) recovered (𝑋4𝐷1) 2.0000* 2.0233 2.2849* 2.2233 died (𝑋4𝐷2) -2.0000* 1.5733 -1.8134* 2.3198 outpatient (𝑋4𝐷3) 2.0000 na 2.8668 8.5570 referred to another hospital (𝑋4𝐷4) 0.0000 9.5010 0.5922 5.9888 𝜏 = 0.90 intersep -0.6596 na 8.3752 36.4268 age (𝑋1) -0.0213 0.0539 -0.0181 * 0.0339 diagnose (𝑋3) perus (𝑋3𝐷1) 8.7021 na 0.2688 38.0224 paus (𝑋3𝐷2) 5.6596 na -2.9586 36.2285 positive (𝑋3𝐷3) 27.5957 na 18.1682 37.1268 discharge status (𝑋4) recovered (𝑋4𝐷1) 5.9362* 2.7045 5.2751 * 2.7938 died (𝑋4𝐷2) -0.5957* 2.7368 -1.1434 * 2.8253 outpatient (𝑋4𝐷3) 2.1702 na 3.5015 10.9954 referred to another hospital (𝑋4𝐷4) 0.7660 na 1.2313 6.3550 * significant at 𝛼 = 0.05, na = not available. in table 3, it can be seen that for the quantile regression method, the 𝑋4𝐷1 variable (recovered) contributed significantly in each quantile, and the category died is significant in the quantile 0.50; 0.75; and 0.90. meanwhile, none were statistically significant for other categories in other quantiles in influencing the length of stay of covid-19 patients. meanwhile, by using the bayesian quantile regression method, the age contributed significantly at the quantile 0.75, and 0.90 in giving affects to the length of stay of covid19 patients. while, diagnose variable (only positive category) contributed significantly to the length of stay of covid-19 patients in quantile 0.25, discharge status (only recovered category) contributed significantly in all quantiles, discharge status (only died category) is significant in quantile 0.50; 0.75; and 0.90 to affect the length of stay of covid-19 patients. from the results of this estimation analysis, it is found that the bayesian quantile regression method as a whole has more significant parameter and smaller 95% confidence interval than the quantile regression method. in order to determine the best method including the best model, it could be based on the higher value of 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2. the 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 values for both methods for all selected quantiles are provided in table 4. table 4. the pseudo r2 values at all selected quantiles. quantile 𝝉𝒕𝒉 𝑷𝒔𝒆𝒖𝒅𝒐 𝑹𝟐 quantile bayesian quantile 0.10 0.27030 0.27235 0.25 0.57550 0.57843 0.50 0.87950 0.88262 0.75 0.93925 0.94244 0.90 0.67508 0.67787 modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 125 in table 4 above, it can be seen that for the quantile regression method, the model at quantile 0.75 is the best model because it has the highest value of 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2, that is 0.93925. this value informs that the proposed model can explain the variance of length of hospital stay for patients with covid-19 is 93.925%. this means that the proposed model at quantile 0.75 is acceptable and could be accepted. meanwhile for the bayesian quantile regression method, the quantile 0.75 is also as the best model because it has the highest value of 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2, that is 0.94244. this informs us that the model can explain the variance of the length of stay for covid-19 patients by 94.244%. since the 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 value obtained from bayesian quantile regression model is higher than quantile method at corresponding quantiles, we could conclude here that bayesian quantile method tends to result better model than quantile method. therefore, the best model for the length of stay of covid-19 patients in west sumatra is model at quantile 0.75 based on bayesian quantile regression method. this proposed model is formulated as follows: �̂� = 5.7668 − 0.0071𝑋1 + 0.1273𝑋3𝐷1 − 1.4964𝑋3𝐷2 + 9.2450𝑋3𝐷3 + 2.2849𝑋4𝐷1 − 1.8134𝑋4𝐷2 + 2.8668𝑋4𝐷3 + 0.5922𝑋4𝐷4. there were 75% of the length of stay for covid-19 patients diagnosed with perus (person under supervision) is 0.1273 days longer than patients diagnosed with asymp (asymptotic persons) assuming others constants. around 75% of the length of stay for covid-19 patients diagnosed with paus (patients under supervision) is 1.4964 days longer than patients diagnosed with asymp (asymptotic persons) assuming others constants. approximately, 75% of the length of stay of covid-19 patients diagnosed with positive was 9.2450 days longer than patients diagnosed with asymp (asymptotic persons) assuming other variables constant. the similar interpretation could be stated for other variables. furthermore, the convergence test of the proposed parameter model obtained was carried out. because of limited space, the selected results of these test are provided in figure 2 below. (a) (b) (c) figure 2. convergency test for category recovered at quantile 0.75 (a) trace-plot, (b) densityplot, dan (c) acf plot in figure 2 (a), it can be seen that the resulting trace-plot forms a pattern that converges to a value so that it can be stated that the model parameters have converged. modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 126 while in part (b), it can be seen that the resulting density plot resembles a normal distribution curve. it can be stated that the model parameters are normally distributed. then in part (c), the resulting acf plot shows a smaller autocorrelation value so that it can be stated that there is no autocorrelation between samples. based on these convergency test, it can be concluded that the model parameters have converged and proposed model could be accepted. conclusions this study found that the length of stay of covid-19 patients in west sumatra was influenced by age, diagnoses of covid-19 patients, and discharge status. from the analysis carried out, the bayesian quantile regression method is better in modeling the length of stay of covid-19 patients than quantile method. the 95% confidence interval based on bayesian quantile regression is smaller, and the 𝑃𝑠𝑒𝑢𝑑𝑜 𝑅2 value is greater than the quantile regression method. acknowledgments this research was funded by directorate of resources directorate general of higher education, ministry of education, culture, and research and technology of indonesia, in accordance with contract number 104/e4.1/ak.04.pt/2021. references [1] kemenkes ri, “kmk no. hk.01.07-menkes-413-2020 ttg pedoman pencegahan dan pengendalian covid-19.pdf.” 2020. [online]. available: https://covid19.go.id/p/regulasi/keputusan-menteri-kesehatan-republikindonesia-nomor-hk0107menkes4132020 [2] n. lapidus, x. zhou, f. carrat, b. riou, y. zhao, and g. hejblum, “biased and unbiased estimation of the average length of stay in intensive care units in the covid-19 pandemic,” ann. intensive care, vol. 10, no. 135, pp. 1–9, dec. 2020, doi: 10.1186/s13613-020-00749-6. [3] e. m. rees et al., “covid-19 length of hospital stay: a systematic review and data synthesis,” bmc med, vol. 18, no. 270, pp. 1–22, dec. 2020, doi: 10.1186/s12916020-01726-3. [4] s. wu et al., “understanding factors influencing the length of hospital stay among non-severe covid-19 patients: a retrospective cohort study in a fangcang shelter hospital,” plos one, vol. 15, no. 10, p. e0240959, oct. 2020, doi: 10.1371/journal.pone.0240959. [5] n. desviona and f. yanuar, “simulation study of autocorrelated error using bayesian quantile regression,” sci. technol. indones., vol. 5, no. 3, pp. 70–74, jul. 2020, doi: 10.26554/sti.2020.5.3.70-74. [6] r. alhamzawi, k. yu, and d. f. benoit, “bayesian adaptive lasso quantile regression,” statistical modelling, vol. 12, no. 3, pp. 279–297, jun. 2012, doi: 10.1177/1471082x1101200304. modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 127 [7] r. alhamzawi and k. yu, “variable selection in quantile regression via gibbs sampling,” journal of applied statistics, vol. 39, no. 4, pp. 799–813, apr. 2012, doi: 10.1080/02664763.2011.620082. [8] k. yu and r. a. moyeed, “bayesian quantile regression,” statistics & probability letters, vol. 54, pp. 437–447, 2001. [9] h. kozumi and g. kobayashi, “gibbs sampling methods for bayesian quantile regression,” journal of statistical computation and simulation, vol. 81, no. 11, pp. 1565–1578, nov. 2011, doi: 10.1080/00949655.2010.496117. [10] d. f. benoit and d. van den poel, “binary quantile regression: a bayesian approach based on the asymmetric laplace distribution,” j. appl. econ., vol. 27, no. 7, pp. 1174–1188, nov. 2012, doi: 10.1002/jae.1216. [11] y. feng, y. chen, and x. he, “bayesian quantile regression with approximate likelihood,” bernoulli, vol. 21, no. 2, pp. 832–850, may 2015, doi: 10.3150/13bej589. [12] y. yang, h. j. wang, and x. he, “posterior inference in bayesian quantile regression with asymmetric laplace likelihood: bayesian quantile regression,” international statistical review, vol. 84, no. 3, pp. 327–344, dec. 2016, doi: 10.1111/insr.12114. [13] e. j. nam, e. k. lee, and m.-s. oh, “bayesian quantile regression analysis of korean jeonse deposit,” csam, vol. 25, no. 5, pp. 489–499, 2018, doi: 10.29220/csam.2018.25.5.489. [14] m.-s. oh, j. choi, and e. s. park, “bayesian variable selection in quantile regression using the savage–dickey density ratio,” journal of the korean statistical society, vol. 45, no. 3, pp. 466–476, 2016, doi: 10.1016/j.jkss.2016.01.006. [15] f. yanuar, a. zetra, c. muharisa, d. devianto, a. r. putri, and y. asdi, “bayesian quantile regression method to construct the low birth weight model,” j. phys.: conf. ser., vol. 1245, p. 012044, aug. 2019, doi: 10.1088/17426596/1245/1/012044. [16] h. a. huskamp, d. g. stevenson, d. c. grabowski, e. brennan, and n. l. keating, “long and short hospice stays among nursing home residents at the end of life,” journal of palliative medicine, vol. 13, no. 8, pp. 957–964, aug. 2010, doi: 10.1089/jpm.2009.0387. [17] b. g. kaufman, c. a. sueta, c. chen, b. g. windham, and s. c. stearns, “are trends in hospitalization prior to hospice use associated with hospice episode characteristics?,” am j hosp palliat care, vol. 34, no. 9, pp. 860–868, nov. 2017, doi: 10.1177/1049909116659049. [18] k. yuki, m. fujiogi, and s. koutsogiannaki, “covid-19 pathophysiology: a review,” clinical immunology, vol. 215, no. 108427, pp. 1–7, jun. 2020, doi: 10.1016/j.clim.2020.108427. [19] y. du et al., “clinical features of 85 fatal cases of covid-19 from wuhan. a retrospective observational study,” am j respir crit care med, vol. 201, no. 11, pp. 1372–1379, jun. 2020, doi: 10.1164/rccm.202003-0543oc. [20] c. gebhard, v. regitz-zagrosek, h. k. neuhauser, r. morgan, and s. l. klein, “impact of sex and gender on covid-19 outcomes in europe,” biol sex differ, vol. 11, no. 1, pp. 1–13, dec. 2020, doi: 10.1186/s13293-020-00304-9. [21] r. alhamzawi and k. yu, “conjugate priors and variable selection for bayesian quantile regression,” computational statistics & data analysis, vol. 64, pp. 209–219, aug. 2013, doi: 10.1016/j.csda.2012.01.014. modeling length of hospital stay for patients with covid-19 in west sumatra using quantile regression ferra yanuar 128 [22] y. yang, h. j. wang, and x. he, “posterior inference in bayesian quantile regression with asymmetric laplace likelihood: bayesian quantile regression,” international statistical review, vol. 84, no. 3, pp. 327–344, 2015, doi: 10.1111/insr.12114. [23] c. muharisa, f. yanuar, and d. devianto, “simulation study the using of bayesian quantile regression in nonnormal error,” cauchy, vol. 5, no. 3, pp. 121–126, dec. 2018, doi: 10.18860/ca.v5i3.5633. [24] i. ntzoufras, bayesian modeling using winbugs. hoboken, n.j: wiley, 2009. [25] f. yanuar, h. yozza, f. firdawati, i. rahmi, and a. zetra, “applying bootstrap quantile regression for the construction of a low birth weight model,” makara journal of health research, vol. 23, no. 2, pp. 90–95, aug. 2019, doi: 10.7454/msk.v23i2.9886. on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 84-96 p-issn: 2086-0382; e-issn: 2477-3344 submitted: july 15, 2021 reviewed: august 19, 2021 accepted: october 06, 2021 doi: https://doi.org/10.18860/ca.v7i1.12934 on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari department of mathematics, faculty of science and technology, maulana malik ibrahim islamic state university of malang email: juhari@uin-malang.ac.id abstract this study discusses the construction of mathematical model modification of newton-secant method and solving nonlinear equations for multiple zeros by using a modified newton-secant method. a nonlinear equations for multiple zeros or multiplicity 𝑚 > 1 is an equation that has more than one root. the first step is to construct of mathematical model newton-secant method and its modification, namely to construct a mathematical model of the newton-secant method using the concept of the newton method and the concept of the secant method. the second step is to construct a modified mathematical model of the newton-secant method by adding the parameter 𝜃. after obtaining the formula for the modification newton-secant method, then applying the method to solve a nonlinear equations for multiple zeros. in this case, it is applied to the nonlinear equation trigonometric function 𝑓(𝑥) = (𝑐𝑜 𝑠2 𝑥 + 𝑥)5 which has a multiplicity of 𝑚 = 5. the solution is done by selecting four different initial guess, namely −2; −0,8; −0,2 and 2. furthermore, to determine the effectivity of this method, the researcher compared the result with the newton-raphson method, the secant method, and the newton-secant method that has not been modified. the obtained results from the construction of mathematical model newtonsecant method and its modification is an iteration formula modification of newton-secant method. and for the result of 𝑓(𝑥) using a modification of newton-secant method with four different initial guess, the root of 𝑥 is obtained approximately, namely −0.641714371 with fewer iterations if compared to using the newton method, the secant method, and the newton-secant method. based on the problem to find the root of the nonlinear equation 𝑓(𝑥) it can be concluded that the modification of newton-secant method is more effective than the newton method, the secant method, and the newton-secant method. keywords: modification; newton-secant method; nonlinear equation; multiple zeros; trigonometric function introduction in the fields of science, engineering and economics often involve problems mathematics. mathematical problems are often found in the form of nonlinear equations [1]. nonlinear equations in the form of functions 𝑓(𝑥) can be form of algebraic equations and transcendent equations. transcendent equations or non-algebraic equations are equations that cannot be expressed in algebraic operations. this equation consists of logarithmic functions, exponential functions, hyperbolic functions and trigonometric https://doi.org/10.18860/ca.v7i1.12934 mailto:juhari@uin-malang.ac.id on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 85 functions [2]. in finding a solution to a nonlinear equation means making that equation becomes zero, i.e. 𝑓(𝑥) = 0 [3]. in determining the solution of a complex nonlinear equation will be difficult if done using analytical methods so the numerical will be solution to this problem [4]. the solution obtained from the numerical method an approximate solution. the approximate solution is different from the exact solution, so there is the difference between exact solution and approximate solution. this difference is often referred to as an error [5]. in finding the roots of a nonlinear equation, it is not always singular or simple, sometimes nonlinear equations are in the form of multiple nonlinear equations, meaning the equation has a multiplicity 𝑚 > 1. the following is the definition of multiplicity: definition 1 (multiplicity) the root 𝛼 of 𝑓(𝑥) is said to have multiplicity (𝑚) if 𝑓(𝑥) = (𝑥 − 𝛼)𝑚ℎ(𝑥) for ℎ(𝑥) a continuous function with ℎ(𝑥) ≠ 0, and 𝑚 is a positive integer. if 𝑚 = 1 then 𝛼 is called a simple root. if 𝑚 ≥ 2 then 𝛼 is called a multiple root [6]. thus it can be said that a function has multiplicity if the multiplicity of a function is more than one [5]. the most famous numerical method for solving nonlinear equations is the newton-raphson method. in finding solutions to nonlinear equations, this method requires one initial guess and function derivative value. this method will fail if the initial guess selection gives the derivative value zero [7]. the following is the formula of newton-raphson method : 𝑥𝑛+1 = 𝑥𝑛 − 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) (1) with 𝑛 = 0,1,2, . . .. and 𝑓 ′(𝑥𝑛) ≠ 0. the numerical method that is no less famous is the secant method. this method able to overcome the weakness of the newton-raphson method. in the newton-raphson method is required first derivative of the function 𝑓(𝑥). the process of finding the derivative function 𝑓(𝑥) does not always easy, sometimes there are some functions that are difficult to find the derivative value. to overcome the weakness of newton's method then in the secant method derivative function is replaced by another equivalent form. so the secant method does not require another derivative of the function but requires two initial guesses [8]. the following is the formula of secant method : 𝑥𝑛+1 = 𝑥𝑛 − 𝑓(𝑥𝑛) ∙ (𝑥𝑛−1 − 𝑥𝑛 ) 𝑓(𝑥𝑛−1) − 𝑓(𝑥𝑛) (2) with 𝑛 = 1,2,3 . . .. in the calculation process using numerical method such as newton's method and the secant method are needed initial guess. determining the initial value will be easier if you pay attention the theory related the intermediate value theorem. the intermediate value theorem is a theorem that is used to determine the presence or absence of a solution at a certain interval limit. theorem 1 (intermediate value theorem) if 𝑓 ∈ 𝐶[𝑎, 𝑏] and 𝐾 is any number between 𝑓(𝑎) and 𝑓(𝑏), then there exists a number 𝑐 in (𝑎, 𝑏) for which 𝑓(𝑐) = 𝐾. [9] another theorem that is almost similar with the intermediate value theorem is bolzano's theorem. on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 86 theorem 2 (bolzano's theorem) if 𝑓: [𝑎, 𝑏] ⊂ ℝ → ℝ is a continuous function and if 𝑓(𝑎) ∙ 𝑓(𝑏) < 0, then there is at least one root 𝑥 ∈ (𝑎, 𝑏) such that 𝑓(𝑥) = 0. [10] each numerical method has a different order of conver gence. order of convergence is the speed of an iteration method in finding the roots simultaneously approximation of the equation of function 𝑓. the following is the definition of convergence : definition 2 (convergence) let the sequence 𝑥1, 𝑥2, . . . , 𝑥𝑛 convergence to 𝛼 and 𝑒𝑛 = 𝑥𝑛 − 𝛼 where 𝑛 ≥ 0. if the order of convergence 𝑝 > 0 and error constant 𝐶 ≠ 0, with lim 𝑛→∞ |𝑥𝑛+1 − 𝛼| |𝑥𝑛 − 𝛼| 𝑝 = lim 𝑛→∞ |𝑒𝑛+1| |𝑒𝑛| 𝑝 = 𝐶 then the sequence {𝑥𝑛} converges to 𝛼 with the order of convergence 𝑝 [11]. if 𝑝 = 1 then the iteration method has a linear convergence order. if 𝑝 = 2 then the iteration method has a quadratic order of convergence. if 1 < 𝑝 < 2 then iteration method has a superlinear order of convergence [12]. if 𝑝 = 3 then the iteration method has a cubic order of convergence [13]. numerical methods such as the newton-raphson method have the quadratic order of convergence. while the secant method has a superlinear order of convergence [12]. numerical methods often experience developments. the development aims to find methods that are considered more effective in solving existing problems [14]. based on this in 2002 kasturiarachi combine newton's method and the secant method become a new method, namely the leap-frogging newton‘s method or newton-secant method [13]. this method has a cubic convergence when used to solve simple nonlinear equations. whereas if used to solve multiple zeros of nonlinear equations the convergence to be linear. therefore, in [15] modified newton-secant method with the addition of parameter 𝜃. the purpose of this modification, namely to maintain the order of convergence newton-secant method to remain cubic, if used to find the roots of nonlinear equations [15]. the following theorem related to the parameter 𝜃 used to modify the newton-secant method : theorem 3 let 𝛼 ∈ 𝐷 be multiple root of a sufficiently differentiable function 𝑓 ∶ 𝐷 ⊂ 𝑹 → 𝑹 on an open interval 𝐷 with multiplicity 𝑚 > 1, which includes 𝑥0 as an initial approximation of 𝛼. then, the modification of newton-secant method has order three and 𝜃 = ( −1+𝑚 𝑚 ) −1+𝑚 , 𝑚 ∈ 𝑍+. proof: let 𝛼 is multiple zero of equation 𝑓(𝑥) = 0, then 𝑓(𝛼) = 0 and 𝑓′(𝛼) ≠ 0. next, suppose 𝑒𝑛 ≔ 𝑥𝑛 − 𝛼 𝑒𝑛,�̅� ≔ 𝑥𝑛̅̅ ̅ − 𝛼 𝑐𝑖 ≔ 𝑚! (𝑚 + 𝑖)! 𝑓 (𝑚+𝑖)(𝛼) 𝑓 (𝑚)(𝛼) using the taylor expansion of 𝑓(𝑥𝑛) around 𝑥𝑛 = 𝛼 we get 𝑓(𝑥𝑛) = 𝑐0𝑒𝑛 𝑚 + 𝑐1𝑒𝑛𝑒𝑛 𝑚 + 𝑐2𝑒𝑛 2𝑒𝑛 𝑚 + 𝑐3𝑒𝑛 3𝑒𝑛 𝑚 + 𝑂(𝑒𝑛 4) simplified to 𝑓(𝑥𝑛) = 𝑒𝑛 𝑚(𝑐0 + 𝑐1𝑒𝑛 + 𝑐2𝑒𝑛 2 + 𝑐3𝑒𝑛 3) + 𝑂(𝑒𝑛 4), (3) and on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 87 𝑓 ′(𝑥𝑛) = 𝑒𝑛 𝑚−1(𝑚 + (𝑚 + 1)𝑐1𝑒𝑛 + (𝑚 + 2)𝑐2𝑒𝑛 2 + (𝑚 + 3)𝑐3𝑒𝑛 3 + 𝑂(𝑒𝑛 4)). (4) if equation (3) is divided by equation (4), then we get 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) = 𝑒𝑛 ( (𝑐0 + 𝑐1𝑒𝑛 + 𝑐2𝑒𝑛 2 + 𝑐3𝑒𝑛 3) + 𝑂(𝑒𝑛 4) (𝑚 + (𝑚 + 1)𝑐1𝑒𝑛 + (𝑚 + 2)𝑐2𝑒𝑛 2 + (𝑚 + 3)𝑐3𝑒𝑛 3 + 𝑂(𝑒𝑛 4)) ) (5) furthermore, equation (5) can be written as 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) = 1 𝑚 𝑒𝑛 − 𝑐1 𝑚2𝑐0 𝑒𝑛 2 + −(1 + 𝑚)𝑐1 2 + 2𝑚𝑐0𝑐2 𝑚3𝑐0 2 𝑒𝑛 3 + 𝑂(𝑒𝑛 4), (6) and since 𝑒𝑛,�̅� = 𝑥𝑛̅̅ ̅ − 𝛼 = −1 + 𝑚 𝑚 𝑒𝑛 − 𝑐1 𝑚2𝑐0 𝑒𝑛 2 + −(1 + 𝑚)𝑐1 2 + 2𝑚𝑐0𝑐2 𝑚3𝑐0 2 𝑒𝑛 3 + 𝑂(𝑒𝑛 4). (7) for 𝑓(𝑥𝑛̅̅ ̅) we have 𝑓(𝑥𝑛̅̅ ̅) = 𝑒𝑛,�̅� 𝑚 (𝑐0 + 𝑐1𝑒𝑛,�̅� + 𝑐2𝑒𝑛,�̅� 2 + 𝑐3𝑒𝑛,�̅� 3 ) + 𝑂(𝑒𝑛,�̅� 4 ) (8) substituting (3)-(8) in modification of newton-secant method formula, which the method will be constructed in the research results section. so we get 𝑒𝑛+1 = 𝐷1𝑒𝑛 + 𝐷2𝑒𝑛 2 + 𝐷3𝑒𝑛 3 + 𝑂(𝑒𝑛 4), where 𝐷1 = 1 + 𝜃 𝑚 (−𝜃 + ( −1+𝑚 𝑚 ) 𝑚 ) , and 𝐷2 = 𝜃𝑚−2+𝑚(−𝑚(−1 + 𝑚)𝑚 + 𝜃𝑚𝑚(−1 + 𝑚))𝑐1 (−1 + 𝑚)((−1 + 𝑚)𝑚 − 𝜃𝑚𝑚)2𝑐0 , and 𝐷3 = 𝜃𝑚−3+𝑚𝐴 2(−1 + 𝑚)2((−1 + 𝑚)𝑚 − 𝜃𝑚𝑚)3𝑐0 2 , where 𝐴 = (−1 + 𝑚)2𝑚(−1 + 𝑚 + 2𝑚2)(𝑚𝑐1 2 − 2(−1 + 𝑚)𝑐0𝑐2) + 2𝜃2(−1 + 𝑚)2𝑚2𝑚((1 + 𝑚)𝑐1 2 − 2𝑚𝑐0𝑐2) − 𝜃(−1 + 𝑚)1+𝑚(𝑚(3 + 4𝑚)𝑐1 2 + 2(1 + 𝑚 − 4𝑚2)𝑐0𝑐2). therefore, to provide the three order of convergence, it is need to choose 𝐷𝑖 = 0 (𝑖 = 1, 2), so we have 𝜃 = ( −1 + 𝑚 𝑚 ) −1+𝑚 , and the error equation becomes 𝑒𝑛+1 = ( 𝑚𝑐1 2 − 2(−1 + 𝑚)𝑐0𝑐2 2𝑚2𝑐0 2 ) 𝑒𝑛 3 + 𝑂(𝑒𝑛 4) , and modification of newton-secant method has convergence order of three [15]. based on the description above the researcher intends to construct the modification of newton-secant method in solving nonlinear equations for multiple zeros. then apply the method to solve the nonlinear equations for multiple zeros. in this case, it is applied to a nonlinear trigonometric function. to determine the effectivity of modification of newton-secant method then the solution will also be compared with newton method, secant method, and newton-secant method. on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 88 methods research steps 1. construction of mathematical model newton-secant method and its modification with the following steps: a) construction of mathematical model newton-secant method.  analyzing the theory related to the origin of the newton-secant method based on kasturiarachi’s article 2002 entitled leapfrogging newton’s method.  performing analysis to obtain newton's approximation by using the equation of the tangent (𝑥0, 𝑓(𝑥0)) that intersects (𝑥0̅̅ ̅, 0).  create an equation of a secant connecting the points (𝑥0, 𝑓(𝑥0)) and (𝑥0̅̅ ̅, 𝑓(𝑥0̅̅ ̅)) using the equation of the line that passes through two point and assumes a secant that satisfies the 𝑥-axis on point (𝑥1, 0).  substituting newton's approximation into the equation of secant connecting the points (𝑥0, 𝑓(𝑥0)) and (𝑥0̅̅ ̅, 𝑓(𝑥0̅̅ ̅)).  write the iteration formula based on the process that has been done at the stages above. b) construction of mathematical model modification of newton-secant method with adding parameter 𝜃 to the second term of the newton-secant method. in this case, theorem 3 is used which is in the introduction section. 2. solving nonlinear equation that have a multiplicity 𝑚 > 1 using modification of newton-secant method. in this case the solution is done by selecting two different initial values. after that the researcher compared the result with the newton-raphson method, secant method, and newton-secant method that has not been modified. this comparison aims to determine the effectivity of modification of newton-secant method if when viewed from the iterations, convergence, and time needed to solve a nonlinear equation having a multiplicity of 𝑚 > 1. results and discussion 1. construction of mathematical model newton-secant method and its modification a. construction of mathematical model newton-secant method newton-secant method or also known as leap-frogging newton's method is a combination of newton method and secant method. based on this to do construction of mathematical model newton-secant method used the concept of newton's method and the concept of the secant method. suppose that the function 𝑓(𝑥) has the zero 𝛼 in the interval [𝑎, 𝑏] and 𝑓 ∈ 𝐶2[𝑎, 𝑏]. let 𝑥0 be the initial guess. if the equation of the tangent line at (𝑥0, 𝑓(𝑥0)) intersects (𝑥0̅̅ ̅, 0) then by using the concept of newton's method geometrically as follows : on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 89 figure 1. newton's approximation curve then we get newton's approximation : 𝑓 ′(𝑥𝑛) = δ𝑦 δ𝑥 𝑓 ′(𝑥0) = 𝑓(𝑥0) − 0 𝑥0 − 𝑥0̅̅ ̅ 𝑓 ′(𝑥0)(𝑥0 − 𝑥0̅̅ ̅) = 𝑓(𝑥0) 𝑥0 − 𝑥0̅̅ ̅ = 𝑓(𝑥0) 𝑓 ′(𝑥0) 𝑥0̅̅ ̅ = 𝑥0 − 𝑓(𝑥0) 𝑓 ′(𝑥0) (9) in this case used 𝑥0̅̅ ̅ instead of 𝑥1 because this is only used as an intermediate approximation. furthermore find the equation of the secant line connecting the points (𝑥0, 𝑓(𝑥0)) and (𝑥0̅̅ ̅, 𝑓(𝑥0̅̅ ̅)). to find these equations is used the concept of the secant method geometrically. look at the following curve : figure 2. the concept of secant method to find the equation of the secant line based on figure 2 above, the following gradient is obtained: 𝑓 ′(𝑥0) = δ𝑦 δ𝑥 = [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] 𝑥0 − 𝑥0̅̅ ̅ tangent to the curve at 𝑥0 with gradient 𝑓 ′(𝑥0) on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 90 the gradient is used to get the equation of the line connecting the points (𝑥0, 𝑓(𝑥0)) and (𝑥0̅̅ ̅, 𝑓(𝑥0̅̅ ̅)). based on the gradient formula to find the equation of the line that through two points, the following equation is obtained : 𝑦 − 𝑓(𝑥0) = 𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅) 𝑥0 − 𝑥0̅̅ ̅ (𝑥 − 𝑥0) (10) assume the secant line meets the x-axis at the point (𝑥1, 0). so with substituting a value of 𝑥1 in 𝑥 and value of 0 in 𝑦 in equation (10) then we get, 0 − 𝑓(𝑥0) = [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] 𝑥0 − 𝑥0̅̅ ̅ (𝑥1 − 𝑥0) 0 − 𝑓(𝑥0) 𝑥1 − 𝑥0 = [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] 𝑥0 − 𝑥0̅̅ ̅ −𝑓(𝑥0) 𝑥1 − 𝑥0 = [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] (𝑥0 − 𝑥0̅̅ ̅) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] ∙ (𝑥1 − 𝑥0) = −𝑓(𝑥0) ∙ (𝑥0 − 𝑥0̅̅ ̅) 𝑥1 − 𝑥0 = −𝑓(𝑥0) ∙ (𝑥0 − 𝑥0̅̅ ̅) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] 𝑥1 = 𝑥0 − 𝑓(𝑥0) ∙ (𝑥0 − 𝑥0̅̅ ̅) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] (11) then substitute newton's approximation (9) into equation (11) so that optained, 𝑥1 = 𝑥0 − 𝑓(𝑥0) ∙ (𝑥0 − (𝑥0 − 𝑓(𝑥0) 𝑓′(𝑥0) )) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] = 𝑥0 − 𝑓(𝑥0) ∙ ( 𝑓(𝑥0) 𝑓′(𝑥0) ) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] = 𝑥0 − [𝑓(𝑥0)] 2 𝑓′(𝑥0) [𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] = 𝑥0 − [𝑓(𝑥0)] 2 𝑓 ′(𝑥0)[𝑓(𝑥0) − 𝑓(𝑥0̅̅ ̅)] (12) repeating this above process, the iteration formula can be written as following : where 𝑥𝑛+1 = 𝑥𝑛 − [𝑓(𝑥𝑛)] 2 𝑓 ′(𝑥𝑛)[𝑓(𝑥𝑛) − 𝑓(𝑥𝑛̅̅ ̅)] , 𝑛 = 0, 1, 2, . . . . 𝑥𝑛̅̅ ̅ = 𝑥𝑛 − 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) (13) thus, we get equation (13) is the iteration formula for the newton-secant method. b. construction of mathematical model modification of newton-secant method in solving simple nonlinear equations the newton-secant method has cubic convergence. while to solve the nonlinear equations for multiple zeros or multiplicity 𝑚 > 1 the convergence is not cubic but becomes linear. therefore, to maintain the convergence of the newton-secant method to remain cubic it is necessary to modify the method. the process modification of newton-secant method is done by adding parameter 𝜃 to the second term of the newton-secant method formula. the addition of these parameter resulted in the convergence of the modification of newton-secant method is cubic. on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 91 the formula for the newton-secant method (13) can be written in the following form, 𝑥𝑛+1 = 𝑥𝑛 − 𝑓(𝑥𝑛) 𝑓(𝑥𝑛) − 𝑓(𝑥𝑛̅̅ ̅) ∙ 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) , 𝑛 = 0, 1, 2, . . . . refer to theorem 3 in the introduction section to the construction of mathematical model modification of newton-secant method is done by adding parameter 𝜃 to the second term newton-secant method. so that the iteration formula is obtained as following : where 𝑥𝑛+1 = 𝑥𝑛 − 𝜃𝑓(𝑥𝑛) 𝜃𝑓(𝑥𝑛 ) − 𝑓(𝑥𝑛̅̅ ̅) ∙ 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) , 𝑛 = 0, 1, 2, . . . . 𝑥𝑛̅̅ ̅ = 𝑥𝑛 − 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛) and 𝜃 = ( −1 + 𝑚 𝑚 ) −1+𝑚 (14) thus, we get equation (14) is modification of newton-secant method formula which can be used to find solution to nonlinear equations for multiple zeros. 2. solving nonlinear equation for multiple zeros (trigonometric function) use modification of newton-secant method in the previous explanation, the construction of mathematical model newtonsecant method and its modification has been explained. to know effectivity of modification of newton-secant method then it is necessary to apply a modified newtonsecant method to nonlinear equations for multiple zeros. in this case, it is taken an example of nonlinear equation of trigonometric function, namely 𝑓(𝑥) = (cos2 𝑥 + 𝑥)5 solving equation 𝑓(𝑥) use modification of newton-secant method will be applied with four different initial guess, namely 𝑥 = −2; 𝑥 = −0,8; 𝑥 = −0,2 and 𝑥 = 2. the following are the steps to find a solution to the equation 𝑓(𝑥) = (𝑐𝑜𝑠2 𝑥 + 𝑥)5 using modification of newton-secant method: 1) determine the initial guess to be used the initial guess selection is based on theorem 1 (intermediate value theorem) in the introduction section. so that the initial guess are chosen, namely 𝑥 = −2; 𝑥 = −0,8; 𝑥 = −0,2 and 𝑥 = 2. 2) finding the derivative of 𝑓(𝑥) 𝑓(𝑥) = (cos2 𝑥 + 𝑥)5 𝑓 ′(𝑥) = 5(𝑐𝑜𝑠2𝑥 + 𝑥)4(−2 cos 𝑥 sin 𝑥 + 1). 3) determine the multiplicity of 𝑓(𝑥) based on the definition of multiplicity in the introduction section, it can be said that the multiplicity of 𝑓(𝑥) is 𝑚 = 5. 4) set the error to be used in this case the author sets the error used, namely 10−10. 5) calculating the value of 𝜃 on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 92 𝜃 = ( −1 + 𝑚 𝑚 ) −1+𝑚 = ( −1 + 5 5 ) −1+5 = ( 4 5 ) 4 = 0,4096 6) perform iteration using the modification of newton-secant method formula. 7) comparing the modification of newton-secant method with the newton’s method, the secant method, and the newton-secant method. it aims to know the effectiveness of the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function. the following is the result of solving 𝑓(𝑥) = (𝑐𝑜𝑠2 𝑥 + 𝑥)5 using modification of newton-secant method, newton method, secant method, and newton-secant method which has not been modified. table 1. solution 𝑓(𝑥) = (𝑐𝑜𝑠2 𝑥 + 𝑥)5 use modification of newton-secant method, newton method, secant method, and newton-secant method. 𝒇𝒊 𝒙𝟎 method n 𝒙𝒏 𝒇(𝒙𝒏) 𝜺 𝒇(𝒙) = (𝒄𝒐𝒔𝟐 𝒙 + 𝒙)𝟓 -2 mmns 5 -0,641714371 1,76869e-74 0,0000000000 mns 61 -0,641714371 8,67022e-49 0,0000000000 mn 93 -0,641714371 1,20449e-47 0,0000000000 ms 133 -0,641714371 -8,69088e-47 0,0000000000 -0,8 mmns 4 -0,641714371 -4,17656e-74 0,0000000000 mns 60 -0,641714371 -1,81522e-48 0,0000000000 mn 92 -0,641714371 -2,51216e-47 0,0000000000 ms 131 -0,641714371 -9,67797e-47 0,0000000000 -0,2 mmns 4 -0,641714371 -9,9601e-76 0,0000000000 mns 62 -0,641714371 3,45672e-48 0,0000000000 mn 96 -0,641714371 1,88037e-47 0,0000000000 ms 132 -0,641714371 -7,87677e-47 0,0000000000 2 mmns 5 -0,641714371 9,07176e-75 0,0000000000 mns 64 -0,641714371 -9,82514e-49 0,0000000000 mn 101 -0,641714371 1,84781e-47 0,0000000000 ms 133 -0,641714371 -8,69088e-47 0,0000000000 information : 𝑓𝑖 : nonlinear equation function with 𝑖 = 1, 2, 3, …. 𝑥0 : initial guess or initial value n : many iterations on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 93 𝑥𝑛 : root of function 𝑓(𝑥𝑛 ) : function of root 𝑥𝑛 𝜀 : error (𝑥𝑛+1 − 𝑥𝑛 ) mmns : modification of newton-secant method mns : newton-secant method mn : newton method ms : secant method based on table 1 above, the error is 0.0000000000, meaning that the error in the iteration value is less than the error tolerance constant used, namely 𝜀 < 10−10. therefore, the iteration process stops and the approximate root is −0.641714371 which is the solution of 𝑓(𝑥). in the table above, it can be seen that taking four different initial values, namely 𝑥 = −2; 𝑥 = −0,8; 𝑥 = −0,2 and 𝑥 = 2, if the search for a solution uses modification of newton-secant method, iterations are needed more little when compared to newton's method, the secant method, and the newton-secant method. based on the many iterations, it can be said that the modification of newton-secant method is more effective in solving the nonlinear equation 𝑓(𝑥) when compared to the newton method, the secant method, and the newton-secant method which have not been modified. the convergence of error values from the table of calculation results above can be seen in table 2 and table 3 below. table 2. convergence graph table of error values in the solution of 𝑓(𝑥) using modification of newtonsecant method and newton-secant method 𝒙𝟎 method modification of newton-secant method newton-secant method -2 -0,8 on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 94 -0,2 2 table 3. convergence graph table of error values in the solution of 𝑓(𝑥) using newton method and secant method. 𝒙𝟎 method newton method secant method -2 -0,8 on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 95 -0,2 2 conclusion combining two concepts of numerical method, namely the concept of newton's method and the concept of secant method result newton-secant method. then newtonsecant method modified by adding parameter 𝜃. so we get a new method, namely modification of newton-secant method with the following iteration formula: where 𝑥𝑛+1 = 𝑥𝑛 − 𝜃𝑓(𝑥𝑛) 𝜃𝑓(𝑥𝑛) − 𝑓(𝑥𝑛̅̅ ̅) ∙ 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛 ) , 𝑛 = 0, 1, 2, . . . . 𝑥𝑛̅̅ ̅ = 𝑥𝑛 − 𝑓(𝑥𝑛) 𝑓 ′(𝑥𝑛 ) and 𝜃 = ( −1 + 𝑚 𝑚 ) −1+𝑚 solution of 𝑓(𝑥) = (cos2 𝑥 + 5)5 use modification of newton-secant method with four different initial guess, namely 𝑥 = −2; 𝑥 = −0,8; 𝑥 = −0,2 and 𝑥 = 2 is obtained the root of 𝑥 approximately, namely −0,641714371 with fewer iterations when compared to using the newton method, the secant method, and the newton-secant method. based on the problem of finding the root of the nonlinear equation trigonometric function 𝑓(𝑥) it can be concluded that the modification of newton-secant method is more effective than the newton method, the secant method, and the newtonsecant method that has not been modified. references [1] p. batarius, "perbandingan metode newton-raphson modifikasi dan metode secant modifikasi dalam penentuan akar persamaan," 2018. [2] j. sapari and s. bahri, "penentuan akar-akar persamaan nonlinier dengan metode iterasi baru," jurnal matematika unand, vol. 4 no. 4, 2015. [3] p. batarius, "nilai awal pada metode newton-raphson yang dimodifikasi dalam on the modification of newton-secant method in solving nonlinear equations for multiple zeros of trigonometric function juhari 96 penentuan akar persamaan," pi: mathematics education journal , vol. 1. no.3, 2018. [4] y. muda, wartono and n. maulana, "konvergensi modifikasi metode newton ganda dengan menggunakan kelengkungan kurva," jurnal sains, teknologi dan industri, vol. 9. no. 2, 2012. [5] r. munir, metode numerik, bandung: informatika, 2008. [6] z. lega, agusni and s. putra, "metode iterasi tiga langkah dengan orde konvergensi lima untuk menyelesaikan persamaan nonlinear berakar ganda," jom fmipa, vol. 1, 2014. [7] rochmad, "aplikasi metode newton-raphson untuk menghampiri solusi persamaan non linear," jurnal mipa 36(2):193-200(2013), 2013. [8] s. c. chapra and r. p. canale, numerical methods for engineers sixth edition, new york: mcgraw-hill companies, inc, 2010. [9] r. l. burden and j. faires, numerical analysis ninth edition, usa: brooks/cole cengage learning, 2011. [10] m. n. vrahatis, "generalization of the bolzano theorem for simplices," elsevier, 2015. [11] j. h. mathews, numerical methods for mathematics, science, and engineering, new jersey: prentice-hall inc, 1992. [12] r. kumar and vipan, "comparative analysis of convergence of various numerical methods," journal of computer and mathematical sciences, vols. 6(6),290-297, 2015. [13] a. b. kasturiarachi, "leap-frogging newton's method," international journal of mathematical education in science and technology, vols. 33, no. 4, 521-527, 2002. [14] s. putra, d. ar and m. imran, "kombinasi metode newton dengan metode secant untuk menyelesaikan persamaan nonlinear," jurnal eksakta, vol. 2, 2011. [15] m. ferrara, s. sharifi and m. salimi, "computing multiple zeros by using a parameter in newton-secant method," sema journal, 2016. trace of positive integer power of squared special matrix cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 200-211 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 15, 2020 reviewed: january 04, 2021 accepted: january 27, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.10312 trace of positive integer power of squared special matrix rahmawati1, aryati citra2, fitry aryani3, corry corazon marzuki4, yuslenita muda5 1,2,3,4,5 department of mathematics, faculty of science and technology, state islamic university of sultan syarif kasim riau st. hr. soebrantas no. 155 simpang baru, panam, pekanbaru, 28293 email: 1rahmawati@uin-suska.ac.id , 2aryaticitra1@gmail.com, 3khodijah_fitri@uin-suska.ac.id, 4corry@uin-suska.ac.id, 5yuslenita.muda@uin-suska.ac.id abstract the rectangle matrix to be discussed in this research is a special matrix where each entry in each line has the same value which is notated by 𝐴𝑛. the main aim of this paper is to find the general form of the matrix trace of 𝐴𝑛 powered positive integer 𝑚 or notated by 𝑇𝑟(𝐴𝑛) 𝑚. to prove whether the general form of the matrix trace of 𝐴𝑛 powered positive integer can be confirmed, mathematics induction and direct proof are used. the main results present the general formula of (𝐴𝑛) 𝑚 and 𝑇𝑟(𝐴𝑛) 𝑚 with observing the pattern of power matrix for 2 ≤ 𝑚 ≤ 11,𝑛 ≥ 2, and 𝑚 ∈ ℤ+. keywords: direct proof; mathematics induction; matrix trace; squared matrix introduction the calculation of trace of power of square matrix has become attention. according to brezinski [1], trace of power of matrix is often used in some fields of mathematics, especially network analysis, number theory, dynamic systems, matrix theory, and differential equations. the discussion about trace matrix has been widely studied by several researchers before. datta et.al [2], has obtained algorithm of trace of power of squared matrix 𝑇𝑟(𝐴𝑘), with 𝑘 is an integer and 𝐴 is hassenberg matrix with a codiagonal unit. there is also discussion of trace in several applications in matrix theory and numerical linear algebra. for example in determining the eigenvalue of a symmetric matrix, the basic procedure in estimating a trace (𝐴𝑛) and trace (𝐴−𝑛) with 𝑛 integers, this is explained in pan [3]. chu. mt [4] discussed symbolic calculations on the power of squared tridiagonal of matrix trace. for example, 𝐴 a symmetric positive definite matrix, and for example {𝜆𝑘} notated its eigen value. for 𝑞 ∈ ℝ, 𝐴 𝑞 also symmetric definite matrices, and are listed in hignam [5] with formula 𝑇𝑟(𝐴𝑞) = ∑𝜆𝑘 𝑞 𝑘 . according to zarelua [6] in quantum and combinatorial theory, the trace matrix is a whole number in relation to the euler equations 𝑇𝑟(𝐴𝑝 𝑟 ) = 𝑇𝑟(𝐴𝑝 𝑟−1 ) 𝑚𝑜𝑑(𝑝𝑟) http://dx.doi.org/10.18860/ca.v6i4.10312 mailto:rahmawati@uin-suska.ac.id mailto:aryaticitra1@gmail.com mailto:khodijah_fitri@uin-suska.ac.id mailto:corry@uin-suska.ac.id mailto:yuslenita.muda@uin-suska.ac.id trace of positive integer power of squared special matrix rahmawati 201 for all matrix a integers, p is the prime number and r original number. then this article also discuss about invariant in dynamic system which is illustrate as form of trace of integer squared matrix, for example the number lefschetz. next, pahade and jha [7], discuss about the formation of general form of trace matrix ordo 2 × 2 square with powered positive integer. in that article there are two general forms of order trace 2× 2 with integer square n. first, the general form of order trace matrix trace 2 x 2 with even number square 𝑛, is 𝑇𝑟 (𝐴𝑛) = ∑   (−1)𝑟 𝑟 ! 𝑛 2⁄ 𝑟=0  𝑛 [𝑛  − (𝑟 + 1)] [𝑛  − (𝑟 + 2)] ⋯ [𝑛 − (𝑟 +(𝑟 − 1))] (det (𝐴)) 𝑟  (𝑇𝑟 (𝐴)) 𝑛−2𝑟 . second, the main form of trace matrix 2 × 2 with odd number square 𝑛, is 𝑇𝑟 (𝐴𝑛) = ∑   (−1)𝑟 𝑟 ! (𝑛−1) 2⁄ 𝑟=0  𝑛 [𝑛 − (𝑟 + 1)] [𝑛  − (𝑟 + 2)] ⋯ [𝑛 − (𝑟 + (𝑟 − 1))] (det (𝐴)) 𝑟  (𝑇𝑟 (𝐴)) 𝑛−2𝑟 . in the network analysis field, especially on triangle counting in a graph, based on avron [8], when analyzing a complex network, the important problem is calculating the total numbers of triangle on the simple connected graph. this number is equal to 𝑇𝑟(𝐴3) 6⁄ , where 𝐴 is adjacency matrix from the graph. then, in 2017, pahade and jha [9] discuss about trace of squared adjacency matrix on positive integers. in the paper, there is a symmetrical adjacency matrix on a complete simple graph with vertex n, for even number k is formulated 𝑇𝑟(𝐴𝑘) = ∑𝑠(𝑘,𝑟)𝑛(𝑛 − 1)𝑟(𝑛 − 2)𝑘−2𝑟 𝑛 2 𝑟=1 and for odd number k is formulated 𝑇𝑟(𝐴𝑘) = ∑ 𝑠(𝑘,𝑟)𝑛(𝑛 − 1)𝑟(𝑛 − 2)𝑘−2𝑟 𝑛−1 2 𝑟=1 with ),( rks is a number thats depend on k and r, and defined as 𝑠(𝑘,1) = 1,𝑠(𝑘, 𝑘 2 ) = 1,𝑠(𝑘, 𝑘−1 2 ) = 𝑘−1 2 , and 𝑠(𝑘,𝑟) = 𝑠(𝑘 − 1,𝑟)+ 𝑠(𝑘 − 2,𝑟 − 1). next, by this research, it will be decided the trace of rectangle matrix with the real number entries which for every entry row has an equal value. in this research, there are some related definitions and theorems. definition 1.1 (anton [10]) if 𝐴 is a rectangle matrix, then the definition of squared of powered non negative integers of 𝐴 is 𝐴0  =  𝐼 , 𝐴𝑛  = 𝐴𝐴…𝐴⏟ 𝑛 𝑓𝑎𝑘𝑡𝑜𝑟  (𝑛  >  0). next, if 𝐴 is invertible, then the definition of squared of powered negative integers of 𝐴 is 𝐴−𝑛  = (𝐴−1)𝑛  = 𝐴−1𝐴−1 …𝐴−1⏟ 𝑛 𝑓𝑎𝑘𝑡𝑜𝑟 . theorem 1.1 (andrilli, [11]) if 𝐴 is a rectangle matrix, and if 𝑟 and 𝑠 are nonnegative integers, then 1. 𝐴𝑟 𝐴𝑠  = 𝐴𝑟 + 𝑠 2. (𝐴𝑟)𝑠  = 𝐴𝑟𝑠 = (𝐴𝑠)𝑡 trace of positive integer power of squared special matrix rahmawati 202 definition 1.2 [10] if 𝐴 is a rectangle matrix, then the trace of 𝐴 which is stated as 𝑇𝑟(𝐴), is defined as the total entries on main diagonal of 𝐴. trace from 𝐴 cannot be defined when 𝐴 is not a rectangle matrix 𝑇𝑟 (𝐴) = 𝑎11 + 𝑎22 + ⋯+ 𝑎𝑛𝑛 = ∑  𝑎𝑖𝑖 𝑛 𝑖=1 . (1.1) theorem 1.2 [12] if 𝐴 and 𝐵 are rectangle matrix in the same order and 𝑐 is r scale, then apply: a. 𝑇𝑟(𝐴) =  𝑇𝑟(𝐴𝑇), b. 𝑇𝑟(𝑐𝐴) =  𝑐 𝑇𝑟(𝐴), (1.2) c. 𝑇𝑟(𝐴 +  𝐵) =  𝑇𝑟(𝐴) + 𝑇𝑟(𝐵), d. 𝑇𝑟(𝐴𝐵) =  𝑇𝑟(𝐵𝐴). methods the method used in order to reach the aim of this paper is using literature study or conceptual foundation by following steps.  finding the general formula of power matrix (𝐴𝑛) 𝑚 with 𝑚  ∈ ℤ+ and proof it using mathematical induction,  determining trace matrix (𝐴𝑛) 𝑚, notated by 𝑇𝑟 (𝐴𝑛) 𝑚, finding the general formula and using mathematical induction, we proof the formula obtained. results and discussion this research is going to discuss about positive integers squared trace of m from special matrix of 𝑛 𝑥 𝑛 order with the entries of real numbers where each entry has the same value in a row, which is noted with matrix 𝐴𝑛 𝑚. the research started by deciding the general form of matrix square of 𝐴𝑛 𝑚 by calculating matrix square in order of 2 x 2 to order of 5 x 5 squared by m positive integers. after the general matrix of 𝐴𝑛 𝑚 is formed, then this research is continued by looking for 𝑇𝑟(𝐴𝑛 𝑚). special matrix order of 𝒏 ×  𝒏 (𝒏  ≥  𝟐) squared by 𝒎 positive integers this part is going to explain about squaring of special matrix order of 𝑛  × 𝑛 ,𝑛 ≥ 2 with the real number entries where each entry has the same value in a row, this matrix is formulated as follows 𝐴𝑛  =  [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛]  , 𝑎𝑖  ∈  ℝ ;  𝑖  =  1, 2, … ,  𝑛. (2.1) it is special matrix in order of 22  to 55  which is formulated as follows. 𝐴2  = [ 𝑎1 𝑎1 𝑎2 𝑎2 ] , 𝐴3  = [ 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 ] , 𝐴4  = [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 ], 𝐴5  =  [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 𝑎4 𝑎5 𝑎5 𝑎5 𝑎5 𝑎5] trace of positive integer power of squared special matrix rahmawati 203 for 𝑛 = 2, it is decided the matrix squaring values of (𝐴2) 2 to (𝐴2) 11 which are presented in table 1 below. table 1. special matrix squaring values of (𝐴2) 2 to (𝐴2) 11 no special matrix squaring of 𝑨2 matrix squaring values 𝑨2 1. (𝐴2) 2 (𝑎1 + 𝑎2) 𝐴2 2. (𝐴2) 3 (𝑎1 + 𝑎2) 2 𝐴2 3. (𝐴2) 4 (𝑎1 + 𝑎2) 3 𝐴2 4. (𝐴2) 5 (𝑎1 + 𝑎2) 4 𝐴2 5. (𝐴2) 6 (𝑎1 + 𝑎2) 5 𝐴2 6. (𝐴2) 7 (𝑎1 + 𝑎2) 6 𝐴2 7. (𝐴2) 8 (𝑎1 + 𝑎2) 7 𝐴2 8. (𝐴2) 9 (𝑎1 + 𝑎2) 8 𝐴2 9. (𝐴2) 10 (𝑎1 + 𝑎2) 9 𝐴2 10. (𝐴2) 11 (𝑎1 + 𝑎2) 10 𝐴2 after getting the values of special matrix squaring of 𝐴2 which are in table 1, then it can be predicted that the general form of the special matrix squaring based on its recursive pattern is (𝐴2) 𝑚  = (𝑎1 + 𝑎2) 𝑚−1 𝐴2. according to the prediction, then the general form of matrix squaring of 𝐴2 is presented in theorem 2.1 below. theorem 2.1 if given the special matrix of 𝐴2  = [ 𝑎1 𝑎1 𝑎2 𝑎2 ] ; 𝑎1, 𝑎2  ∈  ℝ, then (𝐴2) 𝑚  = (𝑎1 + 𝑎2) 𝑚−1 𝐴2 with 𝑚  ∈ ℤ +. (2.2) proof: using mathematic induction. for example 𝑝(𝑚) : (𝐴2) 𝑚  = (𝑎1 + 𝑎2) 𝑚−1 𝐴2 1. for 𝑚 = 1 then 𝑝 (1) : (𝐴2) 1  = (𝑎1 + 𝑎2) 1−1 𝐴2 = (𝑎1 + 𝑎2) 0 𝐴2 = 𝐴2 2. for 𝑚 = 𝑘 then it is assumed that 𝑝 (𝑘) is correct, which is 𝑝 (𝑘) : (𝐴2) 𝑘  = (𝑎1 + 𝑎2) 𝑘−1 𝐴2 will be presented for 𝑚  =  𝑘 + 1 then 𝑝 (𝑘 + 1) is also correct, which is 𝑝 (𝑘 + 1) : (𝐴2) 𝑘+1  = (𝑎1 + 𝑎2) 𝑘 𝐴2. (2.3) then, (𝐴2) 𝑘+1  = (𝐴2) 𝑘 (𝐴2) = (𝑎1 + 𝑎2) 𝑘−1 𝐴2 𝐴2 = (𝑎1 + 𝑎2) 𝑘−1 (𝐴2) 2 = (𝑎1 + 𝑎2) 𝑘−1 (𝑎1 + 𝑎2) 𝐴2 = (𝑎1 + 𝑎2) (𝑘−1)+1 𝐴2 = (𝑎1 + 𝑎2) 𝑘 𝐴2 by giving attention to the equation (2.3) then 𝑝 (𝑘 + 1) is correct. due to step (1) and (2) are presented correctly, then theorem 2.1 is proven. ∎ trace of positive integer power of squared special matrix rahmawati 204 for 𝑛 = 3, it is decided the value of matrix squaring of (𝐴3) 2 to (𝐴3) 11 which are presented in table 2 below. table 2. the value of special matrix squaring of (𝐴3) 2 to (𝐴3) 11 no special matrix squaring of 𝐴3 the value matrix squaring of 𝐴3 1. (𝐴3) 2   3321 aaaa  2. (𝐴3) 3 (𝑎1 + 𝑎2 + 𝑎3) 2 𝐴3 3. (𝐴3) 4 (𝑎1 + 𝑎2 + 𝑎3) 3 𝐴3 4. (𝐴3) 5 (𝑎1 + 𝑎2 + 𝑎3) 4 𝐴3 5. (𝐴3) 6 (𝑎1 + 𝑎2 + 𝑎3) 5 𝐴3 6. (𝐴3) 7 (𝑎1 + 𝑎2 + 𝑎3) 6 𝐴3 7. (𝐴3) 8 (𝑎1 + 𝑎2 + 𝑎3) 7 𝐴3 8. (𝐴3) 9 (𝑎1 + 𝑎2 + 𝑎3) 8 𝐴3 9. (𝐴3) 10 (𝑎1 + 𝑎2 + 𝑎3) 9 𝐴3 10. (𝐴3) 11 (𝑎1 + 𝑎2 + 𝑎3) 10 𝐴3 after getting the values of the special matrix squaring of 𝐴3 which is in table 2, then it can be predicted the general form of the special matrix squaring is based on its recursive pattern which is (𝐴3) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3) 𝑚−1 𝐴3. according to the prediction, then the general form of matrix squaring of 3a is presented in theorem 2.2 below. theorem 2.2 if given the special matrix of 𝐴2  = [ 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 ] ; 𝑎1, 𝑎2, 𝑎3  ∈  ℝ, then (𝐴3) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3) 𝑚−1 𝐴3 with 𝑚  ∈ ℤ +. (2.4) proof: applying the same steps with theorem 2.1, then this theorem is proved. ∎ for 𝑛 = 4, it is decided the value of matrix squaring of (𝐴4) 2 to (𝐴4) 11 which are presented in table 3 below. table 3. the value of special matrix squaring of (𝐴4) 2 to (𝐴4) 11 no special matrix squaring of 𝐴4 matrix squaring value of𝐴4 1. (𝐴4) 2 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝐴4 2. (𝐴4) 3 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 2 𝐴4 3. (𝐴4) 4 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 3 𝐴4 4. (𝐴4) 5 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 4 𝐴4 5. (𝐴4) 6 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 5 𝐴4 6. (𝐴4) 7 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 6 𝐴4 7. (𝐴4) 8 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 7𝐴4 8. (𝐴4) 9 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 8𝐴4 9. (𝐴4) 10 (𝑎1 +𝑎2 +𝑎3 + 𝑎4) 9𝐴4 10. (𝐴4) 11 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 10𝐴4 after getting the values of special matrix squaring of 𝐴4 which is in table 3, then it can be predicted that the general form of the special matrix squaring is based on its recursive pattern which is (𝐴4) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝑚−1 𝐴4. according to the trace of positive integer power of squared special matrix rahmawati 205 prediction, then the general form of matrix squaring of 𝐴4 is presented in theorem 2.3 below. theorem 2.3 if given the special matrix of 𝐴4  = [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 ];  𝑎1, 𝑎2, 𝑎3, 𝑎4  ∈  ℝ, then (𝐴4) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝑚−1 𝐴4 with 𝑚  ∈ ℤ +. (2.5) proof: adopting the proof in theorem 2.1, then this theorem is proven as well. ∎ for 𝑛 = 5, it is decided the value of matrix squaring of (𝐴5) 2 to (𝐴5) 11 which is presented in the table 4 below. table 4. the value of special matrix squaring of (𝐴5) 2to (𝐴5) 11 no special matrix squaring of 𝐴5 matrix squaring value of 𝐴5 1. (𝐴5) 2 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 𝐴5 2. (𝐴5) 3 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 2𝐴5 3. (𝐴5) 4 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 3𝐴5 4. (𝐴5) 5 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 4𝐴5 5. (𝐴5) 6 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 5𝐴5 6. (𝐴5) 7 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 6𝐴5 7. (𝐴5) 8 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 7𝐴5 8. (𝐴5) 9 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 8𝐴5 9. (𝐴5) 10 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 9𝐴5 10. (𝐴5) 11 (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 10𝐴5 after getting the values of matrix squaring of 𝐴5 which is in table 3, then in can be predicted that the general form of the special matrix squaring is based on its recursive pattern which is (𝐴5) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 𝑚−1𝐴5. according to the prediction, then the general form of matrix squaring of 𝐴5 is presented in theorem 2.4 below. theorem 2.4 if given the special matrix of 𝐴5  =  [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 𝑎4 𝑎5 𝑎5 𝑎5 𝑎5 𝑎5] ; 𝑎1,𝑎2,𝑎3,𝑎4,𝑎5 ∈ ℝ then (𝐴5) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 𝑚−1𝐴5 with 𝑚  ∈ ℤ +. (2.6) proof: it is clear from above theorems. ∎ by giving attention to the recursive pattern of equation (2.2), equation (2.4), equation (2.5) and equation (2.6) which are trace of positive integer power of squared special matrix rahmawati 206 (𝐴2) 𝑚  = (𝑎1 + 𝑎2) 𝑚−1 𝐴2 (𝐴3) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3) 𝑚−1 𝐴3 (𝐴4) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝑚−1 𝐴4 (𝐴5) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 𝑚−1 𝐴5. it can be predicted that the general form of the special matrix squaring in order of 𝑛 ×𝑛, 𝑛 ≥ 2 is equal to the equation (2.1), which is (𝐴𝑛) 𝑚  = (𝑎1 + 𝑎2 + … + 𝑎𝑛) 𝑚−1 𝐴𝑛 = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1 𝐴𝑛 according to the prediction, then the general form of special matrix squaring in order of 𝑛  × 𝑛 ,𝑛 ≥ 2 is equal to equation (2.1) is presented in the theorem 2.5 below. theorem 2.5 if given the special matrix in order 𝑛 ×𝑛 , 𝑛 ≥ 2 which is 𝐴𝑛  =  [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛]  ; 𝑎𝑖  ∈  ℝ ,   𝑖  =  1,  2, … ,  𝑛 then (𝐴𝑛) 𝑚  = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1𝐴𝑛 ,  𝑑𝑒𝑛𝑔𝑎𝑛 𝑚  ∈ ℤ +. proof: again, by using mathematic induction, for example 𝑝 (𝑚) : (𝐴𝑛) 𝑚  =  (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1𝐴𝑛 , with 𝑚 ∈ ℤ + 1. for 𝑚  =  1 then 𝑝 (1) : (𝐴𝑛) 1  = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 1−1 𝐴𝑛 = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 0 𝐴𝑛 = 𝐴𝑛 2. for km  is assumed that 𝑝 (𝑘) is correct, which is 𝑝 (𝑘) : (𝐴𝑛) 𝑘  = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1𝐴𝑛, with 𝑚  ∈ ℤ +. will be presented for 𝑚  =  𝑘 + 1 then 𝑝 (𝑘  +  1) is also correct, which is 𝑝 (𝑘  +  1) : (𝐴𝑛) 𝑘+1  = (∑ 𝑎𝑖 𝑛 𝑖=1 ) (𝑘+1)−1 𝐴𝑛 =  (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘𝐴𝑛 (2.7) the proof is below (𝐴𝑛) 𝑘+1  = (𝐴𝑛) 𝑘 (𝐴𝑛) trace of positive integer power of squared special matrix rahmawati 207 = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1 𝐴𝑛𝐴𝑛 = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1 [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛] [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛]   = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1  [ 𝑎1 2 + 𝑎1𝑎2 + ⋯ + 𝑎1𝑎𝑖  + ⋯ + 𝑎1𝑎𝑛 𝑎1 2 + 𝑎1𝑎2 + ⋯ + 𝑎1𝑎𝑖  + ⋯ + 𝑎1𝑎𝑛 ⋯ 𝑎1 2 + 𝑎1𝑎2 + ⋯ + 𝑎1𝑎𝑖  + ⋯ + 𝑎1𝑎𝑛 𝑎1𝑎2 + 𝑎2 2 + ⋯ + 𝑎2𝑎𝑖  + ⋯ + 𝑎2𝑎𝑛 𝑎1𝑎2 + 𝑎2 2 + ⋯ + 𝑎2𝑎𝑖  + ⋯ + 𝑎2𝑎𝑛 ⋯ 𝑎1𝑎2 + 𝑎2 2 + ⋯ + 𝑎2𝑎𝑖  + ⋯ + 𝑎2𝑎𝑛 ⋮ ⋮   ⋮ 𝑎1𝑎𝑖  + 𝑎2𝑎𝑖  + ⋯ + 𝑎𝑖 2 + ⋯ + 𝑎𝑖𝑎𝑛 𝑎1𝑎𝑖  + 𝑎2𝑎𝑖  + ⋯ + 𝑎𝑖 2 + ⋯ + 𝑎𝑖𝑎𝑛 ⋯ 𝑎1𝑎𝑖  + 𝑎2𝑎𝑖  + ⋯ + 𝑎𝑖 2 + ⋯ + 𝑎𝑖𝑎𝑛 ⋮ ⋮ ⋮ ⋮ 𝑎1𝑎𝑛  + 𝑎2𝑎𝑛  + ⋯ + 𝑎𝑖𝑎𝑛  + ⋯ + 𝑎𝑛 2 𝑎1𝑎𝑛  + 𝑎2𝑎𝑛  + ⋯ + 𝑎𝑖𝑎𝑛  + ⋯ + 𝑎𝑛 2 ⋯ 𝑎1𝑎𝑛  + 𝑎2𝑎𝑛  + ⋯ + 𝑎𝑖𝑎𝑛  + ⋯ + 𝑎𝑛 2] = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1   [ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎1 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎1 ⋯ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎1 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎2 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎2 ⋯ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎2 ⋮ ⋮ ⋮ ⋮ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑖 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑖 ⋯ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑛 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑛 ⋯ (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛) 𝑎𝑛] = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1 (𝑎1 + 𝑎2 + ⋯ + 𝑎𝑖  + ⋯ + 𝑎𝑛)  [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛] = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘−1 (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝐴𝑛 = (∑  𝑎𝑖 𝑛 𝑖=1 ) (𝑘−1)+1 𝐴𝑛 = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑘 𝐴𝑛 by giving attention to the equation (2.7) then 𝑝 (𝑘 +  1) is correct. due to step (1) and (2) are presented correctly, then the theorem 2.5 is proven. ∎ trace of special matrix in order 𝒏 × 𝒏 (𝒏  ≥  𝟐) squared by positive integers in this part is going to be given the trace of special matrix of 𝐴2 𝑚,𝐴3 𝑚,𝐴4 𝑚, and 𝐴5 𝑚 which are contained in theorem 3.1 to theorem 3.4 as follows. theorem 3.1 if it is given the special matrix of 𝐴2 = [ 𝑎1 𝑎1 𝑎2 𝑎2 ] ; 𝑎1,𝑎2 ∈ ℝ then 𝑇𝑟 (𝐴2) 𝑚 = (𝑎1 + 𝑎2) 𝑚, with 𝑚 ∈ ℤ+. (3.1) proof. the proof of theorem uses direct proof. because of the known matrix of 𝐴2 then 𝑇𝑟(𝐴2) = 𝑎1 + 𝑎2. according to theorem 2.1, is got equation (2.2) which is (𝐴2) 𝑚 = (𝑎1 + 𝑎2) 𝑚−1𝐴2. by using the definition 1.2 and theorem 1.2 (b), it is formulated 𝑇𝑟 (𝐴2) 𝑚 = 𝑇𝑟((𝑎1 + 𝑎2) 𝑚−1𝐴2) = (𝑎1 + 𝑎2) 𝑚−1𝑇𝑟(𝐴2) = (𝑎1 + 𝑎2) 𝑚−1(𝑎1 + 𝑎2) = (𝑎1 + 𝑎2) (𝑚−1)+1 trace of positive integer power of squared special matrix rahmawati 208 = (𝑎1 + 𝑎2) 𝑚. according to the proof, then theorem 3.1 is proven. ∎ theorem 3.2 if it is given special matrix of 𝐴3 = [ 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 ] ;𝑎1,𝑎2,𝑎3 ∈ ℝ then 𝑇𝑟 (𝐴3) 𝑚 = (𝑎1 + 𝑎2 + 𝑎3) 𝑚, with 𝑚 ∈ ℤ+. (3.2) proof. it is clear from above theorem. ∎ theorem 3.3 if it is given the special matrix of 𝐴4  = [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 ]; 𝑎1, 𝑎2, 𝑎3, 𝑎4  ∈  ℝ, then 𝑇𝑟 (𝐴4) 𝑚 = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝑚, with 𝑚 ∈ ℤ+. (3.3) proof. the proof is clear. ∎ theorem 3.4 if given the special matrix of 𝐴5  =  [ 𝑎1 𝑎1 𝑎1 𝑎1 𝑎1 𝑎2 𝑎2 𝑎2 𝑎2 𝑎2 𝑎3 𝑎3 𝑎3 𝑎3 𝑎3 𝑎4 𝑎4 𝑎4 𝑎4 𝑎4 𝑎5 𝑎5 𝑎5 𝑎5 𝑎5] ; 𝑎1, 𝑎2, 𝑎3,𝑎4 and 𝑎5  ∈  ℝ then 𝑇𝑟 (𝐴5) 𝑚 = (𝑎1 +𝑎2 +𝑎3 +𝑎4 +𝑎5) 𝑚, with 𝑚 ∈ ℤ+. (3.4) proof. clearly proven by following theorem 3.1. ∎ by giving attention to the recursive pattern on equation (3.1), equation (3.2), equation (3.3) and equation (3.4) which are 𝑇𝑟 (𝐴2) 𝑚  = (𝑎1 + 𝑎2) 𝑚 𝑇𝑟 (𝐴3) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3) 𝑚 𝑇𝑟 (𝐴4) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 𝑚 𝑇𝑟 (𝐴5) 𝑚  = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 𝑚. it can be predicted that the general form of the trace of special matrix in order 𝑛 𝑥 𝑛,𝑛 ≥ 2 is equal to equation (2.1) squared by positive integer (nonnegative integer) which is 𝑇𝑟 (𝐴𝑛) 𝑚  = (𝑎1 + 𝑎2 + … + 𝑎𝑛) 𝑚 = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚 . according to the prediction, then the general form of trace of special matrix in order 𝑛 𝑥 𝑛 ,𝑛 ≥ 2 is presented in theorem 3.5 below. trace of positive integer power of squared special matrix rahmawati 209 theorem 3.5 if given special matrix in order 𝑛 𝑥 𝑛 ,𝑛 ≥ 2 which is 𝐴𝑛  =  [ 𝑎1 𝑎1 ⋯ 𝑎1 𝑎2 𝑎2 ⋯ 𝑎2 ⋮ ⋮ ⋮ ⋮ 𝑎𝑖 𝑎𝑖 ⋯ 𝑎𝑖 ⋮ ⋮ ⋮ ⋮ 𝑎𝑛 𝑎𝑛 ⋯ 𝑎𝑛]  ,  𝑎𝑖  ∈  ℝ ;  𝑖  =  1, 2, … ,  𝑛. then 𝑇𝑟 (𝐴𝑛) 𝑚  = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑚, with 𝑚  ∈ ℤ+. proof: this theorem will be proven by direct proof. because matrix 𝐴𝑛 is known, then 𝑇𝑟 (𝐴𝑛) = ∑  𝑎𝑖 𝑛 𝑖=1 . from theorem 2.5, obtained (𝐴𝑛) 𝑚  = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1𝐴𝑛. so that by using definition 1.2 and theorem 1.2 (b) obtained 𝑇𝑟(𝐴𝑛) 𝑚  =  𝑇𝑟 ((∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1 𝐴𝑛)  = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1  𝑇𝑟 (𝐴𝑛) = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1  (∑ 𝑎𝑖 𝑛 𝑖=1 ) = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚 . based on the evidence, then theorem 3.5 is proven. ∎ the application of matrix 𝑨𝒏 𝒎 and 𝑻𝒓(𝑨𝒏 𝒎) in examples the following is given the example of question related to theorem 2.5 and theorem 3.5 as follows. example 1. consider matrix 𝐴4 as follows 𝐴4 = [ 3 3 3 3 12 12 12 12 25 25 25 25 10 10 10 10 ] determine (𝐴4) 80 and 𝑇𝑟(𝐴4) 80. solution: by giving attention to matrix 𝐴4, value of 𝑎1 = 3,𝑎2 = 12,𝑎3 = 25, and 𝑎4 = 10. based on theorem 2.5 obtained (𝐴4) 80 = (∑𝑎𝑖 4 𝑖=1 ) 80−1 𝐴4 = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 79𝐴4 = (3+ 12 + 25 + 10)79 [ 3 3 3 3 12 12 12 12 25 25 25 25 10 10 10 10 ] trace of positive integer power of squared special matrix rahmawati 210 = (50)79 [ 3 3 3 3 12 12 12 12 25 25 25 25 10 10 10 10 ] based on theorem 3.5 obtained 𝑇𝑟(𝐴4) 80 = (∑𝑎𝑖 4 𝑖=1 ) 80 = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4) 80 = (3+ 12 + 25 + 10)80 = (50)80 example 2. given matrix 𝐴5 as follows 𝐴5 = [ 8 8 8 8 8 5/16 5/16 5/16 5/16 5/16 −12 −12 −12 −12 −12 2/3 2/3 2/3 2/3 2/3 −5/12 −5/12 −5/12 −5/12 −5/12] determine (𝐴5) 27 and 𝑇𝑟(𝐴5) 27. solution : by giving attention to matrix ,5a value of 𝑎1 = 8,𝑎2 = 5/16,𝑎3 = −12,𝑎4 = 2/3 and 𝑎5 = −5/12. based on theorem 2.5 obtained (𝐴5) 27 = (∑𝑎𝑖 5 𝑖=1 ) 27−1 𝐴5 = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 26𝐴5 = (8 + (5/16) +(−12) +(2/3) + (−5/12))26 [ 8 8 8 8 8 5/16 5/16 5/16 5/16 5/16 −12 −12 −12 −12 −12 2/3 2/3 2/3 2/3 2/3 −5/12 −5/12 −5/12 −5/12 −5/12] = (−3 7 16 )26 [ 8 8 8 8 8 5/16 5/16 5/16 5/16 5/16 −12 −12 −12 −12 −12 2/3 2/3 2/3 2/3 2/3 −5/12 −5/12 −5/12 −5/12 −5/12] according to theorem 3.5 it is formulated 𝑇𝑟(𝐴5) 27 = (∑𝑎𝑖 5 𝑖=1 ) 27 = (𝑎1 + 𝑎2 + 𝑎3 + 𝑎4 + 𝑎5) 27 = (8+ (5/16) + (−12) + (2/3) +(−5/12))27 = (−3 7 16 )27. trace of positive integer power of squared special matrix rahmawati 211 conclusions based on elaboration and discussion in previous part, several conclusions can be drawn as follows. 1. the general form of integer of a special matrix form in order 𝑛 × 𝑛 ,𝑛 ≥ 2 in equation (2.1) is as follows. (𝐴𝑛) 𝑚  = (∑ 𝑎𝑖 𝑛 𝑖=1 ) 𝑚−1  𝐴𝑛,  with 𝑚  ∈ ℤ +. 2. general form of trace in a special matrix form in order 𝑛 × 𝑛 ,𝑛 ≥ 2 in equation (2.1) is as follows. 𝑇𝑟 (𝐴𝑛) 𝑚  = (∑  𝑎𝑖 𝑛 𝑖=1 ) 𝑚,  with 𝑚  ∈ ℤ+. acknowledgments we thank all authors (rr, ac, fa, ccm, and ym) for their responsibility to designed the research and approved the final manuscript. rr, ac wrote the manuscript, fa, ccm gave their suggestion and edited the manuscript and ym read, edited for the final content of the manuscript. none of the authors had a conflict of interest. references [1] c. brezinski, p. fika, and m. mitrouli, “estimations of the trace of powers of positive self-adjoint operators by extrapolation of the moments,” electron. trans. numer. anal., vol. 39, pp. 144–155, 2012. [2] b. n. datta and k. datta, “an algorithm for computing powers of a hessenberg matrix and its applications,” linear algebra appl., vol. 14, no. 3, pp. 273–284, 1976, doi: 10.1016/0024-3795(76)90072-0. [3] v. pan, “estimating the extremal eigenvalues of a symmetric matrix,” comput. math. with appl., vol. 20, no. 2, pp. 17–22, 1990, doi: 10.1016/08981221(90)90236-d. [4] m. t. chu, “symbolic calculation of the trace of the power of a tridiagonal matrix,” vol. 268, pp. 257–268, 1985. [5] n. higham, “functions of matrices: theory and computation, chapter 5 matrix sign function,” soc. ind. appl. math., no. march, p. 445, 2008. [6] a. v. zarelua, “on congruences for the traces of powers of some matrices,” proc. steklov inst. math., vol. 263, no. 1, pp. 78–98, 2008, doi: 10.1134/s008154380804007x. [7] j. pahade and m. jha, “trace of positive integer power of real 2 x 2 matrices,” adv. linear algebr. & matrix theory, vol. 05, no. 04, pp. 150–155, 2015, doi: 10.4236/alamt.2015.54015. [8] h. avron, “counting triangles in large graphs using randomized matrix trace estimation categories and subject descriptors,” kdd-ldmta, 2010. [9] j. k. pahade and m. jha, “trace of positive integer power of adjacency matrix,” glob. j. pure appl. math., vol. 13, no. 6, pp. 2079–2087, 2017. [10] c. r. howard anton, elementary linear algebra: applications version, 11th edition 11. 2013. [11] j. asquith and b. kolman, elementary linear algebra, vol. 71, no. 457. 1987. [12] j. e. gentle, matrix algebra, theory, computations, and applications in statistics, vol. 102. 2009. cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics cauchy: jurnal matematika murni dan aplikasi is a peer-reviewed electronic national journal. this statement clarifies ethical behaviour of all parties involved in the act of publishing an article in this journal, including the author, the chief editor, the editorial board, the peer-reviewer and the publisher (mathematics department of maulana malik ibrahim state islamic university of malang). this statement is based on cope’s best practice guidelines for journal editors. ethical guideline for journal publication the publication of an article in a peer-reviewed cauchy is an essential building block in the development of a coherent and respected network of knowledge. it is a direct reflection of the quality of the work of the authors and the institutions that support them. peer-reviewed articles support and embody the scientific method. it is therefore important to agree upon standards of expected ethical behavior for all parties involved in the act of publishing: the author, the journal editor, the peer reviewer, the publisher and the society. as publisher of pure and applied mathematics journal, we take our duties to back up over all stages of publishing seriously and we recognize our ethical and other responsibilities. we are committed to ensuring that advertising, reprint or other commercial revenue has no impact or influence on editorial decisions. publication decisions the editor of cauchy is responsible for deciding which of the articles submitted to the journal should be published. the validation of the work in question and its importance to researchers and readers must always drive such decisions. the editors may be guided by the policies of the journal's editorial board and constrained by such legal requirements as shall then be in force regarding libel, copyright infringement and plagiarism. the editors may confer with other editors or reviewers in making this decision. fair play an editor at any time evaluates manuscripts for their intellectual content without regard to race, gender, sexual orientation, religious belief, ethnic origin, citizenship, or political philosophy of the authors. confidentiality the editor and any editorial staff must not disclose any information about a submitted manuscript to anyone other than the corresponding author, reviewers, potential reviewers, other editorial advisers, and the publisher, as appropriate. any manuscripts received for review must be treated as confidential documents. they must not be shown to or discussed with others except as authorized by the editor. disclosure and conflicts of interest unpublished materials disclosed in a submitted manuscript must not be used in an editor's own research without the express written consent of the author. cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics contribution to editorial decisions peer review assists the editor in making editorial decisions and through the editorial communications with the author may also assist the author in improving the paper. promptness any selected referee who feels unqualified to review the research reported in a manuscript or knows that its prompt review will be impossible should notify the editor and excuse himself from the review process. standards of objectivity reviews should be conducted objectively. personal criticism of the author is inappropriate. referees should express their views clearly with supporting arguments. acknowledgement of sources reviewers should identify relevant published work that has not been cited by the authors. any statement that an observation, derivation, or argument had been previously reported should be accompanied by the relevant citation. a reviewer should also call to the editor's attention any substantial similarity or overlap between the manuscript under consideration and any other published paper of which they have personal knowledge. disclosure and conflict of interest privileged information or ideas obtained through peer review must be kept confidential and not used for personal advantage. reviewers should not consider manuscripts in which they have conflicts of interest resulting from competitive, collaborative, or other relationships or connections with any of the authors, companies, or institutions connected to the papers. reporting standards authors of reports of original research should present an accurate account of the work performed as well as an objective discussion of its significance. underlying data should be represented accurately in the paper. a paper should contain sufficient detail and references to permit others to replicate the work. fraudulent or knowingly inaccurate statements constitute unethical behavior and are unacceptable. data access and retention authors are asked to provide the raw data in connection with a paper for editorial review and should be prepared to provide public access to such data (consistent with the alpspstm statement on data and databases), if practicable, and should in any event be prepared to retain such data for a reasonable time after publication. originality and plagiarism the authors should ensure that they have written entirely original works, and if the authors have used the work and/or words of others that this has been appropriately cited or quoted. cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 publication etics multiple, redundant or concurrent publication an author should not in general publish manuscripts describing essentially the same research in more than one journal or primary publication. submitting the same manuscript to more than one journal concurrently constitutes unethical publishing behavior and is unacceptable. acknowledgement of sources proper acknowledgment of the work of others must always be given. authors should cite publications that have been influential in determining the nature of the reported work. authorship of the paper authorship should be limited to those who have made a significant contribution to the conception, design, execution, or interpretation of the reported study. all those who have made significant contributions should be listed as co-authors. where there are others who have participated in certain substantive aspects of the research project, they should be acknowledged or listed as contributors. the corresponding author should ensure that all appropriate co-authors and no inappropriate co-authors are included on the paper, and that all co-authors have seen and approved the final version of the paper and have agreed to its submission for publication. hazards and human or animal subjects if the work involves chemicals, procedures or equipment that have any unusual hazards inherent in their use, the author must clearly identify these in the manuscript. disclosure and conflicts of interest all authors should disclose in their manuscript any financial or other substantive conflict of interest that might be construed to influence the results or interpretation of their manuscript. all sources of financial support for the project should be disclosed. fundamental errors in published works when an author discovers a significant error or inaccuracy in his/her own published work, it is the author’s obligation to promptly notify the journal editor or publisher and cooperate with the editor to retract or correct the paper. cauchy jurnal matematika murni dan aplikasi volume 7, issue 1, november 2021 issn : 2086-0382 e-issn : 2477-3344 acknowledgment to reviewers in this issue contributions and valuable comments of the following reviewers in this issue was very appreciated bety hayat susanti, politeknik siber dan sandi negara, indonesia dian savitri, universitas negeri surabaya, indonesia meta kallista, universitas telkom, indonesia dani suandi, universitas bina nusantara, bandung, indonesia anwar fitrianto, department of statistics, ipb university, indonesia subanar seno, gadjah mada university, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia sri harini, universitas islam negeri maulana malik ibrahim malang, indonesia heni widayani, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia corina karim, brawijaya uiversity fachrur rozi, universitas islam negeri maulana malik ibrahim malang, indonesia javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740595') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740557') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740556') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740541') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/736347') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/5964') optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 173-185 p-issn: 2086-0382; e-issn: 2477-3344 submitted: august 23, 2021 reviewed: november 17, 2021 accepted: december 22, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.13184 optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi1,*, nursyiva irsalinda1, meksianis z. ndii2 1department of mathematics, faculty of applied science and technology, ahmad dahlan university, yogyakarta, indonesia 2department of mathematics, faculty of sciences and engineering, university of nusa cendana, indonesia *corresponding author email: yudi.adi@math.uad.ac.id* nursyiva.irsalinda@math.uad.ac.id, meksianis.ndii@staf.undana.ac.id abstract the existence of viral mutations in various infectious diseases can make it difficult to overcome outbreaks caused by these viruses. in this paper, we introduce an optimal control problem in a two-strain sir epidemic model with viral mutation and vaccine administration. the purpose of this study was to investigate the efficacy and cost-effectiveness of two disease prevention strategies, namely restriction of community mobility to prevent disease transmission and vaccine intervention. we consider the time-dependent control case, and we use pontryagin’s maximum principle to derive necessary conditions for the optimal control of the disease. we also calculate the average cost-effectiveness ratio (acer) and the incremental cost-effectiveness ratio (icer) to investigate the cost-effectiveness of all possible strategies of the control measures. the results of this study indicate that the most cost-effective disease control strategy is a combination of mobility restriction and vaccination. keywords: epidemic model; cost-effectiveness analysis; numerical simulation; optimal control; viral mutation introduction epidemiological modeling is a field of mathematical modeling that studies the causes, patterns, and effects of disease on health in a population. the sir (susceptible, infected, recovered) compartment model that kermack-mckendrick first introduced in 1927 became the basis for developing models of the spread of infectious diseases. according to the characteristics of the disease, different epidemic models by adding or modifying compartments have been developed and studied. among them by adding a compartment vaccination [1],[2],[3], treatment [4], quarantine [5], viruses or bacteria that cause disease[6], disease-carrying vectors [7], and others. in various types of infectious diseases caused by viruses, viruses mutations make the epidemic difficult to overcome immediately. the emergence of new variants of this virus increased the length of the epidemic period. such conditions are also currently happening in various parts of the world, namely the covid-19 pandemic. especially in http://dx.doi.org/10.18860/ca.v7i1.13184 mailto:yudi.adi@math.uad.ac.id* mailto:nursyiva.irsalinda@math.uad.ac.id mailto:meksianis.ndii@staf.undana.ac.id optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 174 indonesia, after experiencing a decline in cases for about nine months since the beginning of the pandemic in march 2020, the number of positive covid-19 cases again increased in mid-june 2021. the government has taken various policies to be able to end the spread of this covid-19 disease immediately. beside targeting vaccinations, the government is currently implementing community activity restrictions (ppkm) to control the spread of the covid-19 outbreak. many mathematical models of covid-19 have also been developed, as in [5], [8]–[12]. in the last few decade, optimal control theory has developed rapidly, and its diverse applications are widely used in various scientific and engineering fields. this theory has proven to be effective in mathematical epidemiology when it comes to determining how to remove or reduce the number of cases at the lowest possible cost. the optimal control theory has been utilized to capture intervention strategies in many research, see for example [5], [7], [10], [13]–[16] optimal control models involving vaccination strategies have also been developed, as in [3], [16], [17]. however, these models did not consider the presence of viral mutations that were presumed to be more virulent in the premutated viruses. as in 12 states across the united states, the more easily transmissible strain of sars-cov-2, b.1.1.7, has been found [18]. in this article, we will discuss the sir epidemic model by considering the presence of viral mutations. we are also considering vaccine intervention as one of prevention against diseases. motivated by this, in this article, we intend to modify the epidemic model with virus mutation and vaccine interventions studied in adi et al. [19]. instead of constant parameters of the intervention strategy, we use a control function to express the intervention strategy in this model. the goal is to find the best function for a given control measure by applying pontryagin’s maximum principle [20]. this study also observes which control strategy is the most cost-effective, which is determined through the average cost-effectiveness ratio (acer) and the incremental cost-effectiveness ratio (icer), as defined in [21]– [24]. besides being applied to the spread of covid-19, the model can also be used for other diseases involving viral mutations. this paper's structure is as follows. the methodologies used in our research are discussed in the following section. after then, the model's analysis was discussed. finally, we will provide a brief summary of our work. methods the optimal control problem is analyzed by performing the following steps: 1. we consider a modified sir epidemic model taking into account the presence of viral mutations and vaccine intervention. 2. considering a time-dependent constant case-control and using pontryagin's maximum principle to obtain the necessary conditions for optimal disease control. 3. demonstrating the numerical result of the existence of the optimal control by implementing the forward-backward fourth-order runge-kutta method. 4. computing the average cost-effectiveness ratio (acer) and additional costeffectiveness ratio (icer) to investigate the cost-effectiveness of all possible control action strategies. optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 175 results and discussion formulation of the optimal control problem modifying the standard sir model, adi et al. [19] have developed an epidemic model taking into account the presence of viral mutations and vaccine interventions. mutations are recorded in terms that transfer an individual infected with one strain to an individual infected with another strain. the populations subdivided into five classes, which are; susceptible (𝑆), infected by strain one (𝐼1), infected by strain two (𝐼2), vaccinated (𝑉), and recovered (𝑅). the model is given in (1) below. 𝑑𝑆 𝑑𝑡 = λ − 𝛽1𝑆𝐼1 − 𝛽2𝑆𝐼2 − 𝛾𝑆 − 𝜇𝑆, 𝑑𝐼1 𝑑𝑡 = 𝛽1𝑆𝐼1 − (𝜔 + 𝛼1 + 𝑐 + 𝜇)𝐼1, 𝑑𝐼2 𝑑𝑡 = 𝛽2𝑆𝐼2 + 𝜔𝐼1 + (1 − 𝜀)𝑉𝐼2 − (𝛼2 + 𝑑 + 𝜇)𝐼2, 𝑑𝑉 𝑑𝑡 = 𝛾𝑆 − (1 − 𝜀)𝑉𝐼2 − 𝜇𝑉, 𝑑𝑅 𝑑𝑡 = 𝛼1𝐼1 + 𝛼2𝐼2 − 𝜇𝑅. (1) the first four equations in the system (1) do not depend on 𝑅, so to analyze the dynamics of the model, the fifth equation is neglected. please refer to [19] for details. next, paying attention only to the first four equations, we introduce a time-dependent control in the system (1). the purpose is to control the spread of disease and study strategies to eradicate epidemics in a community. we introduce two control functions, 𝑢1(𝑡) and 𝑢2(𝑡), which represent attempts to prevent disease transmission from both viral strains and vaccinations, respectively. the corresponding state system is given by: 𝑑𝑆 𝑑𝑡 = λ − (1 − 𝑢1(𝑡))(𝛽1𝐼1 + 𝛽2𝐼2)𝑆 − 𝑢2(𝑡)𝑆 − 𝜇𝑆, 𝑑𝐼1 𝑑𝑡 = (1 − 𝑢1(𝑡))𝛽1𝑆𝐼1 − (𝜔 + 𝛼1 + 𝑐 + 𝜇)𝐼1, 𝑑𝐼2 𝑑𝑡 = (1 − 𝑢1(𝑡))𝛽2𝑆𝐼2 + 𝜔𝐼1 + (1 − 𝜀)𝑉𝐼2 − (𝛼2 + 𝑑 + 𝜇)𝐼2, 𝑑𝑉 𝑑𝑡 = 𝑢2(𝑡)𝑆 − (1 − 𝜀)𝑉𝐼2 − 𝜇𝑉, (2 ) where 𝑢1(𝑡) is a control strategy that maintains the state of the uninfected population in the susceptible class and reduces the rate at which individuals leave the susceptible class to the infected class, either by strain one or by strain two, and 𝑢2(𝑡) is a control strategy to increase the number of individuals vaccinated. medically, considering that both strategies have many limitations so that they are not fully effective, it is realistic to assume that 0 ≤ 𝑢𝑖 𝑚𝑎𝑥 < 1, 𝑖 = 1,2. hence, the bounded lebesgue measurable set of admissible control is represented as 𝛺 = {(𝑢1(𝑡), 𝑢2(𝑡))|0 ≤ 𝑢𝑖 (𝑡) ≤ 𝑢𝑖 𝑚𝑎𝑥 , 𝑖 = 1,2, 𝑡 ∈ [0, 𝑇]}. (3) the aim is to gain the optimal value 𝑢𝑖 ∗ of the control 𝑢𝑖 (𝑡) in the time interval [0, 𝑇], such that the associate state trajectories 𝑋∗ = (𝑆∗, 𝐼1 ∗, 𝐼2 ∗, 𝑉∗) are solutions of the system (2) in the interval [0, 𝑇] with the initial conditions: optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 176 𝑆(0) ≥ 0, 𝐼1(0) ≥ 0, 𝐼2(0) ≥ 0, 𝑉(0) ≥ 0, (4) and 𝑢𝑖 ∗ maximizes the objective function given by: 𝐽(𝑢1, 𝑢2) = ∫ [𝑤1𝑆(𝑡) + 𝑤2𝑉(𝑡) − 𝑤3𝐼1(𝑡) − 𝑤4𝐼2(𝑡) − 𝐶1𝑢1 2(𝑡) 2 − 𝐶2𝑢2 2(𝑡) 2 ] 𝑑𝑡 𝑇 0 , (5) with 𝑤1, 𝑤2, 𝑤3, 𝑤4, 𝐶1, 𝐶2 are positive weight constant where we want to maximize the susceptibles 𝑆(𝑡), and vaccinated individuals 𝑉(𝑡), and to minimize both infected individuals by strain one 𝐼1(𝑡) and by strain two 𝐼2(𝑡) (negative sign means maximizing) while keeping prevention cost 𝑢1(𝑡) and vaccination cost 𝑢2(𝑡) low. the cost of the prevention program could come from the implementation of the restriction of citizen mobilization, quarantine, or local lockdowns. at the same time, the cost of vaccination comes from everything needed to implement the vaccination program. our optimal control problem is to determining (𝑆∗, 𝐼1 ∗, 𝐼2 ∗, 𝑉∗) related to an admissible control 𝑢𝑖 ∗ on the time interval [0, 𝑇] satisfying equation (2) and the initial condition of (4) and maximizing the cost functional of equation (5) such that 𝐽(𝑢1 ∗ , 𝑢2 ∗ ) = max ω 𝐽(𝑢1, 𝑢2). (6) here, we consider that the objective function as a function of 𝑢1 and 𝑢2, so it is concave with respect to the control 𝑢𝑖 . from this property and noting that the control system also satisfies the lipschitz property corresponding to the state variables (𝑆, 𝐼1 , 𝐼2 , 𝑉), it is ensured that the optimal control u of the optimal control problem in equation (4) exists. hence, the maximum value can be obtained [25]–[27]. characteristic of the optimal controls in order to take advantage the pontryagin's maximal principle, the system (4) and the objective functional (5) need to be converted into a pointwise hamiltonian, ℋ with respect to (𝑢1, 𝑢2), and we get ℋ = 𝑤1𝑆(𝑡) + 𝑤2𝑉(𝑡) − 𝑤3𝐼1(𝑡) − 𝑤4𝐼2(𝑡) − 𝐶1𝑢1 2 2 − 𝐶2𝑢2 2 2 + 𝜆1[λ − (1 − 𝑢1)(𝛽1𝐼1 + 𝛽2𝐼2)𝑆 − 𝑢2𝑆 − 𝜇𝑆] + 𝜆2[(1 − 𝑢1)𝛽1𝑆𝐼1 − (𝜔 + 𝛼1 + 𝑐 + 𝜇)𝐼1] + 𝜆3[(1 − 𝑢1)𝛽2𝑆𝐼2 + 𝜔𝐼1 + (1 − 𝜀)𝑉𝐼2 − (𝛼2 + 𝑑 + 𝜇)𝐼2] + 𝜆4[𝑢2𝑆 − (1 − 𝜀)𝑉𝐼2 − 𝜇𝑉]. (7) where 𝜆1, 𝜆2, 𝜆3, 𝜆4 are the costate variables or adjoint variables associated with the state variables 𝑆, 𝐼1, 𝐼2, 𝑉. we summarize the necessary conditions for the optimal control 𝑢𝑖 ∗, 𝑖 = 1,2 in theorem 1 below. theorem 1. there is an optimal control 𝑢𝑖 ∗, 𝑖 = 1,2 corresponding to the optimal solution (𝑆∗, 𝐼1 ∗, 𝐼2 ∗, 𝑉∗) that maximizes the objective functional 𝐽(𝑢1, 𝑢2) over ω. moreover, there exist costate variables or adjoint variables, 𝜆𝑗 , 𝑗 = 1,2,3,4 that satisfies 𝑑𝜆𝑗 𝑑𝑡 = − 𝜕ℋ 𝜕𝑋 with transversality condition 𝜆𝑗 (𝑇) = 0, 𝑗 = 1,2,3,4. furthermore, the associated optimal control 𝑢𝑖 ∗, 𝑖 = 1,2 are given by optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 177 𝑢1 ∗ = min {max {0, (𝜆1 − 𝜆2)𝛽1𝐼1 ∗𝑆∗ + (𝜆1 − 𝜆3)𝛽2𝐼2 ∗𝑆∗ 𝐶1 } , 𝑢1 𝑚𝑎𝑥 }, 𝑢2 ∗ = min {max {0, (𝜆4 − 𝜆1)𝑆 ∗ 𝐶2 } , 𝑢2 𝑚𝑎𝑥 }. (8) proof. the adjoint system is derived by taking the partial derivative of the hamiltonian ℋ with respect to the associated state variables so that 𝑑𝜆1 𝑑𝑡 = − 𝜕ℋ 𝜕𝑆 = −𝑤1 + (𝜆1 − 𝜆2)(1 − 𝑢1)𝛽1𝐼1 + (𝜆1 − 𝜆3)(1 − 𝑢1)𝛽2𝐼2 +(𝑢2 + 𝜇)𝜆2 − 𝜔𝜆3, 𝑑𝜆2 𝑑𝑡 = − 𝜕ℋ 𝜕𝐼1 = 𝑤3 + (𝜆1 − 𝜆2)(1 − 𝑢1)𝛽1𝑆 + (𝜔 + 𝛼1 + 𝑐 + 𝜇)𝜆2 − 𝜔𝜆3, 𝑑𝜆3 𝑑𝑡 = − 𝜕ℋ 𝜕𝐼2 = 𝑤4 + (𝜆1 − 𝜆3)(1 − 𝑢1)𝛽2𝑆 + (𝛼2 + 𝑑 + 𝜇)𝜆3 +(𝜆4 − 𝜆3)(1 − 𝜀)𝑉, 𝑑𝜆4 𝑑𝑡 = − 𝜕ℋ 𝜕𝑉 = −𝑤2 + (𝜆4 − 𝜆3)(1 − 𝜀)𝐼2 + 𝜇𝜆4, (9) along with the transversality conditions 𝜆𝑗 (𝑇) = 0, 𝑗 = 1,2,3,4. then, the optimal control 𝑢𝑖 ∗ are defined by solving 𝜕ℋ 𝜕𝑢𝑖 = 0. this lead to the condition of optimal controls 𝜕ℋ 𝜕𝑢1 = −𝐶1𝑢1 + (𝜆1 − 𝜆2)𝛽1𝐼1 ∗𝑆∗ + (𝜆1 − 𝜆3)𝛽2𝐼2 ∗𝑆∗ = 0, 𝜕ℋ 𝜕𝑢2 = −𝐶2𝑢2 + (𝜆4 − 𝜆1)𝑆 ∗ = 0. hence, we have 𝑢1 = (𝜆1 − 𝜆2)𝛽1𝐼1 ∗𝑆∗ + (𝜆1 − 𝜆3)𝛽2𝐼2 ∗𝑆∗ 𝐶1 , 𝑢2 = (𝜆4 − 𝜆1)𝑆 ∗ 𝐶2 . (10) since 𝑢𝑖 ∗, 𝑖 = 1,2 must belong to ω, we get 𝑢1 ∗ = { 0 (𝜆1 − 𝜆2)𝛽1𝐼1 ∗𝑆∗ + (𝜆1 − 𝜆3)𝛽2𝐼2 ∗𝑆∗ 𝐶1 𝑢1 𝑚𝑎𝑥 , if 𝑢1 ≤ 0 , if 0 < 𝑢1 < 𝑢1 𝑚𝑎𝑥 , if 𝑢1 ≥ 𝑢1 𝑚𝑎𝑥 , 𝑢2 ∗ = { (𝜆4 − 𝜆1)𝑆 ∗ 𝐶2 , if 𝑢2 ≤ 0 , if 0 < 𝑢2 < 𝑢2 𝑚𝑎𝑥 , if 𝑢2 ≥ 𝑢2 𝑚𝑎𝑥 . which can also be characterized by 𝑢1 ∗ = min {max {0, (𝜆1 − 𝜆2)𝛽1𝐼1 ∗𝑆∗ + (𝜆1 − 𝜆3)𝛽2𝐼2 ∗𝑆∗ 𝐶1 } , 𝑢1 𝑚𝑎𝑥 }, (11) optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 178 𝑢2 ∗ = min {max {0, (𝜆4 − 𝜆1)𝑆 ∗ 𝐶2 } , 𝑢2 𝑚𝑎𝑥 }. this completes the proof. the following section provides numerical simulations of the optimality system, the control profile, and discussions. numerical results and discussion we observe the optimal trajectories of the optimal system through some numerical simulations. we applied the forward-backward sweep method described in [20], which is very commonly used in the literature of optimal control problems, as in the literature [9], [14], [23]. for numerical simulation, we use a set of parameter values as in [19] and take the weight factor 𝑤1, 𝑤2, 𝑤3, 𝑤4, equal to one 𝐶1 = 2, and 𝐶2 = 2 due to the lack of the available literature and data. it should be noted that the weight values selected for the simulation are only for the theoretical sense to describe the control strategy proposed in this model. for the maximum control, we set 𝑢1, 𝑢1 𝑚𝑎𝑥 = 0.5 under the assumption that it is difficult to maintain community discipline in implementing prevention of disease transmissions such as restrictions on community interaction/mobilization, local lockdown, and quarantine. as for the control with vaccination, 𝑢1 𝑚𝑎𝑥 = 0.7 was taken based on the assumption that the vaccine was not yet fully effective and the lack of awareness of the individual to be vaccinated. we will focus on comparing the three control strategies.  strategy i: combination of prevention of disease transmission and vaccination. in this case 𝑢1 and 𝑢2 are defined as control variables.  strategy ii: use restrictions on community interaction/mobilization as a control. in this case, only 𝑢1 is taken as a control variable.  strategy iii: vaccine intervention as the only control, so only 𝑢2 as the control variable. figure 1 shows the impact of implementing various strategies on the population size of 𝑆(𝑡) (fig. 1a) and 𝑉(𝑡) (fig. 1b) for 50 days. it can be seen that without implementing the control strategy, the number of susceptible individuals and vaccinated individuals is lower than if the control strategy is applied. with optimal control strategies, most susceptible individuals will be protected or vaccinated against the virus, thus leading to higher individuals in the vaccinated class (fig. 1b) and ultimately resulting in fewer individuals being infected by either strain one or strain two see figure 2. optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 179 figure 1. simulation results without and with the implementation of various control strategies. (a) susceptible individuals, (b) vaccinated individuals. in figures 2(a) 2(d), we show the impact of using optimal control strategies on the number of individuals infected by strain one and strain two. this suggests that disease in infectious populations can be reduced more rapidly when both controls are applied (strategy i) compared to the situation without control or by using a single control, i.e., prevention of transmission only (strategy ii) or vaccination only (strategy iii). from the simulation results, the trajectories of optimal control show that the combination of two control strategies can lead to desired disease control. fig. 2(a) – 2(b) show a comparison of the number of individuals infected by strains one and by strain two using strategy i and strategy iii. figures 2(c) 2(d) show the situation of individuals infected by strain one and strain two by implementing strategy ii and without control strategy. based on the number of infected individuals, it appears that strategy i is the best strategy that can be applied to end the spread of the disease immediately. the corresponding timedependent controls 𝑢1(𝑡) and 𝑢2(𝑡) are depicted in figure 3. optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 180 figure 2. simulation results for individuals infected by strain one (a), (c) and infected individuals by strain two (b), (d) without and with the implementation of various control strategies. figure 3(a) tells us that strategy i can be implemented by maintaining preventive transmission control 𝑢1(𝑡) and vaccination 𝑢2(𝑡) at their upper bounds for about 30 days and 35 days, respectively, and gradually decreasing to their lower bounds. figure 3(b) illustrates the implementation of strategy ii, which shows that the control 𝑢1(𝑡) is kept at its upper bound over time. while figure 3(c) shows that if strategy iii is implemented, then the control 𝑢2(𝑡) should be maintained at its upper bound most of the time. when these controls are implemented on a broad scale, it is also critical to adopt an approach that provides optimal cost, i.e., less cost. as a result, we will look at the cost-effectiveness of these controls in the next section. optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 181 figure 3. control profile for each strategy. (a) strategy i, (b) strategy ii, (c) strategy iii. cost-effectiveness analysis in this section, we use the average cost-effectiveness ratio (acer) and the incremental cost-effectiveness ratio (icer) to carry out the cost-effectiveness analysis. the average cost-effective ratio (acer) is calculated as follows [21]: acer = the total cost (tc) total number of infections averted (ta) . (12) the total number of individuals infected averted during the intervention period t is obtained by using ta = ∫(𝐼1 ∗ + 𝐼2 ∗)𝑑𝑡 − 𝑇 0 ∫(𝐼1 + 𝐼2)𝑑𝑡, 𝑇 0 (13) where 𝐼1 ∗, 𝐼2 ∗ are the solution of infected classes by strain one and the infected classes by strain two without controls and 𝐼1, 𝐼2 are the optimal solution with controls. the total optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 182 cost implemented during the period t is calculated as follows: t𝑐 = ∫ 1 2 (𝐶1𝑢1 2 + 𝐶2𝑢2 2)𝑑𝑡. 𝑇 0 based on this cost analysis, the most cost-effective strategy is the one with the smallest acer value [23]. now, we calculate the total cost invested and total infected averted in each strategy to analyze the cost-effectiveness. using the formula (12), we find that strategy i has the smallest acer value and strategy ii has the largest acer value, as seen in figure 4. the results are also given in table 1. thus, according to the acer value, the most effective intervention strategy is strategy i. figure 4. average cost-effectiveness ratio (acer) results for strategy i – iii the icer, on the other hand, is calculated by dividing the cost difference between two feasible interventions by the difference in their effects. mathematically, it is expressed as [22], [24]: icer = difference in costs produced by strategies i and j difference in the total number of infection averted in strategies i and j . (14) the difference between the total number of infected individuals without controls and the total number of infected individuals with controls is used to compute the total number of averted infections. furthermore, we employed the cost functions 𝐶1 2 𝑢1 2 and 𝐶2 2 𝑢2 2 across time to calculate the total cost of the implemented strategies. we also used the parameter values from the preceding section to calculate the total cost and total infections averted, as shown in table 1, with total averted infections are ranked according to their increasing in order. then, the icer is calculated using the formula in (14). first, we computed for the competing strategies ii and iii as follows: icer (ii) = 989,582.93 − 0 102,599,77 − 0 = 9.6451, icer (iii) = 1,026,524.16 − 989,582.93 112,334.16 − 102,599,77 = 3.7949. the results of the icer computation (as shown in table 1) show that strategy ii has a 8,8 8,9 9 9,1 9,2 9,3 9,4 9,5 9,6 9,7 average cost-effective ratio (acer) strategi ii strategi iii strategi i optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 183 higher icer value than strategy iii. as a result, implementing prevention transmission control 𝑢1 alone is more expensive and ineffective than using vaccine intervention control 𝑢2. as a result, strategy ii is removed from the list of possible control strategies. the icer for strategies iii and i now need to be recalculated. the calculation is as follows: icer (iii) = 1,026,524.16 112,334.16 = 9.1381, icer (i) = 1,029,506.04 − 1,026,524.16 112,886.09 − 112,334.16 = 5.4026. table 2 summarizes the results of the calculations. table 1. strategies i – iii in order of increasing number of averted infected strategy total infected averted total cost acer icer strategy ii 102,599.77 989,582.93 9.6451 9.6451 strategy iii 112,334.16 1,026,524.16 9.1381 3.7949 strategy i 112,886.09 1,029,506.04 9.1199 table 2. comparison between strategies iii and i strategy total infected averted total cost icer strategy iii 112,334.16 1,026,524.16 9.1381 strategy i 112,886.09 1,029,506.04 5.4026 it is clearly shown from table 2 that strategy iii has an icer value greater than strategy i. therefore, due to its cost-effectiveness and health benefits, strategy i, that combination of prevention of disease transmission and vaccination, is the best of all possible options. conclusions this paper has presented and analyzed a modified sir epidemic model considering a time-dependent constant control that includes two control variables. the two control variables considered in this model are prevention of disease transmission, such as by restricting community interactions and administering vaccines. numerical simulation of the optimal control problem was carried out using three strategies. strategy i, a combination of prevention of disease transmission and vaccination, strategy ii, only prevention of disease transmission by restriction community interaction is taken as a control variable, and strategy iii, if the vaccine intervention is the only intervention carried out. all strategies show control profiles adjusted for the number of infected individuals in the community. stronger interventions are needed to substantially reduce the number of infected individuals and the cost of implementing the strategy. furthermore, analysis to determine the most cost-effective strategy was carried out using acer and icer. based on calculating acer and icer, we found that using both controls simultaneously was the most cost-effective method and vaccination was the most cost-effective method in a single intervention. when only one intervention is applied, our simulations reveal that vaccination is the best single intervention strategy. however, the combination of vaccination and the restriction of community interactions, i.e., strategy i, gave the best results in reducing the number of infected individuals with the cheapest cost compared to a single intervention strategy. we think that our work optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 184 will serve as a foundation for mathematical models that examine cost-effectiveness analyses using real-world data, especially on an epidemic model which considers viral mutation and vaccination. acknowledgments we thank ahmad dahlan university for supporting this work through fundamental research grant (no. pd-273/sp3/lppm-uad/vi/2021). references [1] s. ullah and m. a. khan, “modeling the impact of non-pharmaceutical interventions on the dynamics of novel coronavirus with optimal control analysis with a case study,” chaos, solitons and fractals, vol. 139, 2020, doi: 10.1016/j.chaos.2020.110075. [2] r. shi, y. li, and c. wang, “stability analysis and optimal control of a fractionalorder model for african swine fever,” virus res., vol. 288, no. july, p. 198111, 2020, doi: 10.1016/j.virusres.2020.198111. [3] z. zhang, s. kundu, j. p. tripathi, and s. bugalia, “stability and hopf bifurcation analysis of an sveir epidemic model with vaccination and multiple time delays,” chaos, solitons and fractals, vol. 131, 2020, doi: 10.1016/j.chaos.2019.109483. [4] m. a. kuddus, m. t. meeha, l. j. whit, e. s. mcbryd, and a. i. adekunl, “modeling drug-resistant tuberculosis amplification rates and intervention strategies in bangladesh,” plos one, 2020, doi: 10.1371/journal.pone.0236112. [5] z. abbasi, i. zamani, a. hossein, a. mehra, and m. shafieirad, “optimal control design of impulsive sqeiar epidemic models with application to covid-19,” interdiscip. j. nonlinear sci. nonequilibrium complex phenom., vol. 139, p. 110054, 2020, doi: 10.1016/j.chaos.2020.110054. [6] h. w. berhe, o. d. makinde, and d. m. theuri, “optimal control and costeffectiveness analysis for dysentery epidemic model,” appl. math. inf. sci., vol. 1195, no. 6, pp. 1183–1195, 2018. [7] s. k. biswas, u. ghosh, and s. sarkar, “mathematical model of zika virus dynamics with vector control and sensitivity analysis,” infect. dis. model., vol. 5, pp. 23–41, 2020, doi: 10.1016/j.idm.2019.12.001. [8] m. zamir, z. shah, f. nadeem, a. memood, h. alrabaiah, and p. kumam, “non pharmaceutical interventions for optimal control of covid-19,” comput. methods programs biomed., vol. 196, p. 105642, 2020, doi: 10.1016/j.cmpb.2020.105642. [9] r. resmawan and l. yahya, “sensitifity analysis of mathematical model of coronavirus disease (covid-19) transmission,” cauchy, vol. 6, no. 2, p. 91, 2020, doi: 10.18860/ca.v6i2.9165. [10] m. z. ndii and y. a. adi, “modelling the transmission dynamics of covid-19 under limited resources,” commun. math. biol. neurosci., vol. 2020, pp. 1–24, 2020, doi: 10.28919/cmbn/4912. [11] a. yousefpour, h. jahanshahi, and s. bekiros, “optimal policies for control of the novel coronavirus disease (covid-19) outbreak,” chaos, solitons and fractals, vol. 136, p. 109883, 2020, doi: 10.1016/j.chaos.2020.109883. [12] y. a. adi and m. z. ndii, “modeling and prediction of covid-19 with a large scale social distancing,” j. fourier, vol. 9, no. 1, pp. 1–9, 2020, doi: 10.14421/fourier.2020.91.1-9. optimal control and cost-effectiveness analysis in an epidemic model with viral mutation and vaccine intervention yudi ari adi 185 [13] h. zhang, z. yang, k. a. pawelek, and s. liu, “optimal control strategies for a twogroup epidemic model with vaccination-resource constraints,” appl. math. comput., vol. 371, p. 124956, 2020, doi: 10.1016/j.amc.2019.124956. [14] y. a. adi, “a within-host tuberculosis model using optimal control,” jtam (jurnal teor. dan apl. mat., vol. 5, no. 1, p. 162, 2021, doi: 10.31764/jtam.v5i1.3813. [15] j. jang, h. kwon, and j. lee, “optimal control problem of an sir reaction – diffusion model with inequality constraints,” math. comput. simul., vol. 171, pp. 136–151, 2020, doi: 10.1016/j.matcom.2019.08.002. [16] g. b. libotte, f. s. lobato, g. m. platt, and a. j. silva neto, “determination of an optimal control strategy for vaccine administration in covid-19 pandemic treatment,” comput. methods programs biomed., vol. 196, p. 105664, 2020, doi: 10.1016/j.cmpb.2020.105664. [17] m. z. ndii, a. r. mage, j. j. messakh, and b. s. djahi, “optimal vaccination strategy for dengue transmission in kupang city , indonesia,” heliyon, vol. 6, no. october, p. e05345, 2020, doi: 10.1016/j.heliyon.2020.e05345. [18] p. radvak et al., “sars-cov-2 b.1.1.7 (alpha) and b.1.351 (beta) variants induce pathogenic patterns in k18-hace2 transgenic mice distinct from early strains,” nat. commun., vol. 12, no. 1, pp. 1–15, 2021, doi: 10.1038/s41467-021-26803-w. [19] y. a. adi, n. irsalinda, a. wiraya, sugiyarto, and z. a. rafsanjani, “an epidemic model with viral mutations and vaccine interventions,” submitt. publ., 2021. [20] s. lenhart and j. t. workman, optimal control applied to biological models. chapman & hall/crc, 2007. [21] h. w. berhe, “optimal control strategies and cost-effectiveness analysis applied to real data of cholera outbreak in ethiopia’s oromia region,” chaos, solitons and fractals, vol. 138, pp. 1–14, 2020, doi: 10.1016/j.chaos.2020.109933. [22] j. k. asamoah et al., “sensitivity assessment and optimal economic evaluation of a new covid-19 compartmental epidemic model with control interventions,” chaos, solitons and fractals, vol. 146, 2021, doi: 10.1016/j.chaos.2021.110885. [23] d. aldila, “cost-effectiveness and backward bifurcation analysis on covid-19 transmission model considering direct and indirect transmission,” commun. math. biol. neurosci., pp. 1–28, 2020. [24] fatmawati, u. dyah purwati, f. riyudha, and h. tasman, “optimal control of a discrete age-structured model for tuberculosis transmission,” heliyon, 2020, doi: 10.1016/j.heliyon.2019.e03030. [25] l. cesari, optimization-theory and applications. new york, ny, usa: springer, 1983. [26] t. burden, j. ernstberger, and k. r. fister, “optimal control applied to immunotherapy,” discret. contin. dyn. syst. ser. b, vol. 4, no. 1, pp. 135–146, 2004, doi: 10.3934/dcdsb.2004.4.135. [27] c. campos, c. j. silva, and d. f. m. torres, “numerical optimal control of hiv transmission in octave/matlab,” math. comput. appl., 2020, doi: 10.3390/mca25010001. cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 publication etics journal cauchy is a peer-reviewed electronic national journal. this statement clarifies ethical behaviour of all parties involved in the act of publishing an article in this journal, including the author, the chief editor, the editorial board, the peer-reviewer and the publisher (mathematics department of maulana malik ibrahim state islamic university of malang). this statement is based on cope’s best practice guidelines for journal editors. ethical guideline for journal publication the publication of an article in a peer-reviewed cauchy is an essential building block in the development of a coherent and respected network of knowledge. it is a direct reflection of the quality of the work of the authors and the institutions that support them. peer-reviewed articles support and embody the scientific method. it is therefore important to agree upon standards of expected ethical behavior for all parties involved in the act of publishing: the author, the journal editor, the peer reviewer, the publisher and the society. as publisher of pure and applied mathematics journal, we take our duties to back up over all stages of publishing seriously and we recognize our ethical and other responsibilities. we are committed to ensuring that advertising, reprint or other commercial revenue has no impact or influence on editorial decisions. publication decisions the editor of cauchy is responsible for deciding which of the articles submitted to the journal should be published. the validation of the work in question and its importance to researchers and readers must always drive such decisions. the editors may be guided by the policies of the journal's editorial board and constrained by such legal requirements as shall then be in force regarding libel, copyright infringement and plagiarism. the editors may confer with other editors or reviewers in making this decision. fair play an editor at any time evaluates manuscripts for their intellectual content without regard to race, gender, sexual orientation, religious belief, ethnic origin, citizenship, or political philosophy of the authors. confidentiality the editor and any editorial staff must not disclose any information about a submitted manuscript to anyone other than the corresponding author, reviewers, potential reviewers, other editorial advisers, and the publisher, as appropriate. any manuscripts received for review must be treated as confidential documents. they must not be shown to or discussed with others except as authorized by the editor. disclosure and conflicts of interest unpublished materials disclosed in a submitted manuscript must not be used in an editor's own research without the express written consent of the author. cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 publication etics contribution to editorial decisions peer review assists the editor in making editorial decisions and through the editorial communications with the author may also assist the author in improving the paper. promptness any selected referee who feels unqualified to review the research reported in a manuscript or knows that its prompt review will be impossible should notify the editor and excuse himself from the review process. standards of objectivity reviews should be conducted objectively. personal criticism of the author is inappropriate. referees should express their views clearly with supporting arguments. acknowledgement of sources reviewers should identify relevant published work that has not been cited by the authors. any statement that an observation, derivation, or argument had been previously reported should be accompanied by the relevant citation. a reviewer should also call to the editor's attention any substantial similarity or overlap between the manuscript under consideration and any other published paper of which they have personal knowledge. disclosure and conflict of interest privileged information or ideas obtained through peer review must be kept confidential and not used for personal advantage. reviewers should not consider manuscripts in which they have conflicts of interest resulting from competitive, collaborative, or other relationships or connections with any of the authors, companies, or institutions connected to the papers. reporting standards authors of reports of original research should present an accurate account of the work performed as well as an objective discussion of its significance. underlying data should be represented accurately in the paper. a paper should contain sufficient detail and references to permit others to replicate the work. fraudulent or knowingly inaccurate statements constitute unethical behavior and are unacceptable. data access and retention authors are asked to provide the raw data in connection with a paper for editorial review and should be prepared to provide public access to such data (consistent with the alpspstm statement on data and databases), if practicable, and should in any event be prepared to retain such data for a reasonable time after publication. originality and plagiarism the authors should ensure that they have written entirely original works, and if the authors have used the work and/or words of others that this has been appropriately cited or quoted. cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 publication etics multiple, redundant or concurrent publication an author should not in general publish manuscripts describing essentially the same research in more than one journal or primary publication. submitting the same manuscript to more than one journal concurrently constitutes unethical publishing behavior and is unacceptable. acknowledgement of sources proper acknowledgment of the work of others must always be given. authors should cite publications that have been influential in determining the nature of the reported work. authorship of the paper authorship should be limited to those who have made a significant contribution to the conception, design, execution, or interpretation of the reported study. all those who have made significant contributions should be listed as co-authors. where there are others who have participated in certain substantive aspects of the research project, they should be acknowledged or listed as contributors. the corresponding author should ensure that all appropriate co-authors and no inappropriate co-authors are included on the paper, and that all co-authors have seen and approved the final version of the paper and have agreed to its submission for publication. hazards and human or animal subjects if the work involves chemicals, procedures or equipment that have any unusual hazards inherent in their use, the author must clearly identify these in the manuscript. disclosure and conflicts of interest all authors should disclose in their manuscript any financial or other substantive conflict of interest that might be construed to influence the results or interpretation of their manuscript. all sources of financial support for the project should be disclosed. fundamental errors in published works when an author discovers a significant error or inaccuracy in his/her own published work, it is the author’s obligation to promptly notify the journal editor or publisher and cooperate with the editor to retract or correct the paper. cauchy jurnal matematika murni dan aplikasi volume 6, issue 3, november 2020 issn : 2086-0382 e-issn : 2477-3344 acknowledgment to reviewers in this issue contributions and valuable comments of the following reviewers in this issue was very appreciated kusno kusno, university of jember, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia dr riswan efendi, uin sultan syarif kasim riau, indonesia heni widayani, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia corina karim, brawijaya uiversity abdussakir abdussakir, (scopus id:57202352728), universitas islam negeri maulana malik ibrahim malang, indonesia cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 publication etics cauchy: jurnal matematika murni dan aplikasi is a peer-reviewed electronic national journal. this statement clarifies ethical behaviour of all parties involved in the act of publishing an article in this journal, including the author, the chief editor, the editorial board, the peer-reviewer and the publisher (mathematics department of maulana malik ibrahim state islamic university of malang). this statement is based on cope’s best practice guidelines for journal editors. ethical guideline for journal publication the publication of an article in a peer-reviewed cauchy is an essential building block in the development of a coherent and respected network of knowledge. it is a direct reflection of the quality of the work of the authors and the institutions that support them. peer-reviewed articles support and embody the scientific method. it is therefore important to agree upon standards of expected ethical behavior for all parties involved in the act of publishing: the author, the journal editor, the peer reviewer, the publisher and the society. as publisher of pure and applied mathematics journal, we take our duties to back up over all stages of publishing seriously and we recognize our ethical and other responsibilities. we are committed to ensuring that advertising, reprint or other commercial revenue has no impact or influence on editorial decisions. publication decisions the editor of cauchy is responsible for deciding which of the articles submitted to the journal should be published. the validation of the work in question and its importance to researchers and readers must always drive such decisions. the editors may be guided by the policies of the journal's editorial board and constrained by such legal requirements as shall then be in force regarding libel, copyright infringement and plagiarism. the editors may confer with other editors or reviewers in making this decision. fair play an editor at any time evaluates manuscripts for their intellectual content without regard to race, gender, sexual orientation, religious belief, ethnic origin, citizenship, or political philosophy of the authors. confidentiality the editor and any editorial staff must not disclose any information about a submitted manuscript to anyone other than the corresponding author, reviewers, potential reviewers, other editorial advisers, and the publisher, as appropriate. any manuscripts received for review must be treated as confidential documents. they must not be shown to or discussed with others except as authorized by the editor. disclosure and conflicts of interest unpublished materials disclosed in a submitted manuscript must not be used in an editor's own research without the express written consent of the author. cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 publication etics contribution to editorial decisions peer review assists the editor in making editorial decisions and through the editorial communications with the author may also assist the author in improving the paper. promptness any selected referee who feels unqualified to review the research reported in a manuscript or knows that its prompt review will be impossible should notify the editor and excuse himself from the review process. standards of objectivity reviews should be conducted objectively. personal criticism of the author is inappropriate. referees should express their views clearly with supporting arguments. acknowledgement of sources reviewers should identify relevant published work that has not been cited by the authors. any statement that an observation, derivation, or argument had been previously reported should be accompanied by the relevant citation. a reviewer should also call to the editor's attention any substantial similarity or overlap between the manuscript under consideration and any other published paper of which they have personal knowledge. disclosure and conflict of interest privileged information or ideas obtained through peer review must be kept confidential and not used for personal advantage. reviewers should not consider manuscripts in which they have conflicts of interest resulting from competitive, collaborative, or other relationships or connections with any of the authors, companies, or institutions connected to the papers. reporting standards authors of reports of original research should present an accurate account of the work performed as well as an objective discussion of its significance. underlying data should be represented accurately in the paper. a paper should contain sufficient detail and references to permit others to replicate the work. fraudulent or knowingly inaccurate statements constitute unethical behavior and are unacceptable. data access and retention authors are asked to provide the raw data in connection with a paper for editorial review and should be prepared to provide public access to such data (consistent with the alpspstm statement on data and databases), if practicable, and should in any event be prepared to retain such data for a reasonable time after publication. originality and plagiarism the authors should ensure that they have written entirely original works, and if the authors have used the work and/or words of others that this has been appropriately cited or quoted. cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 publication etics multiple, redundant or concurrent publication an author should not in general publish manuscripts describing essentially the same research in more than one journal or primary publication. submitting the same manuscript to more than one journal concurrently constitutes unethical publishing behavior and is unacceptable. acknowledgement of sources proper acknowledgment of the work of others must always be given. authors should cite publications that have been influential in determining the nature of the reported work. authorship of the paper authorship should be limited to those who have made a significant contribution to the conception, design, execution, or interpretation of the reported study. all those who have made significant contributions should be listed as co-authors. where there are others who have participated in certain substantive aspects of the research project, they should be acknowledged or listed as contributors. the corresponding author should ensure that all appropriate co-authors and no inappropriate co-authors are included on the paper, and that all co-authors have seen and approved the final version of the paper and have agreed to its submission for publication. hazards and human or animal subjects if the work involves chemicals, procedures or equipment that have any unusual hazards inherent in their use, the author must clearly identify these in the manuscript. disclosure and conflicts of interest all authors should disclose in their manuscript any financial or other substantive conflict of interest that might be construed to influence the results or interpretation of their manuscript. all sources of financial support for the project should be disclosed. fundamental errors in published works when an author discovers a significant error or inaccuracy in his/her own published work, it is the author’s obligation to promptly notify the journal editor or publisher and cooperate with the editor to retract or correct the paper. cauchy jurnal matematika murni dan aplikasi volume 7, issue 2, may 2022 issn : 2086-0382 e-issn : 2477-3344 acknowledgment to reviewers in this issue contributions and valuable comments of the following reviewers in this issue was very appreciated bety hayat susanti, politeknik siber dan sandi negara, indonesia dian savitri, universitas negeri surabaya, indonesia meta kallista, universitas telkom, indonesia dani suandi, universitas bina nusantara, bandung, indonesia anwar fitrianto, department of statistics, ipb university, indonesia subanar seno, gadjah mada university, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia sri harini, universitas islam negeri maulana malik ibrahim malang, indonesia heni widayani, faculty of mathematics and natural sciences, institut teknologi bandung, indonesia corina karim, brawijaya uiversity fachrur rozi, universitas islam negeri maulana malik ibrahim malang, indonesia javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740595') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740557') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740556') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/740541') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/736347') javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/5964') elliptical orbits mode application for approximation of fuel volume change cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 316-331 p-issn: 2086-0382; e-issn: 2477-3344 submitted: december 19, 2021 reviewed: january 09, 2022 accepted: january 10, 2022 doi: http://dx.doi.org/10.18860/ca.v7i1.14407 elliptical orbits mode application for approximation of fuel volume change jovian dian pratama*, ratna herdiana*, susilo hariyanto department of mathematics, diponegoro university *corresponding author email: joviandianpratama@yahoo.com*, herdiana.math@gmail.com*, sus2_hariyanto@yahoo.com abstract at a 45.507.21 candirejo tuntang gas station, it is difficult to ensure the stock of fuel supplies because there is always a difference between calculations using dipsticks and fuel dispensers. because the calculation method used by gas stations throughout indonesia is linear interpolation which is not smooth, then by using the pertalite (pertamina fuel products) measuring book data a smooth volume change approximation function will be formed. this article presents the elliptical orbits mode (eom) as a proposed method in approximating the function that describes the volume change of fuel with respect to fuel height in underground tank (ut). since the calculation by the gas station is not smooth, it is necessary for a smoother data fitting by considering residual square error (rss) and mean square error (mse). the results of the elliptical orbits mode approximation will be compared with the circle orbits mode and least square data fitting. the result show that eom(θ) method with elliptical height control produces smaller rss and mse compared to using com, eom, least square degree two and three. in next research, the approximation results will be applied to the fuel dispenser data. keywords: orbits mode; data fitting; ellipse; fuel; approximation introduction based on the assumptions given in [1] and [2] the previous orbits mode data fitting research which was used to calibrate the dipstick measuring instrument that converts the height to the volume of fuel in the buried tank, it is explained that the approximation function of the change in fuel volume in the tank is only based on height. in [1] and [2] it is explained that the orbits mode data fitting-based calibration is limited by several assumptions and field conditions, including the following: 1. the resulting approximation function is the change in the volume of fuel in the ut which only depends on the variable height of the fuel in the ut. 2. orbits mode data fitting proposed by the author is used only for the distribution of data that forms a semicircle or ellipse in the first quadrant. 3. the data to be approximated is the fuel measurement manual in the ut from the semarang regency metrology agency. http://dx.doi.org/10.18860/ca.v7i1.14407 mailto:joviandianpratama@yahoo.com elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 317 4. it is assumed that ut is not tilted or flat during measurement, including tank trucks that deliver fuel to gas stations or are filling supplies at gas stations. 5. ut for the right and left have the same shape alias symmetrical, due to the second assumption. in [3] data fitting is applied to approximate the shape of an island on the map, then in [4] the curve fitting which is used to detect enlarged and shrinking eye retina, research in [3] and [4] has their additional algorithm to approach the desired result, which will also be applied to the orbits mode data fitting starting from the proposed method and the object applied to calibration is something new. in [5] hyper least square or hyperls was also introduced and [6] also calibrated data in the form of curves but using an orthogonal matrix where the more data the more complicated, so the method will be difficult for large data. from [7] there is a design drawing of a buried tank where the tank is in the form of a capsule tube with a cross section that is not flat or protruding so that according to [8] also, changes in volume in the tank tend to form a semicircle or half an ellipse or a parabola. the approximation function used by gas stations throughout indonesia is linear interpolation which is not smooth, then by using the pertalite (pertamina fuel products) measuring book data a smooth volume change approximation function with elliptical orbits mode will be formed, and then will be any improvement on ellips height control to minimize residual sum of square (rss) and mean square error (mse), where the data used is the change in the volume of fuel in the tank based on changes in the height of the fuel in the ut in units (cm) and will be converted to fuel volume (liter). therefore, the author proposes method because the calculation is simpler for small and large data and is smoother, although only for data that tends to be semicircular or elliptical, to approximate the fuel volume with minimized errors. orbits mode data fitting is a method proposed by the author in approximating the function of the data which tends to be in the form of a semi-circle or half an ellipse. in [1] and [2] the author introduced the orbits mode data fitting method only in a circle shape, then compared it with cubic spline interpolation and least square data fitting, but this time the authors made the orbits mode data fitting method in the shape of an ellipse too, because in the value approach there is a volume of fuel which has not been detected in the function. definition 1 (ellipse equation) in [9] the ellipse equation is presented in equation (1) which (𝑝,𝑞) is the center point of the ellipse with the major and minor axes adjusting 𝑎 and 𝑏, (𝑥 −𝑝)2 𝑎2 + (𝑦 −𝑞)2 𝑏2 = 1 (1) definition 2 (𝑨𝒊 set for ellipse mode) a set 𝐴𝑖 of points formed from two ellipse equations is defined as follows, 𝐴𝑖 = {(𝑥,𝑦) | 𝑑𝑖1 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑𝑖2 }, (2) with 𝑖 = 1,2,…,𝑘. the set (2) it can be visualized as follows, elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 318 figure 1. set visualization 𝐴𝑖 (2) with 𝑥𝑤 2 = (𝑥 − 𝑤 2 ) and 𝑤 = height of ut and 𝑙 = a half of maximum fuel change in ut defined 𝑤 2 (√𝑑𝑖2 −√𝑑𝑖1) = 𝑙 2 (√𝑑𝑖2 −√𝑑𝑖1) = 𝑡𝑖 the thickness of the ellipse from the partition interval taken from the maximum and minimum values of the volume change [δ𝑉(ℎ)𝑚𝑖𝑛,δ𝑉(ℎ)𝑚𝑎𝑥] will be divided by several partitions where the thickness with the most points is sought, then [δ𝑉(ℎ)𝑚𝑖𝑛,δ𝑉(ℎ)𝑚𝑎𝑥] = [𝑑11,𝑑12]∪ [𝑑21,𝑑22]∪ [𝑑31,𝑑32] ∪…∪[𝑑𝑘1,𝑑𝑘2] (3) with 𝑑𝑖2 = 𝑑(𝑖+1)1 and the intersection of the respective sub-blankets of the minimum and maximum volume intervals in equation (3) is denoted for each [𝑑𝑖1,𝑑𝑖2]∩ [𝑑(𝑖+1)1,𝑑(𝑖+1)2] is equal to 𝑑𝑖2 or 𝑑(𝑖+1)1, where 𝑖 = 1,2,…,𝑘 with 𝑘 is the number of blankets dividing the maximum and minimum intervals of the volume change. definition 3 (partition of 𝑨𝒊 set) partition of 𝐴𝑖 set that divide 𝐴𝑖 set to become partitions or sets of points between 2 ellipses equations, defined as follows, 𝐴1 = {(𝑥,𝑦) | 𝑑11 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑12 } 𝐴2 = {(𝑥,𝑦) | 𝑑21 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑22 } 𝐴3 = {(𝑥,𝑦) | 𝑑31 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑32 } … (4) elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 319 𝐴𝑘 = {(𝑥,𝑦) | 𝑑𝑘1 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑𝑘2 } set of partitions (4) can be described as follows, figure 2. the visualization of partitions set 𝐴1,𝐴2,𝐴3 up to 𝐴𝑘 (4) definition 4 (elliptical orbits mode) elliptical orbits mode was choose partition has the most points, defined as follows: 𝑀𝑎𝑥(𝑛(𝐴𝑖)) = 𝑀𝑎𝑥(𝑛(𝐴1),𝑛(𝐴2),𝑛(𝐴3),…,𝑛(𝐴𝑘)) (5) with 𝑖 = 1,2,…,𝑘. if there is a condition where 𝑀𝑎𝑥(𝑛(𝐴𝑖)) = 𝑛(𝐴𝑘1) = ⋯ = 𝑛(𝐴𝑘𝑚), then the average ellipses scale 𝐴𝑘1,𝐴𝑘2,…,𝐴𝑘𝑚 is taken so 𝑀𝑎𝑥(𝑛(𝐴𝑖)) = 𝑛(𝐴𝑚 ̅̅ ̅̅ ), therefore, the inequality whose ellipse will change is defined 𝐴𝑚̅̅ ̅̅ as follows, 𝐴𝑚̅̅ ̅̅ = {(𝑥,𝑦) |( 𝑑𝑘11 +⋯+𝑑𝑘𝑚1 𝑚 ) < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < ( 𝑑𝑘12 +⋯+𝑑𝑘𝑚2 𝑚 ) } the next step, because we have obtained 𝐴𝑖 or 𝐴𝑚̅̅ ̅̅ , then we approximate ellipse equation, divided which can be devide into two cases: case 1, [𝑴𝒂𝒙(𝒏(𝑨𝒊)) = 𝒏(𝑨𝒎)] based on the set with the maximum number of points between 2 ellipses, choose 𝐴𝑚 = {(𝑥,𝑦) | 𝑑𝑚1 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑𝑚2 }, so that we get: 𝑑𝑚1 < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < 𝑑𝑚2 ⟹ ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 = ( 𝑑𝑚1 +𝑑𝑚2 2 ) ⟹ 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 (6) elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 320 the steps in (6) can be visualized as follows, figure 3. visualization steps in (6) therefore, from (6) the result of the elliptical orbital mode is a semicircular function by substituting 𝑥𝑤 2 = 𝑥𝑑𝑚, as follows: 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑥𝑑𝑚 ( 𝑤 2 ) ) 2 (7) with 𝑥 = the fuel level in ut and 𝑥𝑑𝑚 = 𝑥 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 . theorem 1 (defined intervals for elliptical orbits mode approximation function) if 𝑥𝑤 2 = 𝑥𝑑𝑚 is substituted to 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1+𝑑𝑚2 2 ) −( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 then 𝑦 is defined in ℝ in the interval 0 ≤ 𝑥 ≤ 𝑤 2 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 . proof: substitute 𝑥𝑤 2 = 𝑥𝑑𝑚 to 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1+𝑑𝑚2 2 ) −( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 so that it is obtained: 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑥𝑑𝑚 ( 𝑤 2 ) ) 2 ⟺ 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) − ( 𝑥 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 ( 𝑤 2 ) ) 2 ⟺ 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) − ( 𝑥2 −𝑤𝑥√ 𝑑𝑚1+𝑑𝑚2 2 +( 𝑤 2 ) 2 ( 𝑑𝑚1+𝑑𝑚2 2 ) ( 𝑤 2 ) 2 ) elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 321 ⟺ 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑑𝑚1 +𝑑𝑚2 2 ) − ( 𝑥2 −𝑤𝑥√ 𝑑𝑚1+𝑑𝑚2 2 ( 𝑤 2 ) 2 ) ⟺ 𝑦 = ( 𝑙 2 )√ 𝑥(𝑤√ 𝑑𝑚1+𝑑𝑚2 2 −𝑥) ( 𝑤 2 ) 2 ⟺ 𝑦 = ( 𝑙 𝑤 )√𝑥(𝑤√ 𝑑𝑚1 +𝑑𝑚2 2 −𝑥) with 𝑤,𝑑𝑚1,𝑑𝑚2 ∈ ℝ +. obviously 𝑦 is defined in ℝ since 𝑥(𝑤√ 𝑑𝑚1+𝑑𝑚2 2 −𝑥) ≥ 0, then 𝑥 must be on both interval 0 ≤ 𝑥 ≤ 𝑤√ 𝑑𝑚1+𝑑𝑚2 2 and 0 ≤ 𝑥 ≤ 𝑤 2 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 , because 𝑤√ 𝑑𝑚1+𝑑𝑚2 2 > 𝑤 2 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 . ∎ substitute 𝑥𝑤 2 = 𝑥𝑑𝑚 so that the value 𝑦 is defined in 0 ≤ 𝑥 ≤ 𝑤 2 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 , so that (7) the function of volume change to fuel level in the ut can be visualized as follows, figure 4. visualization of steps in (6) to (7) the function is formed from the half ellipse selected for the calibration of the buried tank whose cross-sectional area is circular but convex, so that the volume of ut will be calculated as a function of the change in volume with respect to height, which then the coordinates are taken from the change in units (cm) that will be converted to volume per centimeter (liter/cm). elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 322 case 2, [𝑴𝒂𝒙(𝒏(𝑨𝒊)) = 𝒏(𝑨𝒌𝟏) = ⋯ = 𝒏(𝑨𝒌𝒎)] based on the set with the maximum number of points between two ellipses equations, 𝐴𝑚̅̅ ̅̅ = {(𝑥,𝑦) |( 𝑑𝑘11+⋯+𝑑𝑘𝑚1 𝑚 ) < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < ( 𝑑𝑘12+⋯+𝑑𝑘𝑚2 𝑚 ) }, then: ( 𝑑𝑘11 +⋯+𝑑𝑘𝑚1 𝑚 ) < ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 < ( 𝑑𝑘12 +⋯+𝑑𝑘𝑚2 𝑚 ) ⟹ ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 = ( ( 𝑑𝑘11+⋯+𝑑𝑘𝑚1 𝑚 )+( 𝑑𝑘12+⋯+𝑑𝑘𝑚2 𝑚 ) 2 ) ⟹ ( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 +( 𝑦 ( 𝑙 2 ) ) 2 = ( 𝑑𝑚1 +𝑑𝑚2 2 ) . (8) the steps in (8) are visualized similarly in figure 11 with 𝑑𝑚1 = ( 𝑑𝑘11+⋯+𝑑𝑘𝑚1 𝑚 ) and 𝑑𝑚2 = ( 𝑑𝑘12+⋯+𝑑𝑘𝑚2 𝑚 ), then ⟹ 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑥𝑤 2 ( 𝑤 2 ) ) 2 . (9) therefore, the result of the orbital mode ellipse is a half-ellipse function, as follows: 𝑦 = ( 𝑙 2 )√( 𝑑𝑚1 +𝑑𝑚2 2 ) −( 𝑥𝑑𝑚 ( 𝑤 2 ) ) 2 , (10) with 𝑥 = is the fuel level in ut and 𝑥𝑑𝑚 = 𝑥 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 . as stated theorem 1 about defined interval for (10) substituted 𝑥𝑤 2 = 𝑥𝑑𝑚 so that 𝑦 is defined values since 0 ≤ 𝑥 ≤ 𝑤 2 −( 𝑤 2 )√ 𝑑𝑚1+𝑑𝑚2 2 , we get (10) the function of the change in volume to the fuel level in the ut which is visualized similarly in figure 4. methods research steps 1. construction of mathematical model elliptical orbits mode methods with the following steps: a) construction of mathematical model elliptical orbits mode.  review for ellipse equation  define 𝐴𝑖 set for ellipse mode  make partition for 𝐴𝑖 set (definition)  choose partitions of set 𝐴𝑖 with 𝑛(𝐴𝑖) is the maximum value  divide onto two cases singular and plural maximum value elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 323  create function 𝑦 = 𝑓(𝑥) from chosen 𝐴𝑖 sets  translate 𝑦 = 𝑓(𝑥) so that 𝑦 is defined on 0 and so on, and  controlling height to find ellipse’s height or vertical axis that minimized residual sum of square (rss) and mean square error (mse). 2. construction of mathematical model elliptical orbits mode will be applied on data from candirejo gas station, measuring book from government metrology agency specific for pertalite (fuel product of pertamina) only and visualize it. 3. measuring performance using rss [10] and mse [11,12] and compare with circle orbits mode from [1], least square with n = 2 and n = 3 from [13], and elliptical with height control. results and discussion pertalite measuring book data a calculation of gas station 45.507.21 candirejo using a measuring book from government metrology agency to determine the volume of fuel in the buried tank, so the authors are just obtained the following data which is not proceed by authors, we need this data from metrologi agency as constructor of approximation function, as follows: table 1. fuel volume measuring book data for pertalite tanks from metrology agency height (x) volume diff (y) height (x) volume diff (y) height (x) volume diff (y) 0 0.0 0.0 75 7085.9 117.7 150 16124.4 111.1 1 237.1 237.1 76 7203.5 117.6 151 16235.6 111.2 2 294.3 57.2 77 7321.2 117.7 152 16346.7 111.1 3 351.4 57.1 78 7438.8 117.6 153 16457.8 111.1 4 409.4 58.0 79 7556.5 117.7 154 16568.9 111.1 5 468.2 58.8 80 7674.1 117.6 155 16680.0 111.1 6 527.1 58.9 81 7791.8 117.7 156 16791.1 111.1 7 586.1 59.0 82 7909.4 117.6 157 16902.2 111.1 8 646.7 60.6 83 8027.1 117.7 158 17013.3 111.1 9 707.3 60.6 84 8144.7 117.6 159 17124.4 111.1 10 767.9 60.6 85 8262.4 117.7 160 17232.6 108.2 11 831.6 63.7 86 8380.0 117.6 161 17337.9 105.3 12 896.1 64.5 87 8497.6 117.6 162 17443.2 105.3 13 960.6 64.5 88 8615.3 117.7 163 17548.4 105.2 14 1026.7 66.1 89 8732.9 117.6 164 17653.7 105.3 15 1093.3 66.6 90 8855.0 122.1 165 17758.9 105.2 16 1160.0 66.7 91 8980.0 125.0 166 17864.2 105.3 17 1228.3 68.3 92 9105.0 125.0 167 17969.5 105.3 18 1297,2 68.9 93 9230.0 125.0 168 18065.7 96.2 19 1366.2 69.0 94 9355.0 125.0 169 18161.0 95.3 20 1439.3 73.1 95 9480.0 125.0 170 18256.2 95.2 21 1513.3 74.0 96 9605.0 125.0 171 18351.4 95.2 22 1588,0 74.7 97 9730.0 125.0 172 18446.7 95.3 23 1668.0 80.0 98 9855.0 125.0 173 18541.9 95.2 elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 324 height (x) volume diff (y) height (x) volume diff (y) height (x) volume diff (y) 24 1748.0 80.0 99 9980.0 125.0 174 18637.1 95.2 25 1830.0 82.0 100 10105.0 125.0 175 18732.4 95.3 26 1913,3 83.3 101 10230.0 125.0 176 18825.5 93.1 27 1997,4 84.1 102 10355.0 125.0 177 18916.4 90.9 28 2084,3 86.9 103 10480.0 125.0 178 19007.3 90.9 29 2171.3 87.0 104 10605.0 125.0 179 19098.2 90.9 30 2261.8 90.5 105 10730.0 125.0 180 19189.1 90.9 31 2352.7 90.9 106 10855.0 125.0 181 19280.0 90.9 32 2446.7 94.0 107 10980.0 125.0 182 19370.9 90.9 33 2541.9 95.2 108 11105.0 125.0 183 19458.3 87.4 34 2637.1 95.2 109 11230.0 125.0 184 19545.2 86.9 35 2732.4 95.3 110 11355.0 125.0 185 19632.2 87.0 36 2830.0 97.6 111 11480.0 125.0 186 19719.1 86.9 37 2930.0 100.0 112 11605.0 125.0 187 19804.0 84.9 38 3030.0 100.0 113 11730.0 125.0 188 19884.0 80.0 39 3130.0 100.0 114 11855.0 125.0 189 19964.0 80.0 40 3230.0 100.0 115 11980.0 125.0 190 20044.0 80.0 41 3330.0 100.0 116 12105.0 125.0 191 20124.0 80.0 42 3430.0 100.0 117 12230.0 125.0 192 20202.2 78.2 43 3530.0 100.0 118 12355.0 125.0 193 20276.3 74.1 44 3630.0 100.0 119 12480.0 125.0 194 20350.4 74.1 45 3730.0 100.0 120 12605.0 125.0 195 20420.0 69.6 46 3832.6 102.6 121 12730.0 125.0 196 20486.7 66.7 47 3937.9 105.3 122 12855.0 125.0 197 20553.3 66.6 48 4043,2 105.3 123 12980.0 125.0 198 20617.5 64.2 49 4148.4 105.2 124 13105.0 125.0 199 20680.0 62.5 50 4257.8 109.4 125 13227.1 122.1 200 20742.5 62.5 51 4368.9 111.1 126 13344.7 117.6 201 20802.9 60.4 52 4480.0 111.1 127 13462.4 117.7 202 20860.0 57.1 53 4591.1 111.1 128 13580.0 117.6 203 20917.1 57.1 54 4702.2 111.1 129 13697.6 117.6 204 20974.3 57.2 55 4813.3 111.1 130 13815.3 117.7 205 21028.6 54.3 56 4924.4 111.1 131 13932.9 117.6 206 21082.7 54.1 57 5035,6 111.2 132 14050.6 117.7 207 21136.8 54.1 58 5146.7 111.1 133 14168,2 117.6 208 21187.1 50.3 59 5257.8 111.1 134 14285.9 117.7 209 21222.9 35.8 60 5368.9 111.1 135 14403.5 117.6 210 21258.6 35.7 61 5480.0 111.1 136 14521.2 117.7 211 21294.3 35.7 62 5591.1 111.1 137 14638.8 117.6 212 21330.0 35.7 63 5702.2 111.1 138 14756.5 117.7 213 21365,7 35.7 64 5813.3 111.1 139 14874.1 117.6 214 21387.1 21.4 65 5924.4 111.1 140 14991.8 117.7 215 21399.1 12.0 66 6035.6 111.2 141 15109.4 117.6 216 21411.0 11.9 67 6146.7 111.1 142 15227.1 117.7 217 21422.9 11.9 68 6262.4 115.7 143 15344.7 117.6 218 21434.8 11.9 elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 325 69 6380.0 117.6 144 15457.8 113.1 219 21446.7 11.9 70 6497.6 117.6 145 15568.9 111.1 220 21458.6 11.9 71 6615.3 117.7 146 15680.0 111.1 221 21470.5 11.9 72 6732.9 117.6 147 15791.1 111.1 222 21482.4 11.9 73 6850.6 117.7 148 15902.2 111.1 222.3 21486.0 3.6 74 6968.2 117.6 149 16013.3 111.1 approximation using elliptical orbits mode (eom) table 1 is used as sample data; we derive elliptical orbits mode approach as an approximation to the change of fuel volume. to obtain a half-ellipse function from the smallest to the largest abscissa, choose, 𝑤 2 = maximum height on underground tank 2 = 222.3 2 = 111.15 𝑙 2 = maximum volume change on undergorund tank = 125 after that, select the difference 𝑑𝑖1 and 𝑑𝑖2 for the prefix of the 𝐴𝑖 set which is 𝑡𝑖 = 0.2 defined as follows, 𝐴𝑖 = {(𝑥,𝑦) |0.8 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 1} with step-size = 0.02 then the number of partitions is obtained 𝑡𝑖 𝑠𝑡𝑒𝑝−𝑠𝑖𝑧𝑒 = 10, so that the partitions are obtained from the 𝐴𝑖set with 𝑖 = 1,2,…,10, 𝐴1 = {(𝑥,𝑦) | 0.98 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 1} 𝐴2 = {(𝑥,𝑦) | 0.96 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.98} 𝐴3 = {(𝑥,𝑦) | 0.94 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.96} 𝐴4 = {(𝑥,𝑦) | 0.92 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.94} 𝐴5 = {(𝑥,𝑦) | 0.90 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.92} 𝐴6 = {(𝑥,𝑦) | 0.88 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.90} 𝐴7 = {(𝑥,𝑦) | 0.86 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.88} 𝐴8 = {(𝑥,𝑦) | 0.84 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.86} 𝐴9 = {(𝑥,𝑦) | 0.82 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.84} 𝐴10 = {(𝑥,𝑦) | 0.80 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.82} elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 326 accordingly, the value of 𝑛(𝐴𝑖) for 𝑖 = 1,2,…,10 is provided in table 2. table 2. calculation of elliptical orbits mode𝑛(𝐴𝑖) 𝑨𝒊 𝒏(𝑨𝒊) 𝐴1 11 𝐴2 15 𝐴3 16 𝐴4 26 𝐴5 25 𝐴6 19 𝐴7 9 𝐴8 3 𝐴9 0 𝐴10 0 according to table 2 it is obtained that 𝑀𝑎𝑥(𝑛(𝐴𝑖)) = 𝑛(𝐴4), with 𝑖 = 1,2,…,10 the selected 𝐴4 set , after that from the 𝐴4 set the following functions 𝑦 = 𝑓(𝑥) will be formed, 0.92 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.94 ⇒ (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 = ( 0.92+0.94 2 ) 0.92 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.94 ⇒ (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 = 0.93 0.92 < (𝑥 −111.15)2 (111.15)2 + 𝑦2 (125)2 < 0.94 ⇒ 𝑦 = 125√0.93− (𝑥 −111.15)2 (111.15)2 . then the translation 𝑦 = 𝑓(𝑥) to be defined at 0 ≤ 𝑥 ≤ 111.15−111.15√0.93 or around 0 ≤ 𝑥 ≤ 3,96, substitution 𝑥𝑤 2 = (𝑥 −111.5) with 𝑥𝑑𝑚 = (𝑥 −111.15√0.93), so that we get: 𝑦 = 125√0.93− (𝑥 −111.15√0.93) 2 (111.15)2 (11) with 𝑦 = change in the volume of fuel with respect to fuel height, x., for visualization of the graph of changes in the volume of fuel obtained: figure 5. graph of change in fuel volume by eom elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 327 based on figure 5, the eom results produce a function that is fit to the pertalite volume change data, the volume as function of height (h) is then obtained by the following integration: 𝑉(ℎ) = ∫125√0.93− (𝑥 −111.15√0.93) 2 (111.15)2 ℎ 0 𝑑𝑥 using the help of maple 2015 the integral results are obtained as follows, 𝑉(ℎ) = 535945891 800000000000000000 √18792746045928961 + 116250000 17040701 √896879arcsin( 10182971929 93000000000000 √896879√93) + 1 160000000000 (−8094332975000000000000 ℎ2 +1735249799334822290000000 ℎ +18792746045928961) 1 2 ℎ − 535945891 800000000000000000 (−8094332975000000000000 ℎ2 +1735249799334822290000000 ℎ +18792746045928961) 1 2 + 116250000 17040701 √896879arcsin( 19 93000000000000 √896879√93(5000000 ℎ −535945891)); (12) where 𝑉(ℎ) is defined on the interval 0 ≤ ℎ ≤ 222.3√0.93. the eom version of the fuel volume calculation uses (12) with 𝑉𝑚𝑎𝑥 = 20.296.55 liters. elliptical orbits mode with elliptical height control on data pertalite based on figure 5, it can be seen that the volume change function according to eom will regress more pertalite data if the ellipse height is higher, so it is necessary to adjust the ellipse height. the eom result at (11) has an elliptical height 125 which represents the equation so that it has a volume change function with respect to the fuel level in the ut, with the general form of (11): 𝐸𝑂𝑀(𝜃,𝑥) = 𝜃 ∙√0.93− (𝑥 −111.15√0.93) 2 (111.15)2 (13) with 𝜃 is the height of the ellipse, defined on the interval 0 ≤ 𝑥 ≤ 222.3√0.93. choose 𝜃 the one that minimizes the residual sum square 𝐸 = ∑ (𝑦𝑖 −𝐸𝑂𝑀(𝜃,𝑥𝑖)) 2 ⌊222.3√0.93⌋ 𝑖=1 𝐸 = ∑(𝑦𝑖 2 −2𝐸𝑂𝑀(𝜃,𝑥𝑖)𝑦𝑖 +𝐸𝑂𝑀 2(𝜃,𝑥𝑖)) 214 𝑖=1 elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 328 𝐸 = ∑𝑦𝑖 2 214 𝑖=1 −2∑𝐸𝑂𝑀(𝜃,𝑥𝑖)𝑦𝑖 214 𝑖=1 +∑(𝐸𝑂𝑀(𝜃,𝑥𝑖)) 2 214 𝑖=1 𝐸 = ∑𝑦𝑖 2 214 𝑖=1 −2∑𝑦𝑖 ∙ 𝜃 √0.93− (𝑥 −111.15√0.93) 2 (111.15)2 214 𝑖=1 +∑(𝜃√0.93− (𝑥 −111.15√0.93) 2 (111.15)2 ) 2 214 𝑖=1 𝐸 = ∑𝑦𝑖 2 214 𝑖=1 −2∑𝑦𝑖 ∙ 𝜃 √0.93− (𝑥 −111.15√0.93) 2 (111.15)2 214 𝑖=1 +∑𝜃2 (0.93− (𝑥 −111.15√0.93) 2 (111.15)2 ) 214 𝑖=1 then, to find 𝜃 the minimization of 𝐸, find the solution of the equation 𝜕𝐸 𝜕𝜃 = 0, we get: ⟺ −2∑𝑦𝑖 √0.93− (𝑥 −111.15√0.93) 2 (111.15)2 214 𝑖=1 +2𝜃∑(0.93− (𝑥 −111.15√0.93) 2 (111.15)2 ) 214 𝑖=1 = 0 ⟺ −2∑ 𝑦𝑖√0.93− (𝑥−111.15√0.93) 2 (111.15)2 214 𝑖=1 +2𝜃∑ (0.93− (𝑥−111.15√0.93) 2 (111.15)2 )214𝑖=1 2√0.93− (𝑥−111.15√0.93) 2 (111.15)2 = 0 ⟺ −∑𝑦𝑖 214 𝑖=1 +𝜃∑√0.93− (𝑥 −111.15√0.93) 2 (111.15)2 214 𝑖=1 = 0 ⟺ 𝜃 = ∑ 𝑦𝑖 214 𝑖=1 ∑ √0.93− (𝑥−111.15√0.93) 2 (111.15)2 214 𝑖=1 by using the data in table 1, it is obtained 𝜃 ≈ 130,37 that from (27) it is obtained: 𝐸𝑂𝑀(130,37,𝑥) = 130,37 ∙√0.93 − (𝑥 − 111.15√0.93) 2 (111.15)2 and visualized the graph of the function of the change in fuel volume and the actual fuel volume change in the reservoir as follows: figure 6. graph of changes in fuel volume by 𝐸𝑂𝑀(𝜃) elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 329 based on figure 6 𝑦 = 𝑓(𝑥) the 𝐸𝑂𝑀(𝜃)results produce a function that is more fit to the pertalite data than the eom results in figure 5, then we calculate the volume function 𝑉(ℎ) with integral and with (24) obtained: 𝑉𝜃(ℎ) = 130,37 125 ∙𝑉(ℎ) where .𝑉𝜃(ℎ) is defined on the interval 0 ≤ ℎ ≤ 222.3. calculation of the volume of the fuel 𝐸𝑂𝑀(𝜃) version has 𝑉𝑚𝑎𝑥 = 21.166,71 liters. comparison of approximate visualization results data visualization results from circle orbits mode, elliptical orbits mode, 𝐸𝑂𝑀(𝜃), least square data fitting 𝑛 = 2, and least square data fitting 𝑛 = 3, as follows: figure 7. comparison graph of approximation method results the results of the calculation of changes in the volume of fuel based on the height of the fuel in the ut will be applied to the pertalite data to search for rss and mse from each result. pertalite data approximation rss and mse calculation comparison of the results between the proposed method and other methods can be seen from the calculation of rss and mse with liter unit in table 3 below: table 3. pertalite data approximation rss and mse calculation method rss mse 𝐶𝑂𝑀 40.390,49 185,28 𝐸𝑂𝑀 8.529,37 39,67 𝐿𝑆(𝑛 = 2) 8.980,63 40,09 𝐿𝑆(𝑛 = 3) 7.574,51 33,81 𝐸𝑂𝑀(𝜃) 6.415,32 29,84 based on table 3 the calculation of com which has the largest rss and mse, for eom has rss and mse which is slightly below 𝐿𝑆(𝑛 = 2), but still above 𝐿𝑆(𝑛 = 3), then by controlling the height of the ellipse to find 𝜃 that minimizes the rss we obtain the minimum i.e. 6.415,32 and its mse 29,84. the smallest value rss and mse are obtained elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 330 when using 𝐸𝑂𝑀(𝜃). pertalite data approximation is not compared to the calculation of gas stations and cubic spline interpolation because the rss and mse are definitely 0 and have unsmooth approximation. defined domain interval and maximum volume in the orbits mode data fitting construction, there is a reduction in the bbm altitude domain in the ut, so that it is only defined at a certain height. the results of the comparison of the defined domain height and the maximum volume of each approximation method are as follows: table 4. domain height intervals and maximum volume approximation method approximation method domain height (cm) 𝑽𝒎𝒂𝒙 maximum volume (liter) gas station calculation 0 ≤ ℎ ≤ 222,3 𝑉𝑚𝑎𝑥 = 21.486.00 liter. circle orbits mode 0 ≤ ℎ ≤ 217,1 𝑉𝑚𝑎𝑥 = 18.508,85 liter. elliptical orbits mode 0 ≤ ℎ ≤ 222,3√0,93 𝑉𝑚𝑎𝑥 = 20.296,55 liter. 𝐸𝑂𝑀(𝜃) 0 ≤ ℎ ≤ 222,3√0,93 𝑉𝑚𝑎𝑥 = 21.166,04 liter. least square 𝑛 = 2 0 ≤ ℎ ≤ 222,3 𝑉𝑚𝑎𝑥 = 21.248,90 liter. least square 𝑛 = 3 0 ≤ ℎ ≤ 222,3 𝑉𝑚𝑎𝑥 = 21.256,66 liter. the calculation result of circle orbits mode is only defined to altitude 217,1 cm there is a reduction of 5,2 cm and elliptical orbits mode is only defined to a height of 222,3√0,93 cm or about 214,382 cm, there is a reduction of (222,3−222,3√0,93) cm or about 7,92 cm. for further research, this has no effect on the application of daily sales data (according to dispenser) if the maximum height of fuel data is below 214,38 cm. so that the value is defined for all data as well as each approximation method as well. approximation results will be validated by measuring mean average deviation (mad) based on [14] and then mean absolute percentage error (mape) based on [15]. if aproximation results has mape below on 10% then aproximation methods is very feasible. conclusions based on the results and discussion, it can be concluded that the method of approximating the pertalite data with the smallest rss and mse is 𝐸𝑂𝑀(𝜃) by 𝜃 ≈ 130,37, resulting in rss and mse respectively are 6.415,32 and 29,81. 𝐸𝑂𝑀(𝜃) also produces a more fit half-ellipse function than other approximation methods. the results of the comparison of the approximation of the pertalite data are compared with 𝐶𝑂𝑀, 𝐸𝑂𝑀, 𝐿𝑆(𝑛 = 2), and 𝐿𝑆(𝑛 = 3) although 𝐸𝑂𝑀(𝜃) produces rss and mse, which are smaller than other methods, there is a reduction in the altitude domain and has a different maximum volume compared to the calculation of gas stations. according to the gas station metrology measurement book, the height of the ut is 222,3 cm and has a maximum volume of 21.486 liters, but 𝐸𝑂𝑀(𝜃) only detects the volume of fuel up to a height of about 214,1 cm and the maximum volume is below the calculation of the gas station. the author hopes for the development of this research, applied to different types of fuel such as pertamax and dexlite. as well as for a more real problem under study, use data on changes in the height and volume of bbm based on daily sales according to the bbm dispenser which must first be tested for the accuracy of the bbm dispenser used. as well as calculating errors using mape, mad, and other error calculations. elliptical orbits mode application for approximation of fuel volume change jovian dian pratama, ratna herdiana, susilo hariyanto 331 references [1] pratama, j.d., herdiana, r., hariyanto, s., application of orbits mode data fitting for dipstick calibration of altitude measuring instruments into volume of fuel oil in the tank compared with cubic spline interpolation, snast – national seminar on application of science and technology, akprind – yogyakarta, 138 146, 2021 [2] pratama, j.d., herdiana, r., hariyanto, s., application of orbits mode data fitting for dipstick calibration. journal of mathematics education: judika education, 4(2), 107118., 2021. [3] urang, job g., ebong, ebong, d., dkk., a new approach for porosity and permeability prediction from well logs using artificial neural network and curve fitting techniques: a case study of niger delta, nigeria. journal of applied geophysics 183 (2020) 104207, 2020. [4] ma, z. and ho, k.c., asymptotically efficient estimators for the fittings of coupled circles and ellipses. digital signal processing 25 (2014) 28 – 40, 2014. [5] kenichi, kanatani. and rangarajan, prasanna. hyper least sqeuares fitting of circles and ellipses. computational statistics and data analysis 55 (2011) 2197 – 2208, 2014. [6] wang, ningming, and wang, shuhua. a calibration data curve fitting method based on matrix orthogonal triangulation. procedia comupter science 174 (2020) 89 – 94, 2020. [7] tambun, m. saputra., soedjarwanto, noer., & trisanto, agus. tank monitoring model design of gas stations using microcontroller-based ultrasonic waves, electrician – journal of electrical engineering and technology, 9, no.2. 2015. [8] martin, julia, adana, david daffos ruiz de, and . asuero, agustin g. fitting the model to the data: residue analysis, primer, intechopen, doi: 10.5772/68049. 2017. [9] sibarani, maslen. linear algebra. second edition. jakarta: pt. rajagrafindo persada, 2014. [10] manuel melgosa, rafael huertas, and roy s. berns, "performance of recent advanced color-difference formulas using the standardized residual sum of squares index," j. opt. soc. am. a 25, 1828-1834. 2008. [11] heaton, j. introduction to neural networks for c#. 2nd ed. heaton research inc. 2008 [12] gareth, james; witten, daniela; hastie, trevor; tibshirani, rob. an introduction to statistical learning: with applications in r. springer. isbn 978-1071614174. 2021. [13] burden, richard l & faires, j. douglas. numerical analysis. 9th edition. boston: brooks/cole cengage learning. 2010. [14] zhang, p.. an interval mean–average absolute deviation model for multiperiod portfolio selection with risk control and cardinality constraints. soft comput 20, 1203–1212 https://doi.org/10.1007/s00500-014-1583-3. 2016. [15] de myttenaere, b golden, b le grand, f rossi. "mean absolute percentage error for regression models", neurocomputing 2016 arxiv:1605.02541. 2015. issn 2086-0382 e-issn 2477-3344 cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 cauchy vol. 6 no. 4 pages: 162 – 308 malang may 2021 issn 2086-0382 e-issn 2477-3344 𝜒 cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 cauchy is a mathematical journal published twice a year on may and november by the mathematics department, faculty of science and technology, universitas islam negeri maulana malik ibrahim malang. this journal includes research papers, literature studies, analysis, and problem solving in mathematics (algebra, analysis, statistics, computing and applied mathematics). editorial board editor in chief : dr. sri harini, m.si, maulana malik ibrahim state islamic university of malang, indonesia. managing editor : mohammad jamhuri, m.si, maulana malik ibrahim state islamic university of malang, indonesia. juhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. 1. editorial board : prof hadi susanto, department of mathematical sciences, university of 2. essex and department of mathematics of khalifa university, united kingdom mario rosario guarracino, computational and data science laboratory high performance computing and networking institute national research council of italy, italy kartick chandra mondal, jadavpur university, salt lake campus, india rowena alma l. betty, university of the philippines diliman, philippines subanar seno, gadjah mada university, indonesia toto nusantara, state university of malang, indonesia edy tri baskoro, institut teknologi bandung, indonesia eridani eridani, airlangga university, indonesia abdul halim abdullah, university of technology malaysia, malaysia kusno, university of jember, indonesia slamin, university of jember, indonesia riswan efendi, uin sultan syarif kasim riau, indonesia arief fatchul huda, uin sunan gunung djati bandung, indonesia usman pagalay, maulana malik ibrahim state islamic university of malang, indonesia abdussakir, maulana malik ibrahim state islamic university of malang, indonesia javascript:openrtwindow('http://ejournal.uin-malang.ac.id/index.php/math/about/editorialteambio/272712') cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 editorial board ari kusumastuti, maulana malik ibrahim state islamic university of malang, indonesia fachrur rozi, maulana malik ibrahim state islamic university of malang, indonesia elly susanti, universitas islam negeri maulana malik ibrahim malang, indonesia assistant editor : mohammad nafie jauhari, m.si, maulana malik ibrahim state islamic university of malang, indonesia. editorial office mathematics department, maulana malik ibrahim state islamic university of malang gajayana st. 50 malang, east java, indonesia 65144 phone (+62) 81336397956, faximile (+62) 341 558933 e-mail: cauchy@uin-malang.ac.id cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 focus and scope cauchy-jurnal matematika murni dan aplikasi is a mathematical journal published twice a year in may and november by the mathematics department, faculty of science and technology, maulana malik ibrahim state islamic university of malang. we we lc om e a u t h or s for original articles (research), review articles, interesting case reports, special articles illustrations that focus on the mathematics pure and applied. subjects suitable for publication include, but are not limited to the following fields of: 1. actuaria 2. algebra 3. analysis 4. applied 5. computing 6. econometry 7. statistics cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 indexing and abstracting cauchy-jurnal matematika murni dan aplikasi has been covered (indexed and abstracted) by following services: 1. doaj (2016-,)(https://doaj.org/toc/2477-3344) 2. d i m e n s i o n s 3. moraref (2015-,)-(http://moraref.or.id/index.php/browse/index/36) 4. onesearch indonesia (2015-,)-(http://onesearch.id/search/results?filter[]=repoid:ios2732) 5. mendeley (2013-,)-(https://www.mendeley.com/groups/5034091/cauchy/papers/) 6. indonesian scientific journal database (isjd) (2013-,)-(http://isjd.pdii.lipi.go.id/index.php/direktorijurnal.html) 7. google scholar (2009-,)-(https://scholar.google.co.id/citations?hl=en&view_op=list_works&gmla=ajsn f6omofbk7q0o2q-9 xuimca1zi8oz9lp2ehctubhl9dcisxnyh9saieau0g0udt8tym6jk3z666zu46vrsbyz6vjc2a_w&user=dr k-5hkaaaaj) 8. ipi (2009-,)-(http://id.portalgaruda.org/?ref=browse&mod=viewjournal&journal=5272) http://moraref.or.id/index.php/ http://onesearch.id/search/results http://www.mendeley.com/groups/5034091/cauchy/papers/ http://isjd.pdii.lipi.go.id/index.php/ http://id.portalgaruda.org/ cauchy jurnal matematika murni dan aplikasi volume 6, issue 4, may 2021 issn : 2086-0382 e-issn : 2477-3344 table of contents selection of specialization class using support vector machine (svm) method in sekolah menengah atas negeri 1 ambon ........................................................................ 162 – 168 optimizing the membership degree of fuzzy inference system (fis) and fuzzy clustering means (fcm) in weather data using firefly algorithm ....................... 169 – 180 learning interest of poliwangi students to learn mathematics engineering through moocs using dummy regression .................................................................... 181 – 187 stability analysis of hiv/aids model with educated subpopulation ......................... 188 – 199 trace of positive integer power of squared special matrix ............................................ 200 – 211 distance and areas weighting of gwr kriging for stunting cases in east java ..... 212 – 217 spatio temporal modelling for government policy the covid-19 pandemic in east java ........................................................................................................................................ 218 – 226 dynamical of ratio-dependent eco-epidemical model with prey refuge................. 227 – 237 poverty in central java using multivariate adaptive regression splines and bootstrap aggregating multivariate adaptive regression splines ........................ 238 – 245 invertibility of generalized space-time autoregressive model with random weight ............................................................................................................................................ 246 – 259 analysis of the rosenzweig-macarthur predator-prey model with anti-predator behavior ........................................................................................................................................ 260 – 269 bayesian generalized self method to estimate scale parameter of invers rayleigh distribution ............................................................................................................... 270 – 278 strongly summable vector-valued sequence spaces defined by 2-modular .......... 279 – 285 modeling plant stems using the deterministic lindenmayer system ........................ 286 – 295 regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) .......................................................................... 296 – 308 the metric dimension and local metric dimension of relative prime graph cauchy –jurnal matematika murni dan aplikasi volume 6(3) (2020), pages 149-161 p-issn: 2086-0382; e-issn: 2477-3344 submitted: august 06, 2020 reviewed: october 07, 2020 accepted: 10 november 2020 doi: http://dx.doi.org/10.18860/ca.v6i3.10103 the metric dimension and local metric dimension of relative prime graph inna kuswandari1, fatmawati2, mohammad imam utoyo3 mathematics department, faculty of science and technology, universitas airlangga, mulyorejo, surabaya 60115, indonesia. email: ikuswandari94@gmail.com, fatma47unair@gmail.com, m.i.utoyo@fst.unair.ac.id abstract this study aims to determine the metric dimensions and local metric dimensions of relative prime graphs formed from modulo 𝑛 integer rings, namely 𝐺ℤ𝑛 with 𝑛 ≥ 2. as a vertex set is ℤ𝑛 ∖{0} and 𝑢𝑣 ∈ 𝐺ℤ𝑛 if 𝑢 and 𝑣 are relatively prime. by finding the pattern elements of resolving set and local resolving set, it can be shown the value of the metric dimension and the local metric dimension of 𝐺ℤ𝑛 are 𝑛 − 2+ 𝑘 and |𝑃0|− 1+ 𝑘 respectively, where 𝑘 is the number of vertices groups that formed multiple 2,3, … , 𝑘 and |𝑃0| is the cardinality of set 𝑃0. this research can be developed by determining the fractional metric dimension, local fractional metric dimension and studying the advanced properties of graphs related to their forming rings. key words : metric dimension; modulo 𝑛; relative prime graph; resolving set; rings. introduction graph 𝐺 is defined as a non-empty and finite set of 𝑉(𝐺) whose elements are called vertices and set 𝐸(𝐺) (maybe empty) whose elements are called edges which are the unordered pair of different vertices in 𝑉(𝐺) [1]. any problems whose objects can be described as vertices and edges can be solved by the concept of graph theory, this has become one of the supporting factors in the field of graph theory research developing very rapidly today. the notion of metric dimensions was introduced firstly by [2] and independently by [3] (in [4]). in [3], the concepts of bases and dimensions have been built on the graph. a bases on a graph is a set of vertices with minimal cardinality that implies in each vertex of graph having a different representation (resolving set) to the bases, while the dimensions are number elements of the bases. meanwhile, the local metric dimension was introduced by [5] by considering different representations of two adjacent vertices so that the local metric dimension of a graph is obtained. the research related to the metric dimensions and local metric dimensions of a graph has been carried out by many researchers no exception to the graph of the results of operations (comb, corona, joint, etc.). the development of research in the field of graph theory is also supported by the expansion of research objects in algebraic systems, namely groups or rings. in [6], the zero divisor graph is introduced from any commutative ring by defining the vertices on the graph are elements of the ring, while the two vertices are adjacent if the product is zero. by using http://dx.doi.org/10.18860/ca.v6i3.10103 mailto:ikuswandari94@gmail.com mailto:fatma47unair@gmail.com mailto:utoyo@fst.unair.ac.id the metric dimension and local metric dimension of relative prime graph fatmawati 150 the definition in [6], the research was also carried out in [7,8,9] and succeeded in finding several properties related to the diameter, girth, isomorphism of the graph, radius, and domination set of zero divisor graphs constructed from the commutative and non commutative rings. in addition to the zero divisor of graph, research has also been developed on the jacobson graph formed from commutative rings that have nonzero unit elements [10,11]. the definition of vertices and edges is built from the radical jacobson and the unit of the ring. specifically, in [11], jacobson graph was formed from the commutative ring 𝑍3𝑛 and obtained several properties that related to the graph with its forming ring. some of the research above inspired the author to examine the metric dimensions and local metric dimensions on graphs constructed from commutative rings. as an object of research, we choose a ring of modulo 𝑛 integer with the adjacency between two vertices was chosen based on the relative prime properties between the two vertices. the following is given the definition of the greatest common divisor and the relative prime of two positive integers. definition 1 [12]. let 𝑎 and 𝑏 be two positive integers. the positive integer 𝑑 that satisfies 𝑑 = 𝑝𝑎 + 𝑞𝑏 for some integer 𝑝 and 𝑞 is called greatest common divisor (abbreviated gcd) of 𝑎 and 𝑏, denoted by 𝑑 = gcd (𝑎,𝑏). definition 2 [12]. two positive integers 𝑎 and 𝑏 are relatively prime if gcd(𝑎,𝑏) = 1. the purpose of this study is to determine the metric dimensions and local metric dimensions of relative prime graphs. the definition of metric dimensions and local metric dimensions which refer to [13] and [5] is given as follows. definition 3 [13]. suppose that 𝐺 is a connected graph of order 𝑛 and 𝑊 = {𝑤1,𝑤2,…,𝑤𝑗} ⊆ 𝑉(𝐺),1 ≤ 𝑗 ≤ 𝑛 is an ordered set of j-tuples of vertices in 𝐺. representation of vertice 𝑣 ∈ 𝑉(𝐺) with respect to 𝑊 is an ordered pair j-tuples 𝑟(𝑣|𝑊) = (𝑑(𝑣,𝑤1),𝑑(𝑣,𝑤2),…,𝑑(𝑣,𝑤𝑗)), where 𝑑(𝑣,𝑤𝑖) representing the distance between vertex 𝑣 and vertex 𝑤𝑖, 1 ≤ 𝑖 ≤ 𝑗. the set 𝑊 is called the resolving set of 𝐺 if each vertex in 𝐺 has different representation with respect to 𝑊. the resolving set with a minimum cardinality is called a bases, whereas number of elements on the bases are called dimensions of the graph 𝐺, denoted by 𝑑𝑖𝑚(𝐺). since the calculation of dimensions in a graph is built using the concept of distance (metric), it is called the metric dimension. definition 4 [5]. let g be the connected graph and 𝑊 ⊆ 𝑉(𝐺). the set 𝑊 is called the local resolving set of a graph 𝐺 if each of two adjacent vertices in 𝐺 has a different representation with respect to 𝑊, i.e. 𝑢𝑣 ∈ 𝐸(𝐺) implies 𝑟(𝑢|𝑊) ≠ 𝑟(𝑣|𝑊). the set of local resolving set with minimal cardinality is called a local bases, while number of elements on a bases are called the local metric dimensions of graph 𝐺, denoted by 𝑑𝑖𝑚𝑙(𝐺). methods this study aims to determine the metric dimensions and local metric dimensions of relative prime graph 𝐺𝑍𝑛. the most important step in determining the metric dimensions and local metric dimensions is to determine the pattern of resolving set and local resolving set that have a minimum cardinality as a bases set. the main difference between them is that the metric dimension and local metric dimension of relative prime graph fatmawati 151 1 1 1 2 3 2 3 3 4 4 5 2 6 2 on the local metric dimension the different representation are only required at adjacent vertices. results and discussion in this section, we will discuss the definition of a relative prime graph built from a ring of modulo 𝑛 integer and an exploration of the basic properties of a relative prime graph related to the characteristics of a graph. the discussion begins with the definition of a relative prime graph. definition 5. let ℤ𝑛 be a rings of modulo 𝑛 integer ℤ𝑛, where 𝑛 positive integer and 𝑛 ≠ 1. we defined a new graph 𝐺 with 𝑉(𝐺) = ℤ𝑛 ∖{0} and 𝐸(𝐺) = {𝑢𝑣;𝑢 relatively prime with 𝑣}. graph 𝐺 which formed from ring of modulo 𝑛 integer with add relatively prime as condition for adjacency two vertices are called relative prime graph, denoted by 𝐺ℤ𝑛. the number of vertices of 𝐺𝑍𝑛is denoted |𝑉(𝐺ℤ𝑛)| and the number of edges is denoted |𝐸(𝐺ℤ𝑛)|. example: a b c figure 1. some relative prime graphs a: 𝐺ℤ4; b: 𝐺ℤ5; c: 𝐺ℤ7 the graph on figure 1: a: 𝐺ℤ4 consists of three vertices, namely 1,2,3 which are mutually relative prime. hence vertex 1 adjacent to 2, vertex 2 adjacent to 3, and vertex 3 adjacent to 1. b: 𝐺ℤ5 consists of four vertices, namely 1,2,3,4. the adjacency of vertices 1,2,3 similar with condition of 𝐺ℤ4. furthermore, vertex 4 is relatively prime to 1 and 3, while vertex 2 is not relatively prime to 4. hence vertex 2 is not adjacent to 4. c: 𝐺ℤ7 consists of six vertices, namely 1,2,3,4,5,6. vertex 1 is relatively prime to vertices 2,3,4,5,6; vertex 2 is relatively prime to vertices 1,3,5; vertex 3 is relatively prime to vertices 1,2,4,5; vertex 4 is relatively prime to vertices 1,3,5; vertex 5 is relatively prime to vertices 1,2,3,4,6; and vertex 6 is relatively prime to vertices 1,3,5. since the three vertices (i.e. 2,4,6) are not relatively prime to each other, then they are not adjacent to each other. likewise, vertex 3 is not adjacent to 6 because 3 is not relatively prime to 6. the results of this study begin with an exploration the basic properties of 𝐺ℤ𝑛, furthermore determine the metric dimensions and local metric dimensions of 𝐺ℤ𝑛. based on definition 5 above, there are some basic properties of 𝐺ℤ𝑛 for 𝑛 ≥ 2, as follows: a. the set of vertices in 𝐺ℤ𝑛is 𝑉(𝐺ℤ𝑛) = {1,2,3,…,𝑛 − 1}, hence |𝐸(𝐺ℤ𝑛)| = 𝑛 − 1. b. 𝐺ℤ𝑛 is not an empty graph. c. 𝐺ℤ𝑛 is a trivial graph for 𝑛 = 2 because it consist only one vertex. the metric dimension and local metric dimension of relative prime graph fatmawati 152 d. 𝐺ℤ𝑛 is a connected graph. e. 𝐺ℤ𝑛 is a complete graph for 𝑛 = 2,3,4. specifically for 𝑛 = 3 it is a path graph and for 𝑛 = 4 it is a sikel graph. f. 𝐺ℤ𝑛 is a regular graph for 𝑛 = 3,4 because every vertex has the same degree. g. there is a vertex 1 that adjacent to every vertex in 𝐺ℤ𝑛 for 𝑛 ≥ 3. based on definition 5, a vertex 𝑢 is adjacent to vertex 𝑣 if 𝑢 is relatively prime with 𝑣. in 𝐺ℤ𝑛, there are vertices that are adjacent to each vertex in 𝐺ℤ𝑛, included in this category are vertex 1 and several vertices which are prime numbers. based on the definition of two adjacent vertices on 𝐺ℤ𝑛, the adjacency of vertices are divided into two groups, namely (1) vertices that are adjacent to every vertex, and (2) vertices that are not adjacent to each other. the vertices that not adjacent are the vertices that not relatively prime to each other, i.e. the vertices that have a common divisor other than 1. furthermore, the vertices that have a common divisor other than 1 can be divided into multiple of 2, multiple of 3, multiple of 5, ..., multiple of 7, ... . suppose 𝐴 = {2,3,5,7,…} = {𝑝1,𝑝2,…,𝑝𝑘} represent a set of ordered prime numbers, then the non adjacent vertices are grouped together in multiples of 2, multiples of 3, ... to multiples of prime numbers 𝑝𝑘, where 2𝑝𝑘 ≤ 𝑛 − 1. in this case, 𝑝𝑘 is the largest prime numbers and 𝑘 represents number of groups (multiples 2, multiples 3, etc.). the vertices that adjacent to every vertices in 𝐺ℤ𝑛 are categorized into group 𝑝0. example: in 𝐺ℤ15, 𝑉(𝐺ℤ15) = {1,2,3,…, 14}. the grouping of vertices in 𝐺ℤ15 are: the vertices in group 𝑝0 are 1,11,13. the vertices in group of multiple 𝑝1= 2 are 2,4,6,8,10,12,14. the vertices in group of multiple 𝑝2= 3 are 3,6,9,12. the vertices in group of multiple 𝑝3= 5 are 5,10. the vertices in group of multiple 𝑝4= 7 are 7,14. in this case, there are four groups of multiples, where 7 is the largest prime number that satisfy 2x7 ≤ 14. the vertices 1,11,13 are adjacent to every vertex in 𝐺ℤ15 so that it is categorized in group 𝑝0. it appears that there are vertices that are in more than one group of multiples, for example vertices 6 and 12 are in groups of multiple 2 and multiple 3. likewise, vertex 15 is in groups of multiple 3 and multiple 5. furthermore, if 𝑃𝑖 is the set of vertices in the group of multiples 𝑝𝑖, where 𝑖 = 1,2,…,𝑘 and ⌊𝑥⌋ represent the largest integer that same or less than 𝑥, then |𝑃𝑖| = ⌊ 𝑛−1 𝑝𝑖 ⌋. on 𝐺ℤ15, |𝑃1| = ⌊ 14 2 ⌋ = 7; |𝑃2| = ⌊ 14 3 ⌋ = ⌊4,67⌋ = 4; |𝑃3| = ⌊ 14 5 ⌋ = ⌊2,8⌋ = 2; |𝑃4| = ⌊ 14 7 ⌋ = 2. so, in general in 𝐺ℤ𝑛 the number of vertices in the group of multiples 𝑝𝑖 is ⌊ 𝑛−1 𝑝𝑖 ⌋, while the number of vertices in the group of multiples 𝑝𝑖 and 𝑝𝑗 is ⌊ 𝑛−1 𝑝𝑖.𝑝𝑗 ⌋ where 1 ≤ 𝑖,𝑗 ≤ 𝑘. the lemma that expressed the number of edge of 𝐺ℤ𝑛 is given as follows. lemma 1. |𝐸(𝐺ℤ𝑛)| = (𝑛−1)(𝑛−2) 2 − ∑ ( ⌊ 𝑛−1 𝑝𝑖 ⌋ 2 )𝑘𝑖=1 , where ⌊ 𝑛−1 𝑝𝑖 ⌋ is the number of vertices in groups of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘 and 2𝑝𝑘 ≤ 𝑛 − 1. proof. as explained above, the vertices on 𝐺ℤ𝑛 are divided into 2 major groups, namely the group 𝑝0 and the groups of multiples 𝑝𝑖. due to the specific properties of each group, the number of edges on graph 𝐺ℤ𝑛 can be calculated by asumption the number of edges of the complete graph with 𝑛 −1 vertices, then edge reduction is done based on the non relatively prime properties between any two vertices. as we know, the number of edge of the complete graph with 𝑛 − 1 vertices is (𝑛−1)(𝑛−2) 2 . since an edge is formed by two different vertices, the metric dimension and local metric dimension of relative prime graph fatmawati 153 then there is a reduction in the edge for each group of multiples 𝑝𝑖 as many as ( ⌊ 𝑛−1 𝑝𝑖 ⌋ 2 ). thus, |𝐸(𝐺ℤ𝑛)| = (𝑛−1)(𝑛−2) 2 − ∑ ( ⌊ 𝑛−1 𝑝𝑖 ⌋ 2 )𝑘𝑖=1  table 1. number of edges on graph 𝐺ℤ𝑛 𝑛 |𝑉(𝐺ℤ𝑛)| groups of vertices reduction of edge |𝐸(𝐺ℤ𝑛)| 2 1 0 3 2 𝑝0 : 1,2 2(2− 1) 2 = 1 4 3 𝑝0 : 1,2,3 3(3− 1) 2 = 3 5 4 𝑝0 : 1,3 4(4− 1) 2 − 1 = 5 𝑝1 : 2,4 ( 2 2 ) = 1 6 5 𝑝0 : 1,3,5 5(5− 1) 2 − 1 = 9 𝑝1 : 2,4 ( 2 2 ) = 1 7 6 𝑝0 : 1,5 6(6−1) 2 −3 −1 = 11 𝑝1 : 2,4,6 ( 3 2 ) = 3 𝑝2 : 3,6 ( 2 2 ) = 1 8 7 𝑝0 : 1,5,7 7(7−1) 2 −3 −1 = 17 𝑝1 : 2,4,6 ( 3 2 ) = 3 𝑝2 : 3,6 ( 2 2 ) = 1 9 8 𝑝0 : 1,5,7 8(8−1) 2 −6 −1 = 21 𝑝1 : 2,4,6,8 ( 4 2 ) = 6 𝑝2 : 3,6 ( 2 2 ) = 1 the degree of each vertex on any graph is the number of the vertices that adjacent to that vertex. thus, the degree of vertex 𝑢 ∈ 𝑉(𝐺ℤ𝑛) is determined based on their adjacency to (𝑛 − 2) other vertices as in observation 2. furthermore, lemma 3 states the minimum and maximum degree of the vertex on 𝐺ℤ𝑛. observation 2. if deg𝐺ℤ𝑛 (𝑢) denoted the degree of any vertex 𝑢 ∈ 𝑉(𝐺ℤ𝑛), then deg𝐺ℤ𝑛 (𝑢) = |{𝑣 ∈ 𝑉(𝐺ℤ𝑛):𝑣 relatively prime with 𝑢}|. lemma 3. if 𝑢 ∈ 𝑉(𝐺ℤ𝑛), then deg𝐺ℤ𝑛 (𝑢) = { 0, 1, 2 ≤ deg𝐺ℤ𝑛 (𝑢) ≤ 𝑛 − 2, 𝑛 = 2 𝑛 = 3 𝑛 ≥ 4 proof. let 𝑢 ∈ 𝑉(𝐺ℤ𝑛). the metric dimension and local metric dimension of relative prime graph fatmawati 154 for 𝑛 = 2, then 𝑢 = 1 and 𝐺ℤ𝑛 is the trivial graph, hence deg𝐺ℤ𝑛 (𝑢) = 0. for 𝑛 = 3, then 𝑢 ∈ {1,2}. the number 1 is relatively prime with 2, hence vertex 1 adjacent to vertex 2. thus, deg𝐺ℤ𝑛 (𝑢) = 1. for 𝑛 ≥ 4, it means |𝑉(𝐺ℤ𝑛)| ≥ 3. to show the minimum degree of any vertex is 2, we use claim: every vertex in 𝐺ℤ𝑛 is adjacent to at least two other vertices. proof of claim: since the criterion for two adjacency vertices are relative prime and the vertices in 𝐺ℤ𝑛 are natural numbers, it is sufficient to show that any natural number other than 1 must be relative prime to the natural number before and after it. suppose any natural number 𝑎 where 𝑎 ≠ 1, it will be shown that 𝑎 relative prime with 𝑎 − 1 and 𝑎 also relative prime with 𝑎 + 1. based on definition 2, the natural number 𝑎 is relative prime with 𝑎 − 1 if there are integers 𝑝 and 𝑞 such that 1 = 𝑝𝑎 + 𝑞(𝑎 − 1). by choosing 𝑝 = 1 and 𝑞 = −1, equality hold. hence 𝑎 relative prime with 𝑎 − 1. in a similar way, it can be shown that 𝑎 relative prime with 𝑎 + 1. thus, 𝑎 adjacent to 𝑎 − 1 and also 𝑎 adjacent to 𝑎 + 1. therefore, for any 𝑢 ∈ 𝑉(𝐺ℤ𝑛), there are at least two vertices that adjacent to 𝑢, so deg𝐺ℤ𝑛 (𝑢) ≥ 2. on the other hand, based on the property of the group 𝑝0, where each vertex is adjacent to every vertex in 𝐺ℤ𝑛, so for any 𝑢 ∈ 𝑃0, deg𝐺ℤ𝑛 (𝑢) = |𝑉(𝐺ℤ𝑛)|− 1 = 𝑛 − 1 − 1 = 𝑛 −2 and this is the maximum degree of any vertex in 𝐺ℤ𝑛. thus, it is proven that 2 ≤ deg𝐺ℤ𝑛 (𝑢) ≤ 𝑛 − 2, for 𝑛 ≥ 4 and the whole lemma is proven.  for example, on graph 𝐺ℤ7 we have 𝑉(𝐺ℤ7) = {1,2,3,4,5,6}. the degree of each vertex is: deg𝐺ℤ7 (1) = |{2,3,4,5,6}| = 5; deg𝐺ℤ7 (2) = |{1,3,5}| = 3; deg𝐺ℤ7 (3) = |{1,2,4,5}| = 4; deg𝐺ℤ7 (4) = |{1,3,5}| = 3; deg𝐺ℤ7 (5) = |{1,2,3,4,6}| = 5; deg𝐺ℤ7 (6) = |{1,5}| = 2. it can be seen that 2 ≤ deg𝐺ℤ7 (𝑢) ≤ 5, for every 𝑢 ∈ 𝑉(𝐺ℤ7). in 𝐺ℤ𝑛, there is vertex 1 which is adjacent to each vertex in 𝐺ℤ𝑛. several other vertices are also similar. vertices of this property will form a complete subgraph of 𝐺ℤ𝑛 as shown in theorem 4. theorem 4. there is a complete subgraph formed by vertices of 𝐺ℤ𝑛. proof. a complete graph is a graph where every two vertices are adjacent. based on the property of the vertices on 𝐺ℤ𝑛, there are vertices that are adjacent to each vertex, namely the vertices in the group 𝑝0. these vertices will form the complete subgraph 𝐾𝑚, where 𝑚 = |{𝑢 ∈ 𝑉(𝐺ℤ𝑛):gcd(𝑢,𝑣) = 1,∀𝑣 ∈ 𝑉(𝐺ℤ𝑛)}|.  from theorem 4, the vertices in 𝐺ℤ𝑛 that form the complete subgraph are vertices in group 𝑝0, where 𝑚 = |{𝑢 ∈ 𝑉(𝐺ℤ𝑛):gcd(𝑢,𝑣) = 1,∀𝑣 ∈ 𝑉(𝐺ℤ𝑛)}| = |𝑃0|. on the other hand, the vertices in the group of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘 have the same property that they are not adjacent to each other. this inspires that the vertices in the group of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘 form the multipartite subgraph of 𝐺ℤ𝑛 which is stated in the theorem 5. theorem 5. for 𝑛 ≥ 5, 𝐺ℤ𝑛 is isomorphic with a 𝐾𝑚 + 𝐻 graph where 𝐻 is the 𝑘-partite graph, 𝐾𝑚 is a complete graph with 𝑚 vertices and 𝑚 = |𝑃0|. proof. based on theorem 4, the vertices in the group 𝑝0 form the complete subgraph 𝐾𝑚 of 𝐺ℤ𝑛. meanwhile, the vertices in the group of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘 are not adjacent to the metric dimension and local metric dimension of relative prime graph fatmawati 155 each other for each 𝑖, so the vertices in the group of multiples 𝑝𝑖 can be partitioned into 𝑘 partitions with each partition consisting of vertices in the same group of multiples. on the same partition, there are no adjacent vertices according to the properties of the vertices in the group of multiples 𝑝𝑖. the connectedness of the vertices between partitions is based on the relatively prime properties of these vertices. since every vertex in 𝐾𝑚 is adjacent to every vertex in the group of multiples 𝑝𝑖, this means that every vertex in 𝐾𝑚 is adjacent to every vertex on all partitions (as many as 𝑘 partitions) that are formed. suppose the 𝑘 partitions formed along with adjacency their vertices are called graph 𝐻, then 𝐻 is a 𝑘-partite graph formed from vertices in the group of multiples 𝑝𝑖. every vertex in the group 𝑝0 is adjacent to every vertex in 𝐺ℤ𝑛, especially in the group of multiples 𝑝𝑖. this means that every vertex in 𝐾𝑚 is adjacent to every vertex in 𝐻. therefore, there is a bijective function 𝑓 from 𝑉(𝐺ℤ𝑛) to 𝑉(𝐾𝑚 + 𝐻) which preserves the adjacency between vertices in 𝐺ℤ𝑛. thus, 𝐺ℤ𝑛 ≅ 𝐾𝑚 + 𝐻.  theorem 5 applies for 𝑛 ≥ 5, spesifically for 𝑛 = 2,3,4 then 𝐺ℤ𝑛 ≅ 𝐾𝑛−1 (complete graph with 𝑛 −1 vertices). for example: on 𝐺ℤ8 where 𝑉(𝐺ℤ8) = {1,2,3,4,5,6,7}. ≅ + 𝐺ℤ8 𝐾3 + 𝐻2,2 figure 2. isomorphism 𝐺ℤ8 with 𝐾3 + 𝐻2,2 figure 2 illustrates the isomorphism 𝐺ℤ8 with 𝐾3 + 𝐻2,2 where 𝐻2,2 is 2-partite graph. in 𝐺ℤ8, vertices 1,5,7 is adjacent to each vertex such that 𝑚 = 3. meanwhile, the vertices in the group of multiples 2 are 2,4,6; the vertices in the group of multiples 3 are 3,6 and no vertex are multiples of 5, so there are 2 partitions. in the example above, the vertices on the first partition are 2 and 4, while the vertices on the second partition are 3 and 6. since the number of vertices on the first partition is 2 and the number of vertices on the second partition is 2, it is written 𝐻2,2. the existence of graph 𝐻2,2 as a 2-partite graph is not unique. another alternative is that if three vertices are selected on the first partition (i.e. vertices 2,4,6) and on the second partition one vertex is chosen (i.e. vertex 3), then the 2-partite graph in this case is written 𝐻3,1. thus, 𝐺ℤ8 ≅ 𝐾3 + 𝐻2,2 ≅ 𝐾3 + 𝐻3,1. next, we will be determined the value of the metric dimension and the local metric dimension of 𝐺ℤ𝑛 for 𝑛 ≥ 2. the determination of metric dimensions is calculated using the concept of the distance between two vertices, namely the length of the shortest path connecting the two vertices. theorem 6. if 𝑢,𝑣 ∈ 𝑉(𝐺ℤ𝑛), then 𝑑(𝑢,𝑣) ≤ 2. proof. let 𝑢,𝑣 ∈ 𝑉(𝐺ℤ𝑛). based on theorem 5, there are three possibilities for 𝑢 and 𝑣, namely (i) 𝑢,𝑣 ∈ 𝑉(𝐾𝑚), (ii) 𝑢,𝑣 ∈ 𝑉(𝐻), dan (iii) 𝑢 ∈ 𝑉(𝐾𝑚) ,𝑣 ∈ 𝑉(𝐻). (i) suppose 𝑢,𝑣 ∈ 𝑉(𝐾𝑚). since 𝐾𝑚 is a complete graph, then distance between two different vertices is 1, so that 𝑑(𝑢,𝑣) = { 0, 𝑢 = 𝑣 1, 𝑢 ≠ 𝑣 . 1 2 3 5 4 7 6 1 5 7 3 2 4 6 the metric dimension and local metric dimension of relative prime graph fatmawati 156 (ii) suppose 𝑢,𝑣 ∈ 𝑉(𝐻), there are two condition, i.e. 𝑢 and 𝑣 on the same partitions or 𝑢 and 𝑣 on the different partitions. a. if 𝑢 and 𝑣 are vertex on the same partition, then there is 𝑧 ∈ 𝑉(𝐾𝑚) so that 𝑑(𝑢,𝑧) = 1 and 𝑑(𝑧,𝑣) = 1. therefore, 𝑑(𝑢,𝑣) = 2. so 𝑑(𝑢,𝑣) = { 0, 2, 𝑢 = 𝑣 𝑢 ≠ 𝑣 . b. if 𝑢 and 𝑣 are vertices on the different partition, then the distance is 1 if 𝑢 adjacent to 𝑣. for vertex 𝑢 which is not adjacent to 𝑣, there is 𝑥 ∈ 𝑉(𝐾𝑚) so that 𝑑(𝑢,𝑥) = 1 and 𝑑(𝑥,𝑣) = 1. therefore 𝑑(𝑢,𝑣) = 2. thus 𝑑(𝑢,𝑣) = { 1, 2, gcd(𝑢,𝑣) = 1 gcd(𝑢,𝑣) ≠ 1 . (iii) suppose 𝑢 ∈ 𝑉(𝐾𝑚) ,𝑣 ∈ 𝑉(𝐻), then 𝑑(𝑢,𝑣) = 1 because every vertex in 𝐾𝑚 is adjacent to all vertices in 𝐻. from all possibilities (i), (ii), and (iii) it is proven that 𝑑(𝑢,𝑣) ≤ 2.  corollary 7. let 𝑢,𝑣 ∈ 𝑉(𝐺ℤ𝑛), 𝑑(𝑢,𝑣) = 2 if and only if 𝑢 is not relatively prime to 𝑣. proof. it is clear, is a direct result of theorem 6. the vertices in the group 𝑝0 apart from forming a complete subgraph of 𝐺ℤ𝑛 are also adjacent to all vertices in 𝐺ℤ𝑛. due to this specific property, in determining the elements of the resolving set, only one vertex is allowed. on the other hand, the vertices in the group of multiples 𝑝𝑖 also have a specific property. two groupings of vertices in 𝐺ℤ𝑛, each group have specific properties, so it is impossible for the metric bases of 𝐺ℤ𝑛 consist of vertices in the group 𝑝0 or in the group of multiples 𝑝𝑖 only. this condition is illustrated in theorem 8 below by observing the vertices in the group 𝑝0. theorem 8. every subset of 𝐾𝑚 is not a resolving set. proof. referring to theorem 4, the vertices in group 𝑝0 form a complete subgraph with 𝑚 vertices and the metric dimension is 𝑚 − 1. that is, the representation of other vertices in the group 𝑝0 against the 𝑚 − 1 vertices is the same as the vertex representation outside the group 𝑝0 for the 𝑚 − 1 verices. as a result, the set consisting of 𝑚 − 1 vertices is not a resolving set. the same condition also applies to sets whose element are less than 𝑚 − 1 vertices. based on theorem 5 which states 𝐺ℤ𝑛 ≅ 𝐾𝑚 +𝐻, it is obtained that each vertex in 𝐾𝑚 is adjacent to every vertex in 𝐻, it means that the distance is 1. for the same reason, any vertex taken from 𝐾𝑚 is not a resolving set. evidently, every subset of 𝐾𝑚 is not a resolving set.  based on the proof of theorem 8, the resolving set can not contain only vertices in 𝐾𝑚. on the other hand, the vertices in groups of multiples 𝑝1,𝑝2, ..., 𝑝𝑘 has a similar property that are not adjacent to each other. representation of two different vertices of a certain group of multiple to another vertices in the same group of multiple and to different groups of multiple are presented in the following lemma 9 and lemma 10. lemma 9. if 𝑢,𝑣 ∈ 𝑃𝑖, where 𝑢 ≠ 𝑣, then 𝑟(𝑢|𝑃𝑖 ∖ {𝑢,𝑣}) = 𝑟(𝑣|𝑃𝑖 ∖{𝑢,𝑣}). proof. the vertices in the group of multiples 𝑝𝑖 have a similar characteristic, namely they are not adjacent. based on the proof of theorem 6 (ii) a, the distance between different vertices is 2. thus, 𝑟(𝑢|𝑃𝑖 ∖{𝑢,𝑣}) = (2,2,…,2) and 𝑟(𝑣|𝑃𝑖 ∖{𝑢,𝑣}) = (2,2,…,2). therefore, 𝑟(𝑢|𝑃𝑖 ∖ {𝑢,𝑣}) = 𝑟(𝑣|𝑃𝑖 ∖{𝑢,𝑣}).  lemma 10. if 𝑢,𝑣 ∈ 𝑃𝑗1, then 𝑟(𝑢|𝑃𝑗2) = 𝑟(𝑣|𝑃𝑗2), where 1 ≤ 𝑗1, 𝑗2 ≤ 𝑘 and 𝑢,𝑣 ∉ 𝑃𝑗1 ∩𝑃𝑗2. the metric dimension and local metric dimension of relative prime graph fatmawati 157 proof. based on the proof of theorem 6 (ii) b, the distance between two vertices on different partitions is 1 or 2. furthermore, 𝑟(𝑢|𝑃𝑗2) = (𝑑(𝑢,𝑝𝑗2),𝑑(𝑢,2𝑝𝑗2),𝑑(𝑢,3𝑝𝑗2),…,𝑑(𝑢,𝑡𝑝𝑗2)) and 𝑟(𝑣|𝑃𝑗2) = (𝑑(𝑣,𝑝𝑗2),𝑑(𝑣,2𝑝𝑗2),𝑑(𝑣,3𝑝𝑗2),…,𝑑(𝑣,𝑡𝑝𝑗2)), where 𝑡𝑝𝑗2 ≤ 𝑛 − 1. the distance between two vertices in different groups of multiples is 1 if the two vertices are relatively prime and the distance is 2 if they are not relatively prime. since the properties of the group of multiples 𝑝𝑗1 and multiples 𝑝𝑗2 are similar, namely that the vertices are not adjacent in each group, while 𝑢 and 𝑣 are vertices in the same multiple group, then 𝑟(𝑢|𝑃𝑗2) = 𝑟(𝑣|𝑃𝑗2).  based on the results of lemma 9 and lemma 10, the vertices in the group of multiple 𝑝𝑖 become element of the resolving set leaving only one vertex in each group. this also applies to the vertices in the group 𝑝0, which leaves only one vertex. suppose that the vertex that must be left in the group of multiples 𝑝𝑖 are 𝑎𝑖 where 1 ≤ 𝑖 ≤ 𝑘 and the vertex that must be left in the group 𝑝0 are 𝑎0, the following is given the metric dimension of 𝐺ℤ𝑛. theorem 11. 𝐷𝑖𝑚(𝐺ℤ𝑛) = 𝑛 − 𝑘 −2, where 𝑘 represents number of groups of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘. proof. suppose 𝑊 = 𝑄0 ∪ 𝑄1 ∪ 𝑄2 ∪…∪𝑄𝑘, where 𝑄0 = 𝑃0 ∖ {𝑎0}, 𝑎0 is an arbitrary vertex left from the group 𝑝0, 𝑄1 = 𝑃1 ∖ {𝑎1}, 𝑎1 is an arbitrary vertex left from the group of multiple 𝑝1, 𝑄2 = 𝑃2 ∖ {𝑎2}, 𝑎2 is an arbitrary vertex left from the group of multiple 𝑝2, ⋮ 𝑄𝑘 = 𝑃𝑘 ∖ {𝑎𝑘}, 𝑎𝑘 is an arbitrary vertex left from the group of multiple 𝑝𝑘. the representation of every vertex in 𝑉(𝐺ℤ𝑛 ∖𝑊) with respect to 𝑊 is: 𝑟(𝑎0|𝑊) = ( 1,1,1,…,1⏟ as many as |𝑃0|−1 , 1,1,1,…,1⏟ as many as |𝑃1|−1 , 1,1,1,…,1⏟ as many as |𝑃2|−1 ,…, 1,1,1,…,1⏟ as many as |𝑃𝑘|−1 ) 𝑟(𝑎1|𝑊) = ( 1,1,1,…,1⏟ as many as |𝑃0|−1 , 2,2,2,…,2⏟ as many as |𝑃1|−1 ,𝑑(𝑎1,𝑄2),…,𝑑(𝑎1,𝑄𝑘)) 𝑟(𝑎2|𝑊) = ( 1,1,1,…,1⏟ as many as |𝑃0|−1 ,𝑑(𝑎2,𝑄1), 2,2,2,…,2⏟ as many as |𝑃2|−1 ,…,𝑑(𝑎2,𝑄𝑘)) ⋮ 𝑟(𝑎𝑘|𝑊) = ( 1,1,1,…,1⏟ as many as |𝑃0|−1 , 𝑑(𝑎𝑘,𝑄1), 𝑑(𝑎𝑘,𝑄2),…, 2,2,2,…,2⏟ as many as |𝑃𝑘|−1 ) where 𝑑(𝑎𝑖,𝑄𝑗) = { 1,if 𝑎𝑖 ∉ 𝑃𝑗 2, if 𝑎𝑖 ∈ 𝑃𝑗 , 1 ≤ 𝑖,𝑗 ≤ 𝑘. it appears that the representation of every vertex with respect to 𝑊 is different, so that 𝑊 is a resolving set. next, it will be shown that the cardinality of 𝑊 is minimal. suppose that any set 𝑋 is taken whose cardinality is one less than 𝑊, that is, |𝑋| = |𝑊|− 1. there are three possibilities for elements of set 𝑋, namely (i) all vertices in group 𝑝0 are element of 𝑋; (ii) all vertices in the group of multiples 𝑝𝑖 are element of 𝑋; and (iii) all vertices of the group of multiple 𝑝𝑗 (certain 𝑗), 1 ≤ 𝑗 ≤ 𝑘 being element of 𝑋. (i) if all vertices in the group 𝑝0 are element of 𝑋, then 𝑘 + 1 of the vertices in 𝐺ℤ𝑛 must be left in the group of multiples 𝑝𝑖 with the number of groups being 𝑘. a. suppose that as many as 𝑘 − 1 groups leave each one vertex, then there are two vertices that must be left by the group of multiple 𝑝𝑗 (certain 𝑗), 1 ≤ 𝑗 ≤ 𝑘. due to the metric dimension and local metric dimension of relative prime graph fatmawati 158 the similar properties of vertices in the group of multiple 𝑝𝑗, the representation of the two vertices with respect to 𝑋 will be the same. thus 𝑋 is not being a resolving set. b. suppose that all vertices in the 𝑘 − 1 group are element of 𝑋, then there is one particular group, name the group of multiple 𝑝𝑗 (certain 𝑗), 1 ≤ 𝑗 ≤ 𝑘 leaving 𝑘 + 1 vertex. for the same reason as (i)a, these 𝑘 − 1 vertices will have the same representation of 𝑋 because the vertices in the group of multiples 𝑝𝑗 are nonmutually adjacent. so 𝑋 is not a resolving set. (ii) if all the vertices in the group of multiples 𝑝𝑖 are element of 𝑋, then |𝑋| = |𝑊| − 1+ 𝑘 = (𝑛 − 2 − 𝑘) − 1+ 𝑘 = 𝑛 − 3 ≥ 𝑛 − 2− 𝑘 for 𝑘 ≥ 1. the equity hold only to 𝑘 = 1. so, for 𝑘 ≥ 2 then |𝑋| > |𝑊|. this contradicts the cardinality of the set 𝑋. (iii) if all vertices in a group of multiples 𝑝𝑖, namely 𝑝𝑗 dengan 1 ≤ 𝑗 ≤ 𝑘 as element of 𝑋, then from the vertices in 𝐺ℤ𝑛 must be left 𝑘 + 1 vertices in 𝑘 − 1 group of multiples 𝑝𝑖 (other than 𝑝𝑗) and in the group 𝑝0. a. suppose that all vertices in the group 𝑝0 are element of 𝑋, then from 𝑘 − 1 group of multiples 𝑝𝑖 must be left 𝑘 vertices. this is similar to case (i). b. suppose all vertices in the group of multiples 𝑝𝑖 (other than 𝑝𝑗) are element of 𝑋, it means that all vertices in the group of multiples 𝑝𝑖 are element of 𝑋. this is similar to the case (ii). from all of the above possibilities it can be concluded that 𝑋 is not a resolving set. if all vertices in one group are selected as element of set 𝑋, then 𝑋 is not a resolving set. likewise, if there is a group 𝑝0 or a group of multiples 𝑝𝑖 that leaves more than one vertex, it will result that 𝑋 not being a resolving set. since |𝑋| = |𝑊| − 1 and 𝑊 are resolving set, it means that 𝑊 is the resolving set with minimal cardinality or the metric bases of 𝐺ℤ𝑛. since each group leaves one vertex and the number of groups are 𝑘 +1, the metric dimension of 𝐺ℤ𝑛 is (𝑛 − 1) − (𝑘 + 1) = 𝑛 − 𝑘 − 2. it is proven that 𝑑𝑖𝑚(𝐺ℤ𝑛) = 𝑛 −𝑘 − 2.  based on theorem 11, the metric bases of 𝐺ℤ𝑛 consists of a combination of vertices in the group 𝑝0 and the group of multiples 𝑝𝑖 with the conditions each leaving only one vertex. the final part of this research is to determine the value of the local metric dimension of 𝐺ℤ𝑛 by first determining the local resolving set. in this case the vertex representation may be the same as long as the vertices are not adjacent. theorem 12. 𝐷𝑖𝑚𝑙(𝐺ℤ𝑛) = |𝑃0| + 𝑘 − 1, where 𝑘 representing number of groups of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘 and |𝑃0| is the cardinality of set 𝑃0. proof. based on the property of each group of multiples 𝑝𝑖, 𝑖 = 1,2,…,𝑘, which the vertices are not mutually adjacent, the property of the group 𝑝0 that all elements are adjacent to every vertex, and according to the definition of the local metric dimension, then we choose the set 𝑊𝑙 = {1,𝑝01,…,𝑝𝑜(𝑚−2),𝑝11,𝑝21,…,𝑝𝑘1}, where 1,𝑝01,…,𝑝𝑜(𝑚−2) are 𝑚 − 1 vertices in the group 𝑝0, 𝑝11 is one vertex in the group of multiples 𝑝1, 𝑝21 is one vertex in the group of multiples 𝑝2, 𝑝𝑘1 is one vertex in the group of multiples 𝑝𝑘. the representation of every vertex in 𝐺ℤ𝑛 with respect to 𝑊𝑙 is: 𝑟(1|𝑊𝑙) = (0,1,…,1,1,1,…,1); 𝑟(𝑝01|𝑊𝑙) = (1,0,…,1,1,1,…,1); 𝑟(𝑝𝑜(𝑚−2)|𝑊𝑙) = (1,1,…,0,1,1,…,1); 𝑟(𝑝11|𝑊𝑙) = (1,1,…,1,0,1,…,1); 𝑟(𝑝21|𝑊𝑙) = (1,1,…,1,1,0,…,1); 𝑟(𝑝𝑘1|𝑊𝑙) = (1,1,…,1,1,1,…,0); the metric dimension and local metric dimension of relative prime graph fatmawati 159 𝑟(𝑝02|𝑊𝑙) = (1,1,…,1,1,1,…,1); 𝑟(𝑝03|𝑊𝑙) = (1,1,…,1,1,1,…,1); 𝑟(𝑝12|𝑊𝑙) = (1,1,…,1,2,2,…,1); 𝑟(𝑝13|𝑊𝑙) = (1,1,…,1,2,2,…,1); 𝑟(𝑝14|𝑊𝑙) = (1,1,…,1,2,2,…,1); 𝑟(𝑝1𝑠|𝑊𝑙) = (1,1,…,1,2,2,…,1). it appears that 𝑟(𝑝13|𝑊𝑙) = 𝑟(𝑝13|𝑊𝑙) = 𝑟(𝑝14|𝑊𝑙) = 𝑟(𝑝1𝑠|𝑊𝑙), but 𝑝12,𝑝13,𝑝14,𝑝1𝑠 are vertices in the group of multiples 𝑝1 which is not mutually adjacent. in the concept of local metric dimensions, the above still meets the criteria, meaning that 𝑊𝑙 is a local resolving set. next, it will be shown that the cardinality of 𝑊𝑙 is minimal. suppose any set 𝑋𝑙 whose cardinality is reduced to one from the set 𝑊𝑙, i.e. |𝑋𝑙| = |𝑊𝑙| − 1. there are three possibilities for element of 𝑋𝑙, namely (i) all vertices in group 𝑝0 become element of 𝑋𝑙; (ii) at least one vertex from each group of multiples 𝑝𝑖 become element of 𝑋𝑙; and (iii) the group 𝑝0 leaves more than one vertex. (i) suppose that all vertices in the group 𝑝0 are members of 𝑋𝑙, then there are 𝑘 − 2 vertices from the group of multiple 𝑝𝑖 as many as 𝑘 that can be element of 𝑋𝑙. assuming at least one vertex in each group of multiple 𝑝𝑖 becomes a element of 𝑋𝑙, it means that there are at least two groups whose vertices are not represented as element of 𝑋𝑙. it is sufficient to show that there are two vertices from two different groups, these two vertices are adjacent and have the same representation respect to 𝑋𝑙. suppose that the two groups not represented in the element of the set 𝑋𝑙 are the group of multiples 𝑝𝑗1 and 𝑝𝑗2 where 1 ≤ 𝑗1, 𝑗2 ≤ 𝑘. vertex 𝑝𝑗1 is adjacent to 𝑝𝑗2 because both are prime numbers and are the first element in their respective group of multiples. in addition, vertices 𝑝𝑗1 and 𝑝𝑗2 are adjacent to all vertices in the group 𝑝0, so that 𝑟(𝑝𝑗1|𝑋𝑙) = 𝑟(𝑝𝑗2|𝑋𝑙). consequently, 𝑋𝑙 is not a resolving set. (ii) suppose that at least one vertex from each group of multiples 𝑝𝑖 is a element of 𝑋𝑙, then there are at least two vertices in the group 𝑝0 that are not element of 𝑋𝑙. the two vertices in the group 𝑝0 will have the same representation respect to 𝑋𝑙, because the two vertices in the group 𝑝0 are adjacent to every vertex in 𝐺ℤ𝑛. so 𝑋𝑙 is not a resolving set. (iii) suppose that the group 𝑝0 leaves more than one vertex, then a. if the group 𝑝0 leaves two vertices, then whatever the condition for selecting vertices in the group of multiples 𝑝𝑖, the remaining two vertices in the group 𝑝0 will have the same representation respect to 𝑋𝑙. consequently, 𝑋𝑙 is not a resolving set. b. if the group 𝑝0 leaves more than two vertices, then there is a group of multiples 𝑝𝑖 where all the vertices are not 𝑋𝑙 element. in this case, the remaining vertices in the group 𝑝0 will have the same representation of 𝑋𝑙 regardless of the vertex conditions in the group of multiples 𝑝𝑖. as a result, 𝑋𝑙 is not a resolving set. from all of the above possibilities, it can be concluded that 𝑋𝑙 is not a local resolving set. if there are more than one vertex left in group 𝑝0, then the representation of the remaining vertices respect to 𝑋𝑙 will be the same. likewise, if there is one or more groups of multiples 𝑝𝑖 that are not represented in the element of the set 𝑋𝑙, it can always be found the vertices of the group of multiples 𝑝𝑖 which have the same representation respect to set 𝑋𝑙 eventhough they are adjacent. since |𝑋𝑙| = |𝑊𝑙| − 1 and 𝑊𝑙 are local resolving set, it can be concluded that 𝑊𝑙 is a local resolving set with minimal cardinality or local metric bases of 𝐺ℤ𝑛. the cardinality of the set 𝑊𝑙 is the local metric dimension of 𝐺ℤ𝑛. it is proven that 𝑑𝑖𝑚𝑙(𝐺𝑍𝑛) = |𝑃0|− 1 + 𝑘.  based on theorem 12, the local metric bases consists of vertices in the group 𝑝0 and the group of multiples 𝑝𝑖, provided that the group 𝑝0 leaves only one vertex while in each group of multiples 𝑝𝑖 are represented by only one vertex. the metric dimension and local metric dimension of relative prime graph fatmawati 160 example: suppose given 𝐺ℤ17, where 𝑉(𝐺ℤ17) = {1,2,3,…, 16}. it will determine the metric dimension and the local metric dimension of 𝐺ℤ17 and some metric bases and local metric bases. firstly, the vertices of 𝐺ℤ17 were grouped as follows: group 𝑝0:1,11,13 then |𝑃0| = 3. group of multiple 2 are 2,4,6,8,10,12,14,16. group of multiple 3 are 3,6,9,12,15. group of multiple 5 are 5,10,15. group of multiple 7 are 7,14. there are four groups of multiples, namely multiples of 2, multiples of 4, multiples of 5, and multiples of 7, so 𝑘 = 4. (1) 𝑑𝑖𝑚(𝐺ℤ17) = 17 − 4 − 2 = 11. some of the metric bases of 𝐺ℤ17 are: 𝑊1 = {1,11,2,3,4,6,8,10,12,14,15}, then 𝑟(5|𝑊1) = (1,1,1,1,1,1,1,2,1,1,2); 𝑟(7|𝑊1) = (1,1,1,1,1,1,1,1,1,2,1); 𝑟(9|𝑊1) = (1,1,1,2,1,2,1,1,2,1,2); 𝑟(13|𝑊1) = (1,1,1,1,1,1,1,1,1,1,1); 𝑟(16|𝑊1) = (1,1,2,1,2,2,2,2,2,2,1). 𝑊2 = {1,13,2,3,4,6,10,12,14,15,16}, then 𝑟(5|𝑊2) = (1,1,1,1,1,1,2,1,2,2,1); 𝑟(7|𝑊2) = (1,1,1,1,1,1,1,1,2,1,1); 𝑟(8|𝑊2) = (1,1,2,1,2,2,2,2,2,1,2); 𝑟(9|𝑊2) = (1,1,1,2,1,2,1,2,1,2,1); 𝑟(11|𝑊2) = (1,1,1,1,1,1,1,1,1,1,1). 𝑊3 = {11,13,2,4,6,9,10,12,14,15,16}, then 𝑟(1|𝑊3) = (1,1,1,1,1,1,1,1,1,1,1); 𝑟(3|𝑊3) = (1,1,1,1,1,2,2,1,2,2,1); 𝑟(5|𝑊3) = (1,1,1,1,1,1,2,1,1,2,1); 𝑟(7|𝑊3) = (1,1,1,1,1,1,1,1,2,1,1); 𝑟(8|𝑊3) = (1,1,2,1,2,1,2,2,2,1,2). (2) 𝑑𝑖𝑚𝑙(𝐺ℤ17) = 3 − 1+ 4 = 6. some of the local metric bases of 𝐺ℤ17 are: 𝑊𝑙1 = {1,11,2,3,5,7}, then 𝑟(4|𝑊𝑙1) = (1,1,2,1,1,1); 𝑟(6|𝑊𝑙1) = (1,1,2,2,1,1); 𝑟(8|𝑊𝑙1) = (1,1,2,1,1,1); 𝑟(9|𝑊𝑙1) = (1,1,1,2,1,1); 𝑟(10|𝑊𝑙1) = (1,1,2,1,2,1); 𝑟(12|𝑊𝑙1) = (1,1,2,2,1,1); 𝑟(13|𝑊𝑙1) = (1,1,1,1,1,1); 𝑟(14|𝑊𝑙1) = (1,1,2,1,1,2); 𝑟(15|𝑊𝑙1) = (1,1,1,2,2,1); 𝑟(16|𝑊𝑙1) = (1,1,2,1,1,1). it appears that 𝑟(4|𝑊𝑙1) = 𝑟(8|𝑊𝑙1) and 𝑟(6|𝑊𝑙1) = 𝑟(12|𝑊𝑙1), but vertex 4 is not adjacent to 8 and vertex 6 is not adjacent to 12. 𝑊𝑙2 = {1,13,4,9,5,7}, then 𝑟(2|𝑊𝑙2) = (1,1,2,1,1,1); 𝑟(3|𝑊𝑙2) = (1,1,1,1,1,1); 𝑟(6|𝑊𝑙2) = (1,1,2,2,1,1); 𝑟(8|𝑊𝑙2) = (1,1,2,1,1,1); 𝑟(10|𝑊𝑙2) = (1,1,2,1,2,1); 𝑟(11|𝑊𝑙2) = (1,1,1,1,1,1); 𝑟(12|𝑊𝑙2) = (1,1,2,2,1,1); 𝑟(14|𝑊𝑙2) = (1,1,2,1,1,2); 𝑟(15|𝑊𝑙2) = (1,1,1,2,2,1); 𝑟(16|𝑊𝑙2) = (1,1,2,1,1,1). it appears that 𝑟(2|𝑊𝑙1) = 𝑟(8|𝑊𝑙1) and 𝑟(6|𝑊𝑙1) = 𝑟(12|𝑊𝑙1), but vertex 2 is not adjacent to 8 and vertex 6 is not adjacent to 12. 𝑊𝑙3 = {1,11,8,12,5,14}, then 𝑟(2|𝑊𝑙3) = (1,1,2,2,1,2); 𝑟(3|𝑊𝑙3) = (1,1,1,2,2,1); 𝑟(4|𝑊𝑙3) = (1,1,2,2,1,2); 𝑟(6|𝑊𝑙3) = (1,1,2,2,1,2); 𝑟(7|𝑊𝑙3) = (1,1,1,1,1,1); 𝑟(9|𝑊𝑙3) = (1,1,1,2,1,1); 𝑟(10|𝑊𝑙3) = (1,1,2,2,2,2); 𝑟(13|𝑊𝑙3) = (1,1,1,1,1,1); 𝑟(15|𝑊𝑙3) = (1,1,1,2,2,1); 𝑟(16|𝑊𝑙3) = (1,1,2,2,1,2). it appears that 𝑟(2|𝑊𝑙3) = 𝑟(4|𝑊𝑙3) = 𝑟(6|𝑊𝑙3) = 𝑟(16|𝑊𝑙3), but the vertices 2,4,6,16 are not adjacent to each other. the metric dimension and local metric dimension of relative prime graph fatmawati 161 conclusions this research focuses on determining the metric dimension and local metric dimension of relative prime graphs 𝐺𝑍𝑛. based on the previous discussion, it can be concluded that: (1) 𝑑𝑖𝑚(𝐺𝑍𝑛) = 𝑛 −𝑘 − 2; (2) 𝑑𝑖𝑚𝑙(𝐺𝑍𝑛) = |𝑃0| − 1 + 𝑘, where 𝑘 is the number of group multiples 𝑝1,𝑝2,…,𝑝𝑘, and |𝑃0| is the cardinality of set 𝑃0. in the future, the research can be extended to other topics such as fractional metric dimensions, local fractional metric dimensions, domination numbers/set, graph coloring, and graph labeling as well as expansion of research objects in special rings. references [1] g. chartrand and l. lesniak, graphs and digraphs, third edition, chapman & hall/crc, florida, 2000. [2] p. j. slater, "leaves and trees", congr. numer., 14, pp. 549-559, 1975. [3] f. harary and r. a. melter, "on the metric dimension of a graph", ars combinatoria, vol. 2, pp. 191-195, 1976. [4] j. a. rodriquez-velazquez, i. g. yero, d. kuziak and o. r. oellermann, "on the strong metric dimension of cartesian and direct product of graphs", discrete mathematics 335, pp. 8-19, 2014. [5] f. okamoto, b. phinezy and p. zhang, "the local metric dimension of a graph", math. bohem., vol. 135, pp. 239-255, 2010. [6] i. beck, "coloring of commutative rings", journal of algebra, vol. 116, no.1, pp. 208226, 1988. [7] d. f. anderson and p. s. livingston, "the zero-divisor graph of a commutative ring", journal of algebra 217, pp. 434-447 1999. [8] s. p. redmond, "the zero-divisor graph of a non-commutative ring", international j. commutative rings 1(4), pp. 203-211, 2002. [9] s. p. redmond, "central sets and radii of the zero-divisor graphs of commutative rings", communications in algebra 34, pp. 2389–2401, 2006. [10] a. azimi, a. erfanian and d. g. farrokhi, "the jacobson graph of commutative rings", jounal of algebra and its application, doi: 10.1142/s0219498812501794, 2012. [11] a. novictor, l. susilowati and fatmawati, "jacobson graph construction of ring z3n, for n > 1", journal of physics: conference series 1494 (2020) 012016, doi:10.1088/17426596/1494/1/012016, 2020. [12] j. b. fraleigh, a first course in abstract algebra, addison-wesley publishing company, massachusetts, 2003. [13] g. chartrand, l. eroh, m. a. johnson and o. r. oellermann, "resolvability in graphs and the metric dimension of a graph", discrete applied mathematics 105, pp. 99-113, 2000. regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) cauchy –jurnal matematika murni dan aplikasi volume 6 (4) (2021), pages 296-304 p-issn: 2086-0382; e-issn: 2477-3344 submitted: february 27, 2021 reviewed: april 29, 2021 accepted: may 04, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11758 regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing1, yudhie andriyana2, bertho tantular3 1statistics indonesia, jakarta, indonesia 1,2,3department of statistics, padjadjaran university, bandung, indonesia email: robinson@bps.go.id abstract generally, modeling poverty aims to obtain the best criteria for assessing poverty status. there are two approaches to model the factors that affect poverty, namely consumption approach and discrete choice model. the advantage of the discrete choice model compared to the consumption approach is that the discrete choice model provides a probabilistic estimate for classifying samples into different poverty categories. the aim of this study is to determine the factors that impact poverty in yogyakarta through regularized ordinal regression used elastic net approach both for parallel, non-parallel, and semi-parallel models. the data used in this study is susenas march 2018 for yogyakarta provinces. the result of this study shows that the best discrete choice model for yogyakarta’s modelling is the parallel model. households that live in villages, have a large number of household members, are headed by women, have elderly household heads, have low education, and work in the primary sector tend to be more vulnerable to poverty. therefore, a simultaneous policy with inclusive economic development is needed to reduce cross-border, cross-gender, and cross-sector inequality. keywords: elastic net; ordinal regression; parallel; poverty introduction poverty is one of the problems in economic development. every country tries to alleviate poverty with various programs. as an institution that released the official poverty rate in indonesia, bps [1] defines poverty as the inability to meet basic needs from an economic perspective, both food and non-food, which is measured in terms of expenditure. generally, modeling poverty aims to obtain the best criteria for assessing poverty status. rouband & razafindrakoto [2] assert that there is a correlation between objective and subjective poverty measures and further argue that various forms of poverty cannot be reduced to one another. a poverty approach is generally a monetary approach, but there is a growing literature that tries to bring up an index of multidimensional aspects of poverty. the impact factor in poverty approaches with two models. the first uses a regression approach between consumption expenditure per adult equivalent to several potential explanatory variables called the consumption approach. the second model is discrete choice model. the discrete approach is to categorize poverty into three categories based on household consumption expenditure compared to a region's poverty http://dx.doi.org/10.18860/ca.v6i4.11758 regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 297 line. the advantages of the discrete model are the influence of independent variables to vary across poverty categories. one of the most common regression models for ordinal data types is the cumulative logit model [3], also known as the proportional odds model or the ordinal logistic regression model. to improve the prediction accuracy in ordinal regression model with different regression coefficients for each response category, the dogev model was introduced to improve prediction accuracy [4]. the dogev model has a requirement which is the data that used has extreme values. moreover, the model has parallel and non-parallel models yet. wurm, rathouz, & hanlon [5] introduced a regression model using different regression coefficients for each response category known as regulized ordinal regression. the data that has ordinal response or dependent should be explained using parallel or non-parallel. when, the number of household observation used maximum likelihood then it is the proper model [5]. after that, both the nonparallel model that includes the parallel model as a particular case and the parallel model will provide an inconsistent estimation coefficient if there are errors in the modeling. the number of explanatory variables increases, we need a variable selection technique that will reduce some variables. this step is needed because it is impossible to estimate each coefficient with a high degree of accuracy. then more realistic modeling goal is built a model for out-of-sample prediction and determine the most important explanatory variables. two variable selection methods that are often used are the lasso and ridge methods. lasso and ridge regression are techniques that minimize the penalized likelihood objective function. lasso regression uses the l1 penalty, while ridge regression uses the l2 penalty. both penalties produce coefficient estimates that are closer to zero than the maximum likelihood estimator; for example, the estimate is "close" to zero yet. the estimation results in an estimation bias towards zero, but a trade-off occurs in terms of reducing the variance, which often reduces the overall mean squared error. lasso has properties with some approximate coefficients close to zero. this method provided a natural way to select variables because only the most relevant predictor of the response variable will have a non-zero coefficient. however, it is a group of variables that are highly correlated then, the lasso tends to choose one variable from the correlated group and ignores the others. the elastic net penalty was introduced to overcome those limitations [5]. the elastic net penalty method is the weighted average between lasso and ridge, by dividing the lasso properties and shrinking some coefficients to zero so that it has a unique solution in most cases. based on the previous description, the main problem to be examined is how the factors that affect poverty in yogyakarta through regularized ordinal regression with elastic net approach both for parallel, non-parallel, and semi-parallel models. methods data the data that used in this study is susenas consumption module in march 2018 by bps. then used as the response variable and explanatory variables. base on theoretical studies, residential, community, household and individual characteristics influenced differences in household expenditure. this study uses household data and the variables that related to household characteristics only. the variation in household characteristics will affect the households’ expenditure. regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 298 methodology the aim of poverty modelling is to know factor that influence such as poverty. the focus of study is the family that has characteristics in specific poverty status. the framework of poverty study would be assumed that the real poverty status in household unable observed or unconsidered by well-being ratio, with general mode: 𝑙𝑜𝑔𝑖𝑡 (𝒑𝑖) = 𝐗𝑖 𝑇𝜷 i=1,2,...,n (1) where: 𝐗𝑖 = covariate matrix 𝜷 = vector of regression coefficients 𝒑𝑖 = category probability the logit model that used is one of the generalized linear model (glm) models. the glm model has three components, namely random component, systematic component and link function. the form of the glm depends on the three components that related to each other. a. random components 𝑌 is a random variable of ordinal response with three categories, which is poor, almost poor, not poor 𝑌 ~ multinomial (𝑛; 𝑝1,𝑝2,𝑝3) 𝑓(𝑦;𝑝1,𝑝2,𝑝3,𝑛) = 𝑛! 𝑦1!𝑦2!𝑦3! 𝑝1 𝑦1𝑝2 𝑦2𝑝3 𝑦3 b. systematic component the systematic component of the model is a set of 𝜷 parameters and a covariate 𝐗 that forms a linear combination of 𝐗𝑖 𝑇𝜷 the general form of the linear predictor is : 𝜼𝑖 = 𝐗i t𝜷 (2) according to wurm, rathouz, & hanlon [5], the linear form of the predictor consists of:: i. parallel model 𝐗𝑖 = (𝐈𝐾×𝐾 | 𝒙𝑖 𝑇 ⋮ 𝒙𝑖 𝑇 ) 𝐾×(𝑃+𝐾) , 𝜷 = ( 𝒃𝟎 𝒃 ) (𝑃+𝐾)×1 ii. nonparallel model 𝐗𝑖 = (𝐈𝐾×𝐾 | 𝒙𝑖 𝑇 0 0 0 0 𝒙𝑖 𝑇 0 0 0 0 ⋱ ⋯ 0 0 ⋮ 𝒙𝑖 𝑇 ) 𝐾×(𝑃𝐾+𝐾) , 𝜷 = ( 𝒃𝟎 𝐁1 𝐁2 ⋮ 𝐁𝑘) (𝑃𝐾+𝐾)×1 iii. semi-parallel model 𝐗𝑖 = ( 𝐈𝐾×𝐾 || 𝒙𝑖 𝑇 𝒙𝑖 𝑇 ⋮ 𝒙𝑖 𝑇 𝒙𝑖 𝑇 0 ⋮ 0 0 𝒙𝑖 𝑇 ⋮ 0 ⋯ … ⋱ ⋯ 0 0 ⋮ 𝒙𝑖 𝑇 ) 𝐾×(𝑃(𝐾+1)+𝐾) , 𝜷 = ( 𝒃0 𝒃 𝐁1 𝐁2 ⋮ 𝐁𝑘) (𝑃(𝐾+1)+𝐾)×1 𝒃0 = vector intercept, 𝒃 = vector slope for parallel model 𝐁𝑖= matrix slope for nonparallel model regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 299 𝐈𝐾×𝐾= matrix identity 𝒙𝒊 = vektor covariat without intercept c. link function the link function is a function that connect systematic components and the expected (average) value the random component explained the relationship working e(𝒚) = n𝒑 with explanatory variable in linier predictor. we have model 𝒑 directly or model a monotonous function. 𝐸 (𝒚𝑖|𝒙𝑖) = 𝑔(𝒑𝑖) = 𝜼𝑖 = 𝐗𝑖 𝑇𝜷 (3) elastic net penalty suppose 𝜷 has the length q and 𝛽𝑗 shows the j element. wurm, rathouz, & hanlon [5] wrote the objective elastic net function as: 𝑀(𝜷;𝛼,𝜆,𝑐1,…,𝑐𝑄) = − 1 𝑁∗ ℓ(𝜷)+𝜆∑ 𝑐𝑗 (𝛼|𝛽𝑗|+ 1 2 (1−𝛼)𝛽𝑗 2) 𝑄 𝑗=1 (4) where: ℓ(𝜷) = ∑ ℓ𝑖(𝜷) 𝑁 𝑖=1 and ℓ𝑖(𝜷) = 𝐿𝑖(ℎ(𝐗𝑖 𝑇𝜷)) in model ℓ(𝜷) is loglikelihood function, 𝜆 > 0 and 0 ≤ 𝛼 ≤ 1. wurm, rathouz, & hanlon [5] wrote elastic net objective function for each model shape derived from equation 4 as follows: objective function for parallel model is: 𝑀(𝒃0,𝒃;𝛼,𝜆) = − 1 𝑁∗ ℓ(𝒃0,𝒃)+𝜆∑(𝛼|𝑏𝑗|+ 1 2 (1−𝛼)𝑏𝑗 2) 𝑃 𝑗=1 objective function for nonparallel model is: 𝑀(𝒃0,𝐁;𝛼,𝜆) = − 1 𝑁∗ ℓ(𝒃0,𝐁)+𝜆∑∑(𝛼|𝐵𝑗|+ 1 2 (1−𝛼)𝐵𝑗𝑘 2 ) 𝐾 𝑘=1 𝑃 𝑗=1 objective function for semiparallel model is: 𝑀(𝑏0,𝑏,𝐵;𝛼,𝜆,𝜌) = − 1 𝑁∗ ℓ(𝑏0,𝑏,𝐵) +𝜆(𝜌∑(𝛼|𝑏𝑗|+ 1 2 (1−𝛼)𝑏𝑗 2) 𝑃 𝑗=1 +∑∑(𝛼|𝐵𝑗|+ 1 2 (1−𝛼)𝐵𝑗𝑘 2 ) 𝐾 𝑘=1 𝑃 𝑗=1 ) when 𝜆 ≥ 0 and 𝛼 ∈ [0,1] are tuning parameters and 𝜌 ≥ 0 is tuning parameters which determines the extent to eliminated the parallel term results and discussion firstly, we discuss characteristics of the socioeconomic variables of the household as a general overview of the respondents that used in the study. we use pie charts for the descriptive characteristics of the respondents. it is used to illustrate the frequency of each category in the research variables. table 1. characteristic of responden variable category poverty status total poor almost poor not poor region type rural 5,99 4,49 54,53 65,01 urban 5,56 3,57 25,86 34,99 the total number single 0,39 0,25 7,95 8,59 regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 300 household member living with a couple 1,93 1,03 14,05 17,01 living with a couple with other family members. 9,24 6,78 58,38 74,39 marital status never marriage 0,25 0,11 4,17 4,53 marriage 10,16 7,35 65,58 83,10 divorce 0,07 0,11 2,85 3,03 divorce by death 1,07 0,50 7,77 9,34 gender male 10,34 7,49 70,01 87,84 female 1,21 0,57 10,38 12,16 age 15-64 years old 8,92 6,63 70,86 86,41 65+ years old 2,64 1,43 9,52 13,59 education primary and junior 8,84 5,67 39,16 53,67 senior high school 2,64 2,32 26,64 31,60 collage 0,07 0,07 14,59 14,73 sector economy primary 6,13 3,53 19,54 29,21 secondary 2,92 2,14 17,40 22,47 tertiary 2,50 2,39 43,44 48,32 total 11.05 8,06 80,39 the first step is to test chi-square independence. chi-square independence analysis use when it has a relationship between categorical variables. this method has done at first step. then seeing whether the independent variable/predictor used has a relationship (dependent) with the dependent/response variable. the null hypothesis formulation there is no dependency between poor status and variables the explanation, while the alternative hypothesis there is a dependency between poor status with the explanatory variable. table 2 is the probability value of the results less than 0.05 then it means all independent variables have a dependent relationship with the dependent variable/response. table 2. independent test of category variables on poor status in this study, used the ordinalnetcv function on the ordinalnet software r version 3.61 package. in this study, we compare the results of parallel, non-parallel and semiparallel models with the aic, bic and loglik. in general, parallel and semi-parallel models have similar performance, but non-parallel models are much worse. this model might be due to the unidentified out-of-sample log-likelihood non-parallel model (nonmonotonous cumulative probability) in the first few values of λ. table 3. comparison of the aic, bic and loglik values of the three ordinal regression models model aic bic loglik parallel 3192.01 3269.21 -319.1258 non-parallel 3403.23 3468.56 -339.732 semi-parallel 3209.88 3358.35 -321.276 table 3 shows that the values of aic, bic, and loglik have the smallest on the parallel model, moreover, the lambda parameters obtained in all three models for each fold, in table 4. the variability of lambda values is the lowest in the parallel model. category variables value chi square df p.value region type 41,086 2 0.000 the total number household member 30,327 4 0.000 marital status 27,660 6 0.000 gender 7,491 2 0.024 age 181,148 4 0.000 education 32,699 2 0.000 sector economy 154,836 4 0.000 regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 301 table 4. comparison of lambda values for the three regression models fold parallel model nonparallel model semiparallel model fold1 0,0015 0,0010 0,0019 fold2 0,0012 0,0229 0,0031 fold3 0,0019 0,0292 0,0009 fold4 0,0025 0,0180 0,0040 fold5 0,0012 0,0372 0,0051 table 5 shows that five dummy variables have a positive coefficient, six dummy variables that have a negative coefficient and one dummy variable with a zero coefficient. a positive coefficient value means the chance of the understudy category to be poor is higher compared to the reference category, furthermore the negative coefficient means the chance of the category understudy is smaller for the poor status compared to the reference category. the zero coefficient means that the opportunity for the category studied is not significantly different for poor status compared to the reference category. table 5. ordinal regression variabel category logit(p[y<=1]) logit(p[y<=2]) intercept -2.389 -1.706 region type (*rural) region type (urban) 0.042 0.042 the total number household member *single living with a couple 1.249 1.249 living with a couple with other family members. 0.539 0.539 marital status * never marriage marriage 0.000 0.000 divorce -1.157 -1.157 divorce by death -0.367 -0.367 gender (*male) gender (female) 0.319 0.319 age (*15-64 years old) age (non produktif 65+) 0.390 0.390 education *primary&junior senior high school -0.455 -0.455 collage -2.912 -2.912 sector economy *primer secondary sector -0.408 -0.408 tertiary sector -1.057 -1.057 *baseline category discussion of the results a. residential type regional type is a category of respondent's residential area; there are two categories: urban and rural areas. the location of the household is one of the factors which is often associated with poverty status. the regional type is due to differences in access to primary facilities such as education and health. the results of this study show that the status of the area of residence significantly affected the poverty status of a household. rural households have a higher tendency to become poorer than urban households. this result is in line with some previous studies such as [6] and [7] that suggest that rural households are more vulnerable to poverty due to limited access. b. household size the size of a household indicates the number of people who live in that household. the more people live in a household; then the more resources are needed to keep the household members prosperous. the results of this study show that compared to households consisting of only one person, households of 2 or more people had a higher inclination to live in poverty. the results of this study are in developing countries which regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 302 show that as the number of households increases, the average per capita consumption decreases, indicating that households are approaching poor status. the results of the studies that conducted in developing countries such as [6] and [8] show that the larger household’s size, the lower the average consumption per capita. it indicates that these households are getting closer to poor status. the problem is even though they live in one household, about 20 percent of the items used together [8]. therefore, they must allocate limited income to more needs. c. marital status marital status is related to responsibility for household expenses. someone who has a never married status tends to have income that and use it for personal needs. where, the income generated becomes cumulative from household members, in the results, there is equal opportunities between those who are never married and those who are married. compared to someone who are never married, household with divorcee-household head have less probability to be poor. according to [9], divorcee usually has economic planning and economic adaptation strategies to align with the amount of income a family needs every day of their life. it proves that from the way a divorcee to save, set aside in part piecemeal revenue that could be used to meet the needs of their child's education and are used for urgent needs. d. household gender there are characteristic differences between households dominated by men and women. in general, households headed by women are often identified with households with higher chances of poverty. the research results in some regions, both in developed and developing countries, showed that households led by women are more prone to poverty, because female heads of households generally generate lower incomes and generally have more dependencies ([7], [10], [11]). the yogyakarta data also shows alike. based on the data collected in this study, female heads of households tend to bear a large number of household members. e. head of household age one of the factors that influence a person's level of productivity is age. a person who has at a productive age is likely to have a higher income than someone has at an unproductive age. therefore, it is a common misconception that households with lowincome households are less likely to become poorer. the results of this study support these general assumptions. the result of this research is with the research that has done in [7] and have shown that as a productive age passes, one's income tends to decline, and the risk of becoming poor can higher. f. head of household education education is one of the crucial factors that determine one's well-being. educational attainment increases potential income of individuals, and as a result, increasing income definitely helped them to out from poverty [12]. in line with previous research, this study showed consistent results. households headed by a person with a high school education have a higher tendency to be poor compared to those with a lower middle school. g. economic sector of head of household the field of work in which household heads work has an impact on household poverty status. this is due to differences in income levels in each industry sector. the primary regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 303 sectors comprising agriculture and mining generally have lower income levels than other sectors. the results of this study show that households with households working in the secondary and tertiary sectors have a lower tendency to be poor compared to those with primary income from the primary sector. besides that, the results of this study show that households who work in the tertiary sector have a higher chance of being poor than those who work in other sectors. the result of this research is in line with [13] and [14] that state that the shift from the agricultural sector is effective in alleviating poverty. conclusions some factors determined poverty such as household size, marital status, the gender of household head, age of head of household, level of education of the head of household, and occupation of the head of household. based on aic and bic criteria, the best model to yogyakarta poverty data is parallel model. households that live in villages, have a large number of household members, are headed by women, have elderly household heads, have low education, and work in the primary sector tend to be more vulnerable to poverty. therefore, a simultaneous policy with inclusive economic development is needed to reduce cross-border, cross-gender, and cross-sector inequality. references [1] badan pusat statistik, “data dan informasi kemiskinan kabupaten/kota 2018,” jakarta, 2018. [2] m. razafindrakoto and f. roubaud, “the multiple facets of poverty: the case of urban africa,” in wider conference on inequality, 2003. [3] p. mccullagh, s. journal, r. statistical, and s. series, “regression models for ordinal data,” j. r. stat. soc. ser. b, vol. 42, no. 2, pp. 109–142, 1980. [4] e. fissuh and m. harris, “modeling determinants of poverty in eritrea: a new approach,” pp. 1–35, 2005. [5] m. j. wurm, p. j. rathouz, and b. m. hanlon, “regularized ordinal regression and the ordinalnet r package,” 2017. [6] j. c. anyanwu, “marital status, household size and poverty in nigeria: evidence from the 2009/2010 survey data,” african dev. rev., vol. 26, no. 1, pp. 118–137, 2014. [7] r. gounder and z. xing, “impact of education and health on poverty reduction: monetary and non-monetary evidence from fiji,” econ. model., vol. 29, no. 3, pp. 787– 794, 2012. [8] p. lanjouw and m. ravallion, “poverty and household size,” econ. j., vol. 105, no. 433, pp. 1415–1434, 1995. [9] a. s. rahayu, “kehidupan sosial ekonomi single mother dalam ranah domestik dan publik,” j. anal. sosiol., vol. 6, no. 1, 2017. [10] m. buvinić and g. rao gupta, “female-headed households and female-maintained families: are they worth targeting to reduce poverty in developing countries?,” econ. dev. cult. change, vol. 45, no. 2, pp. 258–280, 1997. [11] d. f. meyer, “predictors of poverty: a comparative analysis of low income communities in the northern free state region, south africa,” online) int. j. soc. sci. humanit. stud., vol. 8, no. 2, pp. 1309–8063, 2016. [12] m. awan et al., “impact of education on poverty reduction,” int. j. acad. res., vol. 3, 2011. regularized ordinal regression with elastic net approach (case study: poverty modeling in yogyakarta province 2018) pardomuan robinson sihombing 304 [13] i. d. a. bagus, e. k. a. artika, a. a. s. kencana, i. d. a. ayu, and k. marini, “pergeseran lapangan usaha sektor pertanian , pertumbuhan ekonomi,” j. unmas mataram, pp. 111–117, 2018. [14] f. fahar, “kemiskinan dan ketenagakerjaan di kepulauan riau 2014: permasalahan dan implikasi kebijakan,” no. february, 2015. a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units cauchy –jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 13-21 p-issn: 2086-0382; e-issn: 2477-3344 submitted: pebruary 14, 2021 reviewed: april 20, 2021 accepted: october 12, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.11575 a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata1,2, khairil anwar notodiputro2*, bagus sartono2 1mathematics education study program, raja ali haji maritime university, tanjungpinang 2department of statistics, ipb university, bogor email: alonadwinata@umrah.ac.id, khairil@apps.ipb.ac.id*, bagusco@apps.ipb.ac.id *corresponding author abstract generalized linear mixed models (glmm) combined with the l1 penalty (least absolute shrinkage and selection operator/lasso) is called lasso glmm. lasso glmm reduces overfitting and selects predictor variables in modeling. the aim of this study is to evaluate the performance model for predicting covid-19 patients with certain congenital disease that require icu based on the results of blood tests laboratory and patient’s vital signs. this study used binary response variables, 1 if the patient was admitted to the icu and 0 if the patient was not admitted to the icu. the fixed effect predictor variables are the results of blood tests laboratory and patient’s vital signs. the random effect predictor variable is patient's congenital disease. the result showed that the average of accuracy and auc from lasso glmm is more than the average of accuracy and auc from lasso glm by using 5% level of significance. respiratory rate and lactate show a significance effect to predict the icu needs of covid-19 patients. the random effects patient's congenital disease has significance effect at 5% level of significance. it means that the icu needs for covid-19 patients varies among patient's congenital disease. we can conclude that glmm lasso with the random effect of patient’s congenital diseases has better modeling performance to predict the icu needs of covid-19 patients based on the results of blood tests laboratory and patient’s vital signs. the results of this modeling can quickly detect covid-19 patients who need the icu and can help medical staff use icu resources optimally. keywords: covid 19; glmm; glmmlasso; lasso introduction generalized linear model (glm) is an approach that can be used to model the effect of predictor variables on response variables derived from exponential family distribution. for observations in certain groups there is usually a correlation between observations then the glm study is expanded to include random effects on linear predictors. when the glm model added a random effect, the model called generalized linear mixed models (glmm) [1]. glmm modeling has a problem with the number of predictor variables used in relation to complexity in modeling. the more predictor variables used in modeling, the estimation is very unstable [2]. the existence of predictor variables that are not related http://dx.doi.org/10.18860/ca.v7i1.11575 mailto:alonadwinata@umrah.ac.id mailto:khairil@apps.ipb.ac.id mailto:bagusco@apps.ipb.ac.id a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 14 to the response variables in the model will cause overfitting problems. to improve the accuracy of the model prediction, a penalty is added in modeling [3]. the addition of penalty function in modeling was carried out by tibshirani (1996) using the l1 penalty, namely 𝜆 ∑ |𝛽𝑗| 𝑝 𝑗=1 which is called least absolute shrinkage and selection operator (lasso). lambda (λ) in the l1 penalty function is a shrinkage parameter (λ) that determines the amount of shrinkage regression coefficient. lasso reduces overfitting and selects predictor variables in modeling [4]. modeling with a combination of glm and glmm with lasso techniques in this study are called lasso glm and lasso glmm. researchers have discussed various problems on lasso glm, such as arnold and tibshirani (2016) [5], hossain et al. (2015) [6], zhang and zou (2014) [7], simon et al. (2013) [8], friedman et al. (2010) [9]. the lasso glm optimizes the objective function by using coordinate descent optimization. this algorithm is available in the r programming language, namely glmnet package [9]. some researchers have discussed variable selection procedures in glmm using the l1 penalty, including thomson and hossain (2018) [10], groll and tutz (2014) [2], schelldorfer et al. (2011) [11], ibrahim et al. (2010) [12]. the lasso glmm produces stable estimations because penalty l1 can select the important predictor variables used in glmm [2]. the glmms using the l1 penalty are useful whenever there is a grouping structure among high dimensional observations [11]. previous studies also have found an algorithm for estimating the maximum likelihood in the glmm model with the addition of the l1 penalty function. the penalized loglikelihood function maximize using gradient ascent algorithm, this algorithm is called glmmlasso [13]. the glmmlasso algorithm in the r programming language is included in the glmmlasso package [14]. in this study, researchers apply lasso glm and lasso glmm to predict the icu needs for covid-19 patients. the surge in covid-19 cases is putting enormous pressure on the health care system. intensive care units (icu) is one of the health facilities needed by patients with covid-19 confirmation. the study examines the prediction of icu for covid19 patients. the icu needs for covid-19 patients were analyzed using the results of blood tests laboratory, vital signs and the patient's congenital disease. the predictor variables for blood test laboratory results and patient's vital signs were fixed effect, whereas predictor variables for patient's congenital disease were assumed to be fixed effect for lasso glm and random effect for lasso glmm. previous researchers have discussed the performance of lasso glm and lasso glmm modeling on rainfall data, the results showed that modeling with lasso glmm has better performance than lasso glm [15]. to predict the icu needs for covid-19 patients based on laboratory results of blood tests, patient’s vital signs and congenital disease, researchers conducted modeling with lasso glm and lasso glmm. the aim of this study is to evaluate the model's performance in predicting covid-19 patients with certain congenital disease groups that require icu based on the results of blood tests laboratory and patient’s vital signs. methods data the study used data from patients confirmed by covid-19 at the sírio-libanês hospital, são paulo, brasilia. data were collected after 12 hours of confirmed covid-19 patients undergoing treatment in the hospital. total data were 98 patients, with 52 icu patients and 46 non-icu patients. the study used binary response variables, 1 if the patient was admitted to the icu and 0 if the patient was not admitted to the icu. the fixed effect predictor variables for a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 15 modeling totally used 32 variables, 26 variables from the results of blood tests laboratory and 6 variables patient’s vital signs. the fixed effect predictor variables used in modeling can be seen in table 1. researchers assumed patient’s congenital disease as fixed effect predictor variables in modeling using lasso glm and a random effect predictor variable in modeling using lasso glmm. table 1. research variables variable variable name type information y the covid-19 patient's status binary 1 = icu patient, 0 = non-icu patient x1 albumin numeric fixed effect x2 be_venous numeric fixed effect x3 bic_venous numeric fixed effect x4 billirubin numeric fixed effect x5 calcium numeric fixed effect x6 creatinin numeric fixed effect x7 ffa numeric fixed effect x8 ggt numeric fixed effect x9 glucose numeric fixed effect x10 hematocrite numeric fixed effect x11 hemoglobin numeric fixed effect x12 lactate numeric fixed effect x13 leukocytes numeric fixed effect x14 linfocitos numeric fixed effect x15 neutrophiles numeric fixed effect x16 p02_venous numeric fixed effect x17 pc02_venous numeric fixed effect x18 pcr numeric fixed effect x19 ph_venous numeric fixed effect x20 platelets numeric fixed effect x21 potassium numeric fixed effect x22 sat02_venous numeric fixed effect x23 sodium numeric fixed effect x24 ttpa numeric fixed effect x25 urea numeric fixed effect x26 dimer numeric fixed effect x27 bloodpressure_diastolic numeric fixed effect x28 bloodpressure_sistolic numeric fixed effect x29 heart_rate numeric fixed effect x30 respiratory_rate numeric fixed effect x31 temperature numeric fixed effect x32 oxygen_saturation numeric fixed effect research methods modeling was carried out to predict the icu needs for covid-19 patients based on the results of blood tests laboratory, vital signs and congenital diseases. there are many predictor variables used in modeling. we select the variables to determine the important predictor variables, then a simpler model is obtained by adding the l1 penalty function to the model. the algorithms of this research were as follows: a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 16 1. lasso glm modeling predicted the icu needs for covid-19 patients a. determine the optimum lambda value b. lasso glm modeling used the r package glmnet c. analyze the parameters from the modeling results d. determine the model accuracy. 2. lasso glmm modeling predicted the icu needs for covid-19 patients a. determine the optimum lambda value b. lasso glmm modeling used the r package glmmlasso c. analyze the parameters from the modeling results the random effects in modeling used hypothesis 𝐻0:𝜎 2 = 0. this hypothesis was tested by using likelihood ratio, 𝐺2 = 2(loglik𝐿𝐴𝑆𝑆𝑂𝐺𝐿𝑀𝑀 − loglik𝐿𝐴𝑆𝑆𝑂𝐺𝐿𝑀). if 𝐺 2 > 𝜒(𝑑𝑏=1,𝛼=0.05) 2 then 𝐻0 is rejected. d. determine the model accuracy. to evaluate the performance of lasso glm and lasso glmm, researchers have chosen the best model to predict the hospitalization needs of a patient with covid-19. the best model was selected based on accuracy and auc. the steps for selecting the best model were as follows: a. partition data with a composition of 80% modeling data and 20% validation data. data partitioning was performed 30 times b. modeling the lasso glm and lasso glmm used modeling data for each replication c. assessing model performance based on auc and accuracy values using validation data for each replication d. statistically perform a performance difference of lasso glm and lasso glmm used paired sample t-test. results and discussion 1. lasso glm modeling lasso glm selects variables based on λ. the λ optimum is obtained when the binomial deviance value is minimum. cross validation plot to optimize lasso glm shrinkage parameters is shown in figure 1. figure 1. cross validation plot to optimize glm lasso shrinkage parameters based on figure 1, the optimum λ was 0.024. the predictor variables included in the modeling are fixed effect predictor variables. there are 26 features laboratory blood test results and 6 patient's vital signs, and a patient's congenital disease as dummy variable. a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 17 lasso glm modeling used the r package glmnet. the plot of the lasso glm coefficient for each log λ value can be seen in figure 2. the regression coefficient with non-zero values results from the lasso glm modeling is shown in table 2. figure 2. plot of lasso glm coefficients for each shrinkage parameter value table 2. lasso glm coefficient variables coefficient lactate -0.45 p02_venous -1.82 sodium -0.03 bloodpressure_sistolic 0.83 respiratory_rate 3.78 oxygen_saturation 4.26 htn 0.23 disease group 1 -0.52 2. lasso glmm modeling the same as lasso glm, lasso glmm also required optimum λ in modeling. figure 3 shows the binomial deviance value for each value of λ. the optimum λ is 19.6 that obtained when the smallest deviance. figure 3. cross validation plot for optimizing lasso glmm shrinkage parameters there are 26 features laboratory blood test results, 6 patient's vital signs and a patient's congenital disease as random effect in lasso glmm. the modeling use r a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 18 package glmmlasso. the plot of the lasso glmm coefficient spread for each λ can be seen in figure 4. the regression coefficients go to zero along to the increasing λ. the regression coefficient of the lasso glmm modeling with λ = 19.6 result 4 non-zero predictor variables that is shown in table 3. figure 4. plot of lasso glmm coefficients for each shrinkage parameter table 3. lasso-penalized logistic mixed effects regression model (glmm-lasso) fixed effects coefficient standard error z p(>|z|) (intercept) -6.88 0.54 -12.636 0.000 lactate -0.65 0.38 -1.70 0.08 bloodpressure_systolic 1.55 1.88 0.82 0.41 respiratory_rate 5.11 1.25 4.09 0.04 oxygen_saturation 8.64 5.36 1.61 0.11 the patient’s congenital disease as random effect had standard deviation 0.8262 with 𝐺2 = 4.12 dan 𝜒(𝑑𝑏=1,𝛼=0.05) 2 =3.84. then, 𝐻0 is rejected. it means that the random effects for patient's congenital disease was significant at 5% level of significance. 3. selection of the best model data were divided randomly with a composition of 80% modeling data and 20% validation data. furthermore, there are 79 patients as modeling data and 19 patients as validation data. data partitioning was carried out in 30 replications. the optimum λ is obtained based on the modeling data taken for each replication. a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 19 figure 5. auc of lasso glm and lasso glmm for 30 replications furthermore, the lasso glm and lasso glmm modeling was carried out for each replication. assessment of modeling performance use the accuracy and auc in the validation data. comparison of the accuracy and auc of 30 replications for each model is shown in figure 5 and figure 6. figure 6. accuracy of lasso glm and lasso glmm for 30 replications the performance differences of lasso glm and lasso glmm can be statistically stated by paired sample t-test of auc and accuracy. the results of the paired sample t-test for these two models can be seen in table 4. the hypothesis about accuracy or auc of the two models is as follows:  h0: average accuracy of lasso glm is less than or equal to average accuracy of lasso glmm  h0: average auc of lasso glm is less than or equal to average auc of lasso glmm table 4. the paired sample t-test of accuracy and auc criteria t-stat p-value accuracy 5.5746 0.0000 auc 2.2058 0.0178 the t-test results in table 4 showed the p-value for accuracy and auc less than 0.05. it means the average of accuracy and auc from lasso glmm is more than the average of accuracy and auc from lasso glm by using 5% level of significance. discussion the ability to identify patients who need the icu is needed. the solution to this problem can be done by identifying the most important variables that affect the icu needs for covid-19 patients. the paired sample t-test of accuracy and auc in table 4 showed that modeling with lasso glmm has better performance than lasso glm. figure 4 shows the effect of the predictor variables for each lambda value. by using lambda 19.6, this model produced four non-zero fixed effect predictor variables which are the focus of attention to predict the icu needs of covid-19 patients, namely lactate, blood pressure systolic, respiratory rate and oxygen saturation. among these four predictors, only respiratory rate had a significant effect at the 5% level of significance and lactate had a significant effect at the 10% level of significance. meanwhile, blood pressure systolic and oxygen saturation had no significant effect. the odds ratio of respiratory rate was 165.67. it meant that the odds of covid-19 a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 20 patient required the icu was 165.67 higher given an increase of a unit respiratory rate (respirations per minute/rpm) than before the increase. covid-19 damages the respiratory system. respiratory rate is one measure used to identify respiratory tract infections immediately before and during the first days of symptoms. the normal respiratory rate for adults at rest is 12 to 20 rpm [16]. the findings of a study suggest that the stability of nightly respiratory rate measurements in healthy individuals at night rest is a useful metric for tracking changes in health [16]. the odds ratio of lactate was 0.52. it meant that the odds of a covid-19 patient required the icu was 0.52 lower given an increase of a unit lactate (mmol/l) than before the increase. arterial lactatemia higher than central vein (a reversed delta a-cv lactate) indicates a disturbance in the mitochondrial metabolism of lung cells caused by severe inflammation [17]. an increase in one unit of venous blood lactate reduces reversed delta a-cv lactate. lasso glmm produced an auc of 0.96. this means that glmm lasso has good predictive performance in predicting the icu needs of covid-19 patients. the random effects patient's congenital disease was significant at 5% level of significance. it means that the icu needs for covid-19 patients varies among patient's congenital disease. we can conclude that glmm lasso with the random effect of patient’s congenital diseases has better modeling performance to predict the icu needs of covid-19 patients. conclusions in this study, modeling with lasso glmm has better performance to predict the icu needs of covid-19 patients than lasso glm. lasso glmm has good predictive performance in predicting the icu needs of covid-19 patients with an auc 0.96. respiratory rate has a significant effect at 5% level of significance and lactate has a significant effect at 10% level of significance in lasso glmm. respiratory rate shows the largest significance effect to predict the icu needs of covid-19 patients. random effects of patient congenital disease had a significant effect on covid-19 patients requiring icu at 5% level of significance. it means that the icu needs for covid-19 patients varies among patient's congenital disease. we can conclude that glmm lasso with the random effect of patient’s congenital diseases has better modeling performance to predict the icu needs of covid-19 patients based on the results of blood tests laboratory and patient‘s vital signs. acknowledgments the authors would like to thank to all persons who contributed to the improvement of this paper. references [1] j. jiang, linear and generalized linear mixed models and their applications. 2007. [2] a. groll and g. tutz, “variable selection for generalized linear mixed models by l1penalized estimation,” stat. comput., 2014, doi: 10.1007/s11222-012-9359-z. [3] t. hastie, r. tibshirani, and j. friedman, springer series in statistics, vol. 27, no. 2. new york, ny: springer new york, 2008. [4] r. tibshirani, “regression shrinkage and selection via the lasso: a retrospective,” j. r. stat. soc. ser. b stat. methodol., vol. 73, no. 3, pp. 273–282, 2011, doi: 10.1111/j.1467-9868.2011.00771.x. a combination of generalized linear mixed model and lasso methods for estimating number of patients covid 19 in the intensive care units alona dwinata 21 [5] t. b. arnold and r. j. tibshirani, “efficient implementations of the generalized lasso dual path algorithm,” j. comput. graph. stat., vol. 25, no. 1, pp. 1–27, 2016, doi: 10.1080/10618600.2015.1008638. [6] s. hossain, s. e. ahmed, and k. a. doksum, “shrinkage, pretest, and penalty estimators in generalized linear models,” stat. methodol., vol. 24, pp. 52–68, 2015, doi: 10.1016/j.stamet.2014.11.003. [7] t. zhang and h. zou, “sparse precision matrix estimation via lasso penalized d-trace loss,” biometrika, 2014, doi: 10.1093/biomet/ast059. [8] n. simon, j. friedman, t. hastie, and r. tibshirani, “a sparse-group lasso,” j. comput. graph. stat., 2013, doi: 10.1080/10618600.2012.681250. [9] j. friedman, t. hastie, and r. tibshirani, “regularization paths for generalized linear models via coordinate descent,” j. stat. softw., vol. 33, no. 1, pp. 1–22, 2010, doi: 10.18637/jss.v033.i01. [10] t. thomson and s. hossain, “efficient shrinkage for generalized linear mixed models under linear restrictions,” sankhya indian j. stat., 2018. [11] j. schelldorfer, p. bühlmann, and s. van de geer, “estimation for high-dimensional linear mixed-effects models using ℓ1-penalization,” scand. j. stat., vol. 38, no. 2, pp. 197–214, 2011, doi: 10.1111/j.1467-9469.2011.00740.x. [12] j. g. ibrahim, h. zhu, r. i. garcia, and r. guo, “fixed and random effects selection in mixed effects models,” biometrics, 2011, doi: 10.1111/j.1541-0420.2010.01463.x. [13] j. schelldorfer, l. meier, and p. bühlmann, “glmmlasso: an algorithm for high dimensional generalized linear mixed models using ℓ1-penalization,” j. comput. graph. stat., vol. 23, no. 2, pp. 460–477, 2014, doi: 10.1080/10618600.2013.773239. [14] a. groll, “glmmlasso: variable selection for generalized linear mixed models by l1-penalized estimation,” 2017. [15] a. muslim, m. hayati, b. sartono, and k. a. notodiputro, “a combined modeling of generalized linear mixed model and lasso techniques for analizing monthly rainfall data,” 2018, doi: 10.1088/1755-1315/187/1/012044. [16] d. miller et al., “analyzing changes in respiratory rate to predict the risk of covid19 infection.,” vol. 2, pp. 1–10, 2020, doi: 10.1101/2020.06.18.20131417. [17] g. nardi et al., “lactate arterial-central venous gradient among covid-19 patients in icu: a potential tool in the clinical practice,” crit. care res. pract., 2020, doi: 10.1155/2020/4743904. learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 181-187 p-issn: 2086-0382; e-issn: 2477-3344 submitted: august 29, 2020 reviewed: march 17, 2021 accepted: april 11, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.10212 learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati1, aprilia divi yustita2, siska aprilia hardiyanti3, i wayan suardinata4 1,2,3,4politeknik negeri banyuwangi email: ika@poliwangi.ac.id abstract moocs is a learning system in the form of online courses that is massive and open to allow participants to enjoy unlimited content and can be accessed via the web. mathematical techniques taught using moocs which will be developed in the following year are expected to be liked by students. the purpose of this study was to determine student interest in studying moocs. this study uses a dummy regression model on learning hours in each category. dummy regression is considered a suitable model because dummy regression can quantify qualitative data. qualitative data were obtained from a questionnaire distributed to 240 students. the questionnaire contains indicators of student moocs interest, including cognitive, affective, and psychomotor interests. the result of this study is the amount of time studying mathematics influenced by students' interest in learning mathematics through moocs by 60.7%, and the rest 39.3% is influenced by other factors. the model is yi = 1,562 + 4,729 d1 + 1,461 d2 + 𝜀𝑖 . so it can be concluded that the interest of students who want to study mathematics through moocs is the highest with an average student learning hours of 4,729 minus 1,562 equal to 3,647 hours. keywords: dummy regression model; learning interest; moocs introduction massive open online courses (moocs) can be qualified as revolution education has begun to grow and become popular today. individuals can get training in the areas needed and developed with educational training that is open to all students throughout the world [1]. moocs caters to a large number of students and provides a combination of open online courses, short video lectures, automated conversations, quizzes, peer and self-talk, and student collaboration through discussion forums. there are various kinds of moocs designed according to the level of thinking since 2016 [2]. massive open online courses (moocs) contribute significantly to individual empowerment because they can help people learn about various topics [3]. the goal of moocs is the best learning resource and new ways of learning in the classroom. learning can help students learn fully, work together, and is also supported by expert guidance [4]. the use of moocs in learning already exists in various countries. singapore moocs can reduce university and university level tuition fees improve community access to such courses. they also provide skills and job training for community members [5]. this contradicts the results of research which state that moocs can provide new forms of learning through technology and save significant costs for education [6]. in portugal, there is a study that states there is a relationship between interest in educational success [7]. other studies add learning competencies that can influence participation, perseverance, and sustainability [8]. a good formal education system supports the students involved. factors that affect this performance according to http://dx.doi.org/10.18860/ca.v6i4.10212 mailto:ika@poliwangi.ac.id learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 182 a survey conducted at the nigerian private university (redeemer university) namely hours of study [9]. while the learning domain that must be learned in learning can be categorized as the cognitive domain (knowledge), the psychomotor domain (skills), and the affective domain (attitude) according to bloom [10]. because this research analyzed interests published in the domain which is an indicator that shows interest in moocs. the analysis used dummy analysis. dummy analysis can be used to predict interest. the research that has been done is the students' interest in soap [11]. the dummy regression model is also used for the performance of students majoring in mathematics fmipa [12]. dummy variables have often been used in strategy research to study the effects of categorical variables [13,14]. the advantage of using these puppet variables, variables 1 and 0 is that they can be questioned and interpreted as the resulting regression estimates [15]. methods multiple linear regression analysis the dummy regression analysis is a double linear regression analysis whose variable is qualitative. so before the use of dummy regression analysis, it must first be understood the analysis of a double linear regression [12] following on each observation, represented the ith bservation, applies the equation 𝑌𝑖 = 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + ⋯ + 𝛽𝑝𝑋𝑝𝑖 + 𝜀𝑖 (1) system equations (1) can be written in the form of a matrix, by defining each matrix into the following matrix: 𝑌 = [ 𝑌1 𝑌2 ⋮ 𝑌𝑛 ] ; 𝑋 = [ 1 𝑋11 𝑋12 ⋯ 𝑋1𝑘 1 𝑋21 𝑋22 ⋯ 𝑋2𝑘 ⋮ 1 ⋮ 𝑋𝑛1 ⋮ ⋯ ⋮ 𝑋𝑛2 ⋯ 𝑋𝑛𝑘 ] ; 𝛽 = [ 𝛽0 𝛽1 ⋮ 𝛽𝑛 ] ; 𝜀 = [ 𝜀1 𝜀2 ⋮ 𝜀𝑛 ] (2) or equations (2) can be written in the form of another matrix as follows : 𝐘 = 𝐗𝛃 + ϵ (3) based on the assumptions above𝜀𝑖 ~𝑁(0, 𝜎 2), then the equation (1) can be written in the form of expectation value: 𝐸(𝑌𝑖 ) = 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑖𝑘 (4) estimation parameters the estimation of the parameters can be obtained using the smallest quadratic method so that the equation (4) can be written in a matrix form : �̂� = (x′x)−1x′y (5) hypothesis testing was conducted to test the overall regression parameter. overall test of the regression parameters as follows: h0: β0 = β1 … = βk = 0 h1: at least one βj ≠ 0 learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 183 the sum of squares total (sst) is the total of sum squared regression (ssr) and the sum of squared error (sse), or it can be written: sst=ssr+sse (6) test statistics which used are f test statistics: f = ssr/dfr sse/dfe = msr mse explanation dfr = degree of freedom in regression dfe = degree of freedom in error msr = mean sum of squares regression mse = mean sum of squares error h0 rejected if 𝐹0 > 𝐹(𝛼.𝑘,𝑛−𝑘−1) by minimizing the number of squared errors, it is obtained: sse = ∑ (yi − ŷi) 2n i=1 (7) sst = ∑ (yi − y̅i) 2n i=1 (8) from equation (6), (7), and (8) obtained 𝑆𝑆𝑅 = �̂�′𝑋′𝑌′ − (∑ 𝑌𝑖 𝑛 𝑖=1 ) 2 𝑛 (9) when variable a free variable is inserted one by one gradually into a regression equation, it is performed a sequential f test [16] hypothesis testing for partial regression coefficient parameters the f test is used to determine the effect of the independent variable on the dependent variable simultaneously. after the f test is carried out, the t-test or partial regression test is carried out. this test is used to determine the effect of each independent variable on the dependent variable. partial regression hypothesized will be testing h0 : βj = 0 h1 : βj ≠ 0 test statistics: thit = β̂j se(βj) (10) h0 rejected if |thit| > 𝑡(𝛼 2 ;𝑛−𝑘−1) coefficient of determination after knowing the effect simultaneously and the effect of each independent variable. the next step of analysis is to find the percentage of the independent variables as a whole to the dependent variable. multiple coefficients of determination r2 measures the proportion of total diversity in the y-free variables that can be explained by the regression equation model together. size of regression coefficient determined by the formula: r2 = ssr sst (11) then from the independent variable and the dependent variable, the model is determined. there are many ways to build a regression model whose free variables contain. variable a qualitative variable, one of which is using a variable doll or commonly called a dummy variable. dummy variables are used as an attempt to see how learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 184 the classifications in the sample affect the estimation parameters. variables dummy also tries to make the quantification of qualitative variables. for example, if you want to estimate variable. the value of a variable y is influenced by one variable quantitative variable (x) and one variable qualitative free variable that has two categories, such as category 1 and category 2. the dummy model of the example is a. y = a + bx + cd1 (dummy intercept model) b. y = a + bx + c(d1x) (dummy slope model) c. y = a + bx + c(d1x) + dd1 (dummy intercept and slope model) this research used dummy intercept model or sympel dummy regression model by modupe by the formula [9] yi =bo+b1zi+ei bo = intercept b1 = regression coefficient zi = 1 if the unit to i is a group that is = 0 if the unit to i as the reference group results and discussion data presentation respondents, in this case, were 240 students. these students have been given knowledge about the moocs that was developed. students are given a questionnaire totaling 20 questions. the questionnaire was designed to contain indicators of interest in the cognitive, affective, and psychomotor domains. after completing the questionnaire then make groups of them. it is namely students into students who are interested in learning mathematics through moocs (a), students who do not like to learn mathematics through moocs (b), and students who do not want to learn mathematics through moocs (c). the grouping scores can be seen in table 1. table 1. student interest grouping score category score interval a 60-80 b 40-59 c 20-39 the data in the research has been categorized as in table 1 then it was changed to dummy variables. the reference category is chosen. it is c category. so that category c becomes d0, category a becomes d1, and category c becomes d2. test of significance and coefficient of determination the significance test is divided into 2 tests, namely the f-test to find out the significance simultaneously and the t-test to find out the partial significance. f-test is used to test a regression that is by testing hypotheses that involve more than one coefficient. the f-test can also be used to test the linearity of a regression equation. it can also be used to see the effect between independent and dependent variables. f-test on the results of this study can be seen in table 2. learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 185 tabel 2. result of f test f-statistic f p-value 183,313 2 0.000 table 2 indicate that the value of the f-test is 183.313 with a p-value of 0,000, it can be stated that the model or independent variable factors, in this case, the variable of interest in studying engineering mathematics through moocs affects the dependent variable, the amount of time studied the t-test can be used to test hypotheses about individual coefficients, the ttest is also often called a partial test. the results of the t-test can be seen in table 3. tabel 3. result of t test coefficient t-value sig d0 6.282 0.000 d1 16.040 0.000 d2 5.257 0.000 based on table 3 above, it is known that all predictor variables significantly influence the model. it can be seen sig value which has a value less than 0,05. so, the regression coefficient in each category produced on the variable significantly influences the number of learning hours. the coefficient of determination is used to find out how much influence is dependent on the independent variable. it can be seen in table 4. tabel 4. table of effect from the amount of time studying in mathematics towards students interest to moocs model r r2 1 0.779a 0.607 a. predictors: (constant), d2, d1 b. dependent variable: the amount of time studying mathematics table 4 reflected the correlation between the amount of time studying mathematics and students’s interest in learning mathematics through moocs. r-value is 0,779. this value shows that their correlation is strong. besides, r2 shows a value of 60.7%, this gives the meaning of the amount of time studying mathematics influenced by students' interest in learning mathematics through moocs by 60.7%, and the rest 39.3% is influenced by other factors. interpretation of dummy regression model the data in this model uses the results of the data which are categorized into 3 dependent variables. it can be seen in table 5. tabel 5. the estimated learning interest through the amount of time studying mathematics in 3 categories model unstandardized coefficient sig. b std. error 1 (constat) 1,562 0,249 0,000 d1 4,729 0,295 0,000 d2 1,461 0,278 0,000 learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 186 table 5 shows that they are students who do not want to learn mathematics through moocs (d0), students who are interested in learning through moocs (d1), students who do not like to learn through moocs (d2) and 1 independent variable the average hours of study (y) are dummy and a linear regression model is obtained yi = 1,562 + 4,729 d1 + 1,461 d2 + 𝜀𝑖 the interpretation of the regression model above is the average study time for students who do not want to learn engineering mathematics by 1,562 hours. the average difference in hours of study for students interested in learning engineering mathematics through moocs with students who do not want to learn engineering mathematics through moocs is 4,729 hours or in other words the interest of students who do not want to study engineering mathematics through moocs is lower than the interest of students interested in learning engineering mathematics through moocs. the average study time for students who interested in learning engineering mathematics by 3,467 hours. the average difference in hours of study for students who do not like to study engineering mathematics through moocs with students who do not want to study engineering mathematics through moocs is 1,461 hours or in other words the interest of students who do not likes to study engineering mathematics through moocs is lower than that of students who do not want to study engineering mathematics through moocs. the average study time for students who do not like study engineering mathematics by 0,101 hours or equals to 6 minutes. conclusions the results of this study are interest of students who do not want to study engineering mathematics through moocs is lower than the interest of students who are interested in learning engineering mathematics through moocs. moreover, the interest of students who do not want to study engineering mathematics through moocs is lower than the interest of students who do not like to study engineering mathematics through moocs. references [1] h. bicen, “determining the effect of using social media as a moocs tool,” procedia comput. sci., vol. 120, pp. 172–176, 2017, doi: 10.1016/j.procs.2017.11.225. [2] c. wrigley, g. mosely, and m. tomitsch, “design thinking education: a comparison of massive open online courses,” she ji, vol. 4, no. 3, pp. 275–292, 2018, doi: 10.1016/j.sheji.2018.06.002. [3] m. aparicio, t. oliveira, f. bacao, and m. painho, “gamification: a key determinant of massive open online course (moocs) success,” inf. manag., vol. 56, no. 1, pp. 39–54, 2019, doi: 10.1016/j.im.2018.06.003. [4] j. ma, j. zheng, and g. zhao, “the applicable strategy for the courses alliance in regional universities based on moocs platform,” procedia soc. behav. sci., vol. 176, pp. 162–166, 2015, doi: 10.1016/j.sbspro.2015.01.457. [5] v. lim, l. wee, s. ng, and j. teo, “massive open and online courses (moocs) and open education resources (oer) in singapore,” j. southeast asian educ., vol. 1, pp. 1–13, 2017. [6] r. pollack ichou, “can moocss reduce global inequality in education?,” australas. mark. j., vol. 26, no. 2, pp. 116–120, 2018, doi: 10.1016/j.ausmj.2018.05.007. [7] p. goulart and a. s. bedi, “the impact of interest in school on the impact of interest in school on educational success in portugal,” bonn, germany, 2011. [8] w. abeer and b. miri, “students’ preferences and views about learning in a learning interest modelling of poliwangi students to learn mathematics engineering through moocs using dummy regression ika yuniwati 187 moocs,” procedia soc. behav. sci., vol. 152, pp. 318–323, 2014, doi: 10.1016/j.sbspro.2014.09.203. [9] o. d. modupe, “a dummy variable regression on students ’ academic performance,” transnatl. j. sci. technol., vol. 2, no. 6, pp. 47–54, 2012. [10] m. e. hoque, “three domains of learning cognitive, affective, and psychomotor,” j. efl educ. res., vol. 2, no. 2, pp. 45–52, 2016, [online]. available: https://www.mendeley.com/catalogue/three-domains-learning-cognitiveaffective-psychomotor-second-principle/. [11] d. sipahutar, p. bangun, and u. sinulingga, “analisa faktor ketertarikan mahasiswa terhadap produk sabun mandi,” saintia mat., vol. 1, no. 2, pp. 175– 185, 2013. [12] n. amalita and y. kurniawati, “model regresi dummy dalam memprediksi performansi akademik mahasiswa jurusan matematika fmipa unp,” in prosiding semirata fmipa universitas lampung, 2013, pp. 387–391. [13] p. s. l. yip and e. w. k. tsang, “interpreting dummy variables and their interaction effects in strategy research,” strateg. organ. j., vol. 5, no. 1, pp. 13–30, 2007, doi: 10.1177/1476127006073512. [14] m.venkataramana, m. subbarayudu, m. rajanis and k. n. sreenivasulu, “regression analysis with categorical variables,” international journal of statistics and systems, vol. 11, no. 2, pp. 135-143, 2016 [15] i. c. a. oyeka and c. h. nwankwo, “use of ordinal dummy variables in regression models,” iosjrm, vol. 2, no. 5, pp. 1–7, 2012. [16] n. . draper and s. h., applied regression analysis, no. 48. 1981. sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 28-39 p-issn: 2086-0382; e-issn: 2477-3344 submitted: juni 09, 2021 reviewed: september 03, 2021 accepted: october 25, 2021 doi: https://doi.org/10.18860/ca.v7i1.12488 sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana1, ahmad ashril rizal2 1center of data analytic research and services, universitas jenderal achmad yani yogyakarta 2information technology, universitas islam negeri mataram email: adripriadana3202@gmail.com, ashril.rizal@gmail.com abstract the covid-19 pandemic impact has affected all industries in indonesia and even the world, including the tourism industry. the government has conducted many programs to answer the needs of the tourism industry, especially in making tourism and business destination management programs and carrying out activities oriented, especially during the covid-19 pandemic. meanwhile, the government has a role in making policies, especially in the roadmap, for developing the tourism industry. however, the government also needs a way to figure out public sentiments towards the policies that have been implemented. this study aimed to track trending topics and analyze the sentiment of public opinion in instagram to figure out government performance in tourism during the covid-19 pandemic period. the results of trending topics will be classified by sentiment analysis using a lexicon-based and naive bayes classifier. instagram data taken since january 2020 showed the five highest topics in the tourism sector, namely health protocols, hotels, homes, streets, and beaches. of the five topics, sentiment analysis was carried out with the lexicon-based and naive bayes classifier, showing that beaches get an incredibly positive sentiment, namely 80.87%, and hotels provide the highest negative sentiment, 57.89%. the accuracy of the confusion matrix's sentiment results shows that the accuracy, precision, and recall are 82.53%, 86.99%, and 83.43%, respectively. keywords: sentiment analysis; government performance in tourism; covid-19 pandemic period; lexicon based introduction the covid-19 pandemic impact has affected all industries in indonesia and even the world, including the tourism industry, and spreads to various other sectors. the tourism industry in indonesia has links with other sectors such as hotels, restaurants, transportation, and small micro medium enterprises. furthermore, it impacts on souvenir and culinary entrepreneurs, travel agents, and tour guides. the value of the decline in state revenue in the tourism sector due to covid-19 on a national scale is, of course, tremendous. the government should not count and study the impact but pay attention to concrete steps in saving the tourism industry in indonesia. a careful strategy and planning are needed to save the tourism industry in indonesia after covid-19, which can be obtained from social media. one of the social https://doi.org/10.18860/ca.v7i1.12488 sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 29 media that is widely used by indonesian people is instagram. from andi link1 shows that instagram is one of the most-used social media platforms in 2020. in indonesia, there are 63 million instagram users in 2020. data sources from social media are indeed instrumental in research. one of the studies sourced from instagram data analyzed human selfies by determining several hashtags as a basis. we have shown how image data is detected as a human face using the haar cascade method. image analysis to detect human faces using the haar cascade method shows that the applied method produces an accuracy value of 71.48% [1]. collecting data on instagram can be done using the web scraping method. simple additive weighting is successfully applied to the decision support system for selecting endorsement accounts on instagram. this study’s instagram account parameter include the number of followers, the number of likes, the number of comets, and posts that are always updated. the determination of the best parameters for selecting the endorse account on instagram has been successfully carried out, as shown by the system accuracy of 75% [2]. the instagram platform plays an increasingly central role in social media, essential for users to interact or communicate. instagram contributes to developing tourism destinations, which it is clear that instagram and its users are transforming into a new form [3]. previously, yadav conducted research related to trip mode's effect on opinions on hotel aspects using a social media analysis approach. knowing the exact customer expectations will allow service providers to focus more on those aspects that are important. it will help hoteliers prioritize their efforts, allocate resources according to customer needs, and provide tailor-made offers to customers to increase customer satisfaction and optimize resource utilization. with social media being an open source of information, positive or negative sentiments of a hotel's opinion or related aspects can affect a hotel's business. it also provides an opportunity to identify the essential features as perceived by the customer and ascertain what the main reasons for customer dissatisfaction are. the data is taken from hotel visitor reviews based on travel mode on tripadvisor. tripadvisor reviews are divided into five modes of travel with very few single travelers (4% of overall reviews), and the majority are family travelers (44%), couples giving reviews 22%, business travelers 19%, and friends 11% [4]. there is another opinion mining platform for extracting and classifying hotel reviews posted by users on tourism websites. the system visits web pages starting at the given url, extracts reviews from page content, then uses opinion mining to process the content and classifies reviews as positive, negative, and neutral. the proposed process has acceptable accuracy and has the advantage that it does not depend on domains and does not require expensive resources to operate. according to the review, it can be concluded that in the tourism domain, the analysis is made aspect oriented. it is because of the many aspects expressed by users about opinions and mixed sentiments present in a review [5]. the sentiment analysis method is also carried out to research halal tourism in europe. halal tourism has recently received significant attention from academics and practitioners. this study analyzes tweets related to halal tourism. this study's findings can help various stakeholders, such as marketers who want to target the halal tourism market. these studies have also contributed to existing knowledge about tourism in general and halal tourism in particular [6]. the government has conducted many programs to answer the needs of the tourism industry, especially in making tourism and business destination management programs and carrying out activities oriented, especially during the covid-19 pandemic. meanwhile, the government has a role in making policies, especially in the roadmap, for 1https://andi.link/hootsuite-we-are-social-indonesian-digital-report-2020/ sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 30 developing the tourism industry. however, the government also needs a way to figure out public sentiments towards the policies that have been implemented. this study aimed to track trending topics and analyze the sentiment of public opinion in instagram to figure out government performance in tourism during the covid-19 pandemic period. some studies analyzed the sentiment of public opinion in social media that could help the government, or the relevant authorities develop responses or programs to address existing problems in the community, especially during the covid-19 pandemic. shofiya and abidi in 2021 [7] analyzed the sentiment of public opinion about social distancing programs in canada using twitter data during the covid-19 pandemic period. they used the support vector machine (svm) technique to classify sentiment. obiedat et al. in 2021 [8] analyzed the sentiment of public opinion to enhance the government decisions in jordan during covid-19 pandemic based on facebook data. they used whale optimization algorithm & support vector machines (woa-svm) methods compared with some other methods to classify sentiment. habibi et al. in 2021 [9] analyzed the sentiment and modeled the topic of public opinion about covid-19 epidemics in indonesia based on twitter data. they used latent dirichlet allocation (lda) to model topics and naive bayes methods compared with other methods to classify sentiment. prastyo et al. in 2020 [10] analyzed the sentiment of public opinion on twitter about the indonesian government’s handling of covid-19. they used svm with normalized poly kernel to classify sentiment. in this study, the results of trending topics will be classified by sentiment analysis using a lexicon-based and naive bayes classifier. there are some studies that use lexicon-based to analyze sentiment. an arabic sentiment lexicon built through automatic lexicon expansion (moarlex) is a lexicon for a sentiment on a large scale. the lexicon is intended to provide accessible arabic resources that can be used in sentiment analysis tasks. one of the advantages of using the proposed lexicon and the techniques used in constructing it is that it can include terms commonly used in social media [11]. meanwhile, indonesian can use the sastrawi library, which can be accessed openly. a literature review needs to be carried out to provide information about sentiment analysis studies on social media. the researchers introduced various methods, but the most common methods used in the lexicon-based method are sentiwordnet and tf-idf. at the same time, those for machine learning are naïve bayes and svm. choosing the right sentiment analysis method depends on the data itself [12]. different preprocessing methods affect the polarity classification of sentiments on twitter. removing urls, deleting stop words, and deleting numbers affects classifier performance minimally. meanwhile, removing stop words, numbers, and urls is appropriate for reducing noise but not affecting performance [13]. methods data collection first, we did data extraction to collect instagram data by using a web data extraction method. it is used to routinely extract data from a web data source [14]. we got the instagram post data by using hashtags that are near related to tourism in indonesia. the data is used as a dataset is data with captions based on predetermined hashtags in various languages. sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 31 data cleaning process the second step in this research is the data cleaning process. figure 1 shows the data cleaning process, in which there is a processing stage inside of it. in the process, several steps are carried out namely lowercase, removing symbol, stemming, tokenizing, and bag of words. the lowercase process is the process of converting all the letters in the instagram caption to lowercase. it functions to facilitate the next process in determining sentiment. table 1 shows examples of the lowercase process. table 1. lowercase process no. real caption after lowercase process 1. trust and maximize your vacation with @ giliketapang001 guaranteed satisfying fun trust and maximize your vacation with @ giliketapang001 guaranteed satisfying fun 2. are you paid for being sent home? what if we open a small business? are you paid for being sent home? what if we open a small business? 3. just bored at home just bored at home 4. the holiday has ended. hopefully, this feeling of happiness can last even though the vacation time is over? the feeling that i want a year off the holiday has ended. hopefully, this feeling of happiness can last even though the vacation time is over? the feeling that i want a year off figure 1. data cleaning process sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 32 the removing symbol process will delete the symbol in the caption. the deleted symbols is a period (.), comma (,), exclamation (!), ask (?), (@), (#), (https: //), and several other unique symbols and emoticons. this process is done because it is assumed that the caption's scraping results do not determine the sentiment results. table 2 shows examples of captions after removing symbols. the stemming process aimed to replace affix words into root words by removing all affixes, whether it's an affix at the beginning, in the middle, or at the end of the word. at this stage, we used the sastrawi library to replace the affixed word with the root word. tokenizing process is used to collect the number of words in the data set. the data can be a single word. it means that if there are two words or more than two words in the data set, we only can use one word. bag of words is a concept from text analysis. this concept represents the document as an essential information pocket without sorting its words. this method works by counting the total of words’ frequency who appeared in a document dataset. so the output of the bag of words model is a frequency vector. table 2. removing symbol process. no. caption after removing symbol process 1. trust and maximize your vacation with @ giliketapang001 guaranteed satisfying fun. trust and maximize your vacation with giliketapang001 guaranteed satisfying fun. 2. are you paid for being sent home? what if we open a small business? are you paid for being sent home what if we open a small business 3. just bored at home. just bored at home 4. the holiday has ended. hopefully, this feeling of happiness can last even though the vacation time is over? the feeling that i want a year off. the holiday has ended. hopefully, this feeling of happiness can last even though the vacation time is over the feeling that i want a year off determination of trending topics trending topics are determined by counting ten captions with the highest frequency. this determination is done by counting the n-grams after getting the word tokenizing results. sentiment analysis using lexicon base and naive bayes classifier in general, the sentiment analysis process is carried out with the steps shown in figure 2. the lexicon-based process is carried out after cleaning the previously obtained dataset. the number of words in the data will be calculated based on the rules in the data dictionary. the data dictionary is a collection of words in the great dictionary of the indonesian language or kbbi and has been classified into negative and positive classes. naive bayes classifier is a classification with the concept of likelihood when referring to the bayes hypothesis. bayes' hypothesis scientifically determines the relationship between the probability of two events a and b, p (a) and p (b) and the likelihood that event a is formed by b and event b is adapted by a, p (a | b) and p (b | a). so the bayes equation is shown in equation 1. 𝑃(𝐴 𝐵⁄ ) = 𝑃 (𝐵 𝐴)𝑃(𝐴)⁄ 𝑃(𝐵) (1) in this case, the calculated classification is p (a | b), which is the probability that the sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 33 hypothesis is correct (valid) for the observed sample b data. the b data is sample data with an unknown class (label), and a is a hypothesis that b is data with a known class (label). p (a) is the probability of hypothesis a, p (b) is the probability of the observed sample data, p (a | b) is the probability of sample data b if it is assumed that the hypothesis is correct (valid) [15]. figure 2. sentiment analysis proses performance evaluation in this study, performance evaluation was carried out using the confusion matrix as well as precision, recall, and accuracy. precision and recall values are calculated using equation 2 and equation 3. the calculation of predictive accuracy is done using equation 4 [16], 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 x 100% (2) 𝑟𝑒𝑐𝑎𝑙𝑙 = 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 x 100% (3) 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃 + 𝑇𝑁 𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 x 100% (4) results and discussion this paper's research results are in the form of trending topics in the tourism sector since the covid-19 pandemic hit. the evaluation results are based on data obtained from instagram. the data distribution used is shown in table 3. data collection was carried out by scraping method using python. the libraries used in natural language processing are the natural language toolkit (nltk) and the sastrawi library as cleaning data in indonesian. trending topics in the tourism sector in this study, the total dataset used is 195136 data. for the first time, the similarity process of each caption is carried out. this function is used because one caption uses a different hashtag, and that hashtag is used as a parameter in this paper. so that one caption will be taken as a dataset if there are the same captions. the search results for trending topics on instagram are the accumulated result of bigram and tri-gram using nltk. there is a word order from each caption that is part of a sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 34 sentence with the same pattern. examples like the caption in english translation (1) “follow health protocols for holidays, it seems complicated” and caption (2) “to the beach you have to obey health protocols, prefer #dirumahaja”. if caption (1) and caption (2) go through the cleaning stage, caption (1) will become “follow health protocols for very complicated holidays” and caption (2) will become “beaches must comply with health protocols, at home”. after going through the tokenizing stage using bi-gram, there will be a caption (1) ∩ caption (2) become health protocol. the number of slices becomes a parameter for determining the trending topic, which is a projection of the cumulative frequency of words that appear. table 3. frequency hashtag no. hashtag frequency 1 #pariwisata 12397 2 #indonesia 13396 3 #liburan 13997 4 #pesonaindonesia 14175 5 #jalanjalan 14196 6 #wonderfulindonesia 14114 7 #wisataindonesia 14248 8 #kuliner 13873 9 #visitindonesia 14113 10 #wisata 14115 11 #exploreindonesia 14141 12 #wisataalam 14269 13 #pantai 14157 14 #dirumahaja 13945 figure 3. wordcloud instagram caption sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 35 figure 4. top 5 instagram trending topics figure 3 and figure 4 show the results of the trending topic of the data dataset. figure 3 is a wordcloud which describes the distribution of text in the dataset. the size of the text is directly proportional to the frequency with which it appears in the dataset. graphically, figure 4 shows the top five trending topics based on the dataset used. the trending sequences that emerged were health protocols (1301), hotels (1227), homes (739), roads (772), and beaches (544). sentiment analysis at this stage, the sentiment results are based on a predefined data dictionary. the lexion-based method used refers to the data dictionary. in addition to calculating bayes's probabilities, the number of words in the caption that fall into the negative and positive class is one of the classification references. for example in english translation, “follow the health protocol for the holidays, it seems like it's very complicated” as the caption (1), will be ”follow the health protocol for a very complicated holiday” after the cleaning process. in the caption, the words, follow/protocol/health/for/holiday/very are neutral words, and the word /complicated/is negative word. figure 5 shows the sentiment results in trending tourism topics on instagram. health protocol is the highest topic among instagram users. however, most wrote positive captions to the health protocol. it is evident from the sentiment calculation results that 73.12% of captions are writing positive things related to health protocols. not a few instagram users have complained about health protocols while on vacation or traveling during a pandemic. still, many instagram users invite and emphasize complying with health protocols to reduce the number of corona cases in indonesia. besides, many hotel complaints not only from tourists but from the hotel and its employees. with the closure of hotel access and the slow pace of hotel openings, many instagram users have complained about being laid off. regarding the topic of beaches, many positive captions have appeared. for example, access to the beach is still open even though health protocols are applied. besides, a caption discusses beaches in indonesia that are getting prettier and cleaner since the covid-19 pandemic hit. we can see that beach sentiment is 80.87% positive and only 19.13% with the negative caption. sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 36 figure 5. sentiment analysis of trending topics algorithm performance the sentiment results in this paper were evaluated using a confusion matrix. the results of true positive (tp), true negative (tn), false positive (fp), and false negative (fn) are shown in table 4. table 4. confusion matrix of sentiment classification class classified as positive classified asnegative positive 2911 435 negative 578 1874 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛: 𝑇𝑃 𝑇𝑃+𝐹𝑃 ∗ 100% = 2911 2911+435 ∗ 100% = 86.99% recall: 𝑇𝑃 𝑇𝑃+𝐹𝑁 ∗ 100% = 2911 2911+578 ∗ 100% = 83.43% accuracy: 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 ∗ 100% = 2911+1874 2911+1874+435+578 ∗ 100% = 82.53% in this study, the lexicon-based method worked by creating a dictionary of opinion words (lexicon) compiled beforehand. the words contained in the data dictionary were divided into two classes, namely classes containing positive words and classes containing negative words. the dictionary was used to identify whether a sentence contains a certain opinion or not. in the text segmented based on word order, the lexicon method would perform a search process on the data dictionary, which words in the text contained positive or negative words. in the naive bayes classifier process, the results of the search for word classes in the data dictionary in the lexicon process would count the number of words in the dataset that fall into the positive class or the words that fall into the negative class. the naive sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 37 bayes classifier used a prior probability, a true probability value, before carrying out experiments on each label which was the frequency of each label in the training set. the purpose of this process was to classify the caption results on the data in each dataset. the naive bayes classifier method performs the text classification process based on the previously stored training data. in the implementation, there are three stages: making a list of n-grams, then making a list of word classes that have been carried out in the lexicon process and making a classifier. classifiers need to be trained and to do so requires a list of captions classified into positive and negative classes manually. in its implementation, around 250 positive captions and 250 negative tweet captions were used to train the classifier. based on the calculated confusion matrix results, we can see that the accuracy of the sentiment results is 82.53%. precision describes the level of accuracy between the requested data and the predictive results given by the model. the precision result from the sentiment is 86.99%. meanwhile, the recall which describes the success of the model in recovering considerable information is 83.43%. some captions should be positive in the negative class or negative captions classified as positive because the lexicon-based method does not use a learning method. sometimes in indonesian captions, there are ambiguous sentences that are positive, but the words chosen to make the caption classified as negative we can see on an example in the caption (3) of the dataset in english translation like “for those of you who are nagging about wanting to go on a walk, there are more victims, it's better to stay safe with #dirumahaja”. of course, the caption should fall into the positive class. however, words like nagging and victims are included in the negative data dictionary, and words like dirumahaja are classified as positive. thus, the caption contains more negative words so that it will enter the negative class. conclusions this study aims to track trending topics in social media instagram since covid-19 hit. the results of trending topics will be classified by sentiment analysis using a lexicon-based and naive bayes classifier. according to analysis results using instagram data taken since january 2020, it shows the five highest tourism sector topics, namely health protocols, hotels, homes, streets, and beaches. of the five topics, sentiment analysis was carried out with the lexicon-based and naive bayes classifier, showing that beaches get a very positive sentiment, namely 80.87%, and hotels provide the highest negative sentiment 57.89%. the accuracy of the confusion matrix's sentiment results shows that the accuracy, precision, and recall are 82.53%, 86.99%, and 83.43%, respectively. in the future, further investigations will be carried out on how the public perceives the government's performance in the tourism sector during the new normal. moreover, the lexicon-based method is indeed high-speed in classifying text data. however, the sentiment results are very dependent on the data dictionary used and the caption context. in indonesia, netizens often write this article with a specific purpose. captions written using this figure can make the caption fall into the wrong classification. there is a need for a learning method to understand captions with specific figures to fall into the correct classification. sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 38 acknowledgments the authors would like to thank the center of data analytic research and services, universitas jenderal achmad yani yogyakarta, for supporting the data of this research. references [1] a. priadana and m. habibi, “face detection using haar cascades to filter selfie face image on instagram,” in 2019 international conference of artificial intelligence and information technology (icaiit), 2019, pp. 6–9, doi: 10.1109/icaiit.2019.8834526. [2] m. i. akrianto, a. d. hartanto, and a. priadana, “the best parameters to select instagram account for endorsement using web scraping,” in 2019 4th international conference on information technology, information systems and electrical engineering (icitisee), 2019, pp. 40–45, doi: 10.1109/icitisee48480.2019.9004038. [3] m. n. fatanti and i. w. suyadnya, “beyond user gaze: how instagram creates tourism destination brand?,” procedia soc. behav. sci., vol. 211, pp. 1089–1095, nov. 2015, doi: 10.1016/j.sbspro.2015.11.145. [4] m. l. yadav and b. roychoudhury, “effect of trip mode on opinion about hotel aspects: a social media analysis approach,” int. j. hosp. manag., vol. 80, no. september 2018, pp. 155–165, 2019, doi: 10.1016/j.ijhm.2019.02.002. [5] c. bucur, “using opinion mining techniques in tourism,” procedia econ. financ., vol. 23, no. october 2014, pp. 1666–1673, 2015, doi: 10.1016/s22125671(15)00471-2. [6] s. ainin, a. feizollah, n. b. anuar, and n. a. abdullah, “sentiment analyses of multilingual tweets on halal tourism,” tour. manag. perspect., vol. 34, no. january 2019, p. 100658, 2020, doi: 10.1016/j.tmp.2020.100658. [7] c. shofiya and s. abidi, “sentiment analysis on covid-19-related social distancing in canada using twitter data,” int. j. environ. res. public heal. 2021, vol. 18, page 5993, vol. 18, no. 11, p. 5993, jun. 2021, doi: 10.3390/ijerph18115993. [8] r. obiedat, o. harfoushi, r. qaddoura, l. al-qaisi, and a. m. al-zoubi, “an evolutionary-based sentiment analysis approach for enhancing government decisions during covid-19 pandemic: the case of jordan,” appl. sci. 2021, vol. 11, page 9080, vol. 11, no. 19, p. 9080, sep. 2021, doi: 10.3390/app11199080. [9] m. habibi, a. priadana, and m. r. ma’arif, “sentiment analysis and topic modeling of indonesian public conversation about covid-19 epidemics on twitter,” ijid (international j. informatics dev., vol. 10, no. 1, pp. 23–30, jun. 2021, doi: 10.14421/ijid.2021.2400. [10] p. h. prastyo, a. s. sumi, a. w. dian, and a. e. permanasari, “tweets responding to the indonesian government’s handling of covid-19: sentiment analysis using svm with normalized poly kernel,” j. inf. syst. eng. bus. intell., vol. 6, no. 2, pp. 112–122, oct. 2020, doi: 10.20473/jisebi.6.2.112-122. [11] m. youssef and s. r. el-beltagy, “moarlex: an arabic sentiment lexicon built through automatic lexicon expansion,” procedia comput. sci., vol. 142, pp. 94– 103, 2018, doi: 10.1016/j.procs.2018.10.464. [12] z. drus and h. khalid, “sentiment analysis in social media and its application: systematic literature review,” procedia comput. sci., vol. 161, pp. 707–714, 2019, doi: 10.1016/j.procs.2019.11.174. [13] z. jianqiang and g. xiaolin, “comparison research on text pre-processing methods sentiment analysis on government performance in tourism during the covid-19 pandemic period with lexicon based adri priadana 39 on twitter sentiment analysis,” ieee access, vol. 5, no. c, pp. 2870–2879, 2017, doi: 10.1109/access.2017.2672677. [14] e. ferrara, p. de meo, g. fiumara, and r. baumgartner, “web data extraction, applications and techniques: a survey,” knowledge-based syst., vol. 70, pp. 301– 323, nov. 2014, doi: 10.1016/j.knosys.2014.07.007. [15] s. rana and a. singh, “comparative analysis of sentiment orientation using svm and naive bayes techniques,” proc. 2016 2nd int. conf. next gener. comput. technol. ngct 2016, no. october, pp. 106–111, 2017, doi: 10.1109/ngct.2016.7877399. [16] e. m. martín and á. p. del pobil, robust motion detection in real-life scenarios, 1st ed. springer-verlag london, 2012. a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 474-482 p-issn: 2086-0382; e-issn: 2477-3344 submitted: february 10, 2022 reviewed: august 27, 2022 accepted: september 13, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.15398 a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi*, nina salsabila sulistiani, pringgo widyo laksono industrial engineering department universitas sebelas maret jalan ir. sutami 36 a surakarta, indonesia 57126 email: cucuknur@staff.uns.ac.id* abstract in an intensed business competition, a company has to improve its competitiveness by focus more on supply chain management. one of the crucial problems in supply chain deals with the allocation of orders. more and more companies are starting to adopt mass customization logistics service (mcls) mode to determine the optimal allocations of order both from suppliers and to customized logistics services at the possible lowest cost. for this purpose, logistics service integrator (lsi) is needed to integrate the logistics tasks which operationally done by functional logistics service provider (flsp). this research aims at developing an optimization model to determine optimal decisions concerning order allocations of the needed items from the manufacturer to the respective suppliers and logistics tasks from lsi to flsps. the problems were formulated using mixed integer linear programming (milp). the results of the analysis show that the demand becomes the only sensitive parameter towards both decision variables and objective function, while the purchasing cost only impact significantly to the objective function. keywords: mixed integer linear programming; order allocation; mass customization logistics services. introduction in an intensed business competition, many companies made their best effort to improve their competitiveness through highly product adjustments, increase product quality, and reduce product costs with timely distribution. hence, supply chain management has become more important in increasing the company competitiveness [1]. the company has to manage its supply chain efficiently to cope with increasing customer variety and demand, the advances of communication technology and information systems, and high competition in the era of globalization, and environmental awareness [2]. the short-term goal of supply chain management is to increase productivity while in the same time reduce total inventory costs and total cycle time. in the long-term, the goal of supply chain management is to increase customer satisfaction, market share, and profits for all parties involved in the supply chain, namely suppliers, manufacturers, distribution centers, and customers [3]. to achieve this goal, it is necessary to have good coordination of each element in the supply chain. several important decisions should be made by the decision makers in the supply http://dx.doi.org/10.18860/ca.v7i3.15398 mailto:cucuknur@staff.uns.ac.id a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 475 chain such as supplier selection, order allocation, and third-party logistics selection. supplier selection activities are important for the company and involving multi criteria decision making techniques due to its problem nature [4-6]. order allocation problem is usually solved using constrained mathematical programming approach which formulated in single or multi-objective formulation. the common objective of the model is to minimize cost or maximizing profit and maximizing total value purchasing [7]. with optimal order allocation the company can run the entire supply chain with its best performance [8]. supplier selection and order allocation have attracted many researchers. for example, the model in [9] proposed multi attribute utility theory determine the optimal order allocation with two stages. in the first stage, supplier selection was performed using and in the second stage the optimal order allocation was found using multi-objective integer linear programming involving social and environmental objectives. other recent model in supplier selection and order allocation were developed by [10]. in the research, the sustainable criteria were used to determine the weight of criteria using best-worst method (bwm). afterwards, the results of the weight were used to determine the suppliers rating and rating were found by the measurement alternatives and ranking according to compromise solution (marcos) method. research [11] only developed an optimization model to determine the optimal order allocation. research [12] considered the transportation alternatives and lateral transhipment in order allocation problem. the model was used to determine the optimal order allocation and transportation alternative for three echelon supply chain consisting of supplier, manufacturer, and retailer. supply chain transportation has to be managed efficiently. hence, according to [1315], more and more companies adopted mass customization logistic service (mcls) mode to make the oprations more efficient. in mcls, customized logistics services are provided where the order allocation of logistics tasks are conducted by logistic service integrator (lsi). the lsi allocates the logistics tasks to functional logistics service providers (flsp). the research to solve mcls problems has been conducted by many researchers. for example, the scheduling problems of the mcls have been solved by [13] for deterministic and by [16] for uncertain flsp’s time. the optimization models have been developed to solve the order allocation problems of mcls such as in [17, 18]. both researches only considered the order allocation of logistics tasks from lsi to flsps. in fact, the manufacturer that uses the lsi services need to determine the optimal allocation of the needed items from the suppliers. hence, an optimization model needs to develop in order to integrating the decision making of order allocations of needed items and logistics tasks. the problem is formulated using mixed integer linear programming (milp) method to determine the allocation of orders to suppliers and the allocation of logistics tasks to flsp to minimize the total supply chain costs. methods the model is formulated using milp method. the objective function of the model is to minimize manufacturer’s costs which comprise of supplier cost, outsourcing services cost, and transportation cost. there are two decision variables in the model, namely the allocation of order from each supplier and the assignment of logistics tasks to the respective flsp. several assumptions that involved in the modeling process are: (1) each supplier can supply more than one product, (2) the quantity of orders to each supplier is assumed to be constant for each period, (3) the budget for purchasing of orders is assumed to be constant for each period, and (4) each has different outsourcing price and a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 476 service capacity. model notations index i : supplier index (i=1…i) f : flsp index (f=1…f) j : procedure index (j=1…j) k : product index (k=1…k) decision variables 𝑋𝐶𝑘𝑖 : the order quantity of product k from supplier i 𝑄𝑓𝑗𝑘 : the number of logistics tasks assigned by the lsi to the flsp f for procedure j for product k 𝑋𝑓𝑗𝑘 { 1, if flsp 𝑓 of procedure 𝑗 is selected for product 𝑘 0, otherwise parameters cki : unit cost of product k from supplier i ($) tc : unit transportation cost per kg ($) wcfjk : the mass of product k in procedure j processed by flsp f (kg) ocki : unit order cost of product k from supplier i ($) b : total budget for procurement ($) dck : demand for product k (unit) capcki : product k capacity from supplier i (unit) pfjk : unit service price of product k processed by flsp f for procedure j ($) afjk : maximum service capacity of flsp f for procedure j and for product k varfjk : variable for linearization m : big positive number (assumption of m value = 1000000) model formulation the formulation of the cost components is shown in equations (1)-(3). equation (1) expresses the supplier cost which determines by multiplying the order allocation with the summation of unit product cost and order cost. equation (2) calculates the total cost of outsourcing services incurred by the company for flsp and lsi services. the total cost was calculated by multiplying the order quantity with service price and the number of logistics tasks performed by flsp. equation (3) calculate total transportation cost which expressed as the function of the mass of product. tbp = (∑ ∑ 𝐶𝑖𝑘 + 𝑂𝐶𝑖𝑘 𝑁𝑘 𝑘 𝑁𝑖 𝑖 ) 𝑥 𝑋𝐶𝑖𝑘 (1) tblo = ∑ ∑ ∑ 𝑋𝑓𝑗𝑘 𝑁𝑘 𝑘 𝑥 𝑃𝑓𝑗𝑘 𝑥 𝑄𝑓𝑗𝑘 𝑁𝑗 𝑗 𝑁𝑖 𝑓 (2) tbt = ∑ ∑ ∑ 𝑇𝐶 . 𝑁𝑗 𝑘 𝑊𝐶𝑓𝑗𝑘 𝑁𝑗 𝑗 𝑁𝑖 𝑓 (3) the constraints of the model are expressed in equations (4)-(12). equation (4) ensures the expenditure to cover all the costs is not over budget. equations (5) and (6) ensure the order quantity covers all the demand and does not exceed the supplier capacity. equation (7) ensures at least one flsp is selected to prevent the service delays. equation (8) is needed to ensure all the demand are processed by flsp. equation (9) is a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 477 needed to ensure the number of logistics tasks assigned to the flsp for each procedure does not exceed the capacity of each flsp for each procedure. equations (10) and (11) express the non-negative and integer values of the decision variables. equation (12) defines the binary decision variable. (∑ ∑ 𝐶𝑖𝑘 + 𝑂𝐶𝑖𝑘 ) ․ 𝑋𝐶𝑖𝑘 + (∑ ∑ ∑ 𝑋𝑓𝑗𝑘 𝑁𝑘 𝑘 . 𝐶1𝑓𝑗𝑘 . 𝑄𝑓𝑗𝑘 . 𝑇𝐶 . 𝑊𝐶𝑓𝑗𝑘 ) 𝑁𝑗 𝑗 𝑁𝑖 𝑓 𝑁𝑘 𝑘 𝑁𝑖 𝑖 ≤ 𝐵 (4) ∑ 𝑋𝐶𝑖𝑘 ≥ 𝐷𝐶𝑘 𝑁𝑖 𝑖 (5) 𝑋𝐶𝑖𝑘 ≤ 𝐶𝐴𝑃𝐶𝑖𝑘 (6) ∑ 𝑋𝑓𝑗𝑘 ≥ 1 𝑁𝑖 𝑓 (7) ∑ 𝑄𝑓𝑗𝑘 = 𝐷𝐶𝑘 𝑁𝑖 𝑓 (8) 𝑄𝑓𝑗𝑘 ≤ 𝐴𝑓𝑗𝑘 (9) 𝑋𝐶𝑖𝑘 ≥ 0 𝑎𝑛𝑑 𝑖𝑛𝑡𝑒𝑔𝑒𝑟 (10) 𝑄𝑓𝑗𝑘 ≥ 0 (11) 𝑋𝑓𝑗𝑘 ∈ {0,1} (12) in equation (4), there is a non-linear function as the result of multiplication of two decision variables. hence, we have to conduct linearization by adding a surrogate variable. equation (13) and (14) show the lower bound and upper bounds of the surrogate variable. in this case, the surrogate variable should not be greater than the integer decision variable to ensure the consistency of the model. 𝑉𝑎𝑟𝑓𝑗𝑘 ≥ 𝑄𝑓𝑗𝑘 − (1 − 𝑋𝑓𝑗𝑘 ) 𝑥 𝑀 (13) 𝑉𝑎𝑟𝑓𝑗𝑘 ≤ 𝑀 𝑥 𝑋𝑓𝑗𝑘 (14) 𝑉𝑎𝑟𝑓𝑗𝑘 ≤ 𝑄𝑓𝑗𝑘 (15) results and discussion optimization results in this section, we give a numerical example and sensitivity analysis to show the implementation of the model and how sensitive the model to the change of some input parameters. in the numerical example, a single manufacturer has to order three kinds of raw materials form three suppliers. all raw materials can be supplied by all suppliers except for raw material a which only supplied by supplier 1 and 2. after determines the order allocation for each supplier, the delivery of the orders is done by single lsi which then order the logistics services to three flsp with eight procedures. unit product cost and unit order cost are shown in table 1. the demand for each product is set at 9,500 units; the maximum budget for procurement expenditure is $5,000 and transportation cost per kg is $0.00023. the other parameters which deal with the flsp activities are shown in table 2. a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 478 table 1. unit product and order cost raw material supplier unit product cost ($) unit order cost ($) supplier capacity (unit) a 1 7.75 0.210 10,000 2 6.15 0.210 5,000 3 b 1 320 0.013 15,000 2 290 0.014 1,000 3 278.6 0.007 8,000 c 1 151.33 0.021 8,000 2 34 0.004 10,000 3 39.65 0.005 1,000 table 2. service cost and fls capacity. flsp procedure raw material service cost ($) flsp capacity (unit) a b c a b c 1 2.2 2.2 2.2 3350 3369 2576 2 2.8 2.8 2.8 4557 3148 5560 3 4.3 4.3 4.3 5847 5288 1170 4 4.3 4.3 4.3 3573 3150 4734 1 5 4.4 4.4 4.4 2311 3395 2048 6 5.5 5.5 5.5 4500 4561 6751 7 5.1 5.1 5.1 3457 2390 4286 8 6.5 6.5 6.5 7890 6732 7865 1 2.3 2.3 2.3 2576 1096 3711 2 3.2 3.2 3.2 2278 5589 3402 3 4.5 4.5 4.5 3553 2911 2999 4 5.2 5.2 5.2 4428 2152 3350 2 5 4.8 4.8 4.8 4113 4866 5821 6 4.6 4.6 4.6 3235 2387 4598 7 4.8 4.8 4.8 2389 8798 6851 8 5.6 5.6 5.6 6541 4531 3452 1 3.4 3.4 3.4 3711 5241 3350 2 3.5 3.5 3.5 3947 4335 4756 3 3.5 3.5 3.5 1320 3350 5899 4 4.8 4.8 4.8 2987 4850 1887 3 5 5.1 5.1 5.1 3541 1741 2435 6 5.0 5.0 5.0 3576 5768 1578 7 5.2 5.2 5.2 8976 2566 1897 8 5.8 5.8 5.8 5456 3459 1765 lingo 18.0 was used to solve the model using the embedded branch and bound method. the global optimum was found in the sixth iteration resulted a minimum cost at $1676.58. the optimal order for each supplier and the order for each flsp is shown in a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 479 table 3 and table 4 respectively. from table 3, the manufacturer should order raw material a from supplier 1 and 2, raw material b from supplier 1 and 3, and raw material c only from supplier 2. as shown in table 4, all flsp are assigned to process the procedure for all materials. table 3. optimal raw material order allocation raw material supplier order allocation (unit) a 1 4500 2 5000 3 0 b 1 1500 2 0 3 8000 c 1 0 2 9500 3 0 table 4. order allocation for each flsp flsp procedure order allocation (unit) a b c 1 1 3350 3369 2576 2 4557 3148 5560 3 5847 5288 1170 4 3573 3150 4734 5 2311 3395 2048 6 4500 4561 6751 7 3457 2390 4286 8 7890 6732 7865 2 1 2439 890 3574 2 996 2017 1 3 2333 862 2431 4 2940 1500 2879 5 3648 4364 5017 6 1424 1 1171 7 1 4544 3317 8 1 1 1 3 1 3711 5241 3350 2 3947 4335 3939 3 1320 3350 5899 4 2987 4850 1887 5 3541 1741 2435 6 3576 4938 1578 7 6042 2566 1897 8 1609 2767 1634 sensitivity analysis the scenario for sensitivity analysis is shown in table 5. six parameters are studied to determine how sensitive the model towards the change of those parameters. for each parameter, we set four values each with the decrease and increase of 15% and 30% from the base line. resume of the results of sensitivity analysis are shown in table 6. from the table we can see that the change of all parameters value has the same effect on the decision variables, both the order allocation and outsourcing decisions. all the parameters value change are insensitive to both decision variables except for the demand. the increase of a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 480 demand by 15% made the model infeasible. this result indicates that when the demand increases by 15% the manufacturer should find other suppliers to fulfill the demand or otherwise requires some suppliers to increase their capacities. two parameters are sensitive towards the objective function, namely the purchasing cost and demand. this becomes an indication for the manufacturer to have high awareness to those parameters especially when their value of those parameters increases. table 5. sensitivity analysis scenarios parameter value changes (%) c -30 -15 0 15 30 tc -30 -15 0 15 30 oc -30 -15 0 15 30 b -30 -15 0 15 30 dc -30 -15 0 15 30 p -30 -15 0 15 30 table 6. resume of the results of sensitivity analysis parameter order allocation objective function suppliers flsp purchasing cost insensitive insensitive sensitive unit transportation cost insensitive insensitive insensitive order cost insensitive insensitive insensitive maximum expenditure cost insensitive insensitive insensitive demand sensitive sensitive sensitive unit outsourcing cost insensitive insensitive insensitive conclusions in this research, we developed a milp model to solve order allocation problem in a supply chain consists of multi supplier, single manufacturer considering mcls to minimize total supply chain costs. mcls was represented by single lsi which responsible to process the delivery of the raw material through a serial procedure done by several flsps. the costs of the supply chain comprise of purchasing cost, transportation cost, order cost, and outsourcing service cost. based on the results of sensitivity analysis, among six parameters there are only one parameter has significant effect on the decision variables, namely the demand. on the other side, two variables have significant effect on the objective function, namely the unit purchasing cost and the demand. the model can be further developed by incorporating some decision variables such carrier selection, inventory, and lateral transhipment. references [1] s. anwar, "manajemen rantai pasokan (supply chain management): konsep dan hikayat", jurnal dinamika informatika, 3(2), 2011. [2] m. tracey, and c. l. tan, “empirical analysis of supplier selection and involvement, customer satisfaction, and firm performance”, supply chain management: an international journal, vol. 6 no.4, pp. 174-188, 2001 [3] k. c. tan, v. r. kannan, and r. b. handfield, “supply chain management: supplier performance and firm performance,” international journal of purchasing & material management, vol. 34, no. 3, pp. 2–9, 1998. a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 481 [4] govindaraju, r., & sinulingga, j. p., “pengambilan keputusan pemilihan pemasok di perusahaan manufaktur dengan metode fuzzy anp”, jurnal manajemen teknologi, vol. 16 no. 1, pp. 1-16, 2017 [5] r. sulistyoningarum, c. n. rosyidi, and t. rochman, “supplier selection of recycled plastic materials using best worst and topsis method”, journal of physics: conference series, vol. 1367, no. 1, 2019. [6] sulistyoningarum, c. n. rosyidi, and t. rochman, “supplier selection and order allocation of recycled plastic materials: a case study in a plastic manufacturing company”, international journal of information and management sciences, vol. 31 no. 4, pp.315–330, 2020. [7] s. p. venkatesan, and m. goh, “multi-objective supplier selection and order allocation under disruption risk”, transportation research part e: logistics and transportation review, vol. 95, pp. 124-142, 2016. [8] k. s. moghaddam, “fuzzy multi-objective model for supplier selection and order allocation in reverse logistics systems under supply and demand uncertainty”, expert systems with applications, vol. 42 no. 15-16, pp. 6237-6254, 2015. [9] k. park, g. e.o kremer, and j. ma, “a regional information-based multi-attribute and multi-objective decision-making approach for sustainable supplier selection and order allocation”, journal of cleaner production, vol. 187, pp. 590-604, 2018 [10] c. n. rosyidi, r. a. yudhatama, and p. w. laksono, “multi-objective optimization model of supplier selection and order allocation problem in a hospital: a case study”, international journal of procurement management, in press, 2022. [11] b. a. k. dewi, c. n. rosyidi, and a. aisyati, “an optimization model of drug order quantity and distribution using continuous review approach by considering secondary suppliers”, international journal of mathematics in operational research, in press, 2022. [12] a. maharani, c. n. rosyidi, and p. w. laksono, “order allocation model considering transportation alternatives and lateral transhipment”, jurnal optimasi sistem industri, vol. 21 no. 1, pp. 38-44, 2022. [13] w. liu, y. yang, x. li, h. xu, and d. xie, “a time scheduling model of logistics service supply chain with mass customized logistics service”, discrete dynamics in nature and society, pp.1-18, 2012. [14] w. liu, m. ge, w. xie, y. yang, and h. xu, “an order allocation model in logistics service supply chain based on the pre-estimate behavior and competitive-bidding strategy”. international journal of production research, vol. 52 no.8, pp. 2327-2344, 2014 [15] x. liu, k. zhang, b. chen, j. zhou, and l. miao, “analysis of logistics service supply chain for the one belt and one road initiative of china”. transportation research part e: logistics and transportation review, vol. 117, pp. 23-39, 2018. [16] w. liu, q. wang, q. mao, s. wang, and d. zhu, “a scheduling model of logistics service supply chain based on the mass customization service and uncertainty of flsp’s operation time”, transportation research part e, vol.83, pp. 189-215, 2015. [17] x. hu, g. wang, x. li, y. zhang, s. feng, and a. yang, “joint decision model of supplier selection and order allocation for the mass customization of logistics services”, transportation research part e: logistics and transportation review, vol. 120, pp. 76-95, 2018. a mixed integer linear programming model of order allocation involving mass customization logistic service (mcls) cucuk nur rosyidi 482 [18] g. wang, x. hu, x. li, y. zhang, s. feng, and a. yang, “multi-objective decisions for provider selection and order allocation considering the position of the codp in a logistics service supply chain”, computers & industrial engineering, vol. 140, 2020. confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 411-419 p-issn: 2086-0382; e-issn: 2477-3344 submitted: may 07, 2022 reviewed: may 15, 2022 accepted: june 23, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.15989 confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad*, i wayan mangku, bib paruhum silalahi department of mathematics, ipb university, bogor, indonesia email: idipbfaisalmuhammad@apps.ipb.ac.id abstract asymtotic normality of an estimator for the mean function of a compound cyclic poisson process in the present of power function trend which introduced by safitri in 2002. to provided information on parameters guarantees (mean function) covered in an interval, it is necessary to find a convidence interval for the mean function of a compound cyclic poisson process in the presence of power function trend. the objectives of this paper are: (i) to construct confidence interval for the mean function of a compound cyclic poisson process with significance level 0 < 𝛼 < 1, (ii) to prove that the probability that the mean function contained in the confidence interval converges to 1 − 𝛼, and (iii) to observe, using simulation study, that the probabilities of the mean function contained in the confidence intervals for bounded length of observation interval. this paper showed that a confidence interval for the mean function and a theorem about convergence of the probability that the mean function contained in confidence interval. the simulation study shows that the probability that the mean function contained in the confidence interval is in accordance with the theorem. the contribution of this study is to provide information for users regarding confidence interval for the mean function of a compound cyclic poisson process in the presence of power function trend. keywords: compound cyclic poisson process; power function tren; mean function; confidence interval; poisson process. introduction there are many events in everyday life that are uncertain, such as the birth and death process [1] the queue process [2] and the estimation of total insurance claims [3], which can be modeled using a stochastic process. a stochastic process is process that describes series of random events at certain time intervals [4]. a special form of stochastic process is the compound poisson process. a compound poisson process is a process of adding sequencess random variables of independent and identically distributed (i.i.d) with certain distribution as many as poisson random variables, and independent of the poisson process. based on the time aspect, stochastic process can be classified in two categories, namely discrete time stochastic process and continuous time stochastic process. a special form of continuous time stochastic process is the poisson process. the poisson process is a counting a process in which the number of events in a poisson distribution time interval. http://dx.doi.org/10.18860/ca.v7i3.15989 mailto:idipbfaisalmuhammad@apps.ipb.ac.id confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 412 based on the intensity aspect, the poisson process can be classified in two categories, that is homogeneous poisson process with constant intensity function (not dependent on time) and the non-homogeneous poisson process with intensity function depends on time. one type of non-homogeneous poisson process is the cyclic or periodic poisson process [5]. the period can be daily, weekly, yearly or in other forms [6]. this non-homogeneous poisson process is widely applied to real phenomena, such as th phenomenon of earthquakes [7], healthcare [8], radio burst rates [9], and traffic accidents [10]. the study of the compound periodic poisson process is widely. this research begins with the estimation of the expected value function on the compound periodic poisson process [11] [12], then it was developed with a power trend [13] [14]. the compound cyclic poisson does not follow the usual distribution patern. one aspect which can be estimated is the mean value. since this value depends on the time of observation, it is called the mean function. in [15], an estimator for the mean function of a compound cyclic poisson process has been constructed and studied. the asymptotic normality of this estimator also has been proven. furthemore, to give assurance information that the mean function is included in an interval, it is necessary to construct a confidence interval for mean function of the compound cyclic poisson process in the presence of power function trend. as an application of the asymptotic normality, it can be determined the confidence interval of the estimator for the periodic component. in [16] studied the confidence intervals for the mean and variance functions of compound poisson process with power function intensity have been studied, while in [17] confidence intervals for the mean and variance functions of compound poisson process with exponential of linear function intensity have been studied. specifically, this research was conducted to (i) to construct confidence interval for the mean function of a compound cyclic poisson process in the presence of power function trend, (ii) to prove convergence to 1 − 𝛼 of probability that the mean function included in the confidence interval, and (iii) to check using simulation study that the probabilities of the mean function contained in the confidence intervals for bounded length of observation interval. the contribution of this study is to provide information for users regarding confidence interval for the mean function of a compound cyclic poisson process in the presence of power function trend. methods the estimator for the mean function suppose that {𝑁(𝑡), 𝑡 ≥ 0} is a nonhomogeneous poisson process with (unknown) intensity function 𝜆 which is assumed to be locally integrable. suppose that 𝜆 has two components, namely a cyclic component (𝜆𝑐 ) with (known) period 𝜏 > 0 and a power function trend component (𝑎𝑠𝑏 ). in other words, for all 𝑎 ≥ 0 and 𝑠 ≥ 0, the intensity function 𝜆(𝑠) can be expressed as 𝜆(𝑠) = 𝜆𝑐 (𝑠) + 𝑎𝑠 𝑏 . (1) the value of b is assumed to be known real number and 0 < 𝑏 < 1 2 . we do not assumed any parametric form for the cyclic component c  , except that it is cyclic or periodic, which satisfies confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 413 𝜆𝑐 (𝑠) = 𝜆𝑐 (𝑠 + 𝑘𝜏) (2) for all 𝑠 ≥ 0 and all k∈ ℕ. suppose that {𝑌(𝑡), 𝑡 ≥ 0} is a process where 𝑌(𝑡) = ∑ 𝑋𝑖 𝑁(𝑡) 𝑖=1 (3) with {𝑋𝑖 , 𝑖 ≥ 1} is a sequence of independent and identically distributed random variables which having mean 𝜇 < ∞ and variance 𝜎2 < ∞, and also independent of {𝑁(𝑡), 𝑡 ≥ 0}. the process {𝑌(𝑡), 𝑡 ≥ 0} is called a compound cyclic poisson process with power function trend [7]. suppose that 𝜓(𝑡) is notation of the mean function of 𝑌(𝑡), that is 𝜓(𝑡) = 𝐸(𝑌(𝑡)) = 𝐸[𝑁(𝑡)]𝐸[𝑋1] = 𝛬(𝑡)μ (4) with 𝜇 = 𝐸(𝑋𝑖) and 𝛬(𝑡) = ∫ 𝜆(𝑠) 𝑑𝑠 𝑡 0 . (5) let 𝑡𝑟 = 𝑡 − ⌊ 𝑡 𝜏 ⌋ 𝜏, where ⌊ 𝑡 𝜏 ⌋ represents the largest integer less than or equal to 𝑡 𝜏 , 𝑡 𝜏 ∈ ℝ and 𝑘𝑡,𝜏 = ⌊ 𝑡 𝜏 ⌋. then, for any real number 𝑡 ≥ 0, 𝑡 can be expressed as 𝑡 = 𝑘𝑡,𝜏 𝜏 + 𝑡𝑟 with 0 ≤ 𝑡𝑟 ≤ 𝜏. let 𝜃 = 1 𝜏 ∫ λ𝑐 (𝑠) 𝑑𝑠 𝜏 0 denotes the global intensity of the periodic component in the process {n(t ), t ≥ 0} and it is assumed that 𝜃 > 0. this 𝜃 can be written as 𝛬𝑐 (𝑡𝑟) + 𝛬𝑐 𝑐 (𝑡𝑟 ) with 𝛬𝑐 (𝑡𝑟 ) = ∫ λ𝑐 (𝑠) 𝑑𝑠 𝑡𝑟 0 (6) and 𝛬𝑐 𝑐 (𝑡𝑟) = ∫ λ𝑐 (𝑠) 𝑑𝑠. 𝜏 𝑡𝑟 (7) by using (6) and (7) and substituting (1) into (5), then for any t ≥ 0, 𝛬(𝑡) can be written as 𝛬(𝑡) = (𝑘𝑡,𝜏 + 1)𝛬𝑐 (𝑡𝑟)+𝑘𝑡,𝜏 𝛬𝑐 𝑐 (𝑡𝑟 ) + 𝑎 𝑏+1 𝑡𝑏+1. (8) by substituting (8) into (4), we have 𝜓(𝑡) = ((𝑘𝑡,𝜏 + 1)𝛬𝑐 (𝑡𝑟)+𝑘𝑡,𝜏 𝛬𝑐 𝑐 (𝑡𝑟 ) + 𝑎 𝑏+1 𝑡𝑏+1) 𝜇. (9) estimation of mean function in [11] an estimator of the mean function 𝜓(𝑡) has been formulated as follows �̂�𝑛,𝑏 (𝑡) = ((𝑘𝑡,𝜏 + 1)�̂�𝑐,𝑛,𝑏 (𝑡𝑟 ) + 𝑘𝑡,𝜏 �̂�𝑐,𝑛,𝑏 𝑐 (𝑡𝑟 ) + �̂�𝑚,𝑏 𝑏 + 1 𝑡𝑏+1) �̂�𝑛 (10) where �̂�𝑐,𝑛,𝑏 (𝑡𝑟 ) = (1 − 𝑏)𝜏1−𝑏 𝑛1−𝑏 ∑ 1 𝑘𝑏 𝑘𝑛,𝜏 𝑘=1 𝑁([𝑘𝜏, 𝑘𝜏 + 𝑡𝑟 ]) − �̂�𝑚,𝑏 (1 − 𝑏)𝑛 𝑏 𝑡𝑟 , (11) �̂�𝑐,𝑛,𝑏 𝑐 (𝑡𝑟) = (1 − 𝑏)𝜏1−𝑏 𝑛1−𝑏 ∑ 1 𝑘𝑏 𝑘𝑛,𝜏 𝑘=1 𝑁([𝑘𝜏 + 𝑡𝑟 , 𝑘𝜏 + 𝜏]) + �̂�𝑚,𝑏(1 − 𝑏)𝑛 𝑏 (𝑡𝑟 − 𝜏), (12) confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 414 �̂�𝑚,𝑏 = (1 + 𝑏)𝑁([0, 𝑚]) 𝑚(1+𝑏) − (1 + 𝑏) 𝑚𝑏 �̃�𝑛 , (13) �̃�𝑛 = (1 − 𝑏) 𝑛1−𝑏 𝜏𝑏𝑏2 ∑ 1 𝑘𝑏 𝑘𝑛,𝜏 𝑘=1 𝑁([𝑘𝜏, 𝑘𝜏 + 𝜏]) − (1 + 𝑏)(1 − 𝑏)𝑛𝑏 𝑁([0, 𝑛]) 𝑛(1+𝑏)𝑏2 , (14) �̂�𝑛 = 1 𝑁[0, 𝑛] ∑ 𝑋𝑖 𝑁([0,𝑛]) 𝑖=1 . (15) with �̂�𝑛 = 0 when 𝑁([0, 𝑛]) = 0. asymptotic normally of the estimator for the mean function theorem 1 (the asymptotic normally of the estimator for the mean function) suppose that the intensity 𝜆 statisfies (1) and locally integrable. if 𝑌(𝑡) statisfies (2), then √𝑛1−𝑏 (�̂�𝑛,𝑏(𝑡) − 𝜓(𝑡)) 𝑑 → normal (0, (𝑘𝑡,𝜏 + 1) 2 𝑎(1 − 𝑏)𝜏𝑡𝑟 𝜇 2 + 𝑘𝑡,𝜏 2 𝑎(1 − 𝑏)𝜏(𝜏 − 𝑡𝑟 )𝜇 2) (16) as 𝑛 → ∞. the proofs of theorem 1 can be proved through a rough analysis [11]. results and discussion our main results are a confidence interval for the mean function 𝝍(𝒕) and a theorem about convergence of the probability that 𝝍(𝒕) contained in the confidence interval. corollary 1 (the confidence interval for 𝝍(𝒕)) for given a significant level 𝛼, where 0 < 𝛼 < 1, the confidence interval for 𝜓(𝑡) in the case 0 < 𝑏 < 1 2 is given by 𝐼𝜓,𝑛 = [�̂�𝑛,𝑏 (𝑡) −  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 , �̂�𝑛,𝑏(𝑡) +  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ] with 𝑉𝑛,𝑏 = (𝑘𝑡,𝜏 + 1) 2 �̂�𝑚,𝑏 (1 − 𝑏)𝜏𝑡𝑟 �̂�𝑛 2 + 𝑘𝑡,𝜏 2 �̂�𝑚,𝑏 (1 − 𝑏)𝜏(𝜏 − 𝑡𝑟 )�̂�𝑛 2 𝑛1−𝑏 , where  denotes the standard normal distribution and 𝑉𝑛,𝑏 denotes the studenize version of (16). theorem 2 (convergence of probability that 𝝍(𝒕) ∈ 𝑰𝝍,𝒏) for confidence interval 𝐼𝜓,𝑛 of 𝜓(t) given in corollary 1, we have that 𝑃(𝜓(𝑡) ∈ 𝐼𝜓,𝑛) → 1 − 𝛼 as 𝑛 → ∞. confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 415 proof of theorem 2: the probability that 𝜓(𝑡) contained in the confidence interval 𝐼𝜓,𝑛 can be computed as follows. . 𝑃 (�̂�𝑛,𝑏 (𝑡) −  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ≤ 𝜓(t) ≤ �̂�𝑛,𝑏 (𝑡) +  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ) = 𝑃 (−−1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ≤ −�̂�𝑛,𝑏 (𝑡) + 𝜓(t) ≤  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ) = 𝑃 (−1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ≥ �̂�𝑛,𝑏(𝑡) − 𝜓(t) ≥ − −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ) = 𝑃 (−−1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ≤ �̂�𝑛,𝑏 (𝑡) − 𝜓(t) ≤  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ) = 𝑃 (−−1 (1 − 𝛼 2 ) ≤ �̂�𝑛,𝑏(𝑡)−𝜓(t) √𝑉𝑛,𝑏 ≤ −1 (1 − 𝛼 2 )). by the studentize version of (16), we have that �̂�𝑛,𝑏(𝑡)−𝜓(t) √𝑉𝑛,𝑏 𝑑 → normal(0,1), as 𝑛 → ∞. therefore 𝑃(𝜓(𝑡) ∈ 𝐼𝜓,𝑛) converges to 𝑃 (−−1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ −1 (1 − 𝛼 2 )) as 𝑛 → ∞, where 𝑍 is the standard normal random variable. further we can simplify the above probability as follows. 𝑃 (−−1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ −1 (1 − 𝛼 2 )) = 𝑃 (𝑍 ≤ −1 (1 − 𝛼 2 )) − 𝑃 (𝑍 ≤ −1 (1 − 𝛼 2 )) = 𝑃 (𝑍 ≤ −1 (1 − 𝛼 2 )) − 𝑃 (𝑍 ≥ −1 (1 − 𝛼 2 )) = 𝑃 (𝑍 ≤ −1 (1 − 𝛼 2 )) − (1 − 𝑃 (𝑍 ≤ −1 (1 − 𝛼 2 ))) =  (−1 (1 − 𝛼 2 )) − (1 −  (−1 (1 − 𝛼 2 ))) = (1 − 𝛼 2 ) − (1 − (1 − 𝛼 2 )) = 1 − 𝛼. this completes the proof of theorem 2. simulation of the confidence interval for the mean function the purpose of this simulation is to check the probability that the mean function 𝜓(𝑡) is contained in the confidence intervals for some different significant levels, period, and length of observation interval, using generated data. this simulation was carried out with the help of r software and scilab software for illustration the results. confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 416 the programing stage is carried out by generating the realization of compound periodic poisson process with a power function trend with the formulation of the intensity function: 𝜆(𝑠) = sin 2𝜋𝑠 𝜏 + 1 + 𝑎𝑠𝑏 . in this simulation, we choose significant levels 𝛼 = 1%, 5% and 10%, 𝜏 = 1, 𝑠 = 2.5, 𝑎 = 0.1, 𝑏 = 0.4, 𝑛 = 20, 50 and 100 with 1000 repetitions. table 1. simulation results of confidence interval for the mean function 𝜓(𝑡) 𝛼 𝑛 a b c d e 1% 20 985 15 98.5% 1.5% 0.5% 50 988 12 98.8% 1.2% 0.2% 100 990 10 99.0% 1.0% 0.0% 5% 20 941 59 94.1% 5.9%% 0.9% 50 950 50 95.0% 5.0% 0.0% 100 952 48 95.2% 4.8% 0.2% 10% 20 891 109 89,1% 10.9% 0.9% 50 907 93 90.7% 9.3% 0.7% 100 917 83 91.7% 8.3% 1.7% (a= the number confidence interval containing the parameter, b= the number confidence interval that do not contain the parameter, c= percentage of confidence interval containing the parameter, d= percentage of confidence interval that does not contain the parameter, e= absolute error between 𝛼 and percentage of confidence interval that does not contain the parameters) based on simulation results, percentage of confidence interval that does not contain parameter at 𝑠 = 2.5 and 𝜏 = 1 with 𝛼 = 1%, 5% and 10% fir observation interval [0, 𝑛] with 𝑛 = 20, 50 and 100 respectively from 0.0% − 0.5%, 0.0% − 0.9% and from 0.7% − 1.7%. the error that obtained between 𝛼 and percentage of confidence interval that does not contain parameters also tend to be small, between 0% and 1.7%. this shows that the result of the simulation of the confidence interval for the mean function 𝜓(𝑡) for the compound poisson process with different significant levels is in accordance with the theory obtained. the simulation results based on the first 200 estimators can be seen in figure 1. confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 417 figure 1. confidence interval for the mean function 𝜓(𝑡) based on the first 200 estimators with 𝑠 = 2.5, 𝜏 = 1, 𝛼 = 1% and 𝑛 = 100 the illustration in figure 1 shows some of the results of the confidence interval simulation for the mean function 𝜓(𝑡) at 𝑠 = 2.5 and 𝜏 = 1 based on the first 200 estimators with significance level of 𝛼 = 1% and 𝑛 = 100. it can be seen in the figure that the horizontal line is the true value of the mean function 𝜓(𝑡) and vertical lines are the confidence intervals of the mean function 𝜓(𝑡). if the horizontal and vertical lines do not intersect each other, this indicates that the value of the mean function 𝜓(𝑡) is not in that interval. in figure 1, there are three non intersecting lines which indicates there are three confidence intervals based on the first 200 estimators do not contain the value of the mean function 𝜓(𝑡). since in table 1 there are 10 confidence intervals that do not contain the mean function 𝜓(𝑡), this shows that there are seven confidence intervals based on the 201-st to 1000-th estimators do not contain the value of the mean function 𝜓(𝑡). the illustration results in figure 1 show that the probability of the mean function 𝜓(𝑡) is contained in the confidence interval already close to 1 − 𝛼 for 𝜏 = 1 and 5, 𝛼 = 1%, 5% and 10% for bounded time interval observation. conclusions according to the main results, it can be concluded that confidence interval for the mean function 𝜓(𝑡) of compound cyclic poisson process in the presence of power function trend is 𝐼𝜓,𝑛 = [�̂�𝑛,𝑏 (𝑡) −  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 , �̂�𝑛,𝑏 (𝑡) +  −1 (1 − 𝛼 2 ) √𝑉𝑛,𝑏 ], where  denotes the standard normal distribution and 𝑉𝑛,𝑏 = (𝑘𝑡,𝜏 + 1) 2 �̂�𝑚,𝑏 (1 − 𝑏)𝜏𝑡𝑟 �̂�𝑛 2 + 𝑘𝑡,𝜏 2 �̂�𝑚,𝑏 (1 − 𝑏)𝜏(𝜏 − 𝑡𝑟 )�̂�𝑛 2 𝑛1−𝑏 . convergence of the probability that the mean function 𝜓(𝑡) contained in the confidence interval is confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 418 𝑃(𝜓(𝑡) ∈ 𝐼𝜓,𝑛) → 1 − 𝛼, as 𝑛 → ∞. the simulation results show that the probability of the mean function 𝜓(𝑡) included in the confidence interval already close to 1 − 𝛼 for a finite length observation interval. a recommendation for futher research can be to use different intensity functiom and different observation function from this study at thr simulation stage, so that they can show more diverse simulation results. refrences [1] e. roflin, "analysis of time series with calendar effects," management science, vol. 26, pp. 106-112, 2000. [2] s. udayabaskaran and v. t. dora pravina, "transient analysis of an m/m/1 queueing system with server operating in three models," far east journal of mathematical science, vol. 101, pp. 1395-1418, 2017. [3] a. chadidjah, "proses poisson dalam estimasi total klaim," in prosiding seminar nasional matematika dan pendidikan matematika dengan tema peran matematika dan pendidikan matematika dalam menghadapi isu-isu global, 2015, pp. 325-336. [4] s. m. ross, introduction to probaility models, ninth ed. florida: academic press inc, 2010. [5] i. w. mangku, "estimating the intensity function of a cyclic posson process," univ of amsterdam, 2001. [6] t. a. walls and j. l. schafer, "models for intensice longitudinal data," oxford univ pr, 2006. [7] j. geng, w. shi, and g. hu, "bayesian nonparametric nonhomogeneous poisson process with applications to usgs earthquake data," eslevier, vol. 41, p. 100495, march 2021. [8] d. munandar, s. supian, and subiyanto, "probability distributions of covid-19 tweet posted trends use a nonhomogeneous poisson process," international journal of quantitative research and modeling, vol. 1, no.4, pp. 229-238, 2020. [9] e. lawrence, s. vander wiel, c. law, s. b. spolar, and g. c. bower, "the nonhomogeneous poisson process for fast radio burst rates," astron. j., vol. 154, no.3, p. 117, 2017. [10] f. grabski, "nonhomogeneous poisson process anf compound poisson process in the modelling of random process related to road accidents," j. kones, vol. 26, no.1, pp. 39-46, 2019. [11] r. ruhiyat, i. w. mangku, and i. g. p. purnaba, "consistent estimation of the mean function of compound cyclic poisson process," far east j. math. sci, vol. 77, no. 2, pp. 183-194, 2013. [12] f. i. makhmudah, i. w. mangku, and h. sumarno, "estimating the variance function of a compound cyclic poisson process," far east journal of mathematical science (fjms), vol. 100, no. 6, pp. 911-922, sep 2016. [13] i. f sari, i. w. mangku, and h. sumarno, "estimating the mean function of a compound cyclic poissom process in the presence od power function trend," far east j. math. sci, vol. 100, no. 11, pp. 1825-1840, 2016. confidence intervals for the mean function of a compound cyclic poisson process in the presence of power function trend faisal muhamad 419 [14] a. fajri, "pendugaan ragam pada proses poisson periodik majemuk dengan tren fungsi pangkat," ipb university, 2018. [15] n. i. safitri, "sebaran asimtotik penduga fungsi nilai harapan proses poisson periodik majemuk dengan tren fungsi pangkat," ipb university, 2022. [16] a. fajri, "selang kepercayaan fungsi nilai harapan dan fungsi ragam proses poisson majemuk dengan intensitas fungsi pangkat," ipb university, 2017. [17] s. utami, "interval kepercayaan fungsi nilai harapan dan fungsi ragam proses poisson majemuk degan intensitas eksponensial fungsi linear," ipb university, 2018. 11254-yundari fix cauchy –jurnal matematika murni dan aplikasi volume 6(4)(2021), pages 246-259 p-issn: 2086-0382; e-issn: 2477-3344 submitted: january 05, 2021 reviewed: march 16, 2021 accepted: march 30, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11254 invertibility of generalized space-time autoregressive model with random weight yundari1, setyo wira rizki2 1mathematics department, faculty of mathematics and natural science, universitas tanjungpura pontianak, indonesia 2statistics department, faculty of mathematics and natural science, universitas tanjungpura pontianak, indonesia email: yundari@math.untan.ac.id abstract the generalized linear process accomplishes stationarity and invertibility properties. the invertibility property must be having a series of convergence conditions of the process parameter. the generalized space-time autoregressive (gstar) model is one of the stationary linear models therefore it is necessary to reveal the invertibility through the convergence of the parameter series. this article studies the invertibility of model gstar(1;1) with kernel random weight. the result shows that the model gstar(1;1) under kernel random weight fulfills the invertibility property and obtains a finite order of generalized space-time moving average (gstma) process. the other result obtained is the time order of the finite orde 7 � � � 30 . on the triangular kernel resulted in the relatively great value n, so that it does not apply to the kernel with a finite value n. the gstar(1;1) model with random kernel weight is applied to the data of tea production in six plantantion area in west java. the rmse value of data estimation obtained is quite small. it follows the original data pattern at each research location respectively. keywords: autoregressive process; generalized linear process; invertibility; stationarity introduction theoretically, the first order of the autoregressive model, ar(1), of the univariate time series is equivalent to the moving average model with infinity order, ma ( )∞ [1]. it happens to the multivariate model that the vector autoregressive model, var(1), is comparable with the model vma( )∞ [2]. these properties are known as the invertibility property of the autoregression model orde 1. the gstar(1;1) model is a member of the autoregression model family [3]. the question of research, is the gstar(1;1) model also equivalent to the gstma(∞;1) model? the theoretical study of the gstar model has be done in [4] about the model stationarity using the inverse of the autocorrelation matrix. furthermore, [5] tells about the estimation property of the parameter gstar using the least square method, observes the error assumption of the model gstar [6] and the gstar containing outlier [7]. also, the development of the gstar model has been carried out by several researchers such as gstar-garch [8], gstar-sur [9], gstar-kriging [10] and others. invertibility of generalized space-time autoregressive model with random weight yundari 247 several researchers have developed the spatial weight matrix determination, such as [11] using a uniform spatial weight matrix namely the closest neighbors are given the same weight. [5] uses a binary weight matrix considering the uniform weight as a comparison. the weight matrix determination using cross-correlation have also be done by [12]. all of the researchers use distance as the basis of the weight matrix determination. [13] proposes a fuzzy set approach based on observational data in determining the weight matrix, but the approach still produces the weights assigned is not random. determining random weight matrix have be be done by author by using some kernel functions approach [3]. furthermore, the spatial weight effect of the random kernel is also examined for its stationary properties [14]. some of the space-time data applied using the gstar model are the tourist number data at several tourist attractions [15], the tea production data [5], the gdp data in the countries in europe [11], the chili prices prediction [16], the data of log gamma ray [3], the rainfall data [10] and so on. this paper discusses the gstar model with a random weight using the kernel function. the kernel function used is uniform, triangular, epanechnikov, cosine dan gaussian. the kernel functions present the constant function, linear, square, cosine, dan exponential. moreover, the research talks about the weight matrix effect of kernel spatial to its invertibility. in notation, the weight matrix using the kernel function is denoted by � ( )ijw=w ɶ and the parameter matrix gstar(1;1) is represented by �φ . besides, the study discloses the convergence of each kernel function to its invertibility. the article begins with the invertibility theory of the ar(1) model and var(1). the next section explains the kernel function and continued with the study of the gstar(1;1) model under the kernel weight. both results and discussion will be conferred in the next section about the invertibility of the gstar model under both the kernel weight and the convergence to determine the order of the gstma model. in the last section, the paper implements the gstar(1;1) model with the gaussian kernel weight on the tea production data in the six plantation area in west java. methods this section will discuss the theories underlying the research namely the ar(1) model and var(1) which is the basis of the gstar(1;1) model formation. after that, the research studies the properties of each invertibility. the last, it will have conversed about the kernel function used to form the spatial weight matrix of the gstar(1;1) model. invertibility of ar(1) and var(1) process the autoregression process (ar(p)) is defined as below [1]: 1 1 ...t t p t p ty y y aφ φ− −= + + + with 1 2, , ..., pφ φ φ are the autoregression parameters and ta is white noise process with mean is zero and variance is 2aσ . if 1p = then it will be known as process ar(1), which is formulated as: 1t t ty y aφ −= + . besides, the process ar(1) is one of stationer linear models under a stationarity condition 1φ < , model ar(1) has the invertibility property such as: 1 1 2 ( 1) 2 1 t t t k k t t k t k t t t y y a y a a a a a φ φ φ φ φ − − − − − − − = + = + + + + + +⋯ ⋯ invertibility of generalized space-time autoregressive model with random weight yundari 248 the last model obtained is the ma( ∞ )model and the ma model surely stationer with 1φ < and over convergence process. it results in the ar(1) is invertible to ar(1) ≈ ma( ∞ ). on the process ar(1), it considers one random variable with some times. if the observation is worked by using several random variables which each of them through the process ar(1) so it is known as the first order of the vector of autoregressive (var(1)). the model var(1) can be framed as follows: 1 tt t aφ −= +y y � the necessary and sufficient condition of stationarity of the var(1) is a solution of 1 0ki bφ− = less than one. the process var(1) can also be represented in the vector of moving average (vma) or in the other words satisfied the invertibility property [2]. the kernel function the continuous real function is denoted as the kernel function if satisfied the sum of integral is one, symmetrically for each , the mean equal to zero and the finite variance. an example of the kernel function along with its efficiency properties which learned in table 1. the notation { }2(k) ( )r k x dx= ∫ states “roughness” of the kernel function k. the notation 2 2 ( )k x k x dxσ = ∫ is a variance of the kernel function, while the efficiency of the kernel function is obtained from { }5 / 4( *) / (k) 1c k c = , where ( ){ }1/ 524 2( ) ( ) kc k r k σ= . table 1. the shape of kernel function and its properties. the bound of its domain between -1 and 1 (and 0 for outside the domain), except for the gaussian kernel is applicable for all the real numbers [17]. the kernel the form of function ( )r k 2 kσ the efficiency the domain uniform (seragam kernel) ( ) 1 / 2k x = 1 2 1 3 0.9295 (-1,1) triangular ( ) 1k x x= − 2 3 1 6 0.9859 (-1,1) epanechnikov ( )23( ) 1 4 k x x= − 3 5 1 5 1 (-1,1) cosinus ( ) cos 4 2 k x π π =       2 16 π 2 8 1 π − 0.9897 (-1,1) gaussian 21 ( ) e x p 22 x k x π = −       1 2 π 1 0.9512 ℝ the approximation of the kernel function as the weight function is generally used to estimate density and regression function. the procedure of the kernel function is the sum of some kernel function to each point corresponded to every surrounding point (see figure 1). in general, the kernel function of a point linked to it’s the nearest point, for invertibility of generalized space-time autoregressive model with random weight yundari 249 example, x and y are x y k h −      . the notation h is a bandwidth controlling smoothness of the kernel function. figure 1. the plot of kernel density estimation. if there are a lot of observations which is close to point x then f(x) has great value. on other hand, if there are less �� closed to the point x then f(x) has a small value. the gstar(1;1) model with the kernel weight the novel method to determine the spatial weight matrix of the model gstar recommended is by using the kernel function. kernel location weight is attained by adopting the kernel estimator of nadaraya-watson [18] and using an average value of the observation on every single location .iy average value selection of observation in each location is intended to find overall data property (data centering) by ignoring outlier of an observation data. centralization process { }( )iy t following a model gstar (1;1) kernel weight is written as: � 0 1 1 ( ) ( 1) ( 1) ( ), 1,..., , 1,..., n iji i j ii i j wy t y t y t t t t i nφ φ ε = = − + − + = =∑ (1) this model has a spatial weight 1 i j ij n i i y y k h w y y k h= ≠             − = − ∑ ℓ ℓ ℓ ɶ , with notation (.)k is the kernel function, h represents a smoother parameter of the kernel function k and ( )iy t declares an observation on-time t at location i. the term weight matrix can be written as, invertibility of generalized space-time autoregressive model with random weight yundari 250 � � � � � � � 12 1 21 2 1 2 0 0 0 n n n n w w w w w w =              w ⋯ ⋯ ⋮ ⋮ ⋱ ⋮ ⋯ the result of the weight matrix w , by the kernel function approach, appears to satisfy the properties of the random weight matrix. it is caused by the weight that originated from the random variable data. it is the observation data and fulfilled the property 1 1 n ij j w = =∑ , 1n > . after obtaining the matrix of kernel spatial weight, the parameter estimation of the gstar (1;1) model is carried out using the least squares method followed by validating model. the model validating is held by doing 2 steps namely the parameter significance test and the residual test. the parameter significance test uses the parameter matrix eigen value of the gstar(1;1) model and the residual test using the plot of data error (error randomness) and the qq plot of error (normality). results and discussion the symbol writing of parameter matrix for the gstar(1;1) model based on equation (1) to 0 01 0( , , )ndiag φ φ=φ ⋯ , 1 11 1( , , )ndiag φ φ=φ ⋯ dan =(wij), so that the model gstar(1;1) can be expressed in the matrix as follows � �( ) ( ) ( 1) ( 1) ( ) ( ) ( 1) ( ). t t t t t t t = − + − + = − + 0 1 0 1 y φ y φ wy ε y φ +φ w y ε representation of the gstma model from the gstar(1;1) model as below, ( ) ( ) ( ){ } ( ) ( 1) ( ) = ( 2) ( 1) ( ) t t t t t t = − + − + − + 0 1 0 1 0 1 y φ +φ w y ε φ +φ w φ +φ w y ε ε ( ) ( ) ( ) ( ) 2 2 = ( 2) ( 1) ( ) = ( ) ( 1) ( 2) t t t t t t − + − + + − + − 0 1 0 1 0 1 0 1 φ +φ w y φ +φ w ε ε ε φ +φ w ε φ +φ w y ⋮ ( ) ( ) � 2 0 = ( ) ( 1) ( 2) = ( ) i i t t t t i ∞ = + − + − + −∑ 0 1 0 1ε φ +φ w ε φ +φ w ε φε ⋯ with � 0 1φ=(φ +φ w) . for the gstar(1;1) model which is stationer, all of the eigenvalues �φ are between -1 and 1 so that � 0 for i i→ → ∞φ . this is stated in theorem 1. invertibility of generalized space-time autoregressive model with random weight yundari 251 theorem 1. if � 0 1φ= (φ +φ w), and eigenvalue of �φ is between -1 dan 1 so �lim , n n→∞ =φ 0 for n =0,1,2,…. proof: the matrix � �'φ φ is positive definite so that the matrix � ck×∈φ ℓ can be stated by the singular value decomposition (svd), i.e a diagonal matrix r , min{ , }r r r k×∈ ≤d ℓ and matrix c , ck k× ×∈ ∈u v ℓ ℓ , so that � � ⇔ n n φ = u d v φ = u d v because of matrix elements, d is a root of the eigenvalue of matrix �φ and eigenvalue of � φ is between -1 dan 1 so lim n n→∞ =d 0 . it resulted �lim n n→∞ =φ 0. this caused the invertibility property of the gstar(1;1) model satisfied because of the coefficient of process { }( )t i−y limits to zero. it confirms that gstar (1;1) gstma( ;1)∞≃ . the orde determination of gstma is theoretically done by considering the convergence level of every kernel function used. if it is reviewed from every viewpoint of the kernel function, the limit value approaching zero (table 2). the invertibility property stated that gstar(1;1) gstma( ;1)∞≃ . in statistics, the orde of time gstma on the gstma( ;1)∞ model does not mean infinite, but it can be determined by a finite order such as n. it is stated in theorem 2. table 2. the limit result of each kernel function. it seems that the overall kernel function having a limit value is zero. the kernel the limit result of the function the uniform 1 lim 0 2 n n→∞   =    the triangular ( )lim (1 ) 0n n x →∞ − = the epanechnikov 23lim (1 ) 0 4 n n x → ∞  − =    the cosinus lim cos( ) 0 4 2 n n x π π →∞   =    the gaussian 21 lim exp 0 22 n n x π→∞    − =      theorem 2. if given a process { }( )iy t following the model gstar(1;1) with a weight matrix of kernel spatial and satisfied the invertibility property so that gstar (1;1) gstma( ;1)∞≃ then � � 0 ( 1 ) ( ) ( ) 0 n i i t t t i = − + − − →∑φ y ε φ ε , for n →∞ . invertibility of generalized space-time autoregressive model with random weight yundari 252 proof: given the gstar(1;1) model and the gstma(∞,1) model, with the help of matrix norm from the difference of both equivalent models, obtained 0 1 1 ( 1) ( ) ( ) ( 1) ( ) = ( 1) ( ( ) ( 1)) = ( 1) ( n ni i i i n i i t t t i t t i t t i t i t t = = = − + − − = − − − − − − − − − − − − ∑ ∑ ∑ φy ε φε φy φε φy φ y φy φy φy 2 2 2 2 3 1) ( 2) ( ( ) ( 1)) = ( 2) ( 2) ( ( ) ( 1)) n i i n i i t t i t i t t t i t i = = + − − − − − − − − − − − − − − ∑ ∑ φy φ y φy φ y φ y φ y φy ⋮ = ( ) n t n−φ y furthermore, by using the property of matrix norm, it can be written as, ɶ ɶ ɶ ɶ( ) ( ) n n n n t n t n c c− ≤ − ≤ ≤φ y φ y φ φ , for a constant c ∈ℝ . it is defined previously that � � 0 1 = φ + φφ w so ɶ( ) ɶ0 1 0 1 n n n n n = φ + φ ≤ φ + φφ w wɶ (2) the value of diagonal matrix elements iφ is between -1 and 1, so that ( )0 0 0 0 n n nn i i maks aφ≤ = =     ∑φ φ for 00 1a< < . similar to the parameter ar concerning location that is ( )1 1 1 0 n n nn i i m a k s bφ ≤ = =    ∑φ φ for 0 0 1b< < . it results in equation (2) being: � 0 0 nn n na b≤ +φ w (3) the matrix is a spatial weight matrix and obtained through the kernel function. for each, the kernel function converges to zero (table 2) and 0lim 0 n n a →∞ = 0limb 0 n n → ∞ = . by using norm ∞ℓ on matrix [19] is � � 1 , m a k s ij i j n w ∞ ≤ ≤ =w , therefore that can be attained the norm of rank n of the spatial weight matrix towards zero. in other words, the gstar(1;1) model is equivalent to the gstma(n;1) model.□ by using theorem 2 of every kernel function forming the weight matrix will produce a convergence rate differently. the convergence rate resulted in the discovery of a finite value n which is an orde of the gstma model. the value n for each kernel function with an error of 0.001 can be seen in figure 2. from figure 2, it can be classified into 3 groups based on the size of n, such as the group 1 15n≤ ≤ , 16 30n≤ ≤ , and 30n > . on the group1 15n≤ ≤ applies to the uniform kernel function, n = 11 and the gaussian, n = 9. the functions describe that the convergence reached is relatively fast to head zero. the next group is16 30n≤ ≤ satisfied by the epanechnikov kernel function, and the cosinus, . both groups can be categorized into finite n, but the triangular kernel function, n invertibility of generalized space-time autoregressive model with random weight yundari 253 = 688, can be said .n →∞ it caused by the triangular function containing a differentiable absolute value function, therefore that it takes time to get convergence. each kernel function raised to the power of n forms a geometric sequence. as a result, the ratio of the gaussian function becomes the smallest, therefore, that the convergence is also faster. the next smallest ratio in a row is the kernel function of the uniform, the epanechnikov, the cosine, and the triangular. it proved that the invertibility property of the model gstar(1;1) can approached by using the gstma(n;1) model with .n < ∞ some error values can be seen in table 3. the kerne l the convergency of the parameter matrix ( )( )( ) 0nnn x m aks k xφ ≤ ≈ →w the convergency plot of error 0.001 1 ( ) 2 k x = , 1x < . ( ) (1 )k x x= − , 1x < . 23( ) (1 ) 4 k x x= − , 1x < . ( ) 4 2 k x cos x π π = , 1x < . ( ) 2 1/ 2 exp 2 ( ) 2 x k x π   −   = , rx ∈ . 0 20 40 60 80 100 0 .0 0 .5 1 .0 1 .5 2 .0 n y 0 20 40 60 80 100 0 .4 0 .5 0 .6 0 .7 0 .8 0 .9 1 .0 n y 0 20 40 60 80 100 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 n y 0 20 40 60 80 100 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 n y 0 20 40 60 80 100 0 .0 0 .2 0 .4 0 .6 0 .8 1 .0 n y t h e u n if o rm t h e t ri a n g u la r t h e e p a n e c h n ik o v t h e c o s in u s t h e g a u s s ia n n=11 n=688 n=25 n=30 n=9 invertibility of generalized space-time autoregressive model with random weight yundari 254 note: for i jy y x h −= figure 2. the order value of the gstma model is equivalent to the gstar(1;1) model for every single kernel function and the error 0.001. table 3. the finite order of the gstma model satisfied the invertibility property of the model gstar(1;1) the kernel the order of the model gstma on some errors 0.01 0.001 0.0001 the uniform 8 11 14 the triangular 459 688 917 the epanechnikov 17 25 33 the cosinus 20 29 39 the gaussian 6 8 11 case study the gstar(1;1) model with the kernel weight will be applied to the tea production data in the 6 plantation field in west java. the data plot can be seen on figure 3. the figure 3 presents the modelling data with time t=200 by 6 observation locations. the plot of each location shows the data stationary has not been fulfilled (weak stationary), so it is necessary to doing differencing of the data. figure 3. the data plot of each location. the plot illustrates the data is under weak stationary condition and needs differencing so that the data plot is stationary invertibility of generalized space-time autoregressive model with random weight yundari 255 the data is stationary to the mean dan variance after passing once differencing process. the modelling carried out in this paper is the gstar(1;1) model. it does not need to identify model. the next step is to determine the spatial weight matrix using the gaussian kernel (table 1.) with the optimum bandwith value. this spatial weight matrix is random because it uses the function of tea plantation random variable from each the plantation area. the spatial weight matrix with the gaussian kernel is represented as following. � = � � � � � 0 0.25 0.27 0.21 0 0.24 0.23 0.24 0 0.10 0.27 0.11 0.16 0.23 0.16 0.14 0.25 0.14 0.12 0.22 0.18 0.24 0.24 0.25 0.12 0.22 0.17 0 0.17 0.32 0.13 0 0.13 0.32 0.17 0 � � � � � � the next step is to determine the parameter value through the least square estimation method and the parameter significance by considering the eigen value obtained of the parameter matrix of the gstar(1;1) model. the value of parameter estimation with its validation can be seen on table 4. table 4. the result of paremeter estimation using least square method and its validation the parameter of each location the value of parameter estimation confidence interval 95% validation (the eigen value of parameter matrix < 1) phi01; phi 11 -0.32; 0.01 (-0.37;-0.27); (-0.03;0.06) valid phi02; phi12 -0.48; 0.34 (-0.52;-0.44); (0.27;0.40) valid phi03; phi13 -0.56; 0.59 (-0.60;-0.51); (0.53;0.66) valid phi04; phi 14 -0.30; 0.37 (-0.34;-0.27); (0.33;0.42) valid phi05; phi15 -0.49; 0.25 (-0.54;-0.45); (0.19;0.30) valid phi06; phi16 -0.37; 0.18 (-0.40;-0.34); (0.13;0.22) valid the result of the parameter estimation of the gstar(1;1) model with the gaussian kernel weight is shown as below, � � � �( )( ) ( 1)t t= −0 1y + w yφ φ with � 0.32 0 0 0 0 0 0 0.48 0 0 0 0 0 0 0.56 0 0 0 0 0 0 0.30 0 0 0 0 0 0 0.49 0 0 0 0 0 0 0.37 −   −    − =  −   −   −  0φ and � 1 0.01 0 0 0 0 0 0 0.34 0 0 0 0 0 0 0.59 0 0 0 0 0 0 0.37 0 0 0 0 0 0 0.25 0 0 0 0 0 0 0.18         =           φ , invertibility of generalized space-time autoregressive model with random weight yundari 256 the data plot estimated from each location with the gstar(1;1) model can be viewed in figure 4 with its rmse value respectively. it concludes that the estimation value to follow the original value pattern and the rmse value is quite small. figure 4. the plot of the both estimation and original value of the gstar(1;1) model with the gaussian kernel weight. the black line presents the original value and the red line is the estimation value. the residual test of this model can be viewed in figure 5. the residual scatter plot and qq plot show that the assumptions of randomness and normality are met. invertibility of generalized space-time autoregressive model with random weight yundari 257 (a) (b) figure 5. the results of the residual test plot that meet the assumption of randomness (a) and normality (b) conclusion the use of the kernel weight matrix also affects the invertibility property of the model gstar(1;1) to the order of the gstma( ∞ ,1). the result obtained is the time order of the finite orde 7 < � < 30 . on the triangular kernel resulted in the relatively great value n, so that it does not apply to the kernel with a finite value n. the model implementation of the tea production data over 6 plantation area in west java can be applied to this research. invertibility of generalized space-time autoregressive model with random weight yundari 258 this is because the spatial weight to use production data applied to the gaussian kernel function according to the data description from each research location. references [1] g. box, g. jenkins and g.reinsel, time series analysis, forecasting and control, 3rd edition, new jersey: prentice hall, 1994. [2] r. tsay, multivariate time series analysis with r and financial application, new jersey: john willey & sons, 2014. [3] y. yundari, u. s. pasaribu, u. mukhaiyar and m.n.heriawan, "spatial weight determination of gstar(1;1) model by using kernel function," in iop conf, series: journal of physics , 2018. [4] y. yundari, u. pasaribu and u.mukhaiyar, "assumption errors of generalized star model," j. math. fund. sci., vol. 49, no. 2, pp. 136-155, 2017. [5] u. mukhaiyar and u.s.pasaribu, "a new procedure of generalized star modelling using iacm approach," itb j. sci., vol. 44a, no. 2, pp. 179-192, 2012. [6] s. borovkova, b. ruchjana and h. lopuhaa, "least squares estimation of generalized space time autoregressive (gstar) model and its properties," in aip conf. proc. , 2012. [7] d. masteriana, m. riani and u.mukhaiyar, "generalized star(1;1) with outlier-case study of begal in medan, north sumatera," in j.phys.:conf. ser., 2019. [8] h. bonar, b. ruchjana and g. darmawan, "development of generalized space time autoregressive integrated with arch galat (gstari-arch) model based on consumer price index phenomenon at several cities in north sumatera province," in aip conference proceedings, 2013. [9] a. iriany, suhariningsih, b. ruchjana and setiawan, "prediction of precipitation data at batu town using the gstar(1,p)-sur model," journal of basic and applied scietific research, vol. 3, no. 1, pp. 860-865, 2013. [10] a. s. abdullah, s. matoha, d. a. lubis, a. n. falah, m. jaya, e. hermawan and b. n. r. b, "implementation of generalized space time autoregressive (gstar)-kriging model for predicting rainfall data at unobserved locations in west java," applied mathematics & information sciences, vol. 12, no. 3, pp. 607-615, 2018. [11] n. nurhayati, u. s. pasaribu and o. neswan, "application of generalized space-time autoregressive model on gdp data in west european countries," journal of probability and statistics, vol. 2012, 2012. [12] suhartono and subanar, "the optimal determination of space weight in gstar model by using cross-correlation inference," quantitativ methods, vol. 2, no. 2, pp. 45-53, 2006. [13] r. nugraha, s. setyowati, u. mukhaiyar and a. yuliawati, "prediction of oil palm production using the weight average of fuzzy sets concept approach," in aip conference proceeding, bandung, 2015. [14] yundari, n. huda, u. pasaribu, u. mukhaiyar and k. sari, "stationary process in gstar(1;1) through kernel function approach," in aip conference proceeding, pontianak, 2020. invertibility of generalized space-time autoregressive model with random weight yundari 259 [15] d. u. wutsqa and suhartono, "seasonal multivariate time series forecasting on tourism data by using var-gstar model," jurnal ilmu dasar, vol. 11, no. 1, pp. 101109, 2010. [16] n. fadlilah, u. mukhiayar and f. fahmi, "the generalized star(1;1) modeling with time correlated galats to red chili weekly prices of some traditional markets in bandung," in aip conference proceeding, bandung, 2015. [17] m. wand and m. jones, kernel smoothing, new york: springerscience+business media b.v, 1995. [18] m. michalack, "time series pediction with periodic kernels," pattern analitic application, vol. 14, no. 0, pp. 283-293, 2011. [19] r. horn and c. johnson, matrix analysis, 2nd, new york: cambridge univesity press, 2013. 𝜼−𝐈𝐧𝐭𝐮𝐢𝐭𝐢𝐨𝐧𝐢𝐬𝐭𝐢𝐜 𝐅𝐮𝐳𝐳𝐲 𝐒𝐨𝐟𝐭 𝐆𝐫𝐨𝐮𝐩𝐬 cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 354-361 p-issn: 2086-0382; e-issn: 2477-3344 submitted: december 23, 2021 reviewed: may 25, 2022 accepted: july 17, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.14555 𝜼−𝐈𝐧𝐭𝐮𝐢𝐭𝐢𝐨𝐧𝐢𝐬𝐭𝐢𝐜 𝐅𝐮𝐳𝐳𝐲 𝐒𝐨𝐟𝐭 𝐆𝐫𝐨𝐮𝐩𝐬 mustika ana kurfia*, noor hidayat, corina karim mathematics department, universitas brawijaya, malang, indonesia email: muzematika@gmail.com abstract the fuzzy set theory was introduced by zadeh in 1965 and the soft set theory was introduced by molodtsov in 1999. recently, many researchers have developed these two theories and combined the theory of fuzzy set and soft set became the fuzzy soft set. in this research, we present the idea of the 𝜂 −intuitionistic fuzzy soft group defined on the 𝜂 −intuitionistic fuzzy soft set. the main purpose of this research is to create a new concept, which is an 𝜂 −intuitionistic fuzzy group. to achieve this, we combine the concept of 𝜂 −intuitionistic fuzzy group and intuitionistic fuzzy soft group. as the main result, we prove the correlation between intuitionistic fuzzy soft group and 𝜂 −intuitionistic fuzzy soft group along with some properties of 𝜂 −intuitionistic fuzzy soft group. also, we prove some properties of subgroup of an 𝜂 −intuitionistic fuzzy soft group. an 𝜂 −intuitionistic fuzzy soft homomorphism is also proved. keywords: intuitionistic fuzzy group; intuitionistic fuzzy soft group; 𝜂 −intuitionistic fuzzy group; 𝜂 −intuitionistic fuzzy soft group introduction the theory of fuzzy has been studied by many researchers in various fields. zadeh introduced the fuzzy set theory in [1] by defining a membership function that maps each member of a set to a closed interval of 0 and 1. then atanassov formed the intuitionistic fuzzy set that consist of membership function and nonmembership function in [2]. the theory of fuzzy set and intuitionistic fuzzy set was developed into group theory became fuzzy subgroup in [3] and intuitionistic fuzzy subgroup in [4]. the intuitionistic fuzzy subgroup was studied in various types. for example, the intuitionistic l-fuzzy subgroups formed in [5], the (𝑠, 𝑡] −intuitionistic fuzzy subgroups defined in [6], definition of (𝛼, 𝛽)cut of intuitionistic fuzzy subgroups in [7], and t-intuitionistic fuzzy subgroups in [8]. doda and sharma studied the finite groups of different orders and gave the idea of recording the count of intuitionistic fuzzy subgroups in [9]. zhou and xu extended the intuitionistic fuzzy sets based on the hesitant fuzzy membership in [10]. the concept of the (𝜆, 𝜇) −intuitionistic fuzzy subgroups and normal subgroups were defined in [11]. then the fundamental properties of t-intuitionistic fuzzy abelian subgroup along with the homomorphism of t-intuitionistic fuzzy abelian subgroup were studied in [12]. latif, et al. studied the fundamental theorems of t-intuitionistic fuzzy isomorphism of t-intuitionistic fuzzy subgroup in [13]. moreover, the concept of 𝜉 −intuitionistic fuzzy subgroup, 𝜉 −intuitionistic fuzzy cosets, and 𝜉 −intuitionistic fuzzy normal subgroup were characterized in [14]. based on those research, shuaib, et al. in [15] formed a concept http://dx.doi.org/10.18860/ca.v7i3.14555 mailto:muzematika@gmail.com 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 355 called an 𝜂 −intuitionistic fuzzy subgroup on an 𝜂 −intuitionistic fuzzy set along with 𝜂 −intuitionistic fuzzy homomorphism. while the theory of soft set is introduced in [16] which is an ordered pair of parameter and function that maps each member of parameter to the power set of an empty set. the soft sets constructed in the form of membership function became fuzzy soft sets defined in [17]. maji, et al. introduced the concept of intuitionistic fuzzy soft set which is a generalization of intuitionistic fuzzy set and soft set in [18]. furthermore, the operation properties and algebraic structure of intuitionistic fuzzy soft set were discussed in [19]. soft group on the soft set is defined in [16]. then aygünoglu and aygun in [20] developed the concept of soft group in the form of membership function became fuzzy soft group. the concept of intuitionistic fuzzy soft set to semigroup was applied in [21] and intuitionistic fuzzy soft ideals over ordered ternary semigroup was defined in [22]. the concept of soft group is developed into an intersection called soft int-group in [23]. moreover, karaaslan, et al. in [24] applied the concept of soft int-group into intuitionistic fuzzy soft set by forming the intuitionistic fuzzy soft group and investigate some properties of intuitionistic fuzzy soft group. based on [15] and [24], we introduce the notion of the 𝜂 −intuitionistic fuzzy soft group on the 𝜂 −intuitionistic fuzzy soft set and give some basic properties. moreover, we define the notion of the 𝜂 −intuitionistic fuzzy subgroup and investigate the properties of homomorphism of an 𝜂 −intuitionistic fuzzy soft group. methods the method of this research is literature review, data collecting techniques by conducting review studies of books, notes, and other scientific research results related to the object of the problem. in this paper, we begin by forming the definition of 𝜂 −intuitionistic fuzzy soft set based on the definition of intuitionistic fuzzy soft set and 𝜂 −intuitionistic fuzzy set. then, we form the definition of 𝜂 −intuitionistic fuzzy soft group based on the definition of intuitionistic fuzzy soft group. we continue to prove some properties of the 𝜂 −intuitionistic fuzzy soft group along with subgroup of an 𝜂 −intuitionistic fuzzy soft group. moreover, we continue to define the definition of image and pre-image of an 𝜂 −intuitionistic fuzzy soft group and prove the homomorphism of an 𝜂 −intuitionistic fuzzy soft group. here is shown the definition of intuitionistic fuzzy soft set, 𝜂 −intuitionistic fuzzy set, and intuitionistic fuzzy soft group. definition 1[18]. let 𝑋 be a non empty set and 𝐸 be a set of parameter with 𝐴 ⊆ 𝐸. let ℐℱ(𝑋) be a set of all intuitionistic fuzzy set of 𝑋. an intuitionistic fuzzy soft set of 𝐴 over 𝑋 is defined by γ𝐴 = {(𝑎, 𝛾𝐴(𝑎)): 𝑎 ∈ 𝐴}, where 𝛾𝐴(𝑎): 𝐸 → ℐℱ(𝑋) such as 𝛾𝐴(𝑎) = ∅ if 𝑎 ∉ 𝐴. therefore, for all 𝑎 ∈ 𝐸, 𝛾𝐴(𝑎) is called intuitionistic fuzzy value set of 𝑎. 𝛾𝐴(𝑎) can be written as 𝛾𝐴(𝑎) = {(𝑥, 𝜇𝛾𝐴(𝑎)(𝑥), 𝜇𝛾𝐴(𝑎)(𝑥)) : 𝑥 ∈ 𝑋}, for all 𝑎 ∈ 𝐸. definition 2 [15]. suppose 𝐴 is an intuitionistic fuzzy set of a non empty set 𝑋 where 𝜇𝐴 be a membership function and 𝜈𝐴 be a nonmembership function in 𝐴. let 𝜂 ∈ [0, 1]. an 𝜂 −intuitionistic fuzzy set is defined by 𝐴𝜂 = {(𝑥, 𝜇𝐴𝜂 (𝑥), 𝜈𝐴𝜂 (𝑥)): 𝑥 ∈ 𝑋}, where 𝜇𝐴𝜂 (𝑥) = 𝜓[𝜇𝐴(𝑥), 𝜂] = √𝜇𝐴(𝑥) ⋅ 𝜂 and 𝜈𝐴𝜂 (𝑥) = 𝜓 ′[𝜈𝐴(𝑥), 1 − 𝜂] = √𝜈𝐴(𝑥) ⋅ (1 − 𝜂). 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 356 definition 3 [24]. let 𝐺 be an arbitrary group and γ𝐺 be an intuitionistic fuzzy soft set over universe 𝑋, then γ𝐺 is called intuitionistic fuzzy soft group on 𝐺 over 𝑋 if 1. 𝛾𝐺 (𝑥𝑦) ⊇ 𝛾𝐺 (𝑥) ∩ 𝛾𝐺 (𝑦), 2. 𝛾𝐺 (𝑥 −1) = 𝛾𝐺 (𝑥), for all 𝑥 ∈ 𝐺. results and discussion in this section, we introduce the notion of 𝜂 −intuitionistic fuzzy soft group on 𝜂 −intuitionistic fuzzy soft set. inspiring from definition 1 and definition 2, we define the notion of 𝜂 −intuitionistic fuzzy soft set. definition 4. suppose γ𝐴 be an intuitionistic fuzzy soft set of parameter 𝐴 over universe 𝑋 and 𝜂 ∈ [0, 1]. an 𝜂 −intuitionistic fuzzy soft set of 𝐴 over 𝑋 is defined by γ𝐴 𝜂 = {(𝑎, 𝛾𝐴 𝜂 (𝑎)): 𝑎 ∈ 𝐴}, where 𝛾𝐴 𝜂(𝑎) is an 𝜂 −intuitionistic fuzzy set of 𝑋 defined as 𝛾𝐴 𝜂 (𝑎) = ψ[𝛾𝐴(𝑎), 𝜂], for all 𝑎 ∈ 𝐴, where ψ[𝛾𝐴(𝑎), 𝜂] = {(𝑥, 𝜓[𝜇𝛾𝐴(𝑎)(𝑥), 𝜂], 𝜓[𝜈𝛾𝐴(𝑎)(𝑥), 1 − 𝜂]): 𝑥 ∈ 𝑋}. the value of 𝜓[𝜇𝛾𝐴(𝑎)(𝑥), 𝜂] = √𝜇𝛾𝐴(𝑎)(𝑥) ⋅ 𝜂 and 𝜓[𝜈𝛾𝐴(𝑎)(𝑥), 1 − 𝜂] = √𝜇𝛾𝐴(𝑎)(𝑥) ⋅ (1 − 𝜂). some basic properties such as intersection and union of 𝜂 −intuitionistic fuzzy soft set is proved by the following proposition. proposition 1. the intersection of any two 𝜂 −intuitionistic fuzzy soft sets is an 𝜂 −intuitionistic fuzzy soft set. proof. let γ𝐴 𝜂 = {(𝑎, 𝛾𝐴 𝜂 (𝑎)): 𝑎 ∈ 𝐴} and γ𝐵 𝜂 = {(𝑏, 𝛾𝐵 𝜂(𝑏)): 𝑏 ∈ 𝐵}, respectively be two 𝜂 −intuitionistic fuzzy soft sets of 𝐴 and 𝐵 over 𝑋. for any 𝑐 ∈ 𝐴 ∩ 𝐵, then 𝛾(𝐴∩̌𝐵)𝜂 (𝑐) = ψ[𝛾𝐴∩̌𝐵 (𝑐), 𝜂] = {( 𝑥, 𝜓[min[𝜇𝛾𝐴(𝑐)(𝑥), 𝜇𝛾𝐵(𝑐)(𝑥)] , 𝜂], 𝜓[max[𝜈𝛾𝐴(𝑐)(𝑥), 𝜈𝛾𝐵(𝑐)(𝑥)] , 1 − 𝜂] ) : 𝑥 ∈ 𝑋} = {( 𝑥, 𝜓[min[𝜇𝛾𝐴(𝑐)(𝑥), 𝜂] , min[ 𝜇𝛾𝐵(𝑐)(𝑥), 𝜂]], 𝜓[max[𝜈𝛾𝐴(𝑐)(𝑥), 1 − 𝜂] , max[𝜈𝛾𝐵(𝑐)(𝑥), 1 − 𝜂]] ) : 𝑥 ∈ 𝑋} = {( 𝑥, min[𝜓[𝜇𝛾𝐴(𝑐)(𝑥), 𝜂] , 𝜓[ 𝜇𝛾𝐵(𝑐)(𝑥), 𝜂]], max[𝜓[𝜈𝛾𝐴(𝑐)(𝑥), 1 − 𝜂] , 𝜓[ 𝜈𝛾𝐵(𝑐)(𝑥), 1 − 𝜂]] ) : 𝑥 ∈ 𝑋} = ψ[𝛾𝐴(𝑐), 𝜂] ∩ ψ[𝛾𝐵 (𝑐), 𝜂] = 𝛾𝐴𝜂∩̃𝐵𝜂 (𝑐). hence γ𝐴 𝜂 ∩̃ γ𝐵 𝜂 is an 𝜂 −intuitionistic fuzzy soft set. ∎ remark 1. the union of any two 𝜂 −intuitionistic fuzzy soft set is an 𝜂 −intuitionistic fuzzy soft set. the notion of 𝜂 −intuitionistic fuzzy soft group is defined based on definition 3. an 𝜂 −intuitionistic fuzzy soft group must satisfy 2 axioms to be an 𝜂 −intuitionistic fuzzy soft group. definition 5. let 𝐺 be a group and γ𝐺 𝜂 be an 𝜂 −intuitionistic fuzzy soft set over universe 𝑈, then γ𝐺 𝜂 is called an 𝜂 −intuitionistic fuzzy soft group over 𝑈 if 1. 𝛾𝐺 𝜂 (𝑥𝑦) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦), 2. 𝛾𝐺 𝜂 (𝑥−1) = 𝛾𝐺 𝜂 (𝑥), for all 𝑥, 𝑦 ∈ 𝐺. 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 357 the property of the identity element of the group on an 𝜂 −intuitionistic fuzzy soft group is proved by the following proposition. proposition 2. let 𝐺 be a group and γ𝐺 𝜂 be an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈, then 𝛾𝐺 𝜂 (𝑒) ⊇ 𝛾𝐺 𝜂 (𝑥) for all 𝑥 ∈ 𝐺. proof. let γ𝐺 𝜂 = {(𝑥, 𝛾𝐺 𝜂 (𝑥)): 𝑥 ∈ 𝐺} be an 𝜂 −intuitionistic fuzzy soft group and 𝑒 be an identity element of 𝐺. for 𝑒 ∈ 𝐺, then 𝛾𝐺 𝜂 (𝑒) = 𝛾𝐺 𝜂 (𝑥𝑥 −1). since γ𝐺 𝜂 is an 𝜂 −intuitionistic fuzzy soft group, then 𝛾𝐺 𝜂 (𝑒) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑥−1) = 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑥) = 𝛾𝐺 𝜂 (𝑥). so, 𝛾𝐺 𝜂 (𝑒) ⊇ 𝛾𝐺 𝜂 (𝑥), for all 𝑥 ∈ 𝐺. ∎ an 𝜂 −intuitionistic fuzzy soft set is called an 𝜂 −intuitionistic fuzzy soft group if it satisfies 2 axioms in definition 5. here we prove the alternative way of 𝜂 −intuitionistic fuzzy soft group. theorem 1. an 𝜂 −intuitionistic fuzzy soft set is called 𝜂 −intuitionistic fuzzy soft group if and only if 𝛾𝐺 𝜂 (𝑥𝑦−1) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦) for all 𝑥, 𝑦 ∈ 𝐺. proof. (⇒) let γ𝐺 𝜂 = {(𝑥, 𝛾𝐺 𝜂 (𝑥)): 𝑥 ∈ 𝐺} be an 𝜂 −intuitionistic fuzzy soft group. from definition 5, then for all 𝑥, 𝑦 ∈ 𝐺 we have 𝛾𝐺 𝜂 (𝑥𝑦−1) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦−1) = 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦). (⇐) since 𝛾𝐺 𝜂 (𝑥𝑦−1) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦) for all 𝑥, 𝑦 ∈ 𝐺, then 𝛾𝐺 𝜂 (𝑥−1) = 𝛾𝐺 𝜂 (𝑒𝑥) ⊇ 𝛾𝐺 𝜂 (𝑒) ∩ 𝛾𝐺 𝜂 (𝑥) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑥) = 𝛾𝐺 𝜂 (𝑥), and 𝛾𝐺 𝜂 (𝑥) = 𝛾𝐺 𝜂 (𝑒(𝑥−1)−1) ⊇ 𝛾𝐺 𝜂 (𝑒) ∩ 𝛾𝐺 𝜂 (𝑥−1) ⊇ 𝛾𝐺 𝜂 (𝑥)−1 ∩ 𝛾𝐺 𝜂 (𝑥−1) = 𝛾𝐺 𝜂 (𝑥−1). hence 𝛾𝐺 𝜂 (𝑥−1) = 𝛾𝐺 𝜂 (𝑥) for all 𝑥 ∈ 𝐺. then for all 𝑥, 𝑦 ∈ 𝐺 we have 𝛾𝐺 𝜂 (𝑥𝑦) = 𝛾𝐺 𝜂 (𝑥(𝑦−1)−1) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦−1) = 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦). hence 𝛾𝐺 𝜂 (𝑥𝑦) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦) for all 𝑥, 𝑦 ∈ 𝐺. therefore, γ𝐺 𝜂 is an 𝜂 −intuitionistic fuzzy soft group. ∎ since 𝜂 −intuitionistic fuzzy soft group is defined based on intuitionistic fuzzy soft group, so there is a correlation between those concepts. the following theorem, we prove that every intuitionistic fuzzy soft group is an 𝜂 −intuitionistic fuzzy soft group. theorem 2. if γ𝐺 is an intuitionistic fuzzy soft group, then γ𝐺 𝜂 is an 𝜂 −intuitionistic fuzzy soft group for all 𝜂 ∈ [0, 1]. proof. let γ𝐺 = {(𝑥, 𝛾𝐺 (𝑥)): 𝑥 ∈ 𝐺} be an intuitionistic fuzzy soft group over the universe 𝑈 where 𝛾𝐺 (𝑥) = {(𝑢, 𝜇𝛾𝐺(𝑥)(𝑢), 𝜈𝛾𝐺(𝑥)(𝑢)) : 𝑢 ∈ 𝑈}. for any 𝑥, 𝑦 ∈ 𝐺 and 𝜂 ∈ [0, 1], then 𝛾𝐺 𝜂 (𝑥𝑦−1) = ψ[𝛾𝐺 (𝑥𝑦 −1), 𝜂]. since γ𝐺 is an intuitionistic fuzzy soft group, then 𝛾𝐺 𝜂 (𝑥𝑦−1) ⊇ ψ[(𝛾𝐺 (𝑥) ∩ 𝛾𝐺 (𝑦)), 𝜂] = {( 𝑢, 𝜓[min[𝜇𝛾𝐺(𝑥)(𝑢), 𝜇𝛾𝐺(𝑦)(𝑢)], 𝜂], 𝜓[max[𝜈𝛾𝐺(𝑥)(𝑢), 𝜈𝛾𝐺(𝑦)(𝑢)], 1 − 𝜂] ) : 𝑢 ∈ 𝑈} = {( 𝑥, 𝜓[min[𝜇𝛾𝐺(𝑥)(𝑢), 𝜂] , min[ 𝜇𝛾𝐺(𝑦)(𝑢), 𝜂]], 𝜓[max[𝜈𝛾𝐺(𝑥)(𝑢), 1 − 𝜂] , max[𝜈𝛾𝐺(𝑦)(𝑢), 1 − 𝜂]] ) : 𝑢 ∈ 𝑈} 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 358 = {( 𝑥, min[𝜓[𝜇𝛾𝐺(𝑥)(𝑢), 𝜂] , 𝜓[ 𝜇𝛾𝐺(𝑦)(𝑢), 𝜂]], max[𝜓[𝜈𝛾𝐺(𝑥)(𝑢), 1 − 𝜂] , 𝜓[ 𝜈𝛾𝐺(𝑦)(𝑢), 1 − 𝜂]] ) : 𝑢 ∈ 𝑈} = ψ[𝛾𝐺 (𝑥), 𝜂] ∩ ψ[𝛾𝐺 (𝑦), 𝜂] = 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦). therefore, γ𝐺 𝜂 is an 𝜂 −intuitionistic fuzzy soft group. ∎ here we define the 𝜂 −intuitionistic fuzzy subgroup of an 𝜂 −intuitionistic fuzzy group. the properties of 𝜂 −intuitionistic fuzzy subgroup are proved in the following theorem. definition 6. suppose 𝐺 be a group and 𝐻 be a subgroup of 𝐺. let γ𝐺 𝜂 be an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈, then γ𝐻 𝜂 is called an 𝜂 −intuitionistic fuzzy soft subgroup of γ𝐺 𝜂 if γ𝐻 𝜂 is an 𝜂 −intuitionistic fuzzy soft group over 𝑈. theorem 3. let γ𝐺 𝜂 be an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈. suppose γ𝐻 𝜂 and γ𝑁 𝜂 be two 𝜂 −intuitionistic fuzzy soft subgroups of γ𝐺 𝜂 , then γ𝐻 𝜂 ∩̃ γ𝑁 𝜂 is an 𝜂 −intuitionistic fuzzy soft subgroup of γ𝐺 𝜂 . proof. defined γ𝐻 𝜂 ∩̃ γ𝑁 𝜂 = {(𝑥, 𝛾𝐻∩̃𝑁 𝜂 (𝑥)): 𝑥 ∈ 𝐻 ∩ 𝑁}. for any 𝑥, 𝑦 ∈ 𝐺, then 𝛾𝐻∩̃𝑁 𝜂 (𝑥𝑦−1) = 𝛾𝐻 𝜂 (𝑥𝑦−1) ∩ 𝛾𝑁 𝜂 (𝑥𝑦−1) ⊇ (𝛾𝐻 𝜂 (𝑥) ∩ 𝛾𝐻 𝜂 (𝑦)) ∩ (𝛾𝑁 𝜂 (𝑥) ∩ 𝛾𝑁 𝜂 (𝑦)) = (𝛾𝐻 𝜂 (𝑥) ∩ 𝛾𝑁 𝜂 (𝑥)) ∩ (𝛾𝐻 𝜂 (𝑦) ∩ 𝛾𝑁 𝜂 (𝑦)) = 𝛾𝐻∩̃𝑁 𝜂 (𝑥) ∩ 𝛾𝐻∩̃𝑁 𝜂 (𝑦). hence γ𝐻 𝜂 ∩̃ γ𝑁 𝜂 is an 𝜂 −intuitionistic fuzzy soft subgroup of γ𝐺 𝜂 . ∎ theorem 4. let γ𝐺 𝜂 be an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈 and 𝑒 be an identity elemen of 𝐺. then γ𝐺 𝜂 |𝑒 = {(𝑥, 𝛾𝐺 𝜂 (𝑥)): 𝛾𝐺 𝜂 (𝑥) = 𝛾𝐺 𝜂 (𝑒), 𝑥 ∈ 𝐺} is an 𝜂 −intuitionistic fuzzy subgroup of 𝐺. proof. since (𝑒, 𝛾𝐺 𝜂 (𝑒)) ∈ γ𝐺 𝜂 |𝑒, then γ𝐺 𝜂 |𝑒 ≠ ∅. let (𝑥, 𝛾𝐺 𝜂 (𝑥)), (𝑦, 𝛾𝐺 𝜂 (𝑦)) ∈ γ𝐺 𝜂 |𝑒 , we have 𝛾𝐺 𝜂 (𝑥) = 𝛾𝐺 𝜂 (𝑦) = 𝛾𝐺 𝜂 (𝑒) for 𝑥, 𝑦 ∈ 𝐺. then 𝛾𝐺 𝜂 (𝑥𝑦−1) ⊇ 𝛾𝐺 𝜂 (𝑥) ∩ 𝛾𝐺 𝜂 (𝑦) = 𝛾𝐺 𝜂 (𝑒) ∩ 𝛾𝐺 𝜂 (𝑒) = 𝛾𝐺 𝜂 (𝑒). since 𝛾𝐺 𝜂 (𝑒) ⊇ 𝛾𝐺 𝜂 (𝑥𝑦−1), then 𝛾𝐺 𝜂 (𝑥𝑦−1) = 𝛾𝐺 𝜂 (𝑒). thus (𝑥𝑦−1, 𝛾𝐺 𝜂 (𝑥𝑦−1)) ∈ γ𝐺 𝜂 |𝑒 . therefore γ𝐺 𝜂 |𝑒 is an 𝜂 −intuitionistic fuzzy subgroup of 𝐺. ∎ here notioned the definition of image and pre-image of an 𝜂 −intuitionistic fuzzy soft group. definition 7. let γ𝐴 𝜂 and γ𝐵 𝜂 be two 𝜂 −intuitionistic fuzzy soft sets over universe 𝑈, and let 𝜙: 𝐴 → 𝐵, then 1. image of γ𝐴 𝜂 under the map 𝜙, denoted by 𝜙(γ𝐴 𝜂 ) defined by 𝜙(γ𝐴 𝜂 ) = {(𝑏, 𝜙(𝛾𝐴 𝜂 )(𝑏)): 𝑏 ∈ 𝐵}, where for all 𝑏 ∈ 𝐵, then 𝜙(𝛾𝐴 𝜂 )(𝑏) = { ∩ {𝛾𝐴 𝜂 (𝑎): 𝑎 ∈ 𝐴, 𝑓(𝑎) = 𝑏}, if 𝜙(𝑎) ∈ 𝜙(𝐴), 𝛾∅ 𝜂 , others. 2. pre-image of γ𝐵 𝜂 under 𝜙, denoted by 𝜙−1(γ𝐵 𝜂 ) defined by 𝜙−1(γ𝐵 𝜂 ) = {(𝑎, 𝜙−1(𝛾𝐵 𝜂)(𝑎)): 𝑎 ∈ 𝐴}, where for all 𝑎 ∈ 𝐴, then 𝜙−1(𝛾𝐵 𝜂)(𝑎) = 𝛾𝐵 𝜂 (𝜙(𝑎)). lemma 1. let γ𝐴 𝜂 and γ𝐵 𝜂 be two 𝜂 −intuitionistic fuzzy soft sets over 𝑈, then 𝜙(γ𝐴 𝜂 ) and 𝜙−1(γ𝐵 𝜂 ) are 𝜂 −intuitionistic fuzzy soft sets over universe 𝑈. 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 359 proof. let γ𝐴 𝜂 = {(𝑎, 𝛾𝐴 𝜂 (𝑎)): 𝑎 ∈ 𝐴} and γ𝐵 𝜂 = {(𝑏, 𝛾𝐵 𝜂(𝑏)): 𝑏 ∈ 𝐵} respectively be two 𝜂 −intuitionistic fuzzy soft sets of 𝐴 and 𝐵 over 𝑈. let 𝜙: 𝐴 → 𝐵. from definition 7, there is 𝑏 ∈ 𝜙(𝐴) such that 𝜙(𝑎) = 𝑏, then 𝜙(𝛾𝐴 𝜂 )(𝑏) =∩ {𝛾𝐴 𝜂 (𝑎): 𝑎 ∈ 𝐴, 𝜙(𝑎) = 𝑏}. from proposition 1, we have 𝜙(𝛾𝐴 𝜂)(𝑏) is an 𝜂 −intuitionistic fuzzy set. if 𝜙(𝑎) ∉ 𝜙(𝐴), then 𝜙(𝛾𝐴 𝜂 )(𝑏) = 𝛾∅ 𝜂 is an 𝜂 −intuitionistic fuzzy set. then ∀𝑎 ∈ 𝐴, we have 𝜙−1(𝛾𝐵 𝜂 )(𝑎) = 𝛾𝐵 𝜂 (𝜙(𝑎)) is an 𝜂 −intuitionistic fuzzy set. therefore, 𝜙(γ𝐴 𝜂) and 𝜙−1(γ𝐵 𝜂 ) are 𝜂 −intuitionistic fuzzy soft sets over 𝑈. ∎ the map 𝑓: 𝐺1 → 𝐺2 of a group 𝐺1 to a group 𝐺2 is called homomorphism if for all 𝑥, 𝑦 ∈ 𝐺1 then 𝑓(𝑥𝑦) = 𝑓(𝑥)𝑓(𝑦). the following theorems, we prove that every image and preimage of the 𝜂 −intuitionistic fuzzy group under the homomorphism function are the 𝜂 −intuitionistic fuzzy soft group. theorem 5. let γ𝐺1 𝜂 is an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈 and 𝜙hom: 𝐺1 − 𝐺2, then 𝜙(γ𝐺1 𝜂 ) is an 𝜂 −intuitionistic fuzzy soft group. proof. let 𝐺1 and 𝐺2 are two groups, 𝜙hom: 𝐺1 − 𝐺2, and γ𝐺1 𝜂 is an 𝜂 −intuitionistic fuzzy soft group over 𝑈. let 𝑢, 𝑣 ∈ 𝐺2, 𝑢 ∉ 𝜙(𝐺1) or 𝑣 ∉ 𝜙(𝐺1), then 𝜙(𝛾𝐺1 𝜂 )(𝑢) ∩ 𝜙(𝛾𝐺1 𝜂 )(𝑣) = 𝛾∅ 𝜂, means 𝜙(𝛾𝐺1 𝜂 )(𝑢𝑣) ⊇ 𝜙(𝛾𝐺1 𝜂 )(𝑢) ∩ 𝜙(𝛾𝐺1 𝜂 )(𝑣). since 𝑢 ∉ 𝜙(𝐺1), then 𝑢 −1 ∉ 𝜙(𝐺1), so 𝜙(𝛾𝐺1 𝜂 )(𝑢−1) = 𝜙(𝛾𝐺1 𝜂 )(𝑢) = 𝛾∅ 𝜂 . suppose 𝜙(𝑥) = 𝑢 and 𝜙(𝑦) = 𝑣 for 𝑥, 𝑦 ∈ 𝐺1. let 𝑧 = 𝑥𝑦, then 1. 𝜙(𝛾𝐺1 𝜂 )(𝑢𝑣) =∩ {𝛾𝐺1 𝜂 (𝑧): 𝑧 ∈ 𝐺1, 𝜙(𝑧) = 𝑢𝑣} =∩ {𝛾𝐺1 𝜂 (𝑥𝑦): 𝑥, 𝑦 ∈ 𝐺1, 𝜙(𝑥𝑦) = 𝑢𝑣} ⊇∩ {𝛾𝐺1 𝜂 (𝑥) ∩ 𝛾𝐺1 𝜂 (𝑦): 𝑥, 𝑦 ∈ 𝐺1, 𝜙(𝑥) = 𝑢, 𝜙(𝑦) = 𝑣} = (∩ {𝛾𝐺1 𝜂 (𝑥): 𝑥 ∈ 𝐺1, 𝜙(𝑥) = 𝑢}) ∩ (∩ {𝛾𝐺1 𝜂 (𝑦): 𝑦 ∈ 𝐺1, 𝜙(𝑦) = 𝑣}) = 𝜙(𝛾𝐺1 𝜂 )(𝑢) ∩ 𝜙(𝛾𝐺1 𝜂 )(𝑣). 2. 𝜙(𝛾𝐺1 𝜂 )(𝑢) =∩ {𝛾𝐺1 𝜂 (𝑥): 𝑥 ∈ 𝐺1, 𝜙(𝑥) = 𝑢} =∩ {𝛾𝐺1 𝜂 (𝑥−1): 𝑥 ∈ 𝐺1, 𝜙(𝑥 −1) = 𝑢−1} = 𝜙(𝛾𝐺1 𝜂 )(𝑢−1). therefore, 𝜙(γ𝐺1 𝜂 ) is an 𝜂 −intuitionistic fuzzy soft group over 𝑈. ∎ theorem 6. let γ𝐺2 𝜂 is an 𝜂 −intuitionistic fuzzy soft group over universe 𝑈 and 𝜙hom: 𝐺1 − 𝐺2, then 𝜙 −1(γ𝐺2 𝜂 ) is an 𝜂 −intuitionistic fuzzy soft group. proof. let 𝐺1 and 𝐺2 are two groups, 𝜙hom: 𝐺1 − 𝐺2, and γ𝐺2 𝜂 is an 𝜂 −intuitionistic fuzzy soft group over 𝑈. for all 𝑥, 𝑦 ∈ 𝐺1 we have 1. 𝜙−1(𝛾𝐺2 𝜂)(𝑥𝑦) = 𝛾𝐺2 𝜂 (𝜙(𝑥𝑦)) = 𝛾𝐺2 𝜂 (𝜙(𝑥)𝜙(𝑦)) ⊇ 𝛾𝐺2 𝜂 (𝜙(𝑥)) ∩ 𝛾𝐺2 𝜂 (𝜙(𝑥)) = 𝜙−1(𝛾𝐺2 𝜂 )(𝑥) ∩ 𝜙−1(𝛾𝐺2 𝜂 )(𝑦). 2. 𝜙−1(𝛾𝐺2 𝜂)(𝑥−1) = 𝛾𝐺2 𝜂 (𝜙(𝑥−1)) = 𝛾𝐺2 𝜂 ((𝜙(𝑥)) −1 ) = 𝛾𝐺2 𝜂 (𝜙(𝑥)) = 𝜙−1(𝛾𝐺2 𝜂 )(𝑥). 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 360 therefore, 𝜙−1(γ𝐺2 𝜂 ) is an 𝜂 −intuitionistic fuzzy soft group over 𝑈. ∎ based on theorem 5 and theorem 6, we know that image and pre-image of an 𝜂 −intuitionistic fuzzy soft group under 𝜂 −intuitionistic fuzzy soft homomorphism are also an 𝜂 −intuitionistic fuzzy group. conclusions based on the result, it is concluded that 𝜂 −intuitionistic fuzzy soft group depends on intuitionistic fuzzy soft group. it is proved that every intuitionistic fuzzy soft group is an 𝜂 −intuitionistic fuzzy soft group. the definition of 𝜂 −intuitionistic fuzzy soft subgroup and it’s properties are presented. the 𝜂 −intuitionistic fuzzy soft homomorphism show that image and pre-image of an 𝜂 −intuitionistic fuzzy soft group are also 𝜂 −intuitionistic fuzzy soft groups. for the future research, it is suggested to notion the 𝜂 −intuitionistic fuzzy soft ring along with the properties. references [1] l. a. zadeh, “fuzzy sets,” inf. control, vol. 8, no. 3, pp. 338–353, 1965, doi: 10.1016/s0019-9958(65)90241-x. [2] k. t. atanassov, “intuitionistic fuzzy sets,” fuzzy sets syst., vol. 20, no. 1, pp. 87–96, 1986, doi: 10.1016/s0165-0114(86)80034-3. [3] a. rosenfeld, “fuzzy groups,” j. math. anal. appl., vol. 35, no. 3, pp. 512–517, 1971, doi: 10.1016/0022-247x(71)90199-5. [4] r. biswas, “intuitionistic fuzzy subgroups,” math. forum, vol. 10, pp. 37–46, 1989. [5] n. palaniappan, s. naganathan, and k. arjunan, “a study on intuitionistic l-fuzzy subgroups,” appl. math. sci., vol. 3, no. 53, pp. 2619–2624, 2009. [6] x. yuan, h. li, and e. . lee, “on the definition of the intuitionistic fuzzy subgroups,” comput. math. with apl., vol. 59, pp. 3117–3129, 2010. [7] p. k. sharma, “(α,β)cut of intuitionistic fuzzy subgroups,” int. math. forum, vol. 6, no. 53, pp. 2605–2614, 2011. [8] p. k. sharma, “tintuitionistic fuzzy subgroups,” int. j. fuzzy math. syst., vol. 2, no. 3, pp. 233–243, 2012. [9] n. doda and p. k. sharma, “counting the number of intuitionistic fuzzy subgroups of finite abelian groups of different order,” notes intuitionistic fuzzy sets, vol. 19, no. 4, pp. 42–47, 2013. [10] w. zhou and z. xu, “extended intuitionistic fuzzy sets based on the hesitant fuzzy membership and their application in decision making with risk preference,” int. j. intelegence syst., vol. 00, pp. 1–27, 2017, doi: 10.1002/int. [11] s. sun and c. liu, “(λ,μ)-intuitionistic fuzzy subgroups of groups with operators,” int. j. math. comput. sci., vol. 10, no. 9, pp. 456–462, 2016. [12] m. gulzar, d. alghazzawi, m. h. mateen, and n. kausar, “a certain class of tintuitionistic fuzzy subgroups,” ieee access, vol. 8, pp. 163260–163268, 2020, doi: 10.1109/access.2020.3020366. 𝜂 −intuitionistic fuzzy soft groups mustika ana kurfia 361 [13] l. latif, u. shuaib, h. alolaiyan, and a. razaq, “on fundamental theorems of tintuitionistic fuzzy isomorphism of t-intuitionistic fuzzy subgroups,” ieee access, vol. 6, pp. 2169–3536, 2018. [14] u. shuaib, m. amin, s. dilbar, and f. tahir, “on algebraic attributes of ξ intuitionistic fuzzy subgroups,” int. j. math. comput. sci., vol. 15, no. 1, pp. 395–411, 2020. [15] u. shuaib, h. alolaiyan, a. razaq, d. saba, and f. tahir, “on some algebraic aspects of η-intuitionistic fuzzy subgroups,” j. taibah univ. sci., vol. 14, no. 1, pp. 463–469, 2020, doi: 10.1080/16583655.2020.1745491. [16] h. aktaş and n. çaǧman, “soft sets and soft groups,” inf. sci. (ny)., vol. 177, no. 13, pp. 2726–2735, 2007, doi: 10.1016/j.ins.2006.12.008. [17] p. . maji, a. . roy, and r. biswas, “fuzzy soft sets,” j. fuzzy math., vol. 9, no. 3, pp. 589–602, 2001. [18] p. . maji, r. biswas, and a. . roy, “intuitionistic fuzzy soft set,” j. fuzzy math., vol. 9, no. 3, pp. 677–692, 2001. [19] y. yin, h. li, and y. bae, “on algebraic structure of intuitionistic fuzzy soft sets,” comput. math. with appl., vol. 64, no. 9, pp. 2896–2911, 2012, doi: 10.1016/j.camwa.2012.05.004. [20] a. aygünoglu and h. aygun, “introduction to fuzzy soft groups,” comput. math. with appl., vol. 58, pp. 1279–1286, 2009, doi: 10.1016/j.camwa.2009.07.047. [21] j. zhou, “intuitionistic fuzzy soft semigroups,” math. aeterna, vol. 1, no. 03, pp. 173– 183, 2011. [22] m. akram and n. yaqoob, “intuitionistic fuzzy soft ordered ternary semigroups,” int. j. pure appl. math., vol. 84, no. 2, pp. 93–107, 2013, doi: 10.12732/ijpam.v84i2.8. [23] n. cagman, f. citak, and h. aktas, “soft int-group and its applications to group theory,” neural comput applic, vol. 21, no. 1, pp. s151–s158, 2012, doi: 10.1007/s00521-011-0752-x. [24] f. karaaslan, k. kaygisiz, and n. cagman, “on intuitionistic fuzzy soft groups,” j. new results sci., vol. 2, no. 3, pp. 72–86, 2013. average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 231-239 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 18, 2021 reviewed: december 08, 2021 accepted: december 21, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.13371 average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto*, zaenurrohman, titi udjiani srrm department of mathematics, faculty science and mathematics, diponegoro university, indonesia *corresponding author email: susilohariyanto@lecturer.undip.ac.id*, zaenurrohman8.zr@gmail.com, udjianititi@yahoo.com abstract every day, new covid-19 positive cases are discovered in central java. many research has used various methodologies to try to forecast new positive instances. the fuzzy time series (fts) approach is one of them. many fts are now in development, including the fts markov chain. the duration of the gap in the fts must be determined carefully because it will affect the flr, which will be used to estimate the forecast value. the average-based method can be used to determine the optimum interval length; however, other research use frequency density partitioning to determine the optimal interval length in order to produce superior predicting values. the goal of this research is to improve the accuracy of forecasting values by modifying the frequency density partition on the average based-fts markov chain. the approach utilized is average-based, with the length of the interval determined by the average, the forecast value determined by the fts markov chain, and the frequency density partition modified to provide the ideal interval. the average-based fts markov chain approach with adjustments to the frequency density partition achieves an accuracy rate of 89.3 percent, according to the findings of this study. because changes to the frequency density partition can produce a good level of accuracy in forecasting new positive cases of covid-19 in central java, it is hoped that this modification of the frequency density partition on the average-based fts markov chain can be used as a model for forecasting in fields other than new positive cases. covid-19. keywords: average based; fts markov chain; modified frequency density partitioning; covid19; mape introduction covid-19 first appeared in the city of wuhan, hubei province, china, which spread almost all over the world, including indonesia. at the beginning of 2020, indonesia experienced a covid-19 pandemic, which, to this day, new positive cases are still being found[1]. the government is still thinking about how to make indonesia free from covid-19. many experts have estimated the amount of new covid-19 positive cases in indonesia. forecasting is the process of predicting what will happen in the future over a lengthy period of time[2]. however, no approach for accurately forecasting anything, http://dx.doi.org/10.18860/ca.v7i1.13371 mailto:susilohariyanto@lecturer.undip.ac.id* mailto:zaenurrohman8.zr@gmail.com mailto:udjianititi@yahoo.com average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 232 including new covid-19 positive cases, has been developed to yet. the fuzzy time series (fts) is one of the forecasting approaches for determining the number of new positive cases in indonesia. the concept of fuzzy logic is used in the forecasting of fts. song and chissom introduced the fts in 1993[3], and it has since been widely developed, including the markov method[4], chen’s method[5], chen and hsu’s method[6], the weighted method[7], the multiple-attribute fuzzy time series method[8], the percentage change method[9], and markov chain method[10]. ruey chyn tsaur, developing fuzzy time series by merging the fuzzy time series method with the markov chain concept [10]. markov chain is a stochastic process in which future events only depend on today's events. markov chain is used in the defuzzification process [11]. the determination of the length of the interval in the fuzzy time series does not have a definite formula, but the determination of the length of the interval in the fuzzy time series is based on the researcher. as a result, even if each researcher is utilizing the same data, the length of the interval will vary[3]. even though the determination of the length of the interval is a very influential part in the formation of a fuzzy logical relationship (flr).[12]. one method that can be used to determine the length of the interval is the average based. this average based uses an average-based method in determining the length of the interval [12]. chen and hsu in 2004 also developed a fuzzy time series. chen and hsu developed fuzzy time series by repartitioning based on frequency density. chen and hsu's frequency density repartitioning algorithm divides the interval with the highest frequency density into four sub-intervals, the interval with the second highest density into three sub-intervals, the interval with the third highest density into two subintervals, and the interval with the lowest density into one sub-interval.[6][13]. based on the description above, the researcher is interested in modifying the frequency density partitioning algorithm used by chen and hsu, namely by exchanging the partition between the interval with the first densest frequency with the third densest interval, which was originally the first densest interval partitioned into 4 subintervals, the researcher changed the first densest was partitioned into 2 sub-intervals, and for the interval with the third densest frequency initially partitioned into 2 subintervals the researcher changed it to 4 sub-intervals. furthermore, the researcher will use the average-based method to determine the interval length in the fuzzy time series type markov chain, and apply it to forecasting new positive cases of covid-19 in central java. methods fuzzy time series the definition of fuzzy time series was first introduced by song dan chisom (1993). let 𝑈 universe of discourse, with 𝑈 = {𝑢1, 𝑢2, … , 𝑢𝑛} on a fuzzy set 𝐴𝑖 , defined as[3]: 𝐴𝑖 = 𝑓𝐴(𝑢1) 𝑢1 + 𝑓𝐴(𝑢2) 𝑢2 + ⋯ + 𝑓𝐴(𝑢𝑛) 𝑢𝑛 (1) where 𝑓𝐴 is theemembership ofkthe fuzzypset 𝐴𝑖 and 𝑢𝑘 is anxelementuof the fuzzy set 𝐴𝑖 and 𝑓𝐴(𝑢𝑘 ) shows the degreebof membership of 𝑢𝑘 in 𝐴𝑖 where 𝑘 = 1,2,3, … , 𝑛. definition. if 𝐹(𝑡) is causeddby 𝐹(𝑡 − 1), then thecrelation in the first orderrmodel 𝐹(𝑡) cantbe stated assfollows: [5]. average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 233 𝐹(𝑡) = 𝐹(𝑡 − 1) ○ 𝑅(𝑡, 𝑡 − 1) (2) where “○” is max-min composition operator, and 𝑅(𝑡, 𝑡 − 1) is a relation matrix to describe the fuzzy relationship between 𝐹(𝑡 − 1) dan 𝐹(𝑡). average-based average based is an algorithm that can be used to set the interval length that is determined at the initial stage of forecasting when using fuzzy time series. the steps of the average based algorithm are as follows [12], [14]: a. determine the absolute difference (lag) between data 𝑛 + 1 and data 𝑛 with the formula: 𝑙𝑎𝑔𝐷𝑛 = |(𝐷𝑎𝑡𝑎 𝑛 + 1) − (𝐷𝑎𝑡𝑎 𝑛)| (3) b. determine the length of the interval 𝑙𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = ( 𝑡𝑜𝑡𝑎𝑙 𝑙𝑎𝑔 𝑛𝑢𝑚𝑏𝑒𝑟𝑠 𝑜𝑓 𝑑𝑎𝑡𝑎 ) : 2 (4) c. determine the basis value of the interval length according to table 1. following: table 1. basis mapping table range basis 0,1 – 1,0 0,1 1,1 -10 1 11-100 10 101-1000 100 d. the length of the interval is then rounded up according to the interval basis table. modification frequency density partition in this study, modifications to the frequency density partition were used, the algorithm used is as follows: a. the interval with the first densest frequency is divided into 2 subintervals. b. the interval with the second densest frequency is divided into 3 subintervals. c. the interval with the third densest frequency is divided into 4 subintervals d. eliminates intervals that have no frequency. fuzzy time series markov chain markov chain's fuzzy time series forecasting procedure is as follows[10]: step 1. collecting historical data (𝑌𝑡). step 2. defines the u universe set of data, with d1 and d2 being the corresponding positive numbers. 𝑈 = [𝐷𝑚𝑖𝑛 − d1, 𝐷𝑚𝑎𝑥 + d2] (5) step 3. specify the number of fuzzy intervals. step 4. defining the fuzzy set in the universe of discourse u, the fuzzy ai set declares the linguistic variable of the share price by 1 ≤ 𝑖 ≤ 𝑛. step 5. fuzzification of historical data. if a time series data is included in the 𝑢𝑖 interval, then that data is fuzzification into 𝐴𝑖 . step 6. specifies fuzzy logical relationship (flr) and fuzzy logical relationships group (flrg). step 7. calculate forecasting results for time series data, using flrg, a probability can be obtained from a state heading to the next state. in order to calculate the predicting value, a markov probability transition matrix with a dimension of 𝑛 𝑥 𝑛 was used. if average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 234 state 𝐴𝑖 transition to a state 𝐴𝑗 and pass the state 𝐴𝑘 , 𝑖, 𝑗 = 1, 2, . . . , 𝑛, then we can obtain flrg. the transition probability formula is as follows: 𝑃𝑖𝑗 = 𝑀𝑖𝑗 𝑀𝑖 , 𝑖, 𝑗 = 1, 2, … , 𝑛 (6) with: 𝑃𝑖𝑗 = probability of transition from state 𝐴𝑖 to state 𝐴𝑗 one step 𝑀𝑖𝑗 = number of transitions from state 𝐴𝑖 to state 𝐴𝑗 one step 𝑀𝑖 = the amount of data included in the 𝐴𝑖 the probability matrix r of all states can be written as follows: 𝑅 = [ 𝑃11 ⋯ 𝑃1𝑛 ⋮ ⋱ ⋮ 𝑃𝑛1 ⋯ 𝑃𝑛𝑛 ] (7) matrix r reflects the transition of the entire system. if 𝐹(𝑡 − 1) = 𝐴𝑖 , then the process will be defined in the 𝐴𝑖 at the time of (𝑡 − 1), then the forecasting results 𝐹(𝑡) will be calculated using the [𝑃𝑖1, 𝑃𝑖2, … , 𝑃𝑖𝑛 ] on the matrix r. forecasting results 𝐹(𝑡) is the weighted average value of the 𝑚1, 𝑚2, ..., 𝑚𝑛 (midpoint of 𝑢1, 𝑢2, ..., 𝑢𝑛 ). the forecasting output result value on 𝐹(𝑡) can be determined by using the following rules: rule 1: if fuzzy logical relationship group 𝐴𝑖 is one-to-one (suppose 𝐴𝑖 → 𝐴𝑘 where 𝑃𝑖𝑘 = 1 and 𝑃𝑖𝑗 = 0, 𝑗 ≠ 𝑘) then the forecasting value of 𝐹(𝑡) is 𝑚𝑘 the middle value of the 𝑢𝑘 . 𝐹(𝑡) = 𝑚𝑘 𝑃𝑖𝑘 = 𝑚𝑘 (8) rulet2: if the flrg 𝐴𝑖 is one-to-many (e.g. 𝐴𝑗 → 𝐴1, 𝐴2, . . . , 𝐴𝑛 . 𝑗 = 1, 2, . . . , 𝑛), when 𝑌(𝑡 − 1) at time (𝑡 − 1) is included in state 𝐴𝑗 then the forecasting 𝐹(𝑡), is: 𝐹(𝑡) = 𝑚1𝑃𝑗1 + 𝑚2𝑃𝑗2 + … + 𝑚𝑗−1𝑃𝑗(𝑗−1) + 𝑌(𝑡 − 1)𝑃𝑗𝑗 + 𝑚𝑗+1𝑃𝑗(𝑗+1) + … + 𝑚𝑛𝑃𝑗𝑛 (9) where 𝑚1, 𝑚2, . . . , 𝑚𝑗−1, 𝑚𝑗+1, … , 𝑚𝑛 is the middle value 𝑢1, 𝑢2, . . . , 𝑢𝑗−1, 𝑢𝑗+1, . . . , 𝑢𝑛 , and 𝑌(𝑡 − 1) are state values 𝐴𝑗 at time 𝑡 − 1. rule 3: if the flrg 𝐴𝑖 is empty (𝐴𝑖 → ∅) forecast value 𝐹(𝑡) is 𝑚𝑖 which is the middle value of 𝑢𝑖 with the following equation: 𝐹(𝑡) = 𝑚𝑖 (10) step 8. adjusting the trend of forecasting values with the following rules:  if state 𝐴𝑖 communicates with 𝐴𝑖 , startingffrom state 𝐴𝑖 at time 𝑡 − 1 expressed as 𝐹(𝑡 − 1) = 𝐴𝑖 , and undergoing an increasingatransition to state 𝐴𝑗 at the time 𝑡 where (𝑖 < 𝑗), thendthe adjustment value is: 𝐷𝑡1 = ( 𝑙 2 ) (11) where 𝑙 is the basis interval.  ifestate 𝐴𝑖 communicatesswith 𝐴𝑖 , startinggfrom state 𝐴𝑖 atpthestime 𝑡 − 1 expressed as 𝐹(𝑡 − 1) = 𝐴𝑖 , anddexperiencing a decreasingmtransitionxto state 𝐴𝑗 atzthevtime 𝑡 where (𝑖 > 𝑗) the adjustment valueeis: 𝐷𝑡1 = − ( 𝑙 2 ) (12) average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 235  ifzstate 𝐴𝑖 at the time 𝑡 − 1 is expressed 𝐹(𝑡 − 1) = 𝐴𝑖 , and undergoes a jumpbforward transitionrtoqstate 𝐴𝑖+𝑠 atythegtime 𝑡 where (1 ≤ 𝑠 ≤ 𝑛 − 𝑖) then the adjustment value is: 𝐷𝑡2 = ( 𝑙 2 )𝑠 (13) where 𝑠 is the number of forward jumps.  if state 𝐴𝑖 atpthemtime 𝑡 − 1 is as 𝐹(𝑡 − 1) = 𝐴𝑖 , andhundergoes anjumpbackwardxtransition topstate 𝐴𝑖−𝑣 at thertime 𝑡 where (1 ≤ 𝑣 ≤ 𝑖) thenlthe adjustment value is: 𝐷𝑡2 = − ( 𝑙 2 )𝑣 (14) where 𝑣 is the number of jumps backward. step 9. determine the final forecast value based on the adjustment of the trend of the forecasting value if flrg 𝐴𝑖 is one-to-many and state 𝐴𝑖+1 can be accessed from state 𝐴𝑖 where state 𝐴𝑖 is related to 𝐴𝑖 then the forecasting result becomes ’(𝑡) = 𝐹(𝑡) + 𝐷𝑡1 + 𝐷𝑡2 = 𝐹(𝑡) + ( 𝑙 2 ) + ( 𝑙 2 ) . ifqflrg 𝐴𝑖 isoone-too-many andastate 𝐴𝑖+1 can be accessed from 𝐴𝑖 wherewstate 𝐴𝑖 is not related to 𝐴𝑖 then the forecasting values becomes 𝐹’(𝑡) = 𝐹(𝑡) + 𝐷𝑡2 = 𝐹(𝑡) + ( 𝑙 2 ). if flrg 𝐴𝑖 is one to many and state 𝐴𝑖−2 can be accessed from state 𝐴𝑖 where 𝐴𝑖 is not related to 𝐴𝑖 then the forecasting result is 𝐹’(𝑡) = 𝐹(𝑡) − 𝐷𝑡2 = 𝐹(𝑡) − ( 𝑙 2 ) 𝑥 2 = 𝐹(𝑡)– 𝑙. if 𝑣 is jumpxstep, theggeneralmform ofpthe forecast is: 𝐹’(𝑡) = 𝐹(𝑡) ± 𝐷𝑡1 ± 𝐷𝑡2 = 𝐹(𝑡) ± ( 𝑙 2 ) ± ( 𝑙 2 ) 𝑣. (15) forecasting error measurement the reliability of a forecast can be determined by looking at mean average percentage error (mape), this mape formulas[15]: 𝑀𝐴𝑃𝐸 = 1 𝑛 ∑ |𝑌(𝑡) − 𝐹′(𝑡)| 𝑌(𝑡) 𝑥 100% 𝑛 𝑡=1 (16) with 𝑌𝑡 : actual data period 𝑡, 𝐹′𝑡 : 𝑡 period forecasting value, and 𝑛 : the predictable amount of data. results and discussion forecasting with an average based-fts markov chain with modified frequency density partitioning, the first step is to collect covid-19 in the central java period june 25, 2021 until august 20, 2021 as a universe discourse (u). next, determine the greatest value (𝐷𝑚𝑎𝑥 = 5655) and smallest values (𝐷𝑚𝑖𝑛 = 1428), and the value of d1 = 8 and d2 = 5, so it can be defined 𝑈 = [1428 − 8, 5655 + 5] = [1420, 5660]. then, calculate the absolute difference from historical data, the average absolute difference from 57 data points is 658,554, which is then divided by 2 to yield 329,277. the value of 329,277 is then determined using table 1. the basis of the length of the interval is 100, so that u can be partitioned into the same interval length, namely 𝑢1, 𝑢2, 𝑢3, 𝑢4, 𝑢5, … , 𝑢39, 𝑢40, 𝑢41, 𝑢42, 𝑢43 successively the value for each interval is average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 236 table 2. universe discourse of new positive cases of covid-19 𝑢1 = [1420, 1520] ⋮ 𝑢2 = [1520, 1620] 𝑢41 = [5420, 5520] 𝑢3 = [1620, 1720] 𝑢42 = [5520,5620] 𝑢4 = [1720, 1820] 𝑢43 = [5620, 5720] the next step is to distribute the data to each interval and determine the frequency density, resulting in the densest interval, which is then repartitioned using a modified method. the following outcomes were achieved: table 3. frequency density and repartition interval interval frequency redivided interval 𝑢29 = [4220, 4320] 5 𝑢29,1 = [4220, 4270], 𝑢29,2 = [4270, 4320] 𝑢28 = [4120, 4220] 4 𝑢28,1 = [4120, 4153.33] 𝑢28,2 = [4153.33, 4186.67] 𝑢28,3 = [4186.67, 4220] 𝑢16 = [2920, 3020] 3 𝑢16,1 = [2920, 2945], 𝑢16,2 = [2945, 2970], 𝑢16,3 = [2970, 2995] 𝑢16,3 = [2995, 3020] 𝑢17 = [3020, 3120] 3 𝑢17,1 = [3020, 3045], 𝑢17,2 = [3045, 3070], 𝑢17,3 = [3070, 3095], 𝑢17,4 = [3095, 3120] 𝑢19 = [3220, 3320] 3 𝑢19,1 = [3220, 3245], 𝑢19,2 = [3245, 3270], 𝑢19,3 = [3270, 3295], 𝑢19,4 = [3295, 3320] 𝑢27 = [4020, 4120] 3 𝑢27,1 = [4020, 4045], 𝑢27,2 = [4045, 4070], 𝑢27,3 = [4070, 4095], 𝑢27,4 = [4095, 4120] 𝑢32 = [4520, 4620] 3 𝑢32,1 = [4520, 4545], 𝑢32,2 = [4545, 4570], 𝑢32,3 = [4570, 4590], 𝑢32,4 = [4590, 4620] 𝑢33 = [4620, 4720] 3 𝑢33,1 = [4620, 4645], 𝑢33,2 = [4645, 4670], 𝑢33,3 = [4670, 4695], 𝑢33,4 = [4695, 4720] 𝑢2, 𝑢3, 𝑢4, 𝑢5, 𝑢6, 𝑢8, 𝑢11, 𝑢14, 𝑢20, 𝑢23, 𝑢26, 𝑢34, 𝑢42 0 removed next, look for the middle value (𝑚1) for each interval, we get: table 4. middle value 𝒖𝒊 𝒎𝒊 𝒖𝒊 𝒎𝒊 𝑢1 1470 ⋮ ⋮ 𝑢7 2070 𝑢39 5270 𝑢9 2270 𝑢40 5370 𝑢10 2370 𝑢41 5470 𝑢12 2570 𝑢43 5670 average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 237 furthermore, defining fuzzy sets, fuzzy sets that can be formed from the universe conversation are 44 fuzzy sets. the fuzzy sets formed is as follows: 𝐴1 = { 1 𝑢1⁄ + 0,5 𝑢7 ⁄ + 0 𝑢9⁄ + 0 𝑢10⁄ + ⋯ + 0 𝑢40⁄ + 0 𝑢41⁄ + 0 𝑢43⁄ } 𝐴2 = { 0,5 𝑢1 ⁄ + 1 𝑢7⁄ + 0,5 𝑢9 ⁄ + 0 𝑢10⁄ + ⋯ + 0 𝑢40⁄ + 0 𝑢41⁄ + 0 𝑢43⁄ } 𝐴3 = { 0 𝑢1⁄ + 0,5 𝑢7 ⁄ + 1 𝑢9⁄ + 0,5 𝑢10 ⁄ + ⋯ + 0 𝑢40⁄ + 0 𝑢41⁄ + 0 𝑢43⁄ } ⋮ 𝐴49 = { 0 𝑢1⁄ + 0 𝑢7⁄ + 0 𝑢9⁄ + 0 𝑢10⁄ + ⋯ + 1 𝑢40⁄ + 0,5 𝑢41 ⁄ + 0 𝑢43⁄ } 𝐴50 = { 0 𝑢1⁄ + 0 𝑢7⁄ + 0 𝑢9⁄ + 0 𝑢10⁄ + ⋯ + 0,5 𝑢40 ⁄ + 1 𝑢41⁄ + 0,5 𝑢43 ⁄ } 𝐴51 = { 0 𝑢1⁄ + 0 𝑢7⁄ + 0 𝑢9⁄ + 0 𝑢10⁄ + ⋯ + 0 𝑢40⁄ + 0,5 𝑢41 ⁄ + 1 𝑢43⁄ } the next step is to perform fuzzification, the data from the fuzzification results are presented in the following table: table 5. fuzzification results t actual data fuzzy data t actual data fuzzy data 1 2311 𝐴3 ⋮ ⋮ ⋮ 2 2064 𝐴2 54 3263 𝐴18 3 3079 𝐴14 55 3078 𝐴14 4 2702 𝐴6 56 1428 𝐴1 5 2932 𝐴8 57 1432 𝐴1 the next step, determine the flr and flrg, as shown in table 6. and table 7: table 6. flr data order flr data order flr 1-2 𝐴3 → 𝐴2 ⋮ ⋮ 2-3 𝐴2 → 𝐴14 54-55 𝐴18 → 𝐴14 3-4 𝐴14 → 𝐴6 55-56 𝐴14 → 𝐴1 4-5 𝐴6 → 𝐴8 56-57 𝐴1 → 𝐴1 table 7. flrg current state next state current state next state 𝐴1 (1)𝐴1 ⋮ ⋮ 𝐴2 (1)𝐴14 𝐴49 (1)𝐴35, (1)𝐴42 𝐴3 (1)𝐴2 𝐴50 (1)𝐴48 𝐴4 (1)𝐴6 𝐴51 (1)𝐴43 the initial forecast will be calculated next. for example, for 𝑡 = 2, june 26, 2021, the forecast computation based on formulas (8), (9), and (10) is 𝐹(2) = 𝑚𝑘 𝑃𝑖𝑘 = 𝑚𝑘 = 2070. the summary of the initial forecasting results is as follows: table 8. initial forecasting results (𝐹(𝑡)) period actual data 𝑭(𝒕) period actual data 𝑭(𝒕) 6/25/21 2311 na ⋮ ⋮ ⋮ 6/26/21 2064 2070 8/19/21 1428 2070 6/27/21 3078 3082,5 8/20/21 1432 1428 average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 238 after we get the initial forecasting, the next step is to adjust the forecasting trend. for example, adjustment value for june 26, 2021, the next step is 𝐴2 and the current state is 𝐴3 then the adjustment calculation uses the forecast adjustment rule (14) we get 𝐷𝑡2 = − ( 𝑙 2 ) 𝑣 = − ( 100 2 ) 1 = −(50). dor the calculation of other forecasting value adjustment using equations (11), (12), (13), and (14). the following is a forecast adjustment table. table 8. forecasting trend adjustment value period flr 𝐷𝑡𝑛 period flr 𝐷𝑡2 6/25/21 𝐴3 → 𝐴2 na ⋮ ⋮ ⋮ 6/26/21 𝐴2 → 𝐴14 -50 8/19/21 𝐴14 → 𝐴1 -650 6/27/21 𝐴14 → 𝐴9 600 8/20/21 𝐴1 → 𝐴1 0 calculate the final forecast value. the final forecast is the sum of the initial forecast value with the forecast adjustment value by following the equation (15). for example, the final forecast value for june 26, 2021 data is 𝐹′2 = 𝐹2 ± 𝐷𝑡2 = 2070 + (−50) = 2020. by doing the same way, the summary of the final forecasting result is as follows: table. 9. final forecast value period y(t) 𝐹′𝑡 period y(t) 𝐹′𝑡 6/25/21 2311 na ⋮ ⋮ ⋮ 6/26/21 2064 2020 8/19/21 1428 1907.5 6/27/21 3078 3682,5 8/20/21 1432 1428 furthermore, the average based-fts markov chain is based on the modified frequency density partition to forecast data for new positive cases on august 21, 2021, the current state is 𝐴1, from flrg it is known that the next state of 𝐴1 is 𝐴1, then based on equations (8), (9), and (10) the forecasting result is 1 times the data of the previous new positive case 𝑌(𝑡 − 1) is 1432. the last step is to calculate the forecast accuracy value using mape. the mape values of average based-fts based on a modified frequency density partitioning is 10,7%. for forecasting results using average based-fts based on a modified frequency density partitioning, are presented in the following figure: figure 1. graph of forecasting results using average based-fts markov chain based on modified frequency density partitioning 0 1000 2000 3000 4000 5000 6000 7000 actual data forecasting value average based-fts markov chain with modifications to the frequency density partition to predict covid-19 in central java susilo hariyanto 239 conclusions forecasting new positive cases of covid-19 in central java using average based-fts markov chain based on modified frequency density partitioning has a good level of accuracy, this can be seen from the mape value obtained which is 10.7%. and for the predicted value of new positive cases on august 21, 2021, it is 1432 new positive cases of covid-19 in central java. references [1] d. handayani, “penyakit virus corona 2019,” j. respirologi indones., vol. 40, no. 2, 2020. [2] d. c. montgomery, c. l. jennings, and m. kulahci, introduction time series analysis and forecasting, willey, 2015. [3] q. song and b. s. chissom, “forecasting enrollments with fuzzy time series part i,” fuzzy sets syst., vol. 54, no. 1, pp. 1–9, 1993. [4] j. sullivan and w. h. woodall, “a comparison of fuzzy forecasting and markov modeling,” fuzzy sets syst., vol. 64, no. 3, pp. 279–293, 1994. [5] s. m. chen, “forecasting enrollments based on fuzzy time series,” fuzzy sets syst., vol. 81, no. 3, 1996. [6] s. m. chen and c.-c. hsu, “a new method to forecast enrollments using fuzzy time series,” int. j. appl. sci. eng., vol. 2, no. 3, pp. 234–244, 2004. [7] h. k. yu, “weighted fuzzy time series models for taiex forecasting,” phys. a stat. mech. its appl., vol. 349, no. 3–4, 2005. [8] c. h. cheng, g. w. cheng, and j. w. wang, “multi-attribute fuzzy time series method based on fuzzy clustering,” expert syst. appl., vol. 34, no. 2, 2008. [9] m. stevenson and j. porter, “fuzzy time series forecasting using percentage change as the universe of discourse,” change, vol. 1971, no. 3.89, 1972. [10] r. c. tsaur, “a fuzzy time series-markov chain model with an application to forecast the exchange rate between the taiwan and us dollar,” int. j. innov. comput. inf. control, vol. 8, no. 7 b, 2012. [11] y. a. r. langi, “penentuan klasifikasi state pada rantai markov dengan menggunakan nilai eigen dari matriks peluang transisi,” j. ilm. sains, vol. 11, no. 1, 2011. [12] s. xihao and l. yimin, “average-based fuzzy time series models for forecasting shanghai compound index *,”, vol. 4, no. 2, pp. 104-111, 2008. [13] t. a. jilani and s. m. a. burney, “a refined fuzzy time series model for stock market forecasting,” phys. a stat. mech. its appl., vol. 387, no. 12, 2008. [14] j. noh, w. wijono, and e. yudaningtiyas, “model average based fts markov chain untuk peramalan penggunaan bandwidth jaringan komputer,” j. eeccis, vol. 9, no. 1, 2015. [15] s. makridakis, s. wheelwright c, and v. e. mcgee, metode dan aplikasi peramalan. binarupa aksara, 1999. levi decomposition of frobenius lie algebra of dimension 6 cauchy –jurnal matematika murni dan aplikasi volume 7(3) (2022), pages 394-400 p-issn: 2086-0382; e-issn: 2477-3344 submitted: april 03, 2022 reviewed: april 11, 2022 accepted: april 14, 2022 doi: http://dx.doi.org/10.18860/ca.v7i3.15656 levi decomposition of frobenius lie algebra of dimension 6 henti*, edi kurniadi, ema carnia departement of mathematics of fmipa, universitas padjadjaran, indonesia email: henti17001@mail.unpad.ac.id abstract in this paper, we study notion of the frobenius lie algebra m2,1(ℝ) ⋊ 𝔤𝔩2(ℝ) of dimension 6. the finite dimensional lie algebra can be expressed in terms of decomposition between levi subalgebra and the radical (maximal solvable ideal). this form of decomposition is called levi decomposition. our main object is further denoted by 𝔞𝔣𝔣(2) ≔ m2,1(ℝ) ⋊ 𝔤𝔩2(ℝ). the work aims to obtain levi decomposition of frobenius lie algebra 𝔞𝔣𝔣(2) of dimension 6. to obtained levi subalgebra and the radical, we apply literature reviews about lie algebra and decomposition levi in dagli result. the main result of this paper is frobenius lie algebra 𝔞𝔣𝔣(2) can be decomposition be semisimple levi subalgebra 𝔥 of dimension 4 and radical solvable rad(𝔤) of dimension 2. thus, the levi decomposition form of the frobenius lie algebra is given. keywords: frobenius lie algebra; levi decomposition; lie algebra; radical introduction a vector space over a field that is equipped by lie brackets which is neither commutative nor associative is called lie algebra [1]. any finite dimensional lie algebra can be expressed as semidirect sum between levi subalgebra (lie subalgebra) and its radical (maximal solvable ideal) and this form is called a levi decomposition [1]. we denote a finite dimensional lie algebra by 𝔤. on the other hand, for finite dimensional case, the lie algebra 𝔤 can be written in levi decomposition form which is given in the following form 𝔤 = 𝔥 ⋉ rad(𝔤) (1) where 𝔥 is a levi subalgebra of 𝔤 and rad(𝔤) is a radical or solvable maximal ideal of 𝔤. let 𝑆 = {𝑒1, 𝑒2, … , 𝑒𝑛} be a basis of 𝔤 and we define 𝐶(𝔤) = (𝐶(𝔤)𝑖,𝑗) be a matrix whose lie brackets entries of 𝔤 are given by 𝐶(𝔤)𝑖,𝑗 ≔ [𝑒𝑖, 𝑒𝑗]𝔤 , 1 ≤ 𝑖, 𝑗 ≤ 𝑛. (2) this matrix 𝐶(𝔤) ∈ 𝑀𝑎𝑡(𝑛 × 𝑛, 𝑆(𝔤)) is called a structure matrix of 𝔤 where 𝑆(𝔤) denotes as symmetric algebra of 𝔤 [2]. the notion of lie algebras has been widely studied. one of which is the investigation of lie algebra with dimension 8 which can be carried out by levi's decomposition [3]. rais introduced the lie algebra notion 𝑀𝑛,𝑝(ℝ) ⋊ 𝔤𝔩𝑛(ℝ) where 𝑀𝑛,𝑝(ℝ) is a vector space of matrices of size 𝑛 × 𝑝 with real number entries and 𝔤𝔩𝑛(ℝ) is the lie algebra of a vector space of matrices of size 𝑛 × 𝑛 equipped with lie brackets [4]. furthermore, we can see the notions of lie algebra in [5] and [6]. http://dx.doi.org/10.18860/ca.v7i3.15656 mailto:henti17001@mail.unpad.ac.id* levi decomposition of frobenius lie algebra of dimension 6 henti 395 let 𝔤 be a lie algebra with 𝔤∗ is a dual vector space of 𝔤 where 𝔤∗ consisting of real valued all linear functional on 𝔤. the lie algebra 𝔤 is said to be a frobenius lie algebra if there exists a linear functional 𝜑 ∈ 𝔤∗ so that the skew-symmetric bilinear form 𝐵𝜑 (𝑥, 𝑦) ≔ 𝜑([𝑥, 𝑦]) is non degenerate. many studies of frobenius lie algebras have been carried out over the years. for instance, the properties of principal elements on frobenius lie algebra one of them is frobenius lie algebra cannot be unimodular [7]. an example of frobenius lie algebra is the affine lie algebra 𝔞𝔣𝔣(2) can be seen in the classification of frobenius lie algebra with dimension less than or equal to 6 [8]. kurniadi have constructed frobenius lie algebra with dimension less than or equal to 6 from non-commutative nilpotent lie algebra with dimension less than or equal to 4 [9]. other example of frobenius lie algebra, notation lie algebra 𝑀𝑛,𝑝(ℝ) ⋊ 𝔤𝔩𝑛(ℝ) where 𝑛 = 𝑝 = 3 is frobenius lie algebra of dimension 18 [10]. moreover, the lie algebra 𝑀3(ℝ) ⋊ 𝔤𝔩3(ℝ) has quasi-associative algebra structure [11]. the frobenius lie algebra m2(ℝ) ⋊ 𝔤𝔩2(ℝ) is the left-symmetric algebra [12]. it has been proven that the affine lie algebra that is denoted by 𝔞𝔣𝔣(𝑛) ≔ ℝ𝑛 ⋊ 𝔤𝔩𝑛(ℝ) is frobenius lie algebra where ℝ 𝑛 is another form of 𝑀𝑛,1(ℝ) [13]. readers can study more about frobenius lie algebra in the following articles: [14], [15], [16], and [17]. in this paper, we study about decompose frobenius lie algebra for special case 𝑛 = 2 of the affine lie algebra 𝔞𝔣𝔣(𝑛). the notion m2,1(ℝ) ⋊ 𝔤𝔩2(ℝ) can be written in simpler formulas as ℝ2 ⋊ 𝔤𝔩2(ℝ) and we can denote it by 𝔞𝔣𝔣(2) which is known as the affine lie algebra. in the nice formula, the affine lie algebra 𝔞𝔣𝔣(2) ≔ ℝ2 ⋊ 𝔤𝔩2(ℝ) can be expressed in the form of a matrix 𝔞𝔣𝔣(2) ≔ {( 𝑋 𝑌 0 0 ) ; 𝑋 ∈ 𝔤𝔩2(ℝ), 𝑌 ∈ ℝ 2} ⊆ 𝔤𝔩3(ℝ) (3) where 𝔤𝔩3(ℝ) is 3 × 3 real matrix. the purpose of this research is to give decompose this lie algebra into levi subalgebra and radical. methods we used literature study for the research method, especially the study of frobenius lie algebra 𝔞𝔣𝔣(2) and about levi decomposition of lie algebra in [18]. first, we given an affine lie algebra 𝔞𝔣𝔣(2). we proved the affine lie algebra 𝔞𝔣𝔣(2) not solvable. then, it is proved that lie algebra 𝔞𝔣𝔣(2) can be decomposed into its subalgebra and radical. before going into the discussion, we would like to introduce the theoretical foundations used in this study as follows: definition 1 [19] let 𝔤 be a vector space and a bilinear form [. , . ]: 𝔤 × 𝔤 ∋ (𝑥, 𝑦) ↦ [𝑥, 𝑦] ∈ 𝔤. the bilinear form [. , . ] is called a lie bracket for 𝔤 if the following condisitions are satisfied: 1. [𝑥, 𝑦] = −[𝑦, 𝑥]; ∀ 𝑥, 𝑦 ∈ 𝔤 2. [𝑥, [𝑦, 𝑧]] + [𝑦, [𝑧, 𝑥]] + [𝑧, [𝑥, 𝑦]] = 0; ∀ 𝑥, 𝑦, 𝑧 ∈ 𝔤. the vector space 𝔤 equipped by lie brackets is called lie algebra. definition 2 [19] a linear subspace 𝔥 of 𝔤 is called a lie sub-algebra if [𝔥, 𝔥] ⊆ 𝔥, we denote by 𝔥 < 𝔤. if we have [𝔤, 𝔥 ] ⊆ 𝔥, we call 𝔥 as an ideal of 𝔤 and then write 𝔥 ⊴ 𝔤. definition 3 [19] let 𝔤 be a lie algebra. the derived series of 𝔤 is defined by 𝐷0(𝔤) = 𝔤 and 𝐷𝑛(𝔤) = [𝐷𝑛−1(𝔤), 𝐷𝑛−1(𝔤)] ∀𝑛 ∈ ℕ (4) the lie algebra 𝔤 is said to be solvable, if there exists an 𝑛 ∈ ℕ with 𝐷𝑛(𝔤) = {0}. theorem 1 let 𝔤 be lie algebra then, levi decomposition of frobenius lie algebra of dimension 6 henti 396 i. if 𝔤 is solvable then the subalgebras and homomorphic images of 𝔤 are solvable. ii. if 𝔥 is a solvable ideal of 𝔤 and 𝔤/𝔥 is solvable, then 𝔤 is solvable iii. if 𝔥 and 𝔦 are solvable ideals of 𝔤 then 𝔥 + 𝔦 is also a solvable ideal of 𝔤. this theorem shows that the sum of all solvable ideals of a lie algebra is a solvable ideal. so, in every finite-dimensional lie algebra 𝔤, there exists a maximal solvable ideal. this ideal is called the radical of 𝔤 and denoted by 𝑅𝑎𝑑(𝔤). theorem 2 [18] let 𝑉 be vector space over a field and let 𝔤 be a subalgebra of 𝔤𝔩(𝑉), the 𝔤 is solvable if tr(𝑥𝑦) = 0 for all 𝑥 ∈ 𝔤 and 𝑦 ∈ [𝔤, 𝔤]. theorem 3 [18] let 𝔤 be lie algebra over a field 𝔽, then 𝑅𝑎𝑑(𝔤) = {𝑥 ∈ 𝔤 | 𝑇𝑟(ad 𝑥 ⋅ ad 𝑦) = 0} (5) for all 𝑦 ∈ [𝔤, 𝔤]. definition 4 [19] let 𝔤 be a lie algebra. if its radical is trivial i.e 𝑅𝑎𝑑(𝔤) = {0} then 𝔤 is called semisimple. the lie algebra 𝔤 is said to be simple if it is not abelian and if it contains no ideal other than 𝔤 and {0}. definition 5 [1] let 𝑉 be a space vector. a linear map 𝜌: 𝑉 → 𝑉 is said to be endomorphism on 𝑉 if the following condition satisfied: 1. 𝜌(𝑥 + 𝑦) = 𝜌(𝑥) + 𝜌(𝑦) 2. 𝜌(𝑥𝑦) = (𝜌(𝑥))𝑦 = 𝑥(𝜌(𝑦)) for all 𝑥, 𝑦 ∈ 𝑉. the set of all endomorphism on 𝑉 is denoted by 𝐸𝑛𝑑(𝑉). furthermore, the endomorphism 𝐸𝑛𝑑(𝑉) equipped by lie bracket [𝑥, 𝑦] = 𝑥𝑦 − 𝑦𝑥 for all 𝑥, 𝑦 ∈ 𝐸𝑛𝑑(𝑉) is lie algebra and it is called a general linear algebra, we denoted by 𝔤𝔩(v). definition 6 [19] let 𝔤 be a lie algebra and 𝑥 ∈ 𝔤. the map 𝑎𝑑: 𝔤 → 𝔤 defined by ad 𝑥 ∶ 𝔤 ∈ 𝑦 ↦ ad 𝑥(𝑦) = [𝑥, 𝑦] ∈ 𝔤 (6) is a derivation. the map ad: 𝔤 → 𝔤𝔩(𝔤) is called an adjoint representation. let a representation of lie algebra 𝔤 in the dual vector space 𝔤∗ is denoted by ad∗ whose value on 𝔤 is defined by 〈ad∗(𝑥)𝜑, 𝑦〉 = 〈𝜑, ad∗(−𝑥)𝑦〉 = 〈𝜑, [𝑦, 𝑥]〉 (7) for 𝜑 ∈ 𝔤∗, for all 𝑥, 𝑦 ∈ 𝔤. a stabilizer of lie algebra 𝔤 at the point 𝜑 ∈ 𝔤∗ is given in the following form: 𝔤𝜑 = {𝑥 ∈ 𝔤 | ad∗(𝑥)𝜑 = 0} (8) definition 7 [20] let 𝔤 be a lie algebra whose 𝔤∗ be a dual vector space of 𝔤. a lie algebra 𝔤 is said to be frobenius lie algebra if there exist linear functional 𝜑 ∈ 𝔤∗ such that the stabilizer of 𝔤 on 𝜑 is equal to 0. furthermore, we review briefly some basic notations needed in levi decomposition. we explain levi’s theorem which states that a finite dimensional lie algebra can be expressed as the semidirect sum of the levi subalgebra and the radical. theorem 4 [18] let 𝔤 be a lie algebra and let 𝔤 be not solvable, then 𝔤/𝑅𝑎𝑑(𝔤) is a semisimple lie subalgebra. theorem 5 [18] let 𝔤 be a finite dimensional lie algebra. if 𝔤 is not solvable, then there is a semisimple subalgebra 𝔰 of 𝔤 such that 𝔤 = 𝔰 ⊕ 𝑅𝑎𝑑(𝔤). (9) in this decomposition, 𝔰 ≅ 𝔤/𝑅𝑎𝑑(𝔤) and we have commutation relations as follows [𝔰, 𝔰] = 𝔰, [𝔰, 𝑅𝑎𝑑(𝔤)] ⊆ 𝑅𝑎𝑑(𝔤), [𝑅𝑎𝑑(𝔤), 𝑅𝑎𝑑(𝔤)] ⊆ 𝑅𝑎𝑑(𝔤). (10) the example of levi decomposition can be seen in the work of [18], one of all example its levi decomposition as follows example 1 [18] let 𝔤 be a lie algebra spanned by levi decomposition of frobenius lie algebra of dimension 6 henti 397 {𝑥1 = ( 0 1 0 1 0 0 0 0 1 ) , 𝑥2 = ( 1 0 0 0 −1 0 0 0 1 ) , 𝑥3 = ( 0 1 0 0 0 0 0 0 1 ) , 𝑥4 = ( 0 1 0 1 0 0 0 0 0 )} (11) where lie bracket non-zero is [𝑥1, 𝑥2] = 4𝑥1 − 4𝑥3 − 2𝑥4, [𝑥1, 𝑥3] = 𝑥1 − 𝑥2 − 𝑥4, [𝑥2, 𝑥3] = −2𝑥1 + 2𝑥3 + 2𝑥4, [𝑥2, 𝑥4] = −4𝑥1 + 4𝑥3 + 2𝑥4, [𝑥3, 𝑥4] = −𝑥1 + 𝑥2 + 𝑥4. the lie algebra 𝔤 can be express in 𝔤 = 𝑅𝑎𝑑(𝔤) ⋊ 𝔥 where radical 𝑅𝑎𝑑(𝔤) = 𝑠𝑝𝑎𝑛 {( 0 0 0 0 0 0 0 0 1 )} and levi subalgebra 𝔥 is spanned by {𝑧1 = ( 0 1 0 1 0 0 0 0 0 ) , 𝑧2 = ( 1 0 0 0 −1 0 0 0 0 ) , 𝑧3 = ( 0 1 0 0 0 0 0 0 0 )}. results and discussion in this section, let 𝔞𝔣𝔣(2) be the affine lie algebra and let 𝔞𝔣𝔣(2) be realized in the following matrix form 𝔞𝔣𝔣(2) = {( 𝑎 𝑏 𝑥 𝑐 𝑑 𝑦 0 0 0 ) | 𝑎, 𝑏, 𝑐, 𝑑, 𝑥, 𝑦 ∈ ℝ} ⊆ 𝔤𝔩3(ℝ), (12) with the standard basis for 𝔞𝔣𝔣(2), we have 𝑆 = {𝑥1 = ( 1 0 0 0 0 0 0 0 0 ) , 𝑥2 = ( 0 1 0 0 0 0 0 0 0 ) , 𝑥3 = ( 0 0 0 1 0 0 0 0 0 ) , 𝑥4 = ( 0 0 0 0 1 0 0 0 0 ) , 𝑥5 = ( 0 0 1 0 0 0 0 0 0 ) , 𝑥6 = ( 0 0 0 0 0 1 0 0 0 )}. (13) the lie brackets for the affine lie algebra 𝔞𝔣𝔣(2) is defined by [𝑎, 𝑏] = 𝑎𝑏 − 𝑏𝑎, ∀ 𝑎, 𝑏 ∈ 𝔞𝔣𝔣(2) such that the non-zero lie brackets for the affine lie algebra 𝔞𝔣𝔣(2) as follows (14) theorem 5 [8] let 𝔞𝔣𝔣(2) be a lie algebra of dimension 6 with basis in the equation (13). let 𝔞𝔣𝔣(2)∗ be its dual vector space of 𝔞𝔣𝔣(2). then there exist a linear functional 𝜑 = 𝑥2 ∗ + 𝑥6 ∗ ∈ 𝔞𝔣𝔣(2)∗ such that 𝔤𝜑 = {0}. therefore, the affine lie algebra 𝔞𝔣𝔣(2) is frobenius. in this section of the discussion is our main result, we will prove the proposition 1 and the proposition 2 as follows. proposition 1. the affine lie algebra 𝔞𝔣𝔣(2) is not solvable. proof. we have that 𝐷1(𝔞𝔣𝔣(2)) = [𝐷(𝔞𝔣𝔣(2)), 𝐷(𝔞𝔣𝔣(2))] = 𝑠𝑝𝑎𝑛{𝑥1 − 𝑥4, 𝑥2, 𝑥3, 𝑥5, 𝑥6}. next, we compute 𝐷2(𝔞𝔣𝔣(2)) also obtained 𝐷2(𝔞𝔣𝔣(2)) = [𝐷1(𝔞𝔣𝔣(2)), 𝐷1(𝔞𝔣𝔣(2))] = 𝑠𝑝𝑎𝑛{𝑥1 − 𝑥4, 𝑥2, 𝑥3, 𝑥5, 𝑥6}. therefore, there not exist 𝑛 > 0 that causes 𝐷𝑛(𝔞𝔣𝔣(2)) = {0}. thus, the affine lie algebra 𝔞𝔣𝔣(2) is not solvable. ∎ proposition 2. let 𝔞𝔣𝔣(2) be frobenius affine lie algebra whose basis 𝑆 = {𝑥𝑖}𝑖=1 6 where the non-zero brackets for 𝔞𝔣𝔣(2) in the equation (14). the affine lie algebra 𝔞𝔣𝔣(2) is not [𝑥1, 𝑥2] = 𝑥2, [𝑥1, 𝑥3] = −𝑥3, [𝑥1, 𝑥5] = 𝑥5, [𝑥2, 𝑥3] = 𝑥1 − 𝑥4, [𝑥2, 𝑥4] = 𝑥2, [𝑥2, 𝑥6] = 𝑥5, [𝑥3, 𝑥4] = −𝑥3, [𝑥3, 𝑥5] = 𝑥6, [𝑥4, 𝑥6] = 𝑥6. levi decomposition of frobenius lie algebra of dimension 6 henti 398 solvable then there exist 𝔥 = 𝑠𝑝𝑎𝑛{𝑥1, 𝑥2 + 𝑥5 + 𝑥6, 𝑥3 + 𝑥5 + 𝑥6, 𝑥4} is the semisimple lie subalgebra of 𝔞𝔣𝔣(2) and 𝑅𝑎𝑑(𝔞𝔣𝔣(2)) = 𝑠𝑝𝑎𝑛{𝑥5, 𝑥6} is the radical of 𝔞𝔣𝔣(2) such that 𝔞𝔣𝔣(2) = 𝑠𝑝𝑎𝑛{𝑥5, 𝑥6} ⋊ 𝑠𝑝𝑎𝑛{𝑥1, 𝑥2 + 𝑥5 + 𝑥6, 𝑥3 + 𝑥5 + 𝑥6, 𝑥4}. (15) proof. firstly, we have the structure matrix of 𝔞𝔣𝔣(2) is 𝐶(𝔞𝔣𝔣(2)) = ( 0 𝑥2 −𝑥3 0 𝑥5 0 −𝑥2 0 𝑥1 − 𝑥4 𝑥2 0 𝑥5 𝑥3 𝑥4 − 𝑥1 0 −𝑥3 𝑥6 0 0 −𝑥2 𝑥3 0 0 𝑥6 −𝑥5 0 −𝑥6 0 0 0 0 −𝑥5 0 −𝑥6 0 0 ) . (16) then, we find the maximal linearly independent set in the structure matrix such that basis 𝐵 = {𝑦1 = 𝑥1 − 𝑥4, 𝑦2 = 𝑥2, 𝑦3 = 𝑥3, 𝑦4 = 𝑥5, 𝑦5 = 𝑥6} of the product space [𝔞𝔣𝔣(2), 𝔞𝔣𝔣(2)]. next, calculate ad 𝑥𝑖 and ad 𝑦𝑗 for 1 ≤ 𝑖 ≤ 6, 1 ≤ 𝑗 ≤ 5, we get ad 𝑥1 = [ 0 0 0 0 0 0 0 1 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] , ad 𝑥2 = [ 0 0 1 0 0 0 −1 0 0 1 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0] , ad 𝑥3 = [ 0 −1 0 0 0 0 0 0 0 0 0 0 1 0 0 −1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0] , ad 𝑥4 = [ 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1] , ad 𝑥5 = [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 −1 0 0 0] , ad 𝑥6 = [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 0 0 −1 0 0] , ad 𝑦1 = ad 𝑥1 − 𝑥4 = [ 0 0 0 0 0 0 0 2 0 0 0 0 0 0 −2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 −1] . (17) furthermore, compute the radical of 𝔞𝔣𝔣(2) where 𝑥 = ∑ 𝛼𝑖𝑥𝑖 6 𝑖=1 ∈ 𝑅𝑎𝑑(𝔞𝔣𝔣(2)), then find value 𝛼𝑖 using equations (17), we have ∑ 𝛼𝑖 6 𝑖=1 𝑇𝑟 (𝑎𝑑 𝑥𝑖 ⋅ 𝑎𝑑 𝑦𝑗) = 0 ; 1 < 𝑗 ≤ 5 𝛼1 [ 5 0 0 0 0] + 𝛼2 [ 0 5 5 0 0] + 𝛼3 [ 0 5 0 0 0] + 𝛼4 [ −5 0 0 0 0 ] + 𝛼5 [ 0 0 0 0 0] + 𝛼6 [ 0 0 0 0 0] = [ 0 0 0 0 0] . (18) levi decomposition of frobenius lie algebra of dimension 6 henti 399 next, we solve linear equations (18) such we find that 𝛼1 − 𝛼4 = 0, 𝛼2 = 0, 𝛼3 = 0, 𝛼5 = 𝑠, 𝛼6 = 𝑡, and with 𝛼1 = 0, then we get 𝑥 = ∑ 𝛼𝑖𝑥𝑖 6 𝑖=1 = 0𝑥1 + 0𝑥2 + 0𝑥3 + 0𝑥4 + 𝑠𝑥5 + 𝑡𝑥6 = 𝑠𝑥5 + 𝑡𝑥6. therefore, we obtain the radical of 𝔞𝔣𝔣(2) is 𝑅𝑎𝑑(𝔞𝔣𝔣(2)) = 𝑠𝑝𝑎𝑛{𝑥5, 𝑥6} = 𝑠𝑝𝑎𝑛{𝑟1, 𝑟2}. (19) next, we find basis levi subalgebra of 𝔞𝔣𝔣(2). in this cases, 𝑅𝑎𝑑(𝔞𝔣𝔣(2)) is abelian because [𝑟𝑖, 𝑟𝑗] = 0 for all 1 ≤ 𝑖, 𝑗 ≤ 2. complement on 𝔞𝔣𝔣(2) respect to 𝑅𝑎𝑑(𝔞𝔣𝔣(2)) spanned by {𝑥1, 𝑥2, 𝑥3, 𝑥4}. the quotient algebra 𝔞𝔣𝔣(2)/𝑅𝑎𝑑(𝔞𝔣𝔣(2)) is spanned by �̅�1, �̅�2, �̅�3, �̅�4 and we have its brackets as follows [�̅�1, �̅�2] = �̅�2, [�̅�1, �̅�3] = −�̅�3, [�̅�2, �̅�3] = �̅�1 − �̅�4, [�̅�2, �̅�4] = �̅�2, [�̅�3, �̅�4] = −�̅�3. (20) we set levi subalgebra spanned by 𝑧1 = 𝑥1 + 𝛼1𝑟1 + 𝛼2𝑟2, 𝑧2 = 𝑥2 + 𝛽1𝑟1 + 𝛽2𝑟2, 𝑧3 = 𝑥3 + 𝛾1𝑟1 + 𝛾2𝑟2, 𝑧4 = 𝑥4 + 𝛿1𝑟1 + 𝛿2𝑟2. (21) next, we calculate to determine the four unknown 𝛼, 𝛽, 𝛾, 𝛿 such that 𝑧1, 𝑧2, 𝑧3, 𝑧4 span a semisimple lie algebra that is isomorphic to 𝔞𝔣𝔣(2)/𝑅𝑎𝑑(𝔞𝔣𝔣(2)). since 𝑧1, 𝑧2, 𝑧3, 𝑧4 have the same commutation relations as �̅�𝑖, 1 ≤ 𝑖 ≤ 4 written in equations (20), we then get [𝑧1, 𝑧2] = 𝑧2, [𝑧1, 𝑧3] = −𝑧3, [𝑧2, 𝑧3] = 𝑧1 − 𝑧4, [𝑧2, 𝑧4] = 𝑧2, [𝑧3, 𝑧4] = −𝑧3. (22) we substitution the equation (22) onto (23) such that equation can be written as [𝑥1 + 𝛼1𝑟1 + 𝛼2𝑟2, 𝑥2 + 𝛽1𝑟1 + 𝛽2𝑟2] = 𝑥2 + 𝛽1𝑟1 + 𝛽2𝑟2, (23) [𝑥1 + 𝛼1𝑟1 + 𝛼2𝑟2, 𝑥3 + 𝛾1𝑟1 + 𝛾2𝑟2] = −(𝑥3 + 𝛾1𝑟1 + 𝛾2𝑟2), (24) [𝑥2 + ∑ 𝛽𝑗𝑟𝑗 2 𝑗=1 , 𝑥3 + ∑ 𝛾𝑗𝑟𝑗 2 𝑗=1 ] = (𝑥1 + ∑ 𝛼𝑗𝑟𝑗 2 𝑗=1 ) − (𝑥4 + ∑ 𝛿𝑗𝑟𝑗 2 𝑗=1 )(25) [𝑥2 + 𝛽1𝑟1 + 𝛽2𝑟2, 𝑥4 + 𝛿1𝑟1 + 𝛿2𝑟2] = 𝑥2 + 𝛽1𝑟1 + 𝛽2𝑟2, (26) [𝑥3 + 𝛾1𝑟1 + 𝛾2𝑟2, 𝑥4 + 𝛿1𝑟1 + 𝛿2𝑟2] = −(𝑥3 + 𝛾1𝑟1 + 𝛾2𝑟2). (27) then, we apply the equations (23), (24), (25), (26), and (27) to compute 𝛼𝑖, 𝛽𝑖, 𝛾𝑖, 𝛿𝑖, 1 ≤ 𝑖 ≤ 2. from equations (23) and (26) obtained that 𝛽1 = 𝛽2 = 1. from equations (24) and (27) obtained that 𝛾1 = 𝛾2 = 1. for equations (25), we obtained that 𝛼1 − 𝛿1 = 𝛼2 − 𝛿2 = 0 and let 𝛼𝑖 = 0, such that we have 𝑧1 = 𝑥1, 𝑧2 = 𝑥2 + 𝑟1 + 𝑟2, 𝑧3 = 𝑥3 + 𝑟1 + 𝑟2, 𝑧4 = 𝑥4. thus, the levi subalgebra spanned by {𝑧1 = ( 1 0 0 0 0 0 0 0 0 ) , 𝑧2 = ( 0 1 1 0 0 1 0 0 0 ) , 𝑧3 = ( 0 0 1 1 0 1 0 0 0 ) , 𝑧4 = ( 0 0 0 0 1 0 0 0 0 )}. (28) ∎ conclusions it has been proven in proposition 2 that the affine lie algebra 𝔞𝔣𝔣(2) can be decomposed into its subalgebra and radicals which written in the equations (15). from our result of this paper, other research can study about decomposition of the general formula affine lie algebra 𝔞𝔣𝔣(n) of dimension 𝑛(𝑛 + 1). for future research, the decomposition process can be expanded from the decomposition result of 𝔞𝔣𝔣(𝑛) in its radical and levi subalgebra form such that we can find structure frobenius lie algebra 𝔞𝔣𝔣(𝑛) of its decomposition. levi decomposition of frobenius lie algebra of dimension 6 henti 400 references [1] j. e. humphreys, introduction to lie algebras and representation theory. springer, 1972. [2] m. a. alvarez, m. c. rodríguez-vallarte, and g. salgado, “contact and frobenius solvable lie algebras with abelian nilradical,” commun. algebr., vol. 46, no. 10, pp. 4344–4354, 2018, doi: 10.1080/00927872.2018.1439048. [3] p. turkowski, “low-dimensional real lie algebras,” j. math. phys., vol. 29, no. 10, pp. 2139–2144, 1988, doi: 10.1063/1.528140. [4] m. rais, “la représentation coadjointe du groupe affine,” vol. 4, pp. 913–937, 1978. [5] m. a. alvarez, m. c. rodríguez-vallarte, and g. salgado, “contact nilpotent lie algebras,” proc. am. math. soc., vol. 145, no. 4, pp. 1467–1474, 2017, doi: 10.1090/proc/13341. [6] i. v. mykytyuk, “structure of the coadjoint orbits of lie algebras,” j. lie theory, vol. 22, no. 1, pp. 251–268, 2012. [7] a. diatta and b. manga, “on properties of principal elements of frobenius lie algebras,” j. lie theory, vol. 24, no. 3, pp. 849–864, 2018. [8] b. csikós and l. verhóczki, “classification of frobenius lie algebras of dimension ≤ 6,” publ. math., vol. 70, no. 3–4, pp. 427–451, 2007. [9] e. kurniadi, “dekomposisi dan sifat matriks struktur pada aljabar lie frobenius berdimensi 4,” no. 1, pp. 392–399, 2021. [10] henti, e. kurniadi, and e. carnia, “on frobenius functionals of the lie algebra m3(ℝ) ⊕ gl3(ℝ),” j. phys. conf. ser., vol. 1872, no. 1, 2021, doi: 10.1088/17426596/1872/1/012015. [11] henti, e. kurniadi, and e. carnia, “quasi associative algebras on the frobenius lie algebra m3(ℝ) ⊕ gl3(ℝ),” al-jabar j. pendidik. mat., vol. 12, no. 1, pp. 1–16, 2021, [online]. available: http://ejournal.radenintan.ac.id/index.php/aljabar/article/view/2014/1564. [12] e. kurniadi, n. gusriani, and b. subartini, “a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8,” cauchy, vol. 7, no. 2, pp. 267–280, 2022, doi: 10.18860/ca.v7i2.13462. [13] e. kurniadi and h. ishi, “harmonic analysis for 4-dimensional real frobenius lie algebras,” springer proc. math. stat., vol. 290, no. march, pp. 95–109, 2019, doi: 10.1007/978-3-030-26562-5_4. [14] diatta, a., b. manga, and a. mbaye, “on systems of commuting matrices, frobenius lie algebras and gerstenhaber’s theorem,” arxiv:2002.08737., 2020. [15] d. n. pham, “$\frak{g}$-quasi-frobenius lie algebras,” arch. math., vol. 52, pp. 233–262, 2016, [online]. available: http://arxiv.org/abs/1701.01680. [16] m. goze and e. remm, “contact and frobenius forms on lie groups,” differ. geom. its appl., vol. 35, pp. 74–94, 2014, doi: 10.1016/j.difgeo.2014.05.008. [17] t. barajas, e. roque, and g. salgado, “principal derivations and codimension one ideals in contact and frobenius lie algebras,” commun. algebr., vol. 47, no. 12, pp. 5380–5391, 2019, doi: 10.1080/00927872.2019.1623238. [18] m. dagli, “levi decomposition of lie algebras; algorithms for its computation,” iowa state university, 2004. [19] j. hilgert and k.-h. neeb, structure and geometry of lie groups. new york: springer monographs in mathematics, 2012. [20] a. i. ooms, on frobenius lie algebras, vol. 8, no. 1. 1980. spline nonparametric regression to identify factors affecting gender empowerment measure (gem) in east java cauchy – jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 105-117 p-issn: 2086-0382; e-issn: 2477-3344 submitted: july 25, 2021 reviewed: october 10, 2021 accepted: november 05, 2021 doi: https://doi.org/10.18860/ca.v7i1.12993 spline nonparametric regression to identify factors affecting gender empowerment measure (gem) in east java luluk mahfiroh1, yuniar farida2* 1,2 department of mathematics, faculty of science and technology, sunan ampel state islamic university surabaya, indonesia email: mahfirohluluk@gmail.com, yuniar_farida@uinsby.ac.id* *corresponding author abstract gender is a multidimensional issue that's not limited to gender discrimination, but also includes the economic, educational, and health aspects, which then become the focus of almost all the sustainable development goals (sdgs). evaluation of the development devoted to the perspective of the gender using several indicators, gender development index (gdi) and gender empowerment measure (gem). gem describes the role of women in the economic sphere and is measured by equality in political participation. gem of east java for 5 consecutive years (2014 – 2018) is lower than the average national gem. this study aims to identify factors affecting gem in east java using nonparametric regression spline quadratic. the result of the regression model shows that all variables selected are affecting gem in east java, they are the labor force participation rate (lfpr) population of women (𝑥1), school participation rate (spr) high school population of women (𝑥2), percentage of population female that working in the formal sector (𝑥3), sex ratio (𝑥4), percentage of population female that working as members of people’s representative council (𝑥5), percentage of population female that working as civil servants (𝑥6), and rate of women's income donations (𝑥7). the model generates 𝑅 2 value of 93.74% and mape of 3.22%. this research contributes to the implementation of non-parametric spline regression in identifying various factors that influence social phenomena. keywords: gender empowerment measure (gem); gender development index (gdi); nonparametric regression; spline; generalized cross-validation (gcv) introduction discussion about gender is indeed inseparable from the concept of gender equality and justice. the goal is to kill the patriarchal culture that overrides the role of women. at the beginning of the world's development, history records that very few female figures played important events. information about women's roles in social, economic, and political movements is also minimal [1]. the male domination of women influences it. such dominance is inseparable from the patriarchal ideology adopted by most of the world's people since a long time ago [2]. gender is a multidimensional issue that's not limited to gender discrimination, but also includes the economic, educational, and health aspects, which then become the focus of almost all the sustainable development goals (sdgs) [3]. in indonesia, evaluation of https://doi.org/10.18860/ca.v7i1.12993 mailto:mahfirohluluk@gmail.com mailto:yuniar_farida@uinsby.ac.id* spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 106 development outcomes that are devoted to a gender perspective uses several indicators, namely the gender development index (gdi) and the gender empowerment measure (gem). gdi explains the gap in human development between men and women. meanwhile, gem describes women's role in the economic sphere and is measured by equality in political participation [4]. gem is the percentage of women who work as professionals, managers, technicians, and all forms of leadership [5]. gem east java's achievements in 2017 and 2018 were 69.37 and 69.71. it is lower than the national gem average of 71.74 in 2017 and 72.1 in 2018 [6]. gem's calculations indicate the proportion of managers, administrative staff, professional workers, and technicians; the balance of representation in parliament; and non-farm workers' wages. in addition to these indicators, other factors that may influence it, including the labor force participation rate (lfpr), school participation rate (spr), the percentage of the population working in the formal sector, the gender ratio, the percentage of income contributions, the percentage of the population female that working as members of people’s representative council and civil servants. to find out the relationship of these factors to gem values can be done by mathematical analysis. one of the analytical methods used to solve the problem is mathematical modeling [7]. mathematical modeling can be done using regression analysis to determine the causal relationship with other variables. [8]. regression analysis is grouped into 3, namely parametric, nonparametric, and semiparametric [9]. research conducted by adawiyah [10], it is obtained that the spline estimator is strongly influenced by the number of knots, the number of orders, the location of knot points, and the best spline regression model using two-knot points. conducted by fadhilah's research [11], the best spline model relies heavily on determining the optimal knot point with a minimum generalized cross-validation (gcv) value. the best-truncated spline regression model lies in order 2 [12]. the regression function estimation with a nonparametric approach is done with the spline technique to adjust effectively to data [13]. this research used modeling spline nonparametric regression to determine the factors that affect the gender empowerment measure in east java. nonparametric regression is used because the data is not in a particular pattern or the regression curve's shape is limited. methods data and data processing the data used in this study is secondary data in 2013-2018 sourced from bps (badan pusat statistik) of east java province both through the official website of www.bps.go.id and publication books available at the service center. in this study, there are eight (8) variables, which consist of one dependent variable (𝑦) dan seven independent variables (𝑥). they are gender empowerment measure (gem) (𝑦), labor force participation rate (lfpr) of the female population (𝑥1), school participation rate (spr) of female population in the high school level (𝑥2), percentage of female population in the formal sector (𝑥3), sex ratio (𝑥4), percentage of female that working as members of people’s representative council (𝑥5), percentage of female population that working as civil servants (𝑥6), and percentage of women's income donations (𝑥7). the data obtained in this study were analyzed using the nonparametric spline regression method for longitudinal data with the rstudio program. the steps to achieve the objectives of this research are as follows: spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 107 a) create a descriptive statistical analysis and scatter plot for each variable a scatter plot is used to detect relationship patterns between the y variable with each x variable, which is predicted to be a factor in its influence. scatter plots provide information on the pattern of regression curve shapes used in modeling [14]. b) create a model of gem with nonparametric spline regression. nonparametric regression models, in general, can be presented as follows: 𝑦𝑖 = 𝑓(𝑥𝑖) + 𝜀𝑖 ; 𝑖 = 1, 2, 3, … , 𝑛 (1) with 𝑦𝑖 is the response variable; 𝑥𝑖 is predictor variable; 𝑓(𝑥𝑖) is a regression function that doesn't follow a particular pattern; 𝜀𝑖 = (𝜀1, 𝜀2, … , 𝜀𝑛) 𝑇 is a free mutual error vector with zero alignments and diversity 𝜎2 [15]. estimation of function 𝑓(𝑥𝑖) in nonparametric regression is performed with the spline estimator [16]. if a regression curve is 𝑓 is an additive model and approached with the spline function, the regression model is: 𝑦𝑖𝑗 = ∑ 𝛽ℎ𝑖𝑥𝑖𝑗𝑝 ℎ + ∑ 𝛼𝑙𝑖(𝑥𝑖𝑗𝑝 − 𝐾𝑙𝑖)+ 𝑞 + 𝜀𝑖𝑗 𝑚 𝑙=1 𝑞 ℎ=0 (2) with 𝑦𝑖𝑗 is response variables in the i-subject and j-time observations; 𝑥𝑖𝑗𝑝 is ppredictor variables on the i-subject and j-time observations; 𝜀𝑖𝑗 is a random error on isubject and j-time observation; 𝑞 is polynomial degrees. in this study, the value of 𝑖 is 1, 2, ..., 38, that represent the subject/region in east java (38 cities/regencies); 𝑗 is the number of observations (2013-2018) with a value of 1, 2, ..., 6; 𝐾𝑙𝑖 is the number of knots. then, the function of (𝑥𝑖𝑗𝑝 − 𝐾𝑙𝑖)+ 𝑞 is given by [17]: (𝑥𝑖𝑗𝑝 − 𝐾𝑙𝑖)+ 𝑞 = { (𝑥 − 𝐾𝑙𝑖) 𝑞 ; 𝑥𝑖𝑗𝑝 ≥ 𝐾𝑙𝑖 0 ; 𝑥𝑖𝑗𝑝 < 𝐾𝑙𝑖 this study used a nonparametric regression spline with a second-degree polynomial curve or quadratic for longitudinal data. the quadratic curves are selected because they can give a smaller error than linear curves and are suitable for data with more complex patterns. in this study, the ordinary least square (ols) method is used to estimate the parameter value of 𝛽 in equation [2]. by using the spline regression model as a smooth curve estimation 𝑓(𝑥), the estimation equation is as follows [18]: �̂� = (𝑋𝑇𝑋)−1𝑋𝑇𝑦 (3) with matrix x as follows: 𝑋 = [ 1 𝑥111 … 𝑥111 𝑞 (𝑥111 − 𝐾11)+ 𝑞 … (𝑥111 − 𝐾𝑚1)+ 𝑞 … 𝑥11𝑝 … 𝑥11𝑝 𝑞 (𝑥11𝑝 − 𝐾𝑚11)+ 𝑞 … (𝑥11𝑝 − 𝐾𝑚𝑝1) + 𝑞 1 𝑥211 … 𝑥211 𝑞 (𝑥211 − 𝐾22)+ 𝑞 … (𝑥211 − 𝐾𝑚2)+ 𝑞 … ⋮⋱⋮ 𝑥21𝑝 … 𝑥21𝑝 𝑞 (𝑥21𝑝 − 𝐾𝑚12)+ 𝑞 … (𝑥21𝑝 − 𝐾𝑚𝑝2) + 𝑞 1 𝑥𝑛𝑡1 … 𝑥𝑛𝑡1 𝑞 (𝑥𝑛𝑡1 − 𝐾𝑚𝑛)+ 𝑞 … (𝑥𝑛𝑡1 − 𝐾𝑚𝑛)+ 𝑞 … 𝑥𝑛𝑡𝑝 … 𝑥𝑛𝑡𝑝 𝑞 (𝑥𝑛𝑡𝑝 − 𝐾𝑚1𝑛)+ 𝑞 … (𝑥𝑛𝑡𝑝 − 𝐾𝑚𝑝𝑛) + 𝑞 ] spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 108 c) select the optimal knot point using the gcv method. the selection of optimal knot points is critical in nonparametric regression models. knot points are joint fusion points that have behavior changes in the data. one appropriate method for selecting the optimal knot points is generalized crossvalidation (gcv) [19]. it is said to be the optimal knot point when obtained the lowest or minimum gcv value. the gcv function, according to eubank [20], is as follows: 𝐺𝐶𝑉(𝐾𝑙𝑖) = 𝑀𝑆𝐸(𝐾𝑙𝑖) (𝑛−1𝑡𝑟𝑎𝑐𝑒[𝐼 − 𝐻𝐾𝑙𝑖]) 2 (4) with 𝑛 is the amount of data; i is an identity matrix, and mse is mean square error. then, 𝐻 = 𝑋(𝑋𝑇𝑋)−1𝑋𝑇 and 𝑀𝑆𝐸(𝐾𝑙𝑖) is defined as: 𝑀𝑆𝐸(𝐾𝑙𝑖) = 𝑛 −1 ∑ ∑(𝑦𝑖𝑗 − 𝑓(𝑥𝑖𝑗𝑝)) 2 𝑡 𝑗=1 𝑛 𝑖=1 d) modeling gem using spline with optimal knot points. e) interpret the model and conclude. results and discussion characteristics of gem in east java year 2013-2018 an overview of the data can be seen in the following table. table 1. the descriptive statistics of gem variable (y) and various variable affecting the gem (𝑥1 ….. 𝑥7) variable average variance maximum value minimum value 𝑦 65.82 75.82 83.29 42.09 𝑥1 55.19 33.55 72.80 43.56 𝑥2 71.55 190.07 100 25.30 𝑥3 31.87 269.49 67.61 5.98 𝑥4 97.13 6.40 101.64 90.64 𝑥5 17.13 87.80 51.52 0 𝑥6 48.11 33.96 83.88 29.93 𝑥7 32.73 18.88 40.34 22.81 table 1 shows that the average gem in east java in 2013-2018 was 65.82 with a variance of 75.82. high variance data shows the data used is spread far from the average caused by the outlier value. this happens because of differences in women's participation in areas with city and district status. in districts, the majority of women are only housewives, while women's participation in the public sector is minimal. the maximum value is 83.29, which is the gem value of surabaya city in 2018, while the minimum value is 42.09, which is the gem value of the sampang regency in 2013. some variables that have an average score below 50% indicate a gender gap where women's involvement in the variable is lower than that of men. characteristics of variable x5 or percentage of women as members of parliament in east java in 2013-2018 obtained the minimum value is 0 this is because in bangkalan regency in the period 2014 – 2018 there are no members of the dprd of the female sex. spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 109 relationship patterns between 𝒚 (gem) variable with 𝒙 variables relationship patterns between 𝑦 variables with 𝑥 variables do not form a specific pattern for each year. it can be seen from the random plot spread. so that all variables can be used as nonparametric components. the scatter plot of their relationship is shown in figures 1 to 7 as follows: 2016 2017 2018 figure 1. scatter plots between 𝑦 variable and𝑥1 (labor force participation rate (lfpr) of the female population) variable 2016 2017 2018 figure 2. scatter plots between 𝑦 variable and 𝑥2 (school participation rate (spr) high school level female population) variable 2016 2017 2018 figure 3. scatter plots between 𝑦 variable and 𝑥3 (percentage of female population in the formal sector) variable spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 110 2016 2017 2018 figure 4. scatter plots between 𝑦 variables and 𝑥4 (sex ratio) variable 2016 2017 2018 figure 5. scatter plots between 𝑦 variables and 𝑥5 (percentage of females that working as members of people’s representative council) variable 2016 2017 2018 figure 6. scatter plots between 𝑦 variables and 𝑥6 (percentage of female population that working as civil servants) variable 2016 2017 2018 figure 7. scatter plots between 𝑦 variables and 𝑥7 (percentage of women's income donations) variables spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 111 scatter plot between variable y (gem) and all variables x (𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥5, 𝑥6, 𝑥7) are random spread and do not follow a specific pattern. so those variables are included in nonparametric components. spline quadratic nonparametric regression modeling with one-knot points the nonparametric regression model of a quadratic spline with the one-knot point in the gem data of east java province is as follows: 𝑦𝑖𝑗 = 𝛽0𝑖 + 𝛽1𝑖𝑥1𝑖𝑗 + 𝛽1𝑖(𝑥1𝑖𝑗) 2 + 𝛼1𝑖(𝑥1𝑖𝑗 − 𝐾1𝑖) 2 + 𝛽2𝑖𝑥2𝑖𝑗 + 𝛽2𝑖(𝑥2𝑖𝑗) 2 + 𝛼1𝑖(𝑥2𝑖𝑗 − 𝐾1𝑖) 2 + ⋯ + 𝛽7𝑖𝑥7𝑖𝑗 + 𝛽7𝑖(𝑥7𝑖𝑗) 2 + 𝛼1𝑖(𝑥7𝑖𝑗 − 𝐾1𝑖) 2 + 𝜀𝑖𝑗 here is the gcv value for a one-knot point. table 2. smallest gcv value with one-knot point order regency/city 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓 𝒙𝟔 𝒙𝟕 gcv 23 pacitan 70.89 65.98 13.3 95.37 16.31 45.61 38.8 1.07e-26 ponorogo 59.57 74.24 18.08 99.84 12.47 45.67 34.65 trenggalek 62.56 64.25 14.78 98.71 15.28 47.22 36.75 tulungagung 57.63 74.18 28.49 95.12 6.69 50.7 37.7 blitar 53.75 64.25 27.71 100.24 15.98 53.95 40.09 kediri 52.1 78.59 34.61 100.56 29.23 52.29 30.65 malang 51.15 58.56 31.93 101.01 17.04 50.46 36.5 lumajang 46.99 55.7 22.74 95.22 14.55 45.91 23.16 ⋮ ⋮ malang city 51.57 80.62 60.92 97.35 23.44 50.3 34.22 probolinggo city 50.17 75.5 54.51 97.15 24.47 49.55 30.83 pasuruan city 53.74 72.88 50.9 97.98 6.98 50.45 31.02 mojokerto city 56.47 81.99 50.69 96.52 28.46 50.7 36.47 madiun city 54.1 81.2 48.4 93.71 34.27 54.13 38.42 surabaya city 52.06 71.02 64.72 97.59 39.04 68.19 35.1 batu city 55.93 83.4 41.47 101.35 25.63 53.31 29.97 the table above shows that the minimum gcv is 1.07e-26 in the 23rd order segment with optimal knot points on each 𝑥 variable; it only has a one-knot point. spline quadratic nonparametric regression modeling with two-knot points the next step is to create quadratic spline regression with two-knot points. the nonparametric regression model of the quadratic spline with two-knot points in gem data of east java province is as follows: 𝑦𝑖𝑗 = 𝛽0𝑖 + 𝛽1𝑖𝑥1𝑖𝑗 + 𝛽1𝑖(𝑥1𝑖𝑗) 2 + 𝛼1𝑖(𝑥1𝑖𝑗 − 𝐾1𝑖) 2 + 𝛼2𝑖(𝑥1𝑖𝑗 − 𝐾2𝑖) 2 + 𝛽2𝑖𝑥2𝑖𝑗 + 𝛽2𝑖(𝑥2𝑖𝑗) 2 + 𝛼1𝑖(𝑥2𝑖𝑗 − 𝐾1𝑖) 2 + 𝛼2𝑖(𝑥2𝑖𝑗 − 𝐾2𝑖) 2 + ⋯ + 𝛽7𝑖𝑥7𝑖𝑗 + 𝛽7𝑖(𝑥7𝑖𝑗) 2 + 𝛼1𝑖(𝑥7𝑖𝑗 − 𝐾1𝑖) 2 + 𝛼2𝑖(𝑥7𝑖𝑗 − 𝐾2𝑖) 2 + 𝜀𝑖𝑗 here are the resulting gcv values. (5) (6) spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 112 table 3. smallest gcv value with two-knot points order regency/city 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒙𝟒 𝒙𝟓 𝒙𝟔 𝒙𝟕 gcv 32 pacitan 43.79 53.46 10.44 95.32 15.06 69.87 38.36 2.83e-28 51.91 82.78 19.65 94.84 6.8 57.55 36.63 ponorogo 46.44 71.67 14.59 95.39 16.89 71.36 38.99 54.06 94.26 23.4 94.91 11.13 66.47 38.05 trenggalek 44.47 64.44 14.36 99.75 11.17 57.8 34.05 47.04 71.52 14.08 95.04 20.17 48.66 30.65 tulungagung 46.22 78.7 19.77 99.88 13.06 60.38 34.92 50.69 80.54 18.43 95.47 25.77 59.11 31.54 blitar 45.52 50.21 10.05 98.67 11.29 58.95 35.92 44.61 56.7 14.03 97.08 12.12 47.62 24.97 kediri 47.99 70.63 16.92 98.73 17.1 64.2 37.13 47.28 74.39 24.3 97.57 16.04 53.8 25.84 ⋮ ⋮ malang city 52.8 74.94 39.9 99.98 29.54 56.49 35.2 51.94 85.71 53.21 96.66 35.61 58 36.69 probolinggo city 48.65 66.19 26.26 98.9 8.41 46.53 26.58 52.56 67.42 45.15 93.57 27 52.35 37.84 pasuruan city 51.72 80.73 32.06 99.17 21.61 51.94 27.22 54.85 87.47 49.88 93.78 37.57 54.89 38.68 mojokerto city 49.55 61.96 19.55 98.69 16.33 44.11 24 54.91 60.59 62.27 97.54 28.48 50.53 34.7 madiun city 51.86 79.87 24.99 98.82 27.01 51.1 25.1 74.22 75.76 65.83 97.61 43.84 52.76 35.29 surabaya city 50.23 78.87 20.1 97.33 11.2 48.99 28.95 49.82 76.73 35.53 101.1 20.24 53.59 29.45 batu city 53.02 86.7 24.82 97.42 14.11 54.47 30.02 54.9 86.42 44.16 101.46 28.08 57 30.21 optimal knot point selection the selection of the best model is based on the selection of optimal knot points with the minimum gcv grades. gcv with one-knot point produces 48 alternate knot points with each gcv value and obtained a minimum value of 1.07e-26, while gcv with two-knot points also has 48 alternate knot points gcv value obtained a minimum value of 2.83e-28. so, selected nonparametric regression modeling spline quadratic using 2 knot points with the gem model in each city or regency in east java is: 𝑦𝑖𝑗 = 𝛽01 + 𝛽11𝑥1𝑖𝑗 + 𝛽12(𝑥1𝑖𝑗) 2 + 𝛼11𝑗(𝑥1𝑖𝑗 − 𝐾11𝑖) 2 + 𝛼21𝑖(𝑥1𝑖𝑗 − 𝐾21𝑖) 2 + 𝛽21𝑥2𝑖𝑗 + 𝛽22(𝑥2𝑖𝑗) 2 + 𝛼12𝑖(𝑥2𝑖𝑗 − 𝐾12𝑖) 2 + 𝛼22𝑖(𝑥2𝑖𝑗 − 𝐾22𝑖) 2 + 𝛽31𝑥3𝑖𝑗 + 𝛽32(𝑥3𝑖𝑗) 2 + 𝛼13𝑖(𝑥3𝑖𝑗 − 𝐾13𝑖) 2 + 𝛼23𝑖(𝑥3𝑖𝑗 − 𝐾23𝑖) 2 + 𝛽41𝑥4𝑖𝑗 + 𝛽42(𝑥4𝑖𝑗) 2 + 𝛼14𝑖(𝑥4𝑖𝑗 − 𝐾14𝑖) 2 + 𝛼24𝑖(𝑥4𝑖𝑗 − 𝐾24𝑖) 2 + 𝛽51𝑥5𝑖𝑗 + 𝛽52(𝑥5𝑖𝑗) 2 + 𝛼15𝑖(𝑥5𝑖𝑗 − 𝐾15𝑖) 2 + 𝛼25𝑖(𝑥5𝑖𝑗 − 𝐾25𝑖) 2 spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 113 + 𝛽61𝑥6𝑖𝑗 + 𝛽62(𝑥6𝑖𝑗) 2 + 𝛼16𝑖(𝑥6𝑖𝑗 − 𝐾16𝑖) 2 + 𝛼26𝑖(𝑥6𝑖𝑗 − 𝐾26𝑖) 2 + 𝛽71𝑥7𝑖𝑗 + 𝛽72(𝑥7𝑖𝑗) 2 + 𝛼17𝑖(𝑥7𝑖𝑗 − 𝐾17𝑖) 2 + 𝛼27𝑖(𝑥7𝑖𝑗 − 𝐾27𝑖) 2 overall gem model in each city or regency in east java obtains 𝑅2 value is 93.74%, which means that the model can explain the diversity of gem variable values of east java province by 93.74%; while the residual (6,26%) is explained by other variables that are not in the regression model. this model has a mean absolute percentage error (mape) value of 3.22%. parameter testing of quadratic spline nonparametric regression model the test results are displayed in the following anova table 4. table 4. anova results on model source df ss ms fcount p-value decision regresi 6 16134.355 2689.089 512.053 1.96e-120 failed to reject h0 error 205 1076.755 5.2516 total 227 17211.111 the decision obtained is reject 𝐻0, this means there is at least one significant parameter in the quadratic spline nonparametric regression model. the considerable parameter is an entire estimator 𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥5, 𝑥6 and 𝑥7 in each year from 2013-2018. interpretation of results of the quadratic spline nonparametric regression model for example, the implementation of the quadratic spline nonparametric regression model in longitudinal data is used in surabaya. the estimated coefficient of parameters and knot points obtained is substituted in equation (7), so it obtained a model of nonparametric regression of quadratic spline with longitudinal data for the city of surabaya as follows. in gem data, surabaya city is 37th out of a total of 38 cities/regencies, hence the value 𝑖 = 37 𝑦37𝑗 = 0.0008 − 0.0134𝑥1.37.𝑗 − 0.0363(𝑥1.37.𝑗) 2 + 0.0049(𝑥1.37.𝑗 − 50.32) 2 + 0.3059(𝑥1.37.𝑗 − 49.82) 2 − 0.0191𝑥2.37.𝑗 − 0.0391(𝑥2.37.𝑗) 2 + 0.0007(𝑥2.37.𝑗 − 78.87) 2 − 0.0132(𝑥2.37.𝑗 − 76.73) 2 − 0.0302𝑥3.37.𝑗 + 0.0048(𝑥3.37.𝑗) 2 + 0.2975(𝑥3.37.𝑗 − 20.10) 2 + 0.002(𝑥3.37.𝑗 − 35.53) 2 + 0.0457𝑥437𝑗 + 0.0458(𝑥437𝑗) 2 + 0.1919(𝑥437𝑗 − 97.33) 2 − 0.0056(𝑥4.37.𝑗 − 101.1) 2 + 0.111𝑥5.37.𝑗 + 0.0843(𝑥5.37.𝑗) 2 + 0.4336(𝑥5.37.𝑗 − 11.2) 2 − 0.0274(𝑥5.37.𝑗 − 20.24) 2 + 0.0028𝑥6.37.𝑗 + 0.0011(𝑥6.37.𝑗) 2 − 0.0042(𝑥6.37.𝑗 − 48.99) 2 + 0.0255(𝑥6.37.𝑗 − 53.59) 2 + 0.009𝑥7.37.𝑗 + 0.3078(𝑥7.37.𝑗) 2 − 0.0122(𝑥7.37.𝑗 − 28.95) 2 − 0.0009(𝑥7.37.𝑗 − 29.45) 2 from the model above, if the 𝑥2, 𝑥3, 𝑥4, 𝑥5, 𝑥6 and 𝑥7 assumed to be constant, the influence of lfpr female population (𝑥1) on gem is: (7) spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 114 �̂� = − 0.0134𝑥1 − 0.0363(𝑥1) 2 + 0.0049(𝑥1 − 50.32) 2 + 0.3059(𝑥1 − 49.82) 2 if the lfpr female population variable (𝑥1) is below 50.32 and then there is an increase of 1 unit, then the gem value tends to decrease in value by 0.0363. at the time of lfpr female population (𝑥1) between the moderate intervals of 49.82 and 50.32 and there is an increase of 1 unit, the gem value will decrease by 0.0314. lastly, if the lfpr female population (𝑥1) is greater than or equal to 49.82 then there is an increase of 1 unit, then gem has an increase of 0.2745. overall, the higher the lfpr female population score (𝑥1) then the gem value will also be higher, and vice versa. this is because the lfpr is a representation of the large number of people participating in the economy, in this case is counted by the participation of working women. the model's interpretation about spr high school level female population variable (𝑥2) is if the 𝑥1, 𝑥3, 𝑥4, 𝑥5, 𝑥6 and 𝑥7 assumed to be constant, the influence of spr on gem is: �̂� = − 0.0191𝑥2 − 0.0391(𝑥2) 2 + 0.0007(𝑥2 − 78.87) 2 − 0.0132(𝑥2 − 76.73) 2 if the spr high school level female population (𝑥2) is below 78.87 and then there is an increase of 1 unit, then the gem value will decrease by 0.0391 units. when the aps is between the moderate intervals of 76.73 and 78.87 and there is an increase of 1 unit, the gem value will tend to decrease by 0.0384. lastly, if the spr high school level female population (𝑥2) is greater than or equal to 76.73 then there is an increase of 1 unit, then gem also decreases by about 0.0516 units. overall, the higher the spr high school female population (𝑥2) score, the higher the gem score, and vice versa. this is because the higher the level of education that a person finishes, the more it will encourage his participation in the job market. a person's chances of getting a job also tend to be in line with their level of education, especially the share of the current job market usually increases their education. the model's interpretation about percentage of female population working age working in the formal sector variable (𝑥3) is if the 𝑥1, 𝑥2, 𝑥4, 𝑥5, 𝑥6 and 𝑥7 assumed to be constant, the influence of formal sector on gem is: �̂� = − 0.0302𝑥3 + 0.0048(𝑥3) 2 + 0.2975(𝑥3 − 20.10) 2 + 0.0020(𝑥3 − 35.53) 2 if the percentage of female population working age working in the formal sector (𝑥3) is below 20.10 and then there is an increase of 1 unit, then the gem value will increase by 0.0048 units. when the percentage of female population working age working in the formal sector (𝑥3) was between the moderate interval segments of 20.10 and 35.53 and there was an increase of 1 unit, the gem value also increased by 0.3023. lastly, if the percentage of the female population working age working in the formal sector (𝑥3) is greater than or equal to 35.53, there is an increase of 1 unit, the gem also increases by about 0.3043 units. so overall, if the percentage of the female population working age working in the formal sector (𝑥3) increases, gem value will also increase, and vice versa. this is because, the role of women in the economic field as measured in gem is women who work as professional workers, leadership, technicians, and technical or skilled workers. spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 115 the model's interpretation about the sex ratio variable is if the 𝑥1, 𝑥2, 𝑥3, 𝑥5, 𝑥6 and 𝑥7 assumed to be constant, the influence of sex ratio on gem is: �̂� = 0.0457𝑥4 + 0.0458(𝑥4) 2 + 0.1919(𝑥4 − 97.33) 2 − 0.0056(𝑥4 − 101.1) 2 if the sex ratio (𝑥4)is below the segment of 97.33 then if the value increases by one unit, gem will increase by 0.0458. whereas if the value of the sex ratio (𝑥4)is located between the moderate intervals of 97.33 and 101.1 then if the ratio value increases by one unit, gem tends to increase by 0.2377. if the sex ratio (𝑥4) is worth more than or equal to 101.1 then if there is an increase of one unit will affect the increase in gem by 0.2321. so overall, if the value of the sex ratio (𝑥4) increases then the gem value will also increase, and vice versa. this is because the sex ratio is a comparison of the number of male population per 100 female population in a given region and time. the model's interpretation about percentage female that working as members of people’s representative council variable (𝑥5) is if the 𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥6 and 𝑥7 assumed to be constant, the influence of the people’s representative council on gem is: �̂� = 0.111𝑥5 + 0.0843(𝑥5) 2 + 0.4336(𝑥5 − 11.20) 2 − 0.0274(𝑥5 − 20.24) 2 if the percentage of females that work as members of the people's representative council(𝑥5) is below the segment of 11.20 then if the value increases by one unit, gem will experience an increase in value of 0.0843. meanwhile, if the percentage value is located between the moderate intervals of 11.20 and 20.24 then if the ratio value increases by one unit, gem also tends to increase by 0.5179 if the percentage value is more than or equal to 20.24 then if there is an increase of one unit will affect the increase in gem by 0.49. so overall, if the value of percentage female that working as members of the people's representative council (𝑥5) increases, then the gem value will also increase, and vice versa. this is because, gem also highlights the decision of women to participate in politics, so the role of women in the field of political decision-making is measured by the membership of the people’s representative council. the model's interpretation about percentage of population female that working as civil servants variable (𝑥6) is if the 𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥5 and 𝑥7 assumed to be constant, the influence of civil servants on gem is: �̂� = 0.0028𝑥6 + 0.0011(𝑥6) 2 − 0.0042(𝑥6 − 48.99) 2 + 0.0255(𝑥6 − 53.59) 2 if the percentage of women working as civil servants(𝑥6) is below the segment of 48.99 then if the value increases by one unit, gem will experience an increase in value of 0.0011. whereas if the percentage value is located between the moderate intervals of 48.99 and 53.59 then if the percentage value increases by one unit, gem also tends to decrease by 0.0031. if the percentage of women working as civil servants(𝑥6)is worth more than or equal to 53.59 then if there is an increase of one unit will affect the increase in gem value by 0.0224. so overall, if the percentage of population female that working as civil servants (𝑥6) increases, gem value will also increase, and vice versa. this is because, the employment status of civil servants is included in the status of formal sector employment but within the scope of statehood, the percentage of population female that working as civil servants is considered to affect gem. spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 116 the model's interpretation about the percentage of women's income donations variable (𝑥7) is if the 𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥5 and 𝑥6 assumed to be constant, the influence of income on gem is: �̂� = 0.0090𝑥7 + 0.3078(𝑥7) 2 − 0.0122(𝑥7 − 28.95) 2 − 0.0009(𝑥7 − 29.45) 2 if the percentage of women's income donations (𝑥7) is below 28.95 and there is an increase of 1 unit, then the gem value will increase by 0.3078 units. when the percentage value is between the medium interval segments of 28.95 and 29.45 and there is an increase of 1 unit, the gem value also increases by 0.296. lastly, if the percentage of women's income donations (𝑥7) is greater than or equal to 29.45, there is an increase of 1 unit, then gem also continues to increase by about 0.295 units. so overall, if the percentage value of women's income donations (𝑥7) increases, then the value of gem will also increase, and vice versa. this is because the income contribution is a contribution of the value of the proceeds received in return from members of households who work in this case women. conclusions identification of factors that affect gem using spline nonparametric regression results in a highly optimized model with mape at 3.22%. all factors (variable x) studied have a significant influence on gem value (y) so that the value 𝑅2 is 93.74% gem (y). this value indicates that the resulting model is already excellent. all research variables showed significant results on the model so that the factors that influenced gender empowerment measure (gem) east java province is labor force participation rate (lfpr) population of women (𝑥1), school participation rate (spr) high school level female population (𝑥2), percentage of female population that working in the formal sector (𝑥3), sex ratio (𝑥4), percentage of female population that working as members of people’s representative council (𝑥5), percentage of female population that working as civil servants (𝑥6), and percentage of female's income donations (𝑥7). areas below the interval are dominated by regions with district status, and madura islands have the most areas below the interval. references [1] gadis arifia and nur iman subono, “a hundred years of feminism in indonesia an analysis of actors, debates, and strategies,” ctry. the study, pp. 1–28, 2017, [online]. available: www.fes-asia.org. [2] kppa and bps, pembangunan manusia berbasis gender 2016. jakarta: cv. lintas khatulistiwa, 2016. [3] s. akter et al., “women’s empowerment and gender equity in agriculture: a different perspective from southeast asia,” food policy, vol. 69, pp. 270–279, 2017, doi: 10.1016/j.foodpol.2017.05.003. [4] kppa and bps, pembangunan manusia berbasis gender 2018. jakarta: kppa, 2018. [5] kppa and bps, pembangunan manusia berbasis gender 2013. jakarta: cv. lintas khatulistiwa, 2013. [6] badan pusat statistik, “indeks pemberdayaan gender menurut provinsi, 20102018.” [7] r. p. rangel, m. de l. g. magaña, r. u. azpeitia, and e. nesterova, “mathematical modeling in problem situations of daily life,” j. educ. hum. dev., vol. 5, no. 1, pp. 62– 76, 2016, doi: 10.15640/jehd.v5n1a7. spline nonparametric regression to identify factors affecting gender empowerment measure (gem) luluk mahfiroh 117 [8] r. kurniawan and b. yuniarto, analisis regresi dasar dan penerapannya dengan r. jakarta: kencana, 2016. [9] n. a. erilli, “non-parametric regressiın estimation for data with equal values,” eur. sci. j. febr. 2014, vol. 10, no. 4, pp. 70–82, 2014. [10] r. syam, w. sanusi, and r. adawiyah, “model regresi nonparametrik dengan pendekatan spline ( studi kasus : berat badan lahir rendah di rumah sakit ibu dan anak siti fatimah makassar ),” 2017. [11] k. n. fadhilah, “pemodelan regresi spline truncated untuk data longitudinal ( studi kasus : harga saham bulanan pada kelompok saham perbankan periode januari 2009 – desember 2015 ),” vol. 5, pp. 447–454, 2016. [12] m. f. f. mardianto, e. tjahjono, and m. rifada, “statistical modeling for prediction of rice production in indonesia using semiparametric regression based on three forms of fourier series estimator,” arpn j. eng. appl. sci., vol. 14, no. 15, pp. 2763–2770, 2019. [13] f. n. hidayah, analisis regresi nonparametrik spline linear. yogyakarta: fakultas saintek uin sunan kalijaga, 2019. [14] y. h. chan, c. d. correa, and k. l. ma, “the generalized sensitivity scatterplot,” ieee trans. vis. comput. graph., vol. 19, no. 10, pp. 1768–1781, 2013, doi: 10.1109/tvcg.2013.20. [15] p. taylan, g. w. weber, l. liu, and f. yerlikaya-özkurt, “on the foundations of parameter estimation for generalized partial linear models with b-splines and continuous optimization,” comput. math. with appl., vol. 60, no. 1, pp. 134–143, 2010, doi: 10.1016/j.camwa.2010.04.040. [16] x. ni, h. h. zhang, and d. zhang, “automatic model selection for partially linear models,” j. multivar. anal., vol. 100, no. 9, pp. 2100–2111, 2009, doi: 10.1016/j.jmva.2009.06.009. [17] a. pawar et al., “adaptive fem-based nonrigid image registration using truncated hierarchical b-splines,” comput. math. with appl., vol. 72, no. 8, pp. 2028–2040, 2016, doi: 10.1016/j.camwa.2016.05.020. [18] h. a. farahani, a. rahiminezhad, l. same, and k. immannezhad, “a comparison of partial least squares (pls) and ordinary least squares (ols) regressions in predicting of couples mental health based on their communicational patterns,” procedia soc. behav. sci., vol. 5, pp. 1459–1463, 2010, doi: 10.1016/j.sbspro.2010.07.308. [19] a. prahutama, suparti, and t. w. utami, “modelling fourier regression for time series data a case study: modelling inflation in foods sector in indonesia,” j. phys. conf. ser., vol. 974, no. 1, 2018, doi: 10.1088/1742-6596/974/1/012067. [20] r. l. eubank, nonparametric regression and spline smoothing, 2nd editio. new york: mercel dekker, 1988. forecasting rice paddy production in aceh using arima and exponential smoothing models cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 281-292 p-issn: 2086-0382; e-issn: 2477-3344 submitted: october 19, 2021 reviewed: december 09, 2021 accepted: january 05, 2022 doi: http://dx.doi.org/10.18860/ca.v7i1.13701 forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana1,*, amelia1, riezkypurnama sari1, ulyanabilla1, taufan talib2 1mathematics department, universitas samudra, indonesia 2mathematics department, universitas pattimura, indonesia *corresponding author email: nurviana@unsam.ac.id*, ameliamat@unsam.ac.id, riezkypurnamasari@unsam.ac.id, ulya.nabilla@unsam.ac.id, taufan.talib@fkip.unpatti.ac.id abstract indonesia targets aceh to be one of the rice paddy production centers and be able to carry out self-sufficient production in rice paddy and become a national granary. however, in reality, aceh's rice paddy production in its province is not consistent from year to year. this province has not been able to meet the food needs of rice paddy independently, so that it supplies rice paddy from other regions due to the difficulty of detecting the presence of a surplus of rice paddy.the purpose of this research is to forecast the yield ofrice paddy production in aceh for the future. the mathematical model that can be used is a time series model namely autoregressive integrated moving average (arima) and exponential smoothing. the forecasting results of rice paddy production in the next 5 years using the arima (3,1,1) model are 2453401; 2154784; 2111594; 1615171; and 2062436. while the estimation results using the winter exponential smoothing model are 1625925; 1645196; 1687667; 1605530; and 1555213. arima model (3,1,1) produces an mse/mad value of 3,34041 × 1010, while the winter exponential smoothing model produces an mse/mad value of 3,08616 × 1010. therefore, it can be concluded that the winter exponential smoothing model.by obtaining this resultsanalysis, the aceh government can make the right policies in planning for the provision of rice paddy food in the future. keywords: arima; forecasting; exponential smoothing; rice paddy introduction aceh is one of the provinces in indonesia which has large agricultural land. the majority of the population in aceh rely their life on the agricultural sector for their livelihood. according to the central bureau of statistics, the highest economic source of aceh is in the agricultural sector, where agriculture has a good contribution to the economy and fulfills the basic needs of the society. the agriculture of aceh excels in various commodities such as rice paddy, corn, soybeans, and chilies. the rice paddy plants are stapled food crop commodities whose needs continue to increase from year to year following population growth. rice paddy cultivation is the main activity and main source of income for more than 100 million households in developing countries in asia, africa, and latin america. in asiapacific more than 90 percent of the world's rice paddy has been produced and consumed[1]. the rice paddy plant is an ancient agricultural crop that until now is considered a staple crop in most tropical countries, especially in asia and africa. rice http://dx.doi.org/10.18860/ca.v7i1.13701 mailto:nurviana@unsam.ac.id mailto:ameliamath@unsam.ac.id mailto:riezkypurnamasari@unsam.ac.id mailto:ulya.nabilla@unsam.ac.id mailto:taufan.talib@fkip.unpatti.ac.id forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 282 paddy is the most important food crop in aceh because almost all people use rice paddy as a staple food and rice paddy is also a strategic food commodity that has a considerable influence on economic stability, especially inflation, social and political stability. indonesia targets aceh as one of the centers of rice paddy production and can carry out self-sufficient production in rice paddy and become a national granary [2]. aceh has many genotypes or local rice paddy varieties. according to [3]there are 50 aceh rice paddy genotypes that have been collected through exploration activities, but only a few local rice varieties are often planted.the varieties are ramos, dewi, sigupai, tinggong, and siputeh. based on bps [4], local rice paddy production of aceh is not consistent from year to year or in other words, it increases and decreases every year. in 2020, the rice paddy plants production reached 1.75 million tons, an increase from the previous year of 1.71 million tons. if it is converted to rice paddy, in 2020 rice paddy production in aceh will reach 1 million tons. this increase was due to an increase in the harvested area of rice paddy plants from 310.01 thousand hectares to 320.75 thousand hectares [4]. therefore, the province of aceh should have been able to meet the food needs of rice paddy independently. however, aceh still supplies rice paddy from other regions due to the difficulty of detecting the presence of a surplus of rice paddy. a mathematical model is needed to make plans related to food commodities, especially rice paddy. one of the mathematical models that can be used is the time series model. this model is used to estimate production results in the future period based on previous data. the time series model that will be used in this research is autoregressive integrated moving average (arima) and exponential smoothing. previous research related to the study of rice paddy production has been carried out by [5]which discussed the forecast of rice paddy production in gorontalo province using the double moving average method.the forecasting results for the next 5 years were obtained, in 2019 of 326318.5 tons, in 2020 of 32094.5 tons, and so on until 2023 of 304826.5 tons. other research [6]using double exponential smoothing model to assess the estimated production value for the next year. the application of this model obtained predictions of rice paddy harvests in the kudus regency in 2019 of 163,435.90 tons. other rice paddy production research [7]using the fuzzy time series model for forecasting the amount of rice paddy production in southeast sulawesi, the results of forecasting rice paddy production in 2015 were 657768.25191 tons. based on the explanation above, researchers are interested in analyzing the rice paddy production results using the arima and exponential smoothing models. the purpose of this study is to predict the local rice paddy production of aceh result in the future period and to see which of the two models is the best in estimating aceh's local rice paddy production. therefore, hopes that the government can make more precise planning in the provision of rice paddy food and can make aceh a national rice paddy production center. methods arima process arima is a time series model that can predict data for a certain period of time based on past data [8]. this model has very good accuracy when used for short-term forecasting. meanwhile, for long-term forecasting, the accuracy of the forecast is not good and usually, it will tend to be flat (level or constant) for a fairly long period. the arima process developed by box and jenkins in 1976 was a model that does not assume certain patterns in the historical data that was forecasted and was a model that completely ignores the independent variables in making forecasts that were used in forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 283 model formation[9]. the arima process is a combined model between autoregressive (ar) and moving average (ma). this model can represent stationary and non-stationary time series [10] [11]. the general form of arima model (p,d,q) is defined as[12]: 𝑌𝑡 − 𝑌𝑡−𝑑 = 𝛾0 + ∑ 𝛼𝑖 (𝑌𝑡−𝑖 − 𝑌𝑡−𝑖−𝑑) 𝑝 𝑖=1 + ∑ 𝛽𝑖 𝜀𝑡−𝑖 𝑞 𝑖=1 + 𝑒𝑡 (1) in practice, the data is commonly non-stationary so that modifications need to be made, by using differencing, to produce stationary data. exponential smoothing model exponential smoothing is an analytical time series model that is quite good and convenient in low ease of operation. the exponential smoothing model continuously makes improvements related to forecasting by taking the smoothing average of past values from a time series data by decreasing exponentially[13]. in general, the exponential smoothing model is divided into 3 models, namely single, double, and triple exponential smoothing (holt-winter's model). this research concentrates on triple exponential smoothing. this model is used when the data pattern shows very large differences, trends, and seasonal behavior. to deal with seasonality, a third equation parameter has been developed called the “holt-winters” model after the name of the inventor. the holt-winters method is based on three equations, namely stationary, trend, and seasonal elements[14]. the basic equation for the holt-winters method is as follows: [15] overall smoothing: st = α xt it−l + (1 − α)(st−1 + bt−1) trend smoothing: bt = γ(st − st−1) + (1 − γ)bt−1 seasonal smoothing: it = β xt st + (1 − β)it−l forecast: 𝐹𝑡+𝑚 = (𝑆𝑡 + 𝑏𝑡 𝑚)𝐼𝑡−𝐿+𝑚 results and discussion analysis of the data on the data on the amount of rice paddy production using the arima box-jenkins method and the exponential smoothing method. the data processed is data on rice paddy production in aceh from 1993 to 2020 and analyzed with usingminitab 19. data characteristics total rice paddy production result data plot rice paddy production from 1993 to 2020. figure 1. time series plot of total rice paddy production in aceh from 1993 to 2020 forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 284 descriptive statistics show the least quantity of rice paddy production in 2001 was 1,246,614 tons. meanwhile, the highest number of productions occurred in 2017, which was 2.49613 tons and the average total rice paddy production from 1993 to 2020 is around 1.61315 tons. based on figure 1, the amount of rice paddy production tends to go up and down. the fluctuation of the data on the amount of rice paddy production is not at a constant average value so that there is an indication that the data is not stationary. forecasting rice paddy production using the arima model a. model identification the initial step in identifying the data is to know whether the data is stationary in the mean and variance. identification has been done by determining: the time series plot, acf plot, pacf plot, and box-cox transformation. the identification process starts from determining whether the rice paddy production data is stationary to the variance or not. figure 2. box-cox plot of total rice paddy production based on figure 2, the rice paddy production data is not stationary to the variance because the rounded value is less than 1, where the rounded value is said to be good if the value is 1. figure 3. box-cox of total rice paddy production after transformation based on figure 3, a rounded value of 1.00 is obtained so that the data can be said to have been stationary in variance. furthermore, the differencing stage is carried out so that the data is stationary to the mean. stationary data to the mean can be seen visually through the acf plot. the following is an acf plot of total rice paddy production. forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 285 figure 4. acf plot of total rice paddy production before differencing figure 4 shows that the quantity of rice paddy production is not stationary in the mean, because the lags in the acf plot are still decreasing slowly. therefore, it is necessary to do differencing. here is the time series plot after differencing. figure 5. time series plot of rice paddy production after differencing figure 5 shows that the pattern of the amount of rice paddy production is stationary in the mean after the differencing process is carried out once. the next step is to identify the model to get the arima conjecture model. the identification of the arima model has been known based on the acf and pacf plots. the following is a plot of acf and pacf for the amount of rice paddy production after the differencing process at lag 1. forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 286 figure 6. acf plot of total rice paddy production data after differencing figure 7. pacf plot of total rice paddy production data after differencing based on figure 6 and 7, it is showing that there are no lags that come out or all lags are in a significant line in the acf plot, while the pacf plot is cut off at the 3rd lag. so it can be concluded that the suggested arima (p,d,q) models for rice paddy production were (1,1,0), arima (1,1,1), arima (3,1,0) and arima (3,1,1) b. estimation of parameters and diagnostic tests of residual the next step is the estimation of the model parameters for the tentative models that have been selected. the best model was selected based on the minimum values of means square error (mse). table 2. suggested arima (p,d,q) models for rice paddy production suspected model estimates of parameters mean square error value type coef se coef (1,1,0) 𝜙1 -0,354 0,183 4,32646e+10 (1,1,1) 𝜙1 -1,028 0,276 3,95604e+10 𝜃1 -0,853 0,465 (3,1,0) 𝜙1 -0,171 0,179 3,36785e+10 𝜙2 -0,091 0,189 𝜙3 -0,800 0,268 (3,1,1) 𝜙1 -0,396 0,339 3,34041e+10 𝜙2 -0,187 0,231 𝜙3 -0,812 0,290 𝜃1 -0,351 0,411 forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 287 based on table 2, the best selected arima model for rice paddy production is arima (3,1,1) with mse =3,34041e+10. so, it can be analyzed in the next step, namely by testing the assumption of residual white noise and normal distribution. testing the assumption of the residual white noise is carried out using the boxljung test statistic with the following hypothesis formulation: 𝐻0 : residual white noise 𝐻1 : residual is not white noise if the significance level is set at 5%, then the rejection area is rejected h0if q < xα,df−k−p−q 2 or p-value> α. table 3. results of the box-ljung statistic for residuals of arima (3,1,1) model lag q df p-value (3,1,1) 12 3,51 8 0.898 24 9.41 20 0.978 36 * * * 48 * * * based on table 3, the model of arima (3,1,1) since p-value is greater than the level of significance we do not reject the null hypothesis. indicated that the residuals for the model was white noise at the 5% level of significance. testing the assumption of normal distribution of residuals is carried out using the kolmogorov smirnov test. the following are the results of testing the normal distribution of the residual assumption. hypothesis: 𝐻0 : residual-normally distributed 𝐻1 : residual-not normally distributed figure 8. probability plot of residual of arima (3,1,1) base on figure, the results of the residual of arima (3,1,1) was 0,141 with an observed significance level of 0.05 indicating that residual for the model was normally distributed at the 5 % level of significance. hence, the diagnostic test that arima (3,1,1) model was appropriate for rice paddy production. table 6. rice paddy production forecasting results using model arima (3,1,1) year rice paddy production forecasting results 2021 2453401 2022 2154784 forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 288 year rice paddy production forecasting results 2023 2111594 2024 1615171 2025 2062436 estimation using exponential smoothing model the plot of rice paddy data shows that rice paddy production fluctuates every year. the graph also shows that start from 1993s, rice paddy production continued to experience a very significant increase until 2013. however, in 2018 rice paddy production decreased very rapidly, which is equal to 1,697,756 tons. the existence of a trend element in rice paddy production (tons) can be seen from the results of the acf (autocorrelation function) as it shown below. figure 9. acf plot of rice paddy production based on the picture above, it can conclude that the data has an element of trend because the lag movement is slowly decreasing towards 0. the trend element can also spot from the results of trend analysis in the image below: figure 10. trend analysis plot of rice paddy production trend analysis plot in the figure 10, it is obvious that the data has an element of trend. it can be shown from the fits line that has increased linearly. when the fits line increases or decreases linearly, then the data have an element of trend. the presence of seasonal elements in rice paddy production (tons) can spot from the pacf (partial autocorrelation function) plot shown in the figure 12. forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 289 figure 11. pacf plot of rice paddy production in the figure 11, the data has a seasonal element because the lag movement is repeating. in lag 2 which increases as well as in lag 4 and the lag that decreases from the previous lag is spotted in lag 3 and lag 5 and so does lag 6. based on the results of testing the trend and seasonal elements on rice paddy production data, the selection of the appropriate exponential smoothing solution method is winter exponential smoothing. a. determining the smoothing constant value 𝜶, 𝜸, and 𝜷 the smoothing constant used in the winter exponential smoothing model is 𝛼, 𝛾,and 𝛽. the optimal constant value is selected based on the smallest mape value. the following is the mape value of some of the best smoothing constant values. table 7. value of constants and mape no 𝜶 𝜸 𝛃 mape(%) 1 0.8 0.9 0.1 8.25 2 0.8 0.8 0.1 8.26 3 0.7 0.7 0.1 8.31 4 0.7 0.8 0.1 8.23 5 0.7 0.7 0.1 8.27 6 0.7 0.9 0.1 8.20 7 0.7 0.6 0.1 8.31 8 0.8 0.6 0.1 8.36 9 0.7 0.5 0.1 8.34 10 0.9 0.1 0.1 8.27 in the table 7, it can be seen that the smoothing constant value which has the smallest mape value is α = 0,7,γ = 0,9 ,β = 0,1 and the smallest mape value is 8.20%. the smoothing constant value will be used in the mathematical model of winter exponential smoothing in order to obtain forecasting results for the future period. the forecasting results for rice paddy production can be seen in the table 8. forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 290 table 8. estimation of rice paddy production yield in the next 5 years using winter exponential smoothing method year rice paddy production forecasting results 2021 1625925 2022 1645196 2023 1687667 2024 1605530 2025 1555213 the estimation results show that rice paddy production has an average increase of 1.88% from 2021 to 2023. however, rice paddy production has decreased by an average of 4% from 2023 to 2025. figure 12 shows the plot results between the actual data and forecasting data using the winter exponential smoothing method. based on the figure, it has been shown that the mape value generated for forecasting rice paddy production is 7.74% and less than 10%. this means that the average percentage error between the actual value and the forecast value using the winter exponential smoothing method is very small. therefore, it is fair to say that the winter exponential smoothing method provides better forecasting value for forecasting the rice paddy production in aceh. figure 12. plot between actual data and estimated data using winter method comparison of estimated results of rice paddy production using the arima and winter exponential smoothing models comparison of the estimation results of rice paddy production in the next 5 years using the arima and winter exponential smoothing models can be seen in the following table: table 9. comparison of estimated results of rice paddy production using arima (3,1,1) and winter exponential smoothing model year estimated results arima (3,1,1) winter exponential smoothing 2021 2453401 1625925 2022 2154784 1645196 2023 2111594 1687667 2024 1615171 1605530 2025 2062436 1555213 mse/mad 3,34041e+10 3.08616e+10 forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 291 figure 13. comparison graph of rice paddy production estimation results using arima(3,1,1) and winter exponential smoothing models based on table 9 and figure 14, the estimation results of rice paddy production using the arima model have increased from 2021 to 2022 with an increased rate of 0.84%. however, the estimated production results from 2022 to 2023 have decreased with a decline rate of 5.64%. furthermore, the results of the estimated rice paddy production have increased again by 2.14% from 2023 to 2024. in 2025, the estimated results of rice paddy have decreased by 0.75%. while the estimation results using the winter exponential smoothing model show that rice paddy production has an average rate of increase of 1.88% from 2021 to 2023 and has decreased with an average decline of 4% from 2023 to 2025. meanwhile, for testing the estimation results, the arima model (1,1,3) produces an mse/mad value of 3,34041 × 1010, while the winter exponential smoothing model produces an mse/mad value of 3.08616 × 1010. the best model is the model that has the smallest mse/mad value. therefore, it can conclude that the winter exponential smoothing model is the best model that can explain the actual data pattern and is used to estimate rice paddy production in aceh. conclusions the estimation results of rice paddy production of aceh in the next five years by using arima (3,1,1) model, respectively 2453401; 2154784; 2111594; 1615171; and 2062436 with the mse/mad 3,34041 × 1010. while the estimation results using the winter exponential smoothing model, respectively 1625925; 1645196; 1687667; 1605530; and 1555213 with the mse/mad 3.08616 × 1010.therefore, it can conclude that the winter exponential smoothing model is the best model that can explain the actual data pattern and is used to estimate rice paddy production in aceh. references [1] r. biswas, “study on arima modelling to forecast area and production of kharif rice in west bengal,” in cutting-edge research in agricultural sciences vol. 12, 2021. [2] n. fitri, “analisis faktor-faktor yang mempengarui produksi padi di provinsi aceh,” j. ilmu ekon., vol. 3, no. 1, pp. 81–95, 2015. [3] m. setyowati, j. irawan, and l. marlina, “karakter agronomi beberapa padi lokal aceh,” j. agrotek lestari, vol. 5, no. 1, pp. 36–50, 2018. 2453401 2154784 2111594 1615171 2062436 1625925 1645196 1687667 1605530 1555213 0 500000 1000000 1500000 2000000 2500000 3000000 2 0 2 1 2 0 2 2 2 0 2 3 2 0 2 4 2 0 2 5r ic e p a d d y p r o d u c t io n ( t o n s ) year arima (3,1,1) winter exponential smoothing forecasting rice paddy production in aceh using arima and exponential smoothing models nurviana 292 [4] bps, provinsi aceh dalam angka (aceh province in figures) 2021. aceh: badan pusat statistik aceh, 2021. [5] h. a. yusuf, i. djakaria, and resmawan, “penerapan metode double moving average untuk meramalkan hasil,” j. mat. dan apl., vol. 9, no. 2, pp. 92–96, 2020. [6] m. n. fawaiq and dkk, “prediksi hasil pertanian padi di kabupaten kudus dengan metode brown’s double exponential smoothing,” jipi (jurnal penelit. dan pembelajaran inform., vol. 4, no. 2, pp. 78–87, 2019. [7] djafar, m. s. ihsan, and y. purnamasari, “peramalan jumlah produksi padi di sulawesi tenggara menggunakan metode fuzzy time series,” semantik, vol. 3, no. 2, pp. 113–120, 2017. [8] c. madhavi latha, k. siva nageswararao, d. venkataramanaiah, r. scholar, and a. professor, “forecasting time series stock returns using arima: evidence from s&p bse sensex,” int. j. pure appl. math., 2018. [9] y. wigati and dkk, “pemodelan times series dengan proses arima untuk prediksi indeks harga konsumen (ihk) di palu-sulawesi tengah,” j. ilm. mat. dan terap., vol. 12, no. 2, 2016. [10] n. a. bakar and s. rosbi, “autoregressive integrated moving average (arima) model for forecasting cryptocurrency exchange rate in high volatility enviroment: a new insight of bitcoin transaction,” int. j. adv. eng. res. sci., 2017. [11] d. p. singh, p. kumar, and k. prabakaran, “application of arima model for forecasting paddy production in bastar division of chhattisgarh,” am. int. j. res. sci. technol. eng. math., vol. 14, no. 43, pp. 82–87, 2013. [12] r. h. shumway and d. s. stoffer, time series analysis and its applications. usa: spinger, 2017. [13] et. all amelia, “forecasting annual coffee and rubber production in aceh using exponential smoothing,” in regular proceeding 3rd isimmed, 2019, pp. 3–10. [14] e. prasetyowati, n. r. imron rosyadi, and matsaini, “estimated profits of rengginang lorjuk madura by used comparison of holt-winter and moving average,” 2020, doi: 10.11591/eecsi.v7.2031. [15] r. j. hyndman, forecasting: principles & practice. australia: university of western australia, 2014. bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 270-278 p-issn: 2086-0382; e-issn: 2477-3344 submitted: january 22, 2021 reviewed: april 24, 20201 accepted: april 30 , 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.11482 bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar1, rahmi febriyuni2, izzati rahmi hg3 1,2,3mathematics department, faculty of mathematics and natural sciences, universitas andalas, padang email: ferrayanuar@sci.unand.ac.id, raahmifebriyuni@gmail.com, izzatirahmihg@sci.unand.ac.id abstract the purposes of this study are to estimate the scale parameter of invers rayleigh distribution under mle and bayesian generalized square error loss function (self). the posterior distribution is considered to use two types of prior, namely jeffrey’s prior and exponential distribution. the proposed methods are then employed in the real data. several criteria for the selection model are considered in order to identify the method which results in a suitable value of parameter estimated. this study found that bayesian generalized self under jeffrey’s prior yielded better estimation values than mle and bayesian generalized self under exponential distribution. keywords: bayesian generalized self; exponential distribution; inverse rayleigh; jeffrey’s prio; mle. introduction rayleigh distribution is a special form of weibull distribution, meanwhile, inverse rayleigh distribution is a special form of inverse weibull distribution. the inverse rayleigh distribution is very useful lifetime model that can be used for analyzing infant mortality, survival analysis, reliability and quality control. the probability density function (pdf) of the inverse rayleigh distribution with scale parameter 𝜃 is defined as follows [1]: 𝑓(𝑥; 𝜃) = 2𝜃 𝑥3 exp (− 𝜃 𝑥2 ) , 𝜃 > 0, 𝑥 > 0. (1) the cumulative distribution function (cdf) of the inverse rayleigh distribution is given by 𝐹(𝑥; 𝜃) = exp (− 𝜃 𝑥2 ) , 𝜃 > 0, 𝑥 > 0 . (2) here 𝜃 is the scale parameter. the behavior of instantaneous failure rate of the inverse rayleigh distribution has been increasing and decreasing failure rate patterns for lifetime data. a significant amount of work has been done related to the inverse rayleigh distribution model in the classical framework but not much in a bayesian setup, especially in bayesian generalized self (squared error loss function). several studies http://dx.doi.org/10.18860/ca.v6i4.11482 mailto:ferrayanuar@sci.unand.ac.id mailto:raahmifebriyuni@gmail.com mailto:izzatirahmihg@sci.unand.ac.id bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 271 have used inverse rayleigh distribution for several cases. soliman and al-aboud [2] used classical method and bayesian for parameter estimation based on a set of upper record values from the rayleigh distribution. aslam and jun [3] derived an acceptance sampling plan from a truncated life test where multiple items in a group could be tested simultaneously by a tester when the lifetime of an item followed either an inverse rayleigh or a log-logistic distribution. soliman et al. [4] discussed the parameter estimation for an inverse rayleigh distribution based on lower record values. they implemented a maximum likelihood (ml) estimator of the unknown parameter and bayesian analysis with informative prior used to derive these estimators and the predictive intervals. ali [5] explored the modeling of the heterogeneity existing in the lifetime processes using the mixture of the inverse rayleigh distribution, and the spotlight is the bayesian inference of the mixture model using non-informative (the jeffreys and the uniform) and informative (gamma) priors. studied by dey & dey [6] derived bayesian estimation of the scale parameter and reliability function of an inverse rayleigh distribution. yousef & lafta [7] explored how to estimate the scale parameter for distribution of inverse rayleigh using different methods, such as the method of maximum likelihood estimator and moment method. dey [8] obtained bayesian estimates of an inverse rayleigh distribution using squared error and linex loss functions. meanwhile, rasheed [9] designed some bayesian estimators for the parameter scale and reliability function of the inverse rayleigh distribution under the generalized squared error loss function (self). in the present study, we consider the estimation of unknown parameters in an inverse rayleigh distribution. the aim of this study is to estimate the scale parameter of inverse rayleigh distributions using frequentist method (mle) and the bayesian approach which are employed to empirical data. the bayesian approach here is bayesian generalized self under two types of priors, namely jeffrey’s prior and exponential’s prior. the criteria to determine better performance of estimation method are based on the smallest value of akaike information criteria (aic), akaike information criteria correction (aicc) and bayesian information criteria (bic). the remainder of this study is organized as follows: the maximum likelihood estimation (mle), bayesian generalized self, jeffrey’s method, exponential distribution, criteria for the goodness of fit of parameter estimation method are derived in section 2. estimation method using bayesian generalized self under two types of priors and implementation of the proposed method to the real data are discussed in section 3. finally, section 4 as the last section provides some concluding remarks. methods in this section, we explore all methods which are implemented in this present study. the maximumm likelihood estimation method in the beginning and then followed by bayesian approach and criteria for model selection. maximum likelihood estimation in this section, we derived the classical estimator of the scale parameter for the inverse rayleigh distribution represented by the maximum likelihood estimator. let 𝑋1, 𝑋2, … 𝑋𝑛 be a sequence of i.i.d random variables from invers rayleigh distribution with scale parameter θ, written as 𝑋 ~ 𝐼𝑅𝐷 (𝜃), with probability density function of 𝑋 is 𝑓(𝑥 ; 𝜃) as presented by eq. (1). thus, the maximum likelihood estimation is formulated as follows [10]: bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 272 𝐿(𝜃) = ∏ 𝑓(𝑥𝑖 , 𝜃) 𝑛 𝑖=1 = 2𝑛 𝜃𝑛 (∏ 1 𝑥𝑖 3 𝑛 𝑖=1 ) 𝑒𝑥𝑝 (−𝜃 ∑ 1 𝑥𝑖 2 𝑛 𝑖=1 ) (3) to obtained the estimate for θ is derived by maximizing eq. (3) until we have: 𝜃𝑀�̂� = 𝑛 ∑ 1 𝑥 𝑖 2 𝑛 𝑖=1 = 𝑛 𝑇 , 𝑤ℎ𝑒𝑟𝑒 𝑇 = ∑ 1 𝑥𝑖 2 𝑛 𝑖=1 (4) bayesian generalized self method this section deals with the problem of obtaining bayesian estimators for the scale parameter θ from the inverse rayleigh distribution the bayes method is a parameter estimation method based on the bayes theorem. the basic concepts of the bayes method are as follows. suppose that 𝑋1, 𝑋2, … . , 𝑋𝑛 is a random example of the distribution 𝑓(𝒙 ; 𝜃) where 𝜃 is the parameter of the distribution. estimation of parameter 𝜃 will be based on random example 𝑋1, 𝑋2, … . , 𝑋𝑛. the bayes method is an estimation method based on combining information obtained from samples (objective knowledge), known as the likelihood function, with prior information regarding the distribution of estimated parameters [11], [12]. multiplying the likelihood function by the prior distribution gives the posterior distribution. in other words, the posterior distribution is a conditional probability density function of a parameter θ which is given the observation 𝐱 = (𝑋1, 𝑋2, … . , 𝑋𝑛). the formula for defining the posterior distribution is stated by the following formula [13]: 𝑓(𝜃|𝒙) = 𝑓(𝒙|𝜃)𝑓(𝜃) ∫ 𝑓(𝒙|𝜃)𝑓(𝜃)𝑑𝜃 = 𝑓(𝒙,𝜃) 𝑓(𝒙) (5) meanwhile, the estimator for the scale parameter (𝜃) using the bayes generalized self method will be described as follows [9]: 𝐿(𝜃 ; 𝜃) = ∑ 𝛼𝑗 𝜃 𝑗 (𝜃 − 𝜃) 𝑛 𝑖=1 = (𝛼0 + 𝛼1𝜃 + ⋯ + 𝛼𝑘 𝜃 𝑘 ) (𝜃 − 𝜃) (6) the estimator for parameter 𝜃 is obtained by minimizing the expectation for 𝜃, which is denoted by 𝐿(𝜃 ; 𝜃). the expected value of this function can be found by combining 𝐿(𝜃 ; 𝜃) and the probability density function of 𝜃, here denoted by ℎ(𝜃 |𝑥). thus, the expectation of 𝜃 using the bayes generalized self is as follows: 𝐸[𝐿(𝜃 ; 𝜃)] = ∫ 𝐿(𝜃 ; 𝜃) ℎ(𝜃 |𝑥) 𝑑𝜃 ∞ 0 𝐸[𝐿(𝜃 ; 𝜃)] = 𝛼0 𝜃 2 − 2 𝛼0 𝜃 𝐸(𝜃 | 𝑥) + 𝛼0 𝐸(𝜃 2 | 𝑥) + 𝛼1 𝜃 2𝐸(𝜃 | 𝑥) − 2 𝛼1 𝜃 𝐸(𝜃 2 | 𝑥) + 𝛼1 𝐸(𝜃 3 | 𝑥) + ⋯ + 𝛼𝑘 𝜃 2𝐸(𝜃𝑘 | 𝑥) − 2 𝛼𝑘 𝜃 𝐸(𝜃 𝑘+1 | 𝑥) + 𝛼𝑘 𝐸(𝜃 𝑘+2 | 𝑥). (7) to obtain the estimated value for 𝜃 with the bayesian generalized self method, the eq. (7) is derived on 𝜃 , so that: 𝜃𝐵�̂� = 𝛼0 𝐸(𝜃 | 𝑥) + 𝛼1 𝐸(𝜃 2 | 𝑥) + ⋯ + 𝛼𝑘 𝐸(𝜃 𝑘+1 | 𝑥) 𝛼0 + 𝛼1 𝐸(𝜃 | 𝑥) + ⋯ + 𝛼𝑘 𝐸(𝜃 𝑘 | 𝑥) (8) bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 273 jeffreys’ prior as non-informative prior the most widely used noninformative priors in bayesian analysis is jeffreys’ prior. this method is also attractive because it is proper under mild conditions and requires no elicitation of hyperparameters. jeffreys’ rule is derived from likelihood function then take the prior distribution to be the determinant of the square root of the fisher information matrix, denoted by 𝑓(𝜃) ∝ √𝐼(𝜃). fisher's information for the parameter 𝜃, defined as [13], [14] 𝐼(𝜃) = −𝑛 𝐸 ( 𝜕2 ln(𝑓(𝑥𝑖 ; 𝜃)) 𝜕𝜃2 ) let b is constant, thus : 𝑓(𝜃) ∝ √𝐼(𝜃) = 𝑏 √−𝑛 𝐸 ( 𝜕2 ln(𝑓(𝑥𝑖 ; 𝜃)) 𝜕𝜃2 ) (9) for inverse rayleigh distribution, it’s found that 𝜕2 ln(𝑓(𝑥𝑖 ; 𝜃)) 𝜕𝜃2 = − 1 𝜃2 . thus, it’s also obtained that 𝐸 ( 𝜕2 ln(𝑓(𝑥𝑖 ; 𝜃)) 𝜕𝜃2 ) = − 1 𝜃2 (10) by substituting eq. (10) into eq. (9), then it results 𝑓(𝜃) = 𝑏 𝜃 √𝑛 , 𝜃 > 0 (11) by combining this jeffrey’s prior and likelihood function, it yields the following posterior distribution : ℎ1(𝜃 | 𝑥1, 𝑥2, … , 𝑥𝑛 ) = 𝑇𝑛 𝜃𝑛−1 exp(−𝜃𝑇) г(𝑛) (12) the posterior distribution in eq. (12) has identic form with gamma distribution with scale parameter 1 𝑇 and shape parameter n, written as 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑇 , 𝑛). exponential distribution as conjugate prior we also derive the parameter estimation for 𝜃 based on bayesian generalized self with exponential distribution as prior. the probability distribution function for random variable 𝜃 which has exponential distribution with scale parameter 𝜆, written as 𝜃~𝐸𝑥𝑝(𝜆), is formulated as follows: 𝑔(𝜃) = 1 𝜆 exp ( −𝜃 𝜆 ) , 𝜃 > 0, 𝜆 > 0 (13) eq. (13) then is combined with likelihood function in eq. (3) until we have the posterior distribution as follows: ℎ2(𝜃 | 𝑥1, 𝑥2, … , 𝑥𝑛 ) = (𝑇 + 1 𝜆 ) 𝑛+1 𝜃𝑛 exp (−𝜃 (𝑇 + 1 𝜆 )) г(𝑛 + 1) (14) bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 274 this posterior distribution has a similar form with gamma distribution with scale parameter is 1 𝑇+ 1 𝜆 and shape parameter is n+1, written as 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑇+ 1 𝜆 , 𝑛 + 1) or 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑃 , 𝑛 + 1) with 𝑃 = 𝑇 + 1 𝜆 . criteria model selection the akaike information criterion (aic) which is widely used for statistical inference, is an estimator of out-of-sample prediction error and thereby the relative quality of statistical models for a given set of data [15]. given several models for the data, aic estimates the quality of each model, relative to each of the other models. this method provides a means for each model. when a statistical model is used to represent the process that generated the data, the representation will almost never be exact; so some information will be lost by using the model to represent the process. aic estimates the relative amount of information lost by a given model: the less information a model loses, the higher the quality of that model. in estimating the amount of information lost by a model, aic deals with the trade-off between the goodness of fit of the model and the simplicity of the model. in other words, aic deals with both the risk of overfitting and the risk of underfitting. suppose that we have a statistical model of some data. let k be the number of estimated parameters in the model. let �̂� be the maximum value of the likelihood function for the model. then the aic value of the model is the following [15]: 𝐴𝐼𝐶 = 2𝑘 − 2𝑙𝑛(�̂�) for condition 𝑛 𝑘 < 40 with n represent the amount of data, it’s suggested to use aicc (akaike information criteria correction): 𝐴𝐼𝐶𝑐 = 𝐴𝐼𝐶 + 2𝑘(𝑘 + 1) 𝑛 − 𝑘 − 1 another method to estimate the quality of each model relative to each of the other models is bayesian information criteria (bic), which is represented by following: 𝐵𝐼𝐶 = 𝑘𝑙𝑛(𝑛) − 2𝑙𝑛 (𝐿(𝜃)). given a set of candidate models for the data, the preferred model is the one with the minimum aic, aicc and bic value. results and discussion in this current study, we then employ the bayesian generalized self under non informative prior that is jeffreys prior to estimate the scale parameter of invers rayleight distribution. we also consider the bayesian self under informative prior namely an exponential distribution. both methods as well as mle are employed to the empirical data then. the most suitable method to be implemented is determined based on the smallest values of aic, aicc and bic. https://en.wikipedia.org/wiki/statistical_inference https://en.wikipedia.org/wiki/estimator https://en.wikipedia.org/wiki/out-of-sample https://en.wikipedia.org/wiki/statistical_model https://en.wikipedia.org/wiki/goodness_of_fit https://en.wikipedia.org/wiki/overfitting https://en.wikipedia.org/wiki/statistical_model https://en.wikipedia.org/wiki/statistical_parameter https://en.wikipedia.org/wiki/likelihood_function bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 275 bayesian generalized self under jeffrey’s prior based on eq. (12) is obtained that 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑇 , 𝑛). it can be derived that 𝐸 (𝜃 | 𝑥) = 𝑛 𝑇 and in general we also obtain 𝐸 (𝜃𝑚 | 𝑥) = г (𝑛+𝑚) г(𝑛) 𝑇𝑚 . these expected values then be substituted into eq. (8) to derive the formula to estimate 𝜃, under jeffrey’s prior, denoted here as 𝜃�̂� : 𝜃�̂� = 𝛼0 ( 𝑛 𝑇 ) + 𝛼1 ( (𝑛+1)𝑛 𝑇2 ) + ⋯ + 𝛼𝑘 ( (𝑛+𝑘)(𝑛+𝑘−1)…(𝑛+1)𝑛 𝑇𝑘+1 ) 𝛼0 + 𝛼1 ( 𝑛 𝑇 ) + ⋯ + 𝛼𝑘 ( (𝑛+𝑘−1)(𝑛+𝑘−2)…(𝑛+1)𝑛 𝑇𝑘 ) in this study, we choose first polynomial until fourth polynomial to be applied to estimate 𝜃: 𝜃𝐽1̂ = 𝛼0 ( 𝑛 𝑇 ) + 𝛼1 ( (𝑛+1)𝑛 𝑇2 ) 𝛼0 + 𝛼1 ( 𝑛 𝑇 ) (15) 𝜃𝐽2̂ = 𝛼0 ( 𝑛 𝑇 ) + 𝛼1 ( (𝑛+1)𝑛 𝑇2 ) + 𝛼2 ( (𝑛+2)(𝑛+1)𝑛 𝑇3 ) 𝛼0 + 𝛼1 ( 𝑛 𝑇 ) + 𝛼2 ( (𝑛+1)𝑛 𝑇2 ) (16) 𝜃𝐽3̂ = 𝛼0 ( 𝑛 𝑇 ) + 𝛼1 ( (𝑛+1)𝑛 𝑇2 ) + ⋯ + 𝛼3 ( (𝑛+3)(𝑛+2)(𝑛+1)𝑛 𝑇4 ) 𝛼0 + 𝛼1 ( 𝑛 𝑇 ) + ⋯ + 𝛼3 ( (𝑛+2)(𝑛+1)𝑛 𝑇3 ) (17) 𝜃𝐽4̂ = 𝛼0 ( 𝑛 𝑇 ) + 𝛼1 ( (𝑛+1)𝑛 𝑇2 ) + ⋯ + 𝛼4 ( (𝑛+4)(𝑛+3)…(𝑛+1)𝑛 𝑇5 ) 𝛼0 + 𝛼1 ( 𝑛 𝑇 ) + ⋯ + 𝛼4 ( (𝑛+3)(𝑛+2)(𝑛+1)𝑛 𝑇4 ) (18) bayesian generalized self under exponential distribution it has been proved that 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑇+ 1 𝜆 , 𝑛 + 1) or 𝜃 | 𝑥 ~ 𝐺𝑎𝑚𝑚𝑎 ( 1 𝑃 , 𝑛 + 1) with 𝑃 = 𝑇 + 1 𝜆 . then, it can be proved that 𝐸 (𝜃𝑚 | 𝑥) = г (𝑛 + 1 + 𝑚) г(𝑛 + 1) 𝑃𝑚 (19) by substituting 𝑚 = 1,2, . . . , 𝑘 to eq. (19), we then derive the estimate formula for scale parameter, 𝜃 under exponential prior, denoted here as 𝜃�̂� : 𝜃�̂� = 𝛼0 ( 𝑛+1 𝑃 ) + 𝛼1 ( (𝑛+2)(𝑛+1) 𝑃2 ) + ⋯ + 𝛼𝑘 ( (𝑛+𝑘+1)…(𝑛+1) 𝑃𝑘+1 ) 𝛼0 + 𝛼1 ( 𝑛+1 𝑃 ) + ⋯ + 𝛼𝑘 ( (𝑛+𝑘)(𝑛+𝑘−1)…(𝑛+1) 𝑃𝑘 ) (20) in this study, we choose first polinomial until fourth polinomial based eq. (20) to be used to estimate 𝜃�̂� . bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 276 𝜃𝐸1̂ = 𝛼0 ( 𝑛+1 𝑃 ) + 𝛼1 ( (𝑛+2)(𝑛+1) 𝑃2 ) 𝛼0 + 𝛼1 ( 𝑛+1 𝑃 ) (21) 𝜃𝐸2̂ = 𝛼0 ( 𝑛+1 𝑃 ) + 𝛼1 ( (𝑛+2)(𝑛+1) 𝑃2 ) + 𝛼2 ( (𝑛+3)(𝑛+2)(𝑛+1) 𝑃3 ) 𝛼0 + 𝛼1 ( 𝑛+1 𝑃 ) + 𝛼2 ( (𝑛+2)(𝑛+1) 𝑃2 ) (22) 𝜃𝐸3̂ = 𝛼0 ( 𝑛+1 𝑃 ) + 𝛼1 ( (𝑛+2)(𝑛+1) 𝑃2 ) + ⋯ + 𝛼3 ( (𝑛+4)…(𝑛+1) 𝑃4 ) 𝛼0 + 𝛼1 ( 𝑛+1 𝑃 ) + ⋯ + 𝛼3 ( (𝑛+3)(𝑛+2)(𝑛+1) 𝑃3 ) (23) 𝜃𝐸4̂ = 𝛼0 ( 𝑛+1 𝑃 ) + 𝛼1 ( (𝑛+2)(𝑛+1) 𝑃2 ) + ⋯ + 𝛼4 ( (𝑛+5)…(𝑛+1) 𝑃5 ) 𝛼0 + 𝛼1 ( 𝑛+1 𝑃 ) + ⋯ + 𝛼4 ( (𝑛+4)…(𝑛+1) 𝑃4 ) (24) implementation of proposed method to real data the result of analytical study above then implemented to real data. the real data set represents the 72 exceedances for the years 1958–1984 (rounded to one decimal place) of flood peaks (in m3/s) of the wheaton river near carcross in yukon territory, canada [16]. the data are as follows: 1.7 2.2 14.4 1.1 0.4 20.6 5.3 0.7 1.9 13.0 12.0 9.3 1.4 18.7 8.5 25.5 11.6 14.1 22.1 1.1 2.5 14.4 1.7 37.6 0.6 2.2 39.0 0.3 15.0 11.0 7.3 22.9 1.7 0.1 1.1 0.6 9.0 1.7 7.0 20.1 0.4 2.8 14.1 9.9 10.4 10.7 30.0 3.6 5.6 30.8 13.3 4.2 25.5 3.4 11.9 21.5 27.6 36.4 2.7 64.0 1.5 2.5 27.4 1.0 27.1 20.2 16.8 5.3 9.7 27.5 2.5 27.0 in this present study, we fix several values for 𝛼0=100, 𝛼1=50, 𝛼2=10, 𝛼3=8, 𝛼4=7 and 𝜆=0.8 to be applied to estimate the scale parameter 𝜃. based on this real data, we calculate that: 𝑇 = ∑ 1 𝑥𝑖 2 = 138,99 ∑ 𝑙𝑛 ( 1 𝑥1 3 ) 72 𝑖=1 72 𝑖=1 = −388,38 𝑃 = ∑ 1 𝑥𝑖 2 72 𝑖=1 + 1 𝜆 = 140,24 we then employ both proposed methods and mle to this empirical data. the comparison of criteria for model selection based on three methods are provided in table 1. table 1. estimated values for aic, aicc, and bic prior mean criteria model selection aic aicc bic mle 0.5180 917.6820 917.7391 919.9587 jeffrey’s prior 𝜃𝐽1̂ 0.5195 917.6778 917.7349 919.9545 𝜃𝐽2̂ 0.5198 917.6748 917.7319 919.9515 bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 277 𝜃𝐽3̂ 0.5200 917.6728 917.7299 919.9495 𝜃𝐽4̂ 0.5201 917.6718 917.7289 919.9485 exponential’s prior 𝜃𝐸1̂ 0.5220 917.6816 917.7387 919.9583 𝜃𝐸2̂ 0.5223 917.6786 917.7357 919.9553 𝜃𝐸3̂ 0.5225 917.6766 917.7337 919.9533 𝜃𝐸4̂ 0.5226 917.6756 917.7327 919.9523 table 1 informs us that this present study yielded almost similar values for estimated mean for all three methods (in the third column). based on the criteria model selection, this study found that jeffrey’s prior as noninformation prior, tends to result smaller values than mle and exponential ‘s prior for all four polynomials. the smallest values for these criteria are at jeffrey’s prior at fourth polynomial (𝜃𝐽4̂). these results inform us that the method to estimate scale parameter of invers rayleigh distribution using bayesian generalized self under jeffrey’s prior tends to result better values than mle and bayesian generalized self under exponential’s prior. this present study proved it by employing all proposed method to real data with size sample is relatively moderate, n = 72. conclusions this study employed the mle, bayesian generalized self under jeffrey’s prior and bayesian generalized self under exponential’s prior to estimate the scale parameter of invers rayleight distribution of a real data. all 72 sample of flood peaks data in canada are involved in this study. this real data has invers rayleigh distribution. this study found that estimation mean of scale parameter from invers rayleigh distribution based on mle, bayesian generalized self under jeffrey’s prios and bayesian generalized self under exponential’s prior tend to result similar values. based on criteria of selection model, this study proved that bayesian generalized self under jeffrey’s prior tend to result the smallest value of aic, aicc and bic. acknowledgments this research was partly funded by drpm, the deputy for strengthening research and development of the ministry of research and technology / national research and innovation agency of indonesia, in accordance with contract number 123/sp2h/amd/lt/drpm/2020. bayesian generalized self method to estimate scale parameter of inverse rayleigh distribution ferra yanuar 278 references [1] a. rasheed, “reliability estimation in inverse rayleigh distribution using precautionary loss function,” mathematics and statistics journal, vol. 2, no. 3, pp. 9–15, 2016. [2] a. a. soliman and f. m. al-aboud, “bayesian inference using record values from rayleigh model with application,” european journal of operational research, vol. 185, no. 2, pp. 659–672, mar. 2008, doi: 10.1016/j.ejor.2007.01.023. [3] m. aslam and c.-h. jun, “a group acceptance sampling plans for truncated life tests based on the inverse rayleigh and log-logistic distributions,” pakistan journal of statistics, vol. 25, no. 2, pp. 107–119, 2009. [4] a. soliman, e. a. amin, and a. a. a.-e. aziz, “estimation and prediction from inverse rayleigh distribution based on lower record values,” applied mathematical sciences, vol. 4, no. 62, pp. 3057–3066, 2010. [5] s. ali, “mixture of the inverse rayleigh distribution: properties and estimation in a bayesian framework,” applied mathematical modelling, vol. 39, no. 2, pp. 515– 530, jan. 2015, doi: 10.1016/j.apm.2014.05.039. [6] s. dey and t. dey, “bayesian estimation and prediction on inverse rayleigh distribution,” international journal of information and management sciences, vol. 22, pp. 1–15, 2011. [7] a. h. yousef and a. lafta, “comparison between methods of the scale parameter,” fes: finance, economy, strategy, vol. 58, no. 9, pp. 22–30, 2012. [8] s. dey, “bayesian estimation and prediction on inverse rayleigh distribution,” malaysian journal of mathematical sciences, vol. 6, no. 1, pp. 113–124, 2012. [9] h. a. rasheed, “comparison of bayes estimators for parameter and relia-bility function for inverse rayleigh distribution by using generalized square error loss function,” mjs, vol. 28, no. 2, pp. 162–168, 2017, doi: 10.23851/mjs.v28i2.512. [10] a. ahmad, s. p. ahmad, and a. ahmed, “transmuted inverse rayleigh distribution: a generalization of the inverse rayleigh distribution.,” mathematical theory and modeling, vol. 4, no. 7, pp. 90–98, 2014. [11] f. yanuar, “the health status model in urban and rural society in west sumatera, indonesia: an approach of structural equation modeling,” indian journal of science and technology, vol. 9, p. 8, 2016. [12] f. yanuar, k. ibrahim, and a. a. jemain, “bayesian structural equation modeling for the health index,” journal of applied statistics, vol. 40, no. 6, pp. 1254–1269, jun. 2013, doi: 10.1080/02664763.2013.785491. [13] f. yanuar, h. yozza, and r. v. rescha, “comparison of two priors in bayesian estimation for parameter of weibull distribution,” sci. technol. indones., vol. 4, no. 3, p. 82, jul. 2019, doi: 10.26554/sti.2019.4.3.82-87. [14] c. muharisa, f. yanuar, and d. devianto, “simulation study the using of bayesian quantile regression in nonnormal error,” cauchy, vol. 5, no. 3, p. 121, dec. 2018, doi: 10.18860/ca.v5i3.5633. [15] k. p. burnham and d. r. anderson, model selection and multimodel inference: a practical information-theoretic approach, 2. ed., [4. printing]. new york, ny: springer, 2002. [16] k. fatima and s. p. ahmad, “weighted inverse rayleigh distribution,” international journal of statistics and systems, vol. 12, no. 1, pp. 119–137, 2017. the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process cauchy –jurnal matematika murni dan aplikasi volume 7(1) (2021), pages 73-83 p-issn: 2086-0382; e-issn: 2477-3344 submitted: july 05, 2021 reviewed: october 15, 2021 accepted: ocotober 30, 2021 doi: https://doi.org/10.18860/ca.v7i1.12848 the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi1, bonno andri wibowo2, nina valentika3, muhammad syazali4, vina apriliani5 1department of mathematics, syiah kuala university, banda aceh, indonesia 2department of mathematics, ipb university, bogor, indonesia 3department of mathematics, pamulang university, tangerang selatan, indonesia 4department of mathematics, universitas pertahanan, bogor, indonesia 5department of mathematics education, uin ar-raniry, banda aceh, indonesia email: ikhsanmaulidi@unsyiah.ac.id bonno1818@gmail.com , dosen02339@unpam.ac.id, muhamad.syazali@idu.ac.id, vina.apriliani@ar-raniry.ac.id abstract the nonhomogeneous poisson process is one of the most widely applied stochastic processes. in this article, we provide a confidence interval of the intensity estimator in the presence of a periodic multiplied by trend power function. this estimator's confidence interval is an application of the formulation of the estimator asymptotic distribution that has been given in previous studies. by using the asymptotic theorem, the distribution was derived in the form of a confidence interval for the intensity function. in addition, constructive proof of the convergent in probability has been provided for all power functions. the results of this study contribute to the study of statistical analysis of the estimators that have been formulated previously. keywords: asymptotic distribution; interval confidence; intensity function; poisson process. introduction there are many events in nature can be modeled by stochastic modeling processes. the stochastic process is a set of random variables that map the sample space to a state space. one of the stochastic processes is a counting process which states the number of events at a time interval. the counting process assuming the number of events has a poisson distribution is called the poisson process. some basic theories related poisson process can be seen in [1]–[3]. due to the intensity function, the poisson process is divided into two categories, namely the homogeneous poisson process and the nonhomogeneous poisson process. a homogeneous poisson process has a constant intensity function (independent of time), while a nonhomogeneous poisson process has a time-dependent intensity function. this nonhomogeneous poisson process is widely applied to real phenomena, such as the phenomenon of earthquakes [4], traffic accidents [5], and radio burst rates [6]. on the other hand, the study of the nonhomogeneous poisson process in the form a periodic intensity function also has been conducted in recent years. [7] studied the https://doi.org/10.18860/ca.v7i1.12848 mailto:ikhsanmaulidi@unsyiah.ac.id mailto:bonno1818@gmail.com mailto:dosen02339@unpam.ac.id mailto:muhamad.syazali@idu.ac.id mailto:vina.apriliani@ar-raniry.ac.id the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 74 estimation of the intensity function in a nonhomogeneous poisson process by including a trend component in the periodic intensity function. the trend component began with a trend in the form of an additive linear function. then the study was continued with a trend in the form of a multiplicative linear function [8]. the estimation of the intensity function was carried out using the general kernel function approach, and [9] examined this poisson process with a uniform kernel approach. other related studies can be seen in [10]–[13]. [14] studied the estimation of the periodic poisson process intensity function with the power function trend using a general kernel. furthermore, the statistical properties of these estimators have also been proven. [15] has given strong consistency of these estimators. in addition, the asymptotic normality of the estimator has also been formulated and given a numerical simulation of the consistency of the estimator [16]. the results obtained in that study are the estimator of the periodic component which converges to the normal distribution by providing certain conditions. as an application of the asymptotic normality, it can be determined the confidence interval of the estimator for the periodic component. this study provides the theorems for the confidence interval for the intensity function parameters and their proofs. the contribution of this study is to provide the characteristics of the estimator, especially in terms of accuracy. with a certain number of samples (interval length), it can be determined how accurately the estimator predicts the value of the parameter in the form of a confidence interval. methods the estimator for periodic component of the intensity function suppose that {𝑁(𝑡), 𝑡 ≥ 0} is a nonhomogeneous poisson process with intensity function 𝜆 which locally integrable and unknown. suppose also that 𝜆 is a periodic function with the trend of the power function, then λ which depends on the time variable 𝑠 can be expressed as 𝜆(𝑠) = 𝜆𝑐 ∗ (𝑠). 𝑎𝑠𝑏 . (1) the values of the 𝑎 and 𝑏 constants are assumed to be known, so that what is not known is the function of the periodic component of the intensity function, namely 𝜆𝑐 ∗ . equation (1) can also be stated as follows 𝜆(𝑠) = 𝜆𝑐 (𝑠). 𝑠 𝑏 , (2) with 𝜆𝑐 (𝑠) = 𝑎𝜆𝑐 ∗ (𝑠). [14] has been given the kernel type estimator for 𝜆𝑐 (𝑠) by using general kernel functions. the estimator for periodic component of the intensity is �̂�𝑐,𝑛,𝑘 (𝑠) = 𝜏 𝑛 ∑ 1 ℎ𝑛 (𝑠 + 𝑘𝜏) 𝑏 ∞ 𝑘=0 ∫ 𝐾 ( 𝑥 − (𝑠 + 𝑘𝜏) ℎ𝑛 ) 𝑁(𝑑𝑥). (3) 𝑛 0 on equation (3), the constant 𝜏 is a period of the intensity function which satisfies 𝜆𝑐 (𝑠 + 𝑘𝜏) = 𝜆𝑐 (𝑠), for 𝑘 ∈ 𝑍. with 𝑛 is the length of the time interval used. in this case, since the poisson process is a discrete stochastic process, it is clear that 𝑛 is a natural number. the function 𝐾 called a kernel function if it satisfies the following properties: (k1) 𝐾 is a probability density function, (k2) 𝐾 is bounded, and (k3) 𝐾 is defined in [-1,1] [17]. the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 75 the asymptotic normality of the estimator theorem 1 (the asymptotic normal distribution for �̂�𝒄,𝒏,𝒌(𝒔), 𝟎 < 𝒃 < 𝟏) suppose that the intensity 𝜆 satisfies (1) and locally integrable. the kernel function 𝐾 satisfies (k1), (k2), (k3), 𝜆𝑐 has a bounded second derivative around of 𝑠, 0 < 𝑏 < 1, 𝑛1−𝑏 ℎ𝑛 → 0, 𝑛 𝑏+1ℎ𝑛 → ∞, ℎ𝑛 ↓ 0 as 𝑛 → ∞, a) if (𝑛1+𝑏 ℎ𝑛 5 ) 1 2 → 0, then (𝑛1+𝑏 ℎ𝑛 ) 1 2(�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(0, 𝜎2) (4) as 𝑛 → ∞, with 𝜎2 = 𝜏𝜆𝑐(𝑠) (1−𝑏) ∫ 𝐾2(𝑧)𝑑𝑧. 1 −1 b) if (𝑛1+𝑏 ℎ𝑛 5 ) 1 2 → 1, then (𝑛1+𝑏 ℎ𝑛 ) 1 2(�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(𝜇, 𝜎2) (5) as 𝑛 → ∞, with 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 and 𝜎2 = 𝜏𝜆𝑐(𝑠) (1−𝑏) ∫ 𝐾2(𝑧)𝑑𝑧. 1 −1 theorem 2 (the asymptotic normal distribution for �̂�𝒄,𝒏,𝒌(𝒔), 𝒃 = 𝟏) suppose that the intensity 𝜆 satisfies (1) and locally integrable. the kernel function 𝐾 satisfies (k1), (k2), (k3), 𝜆𝑐 has a bounded second derivative around of 𝑠, 𝑏 = 1, ln (𝑛)ℎ𝑛 → 0, 𝑛2ℎ𝑛 𝑙𝑛(𝑛) → ∞, ℎ𝑛 ↓ 0 as 𝑛 → ∞, a) if ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 0, then ( 𝑛2ℎ𝑛 ln (𝑛) ) 1 2 (�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(0, 𝜎2) (6) as 𝑛 → ∞, with 𝜎2 = 𝜏𝜆𝑐 (𝑠) ∫ 𝐾 2(𝑧)𝑑𝑧 1 −1 . b) if ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 1, then ( 𝑛2ℎ𝑛 ln (𝑛) ) 1 2 (�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(𝜇, 𝜎2) (7) as 𝑛 → ∞, with 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 and 𝜎2 = 𝜏𝜆𝑐 (𝑠) ∫ 𝐾 2(𝑧)𝑑𝑧. 1 −1 theorem 3 (the asymptotic normal distribution for �̂�𝒄,𝒏,𝒌(𝒔), 𝒃 > 𝟏) suppose that the intensity 𝜆 satisfies (1) and locally integrable. the kernel function 𝐾 satisfies (k1), (k2), (k3), and 𝜆𝑐 has a bounded second derivative around of 𝑠, 𝑏 > 1, 𝑛2ℎ𝑛 → ∞, ℎ𝑛 ↓ 0 as 𝑛 → ∞, a) if (𝑛2ℎ𝑛 5 ) 1 2 → 0, then (𝑛2ℎ𝑛) 1 2 (�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(0, 𝜎2) (8) as 𝑛 → ∞, with 𝜎2 = 𝜏2−𝑏 𝜆𝑐 (𝑠)𝜁(𝑏) ∫ 𝐾 2(𝑧)𝑑𝑧 1 −1 , and 𝜁(𝑏) = lim 𝑛→∞ (∑ 1 𝑘𝑏 ∞ 𝑘=1 𝐼(𝑦 + 𝑠 + 𝑘𝜏 ∈ [0, 𝑛])) . the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 76 b) if (𝑛2ℎ𝑛 5 ) 1 2 → 1, then (𝑛2ℎ𝑛 ) 1 2(�̂�𝑐,𝑛,𝐾 (𝑠) − 𝜆𝑐 (𝑠)) 𝑑 → normal(𝜇, 𝜎2) (9) as 𝑛 → ∞, with 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 , 𝜎2 = 𝜏2−𝑏 𝜆𝑐(𝑠)𝜁(𝑏) ∫ 𝐾 2(𝑧)𝑑𝑧 1 −1 , and 𝜁(𝑏) = lim 𝑛→∞ (∑ 1 𝑘𝑏 ∞ 𝑘=1 𝐼(𝑦 + 𝑠 + 𝑘𝜏 ∈ [0, 𝑛])) . the proofs of theorem 1, theorem 2, and theorem 3 above can be proved through a rough analysis, [18]. it is recommended to study the basic theory to proof these theorems in [19]–[21]. results and discussion suppose that ф denotes the standard normal distribution with ф−1 is the inverse. based on theorem 1, theorem 2, and theorem 3 above, it can be given some confidence interval for 𝜆𝑐 with significant level 1 − 𝛼 as follows: corollary 1 (the confidence interval for 𝝀𝒄 for 𝟎 < 𝒃 < 𝟏) suppose that all conditions on theorem 1 are satisfied, the for a significant level α where 0 < 𝛼 < 1, the confidence interval for 𝜆𝑐 for 0 < 𝑏 < 1 has been given in the following conditions: a) if (𝑛1+𝑏 ℎ𝑛 5 ) 1 2 → 0 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏ℎ𝑛 , �̂�𝑐,𝑛,𝑘 (𝑠) + 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏ℎ𝑛 ), where 𝜎2 = 𝜏𝜆𝑐(𝑠) (1−𝑏) ∫ 𝐾2(𝑧)𝑑𝑧. 1 −1 b) if (𝑛1+𝑏 ℎ𝑛 5 ) 1 2 → 1 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛1+𝑏 ℎ𝑛 , �̂�𝑐,𝑛,𝑘 (𝑠) + 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛1+𝑏 ℎ𝑛 ), where 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 and 𝜎2 = 𝜏𝜆𝑐(𝑠) (1−𝑏) ∫ 𝐾2(𝑧)𝑑𝑧. 1 −1 corollary 2 (the confidence interval for 𝝀𝒄 for 𝒃 = 𝟏) suppose that all conditions on theorem 2 are satisfied, the for a significant level α where 0 < 𝛼 < 1, the confidence interval for 𝜆𝑐 for 𝑏 = 1 has been given in the following conditions: a) if ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 0 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 ) , �̂�𝑐,𝑛,𝑘 (𝑠) + 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 ) ), where 𝜎2 = 𝜏𝜆𝑐 (𝑠) ∫ 𝐾 2(𝑧)𝑑𝑧. 1 −1 the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 77 b) if ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 1 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − (𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 , �̂�𝑐,𝑛,𝑘 (𝑠) + (𝜎ф −1 (1 − 𝛼 2 ) − 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ), where 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 and 𝜎2 = 𝜏𝜆𝑐 (𝑠) ∫ 𝐾 2(𝑧)𝑑𝑧. 1 −1 corollary 3 (the confidence interval for 𝝀𝒄 for 𝒃 > 𝟏) suppose that all conditions on theorem 3 are satisfied, the for a significant level α where 0 < 𝛼 < 1, the confidence interval for 𝜆𝑐 for 𝑏 > 1 has been given in the following conditions: a) if (𝑛2ℎ𝑛 5 ) 1 2 → 0 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 , �̂�𝑐,𝑛,𝑘 (𝑠) + 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ), where 𝜎2 = 𝜏2−𝑏 𝜆𝑐 (𝑠)𝜁(𝑏) ∫ 𝐾 2(𝑧)𝑑𝑧 1 −1 b) if (𝑛1+𝑏 ℎ𝑛 5 ) 1 2 → 1 then 𝐼𝜆𝑐 = (�̂�𝑐,𝑛,𝑘 (𝑠) − 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛2ℎ𝑛 , �̂�𝑐,𝑛,𝑘 (𝑠) + 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛2ℎ𝑛 ), where 𝜇 = 𝜆𝑐 ′′(𝑠) 2 ∫ 𝑧2𝐾(𝑧)𝑑𝑧 1 −1 , 𝜎2 = 𝜏2−𝑏 𝜆𝑐 (𝑠)𝜁(𝑏) ∫ 𝐾 2(𝑧)𝑑𝑧 1 −1 , and 𝜁(𝑏) = lim 𝑛→∞ (∑ 1 𝑘𝑏 ∞ 𝑘=1 𝐼(𝑦 + 𝑠 + 𝑘𝜏 ∈ [0, 𝑛])) . to strengthen the reasons for the above confidence intervals, it is given the probability convergence theorems for these confidence interval. theorem 4. convergence in probability of the confidence interval for 𝝀𝒄 and 𝟎 < 𝒃 < 𝟏 if λ̂c,n,k is the estimator for periodic component of the intensity function that is given in equation (3). also, iλc,n is a confidence interval that is given in corollary 1, then for the value 0 < b < 1 satisfies p(λc,n(s)ϵiλc,n ) → 1 − α + o(1), provided n → ∞. the proof of theorem 4: case (a) assumption (𝒏𝟏+𝒃𝒉𝒏 𝟓 ) 𝟏 𝟐 → 𝟎 p(λc(s)ϵiλc ) = 𝑃 (λ̂c,n,k − 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏 ℎ𝑛 ≤ λc,n(s) ≤ λ̂c,n,k + 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏 ℎ𝑛 ) the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 78 p(λc,n(s)ϵiλc,n ) = p (− 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏 ℎ𝑛 ≤ λc,n(s) − λ̂c,n,k ≤ 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏ℎ𝑛 ) p(λc,n(s)ϵiλc,n ) = 𝑃 (− 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏 ℎ𝑛 ≤ λ̂c,n,k − λc,n(s) ≤ 𝜎ф−1 (1 − 𝛼 2 ) √𝑛1+𝑏 ℎ𝑛 ) p(λc,n(s)ϵiλc,n ) = 𝑃 (−𝜎ф −1 (1 − 𝛼 2 ) ≤ √𝑛1+𝑏 ℎ𝑛 (λ̂c,n,k − λc,n(s)) ≤ 𝜎ф −1 (1 − 𝛼 2 )), let 𝑌 = √𝑛1+𝑏ℎ𝑛 (λ̂c,n,k − λc(s)), then based on theorem 1 𝑌~normal(0, 𝜎 2), by using central limit theorem 𝑍 = 𝑌 𝜎 ~normal(0,1). therefore p(λc(s)ϵiλc ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − 𝑃 (𝑍 < −ф−1 (1 − 𝛼 2 )). since the normal distribution has a symmetricity property, 𝑃 (𝑍 < −ф−1 (1 − 𝛼 2 )) = 𝑃 (𝑍 ≥ −ф−1 (1 − 𝛼 2 )), then p(λc(s)ϵiλc ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − (𝑍 ≥ −ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − 1 + p (𝑍 ≤ ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = ф (ф −1 (1 − 𝛼 2 )) − 1 + ф (ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = 1 − 𝛼 2 − 1 + 1 − 𝛼 2 = 1 − 𝛼, provided 𝑛 → ∞. case (b). assumption (𝒏𝟏+𝒃𝒉𝒏 𝟓 ) 𝟏 𝟐 → 𝟏 p(λc(s)ϵiλc ) = 𝑃 (λ̂c,n,k − 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛1+𝑏 ℎ𝑛 ≤ λc,n(s) ≤ λ̂c,n,k + 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛1+𝑏 ℎ𝑛 ) = p (− 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛1+𝑏 ℎ𝑛 ≤ λc,n(s) − λ̂c,n,k ≤ 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛1+𝑏 ℎ𝑛 ) = 𝑃 (− 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛1+𝑏ℎ𝑛 ≤ λ̂c,n,k − λc,n(s) ≤ 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛1+𝑏 ℎ𝑛 ) the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 79 = 𝑃 (− (𝜎ф−1 (1 − 𝛼 2 ) − 𝜇) ≤ √𝑛1+𝑏 ℎ𝑛 (λ̂c,n,k − λc,n(s)) ≤ (𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)) suppose that 𝑌 = √𝑛1+𝑏 ℎ𝑛 (λ̂c,n,k − λc(s)) then based on theorem 1b 𝑌~normal(𝜇, 𝜎 2), and 𝑍 = 𝑌 − 𝜇 𝜎 ~normal(0,1). therefore p(λc(s)ϵiλc ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − 𝑃 (𝑍 < −ф−1 (1 − 𝛼 2 )). since the normal distribution has a symmetricity property 𝑃 (𝑍 < −ф−1 (1 − 𝛼 2 )) = 𝑃 (𝑍 ≥ −ф−1 (1 − 𝛼 2 )), so p(λc(s)ϵiλc ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − (𝑍 ≥ −ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = p (𝑍 ≤ ф −1 (1 − 𝛼 2 )) − 1 + p (𝑍 ≤ ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = ф (ф −1 (1 − 𝛼 2 )) − 1 + ф (ф−1 (1 − 𝛼 2 )) p(λc,n(s)ϵiλc,n ) = 1 − 𝛼 2 − 1 + 1 − 𝛼 2 = 1 − 𝛼, provided 𝑛 → ∞. theorem 5. convergence in probability of the confidence interval for 𝝀𝒄 and 𝒃 = 𝟏 if λ̂c,n,k is the estimator for periodic component of the intensity function that is given in equation (3). also, iλc, is an confidence interval that is given in corollary 2, then for the value b = 1 satisfies p(λc(s)ϵiλc ) → 1 − α + o(1), provided n → ∞. the proof of theorem 5 case (a) assumption ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 𝟎 p(λc(s)ϵiλc ) = 𝑃 (λ̂c,n,k − 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 ) ≤ λc,n(s) ≤ λ̂c,n,k + 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 )) = p (−𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 ) ≤ λc,n(s) − λ̂c,n,k ≤ 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 )) the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 80 = 𝑃 (−𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 ) ≤ λ̂c,n,k − λc,n(s) ≤ 𝜎√ ln(𝑛) 𝑛2ℎ𝑛 ф−1 (1 − 𝛼 2 )) = 𝑃 (−𝜎ф−1 (1 − 𝛼 2 ) ≤ √ 𝑛2ℎ𝑛 ln(𝑛) (λ̂c,n,k − λc,n(s)) ≤ 𝜎ф −1 (1 − 𝛼 2 )). let = √ 𝑛2ℎ𝑛 ln(𝑛) (λ̂c,n,k − λc(s)), then based on theorem 2 𝑌~normal(0, 𝜎 2), by using central limit theorem 𝑍 = 𝑌 𝜎 ~normal(0,1). therefore p(λc(s)ϵiλc ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )) by using the same arguments before, it is obtained p(λc(s) ϵ iλc ) = 1 − 𝛼, provided 𝑛 → ∞. case (b). assumption ( 𝑛2ℎ𝑛 5 ln (𝑛) ) 1 2 → 𝟏 p(λc(s)ϵiλc ) = 𝑃 (λ̂c,n,k − (𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ≤ λc(s) ≤ λ̂c,n,k + (𝜎ф −1 (1 − 𝛼 2 ) − 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ) = p (−𝜎ф−1 (1 − 𝛼 2 ) + 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ≤ λc(s) − λ̂c,n,k ≤ 𝜎ф −1 (1 − 𝛼 2 ) − 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ) = 𝑃 (−ф−1 (1 − 𝛼 2 ) − 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ≤ λ̂c,n,k − λc(s) ≤ 𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)√ ln(𝑛) 𝑛2ℎ𝑛 ) = 𝑃 (− (𝜎ф−1 (1 − 𝛼 2 ) − 𝜇) ≤ √ 𝑛2ℎ𝑛 ln(𝑛) (λ̂c,n,k − λc(s)) ≤ (𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)), suppose that 𝑌 = √ 𝑛2ℎ𝑛 ln(𝑛) (λ̂c,n,k − λc(s)), then according to theorem 1b 𝑌~normal(𝜇, 𝜎2) and 𝑍 = 𝑌 − 𝜇 𝜎 ~normal(0,1). therefore p(λc,n(s) ϵ iλc,n ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )). the same arguments gave us p(λc(s) ϵ iλc ) = 1 − 𝛼, provided 𝑛 → ∞. the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 81 theorem 6. convergence in probability of the confidence interval for 𝝀𝒄,𝒏 and 𝒃 > 𝟏 if λ̂c,n,k is the estimator for periodic component of the intensity function that is given in equation (3). also, iλc,n is a confidence interval that is given in corollary 3, then for the value b > 1 satisfies p(λc(s)ϵiλc ) → 1 − α + o(1), provided n → ∞. the proof of theorem 6 case a. assumption (𝑛2ℎ𝑛 5 ) 1 2 → 0 p(λc(s)ϵiλc ) = 𝑃 (λ̂c,n,k − 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ≤ λc(s) ≤ λ̂c,n,k + 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ) = p (− 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ≤ λc(s) − λ̂c,n,k ≤ 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ) = 𝑃 (−𝜎 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ≤ λ̂c,n,k − λc(s) ≤ 𝜎 𝜎ф−1 (1 − 𝛼 2 ) √𝑛2ℎ𝑛 ) = 𝑃 (−𝜎ф−1 (1 − 𝛼 2 ) ≤ √𝑛2ℎ𝑛 (λ̂c,n,k − λc(s)) ≤ 𝜎ф −1 (1 − 𝛼 2 )). let 𝑌 = √𝑛2ℎ𝑛 (λ̂c,n,k − λc(s), then according to theorem 3a 𝑌~normal(0, 𝜎 2) and 𝑍 = 𝑌 𝜎 ~normal(0,1). therefore p(λc(s)ϵiλc ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )). the same arguments gave us p(λc(s) ϵ iλc ) = 1 − 𝛼, provided 𝑛 → ∞. case b. assumption (𝑛2ℎ𝑛 5 ) 1 2 → 𝟏 p(λc(s)ϵ iλc ) = 𝑃 (λ̂c,n,k − 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛2ℎ𝑛 ≤ λc(s) ≤ λ̂c,n,k + 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛2ℎ𝑛 ) = p (− 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛2ℎ𝑛 ≤ λc(s) − λ̂c,n,k ≤ 𝜎 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛2ℎ𝑛 ) the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 82 = 𝑃 (− 𝜎ф−1 (1 − 𝛼 2 ) − 𝜇 √𝑛2ℎ𝑛 ≤ λ̂c,n,k − λc(s) ≤ 𝜎ф−1 (1 − 𝛼 2 ) + 𝜇 √𝑛2ℎ𝑛 ) = 𝑃 (− (𝜎ф−1 (1 − 𝛼 2 ) − 𝜇) ≤ √𝑛2ℎ𝑛 (λ̂c,n,k − λc(s)) ≤ (𝜎ф −1 (1 − 𝛼 2 ) + 𝜇)). suppose that 𝑌 = √𝑛2ℎ𝑛 (λ̂c,n,k − λc(s), then based on theorem 3b 𝑌~normal(𝜇, 𝜎 2) and 𝑍 = 𝑌 − 𝜇 𝜎 ~normal(0,1). therefore p(λc(s) ϵ iλc ) = 𝑃 (−ф −1 (1 − 𝛼 2 ) ≤ 𝑍 ≤ ф−1 (1 − 𝛼 2 )). by using the same arguments, it is obtained that p(λc(s) ϵ iλc ) = 1 − 𝛼, provided 𝑛 → ∞. conclusions from the results that have been studied, the formula to determine the confidence interval for parameter of the periodic component of the nonhomogeneous poisson process with the intensity in the form of periodic function has been obtained. these confidence intervals have been given for each case of the values of b, this is because the results of previous studies show that the variance of the estimator is given in a different function for each case of the values of b. these confidence intervals have been proved to converge in probability 1 − 𝛼. the recommendation for further research that can be done is providing numerical simulations for each confidence interval case, there are 6 cases. the simulation can be started by determining the bandwidth function ℎ𝑛 which satisfies all the conditions in the given case and determining the probability of the estimator being in the confidence interval. references [1] s. ghahramani, fundamentals of probability: with stochastic processes, third edition. new jersey: pearson prentice hall, 2005. [2] d. j. daley and d. vere-jones, “basic properties of the poisson process,” an introd. to theory point process. vol. i elem. theory methods, pp. 19–40, 2003. [3] g. last and m. penrose, lectures on the poisson process (vol. 7). cambridge university press, 2017. [4] j. geng, w. shi, and g. hu, “bayesian nonparametric nonhomogeneous poisson process with applications to usgs earthquake data,” arxiv prepr. arxiv1907.03186, 2019. [5] f. grabski, “nonhomogeneous poisson process and compound poisson process in the modelling of random processes related to road accidents,” j. kones, vol. 26, no. 1, pp. 39–46, 2019. [6] e. lawrence, s. vander wiel, c. law, s. b. spolaor, and g. c. bower, “the nonhomogeneous poisson process for fast radio burst rates,” astron. j., vol. 154, the confidence interval for the periodic intensity function in the presence of power function trend on the nonhomogeneous poisson process ikhsan maulidi 83 no. 3, p. 117, 2017. [7] r. helmers and i. w. mangku, “estimating the intensity of a cyclic poisson process in the presence of linear trend,” ann. inst. stat. math., vol. 61, no. 3, pp. 599–628, 2009. [8] i. w. mangku, “estimating the intensity obtained as the product of a periodic function with the linear trend of a non-homogeneous poisson process,” far east j. math. sci., vol. 51, pp. 141–150, 2011. [9] w. ismayulia, i. w. mangku, and s. siswandi, “pendugaan komponen periodik fungsi intensitas berbentuk fungsi periodik kali tren linear suatu proses poisson non-homogen,” j. math. its appl., vol. 12, no. 1, pp. 49–62, 2013. [10] i. w. mangku, r. budiarti, taslim, and casman, “estimating the intensity obtained as the product of a periodic function with the quadratic trend of a nonhomogeneous poisson process,” far east j. math. sci., vol. 82, no. 1, pp. 33–44, 2013. [11] i. w. mangku, siswadi, and r. budiarti, “consistency of a kernel-type estimator of the intensity of the cyclic poisson process with the linear trend,” j. indones. math. soc., vol. 15, no. 1, pp. 37–48, 2009. [12] r. helmers and i. w. mangku, “predicting a cyclic poisson process,” ann. inst. stat. math., vol. 64, no. 6, pp. 1261–1279, 2012. [13] n. leonenko, e. scalas, and m. trinh, “the fractional non-homogeneous poisson process,” stat. probab. lett., vol. 120, pp. 147–156, 2017. [14] w. erliana, “pendugaan tipe kernel umum untuk intensitas berupa fungsi periodik kali tren fungsi pangkat proses poisson nonhomogen,” institut pertanian bogor, 2014. [15] i. maulidi, i. w. mangku, and h. sumarno, “strong consistency of kernel-type estimator for the intensity obtained as the product of a periodic function with the power function trend of non-homogeneous poisson process,” br. j. appl. sci. technol., vol. 9, no. 4, pp. 383–387, 2015. [16] i. maulidi, m. ihsan, and v. apriliani, “the numerical simulation for asymptotic normality of the intensity obtained as a product of a periodic function with the power trend function of a nonhomogeneous poisson process,” desimal j. mat., vol. 3, no. 3, pp. 271–278, 2020. [17] r. helmers, i. w. mangku, and r. zitikis, “consistent estimation of the intensity function of a cyclic poisson process,” j. multivar. anal., vol. 84, no. 1, pp. 19–39, 2003. [18] n. valentika, i. w. mangku, and w. erliana, “strong consistency and asymptotic distribution of estimator for the intensity function having form of periodic function multiplied by power function trend of a poisson process,” int. j. eng. manag. res., vol. 8, no. 2, pp. 232–236, 2018. [19] r. j. serfling, approximation theorems of mathematical statistics. john wiley & sons, 1980. [20] r. v. hogg, j. mckean, and a. t. craig, introduction to mathematical statistics (6th edition). pearson education, 2005. [21] r. m. dudley, real analysis and probability. wardswort & brooks, 1989. dynamical of ratio-dependent eco-epidemical model with prey refuge cauchy –jurnal matematika murni dan aplikasi volume 6(4) (2021), pages 227-237 p-issn: 2086-0382; e-issn: 2477-3344 submitted: november 17, 2020 reviewed: february 11, 2021 accepted: march 16, 2021 doi: http://dx.doi.org/10.18860/ca.v6i4.10827 dynamical of ratio-dependent eco-epidemical model with prey refuge adin lazuardy firdiansyah department of tadris matematika, stai muhammadiyah probolinggo jl. soekarno hatta 94b, sukabumi, mayangan, probolinggo, indonesia email: adin.lazuardy@gmail.com abstract this paper discusses the dynamic analysis of three species in the eco-epidemiology model by considering the ratio-dependent function and prey refuge. the prey refuge is applied under the fact that infected prey has protection instincts that allow it to reduce predation risk. here, we get the boundedness and three equilibrium points where are existence under certain conditions. in the model, three equilibrium points are locally asymptotically stable and one of the equilibrium points is globally asymptotically stable. we find that the system undergoes hopf bifurcation around the interior equilibrium point by choosing prey refuge as a bifurcation parameter. we also find a condition for uniform persistence. finally, several simulations of numerical are performed not only to illustrate the analytical results but also to illustrate the effect of the prey refuge. keywords: eco-epidemiology model; global stable; hopf bifurcation; local stable; persistence introduction one of the natural phenomena that described the interaction between one species and another individual is the prey-predator interaction. this interaction depends on whether the effects are profitable or detrimental. in the real world, prey-predator interaction also can be influenced by infectious diseases. these diseases can affect population size in the predator-prey interaction. since then, the combination of epidemiological and ecological becomes important issues that are often discussed by many researchers. mathematical studies have considered this issue into an ecoepidemiology model that contains the class of susceptible and infective populations. currently, several studies have focused on the spread of disease in prey only, e.g. [1]–[3]. it is well known that predators prefer to capture infected prey because they are easier to catch than susceptible prey. however, the predator can become infected after eating them. therefore, several researchers are interested to investigate the spread of disease not only in prey but also in predators, e.g. [4][5]. moreover, some studies have reviewed the spread of disease in both populations, e.g. [6]. base on several experiments, the spread of infectious disease becomes an important factor to know the regulation of population density [7]. in this paper, we focus on the situation where predators can eat infected prey only. it appropriates to the fact that infected prey tends to change its behavior. infected prey shall live in an area that is accessible to predators [8]. moreover, infected prey is less agile than healthy prey and can be predated by predators easily [1]. hudson et al. [9] have http://dx.doi.org/10.18860/ca.v6i4.10827 mailto:adin.lazuardy@gmail.com dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 228 observed that predators selectively capture heavily infected red grouse. predators that consumed infected prey populations are described by the response functions. it is well known that response functions are an important component in the eco-epidemiology model. generally, several researchers use holling types as a response function in their model. according to [10], holling types are divide into three types, namely holling type i, holling type ii, and holling type iii. holling type i means that the consumption rate of predator increases linearly with the density of prey but it achieves a constant value if predators are surfeit, e.g. [6][5]. holling type ii means that the consumption rate of predators increases if the density of prey is low, e.g. [1]–[4]. meanwhile, holling type iii means that the consumption rate of predator increases when the density of prey is large but it decreases when the density of prey is low, e.g. [11]. in holling type iii, predators easily switch to eat one prey to another or they focus to eat prey in a location where it is most abundant [12]. the response function for the holling type depends on the density of prey. this is unrealistic because it ignores the effects of predator interference. base on the experiment, the density of predators can influence the consumption rate. in the modeling, the consumption rate that depended on the density of prey can’t describe the dynamics behavior when the density of predator influences the system [12]. currently, several researchers have considered the density of both populations as a ratio-dependent function. this function depends on the ratio of prey to predator density [13]. according to [14], the ratio-dependent model is a more reasonable dynamic than the previous model. one of the phenomena that reduce predation risk is prey refuge. it can avoid the extinction of prey and influence the stability of the dynamic behavior [15]. according to [3], the prey refuge that is incorporated in the model is divided into two types, namely the refuge for a constant-number of prey and the refuge for a constant-proportion of prey. it is well known that the refuge for a constant-number of prey has a stronger stabilizing effect than the refuge for a constant-proportion of prey [14]. therefore, the model is more realistic by incorporating prey refuge and it gives an accessible factor to the predator. in this study, we modify the eco-epidemiology model from [1] by changing holling type ii into a ratio-dependent function. here, we also observe the effect of prey refuge in the system. further, this article presents the results in the form of model analysis. moreover, it is well observed that there are hopf bifurcations around the positive equilibrium point and the condition for uniform persistence. finally, several simulations are performed to illustrate the analytical results. methods we use several methods to modify the eco-epidemiology model from [1]. the method is presented as follows. 1. reviewing and studying the eco-epidemiology model from previous literature. 2. modifying the eco-epidemiology model by changing the holling ii type into the ratiodependent function. 3. investigating the boundedness, equilibrium points, and dynamical behavior in the modified model. 4. investigating hopf bifurcation and persistence in the modified model. 5. performing numerical simulation by using the 5th-order predictor-corrector method as a numerical method to support the analytical results. dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 229 results and discussion the mathematical model in this paper, the eco-epidemiology model consists of three populations, namely susceptible prey, infected prey, and predator. let 𝑋𝑆(𝑡), 𝑋𝐼(𝑡), and 𝑌(𝑡) is respectively defined as the density of susceptible prey, infected prey, and predator at the time 𝑡. according to [1], the general eco-epidemiology model is presented as follows. 𝑋�̇� = 𝑅 − 𝛽𝑋𝑆𝑋𝐼 − 𝛿𝑋𝑆, 𝑋𝐼̇ = 𝛽𝑋𝑆𝑋𝐼 − 𝑓(𝑋𝐼, 𝑌)𝑌 − 𝜂𝑋𝐼, �̇� = 𝑒𝑓(𝑋𝐼, 𝑌)𝑌 − 𝛾𝑌, (1) with 𝑋𝑆 ≥ 0, 𝑋𝐼 ≥ 0, 𝑌 ≥ 0. the first equation of system (1) expresses that in the absence of disease, the prey population grows by following the equation as below. 𝑋�̇� = 𝑅 − 𝛿𝑋𝑆, where 𝑅 is expressed as the level of recruitment in prey populations such as immigrants and new-born and 𝛿 is defined as the natural death rate of susceptible prey. here, the growth of the prey population is affected by factors such as disease. the spread of disease is denoted by bilinear incidence rate 𝛽𝑋𝑆𝑋𝐼 with 𝛽 is the transmission rate. we assume that the prey is not infected because of inherited disease but other sources. moreover, the disease spreads on susceptible prey only. we also assume that the infected prey populations do not recover nor reproduce. the second equation of system (1) describes the development of the infected prey population. they will be erased by the natural death rate 𝜂 and the predation of the predator. here, predation is denoted by the response function 𝑓(𝑋𝐼, 𝑌). we assume that infected prey can hide. hiding behavior gives protection for infected prey which can protect from predation. the protection of infected prey is denoted by constant 𝑚. predators can capture infected prey by following a ratio-dependent function as in [14]. 𝑓(𝑋𝐼, 𝑌) = { 0 , if 0 ≤ 𝑋𝐼 ≤ 𝑚, 𝑎(𝑋𝐼 − 𝑚) 𝑋𝐼 − 𝑚 + 𝜉𝑌 , if 𝑋𝐼 > 𝑚, with 𝑎 is expressed as the predation rate and 𝜉 is defined as half capturing saturation constant. according to [1], if the density of infected prey populations is below the constant 𝑚, then predators cannot eat them and will die exponentially. meanwhile, if the density of infected prey populations is above the constant 𝑚, then predators can eat them. thus, when 𝑋𝐼 > 𝑚, then system (1) shall become as follows. 𝑋�̇� = 𝑅 − 𝛽𝑋𝑆𝑋𝐼 − 𝛿𝑋𝑆, 𝑋𝐼̇ = 𝛽𝑋𝑆𝑋𝐼 − 𝑎(𝑋𝐼 − 𝑚)𝑌 𝑋𝐼 − 𝑚 + 𝜉𝑌 − 𝜂𝑋𝐼, �̇� = 𝑎𝑒(𝑋𝐼 − 𝑚)𝑌 𝑋𝐼 − 𝑚 + 𝜉𝑌 − 𝛾𝑌, (2) with 𝑋𝑆(0) ≥ 0, 𝑋𝐼(0) ≥ 0, 𝑌(0) ≥ 0. the last equation represents the behavior of the predator population. predators only consume the infected prey and don’t consume the susceptible prey. when the infected prey population is absent, then the predator experiences a natural death rate 𝛾. here, it is assumed that disease does not spread from infected prey to predator. in the model, all parameters are positive values. the meaning of parameter and their units are summarized in table 1. dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 230 table 1. units and the meaning of parameters in system (2) parameters biological meaning units 𝑋𝑆 the number of susceptible prey number 𝑋𝐼 the number of infected prey number 𝑌 the number of predators number 𝑅 the level of recruitment in prey time-1 𝛽 the level of infection of disease mass-1 time-1 𝛿 the level of natural mortality in susceptible prey time-1 𝜂 the level of natural mortality in infected prey time-1 𝛾 the level of natural mortality in predators time-1 𝜉 half saturation constant number 𝑎 the level of predation in predators mass-1 time-1 𝑒 the level of alteration from prey into predators time-1 𝑚 the measure of prey in the refuge number the boundedness to show the biological validity, we shall prove the boundedness of the system (2) as follows. theorem 1. all solutions of the system (2) are uniformly bounded. proof: we define 𝑊 = 𝑋𝑆 + 𝑋𝐼 + 1 𝑒 𝑌. by differentiating 𝑊 to 𝑡, we get 𝑑𝑊 𝑑𝑡 = 𝑑𝑋𝑆 𝑑𝑡 + 𝑑𝑋𝐼 𝑑𝑡 + 1 𝑒 𝑑𝑌 𝑑𝑡 . by substituting system (3), we get, 𝑑𝑊 𝑑𝑡 = 𝑅 − (𝛿𝑋𝑆 + 𝜂𝑋𝐼 + 𝛾 𝑌 𝑒 ). next, we choose 𝑞 = min{𝛿, 𝜂, 𝛾}. thus, we get 𝑑𝑊 𝑑𝑡 ≤ 𝑅 − 𝑞𝑊, by using the theory of differential equation, we get 𝑊(𝑡) ≤ 𝑅 𝑞 + 𝐶𝑒−𝑞𝑡, where 𝐶 is the arbitrary positive constant. for 𝑡 → ∞, we get lim 𝑡→∞ sup 𝑊(𝑡) ≤ 𝑅 𝑞 . thus, all solutions of system (2) enter into 𝛺 = {(𝑋𝑆, 𝑋𝐼, 𝑌) ∈ ℝ+ 3 : 𝑊(𝑡) ≤ 𝑅 𝑞 }. the equilibrium points by setting 𝑋�̇� = 𝑋𝐼̇ = �̇� = 0, we get the possible equilibrium points as below. 1. axial equilibrium 𝐸0 ( 𝑅 𝛿 , 0,0) always existent. dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 231 2. planar equilibrium 𝐸1 ( 𝜂 𝛽 , 𝛽𝑅−𝛿𝜂 𝜂𝛽 , 0) exists when 𝛽𝑅 > 𝛿𝜂. 3. interior equilibrium 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗), where 𝑋𝑆 ∗ = 𝑅 𝛽𝑋𝐼 ∗+𝛿 , 𝑌∗ = (𝑎𝑒−𝛾)(𝑋𝐼 ∗−𝑚) 𝛾𝜉 , and 𝑋𝐼 ∗ is the positive root of the quadratic equation (3) as follows. 𝐴1(𝑋𝐼 ∗)2 + 𝐴2𝑋𝐼 ∗ + 𝐴3 = 0, (3) with 𝐴1 = −𝛽(𝑎𝑒 − 𝛾 + 𝑒𝜂𝜉), 𝐴2 = 𝛽𝑚(𝑎𝑒 − 𝛾) − 𝛿(𝑎𝑒 − 𝛾 + 𝑒𝜂𝜉) + 𝑒𝑅𝛽𝜉, 𝐴3 = 𝛿𝑚(𝑎𝑒 − 𝛾). this point 𝐸∗ exists when it satisfies the condition 𝑎𝑒 > 𝛾 and 𝑋𝐼 ∗ > 𝑚. it is clear to show that 𝐴1 < 0 and 𝐴3 > 0. thus, the determinant of the equation (3) is 𝐷 = (𝐴2) 2 − 𝐴1𝐴3 ≥ 0. therefore, to get the explicit form of the root of the equation (3), we have to check the following two cases: a. for 𝐷 = 0, the equation (3) has a twin positive root where 𝑋𝐼 ∗ = − 𝐴2 2𝐴1 with 𝐴2 > 0. b. for 𝐷 > 0, the probability that equation (3) has positive roots is as follows.  if 𝐴2 > 0, then the equation (3) has two positive roots where 𝑋𝐼1,2 ∗ = −𝐴2±√𝐷 2𝐴1 .  if 𝐴2 < 0, then the equation (3) has a single positive root where 𝑋𝐼 ∗ = −𝐴2−√𝐷 2𝐴1 . dynamical behaviour to investigate the stability in system (2), we have to determine the eigenvalues of the jacobian matrix. here, we identify the jacobian matrix at 𝐸(𝑋𝑆, 𝑋𝐼, 𝑌) as follows. 𝐽(𝐸) = [ −𝛽𝑋𝐼 − 𝛿 −𝛽𝑋𝑆 0 𝛽𝑋𝐼 𝛽𝑋𝑆 − 𝑎𝜉𝑌2 (𝑋𝐼 − 𝑚 + 𝜉𝑌) 2 − 𝜂 − 𝑎(𝑋𝐼 − 𝑚) 2 (𝑋𝐼 − 𝑚 + 𝜉𝑌) 2 0 𝑎𝑒𝜉𝑌2 (𝑋𝐼 − 𝑚 + 𝜉𝑌) 2 𝑎𝑒(𝑋𝐼 − 𝑚) 2 (𝑋𝐼 − 𝑚 + 𝜉𝑌) 2 − 𝛾 ] . (4) to check the stability of 𝐸0 ( 𝑅 𝛿 , 0,0), we get the jacobian matrix by replacing 𝐸(𝑋𝑆, 𝑋𝐼, 𝑌) in the equation (4) with 𝐸0 ( 𝑅 𝛿 , 0,0). hence, we obtain the eigenvalues of the jacobian matrix, namely 𝜆1 = −𝛿, 𝜆2 = 𝛽𝑅 𝛿 − 𝜂, and 𝜆3 = 𝑎𝑒 − 𝛾. the point 𝐸0 is locally asymptotically stable when 𝑎𝑒 < 𝛾 and 𝛽𝑅 < 𝛿𝜂. to investigate the stability of 𝐸1 ( 𝜂 𝛽 , 𝛽𝑅−𝛿𝜂 𝜂𝛽 , 0), we have to identify the jacobian matrix by replacing 𝐸(𝑋𝑆, 𝑋𝐼, 𝑌) in the equation (4) with 𝐸1 ( 𝜂 𝛽 , 𝛽𝑅−𝛿𝜂 𝜂𝛽 , 0). thus, we get the eigenvalues 𝜆1 = 𝑎𝑒 − 𝛾 and other eigenvalues are the roots of the quadratic equation 𝜆2 + 𝜑1𝜆 + 𝜑2 = 0 with 𝜑1 = 𝛽𝑅−𝛿𝜂 𝜂 + 𝛿 and 𝜑2 = 𝛽𝑅 − 𝛿𝜂. by using the routh-hurwitz criteria, the eigenvalues have negative real roots when 𝛽𝑅 > 𝛿𝜂. hence, the point 𝐸1 is locally asymptotically stable when 𝑎𝑒 < 𝛾 and 𝛽𝑅 > 𝛿𝜂. from the above discussion, we get the following theorem as follows. theorem 2. the axial equilibrium 𝐸0 is locally asymptotically stable when 𝑎𝑒 < 𝛾 and 𝛽𝑅 < 𝛿𝜂 and the planar equilibrium 𝐸1 is locally asymptotically stable when 𝑎𝑒 < 𝛾 and 𝛽𝑅 > 𝛿𝜂. theorem 2 means that if the level of natural mortality in predator is lower than a certain value and the level of infection is lower than a certain value, then the point 𝐸0 dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 232 becomes stable. meanwhile, if the level of infection is greater than a certain value, then the point 𝐸1 becomes stable. now, we shall investigate the stability of 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗). by replacing 𝐸(𝑋𝑆, 𝑋𝐼, 𝑌) in the equation (4) with 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗), we get the jacobian matrix where 𝑢𝑖𝑗 is the entry of matrix with row 𝑖 and column 𝑗 as follows. 𝐽(𝐸∗) = [ 𝑢11 𝑢12 0 𝑢21 𝑢22 𝑢23 0 𝑢32 𝑢33 ], with 𝑢11 = −𝛽𝑋𝐼 ∗ − 𝛿; 𝑢12 = −𝛽𝑋𝑆 ∗; 𝑢21 = 𝛽𝑋𝐼 ∗, 𝑢22 = 𝛽𝑋𝑆 ∗ − 𝑎𝜉(𝑌∗)2 (𝑋𝐼 ∗ − 𝑚 + 𝜉𝑌∗)2 − 𝜂; 𝑢23 = − 𝑎(𝑋𝐼 ∗ − 𝑚)2 (𝑋𝐼 ∗ − 𝑚 + 𝜉𝑌∗)2 , 𝑢32 = 𝑎𝑒𝜉(𝑌∗)2 (𝑋𝐼 ∗ − 𝑚 + 𝜉𝑌∗)2 ; 𝑢33 = − 𝑎𝑒𝜉𝑌∗(𝑋𝐼 ∗ − 𝑚) (𝑋𝐼 ∗ − 𝑚 + 𝜉𝑌∗)2 . hence, we obtain the characteristic equation of 𝐸∗, namely 𝜆3 + 𝜇1𝜆 2 + 𝜇2𝜆 + 𝜇3 = 0 (5) with 𝜇1 = −(𝑢11 + 𝑢22 + 𝑢33), 𝜇2 = 𝑢11𝑢22 + 𝑢11𝑢33 + 𝑢22𝑢33 − 𝑢12𝑢21 − 𝑢23𝑢32, 𝜇3 = −𝑢11𝑢22𝑢33 + 𝑢11𝑢23𝑢32 + 𝑢12𝑢21𝑢33. by using the routh-hurwitz criteria, the eigenvalues have negative real roots when 𝜇1 > 0, 𝜇3 > 0, 𝜇1𝜇2 > 𝜇3. thus, the point 𝐸 ∗ is locally asymptotically stable when 𝜇1 > 0, 𝜇3 > 0, 𝜇1𝜇2 > 𝜇3. therefore, we get the following theorem. theorem 3. the interior equilibrium 𝐸∗ is locally asymptotically stable when it satisfies the condition 𝜇1 > 0, 𝜇3 > 0, 𝜇1𝜇2 > 𝜇3. in the next theorem, we shall prove that the point 𝐸1 is globally asymptotically stable under a certain condition. theorem 4. the point 𝐸1 is globally asymptotically stable in the 𝑋𝑆 − 𝑋𝐼 plane. proof: by applying the dulac function as 𝐻(𝑋𝑆, 𝑋𝐼) = 1 𝑋𝐼 , we have 𝐻(𝑋𝑆, 𝑋𝐼) = 1 𝑋𝐼 , ℎ1(𝑋𝑆, 𝑋𝐼) = 𝑅 − 𝛽𝑋𝑆𝑋𝐼 − 𝛿𝑋𝑆, ℎ2(𝑋𝑆, 𝑋𝐼) = 𝛽𝑋𝑆𝑋𝐼 − 𝜂𝑋𝐼, where 𝐻(𝑋𝑆, 𝑋𝐼) > 0 in the 𝑋𝑆 − 𝑋𝐼 plane. thus, we get ∆(𝑋𝑆, 𝑋𝐼) = 𝜕 𝜕𝑋𝑆 (ℎ1𝐻) + 𝜕 𝜕𝑋𝐼 (ℎ2𝐻) = −𝛽 − 𝛿 𝑋𝐼 < 0. base on bendixson-dulac criteria, there is no limit cycle in the 𝑋𝑆 − 𝑋𝐼 plane. thus, the point 𝐸1 is globally asymptotically stable in the 𝑋𝑆 − 𝑋𝐼 plane. hopf bifurcation in this section, we shall investigate hopf bifurcation around 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗). hopf bifurcation guarantees that all solutions of system (2) enter a limit cycle around 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗). here, we choose the constant 𝑚 as a bifurcation parameter. hopf bifurcation around 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗) is presented in the following theorem. dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 233 theorem 5. the system (2) undergoes a hopf bifurcation around 𝐸∗(𝑋𝑆 ∗, 𝑋𝐼 ∗, 𝑌∗) when 𝑚 passes through a critical value 𝑚 = 𝑚𝑐. proof: we consider equation (5) as the characteristic equation at 𝐸∗. next, we choose 𝑚 = 𝑚𝑐 such that 𝜇1𝜇2 = 𝜇3 where 𝜇1, 𝜇2, 𝜇3 > 0. therefore, we get (𝜆2 + 𝜇2)(𝜆 + 𝜇1) = 0 (6) with the roots are 𝜆1,2 = ±𝑖√𝜇2 and 𝜆3 = −𝜇1. for all 𝑚, the roots become 𝜆1,2 = 𝑣1(𝑚) ± 𝑖𝑣2(𝑚) and 𝜆3 = −𝜇1(𝑚) where 𝑣1(𝑚) and 𝑣2(𝑚) are real. next, we shall prove the transversality condition as follows. 𝑑 (𝑅𝑒 𝜆𝑗(𝑚)) 𝑑𝑚 | 𝑚=𝑚𝑐 ≠ 0, 𝑗 = 1,2. by substituting 𝜆1 = 𝑣1(𝑚) + 𝑖𝑣2(𝑚) into equation (6) and differentiating to 𝑚, we obtain 𝐾𝑣1̇ − 𝐿𝑣2̇ + 𝑀 = 0, 𝐿𝑣1̇ + 𝐾𝑣2̇ + 𝑁 = 0, (7) where 𝐾 = 3(𝑣1 2 − 𝑣2 2) + 𝜇2 + 2𝑣1𝜇1, 𝐿 = 6𝑣1𝑣2 + 2𝑣2𝜇1, 𝑀 = 𝜇1̇(𝑣1 2 − 𝑣2 2) + 𝑣1𝜇2̇ + 𝜇3̇, and 𝑁 = 2𝑣1𝑣2𝜇1̇ + 𝑣2𝜇2̇. next, we solve equation (7). thus, we have 𝑣1̇ = − 𝐾𝑀 + 𝐿𝑁 𝐾2 + 𝐿2 . since 𝐾𝑀 + 𝐿𝑁 ≠ 0 and 𝜇1̇ ≠ 0, we obtain 𝑑 (𝑅𝑒 𝜆𝑗(𝑚)) 𝑑𝑚 | 𝑚=𝑚𝑐 = − 𝐾𝑀 + 𝐿𝑁 𝐾2 + 𝐿2 | 𝑚=𝑚𝑐 ≠ 0, 𝑗 = 1,2, and 𝜆3(𝑚𝑐) = −𝜇1(𝑚𝑐) < 0. thus, the system (2) occurs hopf bifurcation when 𝑚 passes through a critical value 𝑚 = 𝑚𝑐. persistence to show that all species are present and are not extinct in the future time, we shall prove that system (2) is uniform persistence. theorem 6. let the assumption of theorem 4 holds. if the inequalities 𝑎𝑒 > 𝛾 and 𝛽𝑅 > 𝛿𝜂 hold, then the system (2) is uniform persistence. proof: we consider average lyapunov function 𝜎(𝑋) = 𝑋𝑆 𝑟1𝑋𝐼 𝑟2𝑌𝑟3 with 𝑟1, 𝑟2, 𝑟3 > 0. here, 𝜎(𝑋) is nonnegative 𝐶1 in ℝ+ 3 . thus, we have 𝜗(𝑋) = 1 𝜎 𝑑𝜎 𝑑𝑡 = 𝑟1 𝑋�̇� 𝑋𝑆 + 𝑟2 𝑋𝐼̇ 𝑋𝐼 + 𝑟3 �̇� 𝑌 , = 𝑟1 ( 𝑅 𝑋𝑆 − 𝛽𝑋𝐼 − 𝛿) + 𝑟2 (𝛽𝑋𝑆 − 𝑎(𝑋𝐼 − 𝑚)𝑌 𝑋𝐼(𝑋𝐼 − 𝑚 + 𝜉𝑌) − 𝜂) + 𝑟3 ( 𝑎𝑒(𝑋𝐼 − 𝑚) 𝑋𝐼 − 𝑚 + 𝜉𝑌 − 𝛾). in the system, the point 𝐸1 is the only equilibrium point which is no limit cycle around the equilibrium point. therefore, theorem 4 holds. hence, it is enough to prove that 𝜗(𝑋) > 0 for all equilibrium point 𝑋 ∈ 𝑏𝑑 ℝ+ 3 . thus, we get 𝜗(𝐸0) = 𝑟2 ( 𝛽𝑅 − 𝛿𝜂 𝛿 ) + 𝑟3(𝑎𝑒 − 𝛾) > 0, 𝜗(𝐸1) = 𝑟3(𝑎𝑒 − 𝛾) > 0. dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 234 we note that if the inequalities 𝑎𝑒 > 𝛾 and 𝛽𝑅 > 𝛿𝜂 hold, then 𝜗(𝐸0) > 0 holds. meanwhile, if the inequalities 𝑎𝑒 > 𝛾 holds, then 𝜗(𝐸1) > 0 holds. therefore, theorem 6 expresses the instability of the point 𝐸0 and 𝐸1. numerical solutions the analytical results obtained are incomplete without numerical investigation. in this section, we present the numerical solution by using the 5th-order predictor-corrector method at ∆𝑡 = 0.01. here, we will give four simulations to verify our analytical results and also to demonstrate the effect of the prey refuge. we choose several parameters as in (8). their units are given as in table 1. 𝑅 = 2, 𝛽 = 1, 𝛿 = 1, 𝜂 = 0.5, 𝛾 = 0.5, 𝜉 = 1, 𝑎 = 2, 𝑒 = 0.75, 𝑚 = 0.5. (8) simulation 1. it confirms that system (2) has an equilibrium 𝐸∗(1.0685,0.8717,0.7434). on simulation 1, theorem 3 and theorem 6 are satisfied. we observe that all solutions converge to the point 𝐸∗, see figure 1(a). thus, the point is locally asymptotically stable, see figure 1(b), and the system (2) is uniform persistence. (a) (b) figure 1. the dynamics of system (2) with 𝑚 = 0.5 and other parameters as in (8) simulation 2. we replace 𝑚 = 0.5 into 𝑚 = 0.0002. here, we investigate that system (2) has an equilibrium point 𝐸∗(1.8305,0.0926,0.1848). on simulation 2, theorem 3 is not satisfied but theorem 6 is satisfied. therefore, system (2) is uniform persistence but the equilibrium point 𝐸∗ is unstable, see figure 2. (a) (b) figure 2. this figure shows the instability of system (2) dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 235 simulation 3. to see hopf bifurcation in the system (2), we choose 𝑚 = 𝑚𝑐 = 0.0004 as a bifurcation parameter. thus, we identify that the equilibrium point of system (2) is 𝐸∗(1.8277,0.0943,0.1878). if we choose 𝑚 = 0.005 > 𝑚𝑐 = 0.0004, then all solutions of the system (2) convergent to 𝐸∗(1.7795,0.1239,0.2378), see figure 3(a). meanwhile, figure 3(b) shows the phase portrait of the system (2) with 𝑚 = 0.005 which means that the point 𝐸∗ is stable. furthermore, if we choose 𝑚 = 0.0001 < 𝑚𝑐 = 0.0004, then the point 𝐸∗(1.8319,0.0918,0.1833) is unstable, see figure 4(a). meanwhile, the phase portrait in the system (2) with 𝑚 = 0.0001 that is presented in figure 4(b) means that the solution of the system (2) enter a limit cycle around 𝐸∗. thus, theorem 5 is satisfied. simulation 4. to see the effect of prey refuge in the system (2), we use several values of prey refuge, namely 𝑚1 = 0.1, 𝑚2 = 0.65, and 𝑚3 = 1.3. figure (5) shows the time graph of the system (2) by using 𝑚 makes different values. here, we obtain that all populations exist no matter how large 𝑚 with 𝑚 < 𝑋𝐼. this prey refuge creates the system (2) to become stable rapidly and no extinction occurs. here, it is worthy to attention that when we choose the constant 𝑚 with 𝑚 < 𝑋𝐼, then the measure of prey refuge doesn’t lead to predator extinction. (a) (b) figure 3. the dynamics of system (2) with 𝑚 = 0.005 and other parameters as in (8) (a) (b) figure 4. the dynamics of system (2) with 𝑚 = 0.0001 and other parameters as in (8) dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 236 figure 5. the effect of prey refuge by using 𝑚 makes different values conclusions in this study, we have observed three species in the eco-epidemiology model with the ratio-dependent function incorporating prey refuge. we obtain three equilibrium points where all points, i.e. axial equilibrium, planar equilibrium, and interior equilibrium, are locally asymptotically stable under certain conditions. moreover, the planar equilibrium point in the system (2) is globally asymptotically stable. next, we find that hopf bifurcation occurs around the interior equilibrium by choosing a bifurcation parameter in the constant 𝑚. when 𝑚 > 𝑚𝑐 = 0.0004, the system (2) is stable. however, when 𝑚 < 𝑚𝑐 = 0.0004, the system (2) is unstable. furthermore, we also find a condition for uniform persistence. if the level of natural mortality in predators is lower than a certain value and the level of infection is greater than a certain value, then all species exist in the future time. next, by applying the prey refuge, all populations exist no matter how large 𝑚 with 𝑚 < 𝑋𝐼. we conclude that system (2) is stable faster and there is no extinction. references [1] c. maji, d. kesh, and d. mukherjee, “bifurcation and global stability in an ecoepidemic model with refuge,” energy, ecol. environ., vol. 4, no. 3, pp. 103–115, 2019, doi: 10.1007/s40974-019-00117-6. [2] a. k. pal and g. p. samanta, “stability analysis of an eco-epidemiological model incorporating a prey refuge,” nonlinear anal. model. control, vol. 2014, 2014, doi: 10.1155/2014/978758. [3] b. mukhopadhyay and r. bhattacharyya, “effects of deterministic and random refuge in a prey-predator model with parasite infection,” math. biosci., vol. 239, no. 1, pp. 124–130, 2012, doi: 10.1016/j.mbs.2012.04.007. [4] c. huang, h. zhang, j. cao, and h. hu, “stability and hopf bifurcation of a delayed prey-predator model with disease in the predator,” int. j. bifurc. chaos, vol. 29, no. 7, p. 23, 2019, doi: 10.1142/s0218127419500913. [5] s. a. wuhaib and y. a. b. u. hasan, “predator-prey interactions with harvesting of predator with prey in refuge,” commun. math. biol. neurosci., vol. 2013, no. 1, pp. 1–19, 2013. [6] s. p. bera, a. maiti, and g. p. samanta, “a prey-predator model with infection in both prey and predator,” filomat, vol. 29, no. 8, pp. 1753–1767, 2015, doi: dynamical of ratio-dependent eco-epidemiology model with prey refuge adin lazuardy firdiansyah 237 10.2298/fil1508753b. [7] r. k. upadhyay and p. roy, “spread of a disease and its effect on population dynamics in an eco-epidemiological system,” commun. nonlinear sci. numer. simul., vol. 19, no. 12, pp. 4170–4184, 2014, doi: 10.1016/j.cnsns.2014.04.016. [8] d. mukherjee, “hopf bifurcation in an eco-epidemic model,” appl. math. comput., vol. 217, no. 5, pp. 2118–2124, 2010, doi: 10.1016/j.amc.2010.07.010. [9] p. j. hudson, a. p. dobson, and d. newborn, “do parasites make prey vulnerable to predation? red grouse and parasites,” j. anim. ecol., vol. 61, no. 3, p. 681, 1992, doi: 10.2307/5623. [10] n. apreutesei and g. dimitriu, “on a prey-predator reactiondiffusion system with holling type iii functional response,” j. comput. appl. math., vol. 235, no. 2, pp. 366– 379, 2010, doi: 10.1016/j.cam.2010.05.040. [11] a. l. firdiansyah, “effect of prey refuge and harvesting on dynamics of ecoepidemiological model with holling type iii,” vol. 3, no. 1, pp. 16–25, 2021. [12] a. k. misra and b. dubey, “a ratio-dependent predator-prey model with delay and harvesting,” j. biol. syst., vol. 18, no. 2, pp. 437–453, 2010, doi: 10.1142/s021833901000341x. [13] u. de lausanne and s. brook, “coupling in predator-prey dynamics: ratiodependence,” j. theor. biol., vol. 139, pp. 311–326, 1989. [14] m. verma and a. k. misra, “modeling the effect of prey refuge on a ratio-dependent predator–prey system with the allee effect,” bull. math. biol., vol. 80, no. 3, pp. 626– 656, 2018, doi: 10.1007/s11538-018-0394-6. [15] s. wang, z. ma, and w. wang, “dynamical behavior of a generalized ecoepidemiological system with prey refuge,” adv. differ. equations, vol. 2018, no. 1, pp. 1–20, 2018, doi: 10.1186/s13662-018-1704-x. a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 267-280 p-issn: 2086-0382; e-issn: 2477-3344 submitted: september 30, 2021 reviewed: january 22, 2022 accepted: january 26, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.13462 a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi*, nurul gusriani , betty subartini department of mathematics of fmipa of universitas padjadjaran, indonesia *corresponding author email: edi.kurniadi@unpad.ac.id*, nurul.gusriani@unpad.ac.id, betty.subartini@unpad.ac.id abstract this paper relates quasi-associative algebras or koszul algebras of matrix lie algebras of finite dimension to finite dimensional frobenius lie algebras which is written as a semi direct sum. let m2(ℝ)⋊𝔤𝔩2(ℝ) be the lie algebra of the semi-direct sum of the real vector space m2(ℝ) and the lie algebra 𝔤𝔩2(ℝ) of the sets of all 2×2 real matrices. the research aims to give explicit formulas of left-symmetric algebra structures on m2(ℝ)⋊𝔤𝔩2(ℝ). in this paper, a frobenius functional is constructed in order to show that the lie algebra m2(ℝ)⋊𝔤𝔩2(ℝ) is the real frobenius lie algebra of dimension 8. moreover, a bilinear form corresponding to this frobenius functional is symplectic. then the obtained symplectic bilinear form induces the leftsymmetric algebra structures on m2(ℝ)⋊𝔤𝔩2(ℝ). in other words, we show that the lie algebra m2(ℝ)⋊𝔤𝔩2(ℝ) is the left-symmetric algebra. thus, we give the formulas of its left-symmetric algebra structure explicitely. the left-symmetric algebra structures for case of higher dimension of this lie algebra type are still an open problem to be investigated. keywords: left-symmetric algebra; frobenius lie algebra; frobenius functional; semi-direct sum; symplectic form. introduction let ℝ be the field of real numbers of characteristic zero and m2(ℝ) be the ℝ-vector space of all 2×2 real matrices with entries contained in ℝ. the space m2(ℝ) is considered as the ℝ-algebra structure given by usual addition and multiplication of matrices. in addition, the space m2(ℝ) is the lie algebra equipped by the lie brackets in the form of the matrix commutator [𝑥,𝑦]:= 𝑥𝑦 −𝑦𝑥 for all matrices 𝑥,𝑦 ∈ m2(ℝ). we denote by 𝔤𝔩2(ℝ) the lie algebra for m2(ℝ) with respect to the mentioned lie bracket above. moreover, we shall let 𝔥2 ≔ m2(ℝ)⋊𝔤𝔩2(ℝ) represent for the semi-direct sum lie algebra of the vector space m2(ℝ) and the lie algebra 𝔤𝔩2(ℝ) of dimension 8. furthermore, the lie algebra 𝔥2 has the following matrix realization: http://dx.doi.org/10.18860/ca.v7i1.13462 mailto:edi.kurniadi@unpad.ac.id* mailto:nurul.gusriani@unpad.ac.id mailto:betty.subartini@unpad.ac.id a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 268 𝜎 ≔ 𝜎(𝑈,𝑋) = ( 𝑋 𝑈 0 0 ) ∈ 𝔥2 ⊂ m4(ℝ) (1) with 𝑈 ∈ m2(ℝ) and 𝑋 ∈ 𝔤𝔩2(ℝ). let 𝔥2 ∗ be a dual vector space of 𝔥2. we also realize in the following matrix 𝜎∗ ≔ 𝜎∗(𝛼,𝛽) = ( 𝛽 0 𝛼 0 ) ∈ 𝔥2 ∗ ⊂ m4(ℝ) (2) with 𝛼 ∈ m2(ℝ) and 𝛽 ∈ 𝔤𝔩2(ℝ). a value of a linear functional 𝜎 ∗ ∈ 𝔥2 ∗ on a matrix 𝜎 ∈ 𝔥2 is denoted by 〈𝜎∗,𝜎〉 = tr(𝑈𝛼)+tr(𝑋𝛽) = 〈𝛼,𝑈〉+〈𝛽,𝑋〉 (3) where tr is denoted the matrix trace. the notion of the lie algebra 𝔥2 comes originally from the lie algebra 𝔥𝑛,𝑝 ≔ m𝑛,𝑝(𝕂)⋊𝔤𝔩𝑛(𝕂) where m𝑛,𝑝(k) is the real vector space of the set 𝑛 ×𝑝 matrices and 𝔤𝔩𝑛(𝕂) is the lie algebra of the set 𝑛 ×𝑛 matrices over a field 𝕂 of characteristic 0. this lie algebra was introduced by [1] in the context of coadjoint representations of the affine lie group g𝑛,𝑝 ≔ m𝑛,𝑝(𝕂)⋊gl𝑛(𝕂) where gl𝑛(𝕂) is the lie group of all invertible matrices of the lie algebra 𝔤𝔩𝑛(𝕂). in our case, we take 𝑛 = 𝑝 = 2 and 𝕂 = ℝ. we denote by m2(ℝ) ≔ m2,2(ℝ) and we determine the lie algebra 𝔥2 of the lie group g2. in addition, the case 𝑝 = 1 is considered as the affine lie algebra, denoted by 𝔞𝔣𝔣(𝑛), of dimension 𝑛(𝑛 +1). more precise, the realization of this affine lie algebra 𝔞𝔣𝔣(𝑛) is of the form m𝑛,1(ℝ)⋊𝔤𝔩𝑛(ℝ) where m𝑛,1(ℝ) is isomorphic to the space ℝ 𝑛. interestingly, the lie algebra 𝔞𝔣𝔣(𝑛) is frobenius since it has a trivial stabilizer on a certain linear functional. in the other words, the lie group aff(𝑛) of 𝔞𝔣𝔣(𝑛) has an open coadjoint orbit [2]. moreover, the lie algebra 𝔞𝔣𝔣(𝑛) is non-solvable. regarding the notion of a frobenius lie algebra, this lie algebra type was introduced by ooms in the context to answer professor jacobson’s question on properties of finite dimensional lie algebras having an exact simple module over the universal enveloping algebras([3],[4]). the answer that the universal enveloping algebra has an exact simple module if its finite dimensional lie algebra is frobenius. furthermore, if a lie algebra 𝔤 is frobenius then there exists a linear functional 𝜓 defined on 𝔤 such that a bilinear alternating form is non-degenerate. namely, it is a symplectic linear form and such the linear functional 𝜓 is called a frobenius functional [5]. the notion of the frobenius functional gives some important implications. among them, firstly, since the alternating bilinear form corresponding to the frobenius functional is the symplectic linear form, then the dimension of frobenius lie algebra is always even. secondly, the stabilizer of frobenius lie algebras corresponding to the frobenius functional is trivial, then we have a fact that the frobenius lie algebra is not nilpotent [6]. appearing in many different areas of lie algebras, for instance, in the study of bounded homogeneous domain, simple hypersurface singularities, and solutions of yang-baxter equation [7], frobenius lie algebras bring great significance in many areas of lie algebras. in ([5],[8]), a notion of a left-symmetric algebra structure on a lie algebra was introduced where the bilinear product was defined. it was proved that an 𝑛-dimensional lie algebra 𝔤 has left-symmetric algebra structures if there exists a 𝔤-module 𝐾 of dimension 𝑛 such that the 1-cocyle space contains a bijective 1-cocyle. indeed, not every type of lie algebras has affine structure. these structures are important because it arise a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 269 in many areas of mathematics such as convex homogeneous cones, affine manifolds, and vertex algebras. associating to the frobenius functional, the left-symmetric algebras can be induced by the symplectic structure corresponding to a frobenius functional [9]. in the present work, we are interested in studying the frobenius lie algebra and leftsymmetric algebra for the lie algebra 𝔥2 of the lie group g2. in this paper, we calculate a frobenius functional of 𝔥2 corresponding to its pfaffian and we prove that 𝔥2 is 8-dimensional frobenius lie algebra. the obtained frobenius functional of 𝔥2 is applied to contruct a symplectic structure on 𝔥2 which induces a leftsymmetric algebra structure on 𝔥2. the research aims to give explicit formulas of leftsymmetric algebra structures on m2(ℝ)⋊𝔤𝔩2(ℝ). furthermore, with respect to a basis 𝔥2, we calculte the left-symmetric algebra structures explicitely. let 𝜀𝑖𝑗, 1 ≤ 𝑖,𝑗 ≤ 4 be element of 𝔥2 ⊂ m4(ℝ) of the form in the equation (1). we organize the paper as follows. section 1 is explained the background of this research, state of art, the aim of research, statement of the main results, and some basic notions . section 2 is devoted to research method. section 3 discusses our main results to prove that 𝔥2 ≔ m2(ℝ)⋊𝔤𝔩2(ℝ) is 8-dimensional frobenius lie algebra and 𝔥2 has left-symmectric structures. in this section, we also give explicit formulas for leftsymmetric structures of 𝔥2 and discussion for future research related to our main results. in the end of this paper, the conclusion is given. methods our main object was considered from the notion of the affine lie group g𝑛,𝑝 ≔ m𝑛,𝑝(𝕂)⋊gl𝑛(𝕂) which corresponds to the notion of coadjoint representations. regarding this affine lie group, particularly, we consider the special case for 𝑛 = 𝑝 = 2 and we obtained the lie algebra 𝔥2 ≔ m2(ℝ)⋊𝔤𝔩2(ℝ) of the lie group g2 ≔ m2(ℝ)⋊ gl2(ℝ). we studied some related papers corresponding to a frobenius lie algebra and we offered another alternative to show whether it was a frobenius lie algebra or not coresponding to its pfaffian. particularly on some oom’s work [4]. on the other hand, burde introduced the notation of left-symmetric algebra structures on a lie group and a lie algebra [8]. we proved the existence of left-symmetric algebra structures on 𝔥2 and we listed its structures explicitely. our constructions based on symplectic form which induces its left-symmetric algebra structures. in the next section, we shall briefly review some basic notions of lie algebras, frobenius lie algebras and their properties, pfaffian of a frobenius lie algebras, and left-symmetric structures on a lie algebra. in section results and discussion, we shall complete the proof of our main results. 1.1 frobenius lie algebra we shall first introduce the notion of a lie algebra as follows: definition 1[10]. let 𝔤 be a vector space. a lie bracket on 𝔤 is a bilinear form on 𝔤×𝔤, usually denoted by [ , ], which satisfies: 1. [𝑥,𝑥] = 0 for 𝑥 ∈ 𝔤, a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 270 2. jacobi identity, that is [𝑥,[𝑦,𝑧]] = [[𝑧,𝑥],𝑦]+[[𝑥,𝑦],𝑧]. (4) a vector space 𝔤 together with a lie bracket [ , ] is called a lie algebra. let 𝔤 be a lie algebra with 𝔤∗ is dual vector space consisting of linear functionals on 𝔤. we denote by 𝔤𝔩(𝔤) the lie algebra of endomorphism of 𝔤 to itself. then a representation of the lie algebra 𝔤 on himself is given by the following map ad ∶ 𝔤 ∋ 𝑥 ↦ ad(𝑥) ∈ 𝔤𝔩(𝔤). (5) furthermore, the corresponding representation of the lie algebra 𝔤 on the space 𝔤∗ is considered as the dual map to ad and is written in the following formula 〈ad∗(𝑥)𝜎∗,𝑦〉 = 〈𝜎∗,−ad(𝑥)𝑦〉 = 〈𝜎∗,−[𝑥,𝑦]〉 (6) where 𝑥,𝑦 ∈ 𝔤and 𝜎∗ ∈ 𝔤∗. let 𝜎∗ be an element of 𝔤∗ and 𝐵𝜎∗ be alternating bilinear form corresponding to 𝜎 ∗ which is defined by 𝐵𝜎∗ ∶ 𝔤×𝔤 ∋ (𝑥,𝑦) ↦ 〈𝜎 ∗, [𝑥,𝑦]〉 ≔ 𝐵𝜎∗(𝑥,𝑦) ∈ ℝ. (7) the kernel of 𝐵𝜎∗ is given by the following formula ker(𝐵𝜎∗) = {(𝑥,𝑦) ∈ 𝔤×𝔤 ; 〈𝜎 ∗, [𝑥,𝑦]〉 = 0 } = {𝑥 ∈ 𝔤 ; 〈𝜎∗, [𝑥,𝑦]〉 = 0,∀ 𝑦 ∈ 𝔤 } = {𝑥 ∈ 𝔤 ; 〈ad∗(𝑥)𝜎∗,𝑦〉 = 0,∀ 𝑦 ∈ 𝔤 } = {𝑥 ∈ 𝔤 ; ad∗(𝑥)𝜎∗ = 0} the latter formula is nothing but a stabilizer of 𝔤 correponding to 𝜎∗ which is denoted by 𝔤𝜎 ∗ . definition 2[11]. let 𝔤 be a lie algebra. the lie algeba 𝔤 is said to be frobenius if there exists a linear functional 𝜎∗ ∈ 𝔤∗ such that the alternating bilinear form in the equation (7) is non-degenerate. such 𝜎∗ satisfying the eqution (7) is called a frobenius functional. indeed, the non-commutative lie algebra of dimension 2 is always frobenius. it is well known that affine lie algebra 𝔞𝔣𝔣(1) ≔ ℝ⋊ℝ is 2-dimensional frobenius lie algebras. those frobenius lie algebras form a large class of lie algebras. for example, some parabolic and seaweed subalgebras of semi-simple lie algebras, 𝑗-algebras, and borel subalgebras of simple lie algebras are frobenius lie algebras [7]. now, let 𝑇 ≔ {𝑥1,𝑥2,…,𝑥𝑛} be a basis for a frobenius lie algebra 𝔤 of dimension 𝑛 with 𝑛 is even. for 𝑥𝑗,𝑥𝑘 ∈ 𝑇, we denote by 𝑀𝔤(ℝ) an 𝑛 ×𝑛 matrix whose entries are defined by 𝑀𝔤(ℝ) ≔ ([𝑥𝑗,𝑥𝑘])𝑗,𝑘=1 𝑛 . moreover, for 𝜎∗ ∈ 𝔤∗ we define a matrix 𝑀𝔤(ℝ)(𝜎 ∗) stand for an 𝑛 ×𝑛 matrix whose entries are defined by 𝑀𝔤(ℝ)(𝜎 ∗) ≔ (〈𝜎∗, [𝑥𝑗,𝑥𝑘]〉)𝑗,𝑘=1 𝑛 . (8) we get the following theorem : a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 271 theorem 1[4]. let 𝔤 be a lie algebra of even dimensional with a basis 𝑇 ≔ {𝑥1,𝑥2,…,𝑥𝑛}. the lie algebra 𝔤 is frobenius if one of the following equivalent conditions is satisfied: 1. the stabilizer 𝔤𝜎 ∗ = {0} for some 𝜎∗ ∈ 𝔤∗. 2. the determinant of the matrix 𝑀𝔤(ℝ) is not equal to zero. 3. the determinant of the matrix 𝑀𝔤(ℝ)(𝜎 ∗) is not equal to zero for some frobenius functionals 𝜎∗ ∈ 𝔤∗. the relation of alternating bilinear form and a linear functional on 𝔤 in the term of a frobenius lie algebra can be explained as follows: in general, let 𝜓 be alternating bilinear form in a lie algebra 𝔤. if 𝜓 is non-degenerate closed 2-form then 𝔤 is said to be a quasi-frobenius lie algebra. furthermore, if there exists a linear functional 𝜎∗ ∈ 𝔤∗ such that 𝜓(𝛼,𝛽) = d𝜎∗(𝛼,𝛽) = −〈𝜎∗, [𝛼,𝛽]〉 (𝛼,𝛽 ∈ 𝔤) (9) with d𝜎∗ denotes the differential of 𝜎∗, then 𝔤 is also called a frobenius lie algebra [5]. indeed, this statement is equivalent to definition of the frobenius lie algebra. example 1. the nice examples of low dimensional frobenus lie algebras are 4dimensional frobenius lie algebras over a field with characteristic ≠ 2 and 6dimensional frobenius lie algebras over an algebraically closed field classified in [6]. in the other hand, the (2𝑛 +1)-dimensional heisenberg lie algebra is not frobenius lie algebra. since we shall relate a frobenius functional to pfaffian, we need to recall the notion of pfaffian for the square matrix 𝐴 ≔ (𝐴𝑗𝑘)𝑗,𝑘=1 2𝑛 where 𝐴 is an alternating matrix [12]. the formula of pfaffian is given by pf(𝐴) ≔ 1 2𝑛𝑛! ∑ sgn(𝜏)∏ 𝐴𝜏(2𝑖−1)𝜏(2𝑖) 𝑛 𝑘=1𝜏∈𝑆(2𝑛) . (10) the pfaffian for the lie algebra 𝔤 is defined as the pfaffian of matrix 𝑀𝔤(ℝ). remark 1 [12]. let 𝐴 be an even dimensional square-alternating matrix. then det(𝐴) = pf(𝐴)2. example 2. let 𝔞𝔣𝔣(1) be 2-dimensional affine lie algebra with non-zero bracket is [𝑎,𝑏] = 𝑏. we obtain pf(𝔞𝔣𝔣(1)) = 𝑏. moreover, the 4-dimensional frobenius lie algebra σ with non-zero brackets in the following formulas [6] [𝑑,𝑎] = 𝑎, [𝑐,𝑏] = 𝑎 [𝑑,𝑏] = 1 2 𝑏, [𝑑,𝑐] = 1 2 𝑐 has pfaffian of the form pf(σ) = 𝑎2 [13]. a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 272 the notion of pfaffian is very important in representation theory of lie groups, for instance, in notion of square-integrable representation, the duflo-moore operator can be associated to pfaffian [13]. in this paper, we shall relate pfaffian to a frobenius functional contruction. 1.2 left-symmetric algebra structure the notion of left-symmetric algebra first have been arisen in the theory of lie groups 𝐺 endowed a left-invariant affine structure [14]. let 𝐴,𝐵, and 𝐶 left-invariant vector fields of 𝐺 that are the section of the map 𝜓 ∶ 𝑇𝐺 → 𝐺 with 𝑇𝐺 is a tangent bundle of 𝐺. let ∇ be a connection in 𝑇𝐺 with both curvature and torsion are zero, namely : [𝐴,𝐵] = ∇a𝐵 −∇𝐵𝐴, ∇[a,b]𝐶 = ∇𝑋∇𝑌𝑍 −∇𝑌∇𝑋𝑍. (11) let 𝑎,𝑏, and 𝑐 be elements of a lie algebra 𝔤 of 𝐺. firstly, we define 𝑎𝑏 ≔ 𝑎 ∗𝑏 = ∇a𝐵. let ∆(𝑎,𝑏,𝑐) be associator for 𝑎,𝑏, and 𝑐 given by ∆(𝑎,𝑏,𝑐) ≔ (𝑎𝑏)𝑐 −𝑎(𝑏𝑐). (12) in addition, a left-symmectric or a non-associative algebra structure on the lie algebra 𝔤 is given by the following formula ∆(𝑎,𝑏,𝑐) = ∆(𝑏,𝑎,𝑐). (13) we start with some basic definitions of left-symmetric algebra structure which shall be needed for the next section. definition 3[8]. a non-associative algebra 𝐿 is called a left-symmetric algebra (lsa) if a bilinear product defined by 𝐿 ×𝐿 ∋ (𝑎,𝑏) ↦ 𝑎𝑏 ≔ 𝑎 ∗𝑏 ∈ 𝐿 satisfies the equation (15). a given lie bracket by the following commutator [𝑎,𝑏] = 𝑎𝑏 −𝑏𝑎 (14) for all 𝑎,𝑏,𝑐 ∈ 𝐿 defines a lie algebra 𝔤 of 𝐿. in other words, the algebra 𝐿 is a lie algebra denoted by 𝔤 = 𝔤(𝐿) with respect to the lie bracket in the equation (14). we call a left-symmetric algebra by a lie admissible lie algebra or vinberg algebra [8]. example 3[5]. let 𝔞𝔣𝔣(1) be a 2-dimensional affine lie algebra with basis 𝐵 ≔ {𝑎,𝑏} and non-zero brackets [𝑎,𝑏] = 𝑏. the left-symmetric algebra structures on 𝔞𝔣𝔣(1) are given by 𝑎2 = −𝑎, 𝑎𝑏 = 0, 𝑏𝑎 = −𝑏, 𝑏2 = 0. (15) indeed, 𝔞𝔣𝔣(1) is lsa. furthermore, we can see that the equation (15) satisfies the equation (14). a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 273 remark 2[8]. a lie algebra 𝔤 has structure affines if it satisfies the equations (13) and (14). we mention here that not every lie algebra has left-symmetric algebra structure. for examples, filiform lie algebras 𝔤 of dimension 10 ≤ dim𝔤 ≤ 13 do not have leftsymmetric algebra structure ( [8], [14] ). let 𝔤 be a frobenius lie algebra whose frobenius functional is 𝜎∗. we define an alternating bilinear form 𝐵𝜎∗ as in the equation (7) whose a representation matrix is in the equation (8). for all 𝑎,𝑏,𝑐 ∈ 𝔤, using a product 𝑎𝑏 ≔ 𝑎 ∗𝑏, then we have that the symplectic form 𝐵𝜎∗ induces a left-symmetric algebra structure( [5], [9] ) defined by 𝐵𝜎∗(𝑎𝑏,𝑐):= −𝐵𝜎∗(𝑏,[𝑎,𝑐]) = −〈𝜎 ∗, [𝑏, [𝑎,𝑐]]〉. (16) the non-degeneracy of 𝐵𝜎∗ and the jacobi identity in the equation (4) guarantee that the equation (16) satisfies the equations (13) and (14). in other words, we find that the symplectic form 𝐵𝜎∗ induces a left-symplectic algebra structure. we also observe that the determinant of a representation matrix of 𝐵𝜎∗ is not equal to zero since 𝔤 is the frobenius lie algebra. furthermore, we shall apply the equation (16) to find leftsymmetric algebra structure explicitely. results and discussion in this section, we shall prove our main result as we summarize as follows : proposition 1. let 𝑆 = {𝜀11,𝜀12,𝜀21,𝜀22,𝜀13,𝜀14, 𝜀23, 𝜀24} be a basis for the lie algebra 𝔥2. then 𝔥2 is 8-dimensional frobenius lie algebra whose the frobenius functional 𝜎0 ∗ ≔ 𝜀14 ∗ + 𝜀23 ∗ ∈ 𝔥2 ∗ is considered with respect to the pfaffian of 𝔥2, denoted by pf(𝔥2). its pfaffian pf(𝔥2) is written in the following form : pf(𝔥2) = (𝜀13𝜀24 −𝜀14𝜀23) 2 ∈ s(𝔥2), (17) where s(𝔥2) is a symmetric algebra of degree 4. furthermore, the lie algebra 𝔥2 is leftsymmetric algebra whose formulas are in the following products 𝜀11 2 = −𝜀11 𝜀11𝜀12 = 0 𝜀11𝜀21 = −𝜀21 𝜀11𝜀22 = 0 𝜀11𝜀13 = 𝜀13 𝜀11𝜀14 = 0 𝜀11𝜀23 = 0 𝜀11𝜀24 = −𝜀24 𝜀12𝜀21 = −𝜀22 𝜀12𝜀11 = −𝜀12 𝜀12 2 = 0 𝜀12𝜀14 = −𝜀13 𝜀12𝜀22 = 0 𝜀12𝜀13 = 0 𝜀21 2 = 0 𝜀12𝜀23 = 𝜀13 𝜀12𝜀24 = 𝜀14 −𝜀23 𝜀21𝜀24 = 0 𝜀21𝜀11 = 0 𝜀21𝜀12 = −𝜀11 𝜀22𝜀21 = 0 a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 274 𝜀21𝜀22 = −𝜀21 𝜀21𝜀13 = 𝜀23 −𝜀14 𝜀22𝜀14 = 0 𝜀21𝜀14 = 𝜀24 𝜀21𝜀23 = −𝜀24 𝜀21𝜀24 = 0 𝜀22𝜀11 = 0 𝜀22𝜀12 = −𝜀12 𝜀22𝜀21 = 0 𝜀22 2 = −𝜀22 𝜀22𝜀13 = −𝜀13 𝜀13𝜀21 = −𝜀14 𝜀22𝜀23 = 0 𝜀22𝜀24 = 𝜀24 𝜀13𝜀14 = 0 𝜀13𝜀11 = 0 𝜀13𝜀12 = 0 𝜀14𝜀21 = 0 𝜀13𝜀22 = −𝜀13 𝜀13 2 = 0 𝜀14 2 = 0 𝜀13𝜀23 = 0 𝜀13𝜀24 = 0 𝜀23𝜀21 = −𝜀24 𝜀14𝜀11 = −𝜀14 𝜀14𝜀12 = 0 𝜀23𝜀14 = 0 𝜀14𝜀22 = 0 𝜀14𝜀13 = 0 𝜀24𝜀21 = 0 𝜀14𝜀23 = 0 𝜀14𝜀24 = 0 𝜀24𝜀14 = 0 𝜀23𝜀11 = 0 𝜀23𝜀12 = 0 𝜀23𝜀22 = −𝜀23 𝜀23𝜀13 = 0 𝜀23 2 = 0 𝜀23𝜀24 = 0 𝜀24𝜀11 = −𝜀24 𝜀24𝜀12 = −𝜀23 𝜀24𝜀22 = 0 𝜀24𝜀13 = 0 𝜀24𝜀23 = 0 𝜀24 2 = 0 where the products on 𝔥2 is defined by 𝔥2 ×𝔥2 ∋ (𝜀𝑖𝑗,𝜀𝑘𝑙) ↦ 𝜀𝑖𝑗𝜀𝑘𝑙 ≔ 𝜀𝑖𝑗 ∗𝜀𝑘𝑙 ∈ 𝔥2 . (18) proof. let 𝜀𝑗𝑘 be elements of 𝔥2 ⊂ m4(ℝ) with 1 ≤ 𝑗,𝑘 ≤ 4 corresponding to the form in the equation (1). we let the set 𝑆 = {𝜀11,𝜀12,𝜀21,𝜀22, 𝜀13, 𝜀14,𝜀23,𝜀24} be a basis for 𝔥2. the lie algebra 𝔥2 is endowed with the lie brackets written as the matrix commutator with respect to the basis 𝑆 as follows: [𝜀𝑗𝑘,𝜀𝑖𝑙] = 𝜀𝑗𝑘𝜀𝑖𝑙 −𝜀𝑖𝑙𝜀𝑗𝑘 (19) where 1 ≤ 𝑗,𝑘, 𝑖, 𝑙 ≤ 4. in addition, we give the non-zero brackets for 𝔥2 in the following formulas : [𝜀12,𝜀21] = 𝜀11 −𝜀22 [𝜀11,𝜀21] = −𝜀21, a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 275 [𝜀11,𝜀12] = 𝜀12 [𝜀21,𝜀22] = −𝜀21, [𝜀12,𝜀22] = 𝜀12 [𝜀11,𝜀13] = 𝜀13, [𝜀11,𝜀14] = 𝜀14 [𝜀12,𝜀23] = 𝜀13, [𝜀12,𝜀24] = 𝜀14 [𝜀21,𝜀13] = 𝜀23, [𝜀21,𝜀14] = 𝜀24 [𝜀22,𝜀23] = 𝜀23, [𝜀22,𝜀24] = 𝜀24. (20) firstly, in order to show that 𝔥2 is the frobenius lie algebra, we just compute the pfaffian of the matrix 𝑀𝔥2(ℝ) ≔ ([𝜀𝑘𝑗,𝜀𝑖𝑙])𝑗,𝑘,𝑖,𝑙=1 4 defined before with respect to the basis 𝑆. since we have that the determinant for 𝑀𝔥2(ℝ) in the following form det𝑀𝔥2 (ℝ) = (𝜀13𝜀24 −𝜀14𝜀23) 4, (21) is not equal to zero, then 𝔥2 is the frobenius lie algebra as desired. in the next step, using the theorem 5, we can also see that 𝔥2 is the frobenius lie algebra by constructing a frobenius functional such that a stabilizer of 𝔥2 ata that point is trivial. secondly, to prove that 𝔥2 is left-symmetric algebra, we should find a frobenius functional corresponding the pfaffian for 𝔥2. we observe, using remark 7, we have that the pfaffian of 𝔥2 can be written of the form pf(𝔥2 ) = (𝜀13𝜀24 −𝜀14𝜀23) 2, (22) which is contained in the symmetric algebra 𝑆(𝔥2) of degree 4. since 𝔥2 is frobenius lie algebra, the existence of some frobenius functionals are guaranteed. from the equation (22), we first claim that a frobenius functional is 𝜎𝑚𝑛 ∗ ≔ 𝜀14 ∗ +𝜀23 ∗ which is contained in the dual space 𝔥2 ∗ of vector space 𝔥2. we recalll the value of a linear functional 𝜎𝑚𝑛 ∗ on 𝔥2 is defined by 〈𝜎𝑚𝑛 ∗ , 𝜀𝑗𝑘〉 = 1 for 𝑚 = 𝑗,𝑛 = 𝑘 and 0 otherwise. let 𝑀𝔥2(ℝ)(𝜎 ∗) be the matrix defined in the equation (10). namely, we have the 8×8 matrix 𝑀𝔥2(ℝ)(𝜎 ∗) ≔ (〈𝜎𝑚𝑛 ∗ , [𝜀𝑘𝑗,𝜀𝑖𝑙]〉)𝑗,𝑘,𝑖,𝑙=1 4 which can be seen in the following form 𝑀𝔥2(ℝ)(𝜎𝑚𝑛 ∗ ) = ( 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 −1 0 0 0 0 0 −1 0 0 0 0 0 0 0 0 0 0 −1 0 0 0 0 0 −1 0 0 0 0 0 0) . (23) since the determinant of matrix 𝑀𝔥2(ℝ)(𝜎𝑚𝑛 ∗ ) is equal to 1, then 𝔥2 is again the frobenius lie algebra. in other words, we have shown that our claim is true. thus, 𝜎𝑚𝑛 ∗ is the frobenius functional. indeed, one can also check that the stabilizer of 𝔥2 at this frobenius functional 𝜎𝑚𝑛 ∗ is trivial and this again show that 𝜎𝑚𝑛 ∗ is the frobenius functional. a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 276 corresponding to the frobenius functional 𝜎𝑚𝑛 ∗ , we construct the symplectic linear form 𝐵𝜎𝑚𝑛∗ as defined in the equation (7). we shall show that 𝐵𝜎𝑚𝑛∗ induces leftsymmetric algebra structure in the equations (13) and (14) with respect to the the product 𝔥2 ×𝔥2 ∋ (𝑎,𝑏) ↦ 𝑎𝑏 ≔ 𝑎 ∗𝑏 ∈ 𝔥2. let 𝑎,𝑏,𝑐, and 𝛼 be elements of 𝔥2. firstly, we shall show that the equation (13). namely, we shall show that (𝑎𝑏)𝑐 −(𝑏𝑎)𝑐 = 𝑎(𝑏𝑐)−𝑏(𝑎𝑐). (24) let us observe that 𝐵𝜎𝑚𝑛∗ ((𝑏𝑎)𝑐 −(𝑎𝑏)𝑐,𝛼) = 𝐵𝜎𝑚𝑛∗ ((𝑏𝑎)𝑐,𝛼)−𝐵𝜎𝑚𝑛∗ ((𝑎𝑏)𝑐,𝛼), = 𝐵𝜎𝑚𝑛∗ (𝑐, [𝑎𝑏,𝛼])−𝐵𝜎𝑚𝑛∗ (𝑐,[𝑏𝑎,𝛼]), = 〈𝜎𝑚𝑛 ∗ , [𝑐, [𝑎𝑏,𝛼]]〉−〈𝜎𝑚𝑛 ∗ , [𝑐, [𝑏𝑎,𝛼]]〉, = 〈𝜎𝑚𝑛 ∗ , [𝑐, [𝑎𝑏,𝛼]]−[𝑐,[𝑏𝑎,𝛼]]〉, = 〈𝜎𝑚𝑛 ∗ , [𝑐, [𝑎𝑏,𝛼] −[𝑏𝑎,𝛼]]〉, = 〈𝜎𝑚𝑛 ∗ , [𝑐, [𝑎𝑏 −𝑏𝑎,𝛼]]〉, = 〈𝜎𝑚𝑛 ∗ , [𝑐, [[𝑎,𝑏],𝛼]]〉, = 𝐵𝜎𝑚𝑛∗ (𝑐,[[𝑎,𝑏],𝛼]). therefore, we get the nice formula as follows : 𝐵𝜎𝑚𝑛∗ ((𝑏𝑎)𝑐 −(𝑎𝑏)𝑐,𝛼) = 𝐵𝜎𝑚𝑛∗ (𝑐,[[𝑎,𝑏],𝛼]). (25) in the similar way, we have 𝐵𝜎𝑚𝑛∗ (𝑏(𝑎𝑐)−𝑎(𝑏𝑐),𝛼) = 𝐵𝜎𝑚𝑛∗ (𝑐, [[𝑎,𝑏],𝛼]). (26) by the equality of equations (25) and (26), then we have 𝐵𝜎𝑚𝑛∗ ((𝑏𝑎)𝑐 −(𝑎𝑏)𝑐 +𝑎(𝑏𝑐)−𝑏(𝑎𝑐),𝛼) = 0. (27) but the non-degeneracy of the bilinear form 𝐵𝜎𝑚𝑛∗ guarantees that (𝑏𝑎)𝑐 −(𝑎𝑏)𝑐 + 𝑎(𝑏𝑐)−𝑏(𝑎𝑐) = 0. in other words, the equation (13) holds for all 𝑎,𝑏,𝑐 ∈ 𝔥2. therefore, the symplectic bilinear form 𝐵𝜎𝑚𝑛∗ corresponding to 𝜎𝑚𝑛 ∗ induces the left-symmetric on the frobenius lie algebra 𝔥2. secondly, we shall show that the equation (14) holds. let us observe that 𝐵𝜎𝑚𝑛∗ (𝑎𝑏 −𝑏𝑎 −[𝑎,𝑏],𝛼) = 𝐵𝜎𝑚𝑛∗ (𝑎𝑏,𝛼)−𝐵𝜎𝑚𝑛∗ (𝑏𝑎,𝛼)−𝐵𝜎𝑚𝑛∗ ([𝑎,𝑏],𝛼), = 𝐵𝜎𝑚𝑛∗ (𝑎, [𝑏,𝛼])−𝐵𝜎𝑚𝑛∗ (𝑏,[𝑎,𝛼])−𝐵𝜎𝑚𝑛∗ ([𝑎,𝑏],𝛼), = 𝐵𝜎𝑚𝑛∗ (𝑎, [𝑏,𝛼])+𝐵𝜎𝑚𝑛∗ (𝑏,[𝛼,𝑎])+𝐵𝜎𝑚𝑛∗ (𝛼,[𝑎,𝑏]), = 〈𝜎𝑚𝑛 ∗ , [𝑎,[𝑏,𝛼]]〉+〈𝜎𝑚𝑛 ∗ , [𝑏, [𝛼,𝑎]]〉+〈𝜎𝑚𝑛 ∗ , [𝛼,[𝑎,𝑏]]〉, = 〈𝜎𝑚𝑛 ∗ , [𝑎,[𝑏,𝛼]]+[𝑏,[𝛼,𝑎]]+[𝛼,[𝑎,𝑏]]〉, a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 277 = 〈𝜎𝑚𝑛 ∗ ,0〉 = 0. thus, we obtain 𝐵𝜎𝑚𝑛∗ (𝑎𝑏 −𝑏𝑎 −[𝑎,𝑏],𝛼) = 0. (28) by the same argument of non-degeneracy of 𝐵𝜎𝑚𝑛∗ , we get 𝑎𝑏 −𝑏𝑎 −[𝑎,𝑏] = 0. thus, the equation (14) holds. we proved that, the alternating bilinear form 𝐵𝜎𝑚𝑛∗ induces the leftsymmetric structure on 𝔥2. more specific, for 𝑎,𝑏 ∈ 𝔥2, there exist scalars 𝛼𝑘𝑗 and 𝛽𝑖𝑙 where 1 ≤ 𝑗,𝑘, 𝑖, 𝑙 ≤ 4 such that 𝑎 ≔ ∑ 𝛼𝑘𝑗𝜀𝑘𝑗1≤𝑗,𝑘≤4 , 𝑏 ≔ ∑ 𝛽𝑖𝑙𝜀𝑖𝑙1≤𝑖,𝑙≤4 . (29) moreover, since 𝑎𝑏 ≔ 𝑎 ∗𝑏 ∈ 𝔥2, then there also exists scalars 𝜑𝑠𝑡 where 1 ≤ 𝑠,𝑡 ≤ 4 such that 𝑎𝑏 ≔ ∑ φ𝑠𝑡𝜀𝑠𝑡1≤𝑠,𝑡≤4 . (30) we shall calculate scalar 𝜑𝑠𝑡 written as products of scalars 𝛼𝑘𝑗 and 𝛽𝑖𝑙 where 1 ≤ 𝑗,𝑘, 𝑖, 𝑙,𝑠,𝑡 ≤ 4 satisfying the equationss (13) and (14). applying induced left-symmetric structure from the symplectic form 𝐵𝜎𝑚𝑛∗ corresponding to the frobenius fuctional 𝜎𝑚𝑛 ∗ ≔ 𝜀14 ∗ +𝜀23 ∗ then we get the following products 𝜑𝑠𝑡 = ∑ 𝛼𝑘𝑗𝛽𝑖𝑙.1≤𝑗,𝑘,𝑖,𝑙≤4 (31) therefore, by choosing suitable scalars 𝛼𝑘𝑗 and 𝛽𝑖𝑙, we have the products 𝜀𝑗𝑘𝜀𝑖𝑙. indeed, these product satisfies the equationss (13) and (14) because they are induced by simplectic form for 𝔥2. thus, 𝔥2 is left-symmetric algebra. we apply formulas in the equations (16), (29), (30), and (31) to find the explicit left-symmetric structures on 𝔥2. we compute these left-symmetric structures with respect to the basis 𝑆 as mentioned above. then we obtain : 𝐵𝜎𝑚𝑛∗ (𝑎𝑏,𝜀𝑘𝑗) = −𝐵𝜎𝑚𝑛∗ (𝑏,[𝑎,𝜀𝑘𝑗]), = −〈𝜎𝑚𝑛 ∗ , [𝑏,[𝑎,𝜀𝑘𝑗]]〉, = −〈𝜎𝑚𝑛 ∗ , [∑ 𝛽𝑖𝑙𝜀𝑖𝑙1≤𝑖,𝑙≤4 , [∑ 𝛼𝑘𝑗𝜀𝑘𝑗1≤𝑗,𝑘≤4 , 𝜀𝑘𝑗]]〉, (32) where 𝜀𝑘𝑗 ∈ 𝑆 ⊂ 𝔥2 for 1 ≤ 𝑗,𝑘 ≤ 4. on the other hand, we have the following formulas : 𝐵𝜎𝑚𝑛∗ (𝑎𝑏,𝜀𝑘𝑗) = 𝐵𝜎𝑚𝑛∗ ( ∑ φ𝑠𝑡𝜀𝑠𝑡 1≤𝑠,𝑡≤4 ,𝜀𝑘𝑗), = 〈𝜎𝑚𝑛 ∗ , [ ∑ φ𝑠𝑡𝜀𝑠𝑡 1≤𝑠,𝑡≤4 , 𝜀𝑘𝑗 ]〉, (33) where 𝜀𝑘𝑗 ∈ 𝑆 ⊂ 𝔥2 for 1 ≤ 𝑗,𝑘 ≤ 4. let 𝜀𝑘𝑗 = 𝜀11, then the equation (32) can be computed with respect the lie brackets in the equation (20) as follows : a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 278 𝐵𝜎𝑚𝑛∗ (𝑎𝑏,𝜀11) = −〈𝜎𝑚𝑛 ∗ , [𝑏,[𝑎,𝜀𝑘𝑗]]〉, = −𝛽8𝛼2 +𝛽5𝛼3 +𝛽3𝛼5 +𝛽1𝛼6. (34) in the other hand, we obtain from the equation (33) that 𝐵𝜎𝑚𝑛∗ (𝑎𝑏,𝜀11) = 〈𝜎𝑚𝑛 ∗ , [ ∑ φ𝑠𝑡𝜀𝑠𝑡 1≤𝑠,𝑡≤4 , 𝜀𝑘𝑗 ]〉, = −φ14. (35) therefore, we get φ14 = 𝛽8𝛼2 −𝛽5𝛼3 −𝛽3𝛼5 −𝛽1𝛼6. (36) in the similar way, we obtain the following formulas using the equations (20), (32), and (33) simultaneously : φ11 = −(𝛽1𝛼1 +𝛽2𝛼3), φ12 = −(𝛽1𝛼2 +𝛽2𝛼4), φ21 = −(𝛽3𝛼1 +𝛽4𝛼3), φ22 = −(𝛽3𝛼2 +𝛽4𝛼4), φ13 = 𝛽5𝛼1 −𝛽6𝛼2 +𝛽7𝛼2 −𝛽5𝛼4 −𝛽4𝛼5 −𝛽2𝛼6, φ23 = 𝛽5𝛼3 −𝛽8𝛼2 −𝛽4𝛼7 −𝛽2𝛼8, φ24 = 𝛽6𝛼3 −𝛽8𝛼1 −𝛽7𝛼3 +𝛽8𝛼4 −𝛽3𝛼7 −𝛽1𝛼8. (34) therefore, the following formula 𝑎𝑏 = ( ∑ 𝛼𝑘𝑗𝜀𝑘𝑗 1≤𝑗,𝑘≤4 )( ∑ 𝛽𝑖𝑙𝜀𝑖𝑙 1≤𝑖,𝑙≤4 ) = ∑ φ𝑠𝑡𝜀𝑠𝑡 1≤𝑠,𝑡≤4 is determined by the equation (34). by choosing suitable 𝛼𝑘𝑗 and 𝛽𝑖𝑙 then we have the left-symmetric structure as stated in proposition 1. for example if we fix 𝛼1 = 1 and 𝛽1 = 1, then we have the product 𝜀11 ∗𝜀11 = 𝜀11 2 = −𝜀11. ∎ as discussion, there is a still open problem for a generalization of left-symmetric structure of 𝔥2 to left-symmetric structure of 𝔥𝑛,𝑝:= m𝑛,𝑝(ℝ)⋊𝔤𝔩𝑛(ℝ). the lie algebra 𝔥𝑛,𝑝 is frobenius lie algebra whenever 𝑝 is factor of 𝑛 [1]. we offered an alternative proof to show 𝔥𝑛,𝑝:= m𝑛,𝑝(ℝ)⋊𝔤𝔩𝑛(ℝ) is frobenius for 𝑛 = 𝑝 = 2 as explained before. since 𝔥𝑛,𝑝 is frobenius lie algebra, then there exists a frobenius functional. therefore, we can construct a symplectic linear form which induces left-symmetric structure for 𝔥𝑛,𝑝. thus, we consider the following conjecture: a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 279 conjecture 1. the lie algebra 𝔥𝑛,𝑝:= m𝑛,𝑝(ℝ)⋊𝔤𝔩𝑛(ℝ) where 𝑝 devides 𝑛 has leftsymmetric structures. these structures are induced by a symplectic form of 𝔥𝑛,𝑝. one of the interesting problems is how to find a frobenius functional for 𝔥𝑛,𝑝 corresponding to the pfaffian of 𝔥𝑛,𝑝 in order to give explicit formulas for left-symmetric structure for 𝔥𝑛,𝑝. furthermore, if the conjecture 12 is true, then the next question is how about leftsymmetric structures for general frobenius lie algebras. we also proved that that all 4dimensional frobenius lie algebra are left-symmetric algebras. as mentioned before that not all lie algebras have left-symmetric structure. but we guess that lie algebras of frobenius types in general have left-symmetric structures. in other words, in more general case we consider the following conjecture conjecture 2. let 𝔤 be a finite dimensional frobenius lie algebra. then 𝔤 is left-symmetric algebra whose left-symmetric structures are induced by a symplectic form corresponding to frobenius functional of 𝔤. it would be interesting to study the completeness of left-symmetric algebra of 𝔥𝑛,𝑝 whenever a frobenius lie algebra is equal to its radical and in general, the completeness of any finite dimensional frobenius lie algebra. conclusions we showed that 𝔥2 ≔ m2(ℝ)⋊𝔤𝔩2(ℝ) is the 8-dimensional frobenius lie algebra. furthermore, we proved the existence of left-symmetric structures on the frobenius lie algebra 𝔥2 and we listed the explicit formulas of left-symmetric structures. therefore, 𝔥2 is left-symmetric algebra. our construction based on the symplectic form corresponding to a frobenius functional of 𝔥2 which induced the left-symmetric structures on 𝔥2. our result can motivate the left-symmetric structure for 𝔥𝑛,𝑝:= m𝑛,𝑝(ℝ)⋊𝔤𝔩𝑛(ℝ) and for general case of frobenius lie algebras. for future research, we stated conjecture 12 and conjecture 13 which are still open problem to be investigated. this is very interesting if conjecture 12 and conjecture 13 are true because we can study the radical of 𝔥𝑛,𝑝, denoted by rad( 𝔥𝑛,𝑝), and if we have 𝔥𝑛,𝑝 = rad( 𝔥𝑛,𝑝), then we come to the notion of a completeness of left-symmetric algebra of 𝔥𝑛,𝑝. moreover, left-symmetric algebras relate affine structures and it is interesting to investigate for the lie groups case[15] . acknowledgments we thank universitas padjadjaran who has funded the work through riset percepatan lektor kepala (rplk) in the year 2021 with the contract number 1959/un6.3.1/pt.00/2021. a left-symmetric structure on the semi-direct sum real frobenius lie algebra of dimension 8 edi kurniadi 280 references [1] m. rais, “la representation du groupe affine,” ann.inst.fourier,grenoble, vol. 26, pp. 207--237, 1978. [2] e. kurniadi, “harmonic analysis for finite dimensional real frobenius lie algebras,” nagoya university, 2019. [3] ooms, “on lie algebras with primitive envelopes, supplements,” proc.amer.math.soc, vol. 58, pp. 67–72, 1976. [4] a. i. ooms, “on frobenius lie algebras,” comm. algebra., vol. 8, pp. 13--52, 1980. [5] a. diatta and b. manga, “on properties of principal elements of frobenius lie algebras,” j. lie theory, vol. 24, no. 3, pp. 849–864, 2014. [6] b. csikós and l. verhóczki, “classification of frobenius lie algebras of dimension ≤ 6,” publ. math., vol. 70, no. 3–4, pp. 427–451, 2007. [7] a. i. ooms, “computing invariants and semi-invariants by means of frobenius lie algebras,” j. algebra., vol. 321, pp. 1293--1312, 2009. [8] d. burde, “left-symmetric algebras, or pre-lie algebras in geometry and physics,” arxiv:math-ph/0509016v2, 2015. [9] diatta, a., b. manga, and a. mbaye, “on systems of commuting matrices, frobenius lie algebras and gerstenhaber’s theorem,” arxiv:2002.08737., 2020. [10] j. hilgert and k.-h. neeb, structure and geometry of lie groups. new york: springer monographs in mathematics, springer, 2012. [11] d. n. pham, “g-quasi-frobenius lie algebras,” arch. math., vol. 52, no. 4, pp. 233– 262, 2016. [12] i. satake, linear algebra:pure and applied mathematics, a series on monograph and text books,. new york: marcel-dekker,inc, 1975. [13] e. kurniadi and h. ishi, “harmonic analysis for 4-dimensional real frobenius lie algebras,” in springer proceeding in mathematics & statistics, 2019. [14] d. burde, “simple left-symmetric algebras with solvable lie algebra,” manuscripta math., vol. 95, pp. 397--411, 1998. [15] ayala,v, a. da silva, and m. ferreira, “affine and bilinear systems on lie groups,” syst. &control lett., vol. 117, pp. 23--29, 2018. hybrid model of singular spectrum analysis and arima for seasonal time series data cauchy –jurnal matematika murni dan aplikasi volume 7(2) (2022), pages 302-315 p-issn: 2086-0382; e-issn: 2477-3344 submitted: december 01, 2021 reviewed: december 10, 2021 accepted: december 23, 2021 doi: http://dx.doi.org/10.18860/ca.v7i1.14136 hybrid model of singular spectrum analysis and arima for seasonal time series data gumgum darmawan1,2,*, dedi rosadi1, budi n ruchjana2 1gadjah mada university, yogyakarta, indonesia 2padjadjaran university, bandung, indonesia *corresponding author email: gumgum.darmawan@gmail.ugm.ac.id*, dedirosadi@gadjahmada.edu, budi.nurani@unpad.ac.id abstract hybrid models between singular spectrum analysis (ssa) and autoregressive integrated moving average (arima) have been developed by several researchers. in the ssa-arima hybrid model, ssa is used in the decomposition and reconstruction process, while forecasting is done through the arima model. in this paper, hybrid ssa-arima uses two auto grouping models. the purpose of this paper is to analyze seasonal data using the ssa-arima hybrid by auto grouping. the first model namely the alexandrov method and the second method is alternative auto grouping with long memory approach. the two hybrid models were tested for two types of seasonal pattern, multiplicative and additive seasonal time series data. the analysis results using both methods give accurate result; as seen from the mape generated the 12 observations for future, the value is below 5%. for additive seasonal pattern, the hybrid ssa-arima method with alexandrov auto grouping is more accurate (mape= 0.13%) than the hybrid ssa-arima method with alternative method but for multiplicative seasonal pattern the hybrid ssa-arima with alternative auto grouping is more accurate (mape = 3.63%) than the hybrid ssa-arima method with alexandrov method. keywords: arima; automatic grouping; long memory effect; seasonal pattern, singular spectrum analysis introduction singular spectrum analysis (ssa) is a relatively new non-parametric method that has proved its capability in various time series types. solving all these problems correspond to the so-called basic capabilities of ssa. besides, the method has several extensions. first, the multivariate version of the method permits the simultaneous expansion of several time series data; see, for example, [1]. second, the ssa ideas lead to several forecasting procedures for time series; see [2]. third, ssa has been utilized for change-point detection in time series. the ssa technique has been used as a filtering method in [3]. fifth, a family of the causality test based on the multivariate ssa technique has been introduced in [4]. sixth, ssa can be applied for missing value imputation [5]. ssa can be applied in various disciplines, from mathematics and physics to economics and financial mathematics, meteorology and oceanography, to social sciences. http://dx.doi.org/10.18860/ca.v7i1.14136 mailto:gumgum.darmawan@gmail.ugm.ac.id mailto:dedirosadi@gadjahmada.edu mailto:budi.nurani@unpad.com hybrid model of singular spectrum analysis and arima for seasonal time series data g.darmawan 303 for instance, in climatology ([6], [7], [8]) and biomedical data time series analysis [9]. hybrid modeling of ssa in time series data has been carried out by many researchers. the hybrid model is carried out so that the advantages of two or more models make a positive contribution to the forecasting results. ssa hybrid model with other time series models includes arima, neural network, arimax, par, varimax, and others. [10], performed hybrid ssa with neural network.[11] perform the hybrid ssa-algorithm firefly-bp neural network process. [12] carried out a hybrid ssa model with armax. [13] combining the ssa model with par(p), this model was applied to wind speed data.[14], built the ssa-varimax hybrid model and used it for climate data. the arima model is often used as a comparison for the ssa model, such as [15], comparing ssa, arima, and other time series models for tourism cases in various countries in europe. the result has indicated that there is no good time series model for all tourism data. [16] compared ssa and arima for predicting ambulance demand. the ssa-arima hybrid model studied by [17] was applied to the annual runoff data. [18], the ssa-arima hybrid model was compared with the basic ssa and arima models. the result showed that the ssa-arima hybrid model was the most accurate. however, many of these papers do not discuss specific data forms (e.g., seasonal patterns), so we consider it necessary to examine this hybrid model for seasonal data. in this study, the ssa and the arima were employed collectively to forecast two types of time series data. both models run to get fast and accurate computation. in ssa, there are two methods of automatic grouping (alexandrov and alternative). the forecasting performance of the hybrid ssa-arima model was compared between the two methods (alternative vs. alexandrov). this paper contributes to the analysis of the seasonal patterns (additive and multiplicative) by the ssa-arima hybrid. the purpose of this paper was to analyze seasonal data using the ssa-arima hybrid by auto grouping for two types of seasonal patterns. this paper was organized as follows: the current section was an introduction where we briefly outlined the use of ssa and introduced our study. in the next section, the methods section, we described the detailed methodology of ssa and arima, briefly outlined forecasting using a linear recurrent formula, identification of fractional differencing parameter, identification of hidden periodicities based on periodogram and automatic grouping on alexandrov method ([19], [20]) also alternative automatic grouping [21]. this section also included a proposed algorithm for automatic hybrid ssaarima. in the results and discussion section, we demonstrated the abilities of hybrid ssaarima in real-time series data. in this part, we also investigated three types of time series data: seasonal with no trend, multiplicative seasonal with the trend, and additive seasonal with the trend. this section also discussed the comparison result between hybrid ssa-arima with the alexandrov method and hybrid ssa-arima with an alternative method for real data analysis. methods singular spectrum analysis the (non-parametric) ssa method has received a fair amount of attention in the literature. the first phase of ssa is the decomposition, where the time series are broken down into four components: trend, seasonal, cyclical, and noise. this phase consists of the embedding and singular value decomposition steps. the second phase, namely the reconstruction phase, consists of grouping and diagonal average process. the hybrid model of singular spectrum analysis and arima for seasonal time series data g.darmawan 304 forecasting process can be done once the four stages have been completed. for the completeness of presentation of our method, we presented the complete phase of the ssa algorithm in the following section. embedding the embedding step will transform one-dimensional time series  1 2 tx = x , x , ....., x into multi-dimensional series 1 2 kx , x , ..., x with vectors 𝑋 = (𝑋𝑖,𝑋𝑖+1,𝑋𝑖+2, . . ,𝑋𝑖+𝐿−1) 𝑇 ∈ 𝑅𝐿 , where 𝑖 = 1,2,…,𝐾, 𝐾 = 𝑇 − 𝐿 + 1. the parameter window length l defines the embedding process, where 2 ≤ 𝐿 ≤ 𝑇 − 1 [22]. if we need to emphasize the size (dimension) of the vectors xi, then we shall call them l-lagged vectors. the l-trajectory matrix (or simply the trajectory matrix) of the series x is defined as 𝑋 = [ 𝑥1 𝑥2 ⋯ 𝑥𝐾 𝑥2 ⋮ 𝑥3 ⋮ … ⋯ 𝑥𝐾+1 ⋮ 𝑥𝐿 𝑥𝐿+1 … 𝑥𝑇 ] (1) the lagged vectors xi are the columns of the trajectory matrix x. both the rows and column of x are sub-series of the original series. the (i,j) element of matrix x is 𝑥𝑖𝑗 = 𝑥𝑖+𝑗−1 which yields that x has equal elements on the ‘antidiagonals’ i+j=const. hence the trajectory matrix is a hankel matrix. singular value decomposition the second step, the svd step, makes the singular value decomposition of the trajectory matrix x and represents it as a sum of rank-one bi-orthogonal elementary matrices. set 𝑆 = 𝑋𝑋𝑇 and denoted by 𝜆1,𝜆2,…,𝜆𝐿the eigenvalues of s taken in the decreasing order of magnitude (𝜆1 ≥ 𝜆2 ≥ ⋯ ≥ 𝜆𝐿 ≥ 0)and by u1, u2,…., ul the orthonormal system of the eigenvectors of the matrix s corresponding to these eigenvalues. 𝑑 = 𝑚𝑎𝑥{𝑖,𝑠𝑢𝑐ℎ 𝑡ℎ𝑎𝑡 𝜆𝑖 > 0} = 𝑟𝑎𝑛𝑘 𝑋 if we denote 𝑉𝑖 = 𝑋𝑇𝑈𝑖 √𝜆𝑖 , then the svd of the trajectory matrix can be written as 𝑋 = 𝑋1 + 𝑋2 + ⋯+ 𝑋𝑑, where eigenvector ui, eigenvalues iλ form matrix 𝑉𝑖 𝑇𝑋. the three elements of svd forming are called eigen triple. grouping the purpose of this step is to appropriately identify the trend, the oscillatory components with different periods and noise. this step can be skipped if one does not want to extract hidden information by regrouping and filtering components precisely. the grouping procedure partitions the set of indices 1,2,…., l into m disjoint subsets 𝐼 = 𝐼1, 𝐼2,…,𝐼𝑚, so the elementary matrix in equation (2) is regrouped into m groups. let 𝐼 = {𝑖1, 𝑖2,…, 𝑖𝑝}. then the resultant matrix xi corresponding to the group i is defined as 𝑋𝑖 = 𝑋𝑖1 + 𝑋𝑖2+.. .+𝑋𝑖𝑝. the matrices are computed for i1, i2,…im, and substituted into equation (2) to obtain the new expansion. the grouping process is the phase when the lxk matrix is grouped into several sub-groups, namely trend patterns, seasonal or periodic, and noise patterns. here, in this paper, the patterns are identified by fourier series analysis and long-memory analysis. fourier series analysis is hybrid model of singular spectrum analysis and arima for seasonal time series data g.darmawan 305 used to identify a seasonal pattern, and long memory series analysis is used to identify the differencing parameter of data. we use the gph method [23] to identify the differencing parameter of time series. diagonal averaging the next step in basic ssa transforms each resultant matrix of the grouped decomposition (3) into a new one-dimensional series of length n and is called diagonal averaging. let y denote a matrix with orde (lxk), with the elements    ijy ,1 i l,1 j k , and define l* = min(l, k), k*=max(l, k), and t=l+k-1. let 𝑦𝑖𝑗 ∗ = 𝑦𝑖𝑗 if l