Electromagnetic Modeling of the Propagation Characteristics of Satellite Communications Through Composite Precipitation Layers Science and Technology, 6 (2001) 81-87 © 2001 Sultan Qaboos University Approximating the Tail Probability of the t-Distribution: A Bayesian Approach Mohammad Fraiwan Al-Saleh Department of Mathematics and Statistics, College of Science, Sultan Qaboos University, P.O. Box 36, Al Khod 123, Muscat, Sultanate of Oman. أسلوب بييز: ريب االحتمال الذيلي لتوزيع تتق محمد فريوان الصالح لقد تم استخدام أسلوب بييز في تقريب االحتمال الذيلي لتوزيع ت ، حيث تم ايجاد مجموعة من الحدود الدنيا والعليا : خالصة وقد تمت مقارنة . حدود مناسبة لالستعمال وباالعتماد على بساطة هذه الحدود ودقتها ، يمكن القول بان هذه ال . لهذا االحتمال .و قد تم بحث إمكانية استخدام هذا األسلوب لتوزيعات مهمة أخرى. بعض هذه الحدود بما هو متوفر من تقريبات ABSTRACT: A Bayesian technique is used to approximate the tail probability of the t-distribution. A set of upper and lower bounds are obtained for this probability. Based on their simplicity and accuracy, these bounds are very adequate to use. Some members of these bounds are compared to some existing approximations. The possibility of using this new procedure for some other distributions is explored. KEYWORDS: Mill’s Ratio, Normal Distribution, t-Distribution and Tail Probability. 1. Introduction I f is a random sample from the same normal distribution with mean nXXX ,..., 21 θ and variance , both being finite but unknown, and if 2σ ))1/()(and)/( 22 −−==∑ ∑ nXXSnX iiX , then the statistic SXn /)( θ− has a t-distribution. This statistic is very useful in the construction of tests and confidence intervals of θ . For a brief recent description of this distribution and its properties, see Stuart and Ord (1994) and Johnson, et al. (1995). The importance of approximating the tail probability of this distribution is due to the fact that this probability is frequently used in constructing confidence intervals or in finding the p- values of some statistical tests. There has been intensive work on approximating the t- distribution which produced approximations of very high accuracy, though some times very complicated. Fisher (1935) gave a direct expansion of the probability density function and hence of );( νtF = ) as a series in , where v is the degree of freedom. Elfving (1955) suggested the following approximation: ( ttP ≤ν 1−ν )) 2 ()5.1(()()( )4(5.122596 5 tvtvttttP v σ φσσν −++Φ=≤ +−−− where 5. 2 ) 5. 5. ( tv v + − =σ , and Φ φ are respectively, the cumulative distribution and the density function of the standard normal distribution. Cucconi (1962) obtained the following approximation: 1)696.1185.3(96.1 5.2975,. >+−≈ − vforvvvtv , 2)212.4185.3(5758.2 5.2995,. >+−≈ − vforvvvtv 81 AL-SALEH where is a number with tail area (probability) (1-α,vt α ). Pinkham and Wilk (1954) suggested the use of the expansion: ),1(5.);()1( 1 1 )1(5.12 +<+=+∫ ∑ ∞ − = +−− vmtRwdyvy m t m i i v where ; )1(5.12111 )1()1( −−−−− +−= vvttvvw 1,...,2,1; 12 12 )1( 121 −=+− − += −+ mivi i vtww ii , and is the remainder term. mR Abu-Dayyeh and Ahmed (1993) considered a similar problem. They provided an upper bound for Mill ' s ratio )(/))(1()( xxxR φΦ−= . They showed that 5.22 )),(max()( −+≤ πxxxR )( xR . Al-Saleh (1994) initiated a Bayesian approach to approximate Mill ' s ratio and hence to approximate the tail probability of the standard normal distribution. He obtained a sequence of upper bounds and a sequence of lower bounds of , each converges to . )( xR As mentioned by Johnson et al. (1995), the available tables of the t-distribution are more than sufficient for almost all applications. However, a major concern raised by the above authors, is how to quickly evaluate the tail probability. It is well known that the t-distribution converges to the normal distribution as goes to infinity. Thus, for v , if is the distribution function of the t-distribution then v 30≥ )(xF )()( xxF Φ≈ . However this approximation is not so accurate for small . Recently, Li and Moor (1999) suggested the approximation of by v )(xF )( xλΦ , where λ is a shrinkage factor and its value is given by 2 2 24 14 x x + −+ = ν ν λ . This approximation has a much simpler form and very accurate when compared with many of the approximations listed in Johnson et al. (1995). A weak point of this approximation is that it is written in terms of Φ , which has no closed form and has to be obtained from tables. For other approximations of the t-distribution see Johnson et al. (1995) and Gleason (2000). In this paper we use the Bayesian approach introduced by Al-Saleh (1994) to obtain new approximations of the tail probability of the t-distributions. These approximations, which turned out to be of a simple form, can give very accurate values. The possibility of applying this approach to some other distributions is discussed. 2. Derivation of the Bounds Assume that X is a random variable, which has a t- density with parameters θ and . Then the density of v X is )1(5.21 ))(1( 1 )5(. ))1(5(. ),;( +−+Γ +Γ = v v xvv v vxg θπ θ , where is a positive number and v x is any real number. Assume further that the median θ is positive. Let )(θπ be an improper uniform prior of θ , defined by 1)( =θπ for 0>θ and zero otherwise. Then the posterior density of θ for given x can be written as: )( )( )( )( )|( 0 xF xf dxf xf x θ θθ θ θπ − = − − = ∫ ∞ , 82 APPROXIMATING THE TAIL PROBABILITY OF THE T-DISTRIBUTION where stands for the t density with parameter (.)f ν and zero median and stands for the corresponding cumulative distribution. The main object of this paper is to approximate the tail probability, 1 . F )(xF− Now, if ν =1, the distribution function of the t-distribution is the same as that of the Cauchy which has a closed form. For , the posterior expected value of is finite for v and is given by: 2≥v kθ ,,1 L=k 1− ∫∫ ∞− ∞ −=−= x kkk dttftx xF dxf xF xE )()( )( 1 )( )( 1 )|( 0 θθθθ and hence, ∫ ∞ − − − =− x k k k dttftx xF xE )()( )(1 )1( )|(θ (1) Since 0>θ , we have for all values of 0)|( >−xE kθ x . Thus, 0)()( >−∫ ∞ x k dttftx for even values of , k and 0)()( <−∫ ∞ x k dttftx for odd values of . k Now, the last integral can be written as: )()1()()( 0 xxdttftx i ik k i k i i x k µ− = ∞      −=− ∑∫ (2) where ∫ ∞ = x i i dttftx )()(µ . Integrating by parts, it can be shown that for 1;,,2 −≤= vkkLi and we have 2≥v )( )1( )( 1 )( 21 1 x iv i xx iv v x i i i − − − − + − − = µ ν µµ (3) where, ).()1( 1 )(;)(1)( 2 10 xfv v xxFx v x+ − =−= µµ Thus, using (3), upper and lower bounds can be obtained for the tail probability of the standard t-distribution, i.e. for the quantity )(1)(0 xFx −=µ . For even, i iµ can be written as )()()()( 01 xBxxAx iii µµµ += , where, ))(()()( 2/ 2 2/ 222∑ ∏ = = −+= i k i kj jkii bxaxaxA ; ; ∏ = = 2/ 1 2 i j ji bB 83 AL-SALEH 11)( − − − = ii xiv v xa and iv vi i − b − = )1( . For odd, i )(xiµ can be written as )()()( 1 xxCx ii µµ = , where ))(()()( 2/)3( 1 2/)3( 3212 2/)1( 1 12 ∑ ∏∏ − = − = ++ − = + ++= i k i kj jk i j jii bxabxaxC . Thus, for k with and v< 110 == CB 00 =A we have ))()()(()()()( 10 xAxxCxxBxxdttftx i ik eveni k i i ik oddi k i i ik x eveni k i k −−− ∞ ∑∑∫ ∑     −     −     =− µµ . Hence, for k even we have: )4( )()()(( )()(1)( 1 0 i ik eveni k i i ik eveni k i i ik k ioddi k Bx xAxxCxx xLxFx − −− ∑ ∑∑            −      =≥−= µ µ while for k odd we have )5( ))()()(( )()(1)( 1 0 i ik eveni k i i ik eveni k i i ik oddi k i k Bx xAxxCxx xUxFx − −− ∑ ∑∑            −      =≤−= µ µ where is the lower bound of 1 and U is k upper bound of 1 . )( xLk thk )(xF− )(xk th )(xF− 3. Numerical Calculations of and U )( xLk )(xk To see how accurate and U are, the two bounds have been obtained for some values of and . For =10 and k =3, 4, the two consecutive bounds are: )( xLk )(xk v k v )( 25.65.7 8036.58929. )( 124 3 4 xxx xx xL µ      ++ + = ; )( 75.3 8571.29107. )( 13 2 3 xxx x x µ      + + =U . For =15, the two bounds are: v )(4 xL )(7203.49231.6 4944.59324. 124 3 x xx xx µ      ++ + = ; )( 4615.3 5.29359. )( 13 2 3 xxx x x µ      + + =U . And for ν =20, we have: )( 1667.46667.6 553.59498. )( 124 3 4 xxx xx xL µ      ++ + = ; ).( 333.3 3529.29510. )( 13 2 3 xxx x x µ      + + =U . 84 APPROXIMATING THE TAIL PROBABILITY OF THE T-DISTRIBUTION Here, 11 )1)(( 2 −+= v v v xxfµ and )1(5.21 )1( 1 )5(. ))1(5(. )( ++Γ +Γ = v v xvv v xf π . For a given and suitable , we take the average of the two bounds as an approximation of the tail probability 1 , i.e. for even v k )(xF− k 2 )()( )()(1 1* xUxL xxF kkk −+=≈− α (6) and for odd we have k 2 )()( )()(1 1* xUxL xxF kkk + =≈− −α (7) Table 1: Values of ,)(*4 xα 1α , 2α and 3α v x Exacα )(*4 xα 1α 2α 3α 10 1.812 .050 .0505 .0583 .0639 .0500 2.228 .025 .0252 .0288 .0309 .0249 2.764 .010 .0100 .0093 .0123 .0098 3.169 .005 .0051 .0041 .0063 .0048 15 1.753 .050 .0506 .0595 .0609 .0500 2.131 .025 .0252 .0310 .0289 .0250 2.602 .010 .0101 .0132 .0111 .0100 2.947 .005 .0050 .0054 .0054 .0050 20 1.725 .050 .0505 .0599 .0610 .0500 2.086 .025 .0252 .0318 .0288 .0250 2.528 .010 .0101 .0136 .0111 .0100 2.845 .005 .0050 .0063 .0054 .0050 )(*4 xα is compared to the approximations provided by Elfving (1955), Pinkham and Wilk (1954), and Li and Moor (1999) denoted by 1α , 2α and 3α respectively. Table 1 contains the values of ,)(*4 xα 1α , 2α and 3α for selected values of and v x . The values of x are those values that are used frequently in applications, i.e. values that correspond to exact tail values of .0500, .0250, .0100, and .0050. It can be seen from this table that the value of is very accurate and closer to the exact value than the first two approximations. Furthermore, the values of are almost as accurate as the values of )x(*4α )(*4 xα 3α . Note that more accurate bounds can be obtained using higher values of . For example if we take =5, then k k 135 24 5 25.315.12 8571.227321.109071. µ xxx xx U ++ ++ = . Table 2 contains the values of ,)(*5 xα 1α , 2α and 3α for selected values of x when =10. It can be concluded from this table that the values of are even more accurate than v )x(*5α 3α . 85 AL-SALEH Table 2: Values of ,)(*5 xα 1α , 2α and 3α v x Exacα )(*5 xα 1α 2α 3α 10 1.812 .050 .0501 .0583 .0639 .0500 2.228 .025 .0250 .0288 .0309 .0249 2.764 .010 .0100 .0093 .0123 .0098 3.169 .005 .0050 .0041 .0063 .0048 4. Other Applications of the Technique The Bayesian approach, which is used in this paper to approximate the t-distribution, was used by the author to approximate the normal distribution. An inspection of the procedure reveals that it can be applied to some other distributions. If X has a density that is symmetric around zero and if we let )(xf θ−= XY , where θ is a location parameter then the density of Y is )( θ−yf . If we impose a uniform prior on θ of the type 1)( =θπ for 0≥θ and zero otherwise, then the posterior density of θ given x is )( )( )( )( )|( 0 xF xf dxf xf x θ θθ θ θπ − = − − = ∫ ∞ . All moments of this density are nonnegative and hence as in section (2), it can be shown that )()1()()( 0 xxdttftx i ik k i k i i x k µ− = ∞      −=− ∑∫ where ∫ ∞ = x i i dttftx )()(µ . Now, depending on the functional form of , it may be possible to obtain a recursive formula for )(xf )(xiµ like the one in equation (3). We believe that some distributions such as the lognormal, non-central t and other location types-distribution can benefit from this procedure. Another useful application of the procedure is for estimating the cumulative distribution of the bivariate normal and other bivariate distributions. 5. Concluding Remarks There has been considerable work on the possible approximations of the tail probability of the t-distribution. Simplicity as well as accuracy are important factors in assessing the value of an approximation. In this paper, we use a Bayesian approach to provide a set of upper and lower bounds of this probability; the set consists of [ ]1−v members. Any member of the set or a combination of members can serve as an approximation. Taking the average of two consecutive lower and upper bounds can be a good choice. It turns out that this approach is a suitable one in providing simple and accurate approximations and can be used for similar problems. Unlike many other approximations, the current procedure doesn’t depend on ).x(Φ 86 APPROXIMATING THE TAIL PROBABILITY OF THE T-DISTRIBUTION 6. Acknowledgment I wish to thank the referees for their constructive comments and suggestions. References ABU-DAYYEH, W. and AHMED, M. 1993. Some new bounds on the tail probability of standard normal distribution. Journal of Information and Optimization Sciences 14: 155- 159. AL-SALEH, M. FRAIWAN.1994. Mill's ratio: a Bayesian approach. Pakistan Journal of Statistics 10:629-632. CUCCONI, O. 1962. On simple relation between the number of degrees of freedom and the critical value of student- t. Memeorie Academia Patavina 74:179-187. ELFVING, G. 1955. An expansion principle for distribution functions with application to statistics. Annals Academiae Scientiarum Fennicae, series A 204:1-8. FISHER, R.A. 1935. The mathematical distributions used in the common tests of significance. Econometrica 3:353-365. GLEASON, J.R. 2000. A note on a proposed student t approximation. Computational statistics and data analysis 34:63-66. JOHNSON, N., KOTZ, S. and BALAKRISHNAN, N.1995. Continuous Univariate Distribution. John Wiley and sons, New York. LI, B. and MOOR, B. 1999. A corrected normal approximation for the student t distribution. Computational Statistics and Data Analysis 29:213-216. PINKHAM, R. and WILK, M. 1954. Tail areas of the t-distribution from a Mills-ratio-like expansion. Annals of Mathematical Statistics 34:335-337. STUART, A. and ORD, J. 1994. Kendall Advanced Theory of Statistics. Edward Arnold, London Received 5 January 2000 Accepted 22 January 2001 87 Mohammad Fraiwan Al-Saleh Department of Mathematics and Statistics, College of Science, Sultan Qaboos University, P.O. Box 36, Al Khod 123, Muscat, Sultanate of Oman. ÎáÇÕÉ : áÞÏ Êã ÇÓÊÎÏÇã ÃÓáæÈ Èíí Introduction