Int. J. Anal. Appl. (2022), 20:19 Approximating the Mode of the Non-Central Chi-Squared Distribution V. Ananiev∗, A. L. Read Department of Physics, University of Oslo, Norway ∗Corresponding author: victor.ananyev@gmail.com Abstract. In this paper we consider the probability density function (pdf) of the non-central χ2 dis- tribution with arbitrary number of degrees of freedom and non-centrality. For this function we find the approximate location of the maximum and discuss related edge cases of 1 and 2 degrees of free- dom. We also use this expression to demonstrate the improved performance of the C++ Boost’s implementation of the non-central χ2 and extend the domain of its applicability. 1. Introduction Properties of the non-central χ2 distribution were described before in literature [6–8]. However, the topic of the mode of the non-central χ2 was significantly underrepresented. We would like to focus on the mode specifically in this paper. Let X1,X2, ...,Xn be normally distributed random variables with unit variance and means µ1,µ2, ...,µn. The sum X21 + X 2 2 + ... + X 2 n follows the non-central χ 2 distribution with k = n degrees of freedom and non-centrality λ = µ21 + µ 2 2 + ... + µ 2 n. The probability density function of this distribution has a closed form expression: fk,λ(x) = 1 2 exp− x+λ 2 (x λ )k−2 4 Ik−2 2 ( √ λx) , (1.1) where Iν(x) is a modified Bessel function of the first kind. We are interested in the value of xmode that maximizes fk,λ(x). Typical shapes of the pdf of the non-central χ2 distribution are shown in Fig. 1. Received: Feb. 8, 2022. 2010 Mathematics Subject Classification. 41-02, 33C10, 62-04. Key words and phrases. non-central chi-squared; mode; linear approximation; Boost C++; performance. https://doi.org/10.28924/2291-8639-20-2022-19 ISSN: 2291-8639 © 2022 the author(s). https://doi.org/10.28924/2291-8639-20-2022-19 2 Int. J. Anal. Appl. (2022), 20:19 0 20 40 60 80 100 x 0.00 0.01 0.02 0.03 0.04 no n- ce nt ra l χ 2 de ns ity k=3, λ=20 k=10, λ=20 k=20, λ=20 k=50, λ=20 (a) Dependency on the number of d.o.f. k. 0 20 40 60 80 100 x 0.00 0.02 0.04 0.06 0.08 no n- ce nt ra l χ 2 de ns ity k=3, λ=5 k=3, λ=10 k=3, λ=30 k=3, λ=50 (b) Dependency on the non- centrality λ. Figure 1. Non-central χ2 distributions and behavior of the mode. When the number of degrees of freedom k is fixed, we can plot the dependency of the maximum of the pdf as a function of the non-centrality parameter λ, see Fig. 2. 0 2 4 6 8 10 non-centrality, λ 2 4 6 8 10 12 14 16 18 m od e ncχ2, k=10 ncχ2, k=5 ncχ2, k=3 Figure 2. Mode of the non-central χ2 as a function of the non-centrality parameter λ. We observe that the bigger λ is the better the mode appears to be approximated with a straight line. The derivation of the line parameters together with the analysis of the edge cases of small number of degrees of freedom, where the mode does not exist, constitute the main results of the paper. 2. Derivation 2.1. Master equation. In this section we obtain the transcendental equation (Eq. 2.2) that deter- mines the mode of the non-central χ2 distribution. We reduce it to the ordinary differential equation (Eq. 2.5), where the non-centrality parameter λ is the argument, and the number of degrees of free- dom k is a parameter. Finally, we solve the ODE approximately with a Taylor expansion (Eq. 2.10) and investigate edge cases of 1 and 2 degrees of freedom (Sec. 2.3). Int. J. Anal. Appl. (2022), 20:19 3 We start by setting the derivative of the density of the non-central χ2 (Eq. 2.1) to zero. This leads us to the transcendental equation (Eq. 2.2) that determines the mode of the distribution: d dx χ2k,λ(x) = 1 2 χ2d,λ(x) ·  −1 + k − 2 2x + √ λ x I′k−2 2 ( √ λx) Ik−2 2 ( √ λx)   , (2.1) d dx χ2k,λ(x) = 0 ⇒ √ λxI′k−2 2 ( √ λx) = (x − k − 2 2 )Ik−2 2 ( √ λx) . (2.2) We can eliminate the derivative in Eq. 2.2 by using the differential equation for the modified Bessel function [1, Eq. 10.25.1]: t2 d2 dt2 Iν(t) + t d dt Iν(t) − (t2 + ν2)Id,λ(t) = 0 . (2.3) To make use of Eq. 2.3, we need the expression for I′′k−2 2 , therefore, we differentiate Eq. 2.2 by λ. Since the mode depends on the non-centrality λ, we should remember that x = x(λ), thus dx dλ = x′. The resulting expression for I′′k−2 2 is as follows: √ λxI′′k−2 2 ( √ λx) = (x − k 2 )I′k−2 2 ( √ λx) + 2 √ λxx′ x + λx′ Ik−2 2 ( √ λx) . (2.4) We substitute I′k−2 2 (Eq. 2.2) and I′′k−2 2 (Eq. 2.4) into the differential equation for the modified Bessel function (Eq. 2.3). We then use the property [1, Eq. 10.29.4] to decrease the order of the derivatives of the modified Bessel functions. Assuming that the Bessel function itself is non-zero at the mode, we arrive to the following differential equation for the mode as a function of the non-centrality parameter λ: λx′(x −k −λ + 4) + x(x −k −λ + 2) = 0 . (2.5) 2.2. Approximate solution. We observed that the linear approximation works better with growing λ, thus we introduce the asymptotic parameter t = k λ << 1 to build the expansion. We expect the solution to be linear in λ, however the asymptotic expansion of x(t) = C0 + C1t + ... won’t provide us with a solution linear in λ. Therefore, we reparametrize x(t) with a new function y(t) = tx(t): t = k λ , (2.6) y(t) = tx(t) . (2.7) We obtain the following equation after the reparametrization: − (y ′t −y)(y −kt −k + 4t) + y(y −kt −k + 2t) = 0 . (2.8) To solve Eq. 2.8, we expand y(t) into the Taylor series by the scale parameter t = k λ . We would like to find the linear solution and one extra term that estimates the error. Thus, we cut the series at 4 Int. J. Anal. Appl. (2022), 20:19 the third power of t in order to account for the derivative. After solving algebraic equations for the coefficients near each power of t, we arrive to the resulting approximate expression for the mode: y(t) = C0 + C1t + C2t 2 + C3t 3 + O(t4) , (2.9) C0 = k, C1 = k − 3, C2 = k − 3 2k , (2.10) xmode = λ + k − 3 + k − 3 2λ + O ( k2 λ2 ) . (2.11) We plot the linear approximation Eq. 2.11 together with the precise numerical solution Fig. 2 in order to verify the approximation is correct, see Fig. 3. 0 2 4 6 8 10 non-centrality, λ 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 m od e k + λ - 3 ncχ2, k=10 ncχ2, k=5 ncχ2, k=3 Figure 3. Linear approximation to the mode of the non-central χ2 compared to the more precise numerical solution as a function of the non-centrality parameter λ. 2.3. Small number of degrees of freedom. 2.3.1. Case k < 2. The asymptotic behavior of the modified Bessel function at x → 0 [1, Eq. 10.30.1] shows that the pdf of the non-central χ2 diverges, thus it doesn’t have a mode: χ2k,λ(x) → 1 2Γ(k 2 ) 1 (2λ) k−2 2 e− λ 2 (√ λx )k−2 , x → 0 . (2.12) 2.3.2. Case k = 2. In this case, the pdf at x = 0 is finite. If the derivative at x = 0 is positive, then the maximum is not there. The expression for the derivative (Eq. 2.13) and its asymptotic behavior at x → 0 (Eq. 2.14) are shown below: d dx χ2k,λ(x) = 1 2 χ2d,λ(x) · [ −1 + √ λ x I−1( √ λx) I0( √ λx) ] , (2.13) d dx χ2k,λ(x) → 1 2 χ2d,λ(x) · [ −1 + λ 2 ] , x → 0 . (2.14) We observe that when λ > 2, the pdf of the non-central χ2 doesn’t have its maximum at x = 0. In the region λ < 2, the asymptotic scale t = k λ > 1, hence our approximation is inapplicable in this region and we refrain from analysing it. Int. J. Anal. Appl. (2022), 20:19 5 3. Application There exist a number of numerical procedures for finding the mode of a distribution [2, Ch. 10]. Some of them require the search region to be specified. For example, the widely used C++ library Boost [3] identifies the search region based on an initial guess for the mode x0. Boost iteratively checks regions of the form [x0/2, 2x0], [x0/22, 22x0], etc. When the value of the pdf at both ends of the region becomes smaller than the value at the initial guess point x0, the algorithm initiates the search for the maximum inside of the region. At the time of writing, Boost used x0 = k + 1 as the initial guess. We already know, based on the approximate solution (Eq. 2.11), that the chosen guess will undershoot at large non-centrality values λ. Let’s estimate λ above which the method will require the second iteration for the region to cover the mode. For this we compare the linear estimate for the location of the mode (Eq. 2.11) to the initial guess x0 used by Boost: k + λ− 3 > 2 · (k + 1) , (3.1) λ > k + 5 . (3.2) With Eq. 3.2, for any number of d.o.f. k we are able to specify the threshold α, defined by λ = αk, at which the original initial guess starts undershooting: α > 5 k + 1 . (3.3) We see that large k corresponds to small thresholds α. The most conservative estimate for the threshold would be at the smallest k possible: k = 2. Thus, α = 3.5 is the threshold that approximately works for k = 2 and is the overestimated threshold for bigger values of k. The threshold α (Eq. 3.3) is closely related to the asymptotic scale t (Eq. 2.6) that we used for finding the approximate solution, specifically: t = k λ = 1 α . For example, the conservative threshold α ≈ 3.5 corresponds to the asymptotic scale t ≈ 0.25 < 1. It means that the region where the original guess of Boost undershoots, is, at the same time, the region where our approximate solution for the mode becomes applicable and can be used as a corrected initial guess. However, the fact that we use the conservative threshold may lead to the situation where the original method has already started undershooting but λ is not yet big enough to turn on the corrected regime. 3.1. Dependency on λ. We fix the threshold to the conservative value k λ = 0.25. We then plot the dependency of the run time on the non-centrality λ for a set of d.o.f. k: 2, 15, 50, see Fig. 4. For benchmarking we use the Google benchmark library [4]. The benchmarking script itself became a part of the Boost.Math [5]. Using this script we measure the run time 100 times and use the mean as a central value. The error bar is computed as a standard deviation. We add noise with standard deviation σ = 10−6 to parameters k and λ to avoid caching effects. The vertical line on the plots shows the threshold where the original initial guess for smaller λ is switched to the corrected value at 6 Int. J. Anal. Appl. (2022), 20:19 bigger λ. Therefore, we expect that both lines coincide below the threshold and the improved solution would lie lower above the threshold. One can notice missing values on the curve representing the original initial guess. The reason for this is the numerical instability of the algorithm in Boost, that has been resolved after we corrected the initial guess. 101 102 103 λ 60 80 100 120 140 160 180 200 tim e, μ s baseline new (a) 101 102 103 λ 40 60 80 100 120 140 160 tim e, μ s baseline new (b) 101 102 103 λ 60 80 100 120 140 160 tim e, μ s baseline new (c) Figure 4. Run time as a function of the non-centrality λ for d.o.f. k = 2 (4a), k = 15 (4b), k = 50 (4c). Vertical line shows the threshold at which the corrected expression replaces the original initial guess. 3.2. Dependency on d.o.f. (k). In the set of plots in Fig. 5, we fix the asymptotic scale to values k λ = 0.25, 0.15, 0.05 and investigate the dependency of the run time on the number of d.o.f. Since the threshold is fixed, the difference in the run time is caused by the actual position where the original initial guess starts to undershoot, the non-conservative threshold. The farther the fixed threshold is from the non-conservative threshold, the more significant the effect of undershooting at the test point will be. Therefore, we expect the difference in the run time to grow with number of d.o.f, as follows from Eq. 3.3. For each value of the asymptotic scale, in addition to the full plot, we also show a zoomed version that shows the region where both original and improved methods were able to converge (Fig. 5). 4. Conclusion In this paper we present an approximate expression for the mode of the non-central χ2 distribution: xmode ≈ k +λ−3, where k is the number of degrees of freedom and λ is the non-centrality parameter. The approximation is based on an asymptotic expansion and is valid in the region where the scale parameter k λ << 1 and where the mode exists k > 2. The approximate formula can be used as the initial guess for iterative procedures searching for a precise solution. Run time performance and the domain of applicability of the Boost implementation of the mode search was improved using the presented approximate expression. The improvement became a part of the Boost.Math library [5]. Int. J. Anal. Appl. (2022), 20:19 7 101 102 0 50 100 150 200 250 tim e, μ s baseline new 101 102 10μ k 100 200 μ00 400 500 tim e, μ s baseline new (a) 101 102 0 50 100 150 200 250 tim e, μ s baseline new 101 102 10μ k 200 400 600 tim e, μ s baseline new (b) 101 102 0 50 100 150 200 tim e, μ s baseline new 101 102 10μ k 200 400 600 800 1000 tim e, μ s baseline new (c) Figure 5. Run time as a function of the number of d.o.f. (k) for the asymptotic scale values k λ = 0.25 (5a), k λ = 0.15 (5b), k λ = 0.05 (5c). The upper plot in each pair shows the zoomed version, focused on the region where both original and improved methods were able to converge. 5. Acknowledgements We would like to thank Mykola Semenyakin for the numerous fruitful and motivating discussions. We also would like to acknowledge the support of the Boost community that allowed the contribution to become a part of the Boost Library. This research was supported by the European Unions Framework Programme for Research and Innovation Horizon 2020 (2014-2021) under the Marie Sklodowska-Curie Grant Agreement No.765710. References [1] F.W.J. Olver et al., NIST Digital Library of Mathematical Functions, (2021). http://dlmf.nist.gov/. [2] W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes 3rd Edition, (2007). http:// numerical.recipes/. [3] Boost C++ Libraries, v1.76.0, (2021). https://www.boost.org/. [4] Google benchmark, (2021). https://github.com/google/benchmark. [5] Boost Math, (2021). https://github.com/boostorg/math/pull/645. [6] S. András, Á. Baricz, Properties of the Probability Density Function of the Non-Central Chi-Squared Distribution, J. Math. Anal. Appl. 346 (2008), 395–402. https://doi.org/10.1016/j.jmaa.2008.05.074. [7] D. Horgan, C.C. Murphy, On the Convergence of the Chi Square and Noncentral Chi Square Distributions to the Normal Distribution, IEEE Commun. Lett. 17 (2013), 2233–2236. https://doi.org/10.1109/LCOMM.2013. 111113.131879. [8] L. Saulis, Asymptotic Expansion for the Distribution and Density Functions of the Quadratic Form of a Stationary Gaussian Process in the Large Deviation Cramer Zone, Nonlinear Anal.: Model. Control. 6 (2001), 87–101. https: //doi.org/10.15388/NA.2001.6.1.15218. [9] S.S. Sawant, D.A. Levin, V. Theofilis, Analytical Prediction of Low-Frequency Fluctuations Inside a One- Dimensional Shock, arXiv:physics.flu-dyn (2021). https://arxiv.org/abs/2101.00664. [10] V. Pereyra, Iterated Deferred Corrections for Nonlinear Operator Equations, Numer. Math. 10 (1967), 316–323. https://doi.org/10.1007/BF02162030. http://dlmf.nist.gov/ http://numerical.recipes/ http://numerical.recipes/ https://www.boost.org/ https://github.com/google/benchmark https://github.com/boostorg/math/pull/645 https://doi.org/10.1016/j.jmaa.2008.05.074 https://doi.org/10.1109/LCOMM.2013.111113.131879 https://doi.org/10.1109/LCOMM.2013.111113.131879 https://doi.org/10.15388/NA.2001.6.1.15218 https://doi.org/10.15388/NA.2001.6.1.15218 https://arxiv.org/abs/2101.00664 https://doi.org/10.1007/BF02162030 1. Introduction 2. Derivation 2.1. Master equation 2.2. Approximate solution 2.3. Small number of degrees of freedom 3. Application 3.1. Dependency on 3.2. Dependency on d.o.f. (k) 4. Conclusion 5. Acknowledgements References