Choosing the number of points in a low discrepancy sequence
The goal of this page is to present a method which allows to select an appropriate number of points given the properties of a low discrepancy sequence.
Consider the Halton sequence in s dimensions. This sequence makes use of the first s prime numbers. Hence this is not a (t,m,s)-sequence in any base b, because the base changes. Anyway, the Halton sequence has equidistribution properties which can be used to select a proper number of points, depending on the dimension of the problem and the properties of the low dimensional projections.
Consider the Halton sequence in s dimensions and consider its one dimensional projections. The one dimensional projection which is the hardest to fill is the one associated with the largest prime number, i.e. the s-th prime number p(s). Therefore let us focus on this projection. We consider elementary intervals of length 1/p(s). If the number of points N is equal to p(s), then each elementary interval of the s-th dimensional projection has one point. Since the first s-1 one dimensional projections are associated with smaller primes, they must contain more that one point. In other words, in order to get good equidistribution of the one dimensional projections of the Halton sequence, we consider a number of points N equal to the prime number of the largest dimension, i.e. p(s).
The following script computes the first 100 first prime numbers.
The previous script prints the following output. Hence, in dimension s=6, if we consider N=13 points, therefore there is one point in the elementary intervals of length L=1/13 for the 6-th dimension. The dimensions 1 to 5 have more points.
1. 2. 2. 3. 3. 5. 4. 7. 5. 11. 6. 13. 7. 17. 8. 19. 9. 23. 10. 29. 11. 31. 12. 37. 13. 41. 14. 43. 15. 47. 16. 53. 17. 59. 18. 61. 19. 67. 20. 71. ...
For example, consider the Halton sequence in s=5 dimensions. Since the 5-th prime number is 11, we must consider at least N=11 points so that the elementary intervals of the 5-th dimension have one point.
s=5; p=number_primes100(); N=p(s); u=lowdisc_ldgen(N,s,"halton"); // scf(); plot(u(:,s-1),u(:,s),"bo") lowdisc_plotelembox([p(s-1) p(s)],[1 1]) xlabel(msprintf("X(%d)",s-1)); ylabel(msprintf("X(%d)",s)); | ![]() | ![]() |
The previous script produces the following output. We see that the elementary intervals of the one dimensional projection X(5) have one point, although the elementary intervals of X(4) have 1 to 2 points.
Consider the Halton sequence in s dimensions and consider its two dimensional projections. If we consider N=p(s-1)*p(s) points, i.e. the product of the two largest primes used in the sequence, then the elementary intervals of area A=1/N have one point.
The following script prints the number of points necessary to have more than one point in the elementary intervals of the two dimensional projections of the Halton sequence.
p=number_primes100(); for s=2:20 N=p(s-1)*p(s); A=1/N; mprintf("s=%d, N=%d, A=%.2e\n",s,N,A) end | ![]() | ![]() |
This produces the following output.
s=2, N=6, A=1.67e-001 s=3, N=15, A=6.67e-002 s=4, N=35, A=2.86e-002 s=5, N=77, A=1.30e-002 s=6, N=143, A=6.99e-003 s=7, N=221, A=4.52e-003 s=8, N=323, A=3.10e-003 s=9, N=437, A=2.29e-003 s=10, N=667, A=1.50e-003 s=11, N=899, A=1.11e-003 s=12, N=1147, A=8.72e-004 s=13, N=1517, A=6.59e-004 s=14, N=1763, A=5.67e-004 s=15, N=2021, A=4.95e-004 s=16, N=2491, A=4.01e-004 s=17, N=3127, A=3.20e-004 s=18, N=3599, A=2.78e-004 s=19, N=4087, A=2.45e-004 s=20, N=4757, A=2.10e-004
In the following script we plot N=77 points of the Halton sequence in s=5 dimensions and plot the elementary intervals of the projection (X(4),X(5)).
s=5; p=number_primes100(); N=prod(p(s-1)*p(s)); u=lowdisc_ldgen(N,s,"halton"); // scf(); plot(u(:,s-1),u(:,s),"bo") lowdisc_plotelembox([p(s-1) p(s)],[1 1]) xlabel(msprintf("X(%d)",s-1)); ylabel(msprintf("X(%d)",s)); | ![]() | ![]() |
The previous script produces the following output. Since N=p(s-1)*p(s), each elementary interval of the projection (X(4),X(5)) contain more than one point, although other the elementary intervals of other projections may contain more than one point.
Given a number of dimensions s and a number of points N, what is the maximum dimension d of the projection for which the elementary intervals have at least one point ? The following script considers N=1000 points of the Halton sequence, and computes d for increasing values of s.
N=1000; p=number_primes100(); for s=1:12 for d=1:s if (prod(p(s-d+1:s))>N) then break end end if (d<s) then d=d-1; end mprintf("s=%d, d=%d\n",s,d) end | ![]() | ![]() |
This produces the following output. For example, for N=1000 Halton points in s=8 dimensions, the elementary intervals of the two dimensional projections have one point. For a number of dimensions larger than s=11, only the one dimensional projections are well equidistributed.
s=1, d=1 s=2, d=2 s=3, d=3 s=4, d=4 s=5, d=3 s=6, d=2 s=7, d=2 s=8, d=2 s=9, d=2 s=10, d=2 s=11, d=2 s=12, d=1
Consider the Faure sequence in s dimensions. It is a (0,s)-net in base b, where b is the smallest prime number larger than s. This implies that the number of points in a Faure sequence should be chosen depending on b, so that the sequence has good low dimensional projections. In order to assess the quality of these projections, we consider here the volume of the elementary intervals in base b.
The one dimensional projections of the Faure sequence are well distributed provided that the number of points is a multiple of b. Consider elementary intervals in one dimension, with length 1/b. If the number of points is N=b, therefore there is one point in each elementary interval. The following script prints the number of points N=b such that each elementary of length L=1/b has one point.
p=number_primes100(); for b=p(1:30) N=b; L=1/b; mprintf("b=%d, N=%d, L=%.2e\n",b,N,L) end | ![]() | ![]() |
The previous script prints the following output. This table shows that, for example, if s=30, the chosen base is b=31, so that N=31 points can be chosen, where the length of the elementary intervals of dimension one is equal to 3.23e-2.
b=2, N=2, L=5.00e-001 b=3, N=3, L=3.33e-001 b=5, N=5, L=2.00e-001 b=7, N=7, L=1.43e-001 b=11, N=11, L=9.09e-002 b=13, N=13, L=7.69e-002 b=17, N=17, L=5.88e-002 b=19, N=19, L=5.26e-002 b=23, N=23, L=4.35e-002 b=29, N=29, L=3.45e-002 b=31, N=31, L=3.23e-002 b=37, N=37, L=2.70e-002 b=41, N=41, L=2.44e-002 b=43, N=43, L=2.33e-002 b=47, N=47, L=2.13e-002 b=53, N=53, L=1.89e-002 b=59, N=59, L=1.69e-002 b=61, N=61, L=1.64e-002 b=67, N=67, L=1.49e-002 b=71, N=71, L=1.41e-002 b=73, N=73, L=1.37e-002 b=79, N=79, L=1.27e-002 b=83, N=83, L=1.20e-002 b=89, N=89, L=1.12e-002 b=97, N=97, L=1.03e-002 b=101, N=101, L=9.90e-003 b=103, N=103, L=9.71e-003 b=107, N=107, L=9.35e-003 b=109, N=109, L=9.17e-003 b=113, N=113, L=8.85e-003
More points can be added to the sequence, keeping the good distribution of the points already generated. Moreover, the length of the elementary intervals can be made smaller, provided that the number of points N is increased. For example, if N=b*b-b extra points are computed, then the length of the elementary intervals is reduced down to L=1/(b*b).
Let us now consider the two dimensional projections of the Faure sequence. We must consider at least N=b*b points such that all two dimensional elementary intervals of area A=1/(b*b) have at least one point. More points fill smaller elementary intervals, but less points do not allow to let the sequence fill all the two dimensional intervals with one point.
The following script prints the number of points N so that all two dimensional elementary intervals with volume 1/(b*b) contain at least one point.
p=number_primes100(); for b=p(1:30) N=b^2; A=1/(b^2); mprintf("b=%d, N=%d, A=%.2e\n",b,N,A) end | ![]() | ![]() |
The previous script prints the following output. For example, if s=30, the chosen base is b=31, so that N=1681 points can be chosen, where the area of the elementary intervals of dimension two is equal to 1.04e-3.
b=2, N=4, A=2.50e-001 b=3, N=9, A=1.11e-001 b=5, N=25, A=4.00e-002 b=7, N=49, A=2.04e-002 b=11, N=121, A=8.26e-003 b=13, N=169, A=5.92e-003 b=17, N=289, A=3.46e-003 b=19, N=361, A=2.77e-003 b=23, N=529, A=1.89e-003 b=29, N=841, A=1.19e-003 b=31, N=961, A=1.04e-003 b=37, N=1369, A=7.30e-004 b=41, N=1681, A=5.95e-004 b=43, N=1849, A=5.41e-004 b=47, N=2209, A=4.53e-004 b=53, N=2809, A=3.56e-004 b=59, N=3481, A=2.87e-004 b=61, N=3721, A=2.69e-004 b=67, N=4489, A=2.23e-004 b=71, N=5041, A=1.98e-004 b=73, N=5329, A=1.88e-004 b=79, N=6241, A=1.60e-004 b=83, N=6889, A=1.45e-004 b=89, N=7921, A=1.26e-004 b=97, N=9409, A=1.06e-004 b=101, N=10201, A=9.80e-005 b=103, N=10609, A=9.43e-005 b=107, N=11449, A=8.73e-005 b=109, N=11881, A=8.42e-005 b=113, N=12769, A=7.83e-005
The following script shows how the N=121 first points of the Faure sequence in dimension s=10 are distributed on the two dimensional projection (X(9),X(10)). Since the base is b=11, we plot the elementary intervals of area 1/(11*11)=1/121. Each elementary interval has exactly one point.
p=number_primes100(); s=10; k=find(p>=s,1); b=p(k); u=lowdisc_ldgen ( b^2-1 , s , "faure" ); scf(); lowdisc_proj2d ( u , [s-1 s] ); lowdisc_plotelembox(b,[1 1]) e=gce(); e.children($).children.mark_size=4; xlabel(msprintf("X(%d)",s-1)); ylabel(msprintf("X(%d)",s)); | ![]() | ![]() |
The following script prints the number of points N so that all three dimensional elementary intervals with volume 1/(b*b*b) contain at least one point.
p=number_primes100(); for b=p(1:30) N=b^3; V=1/(b^3); mprintf("b=%d, N=%d, V=%.2e\n",b,N,V) end | ![]() | ![]() |
The previous script prints the following output. For example, if s=40, the chosen base is b=41, so that N=68921 points can be chosen, where the volume of the elementary intervals of dimension three is equal to 1.45e-5.
b=2, N=8, V=1.25e-001 b=3, N=27, V=3.70e-002 b=5, N=125, V=8.00e-003 b=7, N=343, V=2.92e-003 b=11, N=1331, V=7.51e-004 b=13, N=2197, V=4.55e-004 b=17, N=4913, V=2.04e-004 b=19, N=6859, V=1.46e-004 b=23, N=12167, V=8.22e-005 b=29, N=24389, V=4.10e-005 b=31, N=29791, V=3.36e-005 b=37, N=50653, V=1.97e-005 b=41, N=68921, V=1.45e-005 b=43, N=79507, V=1.26e-005 b=47, N=103823, V=9.63e-006 b=53, N=148877, V=6.72e-006 b=59, N=205379, V=4.87e-006 b=61, N=226981, V=4.41e-006 b=67, N=300763, V=3.32e-006 b=71, N=357911, V=2.79e-006 b=73, N=389017, V=2.57e-006 b=79, N=493039, V=2.03e-006 b=83, N=571787, V=1.75e-006 b=89, N=704969, V=1.42e-006 b=97, N=912673, V=1.10e-006 b=101, N=1030301, V=9.71e-007 b=103, N=1092727, V=9.15e-007 b=107, N=1225043, V=8.16e-007 b=109, N=1295029, V=7.72e-007 b=113, N=1442897, V=6.93e-007
We can reverse the reasoning: with a given number of dimensions s and a given number of points N, what is the maximum number d of low dimensional projections that can be well distributed with a Faure sequence ? We have
which implies
The following script considers N=1000 points, and considers increasing dimensions, i.e. increasing prime numbers. It prints the maximum number of dimensions d for which each elementary interval of volume 1/b^d contain more than one point.
The previous script prints the following output. For example, if s=15, the chosen base is b=17, so that N=1000 points can have good up to d=2, i.e. two dimensional equidistribution properties, but not more. Actually, with N=1000 points, we cannot have good equidistribution of two dimensional projections for s greater than 31.
b=2, d=9 b=3, d=6 b=5, d=4 b=7, d=3 b=11, d=2 b=13, d=2 b=17, d=2 b=19, d=2 b=23, d=2 b=29, d=2 b=31, d=2 b=37, d=1