<< Efficiency Tutorials Choosing the number of points >>

Low Discrepancy >> Low Discrepancy > Tutorials > The Van Der Corput sequence

The Van Der Corput sequence

Introduction to Van Der Corput sequence

Purpose

The goal of this page is to present elements of the theory of the Van Der Corput sequence (1935).

Van Der Corput Sequence

Let k≥0 be an integer and let b be a prime number. We decompose k into base b:

where r is the number of digits required in the decomposition.

We can also write this decomposition as the sum:

In order to write the decomposition of an integer in base b, some authors write:

The radical inverse function is

This can be expanded in:

This can also be written:

In this function, the low order digits of k (with low values of i) are in the high order digits of the radical inverse (with large values of powers of 1/b).

The base-b Van Der Corput sequence is

Notice that

Van Der Corput sequence in base 2

The Van Der Corput sequence in base 2 is just the first component of the Halton sequence. The following script considers the numbers k from zero to 9, decompose them in base 2 and compute the k-th element of the Van Der Corput sequence in base 2.

b=2;
n=b^4;
u=lowdisc_ldgen(n,1,"halton");
for k=1:n
    d=number_tobary(k,b);
    dstr=strcat(string(d),"");
    mprintf("k=%d=(%s)_%d, r=(0.%s)_%d=%f\n",..
    k,dstr,b,dstr($:-1:1),b,u(k,1))
end

The previous script produces the following output.

      
k=1=(1)_2, r=(0.1)_2=0.500000
k=2=(10)_2, r=(0.01)_2=0.250000
k=3=(11)_2, r=(0.11)_2=0.750000
k=4=(100)_2, r=(0.001)_2=0.125000
k=5=(101)_2, r=(0.101)_2=0.625000
k=6=(110)_2, r=(0.011)_2=0.375000
k=7=(111)_2, r=(0.111)_2=0.875000
k=8=(1000)_2, r=(0.0001)_2=0.062500
k=9=(1001)_2, r=(0.1001)_2=0.562500
k=10=(1010)_2, r=(0.0101)_2=0.312500
k=11=(1011)_2, r=(0.1101)_2=0.812500
k=12=(1100)_2, r=(0.0011)_2=0.187500
k=13=(1101)_2, r=(0.1011)_2=0.687500
k=14=(1110)_2, r=(0.0111)_2=0.437500
k=15=(1111)_2, r=(0.1111)_2=0.937500
k=16=(10000)_2, r=(0.00001)_2=0.031250
   
    

For example, the integer k=6 decomposes as 110 in base 2, because 2^2+2^1=4+2=6. Its radical inverse is 0.011 in base 2, which is associated to the radical inverse value 0/2+1/4+1/8=0.375.

The following figure plots the points of the Van Der Corput sequence in base 2.

In base 2, the lowest order digits are alternatively 0, 1, 0, etc... The most significant digit of the radical inverse function is alternatively 0, 1, 0, etc... Hence, the radical inverse in base 2 is alternatively lower and greater than 1/2. This can be seen in the plot, where the even points are on the left of the figure, while the odd points are on the right of the figure.

Another property which is made clear from the figure is that the interval [0,1] is progressively filled block-by-block, where each block of points has an increasing size which is a power of 2. Indeed, the first block has size 1 (the point 1), the second block has size 2 (the points 2 and 3), the third block has size 4 (the points 4 to 7), and so on. Therefore, when we use a point set which is a power of two, we have a regular discretization of the [0,1] interval.

Van Der Corput sequence in base 3

To get the Van Der Corput sequence in base 3, we just use the second dimension of the Halton sequence in two dimensions.

b=3;
n=b^3;
u=lowdisc_ldgen(n,2,"halton");
for k=1:n
    d=number_tobary(k,b);
    dstr=strcat(string(d),"");
    drev=strcat(string(d($:-1:1)),"");
    mprintf("k=%d=(%s)_%d, r=(0.%s)_%d=%f\n",..
    k,dstr,b,drev,b,u(k,2))
end

The previous script produces the following output.

      
k=1=(1)_3, r=(0.1)_3=0.333333
k=2=(2)_3, r=(0.2)_3=0.666667
k=3=(10)_3, r=(0.01)_3=0.111111
k=4=(11)_3, r=(0.11)_3=0.444444
k=5=(12)_3, r=(0.21)_3=0.777778
k=6=(20)_3, r=(0.02)_3=0.222222
k=7=(21)_3, r=(0.12)_3=0.555556
k=8=(22)_3, r=(0.22)_3=0.888889
k=9=(100)_3, r=(0.001)_3=0.037037
k=10=(101)_3, r=(0.101)_3=0.370370
k=11=(102)_3, r=(0.201)_3=0.703704
k=12=(110)_3, r=(0.011)_3=0.148148
k=13=(111)_3, r=(0.111)_3=0.481481
k=14=(112)_3, r=(0.211)_3=0.814815
k=15=(120)_3, r=(0.021)_3=0.259259
k=16=(121)_3, r=(0.121)_3=0.592593
k=17=(122)_3, r=(0.221)_3=0.925926
k=18=(200)_3, r=(0.002)_3=0.074074
k=19=(201)_3, r=(0.102)_3=0.407407
k=20=(202)_3, r=(0.202)_3=0.740741
k=21=(210)_3, r=(0.012)_3=0.185185
k=22=(211)_3, r=(0.112)_3=0.518519
k=23=(212)_3, r=(0.212)_3=0.851852
k=24=(220)_3, r=(0.022)_3=0.296296
k=25=(221)_3, r=(0.122)_3=0.629630
k=26=(222)_3, r=(0.222)_3=0.962963
k=27=(1000)_3, r=(0.0001)_3=0.012346
   
    

For example, the integer k=6 decomposes as 20 in base 3. Its radical inverse is 0.02 in base 3, which is associated to the radical inverse value 2/9=0.2222...

The following figure plots the points of the Van Der Corput sequence in base 3.

We see that the [0,1] interval is regularily discretized when we consider a number of points which is either 3-1=2, 9-1=8 or 27-1=26 points. In other words, when the number of points is a power of the base (which is here equal to 3), then the interval is filled more regularly.

The following figure plots the points of the Van Der Corput sequence in base 5.

When the base increases, the number of points before the [0,1] interval is regularly filled increases rapidly (as a power of b).

Moreover, consider the points 5 to 9, then 10 to 14 : they progressively go from left to right, taking b=5 iterations to make a partial circle. More generally, the subsequences take more iterations to go from lower values (near 0) to larger values (near 1). This is because it takes b iterations for the the least significant digits to go from 0 to b-1.

Halton Sequence

The Hammersley sequence (1960) uses the Van Der Corput sequence in order to create a low discrepancy point set. However, the number of points in the Hammersley sequence must be know in advance, which is sometimes an issue. The Halton sequence (1964) uses the Van Der Corput sequence without this limitation, and this is why it is used more often.

The Halton sequence in s dimensions uses s different bases which must be relatively primes. We usually use the s first prime numbers: 2, 3, 5, 7, 11, etc...

Denote by p(i) the i-th prime number. The dimension i of the Halton sequence is the Van Der Corput in base p(i). Hence, the k-th element of the Halton sequence is

for k≥0.

The following script produces 10 points of the Halton sequence in 4 dimensions. The prime number used here are: 2, 3, 5, 7.

      
-->u=lowdisc_ldgen(9,4,"halton")
 u  =
    0.5       0.3333333    0.2     0.1428571  
    0.25      0.6666667    0.4     0.2857143  
    0.75      0.1111111    0.6     0.4285714  
    0.125     0.4444444    0.8     0.5714286  
    0.625     0.7777778    0.04    0.7142857  
    0.375     0.2222222    0.24    0.8571429  
    0.875     0.5555556    0.44    0.0204082  
    0.0625    0.8888889    0.64    0.1632653  
    0.5625    0.0370370    0.84    0.3061224  
   
    

Scrambled Halton Sequence

In high dimensions, the Halton sequence poorly fills the space, as can be seen in low dimensionnal projections. In order to improve this, Braaten and Weller (1979) suggest to scramble the digits in the Halton sequence, in order to break correlations.

The permuted radical inverse function is

where σ is a given permutation.

For a given base b, the permutation does not change the values of the points in the sequence: only the ordering of the points changes. However, since each dimension has a specific base b, the multidimensionnal point set of a scrambled sequence is, in general, different from the unscrambled point set.

The following function computes the scrambled Van Der Corput radical inverse function.

function r=scrambleVDC(index, b, sigma)
    current = index
    ib = 1.0 / b
    r = 0.0
    while (current>0)
        digit = modulo ( current , b )
        digit=sigma(digit+1)
        current = int ( current / b )
        r = r + digit * ib
        ib = ib / b
    end
endfunction

For example, the base-3 Van Der Corput sequence can be associated with the permutation [0,2,1]. The following script computes the first ten points of this scrambled sequence.

sigma=[0,2,1];
b=3;
for index=1:10
    r = scrambleVDC(index,b,sigma);
    mprintf("index=%d, r=%f\n",index,r)
end

The previous script produces the following output.

      
index=1, r=0.666667
index=2, r=0.333333
index=3, r=0.222222
index=4, r=0.888889
index=5, r=0.555556
index=6, r=0.111111
index=7, r=0.777778
index=8, r=0.444444
index=9, r=0.074074
index=10, r=0.740741
   
    

In their paper, Braaten and Weller provide the permutations for dimensions up to 16.

      
2 (0 1)
3 (0 2 1)
5 (0 3 1 4 2)
7 (0 4 2 6 1 5 3)
11 (0 5 8 2 10 3 6 1 9 7 4)
13 (0 6 10 2 8 4 12 1 9 5 11 3 7)
etc...
   
    

The issue here is that the limited size of the table provided by Braaten and Weller limit the practical use of the algorithm. Tuffin (1996) provides improved algorithms to expand this tables (and to improve the permutations as well), but this works has two issues. First, the permutations are computed by taking all dimensions into account, and not dimension-by-dimension as in Braaten and Weller. This has the advantage of improving the permutations, but has the drawback of making the tables larger. Secondly, Tuffin published the algorithm, but not the permutations.

Kocis and Whiten (1979) provide a simple algorithm to generate permutations for the Halton sequence. Their RR2 algorithm uses the Van Der Corput sequence in base 2 and reverses the digits, removing the integers that are too large. A more detailed description is presented in Lemieux (2009). This algorithm has no strong theoretical basis, but works efficiently.

The following function computes the RR2 scrambling for the i-th dimension of a point set with s dimensions, using the bases in the matrix b.

function sigma=RR2Scrambling(s, i, b)
    ns=ceil(log(b(s))/log(2))
    sigma=[];
    u=lowdisc_ldgen(2^ns,1,"halton")
    u=[0;u]
    for k=1:2^ns
        sigmak=u(k)*2^ns
        if (sigmak<b(i)) then
            sigma($+1)=sigmak;
        end
    end
endfunction

The following script prints the RR2 scrambling used for a point set in 3 dimensions.

s=3;
primemat=number_primes100();
b=primemat(1:s);
for i=1:s
    sigma=RR2Scrambling(s,i,b);
    sigmastr=strcat(string(sigma),",");
    mprintf("i=%d, [%s]\n",i,sigmastr);
end

The previous script produces the following output.

      
i=1, [0,1]
i=2, [0,2,1]
i=3, [0,4,2,1,3]
   
    

This feature is provided by the "-scrambling" option of the lowdisc_configure function.

Bibliography

"Monte-Carlo methods in Financial Engineering", Paul Glasserman, Springer, 2003

"Algorithm 247: Radical-inverse quasi-random point sequence", J. H. Halton, 1964. Commun. ACM 7, 12 (Dec. 1964), 701-702

"Computational investigations of low-discrepancy sequences", Kocis, L. and Whiten, W. J. 1997. ACM Trans. Math. Softw. 23, 2 (Jun. 1997), 266-294.

"Monte Carlo methods for solving multivariate problems", Annals of the New York Academy of Sciences, 86:844-874, J.M. Hammersley (1960).

J. G. van der Corput, Verteilungsfunktionen. Proc. Ned. Akad. v. Wet., 38:813-821, 1935

Braaten, E., and G. Weller. An Improved Low-Discrepancy Sequence for Multidimensional Quasi-Monte Carlo Integration. Journal of Computational Physics, Vol. 33, 1979, pp. 249-258.

Monte Carlo and Quasi-Monte Carlo Sampling (Springer Series in Statistics), Christiane Lemieux, 2009


Report an issue
<< Efficiency Tutorials Choosing the number of points >>