<< Gallery Tutorials (t,m,s)-nets >>

Low Discrepancy >> Low Discrepancy > Tutorials > Projections

Projections

An overview of bad 2D projections.

Purpose

The goal of this page is to gather experiments showing bad 2D projections for low disrepancy sequences. These tests suggest that the Sobol sequence is better for small dimensions (s<25), the Faure sequence is better for large dimensions (s>25) and that the scrambled Halton sequence is generally better than a classical Halton sequence.

Halton Sequence

In this section, we present some bad 2D projections of the Halton sequence. The Halton sequence is known for having bad 2D projections and several examples are presented in the bibliography.

In the following example, we consider the projection of dimensions (28,29) as in "On the optimal Halton sequence", Chi, Mascagnib, Warnock Mathematics and Computers in Simulation 70 (2005) 9?21 The paper contains an error : the faulty projection is (28,29) (primes 107,109) and not (27,28) (primes 103,107).

u=lowdisc_ldgen ( 4096 , 29 , "halton" );
scf(); 
lowdisc_proj2d ( u , [28 29] );
// Compute the correlation coefficient for dimensions (28,29)
corrcoef ( [u(:,28) u(:,29)] )

The previous script produces the following output.

      
-->corrcoef ( [u(:,28) u(:,29)] )
 ans  =
    1.         - 0.1210675  
  - 0.1210675    1.         
   
    

The previous script also produces the following graphics.

See our Leaped - Halton sequence. There is no obvious problem anymore.

u=lowdisc_ldgen ( 4096 , 29 , "halton-leaped");
scf(); 
lowdisc_proj2d ( u , [28 29] );

In the following example, we consider the projection of Halton sequence in dimensions (39,40) as in "Computational Investigations of Low-Discrepancy Sequences", Kocis, Whiten ACM Transactions on Mathematical Software, Vol. 23, No. 2, June 1997, Pages 266-294.

u=lowdisc_ldgen ( 2000 , 40 , "halton" );
scf();
lowdisc_proj2d ( u , [39 40] );
// Compute the correlation coefficient for dimensions (39,40)
corrcoef ( [u(:,39) u(:,40)] )

The previous script produces the following output.

      
-->corrcoef ( [u(:,39) u(:,40)] )
 ans  =
    1.           0.1048947  
    0.1048947    1.         
   
    

The previous script also produces the following figure.

See our Leaped-Halton.

u=lowdisc_ldgen ( 2000 , 40 , "halton-leaped");
scf();
lowdisc_proj2d ( u , [39 40] );

In the following example, we search the worst Halton projection for 2000 points in dimensions 40. In order to evaluate the projection, we compute the correlation coefficients. The projections which have the largest non-diagonal correlation coefficients are the worst.

u=lowdisc_ldgen ( 2000 , 40 , "halton" );
// Compute correlation coefficients
c = corrcoef ( u );
// Get the maximum correlation coefficient 
// but remove the diagonal (unity) terms
[m,k] = max(abs(c-eye(c)))
// This is projection (35,36) :
k
// Get the correlation coefficient for this projection
c(k(1),k(2))
// Print this projection
scf();
lowdisc_proj2d ( u , k );

Faure Sequence

In this section, we present some bad 2D projections of the Faure sequence.

In the following example, we search for the worst projection of 2000 points of the Faure sequence in dimension 40. The worst projection is relatively good, which indicates how Faure's sequence is much better than Halton's sequence.

u=lowdisc_ldgen ( 2000 , 40 , "faure" );
// Compute correlation coefficients
c = corrcoef ( u );
// Get the maximum correlation coefficient 
// but remove the diagonal (unity) terms
[m,k] = max(abs(c-eye(c)))
// This is projection (20,21) :
k
// Get the correlation coefficient for this projection
c(k(1),k(2))
// Print this projection
scf();
lowdisc_proj2d ( u , k );

This is consistent with the fact that, in s=40 dimensions, the Faure sequence uses the prime number b=41 as the base. Moreover, the Faure sequence in dimension s=40 is a (0,40)-sequence in base b=41. Consider now the projection (i,j), for i different from j. Consider elementary intervals from 41 subdivisions in dimension i, 41 subdivisions in dimension j, and 1 subdivision in the other dimensions. There are 41*41=1681 intervals, each with volume 1/1681. Hence, if we consider a set of 1681 points, each such constructed elementary interval has one point. Since the previous figure has 2000>1681 points, all 2D projections of the Faure sequence in dimension s=40 are quite well distributed.

In the following example, we consider 300 points of the Faure sequence in dimension 40, and consider the projection (39,40) as in "Computational Investigations of Low-Discrepancy Sequences", Kocis, Whiten, ACM Transactions on Mathematical Software, Vol. 23, No. 2, June 1997, Pages 266-294.

u=lowdisc_ldgen ( 300 , 40 , "faure" );
scf();
lowdisc_proj2d ( u , [39 40] );

Since the number of points is 300 which is much smaller than 41*41=1681, the Faure sequence cannot evenly distribute the points in the space.

We check what is the behavior with more favorable parameters.

dim = 40;
nsimmin = 300;
lds = lowdisc_new("faure");
lds = lowdisc_configure(lds,"-dimension",dim);
base = lowdisc_get(lds,"-faureprime");
[nsim,skip,leap] = lowdisc_fauresuggest ( dim , base , nsimmin );
lds = lowdisc_configure(lds,"-skip",skip);
lds = lowdisc_configure(lds,"-leap",leap);
lds = lowdisc_startup (lds);
[lds,u]=lowdisc_next(lds,nsim);
lds = lowdisc_destroy(lds);
scf();
lowdisc_proj2d ( u , [39 40] );

We see that the number of experiments has increased from 300 to 68921, which is significantly more.

      
-->size(u)
 ans  =
    68921.    40.  
   
  

The following figure shows that there the correlation problem is greatly reduced, although the points seems to be located on lines.

The number of points 68921=41*41*41, which guarantees good 3D projections.

In order to keep the number of experiments constant, we might be interested in using the leap option. Following Kocis and Whithen (Computational investigations of low-discrepancy sequences", Kocis, L. and Whiten, W. J. 1997. ACM Trans. Math. Softw. 23, 2, 266-294.), we may try a leaped Faure sequence. We try the leap L=5.

lds = lowdisc_new("faure");
lds = lowdisc_configure(lds,"-dimension",40);
lds = lowdisc_configure(lds,"-leap",5);
lds = lowdisc_startup (lds);
[lds,u] = lowdisc_next (lds,300);
lds = lowdisc_destroy(lds);
scf();
lowdisc_proj2d ( u , [39 40] );

Sobol Sequence

In this section, we present some bad 2D projections of the Sobol sequence.

In the following example, we consider 1000 points of the Sobol sequence in 23 dimensions and check the projection (16,23).

u=lowdisc_ldgen(1000,30,"sobol");
scf();
lowdisc_proj2d(u,[16 23])
xtitle("Sobol : 1000 points","X16","X23")

In the following example, we consider 1000 points of the Sobol sequence in 36 dimensions and check the projection (11,36).

u=lowdisc_ldgen(1000,36,"sobol");
scf();
lowdisc_proj2d(u,[11 36])
xtitle("Sobol : 1000 points","X11","X36")

Niederreiter (base 2) Sequence

In this section, we consider 1000 points of the Niederreiter (base 2) sequence in 26 dimensions, and consider the projection (25,26).

u=lowdisc_ldgen(1000,30,"niederreiter");
scf();
lowdisc_proj2d(u,[25 26])

Two-dimensionnal correlations with increasing dimensions

In this section, we present how the linear correlation coefficient between dimensions of the sequence deteriorates when the number of dimensions increases. Given a number of dimensions and 1000 points, we are interested in the worst 2D projection, as defined by the largest correlation coefficient.

The following function test for low discrepancy sequences for dimensions s from 2 to smax. It generates a low discrepancy point set for n points and computes the correlation coefficient between columns of the matrix. Then the diagonal terms are set to zero and the largest entry is computed. We assume here that this identifies the worst 2D projection.

function m=worstCorrelation(n, smax, seqname)
    for s=2:smax
        u=lowdisc_ldgen(n,s,seqname);
        c = corrcoef ( u );
        m(s-1) = max(abs(c-eye(c)));
    end
endfunction

The following script tests the Halton, Sobol, Faure and Niederreiter (optimal base) sequences.

n=1000;
smax=50;
scf();
m=worstCorrelation(n,smax,"halton");
plot(2:smax,m,"r-")
m=worstCorrelation(n,smax,"sobol");
plot(2:smax,m,"b-")
m=worstCorrelation(n,smax,"faure");
plot(2:smax,m,"g-")
m=worstCorrelation(n,smax,"niederreiter");
plot(2:smax,m,"k-")
legend(["Halton","Sobol","Faure",..
"Niederreiter"]);
a=gca();
a.children(1).legend_location="in_upper_left";
xtitle("1000 Points","Number of dimensions",..
"Maximum Correlation Coefficient")

The previous script produces following figure. This shows that, according to this test, the Niederreiter sequence in base 2 performs well for s<15, but rapidely deteriorates for s≥15. Notice that that optimal base of Niederreiter sequence is known by our implementation only for s<13. As expected, the Halton sequence has poor 2D projections, especially for s≥20 where the correlation coefficient becomes larger than 0.1. The Sobol sequence is better than Halton, but can have poor 2D projections, especially for s≥35. However, the projections for s<25 are the best of this test. The Faure sequence appears to be the most robust sequence in this test for high dimensions.

The following script tests various Halton sequences.

n=1000;
smax=50;
scf();
m=worstCorrelation(n,smax,"halton");
plot(2:smax,m,"r-")
m=worstCorrelation(n,smax,"sobol");
plot(2:smax,m,"b-")
m=worstCorrelation(n,smax,"faure");
plot(2:smax,m,"g-")
m=worstCorrelation(n,smax,"niederreiter");
plot(2:smax,m,"k-")
legend(["Halton","Sobol","Faure",..
"Niederreiter"]);
a=gca();
a.children(1).legend_location="in_upper_left";
xtitle("1000 Points","Number of dimensions",..
"Maximum Correlation Coefficient")

The previous script produces following figure. This shows that, according to this test, the Reverse Halton sequence has almost the same poor 2D projections as the Halton sequences (the two red lines are almost the same). Our Leaped Halton sequence generally has better projections than Halton, but can also have larger correlations (s=16, 24, 42). The Scrambled (Kocis-Whiten) Halton sequence is the most robust sequence in this test, at the exception of s=25, for which the correlation is slightly larger. These graphics suggest that the scrambled Halton sequence is the best of this set of tests.


Report an issue
<< Gallery Tutorials (t,m,s)-nets >>