### Table 2: L1 regularization produces sparser solutions and requires fewer training iterations than L2 regularization.

2008

"... In PAGE 6: ... To handle the discontinuinty of the gradient, we used the orthant-wise limited- memory quasi-Newton algorithm of [24]. Table2 shows that while there is no significant performance difference in models trained with L1 or L2 regularization, there is significant difference in the number of training iterations and the sparsity of the parameter vector. L1 regularization leads to extremely sparse parameter vectors (96% of the parameters are zero in the 16 subcategory case), while no parameter value becomes exactly zero with L2 regularization.... ..."

Cited by 2

### Table 15: L1 norm errors using regular grid points

in Recent Developments in the Dual Receiprocity Method Using Compactly Supported Radial Basis Functions

"... In PAGE 34: ... Next, we use the 400 regular grid points as interpolation points and compute the L1 norm at 400 random points. In Table15 , we see that there is an improvement in accuracy for the case a = 1:6,... ..."

### Table 5. L1 convergence rates on regular and sparse grids for CFL number = 1=2, sine example.

### Table 2: Error for Problem 1, regular sparse grid. level n points L1-error quotient L2-error quotient pointwise quotient 2 49

"... In PAGE 19: ...irst we consider the case of regular, i.e. non-adapted sparse grids. Problem 1: We rst consider the Poisson equation ? u = f in = (0; 1)2 with Dirichlet boundary conditions and the exact solution u(x; y) = sinh( (1 ? x)) sin( y)= sinh( ). The results for di erent levels of discretization are given in Table2 . Here and in the following, we show the number of points involved in the... In PAGE 21: ... Therefore, no substantial di erence to the case of the regular sparse grid can be expected. From Table 6 we see that, due to the adap- tivity, even slightly more grid points are needed to reach the same absolute error size as for the regular sparse grid case of Table2 . With respect to the ratio relative error reduction versus number of grid points they are comparable.... ..."

### Table 3: Experimental results for problem Glass for tness function l1 with two types of regularization and with neuron markers ( 1: Model Fitness Coe cient, 2: Complexity Fitness Coe cient, F: Composite Fitness Function)

"... In PAGE 5: ... We believe that with this tness function together with the encoding of training epochs the impact of over tting is reduced, as the tness measure emphasizes classi cation performance in contrast to modeling the validation set data (sub){optimally. Table3 displays the results for tness function l1 with the two complexity regularization terms. While the Hinton term improves classi cation accuracy with very small complexity reduction, the Rumelhart term helps to nd less complex nets without improving classi cation accuracy.... ..."

### Table 5: Error for Problem 4, regular sparse grid. Level points L1-error quotient L2-error quotient pointwise quotient 2 49

"... In PAGE 21: ...Table5... In PAGE 22: ...01 5:60+0 1.98 In comparison to the results of Table5 (also upwind discretization and moderate convection but on a regular sparse grid without any adaptivity), now, a much better relative accuracy of the solution is obtained with less grid points, i.e.... ..."

### Table 4: Error for Problem 3, regular sparse grid. level n points L1-error quotient L2-error quotient pointwise quotient 2 49

### Table 1: Escape Time Problem Errors zero on the regions of strong regularity, in part a consequence of the fact that the value function is linear in those regions. Additionally, the optimal controls appear to converge in L1 on the entire domain. Without detailed assumptions about the structure of the regions of strong regularity, our re- sults do not necessarily predict that type of convergence. However, since we know for the present problem that the complement of the regions of strong regularity has Lebesgue measure zero, convergence in L1 on the entire do- main is, in fact, expected. Finally, similar error values are indicated in the second part of Table 1 for the escape time problem on the unit cube in IR3.

### Table 1 Stability and convergence rates of rotational schemesa

"... In PAGE 33: ... In short, for all three classes of projection schemes, their rotational versions should always be preferred over the stan- dard versions. We now summarize in Table1 the main results related to the rotational forms of the pressure-correction methods, velocity-correction methods, and consistent splitting methods. References [1] Y.... ..."

### Table 2: The bipartite-regular hypermaps on the sphere.

"... In PAGE 14: ...Table 2: The bipartite-regular hypermaps on the sphere. Based on the knowledge of regular hypermaps on the sphere, we display in Table2 all the possible values (up to duality) for the bipartite-type of the bipartite-regular hypermaps on the sphere and the unique hypermap (up to isomorphism) with such a bipartite-type. Notice that the map of bipartite-type (1; n; 2; 2n) can be constructed from Dn either via a Wal transformation W al(D(02)(Dn)) or via a Pin transformation P in(D(12)(Dn)).... In PAGE 17: ...nd n give rise to a spherical type (cf. Table 1). If not we choose the second greatest, the third greatest and so forth. For each bipartite-regular hypermap K in Table2 , where (l1; l2; m; n) is our (r; s; u; v), taking the greatest values for the triple (l; m; n) we get a spherical type. To check if such triple determines a hypermap covered by K we take a half-turn in the middle of each hyperedge of K; these half-turns determine a covering K 7! K .... ..."