***linear_solver#

Description#

This keyword specifies the solver used to solve the linear system of equations involved in the global step of the Newton-Raphson algorithm. Both direct and iterative solvers are implemented in Z-set.

Syntax#

***linear_solver type

Where type is a solver type, which can take the following values.

frontal

is a direct frontal solver based on a Cholesky factorization, so matrices have to be symmetric, definite and positive. This solver is the default solver.

sparse_direct

is a direct solver using sparse storage, i.e. only non-zero terms of the factorized matrix are stored. So it is less memory consuming than the frontal solver. This solver is based on a Crout factorization, so matrices have to be symmetric, definite, but not necessary positive.

sparse_dscpack

is an optimized direct solver using sparse storage. It is based on a multifrontal algorithm using a Cholesky factorization, so matrices have to be symmetric definite and positive. To be very efficient, this solver uses the BLAS optimized mathematical library working on full matrices, so it is locally more memory consuming than the sparse_direct solver. This solver is developed by Padma Raghavan of the Pennsylvania State University, and used by permission. Information on the DSCPACK solver is available on Web at the link : http://www.cse.psu.edu/~raghavan/Dscpack/dscpack.ps

mumps

is an interface to the well known MUltifrontal Massively Parallel Solver (http://mumps.enseeiht.fr/ ). It is based on a multifrontal algorithm. The current interface only uses the multithreading parallelism of MUMPS. Like sparse_dscpack, this solver uses the BLAS optimized mathematical library working on full matrices. It seems that MUMPS provides the best (one-core) performance in terms of computational time and the worst in terms of memory consumption. This solver can be used for symmetric and unsymmetric systems. Depending on the system to be solved, MUMPS use a Crout or LU factorization. mumps can solve semi-definite systems and compute associated kernel, and accepts multiple right-and-side. So mumps (like dissection) are preferred local solvers for domain decomposition methods. More information about mumps can be found in section ***linear_solver mumps page (TODO).

dissection

is an interface to the Dissection solver developed by François-Xavier Roux (Onera, Laboratoire Jacques-Louis Lions) and Atsushi Suzuki (Laboratoire Jacques-Louis Lions). It is based on a nested dissection algorithm. The Dissection Solver use two levels of shared memory parallelism. The first level is the multithreading of the optimized BLAS. The second one is controlled via pthreads, it drives the nested dissections. This solver can be used for symmetric and unsymmetric systems. For the moment, the resolution for unsymmetric matrices is not fully optimized, you may prefer mumps in those cases. Thanks to this two-level parallelism, dissection provides a much better scalability More information about dissection can be found in section ***linear_solver dissection page . Like mumps, dissection can solve semi-definite systems and compute associated kernel, and accepts multiple right-and-side. So dissection (like mumps) are preferred local solvers for domain decomposition methods.

sparse_iterative

includes all the available iterative solvers. These solvers are less memory consuming than direct ones, because matrices have never to be assembled, but if matrices have a bad condition number, it can be very hard to achieve convergence. Two iterative solvers can be used in Z-set, a Conjugate Gradient algorithm (cg), or a Global Minimum RESidual one (gmres). The main difference between these two solvers, is the assurance to achieve convergence if matrices are non-positive using the GMRES algorithm. This solvers type needs some complementary subkeywords described in section ***linear_solver sparse_iterative page .

Example#

The following table gives memory needed, total CPU time and number of iterations to solve using different solvers a traction problem for a cube meshed with 13824 linear elements corresponding to 46875 dofs.

Total CPU time

Memory needed (Mb)

Number of iterations

Frontal

48.22 mm

730 Mb

Sparse_direct

27.82 mn

489 Mb

Mumps

xx mn

xx Mb

Dissection

xx mn

xx Mb

Sparse_dscpack

3.52 mn

538 Mb

CG + lumped

1.92 mn

146 Mb

238

CG + Cholesky

2.17 mn

160 Mb

94

GMRES + lumped

5.9 mm

161 Mb

750

GMRES + Cholesky

3.12 mm

175 Mb

169

***linear_solver dissection#

Description#

This keyword specifies that the dissection solver is used to solve the linear system of equations involved in the global step of the Newton-Raphson algorithm. All parameters below are optional. Default values are already set and satisfactory in most cases.

Syntax#

[ **scaling scaling_type ] [ **ordering ordering_type ] [ **pivoting_threshold threshold ] [ **kernel_detection_all] [ **number_iterations int ] [ **minimal_nodes_per_leaf int ] [ **dim_aug_kern dim for kernel detection type ] [ **check_solution ] [ **dump_operator ]

Where

**scaling

specifies the scaling used to increase the accuracy of the solution. scaling can take these three values none (i.e. scaling disabled), diagonal (i.e. scaling using diagonal coefficients of the matrix) or kkt (i.e. scaling based on infinite norm of rows and columns). Default value is diagonal.

**ordering

specifies the renumbering used to reduce fill in. ordering can take these two values tridiag (i.e. Cuthill-McKee algorithm), scotch (i.e. ordering using SCOTCH software). Default value is scotch.

**pivoting_threshold

specifies the threshold used to determine null pivots. Default value is set to \(10^{-2}\).

**kernel_detection_all

activate the full detection of kernel. Not needed for semi-definite matrices. Must be enabled for indefinite matrices. Disabled by default.

**number_iterations

specifies the number of nested dissections. It is by default set to \(-1\) (i.e. automatic choice).

**minimal_nodes_per_leaf

specifies the minimal size of dissected parts. It is by default set to \(128\).

**dim_aug_kern

specifies the size of Schur complements involved for null pivots detection. Default value is \(4\) (i.e \(2 \times 2\) Schur complements).

**check_solution

checks the residual. Only for debugging purpose.

**dump_operator

dumps the matrix. Only for debugging purpose.

Example#

***linear_solver dissection
  **scaling kkt
  **ordering scotch
  **pivoting_threshold 1.0e-2

***linear_solver sparse_iterative#

Description#

This keyword specifies the iterative solver used to solve the linear system of equations involved in the global step of the Newton-Raphson algorithm.

Syntax#

[ **precond type ] [ **full_output ] [ **solver solver ]

Where

**precond

specifies the type of pre-conditioning used to accelerate convergence. The two available types are lumped (e.g. diagonal) and cholesky. Default value is lumped. Using cholesky preconditioner can significantly reduce the number of iterations even if it needs to evaluate and factorize an supplementary matrix, unlike lumped preconditioner.

**full_output

enables to print convergence informations. Default value is FALSE. If FALSE, nothing is said about solver iterations.

**solver

specifies the type of iterative solver used. solver can take these two values cg (i.e. Conjugate Gradient) or gmres (i.e. Global Minimum RESidual). Default value is cg. According to the chosen solver, the following keywords are different.

For the Conjugate Gradient algorithm, syntax is the following:

[ *max_iteration max_iter ] [ *precision eps ] [ *output_every_iter nb_iter ] [ *output_to_file file ] [ *keep_direction dir ] [ *max_standing max ] [ *min_iter min_iter ] [ *reprojection ]

*max_iteration max_iter

where max_iter is the maximum number of allowed iterations when solving the problem. Default value is 1000.

*precision eps

where eps is a real value defining the relative precision required for convergence when solving the problem with the CG method. Default value is 1e-08. For a system of equation:

(27)#\[\bf K \bf q = \bf F\]

this relative ratio is defined by:

(28)#\[ratio = \frac{||\bf F - \bf K \bf q||}{||\bf F||}\]

and convergence occurs when:

(29)#\[ratio < eps\]
*output_every_iter iter

where iter is the frequency of the iteration information output. This option is only active if the keyword ***full_output is TRUE. Default value is 10.

*output_to_file file

where file is a character string specifying the file where iteration information are written. This option is only active if the keyword ***full_output is TRUE. Default value is iterative_solver_it.

*keep_direction dir

where dir is the the integer value specifying the number of orthogonal descent directions retained during the CG iterations. Increasing dir leads to faster convergence but is more memory consuming. Default value is max_iter+2.

*max_standing max

where max is an integer specifying the maximum number of CG iterations allowed without any significant decrease of the convergence ratio. Default value is 50.

*min_iter min_iter

where min_iter is the minimum iterations when solving the problem. Default value is 1.

*reprojection

This subcommand can significantly reduce the number of CG iterations, when used in conjunction with quasi-Newton schemes of tangent matrix update (such as eeeeee or p1p1p1, see the **algorithm command). With this option the descent directions calculated during previous load increments are reused, leading to convergence in just a few iterations when the tangent matrix and the load increment stay constant over several Newton increments.

For the Global Minimum RESidual algorithm, syntax is the following :

[ *max_iteration max_iter ] [ *precision eps ] [ *output_every_iter nb_iter ] [ *output_to_file file ] [ *krylov_space krylov_dim ]

Where *max_iteration, *precision, *output_every_iter, *output_to_file are the same that for CG solver.

*krylov_space krylov_dim

where krylov_dim is an integer value specifying the dimension of the krylov_space built for each cycle of the GMRES algorithm. Increasing this value leads to faster convergence.

Example#

***linear_solver sparse_iterative
  **full_output
  **precond cholesky
  **solver cg
   *output_to_file iterations.hist
   *precision 1.e-12
   *output_every_iter 1
   *max_iteration 1000

***linear_solver rigid#

Description#

This keyword specifies a ”wrapper“ solver which encapsulate a real solver. It behaves exactly like the underlying solver, except that the kernel of the operator is computed. If requested, a boundary condition is automatically added to fix rigid body motions.

The way rigid body motions are computed is explained in the theory manual, at the linear solver chapter.

Syntax#

[ **local_solver type ] \(~\,~\,\) options for the local solver [ **create_bc ] [ **verbose solver ]

Where

**verbose

asks the wrapper to print detailed informations about rigid body motions found,

**local_solver

allow to choose the underlying linear solver. Any linear solver may be used, but note that iterative one sometimes may exhibit weird behavior (especially depending on the convergence ratio used for these solvers),

**create_bc

specifies that a boundary condition has to be added to fix all body motions. Note also that in sequential computations the presence of body motions often means that there is an error in the input file.