TAF: Transformation of Algorithms in Fortran

Overview of Applications:


Model Model Reference Lines Language Main Loop TLM ADM comment HES AD References
NSC2KE Mohamadi (1994) 2,500 F77 steady 2.4 3.4 iteration directive 9.8 Giering et al. (2005)
NS-Solver Hinze (1999) 1,000 F77 steady - 2.0 flow directives - Hinze and Slawig (2002)
IMBETHY Knorr (2001) 5,000 F90 evolving 1.5 3.6 2 level checkpointing 12.7/5 Rayner et al. (2001)
MOM3 Pacanowski and Griffes (1999) 50,000 F77 evolving yes 4.6 2 level checkpointing - Galanti et al. (2001)
MITGCM Marshall et al. (1995) 100,000 F77 evolving 1.8 5.5 3 level checkpointing 11.0 Marotzke et al. (1999),
Stammer (2000),
Stammer et al. (2002),
Heimbach et al. (2002),
Stammer et al. (2002),
Online Manual (2002)
Biomag code Buecker et al. (2001) 83 F77   548/1098 3.1 - - Bischof et al. (2001)
NASA-DAO Lin and Rood (1996),
Lin and Rood (1997),
Lin (1997)
87,000 F90 evolving 2.7 8.4 2 level checkpointing - Todling et al. (2003)
Giering et al. (2005)
HB_AIRFOIL Thomas et al. (2002),
Thomas et al. (2002a),
Hall et al. (2002a)
8,000 F90 steady+unsteady yes 3 - - Thomas et al. (2003),
Hall et al. (2002)
EULSOLDO Cusdin and Müller (2003) 423 F77 - 2-3 2-3 AD restricted to kernel - Cusdin and Müller (2003),
Cusdin and Müller (2003),
Cusdin (2003)
3D N-S CFD code Moinier et al. (2002) 699 F77 steady 1.2-3.4 1.2-3.4 flow and iteration directives - Cusdin and Müller (2003)
Rot-Disk Beeson (2001),
Timoshenko and Goodier (1970)
289 F77 - 7/6 7/6 iteration directive,
(yet faster with structured AD)
- Akkaram et al. (2003)
ARPS Xue et al. ( 2000 , 2001) 40000 F90 evolving 2 11 2-level checkpointing - Xiao et al. (2004)
NAST2D Griebel et al. (1998) 2700 F90 steady 90/289 1.8 iteration directive - -
NAST2D in 3D Griebel et al. (1998) 3500 F90 steady 1.4 1.8 iteration directive - Othmer et al. (2006),
Kaminski et al. (2006)
NIRE-CTM Taguchi ( 1993 , 1996) 860 F77 evolving 1.0 1.5 - - Taguchi (2005)
NIES-ATTM Maksyutov (2008),
Maksyutov et al. (2008)
8600 F90 evolving 1.1 2.6 2-level checkpointing - -
FLOWer Raddatz and Fassbender (2005) 166,000 F77 evolving
or steady
2-3 6-10 2-level checkpointing
or iteration directive
- Giering et al. (2009)
MUGRIDO Reimer und Hesse (2006) 25,000 F77 - 2.2/15 - - - Giering et al. (2009)
Planet Simulator (Plasim) Fraedrich et al. (2005) ,
Lunkeit et al. (2007)
13700 F90 evolving 2.5 5.1 2-level checkpointing 23.8/11 -
JULES 2.0 Clark and Harris (2007),
Clark et al. (2011),
Best et al. (2011)
32100 F90 evolving 1.4 4.4 - - Jupp (2010)
Semidiscrete Model Gobron et al. (1997) 990 F77 - 1.4 3.8 - - Lewis et al. (2011)
NAOSIM Kauker et al. (2003) 45000 F77 - 8/7 4.5 - - Kauker et al. ( 2009a, 2009b)
PO4-DOP Parekh et al. (2005) 125 F90 unsteady - yes - - Piwonski and Slawig (2010)
NPZD Oschlies and Garcon (1999) 2000 F90 unsteady 5/12 - - yes Rueckelt et al. (2010)

Lines
# of Fortran source code lines with comments and blank lines removed.

Main Loop
Describes the main time integration loop, if any.
  • steady state: converges to a steady state
  • evolving: integrates a system forward in time
TLM
Column TLM shows CPU time for evaluating function (model) plus first derivative in forward mode (tangent linear model, TLM) in multiples of the CPU time to evaluate the function only (CPU time ratio).

In most of the applications the TLM computes the product of Jacobian times one vector. Whereever there is more than one vector we print: CPU time ratio / # of vectors.

The entry "Yes" means that there is a TLM, but we don't have the performance, whereas "-" means that there is no TLM.

ADM
Column ADM shows CPU time for evaluating function (model) plus first derivative in reverse mode (adjoint model, ADM) in multiples of the CPU time to evaluate the function only (CPU time ratio).

In all examples the CPU time for the derivative of a scalar valued function is given.

Note that a 2 level checkpointing scheme (see, e.g. Giering and Kaminski, 2002) consumes the CPU time of about one additional function evaluation. For example the adjoint of IMBETHY has a CPU time ratio of about 3.6 - 1 = 2.6 for short integrations, which do not require a checkpointing scheme. 3 level checkpointing costs two additional function evaluations.

comment
Comments on the selected strategy to increase the performance:
  • iteration directive: arranges memory efficient handling for integrations that converge to a steady state.
  • flow directive: are used to provide information about routines the source code of which is either not available or shall not be differentiated. TAF generates the interfaces to derivative code provided by the user. Self adjoint routines are an example for which it is more efficient to provide the derivative (the routine itself) instead of having TAF generate the adjoint.
  • checkpointing: A scheme to allow multiple use of tape space at the cost of additional function evaluations. Generation of such a scheme can be triggered by TAF directives (see e.g. Giering and Kaminski, 2002)
HES
Column HES shows CPU time for evaluating columns of the Hessian (second derivative) in multiples of the CPU time for evaluating the function (CPU time ratio).

If the number of columns is not equal one, we print: CPU time ratio / # of columns.

"-" means that there is no Hessian code.

Copyright © FastOpt - all rights reserved