2014年3月7日 星期五

Gaussian 09 for Intel Compiler with Intel MKL

Gaussian 09在PGI Compiler下編譯完全無難度,但在Intel Compiler下編譯,就需要一些小功夫了。
為了以後改版要編又忘掉,寫個備忘吧!

重點有2個:

1. 編寫Makefile
    在x86_64的系統中安裝,要改的Makefile是i386.make這個檔案,以下是我編譯時的檔案,
     其中紅色的部份是重點,藍色是必須註解。

#
# Makefile for Gaussian 09.
#
#     Copyright (c) 1988,1990,1992,1993,1995,1998,2003,2009,2012,
#                Gaussian, Inc.  All Rights Reserved.
#
#     This is part of the Gaussian(R) 09 program.  It is based on
#     the Gaussian(R) 03 system (copyright 2003, Gaussian, Inc.),
#     the Gaussian(R) 98 system (copyright 1998 Gaussian, Inc.),
#     the Gaussian(R) 94 system (copyright 1995 Gaussian, Inc.),
#     the Gaussian 92(TM) system (copyright 1992 Gaussian, Inc.),
#     the Gaussian 90(TM) system (copyright 1990 Gaussian, Inc.),
#     the Gaussian 88(TM) system (copyright 1988 Gaussian, Inc.),
#     the Gaussian 86(TM) system (copyright 1986 Carnegie Mellon
#     University), and the Gaussian 82(TM) system (copyright 1983
#     Carnegie Mellon University). Gaussian is a federally registered
#     trademark of Gaussian, Inc.
#
#     This software contains proprietary and confidential information,
#     including trade secrets, belonging to Gaussian, Inc.
#
#     This software is provided under written license and may be
#     used, copied, transmitted, or stored only in accord with that
#     written license.
#
#     The following legend is applicable only to US Government contracts
#     under DFARS:
#
#                        RESTRICTED RIGHTS LEGEND
#
#     Use, duplication or disclosure by the US Government is subject to
#     restrictions as set forth in subparagraph (c)(1)(ii) of the Rights
#     in Technical Data and Computer Software clause at DFARS
#     252.227-7013.
#
#     Gaussian, Inc., 340 Quinnipiac St., Bldg. 40, Wallingford CT 06492
#
#     The following legend is applicable only to US Government contracts
#     under FAR:
#
#                        RESTRICTED RIGHTS LEGEND
#
#     Use, reproduction and disclosure by the US Government is subject
#     to restrictions as set forth in subparagraph (c) of the Commercial
#     Computer Software - Restricted Rights clause at FAR 52.227-19.
#
#     Gaussian, Inc., 340 Quinnipiac St., Bldg. 40, Wallingford CT 06492
#
# Where to find this file when making executables:
#
BSDDIR = bsd
MAKELOC = -f $(BSDDIR)/g09.make
BSDDIR1 = ../bsd
MAKELOC1 = -f $(BSDDIR1)/g09.make
UTILDIR = ..
HLIBDIR = $(g09root)/hermes/lib
HSRC = $(g09root)/hermes/gxinterface
HINC = $(g09root)/hermes/include
HLIBS = $(HLIBDIR)/libdbapi.a $(HLIBDIR)/libsupp.a $(HLIBDIR)/libisam.a \
        $(HLIBDIR)/libcbt.a $(HLIBDIR)/libutils.a $(HLIBDIR)/libgxchm.a
HFLAGS = $(CFLAGS) -I$(HINC) -D_POSIX_SOURCE
#
# The utility library:
#
#BLAS = blas-opt.a blas-f2c.a
#BLASL = -Wl,blas-opt.a -Wl,blas-f2c.a
#
MKLPATH= /opt/intel/composerxe-2011.1.107/mkl/lib/intel64
BLASL = -Wl,$(BSDDIR)/libf77blass-ia32.a -Wl,$(BSDDIR)/libatlass-ia32.a
BLAS1 = $(BSDDIR)/libf77blass-ia32.a
BLAS2 = $(BSDDIR)/libatlas-ia32.a
BLAS = ${MKLPATH}/libmkl_intel_ilp64.a ${MKLPATH}/libmkl_intel_thread.a ${MKLPATH}/libmkl_core.a
#BLAS = $(BLAS1) $(BLAS2)
GAULIBA = util.a
GAULIBU = util.so
GAULIB = $(GAULIBU) $(BLAS)
LINDALIBS = $(GAULIB) $(BLAS)
#
# Directory pointers only used for linking the profiling version:
#
GSDIR = .
GDIR = ../g09
#
# Standard dimensioning definitions.
PCMDIM = -DDEFMXTS=2500 -DDEFMXBOND=12 -DDEFMXSPH=250 -DDEFMXINV=2500  -DDEFMXSLPAR=300 -DDEFMXSATYP=4
#CSIZE = 524288
#CSIZEW = 64

INCDIR =
INCDIRG = -I$(g09root)/g09
PARMETH = -D_OPENMP_ -D_OPENMP_MM_
FPARFLAG = -openmp
PARFLAG = -DGAUSS_PAR -DGAUSS_THPAR $(PARMETH)
BLASFLAG = -DCA1_DGEMM -DCA2_DGEMM -DCAB_DGEMM -DLV_DSP
DEBUGP = -DCHECK_ARG_OVERLAP
I8CPP1 = -DI64
I8CPP2 = -DP64
I8CPP3 = -DPACK64
I8CPP4 = -DUSE_I2
I8CPP = $(I8CPP1) $(I8CPP2) $(I8CPP3) $(I8CPP4)
GAUDIM = 2500
GAUDIMA = $(GAUDIM)00
GAUDIMR = $(GAUDIM)0
GAUDIMS = $(GAUDIM)
CTDEBUG = -DDEFICTDBG=0
PROCTYPE =
NISEC = -DDEFISEC=16
NJSEC = -DDEFJSEC=128
NKSEC = -DDEFKSEC=128
X86TYPE =
DIMENSX = $(INCDIR) $(INCDIRG) -DDEFMAXRES=$(GAUDIMR) -DDEFMAXSEC=$(GAUDIMS) $(I8CPP) $(PARFLAG) $(DEBUGP) -DDEFMAXSHL=$(GAUDIMA) -DDEFMAXATM=$(GAUDIMA) $(PROCTYPE) -DNO_SBRK $(X86TYPE) \
  -DDEFMAXNZ=$(GAUDIMA) -DDEFNVDIM=257 -DR4ETIME \
  -DDEFARCREC=1024 -DMERGE_LOOPS -D_I386_ -DLITTLE_END -DUSING_F2C -DSTUPID_ATLAS \
  -DDEFMAXXCVAR=40 -DDEFMAXIOP=200 -DDEFMAXCOORDINFO=32 -DDEFMAXSUB=80 -DDEFMAXCHR=1024 -DDEFMOMEGA=5 -DDEFNOMEGA=6 -DDEFMAXXCNAME=25 -DDEFLMAX=13 -DDEFMINB1P=100000000 -DDEFXGN3MIN=1 $(NISEC) $(NJSEC) $(NKSEC) -DDEFN3MIN=10 -DDEFNBOMAXBAS=10000 -DDEFMAXHEV=2000 -DDEFCACHE=128 \
  -DDEFMAXLECP=10 -DDEFMAXFUNIT=5 -DDEFMAXFFILE=10000 -DDEFMAXFPS=1300 -DDEFMAXINFO=200 \
  -DDEFMAXOP=384 -DDEFMAXTIT=100 -DDEFMAXRTE=4000 -DDEFMAXREDTYPE=3 -DDEFMAXREDINDEX=4 -DDEFMAXOV=500 -DDEFMXDNXC=8 -DDEFMXTYXC=10 $(CTDEBUG) -D_ALIGN_CORE_ \
  $(BLASFLAG) -DO_BKSPEF -DSETCDMP_OK $(PCMDIM) -DGCONJG=DCONJG -DGCMPLX=DCmplx -DGREAL=DREAL -DGIMAG=DIMAG -DEXT_LSEEK -DAPPEND_ACC
#
# These commands are converted to "on machine command" for remote-
# control compilation.
#
RUNF2C = f2c -kr -T. -R -Nx400 -Nn1604 -NL800
RUNCC = icc -openmp -axAVX -static-intel -static-libgcc
RUNAR  = ar
RUNRAN = gau-ranlib
RUNCPP = gau-cpp
RUNFSP = gau-fsplit
RUNMAKE = make
#TIME = -Mreentrant -Mrecursive -Mnosave -Minfo -Mneginfo -time
#VECTOR4 = ,prefetch,sse -fastsse -Mscalarsse
#VECTOR = -Mvect=assoc,recog,cachesize:$(CSIZE)$(VECTOR4)
#MACHTY = p7-32
#MACH = -tp $(MACHTY) $(TIME)
#OPTOI = -m32 -march=i486 -malign-double

OPTFLAGO = -O3 -unroll
# Flags for portland compiler.
#
I8FLAG = -i8
R8FLAG = -r8
MMODEL = -mcmodel=medium
#PGISTATIC = -Bstatic_pgi
#RUNF77 = pgf77 $(PGISTATIC) $(I8FLAG) $(R8FLAG) $(MMODEL) $(DEBUGF) $(SPECFLAG)
RUNF77 = ifort $(I8FLAG) $(R8FLAG) $(MMODEL) -auto -axAVX -static-intel -static-libgcc -no-prec-div -fpp3 -ftz -pad  -mkl
F2CLIB =
SYSLIBS = -lpthread -lm -lc
NUMALIB =
LIBS = $(NUMALIB) $(SYSLIBS)
UNROLL  = -unroll
TWOH =
PC64 = -pc64
DIMENS = $(DIMENSX) $(TWOH)
FNOOPT = $(FPARFLAG) $(PROFFLAG) -O0 $(MACH) -g
FNOOPT64 = $(FNOOPT) $(PC64)
FOPT1 = $(FPARFLAG) $(PROFFLAG) -O1 $(MACH)
FOPT2 = $(FPARFLAG) $(PROFFLAG) -O2 $(MACH)
FOPT2UN = $(FOPT2) $(UNROLL)
FOPT2VC = $(FOPT2) $(VECTOR)
OPTFLAG = -O2 $(UNROLL) $(VECTOR)
LINK1 =
LINK2 =
EXTCFLAGS =
FFLAGS = $(FPARFLAG) $(PROFFLAG) $(MACH) $(OPTFLAG) $(LINK1) $(LINK2)
CFLAGS = $(DIMENS) $(OPTFLAGO) $(PROFFLAG) $(EXTCFLAGS)
LFLAGS = $(FFLAGS)
EXTOBJ1 =
EXTOBJ2 =
EXTOBJ = $(EXTOBJ1) $(EXTOBJ2)
TESTRTO =

---以下都不會變動,省略---

這邊必須注意的是,CSIZE跟CSIZEW對編好的程式效能影響很大,這兩個參數指的是CPU Cache memory的大小,單位是kW,這邊先把它們註解,然後把DDEFCACHE設成128 [原本是DDEFCACHE=$(CSIZEW)]。
要達到最大效能,DDEFCACHE必須是CPU每一個核心所能用的cache size的一半。
比方說:
Intel Xeon E5-2650L(8 cores/20MB Cache),(20*1024)/(8*8*2)=160
Intel Xeon L5640(6 cores/12MB Cache),(12*1024)/(8*6*2)=128
以上面這兩個例子,在L5640的機器上就要用"DDEFCACHE=128",在E5-2650L的機器上就要用"DDEFCACHE=160"。

2. 修改mdutil.c

這個檔案位在bsd下面,必須作點小修正。
在g09.d01的版本中,要改成下面這樣:
#ifdef __x86_64
#define MAX_IO (2000*1024*1024)
#endif
#include
#define NEED_AND
#define NEED_ISHFT
#define NEED_GSR48
#define NEED_PUTENV
#endif /* sun */
#include

大概在324行,多加一個"#define NEED_PUTENV"。

把Makefile跟mdutil.c都弄好後,就是編譯啦。

1 則留言:

  1. 您好,请问您是根据什么来做这些修改的。
    我编译出的g09对于有些体系无法给出能量,请问您有没有遇到类似的情况?
    比如这个:
    %NProcShared=1
    %Mem=16GB
    #P B3LYP/cc-pVDZ Opt

    Methanol

    0 1
    6 -0.046483 0.665862 0.000000
    8 -0.046483 -0.757085 0.000000
    1 -1.088371 0.979555 0.000000
    1 0.438659 1.076028 0.890757
    1 0.438659 1.076028 -0.890757
    1 0.861809 -1.070107 0.000000

    回覆刪除