Efficient Multipole Model and Linear Scaling of NDDO-Based Methods

Content-type: text/HTML

A.M. Tokmachev and A.L. Tchougréeff
Karpov Institute of Physical Chemistry
10 Vorontsovo pole, 105064 Moscow, Russia

Abstract

Fast growth of computational costs with that of the system's size is a bottleneck for the applications of traditional methods of quantum chemistry to polyatomic molecular systems. This problem is addressed by the development of linearly (or almost linearly) scaling methods. In the semiempirical domain it is typically achieved by a series of approximations to the SCF solution. By contrast, we propose a route to the linear scalability by modifying the trial wave function itself. Our approach is based on variationally determined strictly local one-electron states and a geminal representation of chemical bonds and lone pairs. A serious obstacle previously faced on this route were numerous transformations of the two-center repulsion integrals characteristic for the NDDO methods. We pass it by replacing the fictitious charge configurations usual for the NDDO scheme by atomic multipoles interacting through semiempirical potentials. It ensures invariance of these integrals and improves the computational efficiency of the whole method. We discuss possible schemes for evaluating the integrals as well as their numerical values. The method proposed is implemented for the most popular MNDO, AM1, and PM3 parameterization schemes of the NDDO family. Our calculations unequivocally show that the proposed scheme provides almost linear scaling of computational costs with the system's size. The numerical results on molecular properties certify that our method is superior with respect to its SCF-based ancestors.

Introduction

The perspectives of quantum chemistry as of a valuable tool in chemical research are generally attributed to its ability to predict the electronic structures and properties of molecular systems with high degree of accuracy for reasonable time intervals. Nowadays, modern ab initio methods allow to obtain very accurate results for small molecules []. The price for this accuracy is a very fast growth of the required computational resources with that of the system's size (N⁴¸N⁷, where N is the dimension of the basis set). It limits the area of applicability of these methods by small and medium-sized molecules. Although the development of computer technology pushes forward our computational abilities the problem of large-scale calculations remains actual, especially taking into account growing needs of biological chemistry and nanochemistry.

There are two solutions of this problem proposed in the literature. The first one is the construction of hybrid schemes where different parts of a molecule are treated using different methods [,]: typically, a reaction center is described by some quantum chemical methods while the environment is accounted by classical force fields. The second solution is based on the construction of schemes with linear or almost linear dependence of the required computational resources on the system's size (O(N)-methods) [,]. The localization of electronic degrees of freedom due to the exponential decay of the one-electron density matrix elements in the coordinate space [] and the "principle of nearsightedness" by Kohn [] provide physical grounds for these schemes. It is important that most of the O(N)-methods are targeted to the tight-binding model and the required scalability is achieved by a loss of accuracy.

In the semiempirical domain the growth of computational costs with the system's size scales as N³ due to necessity to diagonalize the Fock matrix. A series of approaches has been proposed to avoid this step: the divide-and-conquer scheme [,], the density matrix search method [] and methods based on the localized molecular orbitals in orthogonal [] and non-orthogonal [] formulations. The possibility to devise semiempirical O(N)-methods opens an access to the development of the QM/QM hybrid schemes, where the environment region is treated by semiempirical methods of quantum chemistry []. It allows to solve the problem of the intersubsystem junction in these methods by a sequential derivation based on the perturbation expansions [].

The main purpose of semiempirical methods is to provide a reliable quantum description for large and ultra-large molecular systems. The development perspectives for the semiempirical domain are usually considered [] from the viewpoint of improving the existing parameterization schemes, adding classical terms responsible for the dispersion interactions and introducing sophisticated orthogonalization correstions [] and effective core potentials [,]. These modifications are devised within the basic one-determinant approximation. An alternative route is based on the account of electron correlation given by several techniques employing perturbation expansions [], effective Hamiltonians [], valence bonds [], and multi-reference configuration interaction []. In the present paper we pursue the goal to improve both the wave function's structure and the scalability properties in the framework of the NDDO parameterization.

In a series of papers [,,] we proposed and developed a family of variational semiempirical methods based on the trial wave function in the form of the antisymmetrized product of strictly local geminals (SLG). It takes into account the intrabond ("left-right") electron correlation while the interbond electron transfers are totally neglected due to the strictly local (atom-centered) character of one-electron states. These methods (quite surprisingly) required only a minor re-parameterization of the standard parameters' sets and provide the numerical estimates of heats of formation and eqilibrium geometries [,], as well as vertical ionization potentials [], which are superior by accuracy with respect to the corresponding SCF procedures. It was also shown that the SLG methods have a great potential in derivation of classical force fields [] and development of hybrid quantum/classical schemes [,].

When the MINDO/3 semiempirical Hamiltonian [] is used, the SLG scheme leads to almost linear scaling of computational costs with the system's size (the quadratic term becomes dominating only for very large molecular systems) []. The NDDO type of the Hamiltonian assumes more detailed account of the two-center Coulomb interactions. In combination with the SLG wave function it leads to the necessity of numerous integral transformations induced by basis set transformations. This step dominates the whole procedure and predetermines the quadratic scalability of the SLG-NDDO methods [] with substantial coefficients at the quadratic terms. In the present paper we consider long-range Coulomb interactions in the standard NDDO methods (MNDO [], AM1 [], and PM3 []) and demonstrate that the scheme previously adopted for estimates of the two-center repulsion integrals is not optimal for a local description of electronic structure. In the next Section we describe the SLG-NDDO approximation for the electronic structures of molecules and show that a modified form of molecular integrals leads to significant simplification of the whole procedure. Then, we compare different schemes for the estimates of the two-center repulsion integrals and present numerical results demonstrating the scalability properties and the quality of the calculated heats of formation for the modified SLG-NDDO procedure. Finally, we draw several conclusions on the perspectives of fast semiempirical methods with a local description of both electron correlation and one-electron states.

Method

Strictly Local Geminals

In this Section we consider a sequence of basic steps leading to semiempirical NDDO methods employing a trial wave function in the form of the antisymmetrized product of strictly local geminals []. The first step in the wave function construction consists of orthogonal 4×4 transformations of atomic orbitals (AOs) for each atom A with an {sp} basis set producing hybrid orbitals (HOs) t_m:

t_ms⁺ =
å
i Î A
h_mi^Aa_is⁺.
(0)
Each transformation matrix h^A is parameterized by six Jacobi angles. For hydrogen atoms, the only HO coincides with the 1s-orbital.

The HOs are assigned to geminals representing chemical bonds and lone pairs: each chemical bond is spanned by two orbitals r and l for the right and left ends of the bond, each lone pair is described by only one HO (r). The m-th geminal representing a lone pair is given by one configuration:

g_m⁺ = r_ma⁺r_mb⁺
(0)
with both electrons residing on the same HO, while the same geminal representing a chemical bond is a linear combination of three singlet configurations:

g_m⁺ = u_mr_ma⁺r_mb⁺+v_ml_ma⁺l_mb⁺+w_m(r_ma⁺l_mb⁺+l_ma⁺r_mb⁺),
(0)
where the first two configurations are ionic and the last one is covalent. The geminals are mutually orthogonal and satisfy the normalization condition:

á 0| g_mg_m⁺| 0 ñ = u_m²+v_m²+2w_m² = 1.
(0)
Thus constructed geminals are strictly local because their carrier spaces are formed by strictly local (one-center) orbitals. The total wave function is constructed as their antisymmetrized product:

| Y ñ =
Õ
m
g_m⁺| 0 ñ .
(0)

The electronic structure of geminals can be characterized by the elements of one- and two-electron density matrices:

P_m^{tt^¢} =
å
s
á 0| g_mt_ms⁺t_ms^¢g_m⁺| 0 ñ , G_m^{tt^¢} = á 0| g_mt_ms⁺t_m-s^¢+t_m-s^¢t_msg_m⁺| 0 ñ ,

P_m^rr = 2(u_m²+w_m²), P_m^ll = 2(v_m²+w_m²), P_m^rl = P_m^lr = 2(u_m+v_m)w_m,

G_m^rr = u_m², G_m^ll = v_m², G_m^rl = G_m^lr = w_m².

(0)
(Notation here differs from that adopted in Refs. [,], namely by coefficient 2 at P_m^{tt^¢}.)

The total energy is the sum of the electronic energy and the core-core repulsion. The form of the last term adopted in the MNDO, AM1, and PM3 methods is given in respective Refs. [,,]. The electronic energy can be expressed through the density matrices and molecular integrals. The transformation of the one-electron basis set induces the transformation of all molecular integrals (those of resonance, core attraction, intraatomic and interatomic Coulomb repulsion). The electronic energy is a sum of five terms according to types of interactions in the Hamiltonian:

E = E_c-attr+E_oc-rep+E_res+E_tc-rep+E_m-b.
(0)
The contribution from the attraction of electrons to cores is:

E_c-attr =
å
A

å
t_m Î A
æ
è U_{t_mt_m}^A+
å
B ¹ A
V_{t_mt_m,B}^A ö
ø P_m^tt.
(0)
The intraatomic contribution due to the electron repulsion is a sum of contributions involving one and two HOs:

E_oc-rep =
å
A
[
å
t_m Î A
(t_mt_m|t_mt_m)^AG_m^tt+

å
[(t_m,t_n^¢ Î A) || (m ¹ n)]
( 2(t_mt_m|t_n¢t_n¢)^A-(t_mt_n¢|t_n¢t_m)^A) P_m^ttP_n^{t^¢t^¢}].

(0)
The resonance interaction (electron transfer) is a sum of intrabond contributions:

E_res = -2
å
A < B

å
m Î AB
b_{r_ml_m}^ABP_m^rl,
(0)
where the notation m Î AB means that the m-th bond is one between atoms A and B. The two-center electron repulsion can be written as:

E_tc-rep =
å
A < B

å
t_m Î A

å
t_n^¢ Î B
(t_mt_m|t_n¢t_n¢)^AB×

×[(1-d_mn)P_m^ttP_n^{t^¢t^¢}+2d_mnG_m^rl].

(0)
The last contribution describes NDDO-specific interactions of single bonds constituting one multiple bond:

E_m-b = -
å
[(t_mt_n^¢ Î A) || (m < n)]
(t_mt_n^¢| ~
t

¢
n
~
t

m
)^ABP_m^rlP_n^rl,
(0)
where [(t)\tilde] = l for t = r and [(t)\tilde] = r for t = l.

The total energy is a function of two classes of electronic structure parameters (transformation matrices h^A and geminal amplitudes u_m, v_m, and w_m) which are all determined variationally. It is important that the total number of these parameters is proportional to the system's size (for the traditional SCF scheme the number of electronic structure parameters - MO LCAO coefficients or elements of one-electron density scales as N²) and they describe local fragments of electronic structure. It is a necessary prerequisite for constructing a linearly scaling method. The limiting factor in calculations are summations over the two-center repulsion integrals, i.e., transformations of these integrals from the basis of AOs to the basis of HOs. Here we consider a direct way to accelerate this step and to construct a linearly scaling procedure for NDDO-calculations of molecular energies.

Multipole Expansions

The energy of the Coulomb interaction for two non-bonded atoms A and B can be extracted from the total SLG energy and re-written in a simple form:

E_coul^AB = Z^AZ^B(ss|ss)^AB+
å
m Î A

å
n Î B
P_m^AP_n^B(mm|nn)^AB-

Z^A
å
n Î B
P_n^B(ss|nn)^AB-Z^B
å
m Î A
P_m^A(mm|ss)^AB,

(0)
where we use the simplified notation m for t_m and P_m^A is the total electron density on this HO. It is important that E_coul includes only the standard Coulomb part of the core-core repulsion. The two-center repulsion integrals entering Eq. (13) can be readily re-written in the basis of AOs:

(ss|nn)^AB =
å
kl Î B
h_kn^Bh_ln^B(ss|kl)^AB

(mm|ss)^AB =
å
ij Î A
h_im^Ah_jm^A(ij|ss)^AB

(mm|nn)^AB =
å
ij Î A

å
kl Î B
h_im^Ah_jm^Ah_kn^Bh_ln^B(ij|kl)^AB.

(0)
These integrals have the tensor properties of corresponding multipole-multipole interactions. It can be used to construct effective schemes for their approximate evaluation []. The standard NDDO schemes (MNDO, AM1, and PM3) replace the multipoles by charge configurations []. A point dipole is replaced by two point charges of opposite sign, while the diagonal and non-diagonal components of a quadrupole moment are modeled by three and four point charges, respectively. It is assumed that these point charges interact through semiempirical potentials, producing necessary expressions for two-center molecular integrals. This formulation does not allow to get simple expressions for the energy of the Coulomb interaction between non-bonded atoms and numerous summations in Eqs. (13) and (14) must be explicitly performed making the whole procedure very time consuming.

The Coulomb interaction between non-bonded atoms given by Eq. (13) can be formally re-written in the form:

E_coul^AB = é
ë -Z_A(ss|^A+
å
m Î A
P_m^A(mm|^A ù
û ·

é
ë -Z_B|ss)^B+
å
n Î B
P_n^B|nn)^B ù
û ,

(0)
where · denotes interaction between two charge distributions. The strictly local character of one-electron basis functions forming carrier spaces for geminals allows to introduce point multipoles describing charge distributions for atoms (given by the expressions in square parentheses in Eq. (15)) with definite densities on the HOs. For convenience we write them in the units -e. The atomic charge (monopole) is then:

q^A =
å
m Î A
P_m^A-Z^A.
(0)
The characteristic lengths defining the dipole moment for an (sp)-distribution and the quadrupole moment for a (pp)-distribution depend on the principal quantum number n for the sp-shell and the orbital (Slater) exponents:

D₁ = 2n+1
Ö3
(4z_nsz_np)^n+1/2
(z_ns+z_np)²ⁿ⁺²
,

D₂ = æ
ú
Ö

(2n+1)(2n+2)
20

z_np^-1.

(0)
Using these parameters and the representation of a hybrid orbital in the quaternion form (s_m,[(v)\vec]_m) [], where s_m and three components of [(v)\vec]_m are respectively the coefficients of the s- and p_x-, p_y-, p_z-AOs in the expansion of HOs explicitly referring to their tensor properties, i.e. to the fact that the coefficients s_m are invariant under rotations of either molecule itself or the laboratory coordinate frame whereas the coefficents forming [(v)\vec]_m transform as components of a 3-vector under similar rotations, we obtain the dipole moment of the charge distribution on atom A to be:

®
m

A

= 2D₁^A
å
m Î A
P_m^As_m ®
v

A
m
.
(0)
The standard theory of multipole moments defines the quadrupole moment for a system of charges as:

Q_ab = å
q(3x_a x_b -r²d_ab).
(0)
This tensor is defined as a traceless one because the trace enters only the energy contributions proportional to Df (where f is a potential, and D is a sum of second derivatives with respect to cartesian coordinates, i.e. the Laplacian), which vanishes exactly for the Coulomb potential 1/R.

The Poisson equation Df = 0 is, however, not valid for semiempirical potentials f. Therefore, in the semiempirical context it is more convenient to introduce a second order tensor with a non-vanishing trace instead of the usual quadrupole moment:

^
S

A

= 1
2
å
qx_a x_b = (D₂^A)²
å
m Î A
P_m^A ®
v

A
m
Ä ®
v

A
m
.
(0)
It describes the distribution of charges but does not imply any assumptions about the properties of their interaction potential. To get back to the usual quadrupole moment we introduce the total density on the p-orbitals of atom A as P_p^A = å\limits_{m Î A}P_m^A[ 1-(s_m)²] . It leads to a compact expression for the true quadrupole moment on atom A:

^
Q

A

= 6 ^
S

A

-2(D₂^A)²P_p^A ^
I

.
(0)

The potentials acting between the multipoles are not those known from electrostatics since they are based not on the Coulomb interaction of the form of 1/R but rather on a semiempirical interaction potential approximating the effective values of integrals in the region of intermediate interatomic distances. The most popular semiempirical potential adopted in the NDDO methods is that proposed by Dewar, Sabelli, and Klopman [,]:

f_l₁l₂(R) = [ R²+(r_l₁+r_l₂)²] ^-1/2.
(0)
It depends on the type of interaction (indices l₁ and l₂ correspond to the 2^l₁- and 2^l₂-poles located on atoms A and B, respectively). The semiempirical approach to molecular integrals prompts two types of formulae for interactions including quadrupoles: those based on the tensor [^(S)] and those based on an additional assumption of validity of the Poisson equation and, thus, expressible through the usual quadrupole moment tensors [^(Q)]. In the latter case semiempirical potential is treated as if it satisfied the condition Df = 0 simplifying the formulae but mathematically incorrect. General expression for the interaction energy can be re-written in a general form with the components transforming as tensors:

E_coul^AB = q^AG⁰⁰q^B+q^AG_a⁰¹ ®
m

B
a
- ®
m

A
a
G_a¹⁰q^B-

®
m

A
a
G_ab¹¹ ®
m

B
b
+q^AG_ab⁰² ^
S

B
ab
+ ^
S

A
ab
G_ab²⁰q^B-

®
m

A
a
G_abg¹² ^
S

B
bg
+ ^
S

A
ab
G_abg²¹ ®
m

B
g
+ ^
S

A
ab
G_abgd²² ^
S

B
gd
,

(0)
where

G_ab...m^l₁l₂ = Ñ_aÑ_b...Ñ_mf_l₁l₂(R)
(0)
and Ñ_a, Ñ_b,... denote the Cartesian components of the gradient operator. The Coulomb interactions between two non-bonded atoms A and B are classified according to the quantum numbers l₁ and l₂ of the AOs involved in the potential Eq. (22). The following cases can be specified (we assume that the values of all functions are estimated at R_AB and write R instead of R_AB for brevity).

The charge-charge, {l₁l₂} = {00}, contribution reads:

E₀₀^AB = q^Aq^Bf₀₀.
(0)

The charge-dipole, {l₁l₂} = {01} or {10}, interaction gives two contributions:

E₀₁^AB = q^Af₀₁¢( ®
n

AB
, ®
m

B

),

E₁₀^AB = q^Bf₁₀¢( ®
n

BA
, ®
m

A

),

(0)
where [(n)\vec]_AB is the unit vector directed from atom A to atom B.

The dipole-dipole, {l₁l₂} = {11}, interaction can be written as a sum of contributions proportional to the scalar product of dipole moments and the product of their projections on the line connecting atoms A and B:

E₁₁^AB = - f₁₁¢
R
( ®
m

A
, ®
m

B
)- æ
ç
è f₁₁^¢¢- f₁₁¢
R
ö
÷
ø ( ®
m

A
, ®
n

AB
)( ®
m

B
, ®
n

AB
).
(0)

The charge-quadrupole, {l₁l₂} = {02} or {20}, interactions for semiempirical potentials are the following:

E₀₂^AB = q^A é
ê
ë f₀₂¢
R
tr ^
S

B

+ æ
ç
è f₀₂^¢¢- f₀₂¢
R
ö
÷
ø ( ®
n

AB
, ^
S

B

, ®
n

AB
) ù
ú
û ,

E₂₀^AB = q^B é
ê
ë f₂₀¢
R
tr ^
S

A

+ æ
ç
è f₂₀^¢¢- f₂₀¢
R
ö
÷
ø ( ®
n

AB
, ^
S

A

, ®
n

AB
) ù
ú
û .

(0)
Under the additional assumption Df = 0 one obtains:

E₀₂^AB = q^A
6
æ
ç
è f₀₂^¢¢- f₀₂¢
R
ö
÷
ø ( ®
n

AB
, ^
Q

B

, ®
n

AB
),

E₂₀^AB = q^B
6
æ
ç
è f₂₀^¢¢- f₂₀¢
R
ö
÷
ø ( ®
n

AB
, ^
Q

A

, ®
n

AB
).

(0)

The dipole-quadrupole, {l₁l₂} = {12} or {21}, interaction in general formulation can be written as:

E₁₂^AB = f₁₂^¢¢R-f₁₂¢
R²
é
ë ( ®
m

A

, ®
n

BA
)tr ^
S

B

+2( ®
m

A

, ^
S

B

, ®
n

BA
) ù
û +

f₁₂^¢¢¢R²-3f₁₂^¢¢R+3f₁₂¢
R²
( ®
m

A

, ®
n

BA
)( ®
n

AB
, ^
S

B

, ®
n

AB
),

E₂₁^AB = f₂₁^¢¢R-f₂₁¢
R²
é
ë ( ®
m

B

, ®
n

AB
)tr ^
S

A

+2( ®
m

B

, ^
S

A

, ®
n

AB
) ù
û +

f₂₁^¢¢¢R²-3f₂₁^¢¢R+3f₂₁¢
R²
( ®
m

B

, ®
n

AB
)( ®
n

AB
, ^
S

A

, ®
n

AB
),

(0)
while in the case of the additional assumption Df = 0 one gets:

E₁₂^AB = f₁₂^¢¢R-f₁₂¢
3R²
( ®
m

A

, ^
Q

B

, ®
n

BA
)+

f₁₂^¢¢¢R²-3f₁₂^¢¢R+3f₁₂¢
6R²
( ®
m

A

, ®
n

BA
)( ®
n

AB
, ^
Q

B

, ®
n

AB
),

E₂₁^AB = f₂₁^¢¢R-f₂₁¢
3R²
( ®
m

B

, ^
Q

A

, ®
n

AB
)+

f₂₁^¢¢¢R²-3f₂₁^¢¢R+3f₂₁¢
6R²
( ®
m

B

, ®
n

AB
)( ®
n

AB
, ^
Q

A

, ®
n

AB
).

(0)

The quadrupole-quadrupole, {l₁l₂} = {22}, interaction includes derivatives of a semiempirical potential up to the fourth order:

E₂₂^AB = f₂₂^¢¢R-f₂₂¢
R³
é
ë tr ^
S

A

tr ^
S

B

+2tr( ^
S

A

^
S

B

) ù
û +

f₂₂^¢¢¢R²-3f₂₂^¢¢R+3f₂₂¢
R³
( ®
n

AB
,{ ^
S

A

tr ^
S

B

+ ^
S

B

tr ^
S

A

+4 ^
S

A

^
S

B

}, ®
n

AB
)+

f₂₂^IVR³-6f₂₂^¢¢¢R²+15f₂₂^¢¢R-15f₂₂¢
R³
( ®
n

AB
, ^
S

A

, ®
n

AB
)( ®
n

AB
, ^
S

B

, ®
n

AB
).

(0)
Under the additional assumption Df = 0 it yields:

E₂₂^AB = f₂₂^¢¢R-f₂₂¢
18R³
tr( ^
Q

A

^
Q

B

)+

f₂₂^¢¢¢R²-3f₂₂^¢¢R+3f₂₂¢
9R³
( ®
n

AB
, ^
Q

A

^
Q

B

, ®
n

AB
)+

f₂₂^IVR³-6f₂₂^¢¢¢R²+15f₂₂^¢¢R-15f₂₂¢
36R³
( ®
n

AB
, ^
Q

A

, ®
n

AB
)( ®
n

AB
, ^
Q

B

, ®
n

AB
).

(0)

The formulae given above allow for relatively fast estimates of the two-center contribution to the total energy. Moreover, derivatives of the total energy with respect to molecular geometry parameters can be easily calculated and used for an efficient optimization. We implemented the SLG-NDDO method employing the true multipole scheme for the two-center repulsion integrals. In the next Section we give some numerical results illustrating its potential in semiempirical electronic structure calculations.

Results and Discussion

In the present paper we develop a semiempirical method based on the SLG wave function. Special attention is paid to the two-center repulsion integrals and interactions between non-bonded atoms. These integrals are the most important part of the NDDO scheme. It has been shown more than 50 years ago [,] that multipole expansions can adequately reproduce analytical electron repulsion integrals for a wide range of interatomic distances. These ideas are used in the standard NDDO methods where point multipoles are modeled by appropriate charge configurations. The last approximation makes the integrals non-invariant with respect to rotations of the coordinate frame because the charge configurations have non-vanishing higher multipole moments. In actual calculations this non-invariance is masked by performing them in the diatomic coordinate frame (rotations of the coordinate axes induce rotation of the diatomic coordinate frame and the integrals are calculated identically). The formulae given in the previous Section restore the invariance because they are written in a tensor form. The change of the computation scheme leads to changes in numerical estimates although both schemes have the same asymptotic behaviour at R® ¥. We analyze the consequences of these modifications for the molecular integrals and the total energy.

First, we consider the two-center repulsion integrals on the example of two carbon atoms. Eq. (22) shows that the Dewar-Sabelli-Klopman semiempirical potential depends on three parameters (r₀, r₁, and r₂). In the standard formulation [] these parameters are determined from the condition that at the limit R® 0 the one-center integrals describing the interaction between two monopoles (g_ss), two dipoles (h_sp), and two quadrupoles (h_pp) must be reproduced. This condition is of course arbitrary and, for example, the one-center limit for the integral (ss|zz) is not reproduced by this scheme. The reason is simple: in the sp-basis there are five independent one-center integrals - Slater-Condon parameters F⁰, F², G¹ - which in general case can not be reproduced by only three parameters r. Moreover, these formulae are designed for interatomic distances not approaching zero. In the case of small interatomic distances the difference between integrals evaluated using the formulae based on the charge configurations and based on the true multipoles becomes significant. Therefore, when we change the computation scheme the one-center limit values also change. Taking this into account we can propose two different recipes: the first one is based on the standard values of r's while the second one uses modified values of r's reproducing the one-center limits (g_ss, h_sp, and h_pp). In the latter case the value of r₀ is not modified ( = 2g_ss^-1). The parameters r₁ and r₂ in the standard scheme are determined numerically. When we use the point multipoles and their interactions we can obtain analytical expressions for these parameters. For example:

r₁ = 1
2
æ
ç
è h_sp
D₁²
ö
÷
ø 1/3

.
(0)
It leads to the estimate r₁=0.9699827 a.u. for carbon which can be compared with the standard value of 0.8130983 a.u. [].

Figs. 1-3 illustrate the dependence of the integrals (ss|sz), (sz|sz), and (sx|sx) (z axis connects two atoms in the diatomic coordinate frame) for the scheme based on the charge configurations and for that based on the true multipoles with the standard and modified parameters r₁. These figures show that the curves differ significantly in the region of small interatomic distances. By contrast, in the case of R > 2 Å all the curves are quite close, so that only the closest neighbors are touched by the modification of the procedure. The convergence rate strongly depends on the integral type. In general, the curves with the adjusted value of r₁ seem to be a better approximation to the standard ones than those based on the point multipoles with r₁ unchanged. At the same time, in the case of the integral (sz|sz) the better coincidence of the curve based on modified r₁ with the standard curve in the region of small R leads to a worse agreement for larger values of R.

We tested the performance of two types of procedures tentatively possible for evaluating interactions involving quadrupoles. We considered the (ss|zz) and (ss|xx) integrals. The interaction of charges gives the main contribution to these integrals. It is not affected by our modifications of the computation scheme. Therefore, we analyze only the contribution having the form of the charge-quadrupole interactions (or, equivalently, the differences (ss|zz)-(ss|ss) and (ss|xx)-(ss|ss)). These interactions constitute about 5% of the integrals at the region of characteristic interatomic distances between bonded atoms and their contribution decreases as R^-2 for larger R. Figs. 4 and 5 represent the dependence of these differences on the interatomic distance calculated using the standard formulae and also formulae based on Eqs. (28) and (29). For comparison we also plot these contributions for the usual Coulomb law 1/R. As expected the expressions which do not use the assumption of validity of the Poisson equation for the potential give a better approximation to the standard values (especially for small R). The difference between two approximations based on the multipole-multipole interactions converges to zero in both cases as R^-2. In the case of the charge-quadrupole interactions given by Eq. (29) the one-center limit is exactly zero ((ss|pp) and (ss|ss) are equal). It illustrates the fact that all one-center integrals cannot be reproduced as limiting cases of the corresponding two-center ones. In other cases (standard scheme and that based on [^(S)]) the one-center limit for the difference between these integrals (which is equal to -0.76 eV for the one-center integrals used in the MNDO scheme) is significantly overestimated.

It is important to test how the proposed modification of the computation scheme affects the performance of the SLG-NDDO approach. The first step in this direction is to analyze the computation time as a function of the scheme and the system's size. As test objects we consider regular hydrocarbon chains C_nH_2n+2. Figs. 1-5 demonstrate that at larger R the change of the scheme can only slightly affect the values of the integrals. Therefore, if the scheme based on the atomic multipoles is applied only to well separated pairs of atoms, the modified SLG-NDDO method can be used even with the parameters given in Ref. [], i.e. without any additional re-parameterization. In subsequent calculations we adopt the atomic multipole scheme for R > 3 Å , while the short-range repulsion integrals are calculated using the charge configurations scheme. To further specify the computation scheme we use the MNDO semiempirical Hamiltonian with the standard values of the r parameters and the formulae based on the [^(S)] tensor for the interactions involving quadrupoles.

Fig. 6 demonstrates the ratio of computation times for two SLG-MNDO schemes (based on the charge configurations and multipoles, respectively) as a function of the number of carbon atoms n in C_nH_2n+2. It can be seen that the dependence has two regions: it is almost linear for smaller n's and it is close to a constant for larger n. It can be readily understood taking into account that in the case of the charge configurations scheme a qudratic contribution to the dependence of computation costs on the system's size dominates over linear contribution even for small molecules. At the same time the scheme proposed above significantly reduces namely the quadratic component of this dependence and for relatively small molecules a linear contribution dominates. It is important that in the limit of large n, where a quadratic contribution is again dominant, the true multipole scheme leads to more than 30-ple acceleration of SLG-NDDO calculations.

It is clear that simple replacement of the charge configurations by the atomic multipoles can not affect the scalability of the method. Indeed, even very fast estimation of the two-center Coulomb interactions leads to a quadratic dependence of computation costs on the system's size for larger molecules. To make the scheme truly linearly scaling it is necessary to neglect interactions between very distant atoms. Cut-off procedures of that sort are well justified only for local states. In the case of the SLG method it is particularly simple because one-electron states forming carrier spaces are atom-centered. Fig. 7 shows the dependence of the required computation time on the system's size n for the modified SLG-NDDO method where all interactions between atoms separated by more than 20 Å are totally neglected. This figure unequivocally demonstrates that the method belongs to the family of O(N)-methods. It is important that the cut-off procedure leads to very small modification of the calculated heats of formation (less than 0.03 kcal/mol per CH₂ fragment). Of course, in the case of more polarized molecules with significant effective atomic charges the charge-charge interactions beyond 20 Å should be explicitly considered to obtain the same accuracy. We compare our results (9000 atoms for about 100 sec. on a 0.7 MHz Pentium-III computer) with those obtained in the framework of the LocalSCF method (120000 atoms for about 16000 sec. on a 2.4GHz Pentium-IV computer) [] and deduce that the modified SLG-NDDO method is about 2 orders of magnitude faster than the LocalSCF one.

A detailed discussion of the quality of the SLG-NDDO numerical estimates as compared with the SCF-NDDO ones is given in Ref. []. In the present paper we consider only the heats of formation on the same set of molecules as given in Ref. []. The difference between calculated and experimental values can be found for each of the molecules and for the calculation methods. This error can be considered as a random variable and empirical distribution functions of errors for both methods can be readily constructed. Figs. 8 and 9 represent these functions in the coordinates linearizing the normal distribution plotted for the SLG-MNDO and SCF-MNDO methods, respectively, as well as linear fits for both distributions. The assumption of a normal distribution law for the errors seems to be valid for both methods because the sets of points are close to the corresponding linear fits (values of R² are 0.967 and 0.983, respectively). The abscissa for the crossing of a linear fit and the x axis gives the value of the a parameter (average of the error's distribution), while the slope of a linear fit is s^-1. Our analysis shows that the methods have equal values of s (about 7.6 kcal/mol) but the values of the a parameter differ significantly. In the case of the SCF-MNDO method this value is -2.3 kcal/mol and certifies that heats of formation are significantly underestimated in this method while in the case of the SLG-MNDO method the average is positive and its magnitude is more than 5 times smaller demonstrating the practical absense of the systematic error in the SLG-MNDO calculations. According to the Student's criterion the average errors for these methods are statistically different with a probability larger than 90% .

Conclusions

In the present paper we further develop semiempirical methods based on the strictly local geminals form of the trial wave function contrasting to the traditional SCF-based semiempirical quantum chemistry. The main result is that using invariant multipolar forms of the two-center repulsion integrals leads to a significant (more than 30 times) acceleration of the computation procedure. As a consequence we propose the modified SLG-NDDO methods for electronic structure calculations of large molecular systems. These methods can be easily transformed into a linear scaling form by the applying a cut-off procedures for the two-center Coulomb interactions. It is particularly important that such a modification of calculation procedure does not imply any re-parameterization. The SLG-NDDO methods provide high-quality estimates for the heats of formation of molecules which are well described by two-electron two-center chemical bonds and by lone pairs.

An important component of the scheme proposed is the possibility to sequentially define and use in calculation the atomic multipoles. They can be further used to calculate electrostatic potentials inside and outside molecules []. The tentative prospects of this scheme can be considered from somewhat different point of view. The SLG-MINDO/3 method allowed to construct a route from semiempirical quantum chemistry to classical force fields []. A multipole scheme for two-center interactions opens an access to construction of an NDDO-based deductive molecular mechanics (DMM). The construction of a DMM procedure is only one of many ways to further develop the computation schemes of the SLG-NDDO method. It is possible to construct a method where all two-center repulsion interactions (including those between bonded atoms) are calculated using the multipole scheme. At the same time this modification will require some re-parameterization of the Hamiltonian. An important way to generalize this treatment is to implement a scheme with local electron groups of arbitrary form (numbers of electrons and orbitals). A step in this direction has been made in Ref. [], where the trial wave function in the form of the antisymmetrized product of strictly local geminals and molecular orbitals has been proposed and analyzed. On the other hand, it is clear that the SLG wave function totally neglects interbond electron transfers and dispersion interactions. They can be taken into consideration using perturbation expansions developed for the geminal-type wave functions [,]. These interactions are short-range and, therefore, their account will not destroy the linear scalability of the SLG-NDDO methods.

Acknowledgment

This work has been completed during the stay of A.M.T. at RWTH Aachen in the frame of the Alexander von Humboldt Postdoctoral Fellowship which is gratefully acknowledged as well as the kind hospitality of Prof. Dr. R. Dronskowski. This work was financially supported through the RFBR grants No. 04-03-32146, 05-03-08010, 05-03-08074, 05-03-33118, 05-07-90067.

References

[]: Pople, J. A. Angew. Chem. Int. Ed. 1999, 38, 1894-1902.
[]: Sherwood, P. In Modern Methods and Algorithms of Quantum Chemistry; Grotendorst, J., Ed.; John von Neumann Institute for Computing: Jülich, 2000.
[]: Tchougréeff, A. L.; Tokmachev, A. M. In Advanced Topics in Theoretical Chemical Physics; Maruani, J.; Lefebvre, R.; Brändas, E., Eds.; Kluwer: Dordrecht, 2003.
[]: Goedecker, S. Rev. Mod. Phys. 1999, 71, 1085-1123.
[]: Wu, S. Y.; Jayanthi, C. S. Physics Reports 2002, 358, 1-74.
[]: Goedecker, S. Phys. Rev. B 1998, 58, 3501-3502.
[]: Kohn, W. Phys. Rev. Lett. 1996, 76, 3168-3171.
[]: Dixon, S. L.; Merz, K. M., Jr. J. Chem. Phys. 1997, 107, 879-893.
[]: York, D. M.; Lee, T. S.; Yang, W. T. Phys. Rev. Lett. 1998, 80, 5011-5014.
[]: Daniels, A. D.; Millam, J. M.; Scuseria, G. E. J. Chem. Phys. 1997, 107, 425-431.
[]: Stewart, J. J. P. Int. J. Quantum Chem. 1996, 58, 133-146.
[]: Anikin, N. A.; Anisimov, V. M.; Bugaenko, V. L.; Bobrikov, V. V.; Andreyev, A. M. J. Chem. Phys. 2004, 121, 1266-1270.
[]: Cui, Q.; Guo, H.; Karplus, M. J. Chem. Phys. 2002, 117, 5617-5631.
[]: Winget, P.; Selçuki, C.; Horn, A. H. C.; Martin, B.; Clark, T. Theor. Chem. Acc. 2003, 110, 254-266.
[]: Weber, W.; Thiel, W. Theor. Chem. Acc. 2000, 103, 495-506.
[]: Ahlswede, B.; Jug, K. J. Comput. Chem. 1999, 20, 563-571.
[]: Thiel, W. J. Amer. Chem. Soc. 1981, 103, 1413-1420.
[]: Soudackov, A. V.; Tchougréeff, A. L.; Misurkin, I. A. Theor. Chim. Acta 1992, 83, 389-416.
[]: Cullen, J. M. Int. J. Quantum Chem. 1995, 56, 97-113.
[]: Koslowski, A.; Beck, M. E.; Thiel, W. J. Comput. Chem. 2003, 24, 714-726.
[]: Tokmachev, A. M.; Tchougréeff, A. L. J. Comput. Chem. 2001, 22, 752-764.
[]: Tokmachev, A. M.; Tchougréeff, A. L. Int. J. Quantum Chem. 2001, 85, 109-117.
[]: Tokmachev, A. M.; Tchougréeff, A. L. J. Phys. Chem. A 2003, 107, 358-365.
[]: Tchougréeff, A. L.; Tokmachev, A. M. Int. J. Quantum Chem. 2004, 96, 175-184.
[]: Tokmachev, A. M.; Tchougréeff, A. L.; Misurkin, I. A. J. Mol. Struct. (Theochem) 2000, 506, 17-34.
[]: Bingham, R. C.; Dewar, M. J. S.; Lo, D. H. J. Amer. Chem. Soc. 1975, 97, 1302-1311.
[]: Dewar, M. J. S.; Thiel, W. J. Amer. Chem. Soc. 1977, 99, 4899-4907.
[]: Dewar, M. J. S.; Zoebisch, E. G.; Healy, E.; Stewart, J. J. P. J. Amer. Chem. Soc. 1985, 107, 3902-3909.
[]: Stewart, J. J. P. J. Comp. Chem. 1989, 10, 209-220.
[]: Dewar, M. J. S.; Thiel, W. Theor. Chim. Acta (Berl.) 1977, 46, 89-104.
[]: Dewar, M. J. S.; Hojvat (Sabelli), N. L. J. Chem. Phys. 1961, 34, 1232-1236.
[]: Klopman, G. J. Amer. Chem. Soc. 1964, 86, 4550-4557.
[]: Mulligan, J. F. J. Chem. Phys. 1951, 19, 347-362.
[]: Parr, R. G.; Taylor, G. R. J. Chem. Phys. 1951, 19, 497-501.
[]: Tsiper, E. V.; Burke, K. J. Chem. Phys. 2004, 120, 1153-1156.
[]: Tokmachev, A. M.; Tchougréeff, A. L. J. Comput. Chem. 2005, 26, 491-505.
[]: Tokmachev, A. M.; Tchougréeff, A. L. Int. J. Quantum Chem., in press.
[]: Rosta, E.; Surján, P. R. J. Chem. Phys. 2002, 116, 878-890.
[]: Rassolov, V. A.; Xu, F.; Garashchuk, S. J. Chem. Phys. 2004, 120, 10385-10394.

Figure 0: Dependence of the (ss|sz)^CC integral on the interatomic distance Figure

Figure 0: Dependence of the (sz|sz)^CC integral on the interatomic distance Figure

Figure 0: Dependence of the (sx|sx)^CC integral on the interatomic distance Figure

Figure 0: Dependence of the difference (ss|zz)^CC-(ss|ss)^CC on the interatomic distance Figure

Figure 0: Dependence of the difference (ss|xx)^CC-(ss|ss)^CC on the interatomic distance Figure

Figure 0: Ratio of computation times for the SLG-MNDO method with different schemes for the two-center repulsion integrals Figure

Figure 0: Dependence of computation times on the system's size Figure

Figure 0: Empirical distribution of errors for heats of formation as obtained by the SLG-MNDO method Figure

Figure 0: Empirical distribution of errors for heats of formation as obtained by the SCF-MNDO method Figure

File translated from T_EX by T_TH, version 2.67.
On 18 Jun 2005, 14:42.

A.M. Tokmachev and A.L. Tchougréeff Karpov Institute of Physical Chemistry 10 Vorontsovo pole, 105064 Moscow, Russia

Efficient Multipole Model and Linear Scaling of NDDO-Based Methods

Abstract

Introduction

Method

Strictly Local Geminals

Multipole Expansions

Results and Discussion

Conclusions

Acknowledgment

References

A.M. Tokmachev and A.L. Tchougréeff
Karpov Institute of Physical Chemistry
10 Vorontsovo pole, 105064 Moscow, Russia