Applying GSATto Non-Clausal Formulas

R. Sebastiani

Volume 1, 1994

Links to Full Text:

Abstract:
In this paper we describe how to modify GSAT so that it can be applied to non-clausal formulas. The idea is to use a particular ``score'' function which gives the number of clauses of the CNF conversion of a formula which are false under a given truth assignment. Its value is computed in linear time, without constructing the CNF conversion itself. The proposed methodology applies to most of the variants of GSAT proposed so far.

Extracted Text


Journal of Artificial Intelligence Research 1 (1994) 309-314 Submitted 1/94;
published 6/94 Research Note

Applying GSAT to Non-Clausal Formulas

Roberto Sebastiani rseba@dist.unige.it Mechanized Reasoning Group, DIST,
viale Causa 13, 16145 Genova, Italy. Mechanized Reasoning Group, IRST, loc.
Pant`e, 38050 Povo, Trento, Italy.

Abstract In this paper we describe how to modify GSAT so that it can be applied
to non-clausal formulas. The idea is to use a particular "score" function
which gives the number of clauses of the CNF conversion of a formula which
are false under a given truth assignment. Its value is computed in linear
time, without constructing the CNF conversion itself. The proposed methodology
applies to most of the variants of GSAT proposed so far.

1. Introduction GSAT (Selman, Levesque, & Mitchell, 1992; Selman & Kautz,
1993) is an incomplete model-finding algorithm for clausal propositional
formulas which performs a randomized local search. GSAT has been shown to
solve many "hard" problems much more efficiently than other traditional algorithms
like, e.g., DP (Davis & Putnam, 1960). Since GSAT applies only to clausal
formulas, using it to find models for ordinary propositional formulas requires
some previous clausal-form conversion. This requires extra computation (which
can be extremely heavy if the "standard" clausal conversion is used). Much
worse, clausal-form conversion causes either a large increase in the size
of the input formula or an enlargement of the search space.

In this paper we describe how to modify GSAT so that it can be applied to
non-clausal formulas directly, i.e., with no previous clausal form conversion.
An extended version of the paper (Sebastiani, 1994) provides the proofs of
the theorems and a detailed description of the algorithm introduced.

This achievement could enlarge GSAT's application domain. Selman et al. (1992)
suggest that some traditional AI problems can be formulated as model-finding
tasks; e.g., visual interpretation (Reiter & Mackworth, 1989), planning (Kautz
& Selman, 1992), generation of "vivid" knowledge representation (Levesque,
1986). It is often the case that non-clausal representations are more compact
for such problems. For instance, each rule in the form "Vi OEi oe " gives
rise to several distinct clauses if some OEi are disjuncts or  is a conjunct.
In automated theorem proving (a.t.p.) some applications of model-finding
have been proposed (see, e.g., (Artosi & Governatori, 1994; Klingerbeck,
1994)). For instance, some decision procedures for decidable subclasses of
first-order logic iteratively perform nonclausal model-finding for propositional
instances of the input formulas (Jeroslow, 1988). More generally, some model-guided
techniques for proof search, like goal deletion (Ballantyne & Bledsoe, 1982),
false preference, or semantic resolution (Slaney, 1993), seem to be applicable
to non-clausal a.t.p. as well.

cfl1994 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.

Sebastiani procedure GSAT(OE)

for j := 1 to Max-tries do

T := initial(OE) for k := 1 to Max-flips do

if T j= OE then return T else Poss-flips := hill-climb(OE; T )

V := pick(Poss-flips) T := flip(V,T) UpdateScores(OE; V ) end end return
"no satisfying assignment found".

Figure 1: A general schema for GSAT.

2. GSAT

If OE is a clausal propositional formula and T is a truth assignment for
the variables of OE, then the number of clauses of OE which are falsified
by T is called the score of T for OE (score(T; OE)). T satisfies OE iff score(T;
OE) = 0. The notion of score plays a key role in GSAT, as it is considered
as the "distance" from a truth assignment to a satisfying one.

The schema of Figure 2 describes GSAT as well as many of its possible variants.
We use the notation from (Gent & Walsh, 1993). GSAT performs an iterative
search for a satisfying truth assignment for OE, starting from a random assignment
provided by initial(). At each step, the successive assignment is obtained
by flipping (inverting) the truth value of one single variable V in T . V
is chosen to minimize the score. Let Ti be the assignment obtained from T
by flipping its i-th variable Vi. hill-climb() returns the set Poss-flips
of the variables Vr which minimize score(Tr; OE). If the current values of
Delta si = score(Ti; OE) Gamma  score(T; OE) are stored for every variable
Vi, then hill-climb() simply returns the set of the variables Vr with the
best Delta sr. pick() chooses randomly one of such variables. flip() returns
T with V 's value flipped. After each flipping, UpdateScores() updates the
values of Delta si, for all i.

This paper exploits the observation that the functions initial(), hill-climb(),
pick() and flip() do not depend on the structure of the input formula OE,
and that the computation of the scores is the only step where the input formula
OE is required to be in clausal form. The idea is thus to find a suitable
notion of score for non-clausal formulas, and an efficient algorithm computing
it.

3. An extended notion of score Let cnf(') be the result of converting a propositional
formula ' into clausal form by the standard method (i.e., by applying the
rules of De Morgan). Then the following definition extends the notion of
score to all propositional formulas.

Definition 3.1 The score of a truth assignment T for a propositional formula
' is the number of the clauses of cnf(') which are falsified by T .

310

Applying GSAT to Non-Clausal Formulas ...

...

...

...

.

... ... ...

... .

...

...

...

...

.

... ... ...

... .

...

...

...

...

.

... ... ... ...

.

...

...

...

...

.

... ... ...

... .

...

...

...

...

.

... ... ...

... .

...

...

...

...

.

... ...

... ...

.

...

...

...

...

.

... ... ... ...

.

...

...

...

...

.

... ... ... ...

.

...

...

...

...

.

... ... ... ...

. m

Delta Delta 

Delta 

AA

A

[1,-]

E -C F[1,-][0,-] [0,-]

...

...

...

...

.

... ...

... ...

.

m m m m

m m

m

m

m m m m

m m

Delta Delta  Delta 

AA

AA

SS

S Omega Omega 

Omega 

Theta Theta  Theta 

CC

C

""

"""

.

'' '

Theta Theta  Theta 

BB

B Delta Delta  Delta 

JJ

J

AA

A

Theta Theta  Theta 

BB

B

bb

bb

,, ,,

EE

E

Delta Delta  Delta 

AA

A

## ##

SS

SS

@@

@

Theta Theta  Theta 

AA

A

iiii

iiii

i PPPP

PPPP

PP

...

...

...

...

....

...

...

...

.

...

...

...

...

....

...

...

...

.

...

...

...

...

....

...

...

...

. ..

...

...

...

......

...

...

...

Delta Delta  Delta 

AA

A ..................

...

....

...

.....

... ... ... ... ... ...................................... ... ... ... ...
....

....

....

....

....

..

-D -B [1,-][1,-]

[1,-] [0,-] [1,-] [1,-] [1 , 0][0,1][1 , 0] [0,1] [0,1] [0,-] [1,-]

[2,-]

[4,-]

[7,-]

[14,-]

[2,-] [2,0]

[2,-] [2,-]

[2,-]

[1,-] [2,-]

-C[1,-]

[1,-]

A[1,-]

[1,-] -A -B C -E -F -D A -E C F D -A -F D -B[1,-] [0,-] [1,-]

[0,1] [1,-]

[1,-][0,-] [0,-]

[1,-]

-E B

-F D

Figure 2: The computation tree of s(T; '). cnf() represents the "natural"
clausal form conversion. cnf(') has the same number of propositional variables
as ' and it is logically equivalent to '. The problem with cnf() is the exponential
size growth of cnf('), that is, jcnf (')j = O(2j'j). Definition 3.1 overcomes
such a problem, for it is possible to introduce a linear-time computable
function s(T; ') which gives the score of T for a formula '. This is done
directly, i.e., without converting ' into clausal form. We define s(T; ')
recursively as follows: 1

' s(T; ') sGamma (T; ') ' literal ( 0 if T j= '1 otherwise ( 1 if T j= '0
otherwise :'1 sGamma (T; '1) s(T; '1)V

k 'k Pk s(T; 'k) Qk sGamma (T; 'k)W k 'k Qk s(T; 'k) Pk sGamma (T; 'k)
'1 oe '2 sGamma (T; '1) Delta  s(T; '2) s(T; '1) + sGamma (T; '2)

'1 j '2 s

Gamma (T; '1) Delta  s(T; '2)+

s(T; '1) Delta  sGamma (T; '2)

(s(T; '1) + sGamma (T; '2))Delta 

(sGamma (T; '1) + s(T; '2))

sGamma (T; 'k) is s(T; :'k). The distinction between s(T; 'k) and sGamma
(T; 'k) is due to the polarity of the current subformula 'k. During the computation
of s(T; '), a call to the function s(T; 'j) [sGamma (T; 'j)] is invoked
iff 'j is a positive [negative] subformula of '.

Example 3.1 Figure 2 represents the computation tree of the score of a truth
assignment T for the formula ' :

(((:A ^ :B ^ C) . :D . (:E ^ :F )) ^ :C ^ ((:D ^ A ^ :E) j (C ^ F ))). (D
^ :E ^ B) . (((D ^ :A) . (:F ^ D ^ :B) . :F ) ^ A ^ ((E ^ :C ^ F ) . :B)):

T assigns "true" to all the variables of '. The information in square brackets
associated to any subformula 'j represents [s(T; 'j); sGamma (T; 'j)]. For
instance, if we consider the small subtree in the left of Figure 2, then
the score is computed in the following way:

1. Notice that the definition of s(T; ') can be easily extended to formulas
involving other connectives (e.g.,

nand, nor, xor, if-then-else, : : : ) or more complicate boolean functions.

311

Sebastiani s(T; (:A ^ :B ^ C) . :D . (:E ^ :F ) ) = ; s(T; Wk 'k) = Qk s(T;
'k) s(T; :A ^ :B ^ C) Delta  s(T; :D) Delta  s(T; :E ^ :F ) = ; s(T; Vk
'k) = Pk s(T; 'k) (s(T; :A) + s(T; :B) + s(T; C)) Delta  s(T; :D) Delta
(s(T; :E) + s(T; :F )) = ; literals (1 + 1 + 0) Delta  1 Delta  (1 + 1)
= 4:

Notice that cnf (') is 360 clauses long. 2

Theorem 3.1 Let ' be a propositional formula and T a truth assignment for
the variables of '. Then the function s(T; ') gives the score of T for '.

The proof follows from the consideration that, for any truth assignment T
, the set of the false clauses of cnf('1 . '2) is the cross product between
the two sets of the false clauses of cnf('1) and cnf('2).

Theorem 3.2 Let ' be a propositional formula and T a truth assignment for
the variables of '. Then the number of operations required for calculating
s(T; ') grows linearly with the size of '.

The proof follows from the fact that, if T ime(sSigma ('i; T )) is the number
of operations required for computing both s(T; 'i) and sGamma (T; 'i), and
if T ime(sSigma ('i; T )) ^ ai Delta  j'ij + bi, then T ime(sSigma ('1
Pi  '2; T )) ^ maxi(ai) Delta  j'1 Pi  '2j + 2 Delta  maxi(bi) + 6, for
any Pi  2 f^; .; oe; j g.

The number of operations required for computing the score of an assignment
T for a clausal formula OE is O(jOEj). If OE = cnf ('), then jOEj = O(2j'j).
Thus the standard computation of the score of T for OE requires O(2j'j) operations,
while s(T; ') performs the same result directly in linear time.

4. GSAT for non-clausal formulas It follows from Sections 2, 3 that we can
extend GSAT to non-clausal formulas ' by simply using the extended notion
of score of Definition 3.1. Let NC-GSAT (non-clausal GSAT) be a new version
of GSAT in which the scores are computed by some implementation of the function
s(). Then it follows from Theorem 3.1 that in NC-GSAT(') the function hillclimb()
always returns the same sets of variables as in GSAT(cnf(')), so that NC-GSAT(')
performs the same flips and returns the same result as GSAT(cnf(')). Theorem
3.2 ensures that every score computation is performed in linear time.

The current implementation of GSAT (Selman & Kautz, 1993) provides a highlyoptimized
implementation of Updatescores(OE; V ), which analyzes only the clauses which
the last-flipped variable V occurs in. This allows a strong reduction in
computational cost. In (Sebastiani, 1994) we describe in detail an analogous
optimized version of the updating procedure for NC-GSAT, called NC-Updatescores(';
V ), and prove the following properties:

(i) if ' is in clausal form, i.e., ' = cnf ('), then NC-UpdateScores('; V
) has the same

complexity as UpdateScores('; V );

(ii) if OE = cnf ('), then NC-UpdateScores('; V ) is O(j'j). UpdateScores(OE;
V ) is O(2j'j).

The latter mirrors the complexity issues presented in Section 3.

312

Applying GSAT to Non-Clausal Formulas The idea introduced in this paper can
be applied to most variants of GSAT. In "CSAT" (Cautious SAT) hill-climb()
returns all the variables which cause a decrease of the score; in "DSAT"
(Deterministic SAT) the function pick() performs a deterministic choice;
in "RSAT" (Random walk SAT) the variable is picked randomly among all the
variables; in "MSAT" (Memory SAT) pick() remembers the last flipped variable
and avoids picking it. All these variants, proposed in (Gent & Walsh, 1992,
1993), can be transposed into NCGSAT as well, as they are independent of
the structure of the input formula. Selman and Kautz (1993) suggest some
variants which improve the performance and overcome some problems, such as
that of escaping local minima. The strategy "Averaging in" suggests a different
implementation of the function initial(): instead of a random assignment,
initial() returns a bitwise average of the best assignments of the two latest
cycles. This is independent of the form of the input formula. In the strategy
"random walk " the sequence hill-climb() - pick() is substituted with probability
p by a simpler choice function: "choose randomly a variable occurring in
some unsatisfied clause". This idea can be transposed into NC-GSAT as well:
"choose randomly a branch passing only for nodes whose score is different
from zero, and pick the variable at the leaf".

One final observation is worth making. In order to overcome the exponential
growth of CNF formulas, some algorithms have been proposed (Plaisted & Greenbaum,
1986; de la Tour, 1990) which convert propositional formulas ' into polynomial-size
clausal formulas . Such methods are based on the introduction of new variables,
each representing a subformula of the original input '. Unfortunately, the
issue of size-polynomiality is valid only if no "j" occurs in ', as the number
of clauses of  grows exponentially with the number of "j" in '. Even worse,
the introduction of k new variables enlarges the search space by a 2k factor
and reduces strongly the solution ratio. In fact, any model for  is also
a model for ', but for any model of ' we only know that one of its 2k extensions
is a model of  (Plaisted & Greenbaum, 1986).

Acknowledgements Fausto Giunchiglia and Enrico Giunchiglia have given substantial
and continuous feedback during the whole development of this paper. Toby
Walsh provided important feedback about a previous version of this paper.
Aaron Noble, Paolo Pecchiari, and Luciano Serafini helped with the final
revision. Bart Selman and Henry Kautz are thanked for assistance with the
GSAT code.

References Artosi, A., & Governatori, G. (1994). Labelled Model Modal Logic.
In Proc. of CADE12

Workshop on Automated Model Building.

Ballantyne, M., & Bledsoe, W. (1982). On Generating and Using Examples in
Proof Discovery. In Michie, D. (Ed.), Machines intelligence, Vol. 10, pp.
3-39. Halsted Press.

Davis, M., & Putnam, H. (1960). A computing procedure for quantification
theory. Journal

of the ACM, 7, 201-215.

313

Sebastiani de la Tour, T. B. (1990). Minimizing the Number of Clauses by
Renaming. In Proc. of the

10th Conference on Automated Deduction, pp. 558-572. Springer-Verlag.

Gent, I. P., & Walsh, T. (1992). The Enigma of SAT Hill-climbing Procedures.
Tech. rep.

605, University of Edinburgh, Dept. of Artificial Intelligence.

Gent, I. P., & Walsh, T. (1993). Towards an Understanding of Hill-climbing
Procedures for

SAT. In Proc. of the 11th National Conference on Artificial Intelligence,
pp. 28-33.

Jeroslow, R. (1988). Computation-Oriented Reduction of Predicate to Propositional
Logic.

Decision Support System, 4, 183-197.

Kautz, H., & Selman, B. (1992). Planning as Satisfiability. In Proc. 10th
European Conference on Artificial Intelligence, pp. 359-363.

Klingerbeck, S. (1994). Generating Finite Counter Examples with Semantic
Tableaux and

Interpretation Revision. In Proc. of CADE12 Workshop on Automated Model Building.

Levesque, H. (1986). Making believers out of computers. Artificial Intelligence.,
30, 81-108. Plaisted, D., & Greenbaum, S. (1986). A Structure-preserving
Clause Form Translation.

Journal of Symbolic Computation, 2, 293-304.

Reiter, R., & Mackworth, A. (1989). A logical framework for depiction and
image interpretation. Artificial Intelligence., 41 (2), 125-155.

Sebastiani, R. (1994). Applying GSAT to Non-Clausal Formulas. Tech. rep.
94-0018,

DIST, University of Genova, Italy. Available via anonimous ftp from mrg.dist.unige.it,
/pub/mrg-ftp/.

Selman, B., & Kautz, H. (1993). Domain-Independent Extension to GSAT: Solving
Large

Structured Satisfiability Problems. In Proc. of the 13th International Joint
Conference on Artificial Intelligence, pp. 290-295.

Selman, B., Levesque, H., & Mitchell, D. (1992). A New Method for Solving
Hard Satisfiability Problems. In Proc. of the 10th National Conference on
Artificial Intelligence, pp. 440-446.

Slaney, J. (1993). SCOTT: A Model-Guided Theorem Prover. In Proc. of the
13th International Joint Conference on Artificial Intelligence, pp. 109-114.
Morgan Kaufmann.

314