A Couple of Meta-interpreters in Prolog
Motivation
Informally, an interpreter is a program that evaluates programs.
Interpretation is pervasive in computer science both from a
theoretical and practical perspective, and many programs are
interpreters for domain-specific languages. For example, a
program reading settings from a configuration file and adjusting
itself accordingly is interpreting this "configuration language".
An interpreter for a language similar or identical to its own
implementation language is
called meta-interpreter (MI). An interpreter that can
interpret itself is called meta-circular. Prolog is
well-suited for writing MIs: First and most importantly, Prolog
programs can be naturally represented as Prolog terms and are
easily inspected and manipulated using built-in
mechanisms. Second, Prolog's implicit computation strategy and
all-solutions predicates can be used in interpreters, allowing for
concise specifications. Third, variables from the object-level
(the program to be interpreted) can be treated as variables on the
meta-level (the interpreter); therefore, an interpreter can
delegate handling of the interpreted program's binding environment
to the underlying Prolog engine.
Consider a Prolog program:
natnum1(0).
natnum1(s(X)) :-
natnum1(X).
Prolog can evaluate Prolog code dynamically:
?- Goal = natnum1(X), Goal.
Goal = natnum1(0)
X = 0 ;
Goal = natnum1(s(0))
X = s(0)
Yes
You can make the call explicit using the built-in call/1
predicate, yielding the equivalent query "?- Goal = natnum1(X),
call(Goal).". The call/n (n > 1) family of
predicates lets you append additional arguments to Goal
before it is called.
This mechanism ("meta-call") is used in predicates
like sublist/3, maplist/3, if/3 etc., and can
be seen as interpretation in its coarsest sense. Implicitly using
features of the underlying engine is called absorption.
Making features explicit is called reification. By
using call/1, we absorb backtracking, unification,
handling of conjunctions, the call stack etc. We can make all
these features explicit and subsequently adjust and extend them at
will.
Vanilla MIs
The definition of natnum1/1 consists of two clauses, the
first of which degenerated into a fact that is implicitly
treated as if it were stated as
natnum1(0) :- true.
The Prolog built-in clause/2 allows inspection of the
predicate's definition:
?- clause(natnum1(Z), Body).
Z = 0
Body = true ;
Z = s(_G254)
Body = natnum1(_G254) ;
No
The two solutions found on backtracking correspond to the two
clauses of the predicate's definition.
Another example:
complicated_clause(A) :-
goal1(A),
goal2(A),
goal3(A).
?- clause(complicated_clause(Z), Body).
Z = _G187
Body = goal1(_G187), goal2(_G187), goal3(_G187) ;
No
The body of complicated_clause/1 is represented by a
compound term with functor "," and arity 2, whose
arguments are either goals or compound terms of the same
structure. Here is a Prolog "type" definition of a clause body in
the default representation:
body(true).
body((A,B)) :-
body(A),
body(B).
body(G) :- % ambiguous, also matches "true" and "(_,_)"
goal(G).
goal(_ = _).
goal(call(_)).
% ... etc.
We can thus define an interpreter for pure Prolog programs:
mi1(true).
mi1((A,B)) :-
mi1(A),
mi1(B).
mi1(Goal) :-
Goal \= true,
Goal \= (_,_),
clause(Goal, Body),
mi1(Body).
This is often called vanilla MI because there's nothing
special to it. All MIs that add no ability to pure Prolog are
sometimes called vanilla MIs. They serve as skeletons for
extensions.
We can use this MI to run our example program:
?- mi1(natnum1(X)).
X = 0 ;
X = s(0) ;
X = s(s(0))
Yes
We had to guard the third clause from matching
true/0 or conjunctions to prevent run-time errors resulting
from calling clause/2 with these arguments. This is ugly,
and we can make it uglier still by using cuts (!/0) in the
first two clauses. That would at least remove unnecessary
choice-points, though typically not prevent their creation.
Instead of recognising goals by indication of what they are, we
had to introduce a "default" case and specify what they aren't.
Such representations are called "defaulty". To fix this, we
equip goals with the (arbitrary) functor g:
body(true).
body((A,B)) :-
body(A),
body(B).
body(g(G)) :-
goal(G).
We rewrite our example according to this specification:
natnum2(0).
natnum2(s(X)) :-
g(natnum2(X)).
While a query like "?- natnum2(X)." would yield a runtime error
(there is no predicate g/1), we can use an MI to
interpret the program:
mi2(true).
mi2((A,B)) :-
mi2(A),
mi2(B).
mi2(g(G)) :-
clause(G, Body),
mi2(Body).
In addition to being shorter and cleaner, this version is
typically more efficient (no choice-points due to first-argument
indexing, and less inferences due to less tests):
?- number_term(10^5, T), time(mi1(natnum1(T))).
% 400,005 inferences, 0.80 CPU in 0.83 seconds (96% CPU, 500006 Lips)
?- number_term(10^5, T), time(mi2(g(natnum2(T)))).
% 200,003 inferences, 0.71 CPU in 0.79 seconds (90% CPU, 281694 Lips)
In the following, mi_clause/2 denotes a predicate that is
similar to clause/2, except that it supplies us with an
"appropriate" (MI-dependent) representation of clause bodies. For
the MI at hand:
mi_clause(G, Body) :-
clause(G, B),
defaulty_better(B, Body).
defaulty_better(true, true).
defaulty_better((A,B), (BA,BB)) :-
defaulty_better(A, BA),
defaulty_better(B, BB).
defaulty_better(G, g(G)) :-
G \= true,
G \= (_,_).
This predicate lets us interpret a subset of normal Prolog code
with the second MI, and it can also be used for static
conversion. Alternatively, mi_clause/2 can be defined by
facts. For this MI:
mi_clause(natnum(0), true).
mi_clause(natnum(s(X)), g(natnum(X)).
The 2 MIs presented so far reify
conjunction. Through clause/2, they absorb
unification and backtracking. Having made handling of
conjunctions explicit, we can reverse the default execution order
of goals:
mi3(true).
mi3((A,B)) :-
mi3(B), % first B, than A
mi3(A).
mi3(g(G)) :-
mi_clause(G, Body),
mi3(Body).
From a logical point of view, this changes nothing (conjunction is
commutative). Procedurally, there's a difference: With minor
adjustments to mi_clause/2 (adding the fact
"defaulty_better(fail, fail)." and guarding G against the
atom fail/0), and after adding the additional clause
mi3(fail) :-
fail.
this MI can be used to prove that the query
"?- declarative_fail." cannot succeed given the predicate's
definition:
declarative_fail :-
declarative_fail,
fail.
This can't be derived with the standard execution order.
We can ask the user before we execute goals (tracing),
print out the execution history or restrict access to safe goals
by adjusting the third clause:
mi2(g(G)) :-
( safe_goal(G) ->
mi_clause(G, Body),
mi2(Body)
;
format("Sorry, you can't execute ~w\n", [G]),
fail
).
We can obtain a simpler MI by using lists of goals to represent
conjunctions. In this representation, true/0 becomes the
empty list:
natnum_list(0, []).
natnum_list(s(X), [natnum_list(X)]).
Again, the conversion between runnable Prolog code and this
representation can be performed automatically:
mi_clause(G, Ls) :-
clause(G, Body),
phrase(body_list(Body), Ls).
body_list(true) --> [].
body_list((A,B)) -->
body_list(A),
body_list(B).
body_list(G) -->
{ G \= true },
{ G \= (_,_) },
[G].
An obvious MI for this representation consists of only 2 clauses:
mi_list1([]).
mi_list1([G|Gs]) :-
mi_clause(G, Body),
mi_list1(Body),
mi_list1(Gs).
We can make it tail-recursive:
mi_list2([]).
mi_list2([G0|Gs0]) :-
mi_clause(G0, Body),
append(Body, Gs0, Gs),
mi_list2(Gs).
The difference:
always_infinite :-
always_infinite.
?- mi_list1([always_infinite]).
ERROR: Out of local stack
?- mi_list2([always_infinite]). % loops, constant stack space
Using list differences in our clause representation, appending the
yet-to-be-proved goals can be fused with expanding a goal:
mi_ldclause(natnum(0), Rest, Rest).
mi_ldclause(natnum(s(X)), [natnum(X)|Rest], Rest).
mi_list3([]).
mi_list3([G0|Gs0]) :-
mi_ldclause(G0, Remaining, Gs0),
mi_list3(Remaining).
Example query:
?- mi_list3([natnum(X)]).
X = 0 ;
X = s(0) ;
Yes
Here is an MI that can handle the built-in
predicates clause/2 and \=/2:
mi_circ(true).
mi_circ((A,B)) :-
mi_circ(A),
mi_circ(B).
mi_circ(clause(A,B)) :-
clause(A,B).
mi_circ(A \= B) :-
A \= B.
mi_circ(G) :-
G \= true,
G \= (_,_),
G \= (_\=_),
G \= clause(_,_),
clause(G, Body),
mi_circ(Body).
It can interpret itself and is thus a meta-circular
interpreter:
?- mi_circ(mi_circ(natnum1(X))).
X = 0 ;
X = s(0)
Yes
To generalize this pattern to all built-in predicates, we
can use predicate_property/2 to identify them as such for
calling them directly:
provable(true, _) :- !.
provable((G1,G2), Defs) :- !,
provable(G1, Defs),
provable(G2, Defs).
provable(BI, _) :-
predicate_property(BI, built_in),
!,
call(BI).
provable(Goal, Defs) :-
member(Def, Defs),
copy_term(Def, Goal-Body),
provable(Body, Defs).
provable(Goal, Defs) is true if Goal is provable
with respect to the definitions Defs, which is a list of
clauses represented as terms of the form Head-Body. How
the built-in predicate !/0 is interpreted by this MI
does not match its intended meaning, and building an MI that
handles cuts correctly requires more work.
With the following additional definitions, we can use this MI to
identify redundant facts in some predicate definitions:
redundant(Functor/Arity, Reds) :-
functor(Term, Functor, Arity),
findall(Term-Body, clause(Term, Body), Defs),
setof(Red, Defs^redundant_(Defs, Red), Reds).
redundant_(Defs, Fact) :-
select(Fact-true, Defs, Rest),
once(provable(Fact, Rest)).
Given the definitions
as([]).
as([a]). % redundant
as([a,a]). % redundant
as([A|As]) :-
A = a, % test built-in =/2
5 is 2 + 3, % test built-in is/2
1 > 0, % test built-in >/2
as(As).
we can ask for facts which are deducible from all (respective)
remaining clauses and hence redundant:
?- redundant(as/1, Reds).
Reds = [as([a]), as([a, a])]
Let us reify backtracking now. We need to make explicit all
alternative clause bodies that are normally found on backtracking,
collecting them deterministically using findall/3. A single
alternative is represented as a list of goals, and the branches of
computation that have yet to be explored as a list of
alternatives. The interface predicate, mi_backtrack/1,
takes a goal as its argument and creates the representation of the
initial state of computation: A list, consisting of a single
alternative, [G0]. Actually, the representation is
[G0]-G0, and G0 is also passed as the second
argument to the worker predicate for reasons that will become
clear shortly.
mi_backtrack(G0) :- mi_backtrack_([[G0]-G0], G0).
To perform a single step of the computation, we first collect all
clause bodies whose heads unify with the first goal of the first
alternative. To all found clause bodies, the remaining goals of
that first alternative are appended, thus obtaining new
alternatives that we prepend to the other alternatives to give the
new state of computation. Using the clause representation that
makes use of list differences, and findall/4 to append
existing alternatives, this becomes:
resstep_([A|As0], As) :-
findall(Gs-G, (A = [G0|Rest]-G,mi_ldclause(G0,Gs,Rest)), As, As0).
Leaves of the computation, i.e., alternatives that we are done
with, automatically vanish from the list of alternatives as there
is no goal to be proved for them any more. The unification "A =
[G0|Rest]-G" thus fails and only the other alternatives remain.
The worker predicate:
mi_backtrack_([[]-G|_], G).
mi_backtrack_(Alts0, G) :-
resstep_(Alts0, Alts1),
mi_backtrack_(Alts1, G).
If no goals remain to be proved for an alternative, a solution for
the initial query is found and we report the bindings to the user.
This is why we needed to pass around the user's query. The second
clause describes how the computation is carried on: The list of
alternatives is transformed as described above, and the process
continues.
Representing all alternatives explicitly allows us to guide the
inference process by reordering alternatives, implement fair
execution strategies (by appending new alternatives) and so
on. Also, the MI shows what is needed to implement Prolog
in languages that lack built-in backtracking.
Extending Prolog
If you want a feature that plain Prolog does not provide, you can
add it to a vanilla MI. Here is an MI that behaves like standard
pure Prolog and builds a proof-tree in parallel that makes
explicit the inferences that lead to the proof:
:- op(750, xfy, =>).
mi_tree(true, true).
mi_tree((A,B), (TA,TB)) :-
mi_tree(A, TA),
mi_tree(B, TB).
mi_tree(g(G), TBody => G) :-
mi_clause(G, Body),
mi_tree(Body, TBody).
Example query:
?- mi_tree(g(natnum1(X)), T).
X = 0
T = true=>natnum1(0) ;
X = s(0)
T = (true=>natnum1(0))=>natnum1(s(0)) ;
X = s(s(0))
T = ((true=>natnum1(0))=>natnum1(s(0)))=>natnum1(s(s(0)))
Yes
Another group of extensions aims to improve the incomplete default
computation strategy. We start from an MI that limits the depth of
the search tree:
mi_limit(Goal, Max) :-
mi_limit(Goal, Max, _).
mi_limit(true, N, N).
mi_limit((A,B), N0, N) :-
mi_limit(A, N0, N1),
mi_limit(B, N1, N).
mi_limit(g(G), N0, N) :-
N0 > 0,
mi_clause(G, Body),
N1 is N0 - 1,
mi_limit(Body, N1, N).
Example query:
?- mi_limit(g(natnum1(X)), 3).
X = 0 ;
X = s(0) ;
X = s(s(0)) ;
No
As expected, the number of solutions coincides with the maximal
depth of the search tree in the case of natnum1/1. Based
on this MI, we can implement a complete search
strategy, iterative deepening:
mi_id(Goal) :-
between(0, infinite, N),
mi_limit(Goal, N).
Consider the program:
edge(a, b).
edge(b, a).
edge(b, c).
path(A, A, []).
path(A, C, [e(A,B)|Es]) :-
edge(A, B),
path(B, C, Es).
And the queries:
?- path(a, c, Es).
ERROR: Out of local stack
?- mi_id(g(path(a, c, Es))).
Es = [e(a, b), e(b, c)]
In contrast to depth-first search, iterative deepening finds a
solution.
Omission of the occurs check in the default unification
algorithms of common Prolog implementations can lead to unsound
inference:
occ(X, f(X)).
Without occurs check, the query
?- occ(A, A).
succeeds. We can use occurs check for unification of clause heads
in an MI:
mi_occ(true).
mi_occ((A,B)) :-
mi_occ(A),
mi_occ(B).
mi_occ(g(G)) :-
functor(G, F, Arity),
functor(H, F, Arity),
mi_clause(H, Body),
unify_with_occurs_check(G, H),
mi_occ(Body).
We get:
?- mi_occ(g(occ(A,A))).
No
You can use an MI similar to this one to reify the binding
environment of variables: Thread the bindings through and add
a term of the form "unify(G,H)" to the set of bindings in the
third clause. Use numbervars/3 to get rid of variables if
you want to reify unification.
As a consequence of Prolog's computation strategy, parsing with
left-recursive grammars is problematic. Let us define an MI for
DCGs that can handle left-recursion. Consider a simple grammar:
dcgnumber(0).
dcgnumber(1).
expr(N) --> [N], { dcgnumber(N) }.
expr(A+B) --> expr(A), [(+)], expr(B).
This grammar can be used to (unfairly) enumerate an arbitrary
number of strings it describes:
?- phrase(expr(E), Ss).
E = 0
Ss = [0] ;
E = 1
Ss = [1] ;
E = 0+0
Ss = [0, +, 0]
Yes
Parsing ground strings leads to problems:
?- phrase(expr(E), [1,+,1]).
E = 1+1 ;
ERROR: Out of local stack
We first convert the grammar into a more suitable representation:
dcg_clause(expr(N), [t(N),{dcgnumber(N)}]).
dcg_clause(expr(A+B), [l,nt(expr(A)),t(+),nt(expr(B))]).
The atom l in the body of the second clause is used to
capture that for this clause to apply, there must be at least one
(therefore, one l) token left in the string to be
parsed. This is used in the MI to prune the search tree if we run
out of tokens. The other terms are: t/1 for terminals, nt/1 for
non-terminals and {}/1 for goals.
The interface to the MI:
mi_dcg(NT, String) :-
length(String, L),
length(Rest0, L),
mi_dcg_([nt(NT)], Rest0, _, String, []).
The worker predicates:
mi_dcg(t(T), Rest, Rest, [T|Ts], Ts).
mi_dcg({Goal}, Rest, Rest, Ts, Ts) :-
call(Goal).
mi_dcg(nt(NT), Rest0, Rest, Ts0, Ts) :-
dcg_clause(NT, Body),
mi_dcg_(Body, Rest0, Rest, Ts0, Ts).
mi_dcg(l, [_|Rest], Rest, Ts, Ts).
mi_dcg_([], Rest, Rest, Ts, Ts).
mi_dcg_([G|Gs], Rest0, Rest, Ts0, Ts) :-
mi_dcg(G, Rest0, Rest1, Ts0, Ts1),
mi_dcg_(Gs, Rest1, Rest, Ts1, Ts).
We can now use the left-recursive grammar also for parsing:
?- mi_dcg(expr(E), [1,+,1,+,1]).
E = 1+ (1+1) ;
E = 1+1+1 ;
No
Other possible extensions are module systems, delayed goals,
checking for various kinds of infinite loops, profiling,
debugging, type systems, constraint solving etc. The overhead
incurred by implementing these things using MIs can be compiled
away using partial evaluation techniques. For instance, we can
(mechanically) derive the following partially evaluated version of
the DCG example:
pe_expr(Expr, String) :-
length(String, L),
length(Rest0, L),
pe_expr(Expr, Rest0, _, String, []).
pe_expr(N, Rest, Rest, Ts0, Ts) :-
Ts0 = [N|Ts],
dcgnumber(N).
pe_expr(A+B, [_|Rest0], Rest, Ts0, Ts) :-
pe_expr(A, Rest0, Rest1, Ts0, Ts1),
Ts1 = [+|Ts2],
pe_expr(B, Rest1, Rest, Ts2, Ts).
Comparison:
?- sum_of_ones(10^3, Ss), time(mi_dcg(expr(Sum), Ss)).
% 525,516 inferences, 0.68 CPU in 0.68 seconds (100% CPU, 772818 Lips)
?- sum_of_ones(10^3, Ss), time(pe_expr(Sum, Ss)).
% 6,008 inferences, 0.01 CPU in 0.01 seconds (186% CPU, 600800 Lips)
MIs are also used in abstract interpretation to derive type
and mode information of predicates. An example from Codish and
Sondergaard 2002, "Meta-Circular Abstract Interpretation in
Prolog" is in the source file. By fixpoint computation, we derive
non-trivial facts about Ackermann's function over an abstract
parity domain:
?- ack_fixpoint(As).
As = [ack(odd, odd, odd), ack(odd, even, odd), ack(odd, one, odd), ack(even, odd, odd),
ack(odd, zero, odd), ack(even, even, odd), ack(even, one, odd), ack(one, odd, odd),
ack(even, zero, odd), ack(one, even, even), ack(one, one, odd), ack(one, zero, even),
ack(zero, zero, one), ack(zero, one, even), ack(zero, even, odd), ack(zero, odd, even)] ;
It follows in particular that ackermann(i, j) is odd and greater
than 1 for all i > 1.
Source file containing all examples: acomip.pl.
Written Sept. 14th 2005
Main page