|
From: "Andrew Wozniewicz"
Newsgroups: sdforum.want
Sent: Wednesday, November 07, 2007 3:04 AM
Subject: Execution Runtime [LONG]
It's been pretty quiet here for a while, so here is what I am currently
working on, albeit not as much as I would like to...
I am working to design and implement a Runtime Execution Engine, hereinafter
referred to as the WANT 2.0 Runtime (W2R), or simply the "Runtime". I am
making steady progress there, but I could certainly use some feedback. So
here is an initial description of how I envision the Runtime to work. The
details are a conglomerate of my currently half-cooked implementation, and
the remaining "vision" I have for the Runtime. Please, bear with me as I
cover some fundamental ground - I needed to write this to clarify some
things for myself, anyway. Sorry, I didn't have the time to make it any
shorter.
INTRODUCTION
The Runtime executes XML or, more precisely, an in-memory DOM-like structure
constructed out of nodes - implementors of INode interface, with XML being
just one of the possible representations of the DOM (DML being another, for
example). Here is the essence of INode and its associated INodeCollection
interfaces:
INodeCollection = interface
...
function GetItem(AIndex: Integer): INode;
function GetCount: Integer;
procedure SetItem(AItend: Integer; AValue: INode);
//
function Add(ANode: INode): Integer;
procedure Insert(AIndex: Integer; ANode: INode);
procedure Delete(AIndex: Integer);
procedure Clear;
function FindByAttr(const AAttrName, AAttrValue: String;
CaseSensitive: Boolean = False): INode;
//
property Item[Index: Integer]: INode
read GetItem write SetItem; default;
property Count: Integer read GetCount;
...
end;
INode = interface
...
function GetNodeName: String;
procedure SetNodeName(const AValue: String);
function GetAttrValue(AName: String): String;
procedure SetAttrValue(AName: String; const Value: String);
function GetChildren: INodeCollection;
function GetXML: String;
//
property NodeName: String read GetNodeName write SetNodeName;
property Children: INodeCollection read GetChildren;
property AttrValue[Name: String]: String
read GetAttrValue write SetAttrValue; default;
property XML: String read GetXML;
...
end;
So, essentially, a Runtime DOM node - an INode - has (String) attributes,
and (optionally) other INode children presented as INodeCollection. With a
nod to the needs of XML generation, it also has a "NodeName" property that
corresponds to the tag of its XML representation (which is not to be
confused with the "Name" attribute, which it may, or may not have).
Notice that - unlike most other DOM nodes I've seen - the INode does NOT
define a unique "parent", which actually makes it quite powerful and
universal (not only is it possible to construct "trees", but also more
generic DAGs - Directed Acyclic Graphs). Also note that this DOM is a
different implementation of a DOM from that in WANT 1.0 - the INode
interface, my TNode implementation instead of Juanca's, and an XML
parser/generator, are just a few of the most obvious differences.
The actual executable-DOM (or the "Abstract Syntax Tree" that the
Runtime understands, which in this case is not a tree but a DAG) is
structured very differently from the current "executable" DOM of WANT: it's
much more detailed (verbose!) and is thus not very suitable for manual human
input at all (not that the original WANT XML script *was* suitable, but
still, it was much more amenable).
The WANT 2.0 Runtime (W2R) is more of a generic-script-executable-DOM,
designed explicitly to support all features I wanted to see in Modula7, yet
it is a language-agnostic DOM, much like the CLR of .NET. It is much more
than WANT by itself potentially needs, but I am deliberately building it in
such a way that WANT features will just fall into place naturally - and then
the power will be there to extend it into as yet unknown directions. The
fact that the Runtime is language-agnostic allows me to sidestep - for now -
the issue of the scripting language design altogether, and to concentrate on
the under-the-hood run-time functionality common to all possible scripting
languages that could potentially run on top of it.
In short, the new DOM, when represented as XML, does not resemble the old
WANT XML script at all. I call this new XML format "XML eXecutable", or
XMLX. I am currently forced to use XMLX files created by hand to exercise my
Runtime as I am building it, which is a bit of a pain given the XMLX's
verbosity, but of course, in the future the DOM may and will be generated
directly by any one capable script parser and XMLX files will not be
necessary at all. For now, one can think of them as precompiled "object"
files, or "intermediate code" that can be loaded into the Runtime engine
to execute.
To pre-empt the protests by performance buffs, the implementation is
String-based, i.e. all literal values are ultimately (huge) Delphi Strings.
This means that yes, the integers are represented as strings, and that yes,
there needs to be a StrToIt+IntToStr conversion whenever an actual
computation is taking place. This is by design, and deliberate. I'll explain
the rationale at some other time, but just be advised that if you are
implementing the avionics for a supersonic aircraft, you should select
something other than a W2R-based script for your implementation. I still
think it's good-enough for WANT.
MODULE = DATA + CODE
Now, the fundamental concept in the Runtime is that of a MODULE. A module,
at its simplest, is a combination of data and code, i.e. DATA + CODE =
MODULE (Niklaus Wirth's shadow looming large here). A module is also an
embodiment of an Abstract Data Type, with DATA + OPERATIONS on that data.
When you "invoke" a module, you run its "code" (a designated method) which
operates on its "data". Examples of modules include classes, and methods.
By analogy, the Delphi concept of a unit is just an example of a static
module, while a class is a module that can (typically) be instantiated
(non-static). Less obvious, perhaps, is a realization that a subroutine (a
method, a procedure, a function) is also a module. Unlike Delphi (or most
other programming environments) W2R does NOT make the distinction among the
different kinds of modules and treats them pretty much uniformly: class,
"unit" (actually called "module" in M7), method, procedure, function, etc.,
are all essentially the same thing to the Runtime.
The fundamental characteristic of a module is that it is a recursive
concept, i.e. a module definition may contain other module definitions
inside it, i.e. Module = (Data + Code) + Other_Modules. Thus a class
definition contains methods, and a method definition may contain classes, ad
infinitum. As far as the runtime is concerned there are only classes and
their methods, nested in one another to an arbitrary depth, and the two are
just specialized kinds of modules.
Looking at it a bit closer, following are the constituent
components/sections of a module (these are immediate child nodes of the
module, and each potentailly containing other nodes):
Module:
- static data
- static initialization (class constructor)
- static finalization (class destructor)
- static methods
- parameters
- local data
- initialization (default constructor)
- finalization (default destructor)
- non-static methods (embedded/contained method modules)
- module types (embedded/contained non-method modules)
- non-static code (for direct invocation)
Each of these components is optional, and different kinds of modules (for
example a method versus a class) will typically have a different mix of
them, but all are allowed in every module.
The code sections of a module - which include both static and non-static
code, initialization, and finalization sections - when non-empty, contain
code statements, such as assignments, and function calls (they are, you
guessed it, also nodes of the DOM). These code sections are invoked via the
IExecutable interface's Execute call:
IExecutable = interface
...
function Execute: INode;
end;
This means that the "executable" code sections within the DOM must implement
IExecutable interface, in addition to the INode interface that makes them
part of the DOM in the first place.
Alternatively, a method can be implemented externally to the Runtime, by
being mapped to a Delphi class that implements the IExecutable interface.
Note that a Runtime method implementation is thus a Delphi class instance
that can be dynamically registered with the Runtime.
An externally-implemented method (externally with respect to the Runtime)
has a reference to the external class instance instead of executable script
nodes as its code section. The external class is instantiated upon the
loading of the code block, and is available throughout the Runtime's
execution thereafter.
EXTERNAL METHODS
So, the short of it is that in order to implement an (external) method, one
has to define (in Delphi) a class that implements IExecutable. This also
applies if one needs to make a native (e.g. Windows API) function available
to the script - a wrapper class must be implemented that exposes the
parameters of the external function as script parameters, and enables the
Runtime to marshall data between the two. Since there are potentially lots
of such methods that one might eventually want to implement, there are
numerous classes that need to be defined.
This is why I want to make it (relatively) easy to implement such externally
defined function-wrapper-classes. These classes must be able to provide some
metadata to the Runtime for it to know how to invoke the method correctly,
but I don't want the task of generating the metadata to become tedious for
the method-implementation-class writer, i.e. the programmer such as you.
Enter the RTTI.
An external method implementation would be defined along the following lines
(I'll use a wrapper for the system Copy function in this example):
type
TSystem_Copy_Method = class(TMethodParent,IExecutable)
protected
function GetResult: String;
procedure SetS(AValue: String);
procedure SetIndex(AValue: Integer);
published
property Result: String read GetResult;
property S: String write SetS;
property Index: Integer write SetIndex;
property Count: Integer write SetCount;
public
procedure Execute;
end;
The implementation of the Execute could then be as easy as this simple
wrapper:
procedure TSystem_Copy_Method.Execute;
begin
Result := System.Copy(S, Index, Count);
end;
The reason I would like to do it this way is that it makes it possible to
use the RTTI to gather the metadata about the method call from the
properties of the implementing class. The Runtime could use this info to
detect the parameters, and it makes it (relatively) easy to implement the
actual code of the method.
In the example above, the wrapper declares three input parameters to the
method (write-only properties of the implementing class), and one
output-only parameter (the read-only property called the Result).
Given the above declaration of TSystem_Copy_Method, it would just take a
call to
RegisterStaticMethod('System.Copy',TSystem_Copy_Method);
for the Runtime to be able to gather all the information it needs about the
method to be able to call it with appropriate parameters. The net effect of
these declarations corresponds to the following script function header, if
it were to be represented in Pascal:
function Copy(S: String; Index, Count: Integer): String;
The RTTI embedded in the TSystem_Copy_Method class declaration is sufficient
for the Runtime to determine the signature of the method in question: three
input integer parameters, and a String result. The read and write access of
each property tells the Runtime whether the parameter is IN (write-only),
OUT (read-only), or both (read+write).
It is also worth noting that a script (or external) "method" is more like a
stored procedure in a relational database than a function or procedure in a
high-level language: it may have an arbitrary number of in parameters, an
arbitrary number of OUT parameters, and an arbitrary number of IN-OUT
parameters, each of which can be optional (with a default value).
There is also support for variable argument lists within the Runtime. This
feature allows for the implementation of a wrapper to Delphi System.Write()
and System.WriteLn() procedures for example. An argument within the variable
list is nameless. To mark the start of a variable argument list in an
external method implementation, define a published property "___" (three
underscores) of type INode. These parameters will be assigned unmarshalled,
so that the implementor of the external Execute method will have to access
them via the INode intrface.
Here is an example of an external method that supports a variable argument
list:
type
TSystem_String_Format_Method = class(TMethodParent,IExecutable)
protected
function GetResult: String;
procedure SetFormatStr(AValue: String);
procedure Set___(AValue: INode);
published
property Result: String read GetResult;
property FormatStr: String write SetFormatstr;
property ___: INode write Set___;
public
procedure Execute;
end;
The equivalent Modula7 header would look like:
function Format(FormatStr: String; ... ): String;
As an example, it can be called (in M7 script), like this:
S := System.String.Format("%d %s %n", IntVar, StrVar, NumericVar);
IEXECUTABLE VERSUS INODE
A method implementation external to the Runtime is exposed as an instance of
IExecutable, which in itself is also an INode and is placed in the method's
code section.
Each code section implements IExecutable potentially in its own way. When an
external method is invoked, the Runtime marshalls the parameters into the
external implementation by assigning the implementation class published,
write only properties, calls the implementation's Execute method, and then
marshalls any result (out-parameters) back into the node representation (I
am very tempted to call it "managed" representation here).
When an external method is registered with the Runtime via a call to
RegisterXXXMethod(), the method-implementation class is instantiated as the
"code" object.
RUNTIME FOR WANT
There is a natural mapping between the concept of an external method of the
Runtime and the old concept of a WANT task, namely, a WANT task IS
simply an external method.
So, a WANT task writer implements the task as an external method=Delphi
class, and implements the Execute method of that class to use the published
properties as parameters.
This is mostly how it currently works in WANT.1, anyway. WANT.2 just
clarifies and formalizes the usage of published properties of the task as
being the parameters of a (script)method call, making it generic and
universally extensible.
Note that all published properties of an external implementation of a script
method are treated as parameter definitions, so only those that are
intended as parameters should be published. The existing WANT tasks will
require some cleanup of their published sections.
-Andrew
|