Yacas Under the Hood

Yacas Under the Hood

This part of the manual is a somewhat in-depth explanation of the Yacas programming language and environment. It assumes that you have worked through the introductory tutorial. You should consult the function reference about how to use the various Yacas functions mentioned here.


The Yacas Architecture

Yacas is designed as a small core engine that interprets a library of scripts. The core engine provides the syntax parser and a number of hard-wired functions, such as Set() or MathExp() which cannot be redefined by the user. The script library resides in the scripts directory "scripts" and contains higher-level definitions of functions and constants. The library scripts are on equal footing with any code the user executes interactively or any files the user loads.

Generally, all core functions have plain names and almost all are not 'bodied' or infix operators. The file "src/yacasapi.cc" in the source tree lists declarations of all kernel functions; consult it for reference. For many of the core functions, the script library already provides convenient aliases. For instance, the addition operator "+" is defined in the script scripts/standard while the actual addition of numbers is performed through the built-in function MathAdd.


Startup and the .def files

When Yacas is first started or restarted, it executes the script "yacasinit" in the scripts directory. This script may load some other scripts. In order to start up quickly, Yacas does not execute all other library scripts at first run or at restart. It only executes the file yacasinit and all .def files in the scripts and addons directories. These .def files tell the system where it can find definitions for various functions. For example, the function ArcTan is mentioned in the file "stdfuncs.def", therefore Yacas will load the file "stdfuncs" when the user invokes ArcTan. This way Yacas knows where to look for any given function without actually loading the file where the function is defined.


Object types

Yacas supports two basic kinds of objects: atoms and compounds. Atoms are (integer or real, arbitrary-precision) numbers such as 2.71828, symbolic variables such as A3 and character strings. Compounds include functions and expressions, e.g. Cos(a-b) and lists, e.g. {1+a,2+b,3+c}.

The type of an object is returned by the built-in function Type, for example:
In> Type(a);
Out> "";
In> Type(F(x));
Out> "F";
In> Type(x+y);
Out> "+";
In> Type({1,2,3});
Out> "List";
Internally, atoms are stored as strings and compounds as lists. The functions String() and Atom() convert between atoms and strings. A Yacas list {1,2,3} is internally a list (List 1 2 3) which is the same as a function call List(1,2,3) and for this reason the "type" of a list is the string "List". During evaluation, atoms can be interpreted as numbers, or as variables that may be bound to some value, while compounds are interpreted as function calls.

Note that atoms that result from an Atom() call may be invalid and never evaluate to anything. For example, Atom(3X) is an atom with string representation "3X" but with no other properties.

Currently, no other lowest-level objects besides strings and lists are provided by the core engine. There is however a possibility to link externally compiled code that will provide additional types of objects. Those will be available in Yacas as "generic objects."


Evaluation Scheme

Evaluation of an object is performed either explicitly by the built-in command Eval() or implicitly when assigning variables or calling functions with the object as argument (except when a function does not evaluate that argument). Evaluation of an object can be explicitly inhibited using Hold(). To make a function not evaluate one of its arguments, a HoldArg(funcname, argname) must be declared for that function.

Evaluation when performed on an atom goes as follows: if the atom is bound locally as a variable, the object it is bound to is returned, otherwise, if it is bound as a global variable that is returned. Otherwise, the atom is returned unevaluated.

Lists of atoms are generally interpreted in the following way: the first atom of the list is some command, and the atoms following in the list are considered the arguments. The engine first tries to find out if it is a built-in command (core function). In that case, the function is executed. Otherwise, it could be a user-defined function (with a "rule database"), and in that case the rules from the database are applied to it. If none of the rules are applicable, or if no rules are defined for it, the object is returned unevaluated.

The main properties of this scheme are the following. When objects are assigned to variables, they generally are evaluated (except if you are using the Hold() function) because assignment var := value is really a function call to Set(var, value) and this function evaluates its second argument (but not its first argument). When referencing that variable again, the object which is its value will not be re-evaluated. Also, the default behaviour of the engine is to return the original expression if it couldn't be evaluated. This is a desired behaviour if evaluation is used for simplifying expressions.

One major design flaw in Yacas (one that other functional languages like LISP also have) is that when some expression is re-evaluated in another environment, the local variables contained in the expression to be evaluated might have a different meaning. In this case it might be useful to use the functions LocalSymbols and TemplateFunction. Calling
LocalSymbols(a,b)
a+b;
results in "a" and "b" in the addition being substituted with unique symbols that can not clash with other variables being used elsewhere. Use TemplateFunction instead of Function to define a function whose parameters should be treaded as unique symbols also.


Rules

Rules are special properties of functions that are applied when the function object is being evaluated. A function could have just one rule bound to it; this is similar to having a "function body" in normal procedural languages. However, Yacas function objects can also have several rules bound to them. This is analogous of having several alternative "function bodies" that are executed under different circumstances. This design is more suitable for symbolic manipulations.

A function is identified by its name as returned by Type and the number of arguments, or "arity". The same name can be used with different arities to define different functions: f(x) is said to 'have arity 1' and f(x,y) has arity 2. Each of these functions may possess its own set of specific rules, which we shall call a "rule database" of a function.

Each function should be first declared with the built-in command RuleBase as follows:
RuleBase("FunctionName",{argument list});
So, a new (and empty) rule database for f(x,y) could be created by typing RuleBase("f",{x,y}); The names for the arguments "x" and "y" here are arbitrary, but they will be globally stored and must be later used in descriptions of particular rules for the function "f". After the new rulebase declaration, the evaluation engine of Yacas will begin to really recognize "f" as a function, even though no function body or equivalently no rules have been defined for it yet. Here is the difference in how a function object "g(x)" behaves before and after a RuleBase declaration:
In> f(a):=g(a)+1;
Out> True;
In> f(B);
Out> g(a)+1;
In> g(1+1);
Out> g(1+1);
In> RuleBase("g",{x});
Out> True;
In> f(B);
Out> g(B)+1;
In> g(1+1);
Out> g(2);
Without a RuleBase call, a function object behaves basically like an atom, never evaluated to anything else but its literal representation, because the formal function parameters (e.g. "a" in the definition of "f(a)") will never be bound to the actual arguments.

The shorthand operator := for creating user functions that we illustrated in the tutorial is actually defined in the scripts and it makes the requisite call to the RuleBase function. After a RuleBase call you can specify parsing properties for the function if you like; for example, you could make it an infix or bodied operator.

Now we can add some rules to the rule database for a function. A rule simply states that if a specific function object with a specific arity is encountered in an expression and if a certain predicate is true, then Yacas should replace this function with some other expression. To tell Yacas about a new rule you can use the built-in Rule command. This command is what does the real work for the somewhat more aesthetically pleasing ... # ... <-- ... ; construct we have seen in the tutorial. Incidentally, you do not have to call RuleBase explicitly if you use that construct.

Here is the general syntax for a Rule call:
Rule("foo", arity, precedence, predicate) body;
This specifies that for function foo with given arity (foo(a,b) has arity 2), there is a rule that if predicate is true, then body should be evaluated, and the original expression replaced by the result. Predicate and body can use the symbolic names of arguments that were declared in the RuleBase call.

All rules for a given function can be erased with a call to TryRetract(funcname, arity). This is useful, for instance, when too many rules have been entered in the interactive mode. This call undefines the function and also invalidates the RuleBase declaration.

You can specify that function arguments are not evaluated before they are bound to the parameter: HoldArg("foo",a); would then declare that the a arguments in both foo(a) and foo(a,b) should not be evaluated before bound to "a". Here the argument name "a" should be the same as that used in the RuleBase call when declaring these functions. Inhibiting evaluation of certain arguments is useful for procedures performing actions based partly on a variable in the expression, like integration, differentiation, looping, etc., and will be typically used for functions that are algorithmic and procedural by nature.

Rule-based programming normally makes heavy use of recursion and it is important to control the order in which replacement rules are to be applied. For this purpose, each rule is given a precedence. Precedences go from low to high, so all rules with precedence 0 will be tried before any rule with precedence 1.

You can assign several rules to one and the same function, as long as some of the predicates differ. If none of the predicates is true, the function with its arguments evaluated is returned.

This scheme is slightly slower for ordinary functions that just have one rule (with the predicate True), but it is a desired behaviour for symbolic manipulation. You can slowly build up your own functions, incrementally testing their properties.


Examples of using rules

As a simple example, here are the actual RuleBase and Rule calls needed to define the factorial function:
In> RuleBase("f",{n});

Out> True;

In> Rule("f", 1, 10, n=0) 1;

Out> True;

In> Rule("f", 1, 20, IsInteger(n) And n>0) n*f(n-1);

Out> True;
This definition is entirely equivalent to the one we gave in the tutorial. f(4) should now return 24, while f(a) should return just f(a) if "a" is not bound to any value.

The Rule commands in this example specified two rules for function "f" with arity 1: one rule with precedence 10 and predicate n=0, and another with precedence 20 and the predicate that returns True only if "n" is a positive integer. Rules with lowest precedence get evaluated first, so the rule with precedence 10 will be tried before the rule with precedence 20. Note that the predicates and the body use the name "n" declared by the RuleBase call.

After declaring RuleBase for a function, you could tell the parser to treat this function as a postfix operator:
In> Postfix("f");

Out> True;

In> 4 f;

Out> 24;
There is already a function Function defined in the standard scripts that allows you to construct simple functions. An example would be
Function ("FirstOf", {list})  list[ 1 ] ; 
which simply returns the first element of a list. This could also have been written as
Function("FirstOf", {list})

[

list[1] ;

];
As mentioned before, the [ and ] brackets are also used to combine multiple operations to be performed one after the other. The result of the last performed action is returned.

Finally, the function FirstOf could also have been defined by typing
FirstOf(list):=list[1] ;


Structured programming

Some functions useful for control flow are already defined in Yacas's standard library. Let's look at a possible definition of a looping function ForEach. We shall here consider a somewhat simple-minded definition, while the actual ForEach as defined in the standard script "controlflow" is a little more sophisticated.
Function("ForEach",{foreachitem,foreachlist,foreachbody})

[

Local(foreachi,foreachlen);

foreachlen:=Length(foreachlist);

foreachi:=0;

While (foreachi < foreachlen)

[

foreachi++;

MacroLocal(foreachitem);

MacroSet(foreachitem,foreachlist[foreachi]);

Eval(foreachbody);

];

];

Bodied("ForEach");

UnFence("ForEach",3);

HoldArg("ForEach",foreachitem);

HoldArg("ForEach",foreachbody);
Functions like this should probably be defined in a separate file. You can load such a file with the command Load("file")This is an example of a macro-like function. Let's look at the last few lines first. There is a Bodied(...) , which states that a for command is to be entered as ForEach(item,{list}) body; . That is, the last argument to the command ForEach should be outside its brackets.UnFence(...) states that this function can use the local variables of the calling function. This is necessary, since the body to be evaluated for each item will probably use some local variables from that surrounding.

Finally, HoldArg("function",argument) specifies that the argument argument should not be evaluated before being bound to that variable. This holds for foreachitem and foreachbody , since foreachitem specifies a variable to be set to that value, and foreachbody is the expression that should be evaluated after that variable is set.

Inside the body of the function definition there are calls to Local(...). 'Local' declares some local variable that will only be visible within a block [ ... ]. The command MacroLocal works almost the same. The difference is that it evaluates its arguments before performing the action on it. This is needed in this case, because the variable foreachitem is bound to the variable to be used, and it is the variable it is bound to we want to make local, not "foreachitem" itself. MacroSet works similarly, it does the same as Set, except that it also first evaluates the first argument, thus setting the variable requested by the user of this function. The Macro... functions in the built-in functions generally perform the same action as their non-macro versions, apart from evaluating an argument it would otherwise not evaluate.

To see the function in action, you could type:
ForEach(i,{1,2,3}) [Write(i);NewLine();];
This should obviously write 1, 2 and 3, each on a new line.

Note: the variable names "foreach..." have been chosen so they won't get confused with normal variables you use. This is a major design flaw in this language. Suppose there was a local variable foreachitem, defined in the calling function, and used in foreachbody. These two would collide, and the interpreter would use only the last defined version. In general, when writing a function that calls Eval , it is a good idea to use variable names that can not easily be mistaken. This is generally the single largest cause of bugs when writing programs in Yacas. This issue should be addressed in the future.


Additional Syntactic Sugar

The parser is extended slightly to allow for fancier constructs.


Scope Of Variable Bindings

When setting variables or retrieving variable values, variables are automatically bound global by default. You can explicitly specify variables to be local. When invoking a function, a new stack frame is pushed for the locals, so the code inside a function body doesn't see the locals of the caller. A statement block can have its own locals that are not visible outside the block (blocks have the form [statement1(); statement2(); ] etc.).

You can tell the interpreter that a function can see the local variables from the calling environment by declaring UnFence(funcname, arity) on the specified function.