Contents:
Introduction
The Starting Point
Extra Definitions Required for Integration
Making Leaves for the Parse Tree
Operating on Leaves
Assigning to Vectors
Counting Vectors
Making Operators
Using These Definitions
Summary
Source Files
Vec3.h
Vec3Defs.in
Vec3.cpp
Introduction
This tutorial shows how to integrate a simple class representing 3-element vectors into PETE. The source files for this example are included in the examples/Vec3 directory of the PETE distribution. These files are:
The starting point for this tutorial is the 3-element vector class defined in Vec3.h. Most of this class's declaration is unremarkable. Each instance of the class contains a 3-element array of integers; the class's default constructor initializes their values to 0, while a non-default constructor can be used to give them particular initial values. A copy constructor is also provided, as are assignment operators taking either scalar or vector values. Finally, two versions of operator[] are provided, so that both constant and non-constant vectors can be indexed, and a print() method is defined for operator<< and other I/O routines to use. operator<< is overloaded further down, on lines 115-119.
If you do not understand all of the definitions on lines 22-58 and 78-94 of Vec3.h, you may wish to become more familiar with C++ before proceeding with these tutorials.
In order to understand the particulars of this tutorial, it is necessary to know about some of the indirection classes that PETE uses. The most important of these is a general wrapper called Expression<>. It exists to wrap UnaryNode<>, BinaryNode<> and TrinaryNode<> so that they can all be captured during template expansion by a single type, namely Expression<T>. As we shall see below, the Expression<> template also serves to distinguish PETE expressions from other expressions, so that the compiler will not inadvertently mix PETE expressions with normal arithmetic.
PETE's second indirection class is called
MakeReturn<>. For any type T,
MakeReturn<T>::Expression_t is a typedef that
produces the type of value returned by expressions on T. In
the general case, this is simply Expression<T>,
i.e. MakeReturn<> just wraps the expression produced by
an operator so that it can be used inside other operators. The POOMA library overrides
MakeReturn<T> (by specializing it explicitly) so that
expressions involving POOMA arrays generate new arrays. This
technique is similar to the standard C++ idiom of having a framework
define one or more virtual methods with empty bodies, and call them at
specified times, so that users can derive from the framework classes
and override those virtual methods to insert their own code in the
framework's processing stream.
Extra Definitions Required for Integration
In order to use Vec3 with PETE, we must provide three things:
In addition, this example shows how to add new capabilities to PETE
by creating a mechanism for counting the number of instances of
Vec3 involved in an expression. The same kind of mechanism
can be used to (for example) check that all of the vectors in an
expression have the same length before starting evaluation of that
expression.
Making Leaves for the Parse Tree
PETE uses a traits class called CreateLeaf<> to record information about the leaf types on which it operates. Each specialization of CreateLeaf<> must be able to answer two questions:
The first question is answered by providing a typedef for the name Leaf_t. In our example, we want to have access to Vec3's member functions. However, Vec3 has deep copy semantics so we don't want store actual Vec3 objects at the leaves, thereby making copies and negating most of the benefit of expression templates. Instead, we store a Vec3&. This is accomplished by wrapping the Vec3 class in a Reference<> wrapper. By default, PETE stores leaves by value, which is appropriate for leaves that hold iterators. In this case we would not have to make use of the Reference<> wrapper. Once we have done this, the rest of PETE can be written using expressions like CreateLeaf<T>::Leaf_t. This allows PETE to work with classes that are added later, in the same way that making a function virtual allows programs that use a library to add new functionality without re-writing old code.
Making a leaf is a little bit trickier. Every specialization of CreateLeaf<> must define a static method called make() that takes something of the specialization's input type as an argument, and returns something of its leaf type. This method must be static so that it can be called without an instance of CreateLeaf<Vec3> ever having been created; as with most traits classes, the specializations of CreateLeaf<> exist only to answer questions. In this example, the input to make() is a constant reference to a Vec3, which the function simply returns wrapped in a Reference<Vec3> object. In the case of an STL list, the argument would have type List<T>, but the return type Leaf_t might be an iterator type.
103 template<> 104 struct CreateLeaf<Vec3> 105 { 106 typedef Reference<Vec3> Leaf_t; 107 inline static 108 Leaf_t make(const Vec3 &a) { return Leaf_t(a); } 109 };
Our next task is to provide a way for PETE expressions to extract values from Vec3s. This has to be done in a generic way, so that (for example) scalars can return the same value each time they are queried, while STL lists are accessed through bidirectional iterators and Vec3s are accessed by integer indexing.
PETE's solution to this problem is to require programmers to specialize the traits class LeafFunctor<> for the combination of their leaf class and a tag class called EvalLeaf1. EvalLeaf1 is a simple class defined by PETE, whose only purpose is to contain the single index that PETE wants to evaluate an expression at. (EvalLeaf2 signals that a doubly-indexed expression is being evaluated, and so on up to EvalLeaf7.)
The specialization of LeafFunctor<>, shown below, does two things. First, it defines the type of the result of the expression as Type_t. Second, it defines a static method called apply(), which uses the index stored in its EvalLeaf1 argument and returns the corresponding element of the Vec3:
127 template<> 128 struct LeafFunctor<Vec3, EvalLeaf1> 129 { 130 typedef int Type_t; 131 static Type_t apply(const Vec3 &a, const EvalLeaf1 &f) 132 { return a[f.val1()]; } 133 };
The last step in making Vec3 PETE-compatible is to provide a way for PETE to assign to a Vec3 from an arbitrary expression. This is done by overloading operator= to take a PETE expression as input, and copy values into its owner:
064 template<class RHS> 065 Vec3 &operator=(const Expression<RHS> &rhs) 066 { 067 d[0] = forEach(rhs, EvalLeaf1(0), OpCombine()); 068 d[1] = forEach(rhs, EvalLeaf1(1), OpCombine()); 069 d[2] = forEach(rhs, EvalLeaf1(2), OpCombine()); 070 071 return *this; 072 }
The first thing to notice about this method is that it is templated on an arbitrary class RHS, but its single formal parameter has type Expression<RHS>. This combination means that the compiler can match it against anything that is wrapped in the generic PETE template Expression<>, and only against things that are wrapped in that way. The compiler cannot match against int, complex<short>, or GreatAuntJane_t, since these do not have the form Expression<RHS> for some type RHS.
The forEach function is used to traverse expression trees.
The first argument is the expression. The second argument is the leaf
tag denoting the operation applied at the leaves.
The third argument is a combiner tag,
which is used to combine results at non-leaf nodes. By passing
EvalLeaf1(0) in line 67, we are indicating that we want the
Vec3s at the leaves to return the element at index 0. The
LeafFunctor<Scalar<T>, EvalLeaf1> (defined
inside of PETE) ensures that scalars return their value no matter the
index. While EvalLeaf1 obtains values from the leaves,
OpCombine takes these values and combines them according to the
operators present at the non-leaf nodes. The result is that line 67
evaluates the expression on the right side of the assignment operator at
index 0. Line 68 does this at index 1, and so on. Once evaluation is complete,
operator= returns the Vec3 to which values have been
assigned, in keeping with normal C++ conventions.
Counting Vectors
We could stop at this point, but in order to show off PETE's flexibility, we will finish by defining a new leaf tag that counts the number of Vec3s in an arbitrary expression. The required definitions, on lines 152-168 of Vec3.h, are:
141 struct CountLeaf { }; 142 143 template<> 144 struct LeafFunctor<Vec3, CountLeaf> 145 { 146 typedef int Type_t; 147 static Type_t apply(const Vec3 &, const CountLeaf &) 148 { return 1; } 149 }; 150 151 template<class T> 152 struct LeafFunctor<T, CountLeaf> 153 { 154 typedef int Type_t; 155 static Type_t apply(const T &a, const CountLeaf &) 156 { return 0; } 157 };
CountLeaf is an empty tag class, whose only purpose is
to identify the operation we wish to carry out.
LeafFunctor<> is then specialized separately for
CountLeaf on both Vec3 and the generic type
T. Applying the first specialization returns 1, since it
wraps a single Vec3. Applying the second specialization
returns 0, since nothing else counts as a vector.
Making Operators
We have now provided almost everything that PETE needs in order to operate on our Vec3 class. All that remains is several thousand lines of templated operator and function definitions. Luckily, these can be generated automatically from a few lines of information.
The file Vec3Defs.in stores exactly that information, and is used by PETE's MakeOperators tool to generate the 3740 lines of Vec3Operators.h. The special notation "[n]" is replaced by the digits 1, 2, and so on to distinguish multiple uses of the same argument. Thus, "class T[n]" becomes "class T1", "class T2", and so on as needed. The entire specification file is:
001 classes 002 ----- 003 ARG = "" 004 CLASS = "Vec3"
Two values are specified for each of the classes for which definitions are to be generated:
The command used to build an operator definition file from this input specification is:
MakeOperators --classes Vec3Defs.in --guard VEC3OPS_H --o Vec3Operators.h
Vec3Defs.in is the file shown above. The symbol VEC3OPS_H is copied into the output to guard against multiple inclusion, so that the output file has the form:
#ifndef VEC3OPS_H #define VEC3OPS_H // ...contents of file... #endif // VEC3OPS_H
Further information about MakeOperators and its command-line
argument can be found on its man page.
Using These Definitions
With all this out of the way, we can now write arithmetic expressions that use Vec3, and rely on PETE to optimize them for us. The file Vec3.cpp shows some of the possibilities. The simplest example is straightforward addition and assignment:
013 a = b + c;
which would automatically be translated into something equivalent to:
a[0] = b[0] + c[0]; a[1] = b[1] + c[1]; a[2] = b[2] + c[2];
This snippet would make use of the overloaded operator+() generated by the MakeOperators tool in the Vec3Operators.h file and the assignment operator defined above.
This expression could be made much more complex, and PETE would still eliminate redundant temporaries or loops. One such expression is:
a = sqrt(b*b + c*c);
which would automatically be translated into something equivalent to:
a[0] = sqrt(b[0]*b[0] + c[0]*c[0]); a[1] = sqrt(b[1]*b[1] + c[1]*c[1]); a[2] = sqrt(b[2]*b[2] + c[2]*c[2]);
since PETE provides appropriately-templated overloadings of the standard mathematical functions like sqrt() and acos() as well as overloadings of unary and binary operators.
The next two examples in Vec3.cpp make use of an expression (in this case, the addition of b and c) that has been recorded for delayed evaluation. In order to do this, the programmer must explicitly specify the type of the expression being stored, but once this has been done, that expression can be re-used any number of times. The statement that creates the expression is:
018 const Expression<BinaryNode<OpAdd, Vec3, Vec3> > &expr1 = b + c;
Its first use is as a source for assignment:
019 d = expr1;
It can also be passed to PETE's explicit evaluation function, called forEach(), along with the CountLeaf defined earlier, and PETE's built-in SumCombine tag class, in order to count the number of instances of Vec3 that appear in the expression:
022 int num = forEach(expr1, CountLeaf(), SumCombine());
Note the parentheses after CountLeaf and SumCombine. C++ does not allow raw type names to be used to instantiate templates; instead, the program must create an unnamed instance of each tag class by invoking their default constructors. Since these classes contain no data, and their instances are not used inside forEach(), the compiler optimizes away all of the associated code.
The remaining examples in Vec3.cpp
use CountLeaf to inspect more complicated expressions.
Summary
This tutorial has shown how to integrate a simple class into PETE's expression template framework, so that compilers can optimize expressions involving instances of that class. The steps required are:
In addition, this tutorial showed how to extend PETE's leaf and
combiner tags to calculate values on expression trees during compilation.
The next tutorial will look at how to extend the PETE framework itself
to synthesize new types.
Source Files
Vec3.h
001 #ifndef PETE_EXAMPLES_VEC3_VEC3_H 002 #define PETE_EXAMPLES_VEC3_VEC3_H 003 004 //----------------------------------------------------------------------------- 005 // Include files 006 //----------------------------------------------------------------------------- 007 008 #include "PETE/PETE.h" 009 010 #include <iostream.h> 011 012 //----------------------------------------------------------------------------- 013 // 014 // CLASS NAME 015 // Vec3 016 // 017 // DESCRIPTION 018 // A "tiny" three-element expression-template (ET) array class. 019 // 020 //----------------------------------------------------------------------------- 021 022 class Vec3 023 { 024 public: 025 026 //--------------------------------------------------------------------------- 027 // Constructors and Destructor 028 //--------------------------------------------------------------------------- 029 030 Vec3() { d[0] = d[1] = d[2] = 0; } 031 032 Vec3(int i, int j, int k) 033 { 034 d[0] = i; d[1] = j; d[2] = k; 035 } 036 037 Vec3(const Vec3 &v) 038 { 039 d[0] = v.d[0]; d[1] = v.d[1]; d[2] = v.d[2]; 040 } 041 042 ~Vec3() {} 043 044 //--------------------------------------------------------------------------- 045 // Vec3 and scalar assigment operators 046 //--------------------------------------------------------------------------- 047 048 Vec3 &operator=(const Vec3 &v) 049 { 050 d[0] = v.d[0]; d[1] = v.d[1]; d[2] = v.d[2]; 051 return *this; 052 } 053 054 Vec3 &operator=(int i) 055 { 056 d[0] = d[1] = d[2] = i; 057 return *this; 058 } 059 060 //--------------------------------------------------------------------------- 061 // Assignment operator taking expression: 062 //--------------------------------------------------------------------------- 063 064 template<class RHS> 065 Vec3 &operator=(const Expression<RHS> &rhs) 066 { 067 d[0] = forEach(rhs, EvalLeaf1(0), OpCombine()); 068 d[1] = forEach(rhs, EvalLeaf1(1), OpCombine()); 069 d[2] = forEach(rhs, EvalLeaf1(2), OpCombine()); 070 071 return *this; 072 } 073 074 //--------------------------------------------------------------------------- 075 // Indexing operators 076 //--------------------------------------------------------------------------- 077 078 int &operator[](int i) { return d[i]; } 079 int operator[](int i) const { return d[i]; } 080 081 //--------------------------------------------------------------------------- 082 // Print method used by operator<< free function. 083 //--------------------------------------------------------------------------- 084 085 void print(ostream &os) const 086 { 087 os << "{" << d[0] << "," << d[1] << "," << d[2] << "}"; 088 } 089 090 private: 091 092 // The underlying complicated data structure 093 094 int d[3]; 095 096 }; 097 098 //----------------------------------------------------------------------------- 099 // We need to specialize CreateLeaf<T> for our class, so that operators 100 // know what to stick in the leaves of the expression tree. 101 //----------------------------------------------------------------------------- 102 103 template<> 104 struct CreateLeaf<Vec3> 105 { 106 typedef Reference<Vec3> Leaf_t; 107 inline static 108 Leaf_t make(const Vec3 &a) { return Leaf_t(a); } 109 }; 110 111 //----------------------------------------------------------------------------- 112 // ostream inserter for Vec3s 113 //----------------------------------------------------------------------------- 114 115 ostream &operator<<(ostream &os, const Vec3 &a) 116 { 117 a.print(os); 118 return os; 119 } 120 121 //----------------------------------------------------------------------------- 122 // Specialization of LeafFunctor class for applying the EvalLeaf1 123 // tag to a Vec3. The apply method simply returns the array 124 // evaluated at the point. 125 //----------------------------------------------------------------------------- 126 127 template<> 128 struct LeafFunctor<Vec3, EvalLeaf1> 129 { 130 typedef int Type_t; 131 static Type_t apply(const Vec3 &a, const EvalLeaf1 &f) 132 { return a[f.val1()]; } 133 }; 134 135 //----------------------------------------------------------------------------- 136 // Specialization of LeafFunctor class for applying the CountLeaf 137 // tag to a Vec3. The apply method simply returns 1 for a Vec3 and 0 for 138 // anything else. 139 //----------------------------------------------------------------------------- 140 141 struct CountLeaf { }; 142 143 template<> 144 struct LeafFunctor<Vec3, CountLeaf> 145 { 146 typedef int Type_t; 147 static Type_t apply(const Vec3 &, const CountLeaf &) 148 { return 1; } 149 }; 150 151 template<class T> 152 struct LeafFunctor<T, CountLeaf> 153 { 154 typedef int Type_t; 155 static Type_t apply(const T &a, const CountLeaf &) 156 { return 0; } 157 }; 158 159 // We put this include at the end because 160 // the operators can't be defined until after Vec3 and 161 // CreateLeaf<Vec3> have been defined. 162 // (Since Vec3 isn't templated the operators aren't just 163 // templates.) 164 165 #include "Vec3Operators.h" 166 167 #endif // PETE_EXAMPLES_VEC3_VEC3_H
001 classes 002 ----- 003 ARG = "" 004 CLASS = "Vec3"
001 #include "Vec3.h" 002 003 int main() 004 { 005 Vec3 a, b, c; 006 007 c = 4; 008 009 b[0] = -1; 010 b[1] = -2; 011 b[2] = -3; 012 013 a = b + c; 014 015 cout << a << endl; 016 017 Vec3 d; 018 const Expression<BinaryNode<OpAdd, Vec3, Vec3> > &expr1 = b + c; 019 d = expr1; 020 cout << d << endl; 021 022 int num = forEach(expr1, CountLeaf(), SumCombine()); 023 cout << num << endl; 024 025 const Expression<BinaryNode<OpAdd, Vec3, 026 BinaryNode<OpMultiply, Scalar<int>, Vec3> > > &expr2 = b + 3 * c; 027 num = forEach(expr2, CountLeaf(), SumCombine()); 028 cout << num << endl; 029 030 const Expression<BinaryNode<OpAdd, Vec3, 031 BinaryNode<OpMultiply, Vec3, Vec3> > > &expr3 = b + c * d; 032 num = forEach(expr3, CountLeaf(), SumCombine()); 033 cout << num << endl; 034 }
[Prev] | [Home] | [Next] |