PETE Tutorial 1
Incorporating a Simple Vector Class

Contents:
    Introduction
    The Starting Point
    Extra Definitions Required for Integration
        Making Leaves for the Parse Tree
        Operating on Leaves
        Assigning to Vectors
        Counting Vectors
    Making Operators
    Using These Definitions
    Summary
    Source Files
        Vec3.h
        Vec3Defs.in
        Vec3.cpp

Introduction

This tutorial shows how to integrate a simple class representing 3-element vectors into PETE. The source files for this example are included in the examples/Vec3 directory of the PETE distribution. These files are:

The Starting Point

The starting point for this tutorial is the 3-element vector class defined in Vec3.h. Most of this class's declaration is unremarkable. Each instance of the class contains a 3-element array of integers; the class's default constructor initializes their values to 0, while a non-default constructor can be used to give them particular initial values. A copy constructor is also provided, as are assignment operators taking either scalar or vector values. Finally, two versions of operator[] are provided, so that both constant and non-constant vectors can be indexed, and a print() method is defined for operator<< and other I/O routines to use. operator<< is overloaded further down, on lines 115-119.

If you do not understand all of the definitions on lines 22-58 and 78-94 of Vec3.h, you may wish to become more familiar with C++ before proceeding with these tutorials.

In order to understand the particulars of this tutorial, it is necessary to know about some of the indirection classes that PETE uses. The most important of these is a general wrapper called Expression<>. It exists to wrap UnaryNode<>, BinaryNode<> and TrinaryNode<> so that they can all be captured during template expansion by a single type, namely Expression<T>. As we shall see below, the Expression<> template also serves to distinguish PETE expressions from other expressions, so that the compiler will not inadvertently mix PETE expressions with normal arithmetic.

PETE's second indirection class is called MakeReturn<>. For any type T, MakeReturn<T>::Expression_t is a typedef that produces the type of value returned by expressions on T. In the general case, this is simply Expression<T>, i.e. MakeReturn<> just wraps the expression produced by an operator so that it can be used inside other operators. The POOMA library overrides MakeReturn<T> (by specializing it explicitly) so that expressions involving POOMA arrays generate new arrays. This technique is similar to the standard C++ idiom of having a framework define one or more virtual methods with empty bodies, and call them at specified times, so that users can derive from the framework classes and override those virtual methods to insert their own code in the framework's processing stream.

Extra Definitions Required for Integration

In order to use Vec3 with PETE, we must provide three things:

  1. A description of the Vec3 class.
  2. A way to extract values from instances of Vec3.
  3. A way to assign to a Vec3 from a PETE expression.
These three issues are discussed in order below.

In addition, this example shows how to add new capabilities to PETE by creating a mechanism for counting the number of instances of Vec3 involved in an expression. The same kind of mechanism can be used to (for example) check that all of the vectors in an expression have the same length before starting evaluation of that expression.

Making Leaves for the Parse Tree

PETE uses a traits class called CreateLeaf<> to record information about the leaf types on which it operates. Each specialization of CreateLeaf<> must be able to answer two questions:

  1. What is the type of the leaf?
  2. How can the program make an instance of this leaf?

The first question is answered by providing a typedef for the name Leaf_t. In our example, we want to have access to Vec3's member functions. However, Vec3 has deep copy semantics so we don't want store actual Vec3 objects at the leaves, thereby making copies and negating most of the benefit of expression templates. Instead, we store a Vec3&. This is accomplished by wrapping the Vec3 class in a Reference<> wrapper. By default, PETE stores leaves by value, which is appropriate for leaves that hold iterators. In this case we would not have to make use of the Reference<> wrapper. Once we have done this, the rest of PETE can be written using expressions like CreateLeaf<T>::Leaf_t. This allows PETE to work with classes that are added later, in the same way that making a function virtual allows programs that use a library to add new functionality without re-writing old code.

Making a leaf is a little bit trickier. Every specialization of CreateLeaf<> must define a static method called make() that takes something of the specialization's input type as an argument, and returns something of its leaf type. This method must be static so that it can be called without an instance of CreateLeaf<Vec3> ever having been created; as with most traits classes, the specializations of CreateLeaf<> exist only to answer questions. In this example, the input to make() is a constant reference to a Vec3, which the function simply returns wrapped in a Reference<Vec3> object. In the case of an STL list, the argument would have type List<T>, but the return type Leaf_t might be an iterator type.

103  template<>
104  struct CreateLeaf<Vec3>
105  {
106    typedef Reference<Vec3> Leaf_t;
107    inline static
108    Leaf_t make(const Vec3 &a) { return Leaf_t(a); }
109  };

Operating on Leaves

Our next task is to provide a way for PETE expressions to extract values from Vec3s. This has to be done in a generic way, so that (for example) scalars can return the same value each time they are queried, while STL lists are accessed through bidirectional iterators and Vec3s are accessed by integer indexing.

PETE's solution to this problem is to require programmers to specialize the traits class LeafFunctor<> for the combination of their leaf class and a tag class called EvalLeaf1. EvalLeaf1 is a simple class defined by PETE, whose only purpose is to contain the single index that PETE wants to evaluate an expression at. (EvalLeaf2 signals that a doubly-indexed expression is being evaluated, and so on up to EvalLeaf7.)

The specialization of LeafFunctor<>, shown below, does two things. First, it defines the type of the result of the expression as Type_t. Second, it defines a static method called apply(), which uses the index stored in its EvalLeaf1 argument and returns the corresponding element of the Vec3:

127  template<>
128  struct LeafFunctor<Vec3, EvalLeaf1>
129  {
130    typedef int Type_t;
131    static Type_t apply(const Vec3 &a, const EvalLeaf1 &f)
132      { return a[f.val1()]; }
133  };

Assigning to Vectors

The last step in making Vec3 PETE-compatible is to provide a way for PETE to assign to a Vec3 from an arbitrary expression. This is done by overloading operator= to take a PETE expression as input, and copy values into its owner:

064    template<class RHS>
065    Vec3 &operator=(const Expression<RHS> &rhs)
066    {
067      d[0] = forEach(rhs, EvalLeaf1(0), OpCombine());
068      d[1] = forEach(rhs, EvalLeaf1(1), OpCombine());
069      d[2] = forEach(rhs, EvalLeaf1(2), OpCombine());
070  
071      return *this;
072    }

The first thing to notice about this method is that it is templated on an arbitrary class RHS, but its single formal parameter has type Expression<RHS>. This combination means that the compiler can match it against anything that is wrapped in the generic PETE template Expression<>, and only against things that are wrapped in that way. The compiler cannot match against int, complex<short>, or GreatAuntJane_t, since these do not have the form Expression<RHS> for some type RHS.

The forEach function is used to traverse expression trees. The first argument is the expression. The second argument is the leaf tag denoting the operation applied at the leaves. The third argument is a combiner tag, which is used to combine results at non-leaf nodes. By passing EvalLeaf1(0) in line 67, we are indicating that we want the Vec3s at the leaves to return the element at index 0. The LeafFunctor<Scalar<T>, EvalLeaf1> (defined inside of PETE) ensures that scalars return their value no matter the index. While EvalLeaf1 obtains values from the leaves, OpCombine takes these values and combines them according to the operators present at the non-leaf nodes. The result is that line 67 evaluates the expression on the right side of the assignment operator at index 0. Line 68 does this at index 1, and so on. Once evaluation is complete, operator= returns the Vec3 to which values have been assigned, in keeping with normal C++ conventions.

Counting Vectors

We could stop at this point, but in order to show off PETE's flexibility, we will finish by defining a new leaf tag that counts the number of Vec3s in an arbitrary expression. The required definitions, on lines 152-168 of Vec3.h, are:

141  struct CountLeaf { };
142  
143  template<>
144  struct LeafFunctor<Vec3, CountLeaf>
145  {
146    typedef int Type_t;
147    static Type_t apply(const Vec3 &, const CountLeaf &)
148      { return 1; }
149  };
150  
151  template<class T>
152  struct LeafFunctor<T, CountLeaf>
153  {
154    typedef int Type_t;
155    static Type_t apply(const T &a, const CountLeaf &)
156      { return 0; }
157  };

CountLeaf is an empty tag class, whose only purpose is to identify the operation we wish to carry out. LeafFunctor<> is then specialized separately for CountLeaf on both Vec3 and the generic type T. Applying the first specialization returns 1, since it wraps a single Vec3. Applying the second specialization returns 0, since nothing else counts as a vector.

Making Operators

We have now provided almost everything that PETE needs in order to operate on our Vec3 class. All that remains is several thousand lines of templated operator and function definitions. Luckily, these can be generated automatically from a few lines of information.

The file Vec3Defs.in stores exactly that information, and is used by PETE's MakeOperators tool to generate the 3740 lines of Vec3Operators.h. The special notation "[n]" is replaced by the digits 1, 2, and so on to distinguish multiple uses of the same argument. Thus, "class T[n]" becomes "class T1", "class T2", and so on as needed. The entire specification file is:

001  classes
002  -----
003    ARG   = ""
004    CLASS = "Vec3"

Two values are specified for each of the classes for which definitions are to be generated:

The command used to build an operator definition file from this input specification is:

MakeOperators --classes Vec3Defs.in --guard VEC3OPS_H --o Vec3Operators.h

Vec3Defs.in is the file shown above. The symbol VEC3OPS_H is copied into the output to guard against multiple inclusion, so that the output file has the form:

#ifndef VEC3OPS_H
#define VEC3OPS_H

// ...contents of file...

#endif // VEC3OPS_H

Further information about MakeOperators and its command-line argument can be found on its man page.

Using These Definitions

With all this out of the way, we can now write arithmetic expressions that use Vec3, and rely on PETE to optimize them for us. The file Vec3.cpp shows some of the possibilities. The simplest example is straightforward addition and assignment:

013    a = b + c;

which would automatically be translated into something equivalent to:

a[0] = b[0] + c[0];
a[1] = b[1] + c[1];
a[2] = b[2] + c[2];

This snippet would make use of the overloaded operator+() generated by the MakeOperators tool in the Vec3Operators.h file and the assignment operator defined above.

This expression could be made much more complex, and PETE would still eliminate redundant temporaries or loops. One such expression is:

a = sqrt(b*b + c*c);

which would automatically be translated into something equivalent to:

a[0] = sqrt(b[0]*b[0] + c[0]*c[0]);
a[1] = sqrt(b[1]*b[1] + c[1]*c[1]);
a[2] = sqrt(b[2]*b[2] + c[2]*c[2]);

since PETE provides appropriately-templated overloadings of the standard mathematical functions like sqrt() and acos() as well as overloadings of unary and binary operators.

The next two examples in Vec3.cpp make use of an expression (in this case, the addition of b and c) that has been recorded for delayed evaluation. In order to do this, the programmer must explicitly specify the type of the expression being stored, but once this has been done, that expression can be re-used any number of times. The statement that creates the expression is:

018    const Expression<BinaryNode<OpAdd, Vec3, Vec3> > &expr1 = b + c;

Its first use is as a source for assignment:

019    d = expr1;

It can also be passed to PETE's explicit evaluation function, called forEach(), along with the CountLeaf defined earlier, and PETE's built-in SumCombine tag class, in order to count the number of instances of Vec3 that appear in the expression:

022    int num = forEach(expr1, CountLeaf(), SumCombine());

Note the parentheses after CountLeaf and SumCombine. C++ does not allow raw type names to be used to instantiate templates; instead, the program must create an unnamed instance of each tag class by invoking their default constructors. Since these classes contain no data, and their instances are not used inside forEach(), the compiler optimizes away all of the associated code.

The remaining examples in Vec3.cpp use CountLeaf to inspect more complicated expressions.

Summary

This tutorial has shown how to integrate a simple class into PETE's expression template framework, so that compilers can optimize expressions involving instances of that class. The steps required are:

In addition, this tutorial showed how to extend PETE's leaf and combiner tags to calculate values on expression trees during compilation. The next tutorial will look at how to extend the PETE framework itself to synthesize new types.

Source Files

Vec3.h


001  #ifndef PETE_EXAMPLES_VEC3_VEC3_H
002  #define PETE_EXAMPLES_VEC3_VEC3_H
003  
004  //-----------------------------------------------------------------------------
005  // Include files
006  //-----------------------------------------------------------------------------
007  
008  #include "PETE/PETE.h"
009  
010  #include <iostream.h>
011  
012  //-----------------------------------------------------------------------------
013  //
014  // CLASS NAME 
015  //   Vec3
016  //
017  // DESCRIPTION
018  //   A "tiny" three-element expression-template (ET) array class. 
019  //
020  //-----------------------------------------------------------------------------
021  
022  class Vec3
023  {
024  public:
025  
026    //---------------------------------------------------------------------------
027    // Constructors and Destructor
028    //---------------------------------------------------------------------------
029  
030    Vec3() { d[0] = d[1] = d[2] = 0; }
031  
032    Vec3(int i, int j, int k) 
033    {
034      d[0] = i; d[1] = j; d[2] = k;
035    }
036  
037    Vec3(const Vec3 &v) 
038    {
039      d[0] = v.d[0];  d[1] = v.d[1];  d[2] = v.d[2];
040    }
041  
042    ~Vec3() {}
043  
044    //---------------------------------------------------------------------------
045    // Vec3 and scalar assigment operators
046    //---------------------------------------------------------------------------
047  
048    Vec3 &operator=(const Vec3 &v) 
049    {
050      d[0] = v.d[0];  d[1] = v.d[1];  d[2] = v.d[2];
051      return *this;
052    }
053  
054    Vec3 &operator=(int i) 
055    {
056      d[0] = d[1] = d[2] = i;
057      return *this;
058    }
059  
060    //---------------------------------------------------------------------------
061    // Assignment operator taking expression:
062    //---------------------------------------------------------------------------
063  
064    template<class RHS>
065    Vec3 &operator=(const Expression<RHS> &rhs)
066    {
067      d[0] = forEach(rhs, EvalLeaf1(0), OpCombine());
068      d[1] = forEach(rhs, EvalLeaf1(1), OpCombine());
069      d[2] = forEach(rhs, EvalLeaf1(2), OpCombine());
070  
071      return *this;
072    }
073  
074    //---------------------------------------------------------------------------
075    // Indexing operators
076    //---------------------------------------------------------------------------
077  
078    int &operator[](int i)      { return d[i]; }
079    int operator[](int i) const { return d[i]; }
080  
081    //---------------------------------------------------------------------------
082    // Print method used by operator<< free function.
083    //---------------------------------------------------------------------------
084  
085    void print(ostream &os) const 
086    { 
087      os << "{" << d[0] << "," << d[1] << "," << d[2] << "}";
088    }
089  
090  private:
091  
092    // The underlying complicated data structure
093  
094    int d[3];
095  
096  };
097  
098  //-----------------------------------------------------------------------------
099  // We need to specialize CreateLeaf<T> for our class, so that operators
100  // know what to stick in the leaves of the expression tree.
101  //-----------------------------------------------------------------------------
102  
103  template<>
104  struct CreateLeaf<Vec3>
105  {
106    typedef Reference<Vec3> Leaf_t;
107    inline static
108    Leaf_t make(const Vec3 &a) { return Leaf_t(a); }
109  };
110  
111  //-----------------------------------------------------------------------------
112  // ostream inserter for Vec3s
113  //-----------------------------------------------------------------------------
114  
115  ostream &operator<<(ostream &os, const Vec3 &a)
116  {
117    a.print(os);
118    return os;
119  }
120  
121  //-----------------------------------------------------------------------------
122  // Specialization of LeafFunctor class for applying the EvalLeaf1
123  // tag to a Vec3. The apply method simply returns the array
124  // evaluated at the point.
125  //-----------------------------------------------------------------------------
126  
127  template<>
128  struct LeafFunctor<Vec3, EvalLeaf1>
129  {
130    typedef int Type_t;
131    static Type_t apply(const Vec3 &a, const EvalLeaf1 &f)
132      { return a[f.val1()]; }
133  };
134  
135  //-----------------------------------------------------------------------------
136  // Specialization of LeafFunctor class for applying the CountLeaf
137  // tag to a Vec3. The apply method simply returns 1 for a Vec3 and 0 for
138  // anything else.
139  //-----------------------------------------------------------------------------
140  
141  struct CountLeaf { };
142  
143  template<>
144  struct LeafFunctor<Vec3, CountLeaf>
145  {
146    typedef int Type_t;
147    static Type_t apply(const Vec3 &, const CountLeaf &)
148      { return 1; }
149  };
150  
151  template<class T>
152  struct LeafFunctor<T, CountLeaf>
153  {
154    typedef int Type_t;
155    static Type_t apply(const T &a, const CountLeaf &)
156      { return 0; }
157  };
158  
159  // We put this include at the end because
160  // the operators can't be defined until after Vec3 and
161  // CreateLeaf<Vec3> have been defined.
162  // (Since Vec3 isn't templated the operators aren't just
163  // templates.)
164  
165  #include "Vec3Operators.h"
166  
167  #endif // PETE_EXAMPLES_VEC3_VEC3_H

Vec3Defs.in


001  classes
002  -----
003    ARG   = ""
004    CLASS = "Vec3"

Vec3.cpp


001  #include "Vec3.h"
002  
003  int main()
004  {
005    Vec3 a, b, c;
006  
007    c = 4;
008  
009    b[0] = -1;
010    b[1] = -2;
011    b[2] = -3;
012  
013    a = b + c;
014  
015    cout << a << endl;
016  
017    Vec3 d;
018    const Expression<BinaryNode<OpAdd, Vec3, Vec3> > &expr1 = b + c;
019    d = expr1;
020    cout << d << endl;
021    
022    int num = forEach(expr1, CountLeaf(), SumCombine());
023    cout << num << endl;
024  
025    const Expression<BinaryNode<OpAdd, Vec3, 
026      BinaryNode<OpMultiply, Scalar<int>, Vec3> > > &expr2 = b + 3 * c;
027    num = forEach(expr2, CountLeaf(), SumCombine());
028    cout << num << endl;
029    
030    const Expression<BinaryNode<OpAdd, Vec3, 
031      BinaryNode<OpMultiply, Vec3, Vec3> > > &expr3 = b + c * d;
032    num = forEach(expr3, CountLeaf(), SumCombine());
033    cout << num << endl;
034  }


[Prev] [Home] [Next]
Copyright © Los Alamos National Laboratory 1999