In my previous entry I showed how to iterate the (heterogeneous) list of the members of a class in D. The natural evolution of this is to recursively traverse the whole tree given by an object: member’s members, arrays etc. This in fact would make the algorithm much more interesting and useful for a number of cases, serialization and debugging just to name a couple. It’s not even too hard to do, so I will show the improved version, and a couple of use cases.

Recursive Iteration

To keep it generic enough we can split the code in two parts: a pure data-walking function, and a visitor object which will perform the desired computation on every instance.

When adding recursion there are some things to decide:

  • Should an object be analyzed before or after its members, after, or both?
  • Do we have to walk the entire tree, or just a part ? In the latter case how we signal that we want to stop?
  • How do we handle cycles in the structure ? In other words how do we avoid infinite loops ?

For the first point, it’s probably best to allow both, and this can be achieved by requiring the visitor to have two methods, one to be run before any member of the class, and one after. For the other two points above, we can require the first of the two methods to return a value that says if the walker should stop or go ahead; with some work every implementation can decide how and when to stop.

The visitor interface is quite simple:

interface Visitor
{
public:
	bool begin(P,T)(P parent, T t, string id);
	void end(P,T)(P parent, T t, string id);
}

We pass not only the object to be used, but also the object that contains it, and the name of the field (if there is one). The code of the walker is instead as follows

string[] getMembers(T)()
{
	return [__traits(allMembers,T)]
		.filter!( a => a != "Monitor" )
		.filter!( a => a != "factory" )
		.array();
}

string eachFieldWorker(string klass, string var, string expr,string[] members)
{
	string result = "";
	foreach (string m; members)
	{
		string condition = format("!__traits(isVirtualFunction,%s.%s)
		 && !__traits(isFinalFunction,%s.%s)
		 && !__traits(isVirtualMethod,%s.%s) ", klass, m, klass, m, klass, m);
		result ~= (format( "static if (%s) \n" );

	}
	return result;
}

void eachField(T,string expr, U...)(T value, U args)
{
	mixin(eachFieldWorker(__traits(identifier,T),"value", expr, getMembers!T()));
}


void walkObject(P,T,V)(P parent, T t, V v, string fieldName = null)
{
	static if (isBasicType!T) 
	{
		v.begin(parent, t, fieldName);
	}
	else static if (isArray!T) 
	{
		bool cont = v.begin(parent, t, fieldName);
		if (cont)
		{
			foreach( u; t)
			{
				walkObject!(T,typeof(u),V)(t,u,v,null);
			}
		}
		v.end(parent,t, fieldName);
	}
	else static if (isCallable!T) 
	{
		assert(false,"Cannot visit callables");
	}
	else static if (isBasicType!T == false &&
			   isArray!T == false &&
			   isCallable!T == false ) 
	{
		bool cont = v.begin(parent, t, fieldName);
		if (cont)
		{
			eachField!(T,"((a) => walkObject!(typeof(args[0]),typeof(a), typeof(args[1])) (args[0],a,args[1],memberName))")(t,parent,v);
		}
		v.end(parent, t, fieldName);
	}
	else
	{
		assert(false,"Unknown object type " ~ to!string(T));
	}
}

The part iterating over the members is almost the same junk I wrote last time: ugly, convoluted but effective; the walker itself is luckily much simpler, and goes as follows:

  • Execute the begin on the object
  • If the object has members and begin returned true, go on with the members
  • Execute the end on the object

An example: serialization

The walker can be used to implement serialization of arbitrary objects. As an example, a wrote a class to generate XML output:

class XmlSerializer : Visitor
{
public:
	this(string filename)
	{
		_out = File(filename,"w");
	}

	bool begin(P,T)(P parent, T t, string id)
	{
		static if (is(T == string))
		{
			writeMargin();
			_out.write(format("<string id='%s'>%s</string>\n",encode(id),encode(t)));
			return false;
		}
		else static if (isArray!T)
		{
			writeMargin();
			_out.write(format("<array id='%s' class='%s'>\n", encode(id), encode(fullyQualifiedName!T)));
			moreMargin();
		}
		else static if (isBasicType!T)
		{
			writeMargin();
			_out.write(format("<%s id='%s'>%s</%s>\n", fullyQualifiedName!T, id, t, fullyQualifiedName!T));
		}
		else
		{
			writeMargin();
			_out.write(format("<class type='%s' id='%s'>\n", encode(t.classinfo.name), encode(id)));
			moreMargin();
		}

		return true;
	 }

	void end(P,T)(P parent, T t, string id)
	{
		static if (is(T == string))
		{
			return;
		}
		else static if (isArray!T)
		{
			lessMargin();
			writeMargin();
			_out.write("</array>\n");

		}
		else static if (isBasicType!T)
		{
		}
		else
		{
			lessMargin();
			writeMargin();
			_out.write("</class>\n");
		}
	}

private:

	void writeMargin()
	{
		for (int i=0; i<_margin; ++i)
			_out.write(' ');
	}

	void moreMargin()
	{
		_margin += MARGIN_DELTA;
	}

	void lessMargin()
	{
		_margin -= MARGIN_DELTA;
		assert(_margin >= 0, "Negative margin, call to moreMargin() missing?");
	}

	File _out;
	int _margin = 0;
	static immutable int MARGIN_DELTA=4;
};

I hadn’t time to find (and learn) a XML library for D, so I’m writing from scratch, but it should quite easy to understand: primitive types are written “as they are”; objects and array instead becomes the nodes of the XML tree. With this, with less than 200 lines of code we have a fairly functional serializing library. Something is still missing (handling of pointers for example), but beside that it can be used to serialize arbitrary data, without need to implement interfaces or to decorate any member in any class. Not too bad ! (Of course, one would need to write the deserializing function… I may write a post about this later).

Let try it with an example. The following code

class Foo
{
	int x,y;
	void setZ(float z)
	{
		this.z = z;
	}

	string text;
	int [] values = [1,2,3];

	float getZ(){ return z; }

	Foo[] foos = [];

private:
	float z;
};

void test()
{
  Foo f = new Foo();
  f.x = 1;
  f.y = 2;
  f.setZ(3.14);
  f.text= "foo <> &bar ";
  f.foos ~= new Foo();
  f.foos ~= new Foo();
  auto dw = new XmlSerializer("test.xml");
  walkObject!(Foo, Foo,XmlSerializer)(null, f,dw);
}

Generates the following output:

<class type='serializable.Foo' id=''>
	<int id='x'>1</int>
	<int id='y'>2</int>
	<string id='text'>foo &lt;&gt; &amp;bar </string>
	<array id='values' class='int[]'>
		<int id=''>1</int>
		<int id=''>2</int>
		<int id=''>3</int>
	</array>
	<array id='foos' class='serializable.Foo[]'>
		<class type='serializable.Foo' id=''>
			<int id='x'>0</int>
			<int id='y'>0</int>
			<string id='text'></string>
			<array id='values' class='int[]'>
				<int id=''>1</int>
				<int id=''>2</int>
				<int id=''>3</int>
			</array>
			<array id='foos' class='serializable.Foo[]'>
			</array>
			<float id='z'>nan</float>
		</class>
		<class type='serializable.Foo' id=''>
			<int id='x'>0</int>
			<int id='y'>0</int>
			<string id='text'></string>
			<array id='values' class='int[]'>
				<int id=''>1</int>
				<int id=''>2</int>
				<int id=''>3</int>
			</array>
			<array id='foos' class='serializable.Foo[]'>
			</array>
			<float id='z'>nan</float>
		</class>
	</array>
	<float id='z'>3.14</float>
</class>

Of course this is just an example of what could be achieved, a proper output should be a bit different and cleaner.

Issues

The walker could make for an interesting library, and probably I’ll make it, but there are still some problems.

Incomplete

Not all cases are handled, pointers are a glaring example; shouldn’t be too difficult, but I have not decided yet how to better approach them.

Code size

The code generate is really a lot. For this simple example we are talking about 500KB just for the object code (file .o). The fact that walkObject has three template parameters probably doesn’t help, but it seems a bit too much to me.

Awful code

The code is functional but not at all beautiful and difficult to mantain. I complained enough about the mixin code last time, so I’m not going to repeat myself :), but to that I’d also add having to use the visitor as a template parameter: since begin and end are template methods, using just the interface Visitor as argument won’t work, and the compiler will complain about not being able to find the code when linking.

Cycles cycles cycles […]

Cycles are not handled, the visitor must handle this by itself.

Custom behaviour

It’s not easy to customize the behaviour for a specific inspected class, since it seems one can’t override a method for specific template arguments, I’ll have to think about alternative solutions.

More rants about the compiler…

Sometimes it is hard to get right the generated code; I can print the code that I mixin, but it is without context, so bugs are harder to find. A compiler flag to output the expanded code would be much appreciated.

Also, it seems that the compiler (DMD 2.063) already does something similar, seeing that when I use mixins the error lines are usually messed up (which adds to the confusion when debugging, I must add), so maybe it is not an impossible task.

When abusing metaprogramming, sometimes the compiler crashes (or triggers an assert). That doesn’t give complete confidence on the generated code. And now ?

I hope the walker code will be useful to someone (not many people reading me I guess… but who knows ? :) ). In time I’ll try to make the it more complete, while fixing the above issues.

The serializer by itself could be a nice library, once implemented properly (the deserializer too). In particular I aim to implement a version with binary serialization, since it could prove useful for a couple of ideas I’m working on.