Sankey Generator v0.3 and 0.4
Fourth time’s a charm! It took a while and a lot of mistakes, but I have my Sankey Generator working with a theoretically infinite number of XML nodes. Here’s how it went down.
While version 0.2 looked exactly how I wanted to, it didn’t work exactly how I wanted it to. It ran through the values, generated all the arrays I needed for the drawing, and plotted them accordingly. However it didn’t keep track of values like where each branch ends, which are very important if I wanted to plot a sub-level (which I did). I knew that I would need to use some sort of recursion to achieve the handling of infinitely-nested XML values, so I decided to rewrite my code from scratch to make it completely recursive. This was version 0.3. It read through the XML file as it is written, line by line, checking for any child nodes before moving on to a sibling node. This made quick work of handling an infinite amount of data, but I encountered a new problem: because it went to children before going to the next sibling, I again could not keep track of the end position of each branch. So once I ran through all the children of a node and needed to draw the next sibling, I would need to get the coordinates of the end of the parent node. My code quickly became cluttered with a ton of offsets and confusing numbers, driving me crazy! I am no fan of ugly code, for one, but most importantly, this wasn’t working. I solved this by creating a middle ground, a combination between recursive and level-by-level XML handling.
In version 0.4, I created a class called BranchGroup. BranchGroup takes three parameters:
- The XML element containing the siblings that need to be drawn, plus any potential child nodes, their children, etc.
- How deep in the XML the node is, for calculating the curve’s X position
- The Y position of the end of the branch from which this BranchGroup branches. (Got it?)
It is also responsible for performing two functions:
- Drawing the curves for all the children (and only the children, not grandchildren or anything deeper). This works much the same way version 0.2 does.
- Going back through the children and seeing if they have any children. If they do, I send that child element into a new BranchGroup, which repeats the same process until there are no more children. This works much the same way version 0.3 does.
And voila! Now an infinite number of values with an infinite number of children can be drawn in beautiful Sankey form. My next task is to go through some data (I have chosen Obama’s economic stimulus bill, which so far has been a thrilling read), put it into XML and see how my program handles it. I anticipate having to make a few tweaks to make sure that branches don’t cross over and node labels show up properly. That’s all for now!


1 Comment so far
Leave a comment
sick nasty. programming rules
By greg on 03.01.09 7:36 pm
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>