Friday 6 September 2013

OwnedPtr and AssocPtr - UML in C++

My recent post about “Overdoing the References”, coupled with another recent post from an ex-colleague (Carl Gibbs) titled “Divide in C++ Resource Management” caused me to remember an idea we tossed about around the turn of the millennium for representing UML ownership semantics in C++...

Back then general purpose smart pointers, and in particular reference-counted smart pointers were still fairly cutting edge as Boost was in its infancy (if you were even aware of its existence). Around the same time UML was also gaining traction which I personally latched onto as I found the visualisation of OO class hierarchies jolly useful [1]. What I found hard though was translating the Aggregation and Association relationships from UML into C++ when holding objects by reference. This was because a bald (raw) pointer or reference conveys nothing about its ownership semantics by default. References at least had the convention that you don’t tend to delete through them (if you exclude my earlier reference obsessed phase), but that wasn’t true for pointers.

Unique Ownership

Reference-counted smart pointers like std::shared_ptr<> are the Swiss-Army knife of modern C++. The common advice of not using std::auto_ptr<> with containers is probably what led to their adoption for managing memory everywhere - irrespective of whether the ownership was actually shared or logically owned by a single container, such as std::vector<>. My overly literal side didn’t like this “abuse” - I wanted ownership to be conveyed more obviously. Also, the number of places where shared ownership even occurred was very rare then because there was always an acyclic graph of objects all the way down from the root “app” object that meant lifetimes were deterministic.

UML in C++

image

The canonical example in UML of a where both forms of ownership crops up is probably with a tree structure, such as with the nodes in an XML document. A node is a parent to zero or more children and the relationship is commonly bidirectional too. A node owns its children such that if you delete a node all its children, grand-children, etc. get deleted too.

Using bald pointers you might choose to represent this class like so:-

template<typename T>
class Node
{
private: 
  T                  m_value; 
  Node*              m_parent; 
  std::vector<Node*> m_children;
};

However you could argue there is a difference in ownership semantics between the two Node* based members (m_parent and m_children). The child nodes are owned by the std::vector<> container, whereas the parent node pointer is just a back reference. Naive use of reference-counted smart pointers for both relationships can lead to memory leaks caused by the cyclic reference between parent and child and so by keeping the child => parent side of the link simple we avoid this.

The Smarter Pointer

So, to deal with the ownership of the children we came up with a std::auto_ptr<> like type called OwnedPtr<>. The idea was that it would behave much like what we now have in the std::unique_ptr<> type, i.e. std::auto_ptr<> like non-shared ownership, but without the container problems inherit with auto_ptr<>.

  Node*                       m_parent; 
  std::vector<OwnedPtr<Node>> m_children;

The Dumbest Pointer

Whilst we could have left the child => parent pointer bald, this meant that it would be hard to tell whether we were looking at legacy code that had yet to be analysed, new code that mistakenly didn’t adhere to the new idiom, or code that was analysed and correct. The solution we came up with was called AssocPtr<> which was nothing more than a trivial wrapper around a bald pointer! Whilst it was functionally identical to a raw pointer, the name told you that the pointer was not owned by the holder.

  AssocPtr<Node>              m_parent; 
  std::vector<OwnedPtr<Node>> m_children;

Exit UML / Enter shared_ptr

In the end this idea became just another failed experiment. The OwnedPtr<> type was pretty indistinguishable from a classic reference-counted smart pointer and ultimately it was easier to just share ownership by default rather than decide who was the ultimate owner and who was just a lurker. Once Boost showed up with its various (thread-safe) smart pointer classes the need to crank one’s own variants pretty much evaporated.

[1] I also thought the Use Case, Deployment and Sequence diagrams were neat too. While I still find value in the latter two I got disillusioned with the Use Case aspect pretty quickly.

No comments:

Post a Comment