COM aggregation and ref counting woes

Why are we talking about Component Object Model (COM), isn't that old dead technology? Well... no. There are still so many COM objects in use today, in many projects, that you will run into them sooner or later. As a software engineer you might even have to resolve bugs in these components. Today I want to draw attention to ref counting bugs that can creep in when using aggregation within these objects.

COM objects use reference counting to control their lifetime. This is achieved through the implementation of an IUnknown interface by each and every object. This interface contains (quick - what are the first three v-table entries?) IUnknown::QueryInterface, IUnknown::AddRef and IUnknown::Release. So after you add a reference to an object with AddRef, you are expected to call Release when you are finished. Reference counting bugs can crop up in object clients when someone forgets this rule and are usually a real pain to find. This difficulty can be compounded even further if the object itself messes up its implementation of AddRef.

The implementation of AddRef is typically simple (just increment an internal counter), however when the object aggregates other objects in its internal implementation (another form of object reuse as opposed to using containment) it becomes more complex. In these cases you have to ensure that AddRef and Release calls operate on the correct object.

Lets take the example of a hypothetical AIRPLANE object. An AIRPLANE is implemented by aggregating WING, and ENGINE objects. WING implements IWing and IUnknown, ENGINE implements IEngine and IUnknown. Now clients of the AIRPLANE object would expect that IWing::AddRef, and IEngine::AddRef both control the lifetime of the outer object (the component/object doing the reusing - in this case AIRPLANE). This is fair and reasonable, however the only way this can happen is if the inner objects (WING and ENGINE) are aware of the outer object. So when WING and ENGINE are created, AIRPLANE passes a pointer to its IUnknown implementation (called the controlling unknown) down to the inner objects. If they support aggregation then they will use this pointer to handle any IUnknown calls that come in through IWing and IEngine. If they do not support aggregation they will return CLASS_E_NOAGGREGATION and fail creation. However ENGINE and WING must not delegate to the controlling unknown for any AddRef and Release calls that come in through IUnknown itself, as this is what the outer object will use to control their lifetime.

These are just the basic rules of aggregation that when applied ensure that object lifetime is still managed correctly. Common problems arise when the inner objects forget to delegate the AddRef and Release calls to the controlling unknown, or do so in the wrong case (i.e. when called through IUnknown::AddRef). In this case the client of AIRPLANE may see what appears to be a ref counting bug on their side, but is actually an internal issue with the aggregation.

... no wonder people like managed code and .NET. Ref counting bugs can get tricky.

 

Submitted by Armin on Thu, 2006-11-23 07:28

Justin Mann (not verified) | Wed, 2006-12-06 05:43

I just want to stand up for the minority opinion and support the COM ref counting method. Ref counting is the simplest way to allow more than one piece of code to control the lifetime of the same object. It makes it easy to pass an object to function and that function can then cache a pointer to the object without the callee having to worry about deleting the underlying object out from under it. Unfortunately, it is difficult to consistently follow the ref counting rules in a large project. That is why using helper classes like ATL is so useful.

Why is ref counting better than the garbage collected happy fun world? Simply it is all about performance. When trying to make a performant desktop application the most important aspect is reducing memory usage. Garbage collection requires the system to maintain more information about each pointer, and usually adds another pointer redirection for each object access (sequential dependent memory reads are evil). The GC also pulls back into main system memory every pointer during a gen-2 collection, effectively nullifying the virtual memory management system. On top of the this, all of the gc based systems include a runtime type system (System.Reflection) which stores a lot of information about every type of class, and for complex objects about each instance. Thus utilizing more of our precious memory resources.

Many people, will dismiss these concerns as silly because computers keep getting more RAM. Unfortunately, having more RAM simply reduces hard drive access, it does not make the speed of reading RAM faster and thus has little effect on the performance of your application unless your system is paging a lot already. CPUs counter this with bigger caches, up to L3 on desktop computers. These caches help but they also increase the importance of code that needs very little memory to perform its work. Only L1 cache actually runs at CPU speed, all of the rest of the levels of cache are going to put your CPU into no op mode.

Ref counting allows us to manage objects without needing to use a garbage collector. If we use the right helper tools then the danger of misusing ref counting is mitigated.

»

Armin | Sat, 2006-12-16 16:32

Ref counting certainly has its advantages. I can't disagree with you at all. Its also important to consider advantages with disadvantages based on your requirements on a case by case basis. I personally like it (I hope this isn't a minority opinion at all). Ref counting is an effective garbage collection method. I would still define it as garbage collection in the sense that when you Release a ref counted object the reference count is decremented but the consumer code does not specifically control deallocation of memory/resources (there may be other consumers). So in a sense the object's memory, soon to be 'garbage', is deallocated only when it determines that the reference count allows it. On the other end of the scale, when you are dealing with tracing garbage collection there are many disadvantages, specially when dealing with perf, that you pointed out.

»

Duncan Bayne | Fri, 2006-11-24 23:23

... if it's merely an abstraction built upon existing unmanaged code - because from time to time, the abstraction leaks.

Case in point - the .NET methods to interact with the Windows clipboard care about the COM threading model under which they're running.

Another example (with which I've been battling all week at work) is the .NET web browser control throwing AccessViolationExceptions (because it's really a COM object under the hood).

So yes, ref counting bugs can be tricky. And I'd much rather work in .NET than MFC. But you still have to know & understand (albeit to a rudimentary level) COM in order to use .NET.

 

»

Armin | Sat, 2006-11-25 01:25

I completely agree. We sometimes make abstractions to simplify components, however ideally everything would be like a transparent white box, withholding no secrets from you. The more you know the better you can deal with the component.

»

Syndicate

Syndicate content