COM Interop Not Fundamentally Flawed But Hard

I have been working and thinking about .NET COM Interop a lot for the past 14 months. I have spent time working with millions of lines of COM code and Interop with the managed world over three companies, as well as helping many people with their Interop problems. This includes at least one company with COM code bases in the millions of lines of code. I had initially come to a set of conclusions last July that COM Interop was Fundamentally Flawed. Since that time, I have done a lot of thinking on this issue and have talked with a lot of people and I have refined my ideas into that it is not fundamentally flawed but much harder than it appears to be, and is perhaps all that could have been done given the conditions.

I want to give special thanks to Don Box for helping me to clarify my thoughts in a series of meetings on this in Redmond and elsewhere.

To recap, my contention had been that COM Interop, as it is implemented today, wreak havoc with non-trivial COM architectures interoperating with managed code because of the fact that Runtime Callable Wrappers (RCWs) are garbage-collected non-determinstically causing releases of underlying COM interfaces to happen non-determinstically. This forces managed developers wishing to Interop with existing COM code to sprinkle their code full of ReleaseComObject calls from the System.Runtime.InteropServices.Marshal namespace. Obviously this can be done to some degree but my contention was that Microsoft could have and should have implemented IDisposable in both RCWs and CCWs allowing managed developers to automatically do cleanup and do all these ReleaseComObject calls. I had pointed out for many companies that I had worked with huge non-trivial COM architectures that this lack caused immense suffering and pain.

Of course, this approach to COM Interop has roots in both VB6 and the Microsoft MSJVM teams and efforts. This issue has existed for a long time in the MSJVM world. Yes, it's common knowledge to use ReleaseComObject to cleanup. The problem is that for most "real-world" scenarios this is not nearly enough and actually has to be done multiple times and in multiple places to get things to work without either crashing or failing to work alltogther. One classic example is Connection Points. To alleviate all this, I think it's pretty well accepted that Microsoft could have and should have leveraged IDisposable instead of forcing Interop developers to implement mechanisms in pehaps doezens and dozens of Interop assemblies (as was the case for at least one company I worked with).

The discussions with Don helped me see this: Microsoft could have implemented IDisposable in all RCWs and it would have helped in most of the cases but not in CCWs where it wouldn't work in a lot of cases and actually could be harmful. Why? Remember, that a single RCW is created per COM co-class, wheras a single CCW is shared by .multiple COM clients "talking" to a single .NET type. The RCW does not release AddRef'd interface pointers to the underlying COM object until the RCW is finalized (which is non-determinstic). In the case of the CCW, it holds a rooted reference to the underlying CLR object and prevents the object from being GC'd as long as there is at least one outstanding COM interface pointer. So the example is like this: Take a system in which there is a single RCW. Would Microsoft implementing IDisposable here help? Clearly, it would. Microsoft could have put in tlbimp or some tool the "hook" to read that COM typelib and generate the Interop Assembly with IDisposable implemented or even stubbed in there. This would have handled this situtaion. Now, take a CCW. Would it help here? No because the CCW is "shared" by multiple COM clients for that single .NET type and has a rooted reference. Moreover, take the situation of both an RCW and an CCW in an Interop situation. Microsoft implementing a general IDisposable in both the RCW and CCW would not help and could be disatrous in a system where there is a call made out through the RCW from .NET to COM and then there is a callback from COM back to .NET via the CCW. Actually this is a problem that we have had for years and years in COM itself. If your design contains cycles in an object graph, they must be broken somehow. COM has had problems in this area for years and COM Interop just carries it forward. If cycles in your object graph contain CCWs or RCWs, one needs some mechanism to break the cycle. This is done via the Marshal.ReleaseComObject for the RCW and converting the rooted reference in a CCW to a weak reference by calling Marshal.ChangeWrapperHandleStrength.

Cycle problems in component object architectures is hard and is still hard. My belief now is that Microsoft designed a general solution that is the best one can do for the general case, to fit the general picture. This does not make COM Interop fundamentally flawed but it does make it quite difficult and certainly not the simple area that some make out to be.