Introduction to .NET COM Interop

Introduction to COM Interop

There is no doubt by now that .NET is making a big splash as the new and prefered way to create components, applications and distributed applications. The benefits of the runtime are very compelling: garbage collection eliminating memory issues, side by side versioning, one class library available to all languages, language interop and more. But for many companies who have invested big in COM, what do they do with the tons of perfectly good, reusable COM components that they have, not to mention all those hard hours and endless cups of coffee?

For those of us who program COM for a living, and thus live by the "COM is Love" mantra, there is good news..NET managed applications can talk and work with our COM components. Microsoft could never have gotten away with abandoning what has been the most widely successful component re-use and distributed technology on the planet with over a billion dollars a year of investment by companies.

The good news is that Classic COM components can interoperate with the .NET runtime through an interop layer that will handle all the plumbing underneath between the unmanaged and managed worlds and cise versa. You can also go the other way. If you build components targetted for the CLR, you can consume these managed .NET components from COM. The COM Interop will generate the wrapper and let you party on.

Why COM Interop? Well, as you probably know by now, the programming models of COM and .NET differ greatly. The differences are too great to cover in detail right here, but think of both models and you will notice some very apparent differences. COM delas primarly with interfaces; specifically three kinds of interfaces:

Custom Interfaces. These are IUnknown only interfaces and don't support IDispatch.
Dispinterfaces. Interfaces whose members an be called via IDispatch.
Dual Intefaces. Interfaces that can be called either directly or through IDispatch.

On the other hand. the .NET Framework offers a full OO based model that does not require interface implemention. Classes get to have all sorts of methods: static, instance, virtual and even overloaded. Classes can have fields, events and properties as first-class citizens directly supported by the CLR. Also, members can have sorts of visibility. Then there is type information. Type information in COM is a hack. Yes, there are Type Libraries, but they are not manadatory nor do they fully describe the types and methods. Actually, a type library exposes whatever the author feels like exposing to you!

.NET, on the other hand, is all about type information and is right at the core of the Common Type System and the CLR. Since all CLS-compliant compilers emit rich and full meatadata that fully describes a type and it is manadatory, you always have it in one form. The CLR doesn't need to query for an interface or ask an object to construct itself. It just inspects and consumes the metadata. Futher, you can do all sorts of late-binding through this use of type data and .NET's System.Reflection classes.

So given that, its pretty obvious that some form of "difference manager" is needed. That's where COM Interop comes into the picture. COM Interop and the CLR masks these differences and in the vast majority of cases makes interop seemless. Not only does a COM object becomes usable from .NET but you use it in the same exact way as you would any .NET object! You don't call CoCreateInstance and such; you new a .NET object and catch exceptions. You use .NET data types and code. The interop layer makes it look just like .NET to you.

On the other side, .NET objects can be called from COM and it looks just like COM to the caller.

Of course, if you are not satisified with the COM Interop marshaller, you can "override" just about every aspect of it through the very large and useful System::Runtime::InteropServices namespace.

Using COM from .NET Applications.

A .NET application cannnot speak directly to a COM component. .As I described above, the programming and binary models are totally different. NET needs some metadata to figure out what the COM component is all about. Metadata is an important and pervasive technology throughout .NET. If we create some sort of metadata "layer" around our COM component, then then runtime can "interogate" this metadata at runtime using System.Reflection and produce a Runtime Callable Wrapper (RCW). The RCW gives .NET applications the illusion that they are talking to another .NET component. Each COM instance is wrapped by a single RCW to maintain identity. The RCW acts like a Proxy to the COM object and looks just like any other .NET object to .NET code. The important thing is that the RCW abstracts away the COM world from the .NET application. You do .NET stuff as you would for any other object. In other words, you new an object, not CoCreateInstance it and Query Interface, etc.

The RCW:

handles activation of the COM object
handles marshaling requirements when dealing with .NET
manages COM object identity
manages COM object lifetime
manages COM interface caching

Of course, the object lifetime issue is a vital and critical issue because .NET uses Garbage Collection, and objects can be moved around and even collected. The RCW performs the Proxy Pattern and looks to .NET as yet another managed .NET assembly and it looks to COM as it's being called by another unmanaged COM client. The behavior and the creation of the RCW depends on whether you are using early or late binding. Under the hood, the RCW is doing all the hard work and thunking down to coresponding v-table calls in the COM component in unmanaged land.

To generate the metadata wrapper for a COM component, there is a tool tlbimp.exe (Type Library Exporter). The tool reads in a type library and generates a metadata wrapper that has all the type information of the COM component. The key is that it is in .NET format, in other words, it uses CLR and CTS types, and not COM types. The Type Library Importer (Tlbimp.exe) is a command-line tool that converts the coclasses and interfaces contained in a COM type library to metadata. This tool creates an interop assembly and namespace for the type information automatically. After the metadata of a class is available, managed clients can create instances of the COM type and call its methods, just as if it were a .NET instance. Tlbimp.exe converts an entire type library to metadata at once and cannot generate type information for a subset of the types defined in a type library.

Calling COM From .NET: The COM Callable Wrapper (CCW)

We have the same issue on the other side. The world of COM doesn't know a thing about the world of .NET. The COM Callable Wrapper (CCW) acts as a proxy to the managed object and looks just like any other COM object. Yes, it even has all those ugly registry settings! The CCW's job is to foward and translate calls from COM to the original managed object. The CCW is not a managed class and is not controlled by .NET's Garbage Collector. This is very important. It is reference counted and obeys COM's rules for lifetime management.

New and Vital Books.

Adam Nathan, an all around nice and helpful guy at Microsoft is publishing a MUST HAVE COM Interop book: .NET and COM: The Complete Interoperability Guide