Mixing Objective-C, C++ and Objective-C++: an Updated Summary
Originally published on 25th May 2012, updated on 15th July 2012.
Quite some time ago, I ran into the situation of including a C++
library in an Objective-C project. I failed to find any sensible
documentation on the subject, so I came up with a solution myself and
eventually wrote it up in an article.
That article went on to become something of a sleeper hit (by my modest
standards anyway) and is to this day one of the highest-ranked results
for Objective-C++ and related keywords on Google.
Since then, Apple has switched to the LLVM-based clang
as the primary compiler for Mac and iOS development. One of the effects
of this has been an accelerated pace of changes to the Objective-C
language, compared to the rather more glacial rate of change under the
GCC regime. One particular change has caused my old article to no longer
be up-to-date. This, along with the steady stream of clarification
questions I receive about it, has prompted me to write this new article.
Recap of the problem
To save you going through the old article, here's the issue: let's
say you have some existing C++ code, a library perhaps, and you want to
use it in an Objective-C application. Typically, your C++ code will
define some class you'd like to use. You could switch your
whole project to Objective-C++ by renaming all the .m files to .mm, and
freely mix C++ and Objective-C. That's certainly an option, but the two
worlds are quite different, so such "deep" mixing can become awkward.
So usually you'll want to wrap the C++ types and functions with
Objective-C equivalents that you can use in the rest of your project.
Let's say you have a C++ class called CppObject, defined in CppObject.h:
#include <string>
class CppObject
{
public:
void ExampleMethod(const std::string& str);
// constructor, destructor, other members, etc.
};
You can have C++-typed members in an Objective-C class, so the typical first attempt is to do this with your wrapper class, ObjcObject - in ObjcObject.h:
And then implementing the methods in ObjcObject.mm. Many are then
surprised to get preprocessor and compile errors in ObjcObject.h and
CppObject.h when they #import "ObjcObject.h" from a pure
Objective-C (.m) file directly or indirectly via another header (.h)
file. The thing to bear in mind is that the preprocessor basically just
does text substitution, so #include and #import
directives are essentially equivalent to recursively copy-and-pasting
the contents of the file in question into the location of the directive.
So in this example, if you #import "ObjcObject.h" you're essentially inserting the following code:
// [lots and lots of Objective-C code from Foundation/Foundation.h]
// [fail to include <string>] as that header is not in the include path outside of C++ mode
class CppObject
{
public:
void ExampleMethod(const std::string& str);
// constructor, destructor, other members, etc.
};
@interface ObjcObject : NSObject {
CppObject wrapped;
}
- (void)exampleMethodWithString:(NSString*)str;
// other wrapped methods and properties
@end
The compiler will get enormously confused by class CppObject and the block following it, as that's simply not valid Objective-C syntax. The error will typically be something like
Unknown type name 'class'; did you mean 'Class'?
as there is no class keyword in Objective-C. So to be
compatible with Objective-C, our Objective-C++ class's header file must
contain only Objective-C code, absolutely no C++ - this mainly affects
types in particular (like the CppObject class type here).
Keeping your headers clean
In the old article,
I talked through a few solutions to this, so I won't reiterate them
here. The nicest one at the time was the PIMPL idiom. It continues to
work well today, and is still the best way for the opposite problem of
wrapping Objective-C with C++ (more on that later on). However, with
clang, there is a new way to keep C++ out of your Objective-C headers: ivars in class extensions.
Class extensions (not to be confused with categories) have
existed in Objective-C for a while: they let you declare additional
parts of the class's interface outside the public header before the @implementation block. As such, the only sensible place to put them is just above said block, e.g. ObjcObject.mm:
#import "ObjcObject.h"
@interface ObjcObject () // note the empty parentheses
- (void)methodWeDontWantInTheHeaderFile;
@end
@implementation ObjcObject
// etc.
This much already worked with GCC, but with clang, you can also add
an ivar block to it. This means we can declare any instance variables
with C++ types in the extension, or at the start of the @implementation block. In our case, we can reduce the ObjcObject.h file to this:
#import <Foundation/Foundation.h>
@interface ObjcObject : NSObject
- (void)exampleMethodWithString:(NSString*)str;
// other wrapped methods and properties
@end
The missing parts all move to the class extension in the implementation file (ObjcObject.mm):
#import "ObjcObject.h"
#import "CppObject.h"
@interface ObjcObject () {
CppObject wrapped;
}
@end
@implementation ObjcObject
- (void)exampleMethodWithString:(NSString*)str
{
// NOTE: if str is nil this will produce an empty C++ string
// instead of dereferencing the NULL pointer from UTF8String.
std::string cpp_str([str UTF8String], [str lengthOfBytesUsingEncoding:NSUTF8StringEncoding]);
wrapped.ExampleMethod(cpp_str);
}
Alternatively, if we don't need the interface extension to declare
any extra properties and methods, the ivar block can also live at the
start of the @implementation:
#import "ObjcObject.h"
#import "CppObject.h"
@implementation ObjcObject {
CppObject wrapped;
}
- (void)exampleMethodWithString:(NSString*)str
{
// NOTE: if str is nil this will produce an empty C++ string
// instead of dereferencing the NULL pointer from UTF8String.
std::string cpp_str([str UTF8String], [str lengthOfBytesUsingEncoding:NSUTF8StringEncoding]);
wrapped.ExampleMethod(cpp_str);
}
Either way, we now #import "ObjcObject.h" to our heart's content and use ObjcObject like any other Objective-C class. The CppObject instance for the wrapped ivar will be constructed using the default constructor when you alloc (not init) an ObjcObject, the destructor will be called on dealloc.
This often isn't what you want, particularly if there isn't a (public)
default constructor at all, in which case the code will fail to compile.
Managing the wrapped C++ object's lifecycle
The solution is to manually trigger construction via the new keyword, e.g.
@interface ObjcObject () {
CppObject* wrapped; // Pointer! Will be initialised to NULL by alloc.
}
@end
@implementation ObjcObject
- (id)initWithSize:(int)size
{
self = [super init];
if (self)
{
wrapped = new CppObject(size);
if (!wrapped) self = nil;
}
return self;
}
//...
If using C++ exceptions, you may want to wrap the construction in a try {...} catch {...} block and handle any construction errors. With explicit construction, we also need to explicitly destroy the wrapped object:
- (void)dealloc
{
delete wrapped;
[super dealloc]; // omit if using ARC
}
Note that the extra level of indirection involves an extra memory
allocation. Objective-C heavily allocates and frees memory all over the
place, so this one extra allocation shouldn't be a big deal. If it is,
you can use placement new instead, and reserve memory within the Objective-C object via an extra char wrapped_mem[sizeof(CppObject)]; ivar, creating the instance using wrapped = new(wrapped_mem) CppObject(); and destroying it via an explicit destructor call: if (wrapped) wrapped->~CppObject();. As with any use of placement new, though, you'd better have a good reason for it. Placement new returns a pointer to the constructed object. I would personally keep that (typed) pointer in an ivar just as with regular new. The address will normally coincide with the start of the char array, so you could get away with casting that instead.
Wrapping up
Now you'll probably want to wrap a bunch of member functions with
Objective-C methods, and public fields with properties whose getters and
setters forward to the C++ object. Make sure that your wrapper methods
only return and take parameters with C or Objective-C types. You may
need to do some conversions or wrap some more C++ types. Don't forget
Objective-C's special nil semantics don't exist in C++: NULL pointers
must not be dereferenced.
The reverse: using Objective-C classes from C++ code
I've had some email regarding the opposite: calling into Objective-C
from C++. Again the problem lies with header files. You don't want to
pollute the C++ header with Objective-C types, or it can't be #included from pure C++. Let's say we want to wrap the Objective-C class ABCWidget, declared in ABCWidget.h:
A pure C++ compiler will trip over the code in Foundation.h and eventually the @interface block for ABCWidget.
Some things never change: PIMPL
There's no such thing as a class extension in C++, so that trick
won't work. PIMPL, on the other hand, works just fine and is actually
quite commonly used in plain C++ anyway. In our case, we reduce the C++
class to its bare minimum:
This is mostly self-explanatory; the reason it works is that a
forward declaration of a struct or class suffices for declaring
variables or members as pointers to such struct or class objects. We only dereference the impl pointer inside Widget.mm after we fully define the WidgetImpl struct type.
Notice that I release the wrapped object in the
destructor. Even if you use ARC in your project, I recommend you disable
it for C++-heavy Objective-C++ files like this one. You can make your C++ code behave itself even with ARC, but it'll often be more work than just putting in the release and retain
calls. You can disable ARC for individual files in XCode under the
'Build Phases' tab in the build target's properties. Fold out the
'Compile Sources' section and add -fno-objc-arc to the compiler flags for the file(s) in question.
A shortcut for wrapping Objective-C objects in C++
You may have noticed that the PIMPL solution uses two levels
of indirection. If the wrapper is as thin as the one in this example,
that's probably overkill. Although Objective-C types can generally not
be used in plain C++, there are a few types that are actually defined in
C. The id type is one of them, and it's declared in the <objc/objc-runtime.h>
header. You lose what little type safety Objective-C gives you, but it
does mean you can place your object pointer directly into the C++ class
definition:
#include <objc/objc-runtime.h>
namespace abc
{
class Widget
{
id /* ABCWidget* */ wrapped;
public:
Widget();
~Widget();
void Reticulate();
};
}
Sending messages to id isn't really advisable, as you lose a lot of
the compiler's checking mechanism, particularly in the presence of
ambiguities between differently-typed methods with the same selector
(name) in different classes. So:
So, if this header is #imported in a .mm file, the compiler is fully aware of the specific class type. If #included in pure C++ mode, ABCWidget* is identical to the id type: id is defined as typedef struct objc_object* id;. The #ifdef block can of course be further tidied up into a reusable macro:
#ifdef __OBJC__
#define OBJC_CLASS(name) @class name
#else
#define OBJC_CLASS(name) typedef struct objc_object name
#endif
We can now forward-declare Objective-C classes in headers usable by all 4 languages:
OBJC_CLASS(ABCWidget);
Acknowledgements
Many thanks to Christopher Atlan, Uli Kusterer and Jedd Haberstro for their suggestions and corrections after reading drafts of this article.
Thanks to Rick Mann for making a suggestion that prompted me to come up with the final version for wrapping Objective-C classes with C++.
Strategies for Using C++ in Objective-C Projects (and vice versa)
Update (May 2012): while nothing in this article is
incorrect, there have been some changes to Objective-C since clang
became Apple's primary compiler. This means there is now an easier way
to combine C++ and Objective-C than the techniques proposed here, as long as you're using clang and don't need to maintain GCC compatibility. I have written about the new technique and the feature enabling it in a new article. For a more extensive explanation and some alternative solutions, all of which still work with GCC, read on. If you're in a hurry and want to get straight to the solution of
embedding C++ objects in Objective-C classes without tainting the
header files so they can still be included from plain Objective-C, you
can skip straight to the conclusion showing the solution
to use in ~95% of cases. The rest of the article contains deeper
analysis of the issue at hand and alternative approaches to solving it.
Why mix Objective-C with C++?
When using Objective-C for whatever reason, typically for iOS or Mac
development, I've often encountered situations where I wanted to
incorporate C++ in the project in some way. Sometimes the best library
for the job happens to be written in C++. Sometimes the solution to a
problem is more succinctly implemented using C++. The most obvious use
are C++ templates, which can save you from repeating boilerplate code.
Maybe less obviously, I find that Objective-C is sometimes too
object oriented. This is obviously heresy among the "everything is an
object" folks, but for non-trivial data structures, I often find
classical object orientation unwieldy, and C's structs just a bit too
weak. C++'s model is a continuum.
Objective-C also is quite assertive about memory management, which
can get in the way, at least in the absence of garbage collection. The
STL (and its newer shared_ptr extension) often lets you
forget about that issue altogether, or to concentrate it in constructors
and destructors, rather than littering your code with retain and release.
It is of course a matter of taste and applies differently to different
situations; automating memory management tends to be most helpful in
code with complex data structures, or code which is heavily algorithmic.
Another good reason for mixing Objective-C with C++ is the opposite
situation: the need to use Objective-C libraries, such as those for the
Apple platforms, from a C++ project. One common scenario for this is
porting a game or engine to those platforms, and most of the following
techniques can be applied in those cases too.
Finally, you might want to use C++ for performance reasons. The
flexibility of Objective-C messaging adds some overhead compared to most
implementations of C++ virtual functions, even with the method caching
techniques used in modern runtimes. Objective-C objects don't have an
equivalent of C++'s non-virtual functions, which are faster still. This
can be relevant for optimising performance hotspots.
The lowest common denominator: C
One option for using the two languages in the same project is to
separate them completely. You only allow them to communicate via a pure C
interface, thus avoiding mixing the languages altogether. The code
using the C++ library goes in a .cpp file, the code calling it is pure
Objective-C (.m), the interface is declared in a C header, and the C++
side implements its interface with extern "C" functions.
This will work quite well in simple cases, but more likely than not
you'll be writing quite a bit of wrapper code. Anyone with experience
writing dynamically loadable C++ libraries via a public C interface
knows this all too well. [1] Virtually all Objective-C seems to be compiled with GCC or clang nowadays. Both compilers support Objective-C++, usually a better means for mixing the languages.
Objective-C++ and the trouble with header files
At first glance, using the Objective-C++
dialect looks like a straightforward approach. It is the result of
mashing C++ and Objective-C together in the same compiler, and robust
implementations exist in GCC and now clang.
Considering just how different the details of Objective-C and C++ are,
the GCC hackers have done a great job of it. But as you start renaming
your .m files to .mm to introduce chunks of C++, you quickly realise
it's not quite so simple.
Header files and the C preprocessor in general have caused headaches
for C, C++ and Objective-C programmers for decades. It gets worse when
you try to mix the languages. Say you wanted to use the STL's map
in an Objective-C class in your project. Apple's Foundation libraries
to my knowledge don't contain a sorted, tree-based map; one of our StyleKit Components needs exactly that, for example. So we simply create an instance variable for the map in our class and away we go:
However, std::map<int, id>[2] only makes sense to a C++-aware compiler, and only after an #include <map>[3], so this header now can only be #imported
from Objective-C++ files. Any code using this class now needs to be
converted to Objective-C++ itself, and importing from other headers
leads to a cascade effect that quickly encompasses the whole project.
In some cases, this may be acceptable. However, switching a whole
project or large parts of it across just to introduce a library which is
used in one location is not only excessive; if you're the only one who
knows C++ on a project with multiple Objective-C programmers, you might
find this to be an unpopular idea. It might also cause issues in the
same way that compiling pure C code with a C++ compiler rarely is
completely hassle-free. Moreover, it means that code isn't automatically
reusable in other Objective-C projects.
In most cases, using Objective-C++ only where necessary is the way to
go, keeping the majority of the code in pure Objective-C or C++, but
the best way of doing so was not immediately apparent to me. I also
found remarkably little mention of the issue on the web.
Shooting roughly in the direction of your foot: void*
Given these problems, the objective is to remove all trace of C++,
mainly member types, from header files. The typical C way of hiding
types is to use a pointer to void. This will certainly work here, too.
@interface MyClass : NSObject {
@private
// is actually a std::map<int, id>*
void* lookupTable;
}
// ...
@end
In the code using the table, we always have to cast to the correct type using static_cast<std::map<int, id>*>(lookupTable) or ((std::map<int, id>*)lookupTable),
which is annoying at best. If the actual type of the member ends up
changing, all the casts must be changed manually - an error prone
process. With a growing number of members, keeping track of the correct
types becomes infeasible. You really are getting the worst of both
worlds from static and dynamic typing. If you use this approach when
dealing with objects in a class hierarchy, you're dicing with death
outright due to the possibility that an A* and a B* to the same object don't have identical void* representations.[4]
Suffice to say, we can do better.
Conditional compilation
Losing type information sucks, but since we can only use the C++
typed fields from Objective-C++ code, and the pure Objective-C compiler
only needs to be aware of their presence for the purposes of memory
layout, we can provide 2 different versions of the code. The
preprocessor symbol __cplusplus is defined in Objective-C++ mode, so how about this:
It's not pretty, but it's much easier to work with. The C++
standard probably doesn't guarantee that a class-pointer and a
void-pointer have the same memory properties, but Objective-C++ is a
non-standard GNU/Apple thing anyway. I've only found pointers to virtual
member functions to be a problem when converting to void* in practice,
and the compiler will complain loudly if you attempt this. If you're
worried, use static_cast<> instead of C-style casts.
Still, C happily casts void* to other pointer types
implicitly, so it may be preferable to replace the C part of the #ifdef
with a pointer to an opaque struct with a unique and recogniseable name,
e.g. struct MyPrefix_std_map_int_id. You can even define a
macro which expands to the correct type depending on the compiler's
language. You can't auto-mangle templated types in a C macro though, and
you'll struggle with multiple levels of namespace nesting.
You can't avoid conditionally including C++ headers with this method,
and you may confuse/upset those who don't know/like C++, and it's all
rather ugly. Luckily, there are other options.
Abstract classes, interfaces and protocols
As a C++ programmer, you're probably familiar with pure virtual
functions and, as a consequence, abstract classes. Other langages such
as Java and C# have an explicit "interface" concept which deliberately
leaves out any implementation details. Hiding the implementation is
exactly what we're trying to do, so can we use a similar pattern here?
Recent versions of Objective-C do support protocols, which are similar to Java/C# interfaces in spirit, if not in syntax. [5]
We could specify our class's public methods in a protocol in the header
and specify and implement the concrete class conforming to said
protocol in private code. This works well for instance methods, but
there's clearly no way to directly create class instances via the
protocol. You therefore need to delegate the task of allocating and
initialising new objects to a factory object or a free C function.
Worse, protocols are orthogonal to the class hierarchy, so reference
declarations will look different than those for other types:
id<MyProtocol> ref;
Instead of the expected
MyClass* ref;
Workable, but probably not ideal.
True abstract classes in Objective-C
So what about abstract classes? There's no direct, idiomatic support for them in the language, but even the very prominent NSString
is abstract, and you wouldn't know it just by using it. It turns out
that documentation on the subject is scarce. One option is to leave out
the method implementations in the abstract base altogether and live with
the compiler warning about the incomplete class. At runtime, attempts
to call unspecified methods will raise exceptions. More helpfully, but
also much more laboriously, you can create dummy implementations which
do nothing but raise an exception explaining the situation.
In most languages you need to know the concrete class when creating
instances, or delegate this task to a factory. Interestingly, you can
alloc and init NSString directly and receive subclass instances! As far as I know, the init methods return a different object than the self they are given, or they're internally doing funky things with the object's type. Alternatively, NSString's alloc class method could be overridden to call its NSCFString counterpart, NSString
thus acting as a factory for its own subclasses. If you do this
yourself, you'll also need to define all the init* methods the concrete
class uses on the abstract class, too, or they won't be visible to users
of your class.
So far, this is definitely the cleanest solution for the header file
and for users of the class, but it's also by far the most laborious,
requiring an extra class, dummy/abstract methods and complicated init
wrangling.
However, C++ programmers have found an elegant solution to similar
problems. Exploding compile times due to seemingly infinite header
dependencies are a serious issue in large C++ projects, and hiding class
internals from library users is also often desired. As it happens, the
solution to these can be applied to the Obective-C/C++ conundrum.
Pimpl
Short for "pointer to implementation" or "private implementation,"
this idiom is less unpleasant than its name might suggest at first. It's
well-documented in the C++ literature. It's also quite simple. In the public header, add a forward declaration of an implementation struct, typically using the public class's name suffixed with "Impl" or some such convention. This struct
will hold all the members we want to hide from the public class's
header. Add a pointer to the struct as a class instance variable, and
define the struct's members in the .cpp file (or in our case, the .mm),
not the header. On construction (here: -init*), construct an instance of the struct using the new operator and set the instance variable to it, and ensure delete is called on destruction (here: in -dealloc).
In MyClass.h:
This works because forward declarations of structs are valid C code,
even if the struct later turns out to have implicit or explicit C++
constructors, or even base classes. The public class's methods access
the contents of the struct via the pointer, possibly via member
functions of the struct. Construction and destruction of members is
handled by the new and delete operators as long as they are called correctly.
It's more or less up to you whether the functionality lives in member
functions of the public class or the implementation class, or both. You
might gain some efficiency by avoiding Objective-C messages in some
cases, but it can get messy if the C++ methods need to call the
Objective-C class's.
It should be noted that instead of single-member implementation structs, the implementation class can actually derive from the type of the member instead, thus avoiding the indirection on method calls. For example, in MyClass.h:
This becomes less practical with larger numbers of members due to the
many new/delete pairs required. Locality of reference may also be
worse.
Limitations
In pure C++, you may of course declare the implementation as a class,
though this clearly won't work in our case where the forward
declaration must be valid Objective-C. In the C++ literature, you may
also find recommendations to use shared_ptr<> or auto_ptr<>
to handle automatic deletion of the implementation object, or even to
use a mix-in (templated base class) to provide the functionality.
Neither will work with Objective-C headers. Even in runtimes that
support correct construction/destruction of C++ members, the
implementation member must be a pointer, as the struct type is incomplete in the header and reserving the correct amount of memory in the class will fail.
Because the implementation's definition is private, accessing members
from a derived class, i.e. protected members, isn't directly possible.
You can however move the definition of the implementation struct to a
semi-private header, which only needs to be included by subclasses
requiring direct access to it. Those subclasses must therefore be
written in Objective-C++ themselves. Extending the implementation by
subclassing it will be tricky as the pointer will still have the
superclass type; you might prefer to create a second, separate struct
instead.
Final thoughts
Nevertheless, I've found the Pimpl idiom to be the best choice for
embedding C++ in Objective-C in almost all cases. Even when the struct
only contains one member, the lack of casts easily makes up for the
indirection. No efficiency is lost as long as the struct member isn't a
pointer itself. For the reverse case of embedding Objective-C in a C++
class, use a similar approach. Define the public interface of the C++
class and a forward declaration for the implementation class in the
header and place its definition with Objective-C member types in the
corresponding .mm file.
A real-world example of embedding C++ can be found in my Objective-C wrapper for the Open-VCDiff decoder.
An Update (May 2012)
Since I first wrote this article in November 2010, Apple has switched
to the clang compiler and made some changes to the Objective-C
language. One of the new features, the class extension, allows you to
declare instance variables outside the header file, opening up new
possibilities for language mixing. I have written a new article explaining the class extension technique - follow the link to read more on the subject.
Acknowledgements
Thanks to Björn Knafla for the
valuable suggestions after reading drafts of this article and for the
Twitter discussion which ultimately lead to writing this article in the
first place.
Thanks also go to Markus Prinz for reading drafts of this article and making helpful suggestions.
References and Footnotes
[1] e.g. Chapter 7, Matthew Wilson, Imperfect C++; 2005 Addison-Wesley
[2] Note that in earlier versions of Objective-C++,
lookupTable would have had to be a pointer and the map would need to be
constructed explicitly with new, as the STL map has a
non-trivial constructor. Clean C++ member construction and destruction
in the Objective-C object lifecycle is a recent addition to Apple's
implementation. I haven't tested it on other implementations of the
runtime.
[3] You can forward declare std::map in theory but lose the ability to use the defaults for the third and fourth template parameters by doing so.
[4] I have so far only encountered this in
connection with multiple inheritance. Still, class hierarchies can
change, and memory layouts tend to be implementation-defined.
[5] In line with the rest of Objective-C, protocols
also "looser" and more dynamic than interfaces in C#/Java, but that
doesn't change anything in this case.