Phusion white papers Phusion overview

Phusion Blog

Objective-C for Ruby developers, un not-so-petit interlude (1/2)

By Jean Pierre Hernandez on March 24th, 2010

Bonjour les amis! Welcome back to to this second installment of our tour de MacRuby! In the previous article, we went over the basics of XCode and Interface Builder. With this preliminary knowledge, we were quickly able to write our very first Cocoa application using MacRuby as well as understand the importance of Camembert!

Indeed, that was quite an interesting experience as it showed not only how easy it was, but hopefully, it also showed how fun this was as well. If I’ve been unable to convince you of this in the last article, I certainly hope I will be able to do so in this article as we will compare the MacRuby way with the traditional “Objective-C” way.

Assuming you are already well familiar with Ruby, it’s important to go over some basics of Objective-C seeing as Cocoa and its documentation assume this language. Basic understanding of Objective-C is therefore a must when it comes to developing Cocoa applications.

At a first glance, Objective-C and Ruby couldn’t be more different from each other than humanly possible. Where Ruby is beautiful and elegant, Objective-C seems to be convoluted with square brackets. Looking past the syntactic differences though, you’ll be able to see that they really don’t differ that much from one another.

For instance, both Objective-C and Ruby are strongly typed object oriented programming languages. They both support dynamic dispatching via message dispatching, i.e. objects being able to respond to messages and effectively run the corresponding code for that message during runtime.

Realizing these similarities, Laurent Sansonetti — black-belt Patrick Hernandez imitator at Apple Inc. — set out to unify these two worlds, resulting in MacRuby. With MacRuby one can access the Cocoa library as if it were an integral part of Ruby itself. This in contrast to RubyCocoa which acts as a bridge between Ruby and Cocoa.

At its core, Objective-C is a strict superset of C, meaning I’m using fancy words to say that at its core, it’s still C. More specifically, this means that any C program should work with an Objective C compiler but not the other way around.

Seeing as Objective-C can pretty much be considered a thin layer on top of C, you’ll still have to deal with tedious things like forward declarations (seperation of declaration and definition), macro’s, memory pointers and prior to Objective-C 2.0, manual memory management. Let’s see… in Ruby, you have to deal with pretty much none of these. With the “Ruby is better than X 😉 ” doctrine in mind, let’s go over some of the things that make up Objective-C.

Selectors

In Cocoa jargon, selectors denote method names/identifiers. In Objective-C, a method call on an NSMutableDictionary instance (think Ruby Hash) to set an object for a particular key would look something like this:

[myMutableDictionary setObject:foo forKey:bar];

As you’ve undoubtedly noticed from looking at this code, method names in Objective-C seem pretty different from Ruby to say the least. Indeed, unlike Ruby, Objective-C allows for the method name to accomodate the parameters in such a way that we invoke a method, it looks like we have to provide the parameter names too, not only their values. This allows the Cocoa API to be very descriptive based on the method name alone.

In particular, the signature for the above method is actually:

-(void)setObject:(id)anObject forKey:(id)aKey;

As mentioned before, this is very different from how you would define a method like this in Ruby, which would look something like:

def set_object_for_key(anObject, aKey)
	#...
end

Or overload an operator should you be so inclined to impress the ladies:

def []=(key, object_value)
        #...
end

In the Objective-C signature, we see that the method signature starts with a minus symbol denoting that this is indeed an instance method. If we were to use a plus here instead, it would denote a class method, i.e. a method that would operate on a class-level as opposed to an instance level.

So to drive the point home:

@implementation SomeClass
+(id)foo {
    //...
}
@end

would mentally translate to the following in Ruby:

class SomeClass
    def self.foo
        #...
    end
end

In addition to the above, we also had to provide type information for the return value and the parameters of the function in Objective-C. Here, the type following the minus sign denotes the return type. In the setObject:forKey: case, we have void, which means that our method returns “nothing”. The (id) in the parameter list signifies that it’s an object of “an arbitrary type”. In C parlance, we could compare this to a void pointer, but in Objective-C we use (id) mainly for Objective-C objects.

Drawing our attention back to the aforementioned syntax comparison (i.e. setObject:forKey: and set_object_for_key), this is basically what RubyCocoa does when bridging the API’s and it shouldn’t be too hard to realize that this will become quite hard to read once the argument list grows. In particular, one could in such a case easily get confused with regards to which nth argument is responsible for what.

In order to retain the descriptive nature of Objective-C methods for arguments, MacRuby had to introduce a similar mechanism to keep Cocoa as readable as it would be in Objective-C.

In MacRuby, the above method invocation would look like this:

myMutableDictionary.setObject(foo, forKey: bar)

As you can see, we can still clearly see that the object here to be set is foo and that bar is the key.

Seeing as selectors identify a method, they can be used in conjunction with message dispatching to vary method invocation during runtime. Indeed, this is very similar to Ruby’s send method.

In particular, Objective-C’s:

[apple performSelector:@selector(eat:)
			withObject:camembert]; // Equivalent to [apple eat:camembert];

Would be equivalent to Ruby’s:

apple.send(:eat, camembert)

Note that in the Objective-C example, we’ve made use of the @selector directive to refer to a method by name. In Ruby, we could simply use a string or a symbol.

The alert reader will have noticed that the same mechanism can be used to set up the target-action mechanism we’ve seen in Xcode and Interface Builder. Instead of dispatching an action of a button via Interface Builder to an action method, we could also write:

[myButton setAction:@selector(myAction:)];
[myButton setTarget:myController];

In case of a click event, this will perform the myAction: selector on the myController object.

Declaration, Implementation, Linking

Earlier on, we made mention that Objective-C could be considered as a thin layer on top of C, and in a more formal sense, that Objective-C should be considered as a strict superset to C. This should become a little bit more apparent too when we consider some of the traits it has inherited from C in order to work cooperatively with it. We elaborate on this for the purpose of allowing you to get a deeper understanding when looking into existing Cocoa/Objective-C code and applying this to MacRuby.

Like C, any C function invocation you make in Objective-C to a either a variable or method needs to be declared prior to its use. These declarations are required for the compiler to build up a so-called symbol table, which it uses to determine scope, type and sometimes the location in memory of the symbol. Indeed, tedious yackshaving stuff we don’t need to bother ourselves over in Ruby! Objective-C method invocations are dynamically dispatched through message dispatching and it is therefor not strictly necessary to declare the receiver’s type upfront. For example:

id foo = [[Foo alloc] init];
[foo someArbitraryMethod];

The example above this line will compile just fine as it is syntactically correct, but can break during runtime if foo is unable to respond to someArbitraryMethod. Compare that to the situation when we provide the actual type upfront too:

Foo *foo = [[Foo alloc] init];
[foo someMethod];

By doing this, we’re saying that foo points to an object in memory of type Foo. With this information, the compiler can now check during compile time if someMethod is defined and declared in Foo, basically making sure foo can respond to the method. It is for this reason that I’d like to recommend limiting the use of id over the actual type as the compiler may be able to provide additional assistance.

The declaration of a class is put in so-called header files (.h files, e.g. MyClass.h), whereas the implementation itself is put into implementation files (.m files, e.g. MyClass.m). The implementation file is then expected to #include (or #import) its header file to get its declarations. As the @interface directive denotes in a typical Objective-C header file, we concern ourselves with exposing the interface of the class within this file by declaring it for use for others. When we compile a class’ implementation file (and its included header file) in this manner, we end up with a so called object file. This object file contains our compiled class code.

Other files depending on this class are expected to include its header file (and thus obtaining its interface) and let the linker hook up the compiled implementation (object file) to it during compilation. The ultimate end result should be the executable itself.

How different this process is indeed from Ruby, where the interpreter does not need us to do this kind of yackshaving at all. Of course, a compiled variation usually performs a few magnitudes better than an interpreted version, but the latter no longer strictly applies with MacRuby’s ahead-of-time and/or just-in-time compilation. Indeed, we can just compile our Ruby code to native and get similar performance to a compiled Cocoa/Objective-C application, sans the yackshaving! C’est parfait!

To drive the point home, a declaration for an Objective-C class in a Cocoa application usually looks something like this:

#import <Cocoa/Cocoa.h>
#import "Cheese.h"

@interface Camembert : Cheese {
	int age;
	Smell *smell;
}

@property (readwrite, assign) int age;
@property (readwrite, retain) Smell *smell; 

-(id)initWithAge:(int)age smell:(Smell *)smell;
-(void)someOtherOperation;

@end

First off, we see an import pre-processor directive which will include the Cocoa header file. Angular brackets here denote that it will look for this file from in the system directories (where these are also configurable through compiler flags).

Also note that the #import directive differs from the #include directive in that #import will take care of including the provided file no more than once. This will prevent multiple declarations for the same symbols to occur, which normally would result in compiler errors. So called include guards would normally be necessary in the header files to prevent this from happening in the case of #include.

Then, we see the interface declaration of the class Camembert, in particular, it being declared as a subclass of Cheese. For the latter, we had to import the header for Cheese too. Note here that the use of normal quotes denotes that the compiler should look for this header file within the user specified directory. By default, this includes the current working directory, but we can specify these directories with compiler flags if we would really need to.

Within the accolades, we are expected to declare our instance variables. As any good Camembert, we expect it to have a distinct smell and age. Thus, we declare an age of type integer and a smell of type Smell as our instance variables. The asterisk at smell denotes that this is a pointer, which we’ll elaborate on at a later stage.

Then, we see the following two property directives:

@property (readwrite, assign) int age;
@property (readwrite, retain) Smell *smell;

Even though scary looking, you should already be familiar with something like this within Ruby, be it in the form of attr_accessor, attr_reader and attr_writer.

Like the aforementioned Ruby attr_* family of methods, the @property directive can generate a getter and/or setter for a particular instance variable of our class. Consider:

@property (readwrite, assign) int age;

Here, we say we want our instance variable age to be available with read and write access (getter and setter), and we want the argument to be assigned to the property. This should eventually expand in code like the following if we were to use the “@synthesize age;” directive in the implementation file:

-(void)setAge:(int)a {
	 age = a;
}

-(int)age {
	return age;
}

A similar process takes place for the smell property, provided we use “@synthesize smell;” in the implementation file:

-(void)setSmell:(Smell *)s {
	if(smell != s) {
		[smell release];
		smell = [s retain];	
	}
}
	
-(Smell *)smell {
	return smell;
}

You will have undoubtedly noticed the use of release and retain. Even though Objective-C 2.0 comes with garbage collection, we can opt to not use it, giving up convenience in favor of performance and efficiency. Platforms like the iPhone by the time of this writing for example do not support GC yet, and so, we have to fall back on a manual memory management technique called manual reference counting for these platforms. Release and retain are our means to this. Under a garbage collected environment like MacRuby’s, these methods however do nothing as the GC will take care of memory management.

The concepts to manual reference counting are actually very similar to those of garbage collection: an object becomes unreachable if we have no references to it, i.e. the reference count of an object is 0. We say that that object is no longer live and is considered to be garbage. As such, we can get rid of that object. MacRuby utilizes a garbage collector called AutoZone to achieve this, the same one that we can use in Objective-C 2.0.

If we don’t use a garbage collector in Objective-C however, we have to manually determine what objects are live and what objects are considered to be garbage. For Objective-C objects that are allocated on the heap, i.e. objects that would otherwise persist in memory throughout the lifetime of the application if not explicitly deallocated, we employ a technique called manual reference counting: every time we need a reference to an object, we call retain on it to bump up the reference count of the object by one. When we no longer need it, we decrement the reference count by one with a corresponding release call on the object. When the retain count reaches zero, the object is no longer referenced and will be discarded.

In the case of setSmell, we see that we first try to determine whether or not we’re dealing with the same smell objects. If we have two distinct objects here, we first discard the current smell and then retain the new smell. If we were not to release the current smell, the current smell object’s reference count would not be decremented and could possibly never reach 0. This would result in the object not being released from memory, causing a memory leak. Conversely, not retaining the new smell object could cause the new smell to prematurely reach a reference count of 0 and get released too soon from memory.

Not everything in memory needs to be maintained in such a tedious manner though. Stack allocated objects, i.e. values allocated on the stack that only persist through the current scope, will get popped automatically once their scope ceases to exist. For example:

-(void)foo {
	int myStackAllocatedVariable = 6;
	// Do some processing...
	
	// Do some more processing...
	
	// myStackAllocatedVariable should be popped.
}

Here, myStackAllocatedVariable will get popped from the stack after it reaches the end of the function.

Besides retain and assign, we also have copy, which should be pretty self explanatory by now. Think Object#dup in Ruby ;-). That was undoubtedly quite a bit to soak up and should hopefully underline the strengths of MacRuby for not putting you up with this in the first place!

Finally, after the accolades, we see the declarations for our methods. In particular, we see:

-(id)initWithName:(int)age smell:(Smell *)smell;
-(void)someOtherOperation;

Where the second method is just a plain method, we consider the semantics of a method prefixed with init to be a constructor by convention.

After allocating memory for the object via the alloc method, we need to initialize the object to the desired state through a constructor. The process to achieve this in Objective-C is very similar to that of Ruby actually, even though you might not realize that at a first glance.

In Ruby, we allocate and initialize an object through calling its “new” factory class method. The workings of the latter can be described in pseudo code as the following:

class Object
	def self.new(*args)
		self.alloc.initialize(*args)
	end
end

Here we allocate an object using its alloc class method and initialize it by calling initialize on the object, passing it the arguments we got through the “new” invocation.

In Objective-C, we would do the following to allocate the object and invoke its default constructor:

Camembert *camembert = [[Camembert alloc] init];

To utilize our custom constructor, we would do:

Camembert *camembert = [[Camembert alloc] initWithAge:3 smell:someSmell];

Instead of having a “new” class method for every possible initialization, it is common practice in Objective-C to just chain the calls like this instead. One could consider it as a way of overloading constructors in a more distinct and descriptive way.

Now that we’ve covered the declarative part, let’s discuss a possible implementation for our Camembert class:

#import "Camembert.h"

@implementation Camembert

@synthesize age;
@synthesize smell;

-(id)initWithAge:(int)a smell:(NSSmell *)s {
	if(self = [super init]) {
		[self setAge:a];
		[self setSmell:s];
	}
	return self;
}

-(void)someOtherOperation {
	NSLog("I really should have a better implementation.");
}
end

As we’ve already discussed in the previous section, the @synthesize directives will expand our @property directives to setters and getters.

In the class definition, we’ve also given a sample implementation of the constructor initWithAge:smell:. This should be pretty self explanatory, save perhaps the if(self = [super init]) part which is idiomatic in Objective-C. This is the part where we make sure that the super constructor gets invoked prior to initializing the current subclass as we’d normally expect in object oriented languages.

A constructor in Objective-C is not a special method however that takes care of this for us implicitly like in e.g. Java: from a code point of view, it’s just a regular method with special semantics. It is for this reason that it’s usually a good idea to invoke the super constructor before doing anything, seeing as an alloc will only allocate the memory for the class, but makes no guarantees as to how it will initialize it.

The following code invokes our synthesized setters, and as you can see, their selectors are already prefixed with set (and get).

[self setAge:a];
[self setSmell:s];

Finally, we see:

-(void)someOtherOperation {
	NSLog("I really should have a better implementation.");
}

You should be able to tell what the signature is about by now, but I wanted to highlight this part in particular as it shows a very useful function in the form of NSLog which allows you to write messages to the console in a convenient way. This is of course very useful for debugging if you don’t want to dive into GDB.

MacRuby<3

I think you’ll agree with me that the above is very similar compared to the MacRuby way, minus the yackshaving of header and implementation files:

class Camembert < Cheese
    attr_accessor :age
    attr_accessor :smell

    def initialize(age, smell)
        # If you're subclassing a Cocoa/Objective-C super class,
        # you may want to do a super invocation here etc... too.
        # Can you guess why?
        @age    = age
        @smell = smell
    end

    def someOtherOperation
        NSLog("Born!")
    end

end

Pointers

When we store data in memory, e.g. an object, we may want to retrieve it at some later point. Luckily for us, memory is addressed, meaning we can access certain parts in memory via their address provided the operating system allows us to. Here, pointers are used to store these addresses of data in memory of a certain type.

To drive the point home with an analogy, we use the example of a book (in particular, the MacRuby book Matt Aimonetti is currently writing for O'Reilly which yours truly under the pseudonym of "Ninh Bui" will review 😉 ).

The pages of the aforementioned book contain the actual data on the topic of MacRuby and are assumed to be numbered. Here, the page numbers can be considered to be the memory addresses of the data. Now suppose a friend of yours would like to know more about Cocoa bindings. You now either have the option to copy parts of the book (copying data) or provide the book with a page number instead elaborating on the topic. Similar to computers, copying data is often unnecessary and I think you'll agree that that is definitely the case in this situation. It is for this reason that instinctively, you'd want to pass a page number and book instead of copying parts of the text by hand.

If we were to use a piece of paper to write down a specific page number for a friend, that piece of paper could be considered to be a pointer and in particular, it's a pointer to information on Cocoa bindings. Your friend can now look up the information from your book using the given page number, basically engaging in the process of dereferencing the pointer by looking up the actual data the pointer references. On computers, it's pretty much the same deal: given a pointer, you can look up the actual data in memory that the pointer references.

If this was hard for you to follow, just remember that this is really not that different from a reference in Ruby. Instead of dealing directly with memory addresses, we just get the object that was stored there instead.

For MacRuby, it's important to understand that when you see an Objective-C method like for example:

-(void)foo(NSMutableDictionary *dict) {
    // Do stuff with dict
}

You are expected to pass it an object of type NSMutableDictionary. In MacRuby that would be its Hash counterpart:

foo({:haiz => :baiz})

There is a small exception though in MacRuby that allows you to get pointers to primitives but considering the unlikeliness that you'll need it, I'd kindly like to refer you to the Pointer class for more information on that.

Stay tuned for part 2

That was undoubtedly quite a lot of information to deal with in one blog posting indeed! And we're not even done yet, as we still haven't even discussed Categories and Protocols. These are fundamental for understanding the concept of delegates in Cocoa.

MOAR!

These blog posts are just a small example of the knowledge that is available at Phusion. If you're interested in learning more about topics like e.g. Ruby, Passenger, Scaling, C/C++, Databases and so on in an in-depth manner, feel free to contact us for information on consultancy, training and speaking engagements.