I am currently working on a dialect of Ruby well-suited for programing embedded systems. My goals are to make it fast, flexible and predictable while not giving up the convenience which Ruby offers.
This article is primarily targeted at software engineers working on embedded systems and other low-level software, and does not require any prior knowledge of Ruby. Everyone else is welcome as well.
Ruby is a very expressive language which is built on a foundation of several simple rules:
- Everything is an object.
- Objects have internal state (instance variables).
- Objects send messages to each other. An object never directly accesses state of another object.
- Behavior of objects is defined through classes. If an object is an instance of class A, then A defines which messages the object will accept. A class can inherit one another class, and override its behavior.
- Messages can be processed programmatically and forwarded to other objects. This is how messages differ from method calls.
- Classes themselves are objects, too. Everything related to classes can be changed programmatically: new classes created, existing methods altered, new methods defined.
The last clause is especially important. Let me explain this with an example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
Don’t you think that explicitly writing bodies of accessor methods again and again is a little verbose? It would be really convenient to do something like this:
1 2 3 4 5 6 7 8 9 10 11
… and have the exact same accessor methods to be defined automatically.
Let’s look closer at the example. What does
attr_reader "name" in the context
class Microcontroller mean? It means that a message named
is sent to the class
Microcontroller, which is itself an object, and is an
instance of class
Class. We can handle that message!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
In Ruby, this is called metaprogramming: creating programs which themselves
write other programs. This is incredibly convenient. (And, by the way, Ruby
standard library already covers
attr_* methods and tons of other useful
This is a solution for the same problem which C preprocessor and C++ templates solve, but unlike those two entities, implementing their own language which does not cooperate nicely with the rest of the code, Ruby metaprogramming consists just of plain Ruby code. You get more expressive power with none of the complexity!
But Ruby is slow
If you know just one thing about Ruby, chances that it has to do something with performance. Frankly, Ruby was never known for its stellar execution speed; more like the opposite, with execution times 200x higher than those of the corresponding C program.
But why does that happen? Ruby implements dynamic typing, which means that a runtime type of the expression is generally not known until the expression is evaluated, and it implies that methods shall be late bound: again, the particular method body to be executed generally cannot be known until the moment of calling that method.
Or, more demonstrably:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
As all classes are open, methods can be freely redefined at runtime. You can see that not even the resulting type of a seemingly constant arithmetic expression can be known for sure, much less types of more complex expression.
(Yes, this example does not contain code you will ever want to write.
Nevertheless, redefining operators even on built-in types can have its valid
applications; think of overriding
Fixnum#/ to return a precise rational value
instead of a rounded integer.)
This property poses a huge problem to Ruby implementations, as Ruby code almost entirely consists of method calls and absolutely nothing is known about method calls in advance. Thus, an implementation either performs method lookup each time the method is invoked, or does sophisticated runtime profiling to determine which method will be probably called at a given call site, and optimizes accordingly.
Runtime method lookup itself isn’t very fast, but what’s worse is that it completely prevents one of the most powerful optimizations, inlining, from happening. Doing runtime profiling and just-in-time optimization yields much better results, but is inherently unpredictable (a compiler may interrupt your computation at virtually any moment), is incredibly complex (thus error-prone) and, finally, consumes vast amount of resources, especially RAM. This can be fine at large-scale servers, but doesn’t really work for most embedded devices.
The only solution to this problem is to make method calls eagerly bound, or, in other words, resolve them at compile-time.
One does not simply compile Ruby
Unfortunately, statically figuring out types in a general Ruby program is impossible. For example, all Ruby arrays can hold elements of any type, and in fact it is not possible to restrict them to holding one particular type of elements even if you’d want to do it. This is indeed both a strength and a weakness of Ruby.
I’ve solved this problem by adding static runtime typing to Ruby. Conceptually, this changes very little: Ruby retains its object-oriented semantics completely, metaprogramming is not affected as long as it does not happen at runtime (which is quite a bad idea anyway), and so on. In fact, this change is little more than a convenient syntactic sugar for adding appropriate type conversions; further, type inference ensures that you won’t even need to declare types explicitly in a lot of cases.
What’s interesting is that this change enables me to extend Ruby semantics in completely new ways. Most importantly, I’ve added generics, configurable types, and they allowed me to adapt standard library for more low-level applications.
What features does it have?
Quite a few.
Arbitrary precision integers
unsigned long? That’s too many identifiers to remember. Integer types
are instantiated (or reified) as simply as
Int32 = Integer.reify(32, :signed)
UInt16 = Integer.reify(16, :unsigned). Yes, if your DSP has only 24-bit and
36-bit registers, arithmetics will be just as optimal as 32-bit on common ARMs.
If you’re decompressing an ogg-encoded audio file and need an array to store raw
data, you can instantiate it as simple as
@data = Array.reify(UInt16).new.
As integers have value semantics (i.e. cannot be modified in-place), the array
will store them directly, faciliating efficient memory use.
All containers used at runtime must have a reified type, but (if you already know Ruby) don’t worry: both intermediate arrays and hashes used to pass keyword arguments are handled by compiler and don’t require any special treatment.
Complete control over memory layout of objects
Do you want to have a class representing an IP packet header to have the same in-memory structure as the actual header? OK, you can do that. Also, bitfields which actually work, are translated to memory accesses with correct alignment and take just a few lines to define.
Fast method calls
Method calls are either translated to a machine call instruction directly or use a vtable mechanism for the cases where subclasses can be passed where base classes are expected. Liskov substitution principle still applies, as well.
Constant folding, inlining, strength reduction, loop-invariant code motion, you name it. The machine code is generated by LLVM, which is well know for its efficiency and extensibility, and the compiler itself does a fair amount of analysis as well.
Automatic memory management
This decision can be controversional, but I’ve decided to avoid manual memory management. A focus is put on automated reference counting, with garbage collection as a possible option in the future. Yes, reference loops are bad, but GC delays can be even worse.
This nicely object-oriented code working with Arrays is actually faster than
the naïve implementation which uses plain old
1 2 3 4 5 6 7 8 9 10 11 12 13
Why? Because that eliminates bounds checking as well as repetitive
call, thus retaining safety yet making the code easier to read. The compiler
checks whether the closure will live after its containing scope is destroyed,
and only allocates it on the heap if that’s actually needed. In this particular
case, the closure will actually be inlined to the enclosing scope.
The compiler performs escape analysis and, if a certain object doesn’t leave its enclosing scope, it will be automatically marked as stack-allocated, thus decreasing heap traffic.
Lightweight coroutines and generators
Language-provided coroutines eliminate the need for simplest RTOSes and allow to use safe multithreading patterns such as Actor. Generators are a useful abstraction tool, and with fair amount of static analysis there is no need to allocate an entire stack just to pass a trivial sequence generator around.
You retain the ability to use legacy C code by defining its interface through FFI; if the entire project uses LLVM, inter-language optimizations are perfectly possible.
Direct memory access
A fair amount of low-level programming tasks require one to perform direct
memory access on specific addresses, most notably to work with hardware
registers. Nothing prevents a high-level notation like
to compile to an efficient read-modify-write cycle.
Support for interrupts is provided out of the box. No weird assembly hacks required.
Well-defined memory model
C++11 memory model is used as a reference to provide efficient atomic operations with well-defined ordering characteristics.
Microcontrollers are well known for their unusual features, which commonly don’t map directly to languages specified at a lower level. Ruby language itself does not specify anything hardware-related itself, and thus nothing prevents a sophisticated compiler to take advantage of features like bitband areas on Cortex-M3 chips.
Completely written in Ruby
The whole compiler is written in Ruby, and all of the standard library is written in the eponymous dialect of Ruby.
Board support packages
A compiler, as good as it may be, is useless without a matching BSP for your microcontroller. That’s what you get, too.
How does it look like?
Pretty much like plain Ruby, but method definitions are sometimes annotated with types:
1 2 3 4 5 6 7 8 9
Note how the type of a variable
result isn’t specified. It is inferred
automatically at the point of first assignment and enforced later.
The return type is specified explicitly, but it could be omitted here, as it
can be inferred from the type of variable
In fact, this whole function can be written without explicitly specified types at all! Compiler is clever enough to infer the type of its sole argument from the call sites where the function is referenced, and it will create a version of the functions for each distinct set of argument types. This way, duck typing is still possible in a completely statically typed language.
How does it work?
This section requires Ruby knowledge.
With a set of builtin functions and some amount of syntactic expansion.
In method definition, a type specification of
Array[Fixnum] is equivalent
Array.reify(Fixnum), which is just a class method which returns another
class. On the other hand, when the compiler needs to ensure that a value is of
a certain type, it inserts a call to
coerce: the method
sum actually returns
the value of
Fixnum.coerce(result) (which is most certainly a no-op).
The standard library or user-defined classes can then use these two methods to implement reification or conversion semantics.
This is how the Array class could implement genericality:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Overall, the compilation process can be described as using Ruby as a domain-specific language to define semantics of a Ruby dialect, which itself is translated to efficient machine code.
Can I try it?
No. The project is not ready for public beta yet. Expect it in a few months.
Is it free? Is it open-source?
This is a commercial project. I believe that I would not be able to make this product as good as it should be if it would be non-commercial; on the other hand, I consider open-source a vastly superior development model. Stay tuned.
How does it compare to language X?
I don’t consider myself knowledgeable enough on the very diverse topic of the programming languages to write such comparisons; furthermore, I do not aim on “replacing” any particular programming language. My goal is to make embedded development easier and more efficient than before. If you think my project would help you to achieve that goal, you’re welcome!
Why are you posting this now?
Because, as I’ve stated before, programming languages is a very wide topic, and so is embedded development. It is better to figure out where am I wrong earlier than to trip on that later. Feedback is very welcome, both positive and negative.
Thank you for reading.