FSMLabs Lean POSIX for RTLinux

Victor Yodaiken¹
Finite State Machine Labs Inc.
©Finite State Machine Labs 1999,2000. All rights reserved

Abstract:

RTLinux is a small, fast operating system, following the POSIX 1003.13 "minimal realtime operating system" standard. The key design objective of RTLinux is to produce a peaceful resolution to an apparently insoluable contradiction: the need for simplicity and ``bare metal" performance in a hard realtime operating system, and requirements for all sorts of complex software, such as networking and graphical user interfaces in modern realtime application code. RTLinux was originally released as a research project in 1995 and provided a simple API that was quite succesful, but since 1999 RTLinux has been tranforming into an an industrial strength version based on a POSIX threads API. In this article, I'll discuss what's important about ``hard realtime", how RTLinux works, ``why POSIX" and how we reconciled RTLinux with a standard while keeping it small and efficient.

Introduction: RTLinux, realtime and POSIX

RTLinux is a hard realtime operating system. That is, RTLinux is designed to support applications that have real, serious, non-negotiable deadlines. A rocket engine emergency shutdown sequence must complete before the rocket explodes; a servo motor controller must send commands exactly when the motor is in position; a video editing system must capture the next frame and not permit it to be overwritten; a data acquisition system can't miss any samples. All these systems are hard realtime systems and for these systems worst case timing is critical. In non-realtime applications average or typical times are more important. If a file system averages 100Mbytes/second in data transfer, we don't care if it stops now and then to shuffle buffers. Similarly, if a word processor is delayed a few milliseconds, nobody will notice, if X-windows runs fast almost all the time, a hitch now and then is no problem. It's even reasonable to allow a consumer video player to drop frames now and then or to live with an occasional pop or drop-out in a soft CD player. Systems that are realtime, but where missing a deadline now and then is acceptable are called soft realtime systems. But usually shutting down the rocket before it explodes or often putting a chip on an automated assembly line into the right socket is not good enough. A delay of a few microseconds can cause a servo-motor to get a degree out of position and you can't compensate for being too late once or twice by being early a few times.

Twenty years ago, applications needing hard realtime were simple and were usually placed on dedicated, custom, isolated hardware. Modern realtime applications must control systems such as factory floors connected to supply databases, telescopes connected to the Internet, cell phones generating graphical displays, routers, and telephone switches. These applications run platforms that have non-realtime responsibilities as well as realtime ones, must work on commodity hardware, and need to be networked. In other words, these applications are not simple and do not run on dedicated, custom, or isolated hardware. Furthermore, as speed and quality requirements increase, applications that were not realtime, change into hard realtime applications. For example, serious use of networked video and audio requires software that can maintain streams of data with hard realtime guarantees. As another example, variations in network transmit times for loosely coupled stand-alone servers on relatively slow networks has not been a problem, but a server cluster on 10G ethernet in which there are tightly connected distributed applications is a very different situation and one where real-time networking capabilities become an issue.

The problem here is that to deliver the tight worst case timing performance needed for hard realtime, operating systems need to be simple, small, predictable - and optimized to minimize worst case performace. But sophisticated services and powerful applications run best on complex operating systems that are optimized to deliver strong average case performance. TCP network stacks, for example, use algorithms that trade latency for throughput. Software that tries to optimize for two contradictory goals will deliver neither. In fact, the Mars Lander, running VxWorks had precisely this problem as a standard non-realtime inter-process communication mechanism caused a critical realtime process to hang up and time out. RTLinux offers an alternative to the traditional choices of ``make a general purpose operating system kind-of-realtime" (IRIX) and ``keep adding nonrealtime features to a realtime operating system" (VxWorks).

Instead of trying to attempting to balance worst case time against average case time, RTLinux decouples the realtime and non-realtime systems. We put realtime components in a single, multithreaded, realtime process running on a bare machine. Applications are threads and signal handlers. This realtime process is simple, fast, predictable and optimized to minimize worst case performance. A general purpose operating system is run as the lowest priority thread. In RTLinux this thread is Linux. A patented "virtual interrupt controller" prevents the low priority thread from blocking interrupts aimed at realtime signal handlers. Communication between realtime components and the general purpose OS thread is designed to mimimize interference -- the realtime components are never forced to wait for operations of the non-realtime thread. But communication mechanisms are designed to help programmers move all the non-realtime functionality into the non-realtime environment.

A realtime application may consist of a periodic thread running in the base realtime process and collaborating with control software running under Linux and making use of, for example, an Oracle database, the KDE desktop environment, and a Perl networking script. Linux and its processes can never block the realtime process scheduling or event handling and the realtime process does not need to offer virtual memory or a journaling file system or a TCP stack as these are all available within Linux. The classic RTLinux application consists of a periodic thread collecting data from an A/D device and driving an D/A output, with data flowing to a Linux process that logs results and commands comming from a Linux process that presents the operator with buttons and a display. Of course, the challenge with this model is to make sure that hardware efficiency does not come at the expense of programmer efficiency and this brings us to the API.

API development

The first widely used version of RTLinux was released in 1995. This "V1" system was intended to run on low end x86 based computers and provided a spartan API and programming environment. Faster machines (and more complex applications), multiprocessing, multiple architectures, and a much larger programmer community exposed limitations of the original API. We realized that repairing and extending our simple API would transform it into a mess. We also realized that with over 100 commercial RTOS's in the market, the world was not exactly crying out for another new RTOS API. POSIX offered several advantages: POSIX is a real standard, not an effort to lock customers into a proprietary API; and POSIX is widely known and well documented. But POSIX seemed inherently slow because it incorporated too many operations that were impossible to provide in our lean programming environment. For example, consider how very far from fast, simple, predictable, and efficient it is to parse a POSIX file name though links, hardlinks, mounted file systems, and layers of directories. Fortunately, we discovered that the POSIX 1003.13 standard was the solution we needed.

POSIX 1003.13 provides a beautiful² model for fitting RTLinux tasks and interrupt handlers into the POSIX world. The POSIX standard identifies, in section 6, a "Minimal Realtime System Profile" (PSE51) intended for hard realtime systems like RTLinux. The 1003.13 ``minimal realtime system" application environment consists of a single, multithreaded POSIX process running on a bare machine. In this model, RTLinux tasks became POSIX threads and RTLinux interrupt handlers became signal handlers. Linux runs as the lowest priority thread. In an SMP or other multi-processor system we decided to provide a single realtime process on each processor.

The "rationale" given in the standards document is that "the POSIX.1c Threads model (with all options enabled, but without a file system) best reflected current industry practice in certain embedded realtime areas. Instead of full file system support, basic device I/O (read, write, open, close, control) is considered sufficient for kernels of this size. Systems of this size frequently do not include process isolation hardware or software; therefore, multiple processes (as opposed to threads) may not be supported."

POSIX 1003.13 escapes from the POSIX file system by providing a bare-bones interpretation of file path names. The standard RTLinux open will open /dev/x for a fixed set of devices and will not support any other path names.

That's not to say that the Pthreads API is perfect. For example, the Pthreads API has quite clumsy handling of periodic threads, the entire cancel system is terrible, and signals are sometimes confusing. But the API is usable and can be made better by the addition of some POSIX compatible extensions.

In the next section, I will explain how a RTLinux program can be constructed using the POSIX API. But it's worth noting that there is an alternative high level interface to RTLinux from Linux user code via an added Linux system call. The call, rtlinux_sigaction designates some function in the user program as a signal hander to be run in realtime mode either periodically or in response to some interrupt (as a user interrrupt handler). The signal handler code runs in the address space of the process so, with appropriate use of data structures, data can be shared between the non-realtime main process and the realtime handlers. The rtlinux_sigaction interface provides a method for writing realtime code within Linux processes. This method is exceptionally simple to use, but is less flexible than the RTLinux pthreads API and suffers a (couple of microsecond) performance penalty. For more details, see the section on the Programmable Signal Control (PSC) modules in the RTLinux documentation.

An example

RTLinux makes use of the Linux kernel module facility to load realtime components. Linux boots up in non-realtime mode. To switch over to realtime operation, we insert a collection of kernel modules providing RTLinux functionality. The modules have been designed so that users can select a minimal subset that supports their applications. RTLinux is provided with a command insrtl to load the full set of modules, but the modules are designed so that users can pick a minimal set that provides only the services they need. The RTLinux modules include rtl (basic services), rtl_timer (to control hardware clocks), rtl_sched (the scheduler) and rtl_posixio (a read/write/open layer for drivers. Once the RTLinux modules are loaded, RTLinux applications can be loaded. The standard Linux GDB debugger can be used to debug RTLinux threads. It's also possible to use the RTLinux system purely from user mode via a realtime signal handler facility.

RTL Threads

A thread function returns an untyped pointer (in "C" a void pointer can point to any type of data structure), and takes a single argument that is also an untyped pointer. In our case, we don't need either. The code below declares a POSIX timespec to specify its period, and a structure D which will hold data from the A/D device. The thread calls the POSIX clock_gettime to read the current time from the RTL scheduling clock into the timespec structure. Each cycle, the thread collects data from the A/D device, writes the data down a fifo to a Linux process, adds its period in nano-seconds to the timespec structure and calls the POSIX clock_nanosleep to sleep until that time.

This code is placed inside an ordinary Linux kernel module which has the format:

The init_module code for our example will create "fifo" to send data to the user process and then create the thread. The cleanup_module asks the thread to shut down, waits for it, and then closes the fifo.

If you just wanted to log the data produced by this code, the Linux shell command cat < /dev/rtf0 > logfile would do the trick. A two line awk script could convert the data into a format where it could be displayed by gplot. Alternatively, a simply C program could take the data and do some analysis, or a Perl script could send the data over a network. By adding a second fifo, we could have the user interface control the period or other properties of the sampling program. The RTLinux front end RTiC Lab automates much of this.

Discussion

On a standard x86 PC RTLinux interrupt latencies are 20 microseconds, worst case. On hardware with better timers and interrupt handlers we do much better. For example, the AMD Elan520 a 133MHz processor shows worst case interrupt latencies of under 7microseconds and on the Motorola M8260 worst case seems to be no more than 4 microseconds. Of course, PC architecture is not created with realtime in mind and badly behaved hardware will cause long delays. For example, some low-end video cards will hold the PCI bus for millisecond intervals if their internal fifos are filled! But, with sensible hardware choices, RTLinux pushes the boundaries of interrupt response down to a level where commodity microprocessors can replace special purpose devices.

One design principle that 6 years of development has validated is our principle that anything that can go into the general purpose operating system should go into the general purpose operating system. Realtime applications should be ruthlessly separated into the part that must be realtime and the part that does not have hard time constraints. This is a design constraint that conflicts with the programming model in IRIX, VxWorks, and other traditional ``realtime" operating systems and it requires an adjustment on the part of the programmer. But our decoupled design is at the heart of why RTLinux is reliable and fast. We require programmers to identify what sections of their code is realtime and our operating system takes advantage of that information. In practice, with tens of thousands of applications, this seems to be a natural way to design modular applications. There are some difficulties porting legacy code, but the user mode RTLinux signal capability reduces this difficulty. In fact, ordinary Linux, provides better realtime performance than many ``realtime" operating systems and we have found that turning a non-Linux realtime application into a multithreaded Linux application with RTLinux signal handlers is reasonably simple.

Since RTLinux was introduced, a number of ``real time" Linuxes have been developed and marketed and advocates for those approaches often claim to be escaping from the rigors our our decoupled model. Lineo provides a system called RTAI which started as a variant of RTLinux and has since grown considerably. MontaVista is attempting to introduce some techniques from the IRIX Operating System into Linux with a ``fully preemptable kernel". There is, however, no reason to suppose that re-creating VxWorks in Linux (Lineo) or re-creating IRIX in Linux (MontaVista) will produce improvements on the originals. The open-source development model, however, means that good work will enter the mainstreams of code development. There are several companies and researchers working on making Linux itself lower latency and this will extend the range of RTLinux applications where a soft realtime Linux component can be connected to hard realtime RTLinux threads. And RTAI, whose main developer is an independent University researcher, is free to experiment with APIs. The RTLinux user mode realtime signals were developed after a far more ambitious mechanism in RTAI proved to be useful. It's more important for us to retain a POSIX compatible and small API, so RTLinux is much more conservative about adding new features and much more likely to push new functionality into the existing framework.

As an example of how new functionality can be accomodated within the POSIX framework consider RTLinux/SMP. POSIX does not substantially address the issue of multiple processors. In RTLinux, each processor of a SMP system runs its own copy of the RTLinux realtime process (sharing minimal data). Allocation of threads and signal handlers to processors is completeley determined by the application. RTLinux adds a processor identifier to the POSIX thread attribute and when a thread is created the application can assign it to a processor. Efficient cross-processor control is a difficult problem and we were reluctant to introduce hidden synchronzation points or operations that would cause significant cache disruption -- e.g by migrating tasks between processors. By sending a suspend signal to the general purpose OS thread on a particular processor, it is possible to dedicate that processor to RTLinux. When the Linux thread on the target processor receives the signal, it will run an idle loop. As a result, the realtime applications on that processor can generally run within the processor cache: greatly reducing latency. All this is done with no additions to the POSIX threads API.

FSMLabs Lean POSIX for RTLinux $^{\tt TM}$