For participants only. Not for public distribution.

Note #14
QNX development hints

John Nagle
Last revised April 4, 2003.

Notes on how to develop under QNX

See also: Note #9, QNX hints.

This note will be added to as time progresses.

Compiling

All code is compiled with QCC, the QNX front end to the "gcc" compiler. QCC knows where all the C++ STL and Neutrino headers are. Note that some of the options are a bit different than for "gcc"; this reflects cross-platform capabilities we're not using. Annoyingly, "-M", which generates makefile dependencies, doesn't work with QCC; you have to use "gcc" for that.

All code must be in ".cpp" files, and compiled with C++, even if it could be compiled as C. Due to a horrible mistake in the Gnu compiler development process, "bool" is one byte in C and four bytes in C++ using the gcc 2.93 compiler provided with QNX. This produces structure incompatibilities without error messages. (Later versions of "gcc" fix this, but QNX runs a bit behind on compilers.)

Compile with the options "-Vgcc-ntox86 -Wall -g" which turns on all warnings and generates debug info.

Code repository

We will be using CVS, with a repository somewhere. More on this later.

Programming conventions

In general, use C++ classes. Use the STL.

In general, code indicates serious trouble by throwing an exception. See our errnoexception.h in our qnxlib in the repository. Generally, errors are indicated by throwing an "errno exception" containing a Posix error number. This provides something to catch that can immediately be turned into an error code.

Avoid explicit deallocation. Use auto_ptr and similar approaches. This avoids memory leaks after exceptions.

Avoid explicit locking and unlocking. Use our mutexlock.h classes for locking and for bounded buffers.These are exception-safe.

Printing to standard output is the usual way to generate information to be logged. In the vehicle system, anything printed by a production process will be prefixed by the name of the program that generated it and the time, and logged by the parent process that starts the production programs. Don't overdo it; we have to store two days of that stuff. Also bear in mind that printing can block a task, impacting real-time response. (I've run into this already.) We may need to write a "lossy" print function that discards output rather than blocking.

Dynamic allocation is OK, but memory leaks must be avoided. In production, programs will probably be given memory and stack quotas (QNX supports this), so that single programs in trouble abort themselves, rather than taking the whole system down.

Don't use virtual memory. (QNX has virtual memory if configured and enabled for a process, but we won't be using it in the vehicle.)

Little machines and big machines

We will have two types of computers in the vehicle - "little machines" and "big machines". All are IA-32 (x86) machines running QNX.

"Brick" computer - a potential "big machine"

A pure compute element. 2GHz or so each.

These should be rugged self-sufficient modules, preferably with no slots and no connector to a backplane.

Big machines will generally be 1-2GHz machines with substantial memory and hard disk, but will usually lack a keyboard and screen. Big machines do all their communications over Ethernet and FireWire. Big machines will generally not close servoloops requring response times below 250ms or so. The big machines will all live in a cooled industrial enclosure in the truck cab.

We'll need 2 to 6 of these, depending mainly on how much vision processing we do.

We need some good alternatives in this area. Most ruggedized computers lack serious compute power, or plug into backplanes with many connector pins. We might even be better off just using a stack of laptops.

PC/104 PC in "CanTainer" - a potential "little machine"

Little machines will be much smaller, perhaps only 100MHz, probably ruggedized PC/104 form factor or similar. They will not have hard disks, but will have a reasonable amount of flash memory and RAM. They will all have Ethernet, and may have other interfaces. Little machines will be in sealed boxes near whatever they are tied to electrically.

We'll probably use 5 or 6 of these. Probably two for the vehicle interface, two for the bumper sensors (short range radars), and one for each laser rangefinder, and one for the E-stop system.

If at all possible, all little machines and all big machines will be identical and interchangeable.

Resource managers ("device drivers")

Code which talks directly to hardware should be written as a QNX resource manager. This encapsulates the device with the usual open/close/read/write functions. It also makes it accessable over the network, so the application which deals with the device need not be on the same machine.

If you write a resource manager, make sure that it cannot prevent a program from terminating. Resource managers have the privilege of blocking a calling process against kills. A program hung in a resource manager won't be killable, and the watchdog process will hang waiting for it to die. This will cause the whole machine to reboot. To avoid this, implement io_unblock handlers in the resource manager, so that when the process dies, you cancel the outstanding I/O operation.

Programs that aren't resource managers shouldn't assume where they will run. QNX allows interprocess communication over the net. We don't have to decide until very late where a program will run.

Process priorities

This is a hard real-time system, but that only works if everybody is careful about process priorities. We will have a table of tasks and priority numbers later.

Watchdog system

Each production process has to send a message to its parent, the watchdog process, every so often, or it will be terminated and restarted. More on this later.

The big constraint this imposes is that all programs must be restartable. Assume that any file you write to and read back may be corrupted or totally unreadable. Essential files should be opened read-only. Don't use lock files. QNX has good locking mechanisms that deal properly with termination.

After a QNX reboot without file recovery, files that were open for writing are marked as locked and cannot be opened. (They can be deleted, and there's a way to clear them to the empty state.) This differs from UNIX and Microsoft practice. The idea is that you know it's broken, so you don't work on garbage. Programs that put important data in files must deal with this. We may have a little database with some redundancy to handle this issue, if we need it.

All this will get tested; we'll have a program that randomly kills processes, and the software has to keep working.

Code text conventions

C++ files are ".cpp", header files are ".h".

All source files should be in ASCII. QNX understands UNICODE, as does the QNX editor, ped, but not all the GNU tools do.

Use astyle -style==ansi to reformat and tab code. I've ported astyle to QNX and will put it on our web site.

Types and classes should have names beginning in upper case, or names of the form "typename_t" in lower case. Data fields of classes should have the form "m_fieldname". Avoid leading and trailing underscores. It's not worth reworking old code to do this, but new code should follow these conventions if possible.

Every function should have a comment at its head, describing generally what it does. A comment on every line is suggested, but mandatory.

In general, try to express error conditions by returning an error code chosen from "errno.h".

Exceptions

Exception handling is enabled. Running out of memory will throw an exception from "new". So programs should be prepared for exceptions. However, if a restart of the program is good enough, you can just let the watchdog manager handle the problem.

Exceptions you intend to catch should either be standard exceptions (from <exception>) or error number exceptions (from our "errnoexception.h".)

Timestamps

Incoming sensor data should be timestamped, using a QNX timespec structure, which contains time in seconds and nanoseconds. CLOCK_REALTIME should be used for timestamps. Timestamps should reflect the actual time at which sensor data was collected. This is not for logging. Because the vehicle is moving and some of the sensors and processing take time, we will have to adjust for lag when adding data to a map. Processing modules should thus pass the data timestamp through into their own output data.

We will try to synchronize CLOCK_REALTIME between all the machines in the vehicle.