MMU and heap contiguity - Embedded Software

By Colin Walls

A while ago I did a Webinar looking at C++ for embedded applications. It was well attended and well received and there were lots of questions and comments, which is always very satisfying. I observed that a number of people were specifically interested in dynamic memory allocation in C and C++ and the challenges that are presented to embedded and real time programmers. So I developed a further Webinar specifically looking at dynamic memory allocation and fragmentation. Both of these were recorded and available as archives to view on demand.

I was interested in investigating how to avoid dynamic memory [heap] fragmentation …

The issue comes about because the arbitrary allocation and deallocation of chunks of memory can result in something of a mess. The free space is not contiguous – i.e. it is fragmented – and an allocation can fail, even though enough free space is actually available.

In this example, there is 6K of free memory, but a request for a 4K block would fail.
An obvious solution would be to de-fragment the memory, but this is not a possibility for 2 reasons:

The application code uses pointers to the allocated memory; changing the address of allocated chunks would break that code. This is possible in languages like Java and Visual Basic, where there are no direct pointers.
Such “garbage collection” would be intrinsically non-deterministic, which would not be acceptable in a real time system.

The only realistic solution, that I could propose, is to use the block [partition] memory allocation provided by most real time operating systems. A series of partition pools are created, with partition sizes in a geometric series [e.g. 32, 64, 128, 256 bytes]. Allocations are then made from the pool that has partitions [just] large enough. This solves the problem, as a partition pool cannot get fragmented and its usage is most likely to be highly deterministic.

But I had another idea: would creative use of a memory management unit [MMU] be another way to apparently de-fragment the heap? The concept is simple: the MMU manages the free space and maps it so that it always appears to be contiguous. Of course, when memory is allocated, the MMU would also need to maintain the apparent contiguity of the allocated chunk.

To be frank, my experience in the use of MMUs is limited. So my question is: would this actually work? If you have any insight, please comment or email me.

Comments

0 thoughts about “MMU and heap contiguity”

MacOS pre-X had many good ideas. They were doing a lot in little memory. One idea was instead of malloc(), they used createHandle(). Handles were pointers to pointers. You locked them when you were using them, and unlocked them so that memory manager could defragment the data. As the Metronome GC shows, you can get deterministic behavior by defragmenting in response to time rather than memory pressure.

I’m surprised the concept hasn’t really been adopted by the embedded community as far as I can tell. Perhaps you can fix that.

That is good input Paul. Thank you.

I would agree that such an approach would seem usable for embedded. I would be curious to see how it worked in more detail. For example, how was defragmentation done with some memory chunks locked?

I’m not familiar with the implementation details except to say that memory manager was able to defragment memory even in the presence of chunks that are locked, and eventually move those chunks when they are unlocked.

Thus far, I’ve been lucky and fragmentation hasn’t been an issue in my job. The main attraction for me is that when memory regions are explicitly locked/unlocked for use, they can be checked for heap corruption, and similarly software emulators can break on reads/writes to unlocked handles.

Yes, an MMU can, indeed, make arbitrary memory blocks look contiguous. However, it relies on page faults and code to handle these fault “interrupts” in order to do this. So it works much like the memory manager of the Java Virtual Machine, except that the hardware MMU makes things faster and allows the scheme to work with languages (like C and C++) which compile down to the bare silicon. So it’s theoretically feasible, but still with indeterminate timing.

Another killer, for perhaps most embedded systems, is the large granularity. An optimal page size for a 32-bit MMU is about 4kbytes, which is quite large compared with many of the typical memory allocations in an embedded program. ARM has recognised this (and maybe others too) and has used some clever tricks to allow variable page sizes, both bigger and smaller then 4k, but it complicates the design somewhat and the indeterminate timing aspects remain.

I may be an old diehard, but I still maintain that, for most of us, fixed-length-block allocation is the only viable approach.

Actually, I take back the bit about page faults; they’re only for virtual Memory handling really. But memory allocation still requires code to reprogram the MMU to make blocks look contiguous, and the time it takes to run, on each occasion it’s needed, depends heavily upon the degree of fragmentation of the underlying physical memory. Still no free lunch here.

Comments

Leave a Reply Cancel reply