Designed and programmed by David C. Steere
The Recoverable Dynamic Storage (RDS) library provides a heap similar to that of the C library, although with persistence, for RVM segments. Additional provisions for convenient initialization and reloading of the heap after shutdown or crash are included.
RDS uses the Quick Fit method ] for allocation. In this method, separate free lists are maintained for a specified number of sizes. Most programs tend to repeatedly allocate a small number of distinct sized blocks. Quick Fit avoids overhead by retaining those block sizes as long as possible in the expectation that they will be re-allocated when freed. Coalescing of adjacent free blocks is deferred until a request cannot be satisfied.
The free list sizes are integer multiples of the
chunk size, which is specified when the recoverable heap is created
and cannot be modified thereafter except by re-initializing the heap.
The chunk size must be an integral multiple of sizeof (char *)
, and
be at least RDS_MIN_CHUNK_SIZE
.
The free lists are maintained in sizes of one chunk to n chunks,
where n is the maximum number of free lists. The
number of lists is also specified at heap creation and cannot be
modified thereafter.
When an allocation request is made, the requested size is rounded up to an integral multiple of the chunk size. The free list for that size is checked first, and if not empty, the first block is returned. Otherwise, the next larger free list is checked, and if not empty, the first block is selected, the requested size split off, and the remainder placed on the appropriate free list. If the free list is empty, the next is searched. If no specific-sized free list can allocate the block, the request is taken from the single list of larger blocks by splitting the first block on the list. The remainder is returned to the appropriate free list. Sizes larger than n times the chunksize are also allocated from the large blocks list on a first-fit basis. This list is always created in addition to the fixed size lists.
When a block is freed, it is simply placed on the appropriate free list with no attempt coalesce it with adjacent free blocks. Coalescing is deferred until an allocation request can not be satisfied.
The allocation and deallocation functions, rds_malloc and rds_free, resemble malloc and free, although they are modified for transactions. To help prevent fragmentation, RDS also provides for pre-allocation of pools of block sizes that the application will repeatedly allocate with the function rds_prealloc.
The function rds_load_heap provides a convenient way to use the RVM segment loader to load the persistent heap and a statically allocated region. rds_load_heap will make the necessary RVM mapping calls and initialize RDS. When it completes, the allocations persistent data is ready for use.
Initialization of an RDS heap is done with the function rds_init_heap. A second function, rds_zap_heap, which calls rds_init_heap and rvm_create_segment, can be used to prepare a segment for loading via rds_load_heap. However, heap initialization is usually done with the utility rdsinit, which will prompt for the necessary parameters to call rds_zap_heap.
To assist in detecting buffer overruns and other memory clobbers, RDS puts special guard constants at the beginning and end of each block. These are not within the space allocated to satisfy an allocation request and will never be addressed by a correctly operating application. The rounding up of allocation request sizes to integral multiples of the chunk size automatically includes space for the guard constants.
If memory damage is detected by the corruption of one of the guards when a block is deallocated, or when damage to a free list is detected during allocation, the error code ECORRUPT is returned. If this error is detected during deallocation, the block is not put on a free list, but is left unmodified for analysis of how the overwrite occurred.
RDS generally returns error indications in a parameter whose address
must be specified in the call. However, a few functions return the
error code as their value. RDS error codes are all defined as
negative integers in rds.h. Because RDS must perform its actions
via transactions, the RVM error codes can also be returned in the error
code parameter or as function values. These are defined in rvm.h
,
and are all positive values. Success is always indicated by the code
SUCCESS, which is defined to be zero.
The header file giving all necessary definitions for RDS is included in Appendix of C declarations.
All functions in RDS use RVM transactions. Some, particularly rds_malloc and rds_free, must interact with the applications transactions, requiring some care from the caller. RDS does not hold its lock until after commit so that greater concurrency can possibly be achieved in Coda. This is not generally recommended because it permits uncommitted state to be used by other than the calling transaction. See the notes on good programming practice with RVM, Section of Transactions in Critical Sections.
Since the lock is not held until commit, some additional complexity must be introduced to insure that allocation and deallocation actions are independent. There are two cases that must be recognized: if the caller guarantees that no context swaps among application threads will occur before transaction commit, and if such swaps must be allowed.
If context swapsare avoided, the application can supply the transaction to be used by RDS in the allocation/deallocation calls and be sure that if the transaction must later be aborted, all will be restored correctly. The transaction must be started in restore mode, which is required for all RDS calls. The transaction will not be committed or aborted by RDS.
If context swaps must be allowed (frequently the case), a null
transaction parameter must be specified. This will force an internal
transaction to be created by RDS.
This transaction will be committed by RDS before the function returns.
In this case, if rds_malloc
is called and
the application transaction requesting the allocation must abort, the
block must be explicitly deallocated by a call to rds_free.
Otherwise, recoverable heap space will be lost.
When the application deallocates blocks in a transaction
that could later abort and context swaps are possible, two special
functions are provided,
rds_fake_free and rds_do_free. These functions manage an
intention list that records the intended deallocations, but defers
the actual deallocation until the application transaction is ready to
commit. If the transaction aborts, the intention list is discarded.
The application must call rds_fake_free at the point where
the block would logically be freed, and then
call rds_do_free
immediately before committing the application transaction.
The application must provide the intention list header, of type
intentionList_t
; RDS will manage the actual list.
Note that these methods are safe only in a coroutine thread environment. If pre-emptive threads, or a multiprocessor, are required, RDS cannot safely be used.
NAME
rds_malloc - allocate from recoverable heap
SYNOPSIS
#include "rds.h" int rds_malloc (size, tid, err); unsigned long size; /* length of allocation request */ rvm_tid_t *tid; /* optional pointer to transaction identifier */ int *err; /* pointer to error return location */
DESCRIPTION
rds_malloc
allocates from the recoverable heap a block large
enough to hold the request, specified in bytes by the size
parameter.
The address of the allocated block is returned as type
int
and must be cast to the desired type.
If the request cannot be satisfied, zero is returned, and the err
parameter is set to the appropriate error code.
Because allocation actions in the recoverable heap must be done
via transactions, rds_malloc
offers two choices for the transaction.
In the first case, rds_malloc
can be instructed to use an existing
transaction begun in restore
mode, by passing the address of a
valid rvm_tid_t
record in the tid
parameter.
This avoids extra transaction start and commit
overhead and provides automatic deallocation if the transaction must
later abort.
However, no context swap can be permitted between allocation and
transaction commit or abort since the
modified, but uncommitted, free lists are visible to other RDS actions.
If context swaps cannot be prohibited, or the available transaction was
started in no_restore
mode, the tid
parameter should be
set to null, instructing rds_malloc
to start an internal
transaction.
This transaction will be committed in no_flush
mode if the
allocation is made, and aborted otherwise.
In this case, an explicit rds_free
must be done if the allocating
transaction later aborts.
DIAGNOSTICS
success
RDS heap exhausted
RDS heap damaged
RDS not initialized
RVM return codes
SEE ALSO
rds_free (3)
AUTHOR
David C. Steere
BUGS
The internal synchronization is not valid with pre-emptive threads.
rds_free - return a block to the recoverable heap
SYNOPSIS
#include "rds.h" int rds_free (addr, tid, err) char *addr; /* address of block to be freed */ rvm_tid_t *tid; /* optional pointer to transaction identifier */ int *err; /* pointer to error return location */
DESCRIPTION
rds_free
deallocates recoverable storage allocated by rds_malloc
.
The block to be deallocated is passed in the addr
parameter. This
block must not already be free, or the EFREED_TWICE
error
code will be returned in err
.
Because deallocation actions in the recoverable heap must be done via
transactions, rds_free
offers two choices for the transaction.
In the first case, rds_free
can be instructed to use an existing
transaction by passing the address of a valid rvm_tid_t
record in
the tid
parameter. This avoids extra transaction start and commit
overhead and provides automatic reallocation if the transaction must
later abort.
If the available transaction was started in no_restore
mode, the
tid
parameter should be set to null, instructing rds_free
to
start an internal transaction.
This transaction will be committed in no_flush
mode if the
allocation is made, and aborted otherwise.
In neither case can context swaps be permitted between deallocation and
transaction commit or abort since the modified, but uncommitted, free lists are visible to other RDS actions.
If this condition cannot be met, or if there is a
possibility that the transaction will abort, the functions
rds_fake_free
and rds_do_free
must be used.
DIAGNOSTICS
success
block is already free
RDS heap damaged
block is not on 4 byte boundary
RDS not initialized
RVM return codes
SEE ALSO
rds_malloc (3)
, rds_fake_free (3)
, rds_do_free (3)
AUTHOR
David C. Steere
BUGS
The internal synchronization is not valid with pre-emptive threads.
rds_fake_free - deferred return of blocks to the recoverable heap
SYNOPSIS
#include "rds.h" typedef struct intlist { unsigned long size; unsigned long count; char **table; } intentionList_t; int rds_fake_free(addr, list); char *addr; /* address of block to be freed */ intentionList_t *list; /* pointer to intention list */ int rds_do_free(addr, list); intentionList_t *list; /* pointer to intention list */ rvm_mode_t mode; /* transaction commit mode */
DESCRIPTION
In cases where the application must free recoverable storage without
knowing whether the transaction will later abort, rds_free
should
not be used since the internal free list lock is not held until commit
or abort.
To handle these cases, RDS allows the creation of an intention
list for the blocks to be freed. At the point where the block would
logically be freed, rds_fake_free
is called to place the block on
the intention list.
To actually free the blocks on the intention list, rds_do_free
is
called. This must be done immediately before calling
rvm_end_transaction
for the deallocating transaction, and must not
be done if
rvm_abort_transaction
is called. Also, no context swaps can be
allowed between the calls on rds_do_free
and rvm_end_transaction
.
The intention list header (type intentionList_t
) must be provided
by the application. The table
field must be null on the first call to rds_fake_free
. RDS will
allocate space as necessary for the blocks to be freed.
rds_do_free
will deallocate the table
vector. The application
is responsible for deallocation of table
if the transaction is
aborted. Deallocation of the intentionList_t
itself is always the
responsibility of the application.
RDS does not internally synchronize access to the intention list.
If more than one thread is involved in the transaction,
synchronization of access to the intention list is the responsibility
of the application.
The block to be deallocated is passed in the addr
parameter to
rds_fake_free
. This
block must not already be free, or the EFREED_TWICE
error
code will be returned.
The ECORRUPT
error return is given by rds_fake_free
if the
block guards are damaged. rds_do_free
returns this code if
internal damage to the free lists is found.
These functions return error codes as the value of the function rather than in a parameter.
DIAGNOSTICS
success
block is already free
block or heap damaged
block is not on 4 byte boundary
RDS not initialized
RVM return codes
SEE ALSO
rds_free (3)
, rds_malloc (3)
AUTHOR
David C. Steere
rds_init_heap - initialize a recoverable heap
SYNOPSIS
#include "rds.h" int rds_init_heap (base, length, chunkSize, nlists, tid, err) char *base; /* base address of heap */ unsigned long length; /* length of heap */ unsigned long chunkSize; /* size of allocation unit */ unsigned long nlists; /* number of allocation lists */ rvm_tid_t *tid; /* pointer to transaction identifier */ int *err; /* pointer to error return location */
DESCRIPTION
rds_init_heap
initializes a recoverable heap in the previously
mapped memory region specified by the address base
and length
parameters.
Allocation requests will be rounded up to an integral number of
allocation units chunkSize
, which is specified in bytes, and must
be an integral multiple of sizeof (char *)
, and be at least
RDS_MIN_CHUNK_SIZE
.
For rapid allocation, RDS maintains separate allocation lists for
blocks of different sizes. The sizes are integrals of chunkSize
,
beginning with 1 and extending to the number of lists specified by
nlists
, which must be at least RDS_MIN_FREE_LISTS
. These
lists are initially empty and the entire heap is
placed on the large block list as a single block. rds_malloc
will
split this block as required.
If a number of blocks of specific sizes are required by the
application, the function rds_prealloc
is provided for efficient
pre-allocation. It should be called immediately after initializing
the heap.
An active transaction must be specified in the tid
parameter; RDS
will not create an internal transaction for this function.
This transaction must be committed by the application for the
initialization to be permanent.
Since this function is called only to initialize the heap, RDS assumes there is no concurrent access and does no internal locking.
DIAGNOSTICS
success
heap length not long enough for heap structures
the chunk size is not large enough or is not a multiple of sizeof(char *)
the number of free lists is not at least RDS_MIN_FREE_LISTS
RVM return codes
SEE ALSO
rds_zap_heap (3)
, rds_prealloc (3)
AUTHOR
David C. Steere
rds_load_heap, rds_start_heap - map recoverable storage
SYNOPSIS
#include "rds.h" int rds_load_heap (DevName, DevLength, staticAddr, err); char *DevName; /* name of heap segment */ rvm_length_t DevLength; /* length of heap raw partition */ char **staticAddr; /* address of static region (out)*/ int *err; /* pointer to error return location */ int rds_start_heap (staticAddr, err); char **staticAddr; /* address of static region (out)*/ int *err; /* pointer to error return location */
DESCRIPTION
rds_load_heap
provides a convenient method of mapping the heap
segment and initializing RDS. The name of the file or partition
containing the recoverable heap is specified by DevName
, and the
length of the partition is specified in DevLength
. If the heap
is in a file, DevLength
should be zero.
rds_load_heap
calls rvm_load_segment
to map the heap. This
requires that the heap file or partition be created with
rds_zap_heap
or rvm_create_segment
, which is most conveniently
done with the rdsinit
utility.
After the heap is mapped, the address of the static region is returned
in staticAddr
. rds_start_heap
is automatically called to
initialize RDS.
If the actual mapping of the heap and static regions is done by the
application, rds_start_heap
should be used to initialize RDS. In this
case, the address of the static region must be specified to RDS in the
staticAddr
parameter.
With either function, the version of RDS that the heap was built with
is compared with the version of the currently linked RDS library.
If there is a mismatch, the error code EHEAP_VERSION_SKEW
is returned.
Since these functions are called only to initialize the application, RDS assumes there is no concurrent access and does no internal locking.
DIAGNOSTICS
success
RDS heap and library have different versions
RVM return codes
SEE ALSO
rvm_load_segment (3)
, rvm_map (3)
, rdsinit (1)
AUTHOR
David C. Steere
rds_zap_heap - initialize an RVM segment as an RDS heap
SYNOPSIS
#include "rds.h" int rds_zap_heap (DevName, DevLength, startAddr, staticLength, heapLength, nlists, chunkSize, err) char *DevName; /* name of heap segment */ rvm_length_t DevLength; /* length of heap raw partition */ char *startAddr;/* base address of heap */ rvm_length_t staticLength; /* length of static region */ rvm_length_t heapLength;/* length of heap region */ unsigned long nlist; /* number of allocation lists */ unsigned long chunkSize; /* size of allocation unit */ int *err; /* pointer to error return location */
DESCRIPTION
rds_zap_heap
prepares a recoverable heap for loading by
rds_load_heap
. Two regions are created in the heap segment named
by DevName
, one for the applications statically allocated region
and one for the recoverable heap.
The lengths of the regions, specified by staticLength
and
heapLength
, must be integrals of the system page size. The heap
region is allocated at *startAddr
in virtual memory, which must be
on a page
boundary. This address is the permanent address of the heap. The
static region starts at *startAddr
+ heapLength
and is also
permanent. One additional region is created at the beginning of the
segment for the segment header, and is one page long. This region is
only temporarily mapped by rds_load_heap
. The heap region follows
the header in the segment, and the static region follows the heap.
If the segment is represented by a raw partition, the maximum usable
length of the partition must be specified in DevLength
. For
files, this length should be zero.
The region of virtual memory for the heap and static regions should
not be allocated by the application. rds_zap_heap
uses the RVM
segment utilities to build the segment header and map the regions.
When rds_zap_heap
is complete, the memory will be mapped.
rds_zap_heap
uses rvm_create_segment
to build the segment, and
then calls rds_init_heap
to initialize the free
lists as specified by the nlists
and chunkSize
parameters.
chunkSize
is specified in bytes, and must
be an integral multiple of sizeof (char *)
, and be at least
RDS_MIN_CHUNK_SIZE
. The number of free lists must be at least
RDS_MIN_FREE_LISTS
.
The transaction required by rds_init_heap
is created and
terminated internally.
If a number of blocks of specific sizes are required by the
application, the function rds_prealloc
is provided for efficient
pre-allocation. It should be called immediately after initializing
the heap.
The utility rdsinit
is usually used to create recoverable heap
segments. It prompts for the parameters and calls rds_zap_heap
.
Since this function is called only to initialize the heap, RDS assumes there is no concurrent access and does no internal locking.
DIAGNOSTICS
success
heap length not long enough for header
the chunk size is not large enough or is not a
muliple of sizeof(char *)
the number of free lists is not at least
RDS_MIN_FREE_LISTS
RVM return codes
SEE ALSO
rds_init_heap (3)
, rds_load_heap (3)
, rvm_create_segment (3)
,
rvm_load_segment (3)
, rdsinit (1)
, rds_prealloc (3)
AUTHOR
David C. Steere
rds_prealloc - initialize allocation pools
SYNOPSIS
#include "rds.h" int rds_prealloc (size, nblocks, tid, err) unsigned long size; /* size of allocation request */ unsigned long nblocks; /* length of allocation request */ rvm_tid_t *tid; /* pointer to transaction identifer */ int *err; /* pointer to error return location */
DESCRIPTION
rds_prealloc
assists applications in minimizing heap fragmentation
by pre-allocating pools of blocks in sizes that the application is
known to use in relatively large numbers.
rds_prealloc
is best called immediately after the heap is
initialized, although it can be called anytime.
When pools of several block sizes are pre-allocated, they should be
allocated in increasing block size order so that the larger blocks are
not split to make the smaller.
The size
parameter specifies the size, in bytes, of the blocks to
be pre-allocated, and nblocks
controls the number of blocks
pre-allocated.
Because allocation actions in the recoverable heap must be done via
transactions, rds_prealloc
offers two choices for the transaction.
In the first case, rds_prealloc
can be instructed to use an existing
transaction begun in restore
mode, by passing the address of a
valid rvm_tid_t
record in the tid
parameter.
This avoids extra transaction start and commit
overhead and provides automatic deallocation if the transaction must
later abort.
However, no context swap can be permitted between allocation and
transaction commit or abort since the uncommitted state is visible to
other allocations.
If context swaps cannot be prohibited, or the available transaction was
started in no_restore
mode, the tid
parameter should be
set to null, instructing rds_prealloc
to start an internal
transaction.
This transaction will be committed in no_flush
mode if the
allocation is made, and aborted otherwise.
In this case, if the allocating
transaction later aborts, the pre-allocated blocks remain allocated.
DIAGNOSTICS
success
RDS heap exhausted
RDS heap damaged
RDS not initialized
RVM return codes
SEE ALSO
rds_malloc(3)
, rds_init_heap(3)
, rds_zap_heap(3)
AUTHOR
David C. Steere
BUGS
The internal synchronization is not valid with pre-emptive threads.
rds_get_stats, rds_clear_stats, rds_print_stats - RDS statistics functions
SYNOPSIS
#include "rds.h" int rds_get_stats (stats); rds_stats_t *stats; /* pointer to RDS statistics record */ int rds_clear_stats (err); int *err; /* pointer to error return location */ int rds_print_stats();
DESCRIPTION
RDS maintains simple statistics about the recoverable heap and the allocations performed. The statistics area is kept in the recoverable heap region and is initialized when the heap is initialized.
The statistics can be read by rds_get_stats
into an rds_stats_t
record for analysis by the application, or printed on stdout
by
rds_print_stats
.
rds_get_stats (stats)
and rds_print_stats ()
return error codes as the
value of the function rather than in a parameter.
After heap initialization, the statistics area can be cleared by
calling rds_clear_stats
.
rds_clear_stats
simply zeroes the statistics area via an internal
transaction. The space
allocated and free counts will not be accurate if it is called after
initializing the RDS heap. Note that rds_clear_stats
requires an
error return code as a parameter, while the others return the error
status as the function value.
RDS performs no internal synchronization for these calls, so if there is a possibility of concurrent access to the statistics, the accesses must be serialized by the application.
DIAGNOSTICS
success
RDS not initialized
null RDS statistics record pointer
AUTHOR
David C. Steere
rdsinit - RDS heap initialization utility
SYNOPSIS
rdsinit log data_seg
rdsinit log data_seg datalen saddr hlen slen nl chunk
rdsinit -f log data_seg datalen saddr hlen slen nl chunk
DESCRIPTION
rdsinit
is a utility that constructs an initialized RDS heap in an
RVM segment. It is intended to create a structure that can be loaded
by rds_load_heap
.
There are three different ways of using rdsinit. General users are expected to use first two interactive modes, where users may supply parameters for the rds heap interactively or on command line arguments. However, in both cases, users will be asked interactively to confirm their choice of parameters before rdsinit goes ahead to make any permanent change. These are the preferred modes of using rdsinit. Script writers, however, may prefer to supply all the parameters on the command line and no confirmation required for those parameters. This is accomodate in the third mode where an additional switch of -f (firm) is supplied on the command line.
In any case, two command-line parameters are always
required: log
and data_seg
. The former is the name of the
RVM log, which must have previously been initialized by
rvmutl
; the latter is the name of the data segment that will
be initialized with an RDS heap. If either is missing, a command line
error is printed. If the log has not been initialized, an RVM error
will result. A short message indicating RVM initialization succeeded
is then printed. Both the log and data segment can be regular files or
raw partitions.
After the name of log and data segment, there are six other numeric parameters required. They are summarized here and will be explained one by one in the following paragraphs:
Length of data segment
Starting address of rvm
Heap region length
Static region length
Number of lists of free block
Chunk size
While entering these six numeric parameters, either on command line on
via the interactive prompt, users may use numeric number in
hexadecimal, decimal or even octal notation. Hexadecimal numbers are
preceeded by Ox
, decimal numbers are preceeded by nothing and
octal numbers are preceded by 0
.
Special note for long time rdsinit user: the old rdsinit automatically
assumed saddr
, hlen
and slen
parameters supplied on
command lines are in hexidecimal and did not require the prefix
0x
. This is no longer true with this version of rdsinit.
Users specify the length of the data segment with the parameter datalen. Again, old version of rdsinit did not require this parameter if the data segment was a regular file and it existed already at the time of running rdsinit. This special case is eliminated: length of data segment must to be supplied in all circumstances.
Starting address of rvm, or saddr, is where heap and static region will be mapped into virtual memory. Heap region is located right at the starting address, while static region is located at starting address plus heap region length. Users may need to have some knowledges of the overall layout of the virtual memory use by the system before they can make the right choice of starting address. For example, the starting address of rvm must be much larger than the largest possible break point of your application, and it should not be in conflict other uses of virtual memory (such as use by shared libraries). It must also be on a page boundary. In CMU, we use 0x20000000 (536870912) with Linux and BSD44, or 0x70000000 (1879048192) with Mach. It is possible to choose other values, but you have to choose them carefully.
Length of regions of heap and static are specified by the parameter hlen and slen respectively. They both must be integral multiple of pagesize of the system. Also, the combined length of the two regions must be smaller than the length of data segment minus one extra page.
Note that the above three parameters: saddr, hlen, slen, are permanent. They can't be changed without re-initizing (and brain-wiping) the data segment.
The next two parameters: nl and chunk are related to
underlying structure of management of the heap. RDS uses the Quick
Fit method for heap allocation. In this method, free blocks are
maintained by a number of free lists, each list for one particular
size of free blocks. Specifically, there will be nl
free lists,
and each of them will have free blocks of size 1..nl
chunk
respectively.
Chunk size must be integral multiple of sizeof(char *)
, and be at
least RDS_MIN_CHUNK_SIZE
. Number of lists must be at least
RDS_MIN_FREE_LISTS
. For example, a reasonable choice is to have
100 free list with chunk size 32 bytes.
Once all the parameters are chosen, rdsinit will ask user for confirmation before it goes ahead and make permanent change on the log and data segment. Note in the following example that those numerical arguments are presented in both hexidecimal and decimal (in bracket). It is safe to quit at this point and no permanent changes will be made.
The following parameters are chosen: length of data segment: 0xf5000 ( 1003520) starting address of rvm: 0x20000000 ( 536870912) heap len: 0xf0000 ( 983040) static len: 0x4000 ( 16384) nlists: 0x64 ( 100) chunk size: 0x20 ( 32) Do you agree with these parameters ? (y|n|q) y
If user supplied the -f (firm) switch on command line, this last confirmation will not show up.
SEE ALSO
rds_init_heap (3)
, rds_load_heap (3)
, rds_zap_heap (3)
,
rvm_create_segment (3)
, rvm_load_segment (3)
, rvmutl (1)
AUTHOR
David C. Steere, created man page
Yui W. Lee, modified (Sept 1997)