NCollection allocators

The previous post has introduced a C++ standard compliant allocator that wraps Open CASCADE allocator interface (Standard::Allocate(), ::Free(), ::Reallocate()) and thus can be used in any containers or other template classes that accept such allocator.

Before we move forward to consider another use case of standard allocator, let's make a step aside and review another allocator flavors offered by Open CASCADE.

Those of you who happened to use templates from NCollection or just attentively read the source code employing it might notice NCollection_BaseAllocator base class. NCollection containers sort of tried to mimic interfaces of STL containers and as such introduced allocator in its inception in early 2000es. However, unlike STL where allocator is part of the type (e.g. std::list is a different type than std::list) NCollection allows to dynamically supply an allocator, for instance:

NCollection_List aVec1, aVec2 (new NCollection_IncAllocator);

There are three types of allocator in NCollection:
-    NCollection_BaseAllocator, which is a base class that also implements a wrapper over Standard::Allocate() and ::Free(). There is a singletone returned by NCollection_BaseAllocator::CommonBaseAllocator().
-    NCollection_HeapAllocator, which wraps standard OS allocator (malloc/free). Similar to a base class there is a singletone returned by NCollection_HeapAllocator::GlobalHeapAllocator().
-    NCollection_IncAllocator, which implements a pool allocator.

The former two are straightforward and probably only NCollection_IncAllocator deserves some details.

Pool allocators (see for instance Wikipedia) are most helpful to manage temporary objects, especially when their destruction can be deferred to some common point in time, e.g. completion of an algorithm that created them. Implementation of a pool allocator is very straightforward – it preallocates a large chunk of memory (e.g. from a system heap or another allocator), and whenever there is a new temporary object to be allocated, just takes a required size from that large chunk's head pointer and shifts that pointer. Deallocation is void, i.e. no memory gets really freed after the object destructor has been called. Real deallocation of large chunk(s) happens upon own pool allocator destruction.

This simple implementation allows to greatly improve performance when dealing with numerous temporary objects due to avoiding fragmentation and/or complicated techniques to manage individual smaller memory chunks.

Here is a usage example which creates a temporary hash table:
class PartnerShapeHasher
    static Standard_Integer HashCode (const TopoDS_Shape& theKey, const Standard_Integer theUpper)
        return ::HashCode (theKey.TShape(), theUpper);
    static Standard_Boolean IsEqual(const TopoDS_Shape& theKey1, const TopoDS_Shape& theKey2)
        return theKey1.IsPartner (theKey2);

//! Returns a number of partner subshapes of a given type
/*! A partner shape is a shared instance regardless of a location (transformation matrix).*/
static size_t NumberOfPartnerSubshapes (const TopoDS_Shape& theShape, const TopAbs_ShapeEnum theType)
    NCollection_Map aMap (1 /*default number of buckets*/, new NCollection_IncAllocator);
    for (TopExp_Explorer anExp (theShape, theType); anExp.More(); anExp.Next()) {
        const TopoDS_Shape& aSubshape = anExp.Current();
        aMap.Add (aSubshape);
    return aMap.Size();

If your temporary objects' life span is not necessarily aligned with the end of the algorithm scope (e.g. some objects are destroyed earlier) then you could either have a separate pool allocator for such objects or just accept a trade-off of temporary larger peak memory footprint.

NCollection_IncAllocator's constructor accepts a parameter theBlockSize which specifies a size of large memory chunk(s) preallocated from the heap. I ended up with having a few predefined sizes via the following enumeration:

enum NCollection_IncAllocator_Size {
    NCollection_IncAllocator_Tiny = 1000,
    NCollection_IncAllocator_Small = 4000,
    NCollection_IncAllocator_Medium = 8000,
    NCollection_IncAllocator_Large = 16000,
    NCollection_IncAllocator_Huge = 32000

and use respective value depending on the most probable case, e.g.:
Handle(NCollection_BaseAllocator) anAlloc = new NCollection_IncAllocator (NCollection_IncAllocator_Tiny);
NCollection_List aList (anAlloc);

This should help avoid extra fragmentatoin in the OS allocator due to reuse of the same size blocks over and over again, yet providing a hint instead of default constructor parameter (around 26K).

If the allocator is requested to allocate an object of a larger size than the constructor's theBlockSize parameter then a new larger chunk is allocated. Whenever a new large chunk is allocated then the remaining unused free bytes in a previous chunk are wasted. So some attention should be made to minimize unnecessary waste, if it can be substantial.

Another point to remember is that NCollection_IncAllocator is not thread-safe (though reentrant), so access to the same instance should be synchronized outside, if it is used from multiple threads. For thread-safe scalable implementations you might want to check Intel Threading Building Blocks memory pools.

Now with the review of NCollection allocator completed, let's get back to the standard interface.

No comments:

Post a Comment