A C application I wrote needed random access to in-memory data the application created. I wanted search routines for C that would do what the C++ map and set STL datatypes do. A base requirement (for me) is that the routines be open source, not proprietary. A few minutes search turned up hsearch and tsearch. Both are in the Single Unix Specification and both are part of GNU libc. I am sure there are other similar routines on the web.
hsearch is pretty useless because one must provide a fixed size for the table (which cannot be increased). And there is no way to empty the table. I won't mention hsearch again.
tsearch is much better. It grows as necessary automatically and the GNU glibc version has a tdestroy() function to clean out all memory used. It is a binary tree representation so performance is quite reasonable. The cutest part is that nothing in the library has any dependency on the data involved. So the same code (see the example code) works fine as either a map or a set.
The algorithms implemented are binary tree, binary tree with Eppinger delete, balanced binary tree, red black tree, and a hashing (non-tree) version. (See the references in the README and README.md in the repository). The function interfaces implemented include tsearch(), tdelete(), tdestroy(), tfind(), and twalk().
The implementation of tsearch (in various closely related versions) which has full create and destroy functionality is available in the source repository. Clone the entire tree with
git clone https://github.com/davea42/tsearch-code
Releases are available on github at
Here is a fully worked out example in C calling dwarf_tsearch functions.
GNU libc is pretty complete too (if you use build define _GNU_SOURCE) but traditional libc tsearch is not sufficient. The source implements the standard tsearch interfaces that are hinted at below. See the README.md in the distribution for additional detail.
What I found surprising was that in some cases the values are essentially 'void *root' and in the most-used cases they are 'void ** rootp'. It is hard, now, to quite understand how this inconsistency is any benefit. IMO in all cases it should have been void**rootp.
When you see something like the following pointer indirection in the source you are dealing with one of the indirect pointers.
struct my_tentry *m = *(struct my_tentry **)mt_data;
Another surprise was the difficulty in using tdelete() so that memory was not leaked. Similarly, using tsearch without leaking was tricky.
This work is licensed under a Creative Commons Attribution 4.0 International License.