|Freedom of Speech, Privacy, Science|
I have not received any National Security Letter.
Please join the Electronic Frontier Foundation ( EFF.org ) and the fight for your rights on the Internet.
Please join the Union of Concerned Scientists ( uscusa.org ) in bringing science into improving all our lives (everyone is welcome to join).
Public Domain works are a vital part of any culture and there are repeated attempts to erode the Public Domain. For more information see the Center for the Study of the Public Domain at Duke University.
One application I wrote has both C and C++ versions. I wanted search routines for C that would do what the C++ map and set STL datatypes do. A base requirement (for me) is that the routines be open source, not proprietary. A few minutes search turned up hsearch and tsearch. Both are in the Single Unix Specification and both are part of GNU libc. I am sure there are other similar routines on the web.
hsearch is pretty useless because one must provide a fixed size for the table (which cannot be increased). And there is no way to empty the table. I won't mention hsearch again.
tsearch is much better. It grows as necessary automatically and the GNU glibc version has a tdestroy() function to clean out all memory used. It is a binary tree representation so performance is quite reasonable. The cutest part is that nothing in the library has any dependency on the data involved. So the same code (see the example code) works fine as either a map or a set.
An implementation of tsearch (in various closely related versions) which has full create and destroy functionality is available in the libdwarf source code in the tsearch directory. The source implements the standard tsearch interfaces that are described below.
What I found surprising and bizarre was that in some cases the values are void *pointers and in some cases they are pointers-to-pointers, but all the function interfaces simply say void *. I found existing documentation (including the Single Unix Specification) to be unclear on where the values were actually pointers-to-pointers. I found an example on the web which failed to account for this and which therefore simply did not entirely work on Linux.
I do not currently (2011) have access to IRIX, HP-UX, Solaris or any other Unix so I do not know whether this characteristic is solely a Linux issue.
When you see something like the following pointer indirection in the source you are dealing with one of the indirect pointers.
struct my_tentry *m = *(struct my_tentry **)mt_data;
The C example assumes you have a GNU libc in that it calls tdestroy(), which is a GNU only function. If you have a non-GNU libc just comment out the call to tdestroy() and compile with cc (it should work).
Another surprise was the difficulty in using tdelete() so that memory was not leaked. Similarly, using tsearch without leaking was tricky. I discovered and fixed these problems in the tsearch.c usage-example (see the Download tsearch C example link below) on or about September 8, 2011.
Download tsearch C example
An example of a build and execution of the code on Linux Ubuntu 9.10 for X86 follows:
q2 1206: gcc -Wall tsearch.c q2 1207: ./a.out insert ok 0 insert ok 1 insert ok 2 found ok 0 found ok 1 found ok 2 Fail TSRCH on 3 is ok Fail TSRCH on 4 is ok <0>Walk on node preorder 1 data for 1 <1>Walk on node leaf 0 data for 0 <0>Walk on node postorder 1 data for 1 <1>Walk on node leaf 2 data for 2 <0>Walk on node endorder 1 data for 1 tdelete returned parent: 2 data for 2 <0>Walk on node preorder 2 data for 2 <1>Walk on node leaf 0 data for 0 <0>Walk on node postorder 2 data for 2 <0>Walk on node endorder 2 data for 2 PASS tsearch test.
This work is licensed under a Creative Commons Attribution 4.0 International License.