PicoStorage

Lightweight Structured Storage

Why and when should you use PicoStorage instead of alternatives such a classical filesystem.

The distinctive point of PicoStorage is its efficient handling of large numbers of small files. For example, creating in a directory one million (1M) empty files takes less than 10s, and the resulting storage file is less then 7MB in size. Doing something similar on a traditional filesystem will either be impossible (on FAT or Ext2, because directories will get too slow at 1M entries), or will take up much more disk space. An open PicoStorage file takes up only 8 bytes (on 32bit architectures) in RAM, so you can keep a large number of such files open at the same time without problems. For example, putting (in RAM) in a vector 1M open files is ok. Doing this on a traditional filesystem would pose serious trouble, because the number of "open files" per process is limited (sometimes to as little as 256), and because each open file uses much more ressources in RAM.

PicoStorage is lightweight. Because of its simplicity, the library is very easy to use. It is opensource, and the source code is relatively small (~5k LOC), and easy to read and understand (just in case somebody is curios to take a look).

PicoStorage is an embedded library, which means it is linked-in in a process. A single process can have write-access to a given storage at a specific moment (i.e., a storage can't be simultaneously accessed by multiple processes). This is in opposition with the client/server model, where multiple clients can access a server simultaneously. This distinction (embedded vs. client/server) is often met in the database field (eg. SQLite and BerkeleyDB are embedded, while MySQL and PostgreSQL are client/server).

The library supports a limited form of transactions. At any moment exactly one transaction is active (i.e., there can't be multiple simultaneous transactions going on). Both metadata and file-data are transactioned. The use of transactions insures that there will be no data corruption in the case of a power-failure or application or OS crash. After such an event, the storage automatically returns to the last commited state. I'd say PicoStorage's transactions respect the Atomicity, Coherence and Durability of the ACID (Isolation is meaningless because there's just one transaction).