PicoStorage : C++ Lightweight Structured Storage Library
Introduction
Structured Storage allows you to have a hierarhical organisation,
similar to a file system, inside a single file.
Example applications of structured storage are:
-
A game that is shipped with only one big data file.
Inside this file, the game keeps its audio and graphic files, organized
in directories. Possible advantages of such a
filesystem in a file are:
it prevents the user from tempering with the game's files,
and can provide data encryption, authentication and compresion.
If the game uses a large number of very small files, this can also save
disk space and access time.
-
A word processing application (e.g. Word) can store inside a single file
the multiple elements that compose a document
(such as text, embeded images, embeded sound,
meta information about the document).
The structured storage allows easy access and modification of
these elements, while preserving the
single document == single file idiom.
-
A WWW search engine needs to store a huge number (tens of millions)
of small documents (e.g. 10KB each).
Putting this number of small documents in a traditional filesystem
poses some problems: too many files inside a directory slows down
the access, too much disk space is wasted, etc.
As an alternative solution, Google for example stores
many individual web pages together in big files (hundreds of MB in size)
which are better handled by the filesystem.
Structured storage offers another alternative, offering the familiar
view of a file system while allowing for big number of small documents,
eventually transparently offering compression, too.
-
Various application have different needs for their persistent data storage.
For tabular data, applications typically use an SQL database.
For hierarchical data, application can use either the filesystem
(putting the data in directories and files),
or store the data in the form of an XML document; or they can use
structured storage,
thus preserving the hierarchical nature of data while having it
in a single file, and at the same time get better performance.
What is PicoStorage : C++ Lightweight Structured Storage
PicoStorage is an implementation of the Structured Storage concept.
PicoStorage allows you to organize and access data
that is stored in a single filesystem file hirarchically,
in directories and files.
The distinctive traits of PicoStorage are:
- Lightweight The PicoStorage library is small and efficient. It
uses little memory & CPU resources. The library interface is conceptually
simple and straightforward. The source code is small and clear.
-
C++ PicoStorage uses to its advantage the features
of the C++ programming language.
The interface is Object Oriented: it offers elegance, simplicity
and ease of extension; all this while preserving the performance.
But this also means that you can only use the PicoStorage library from
the C++ language.
But the specification of the underlying storage format is
language independent,
so it is possible for independent developers to produce similar
libraries for other languages.
-
Platform independent
PicoStorage was designed to be used in many OS environments.
The platform dependencies are reduced and isolated in a single place,
making it very easy to adapt the library to a new platform.
The main operating systems that are supported are:
Linux, Windows, FreeBSD. Support is envisaged for MacOS.
-
Open Source
You have access to full source code: you can see how things are done;
you can play with it and try out your ideas as you like.
A main goal is the simplicity and elegance of the design,
of the interface and of the implementation;
the source code is understandable and readable.
It will be our pleasure if anybody
uses the source code in an educative fashion.
Interface Overview
Example 1: here we open a storage from an OS file,
and get a reference to its root directory;
we create a new file in an existing directory,
and write a text to the file.
#include "Storage.hpp"
int main() {
Dir root = Storage::openFile("store.lss", EXISTING_OR_CREATE_INIT);
Dir dir1 = root->openDir("dir1", EXISTING);
File myFile = root->openFile("myFile", CREATE);
const char txt[] = "hello world";
myFile->write(txt, sizeof(txt));
return 0;
}
Note: The main concepts are Dir and File.
Dir represents a directory.
You use it for opening / creating, copying and moving
files or sub-directories. You use a File for
reading and writing the file, and for obtaining/setting the file size
and the file pointer.
As you see in the example, you don't need to use
new or delete. Dir and File are
smart pointers: they behave mostly like ordinary pointers,
but you don't have to deallocate (delete) them.
Example 2: enumerating the content of a directory.
#include "Storage.hpp"
#include <stdio.h>
int main() {
Dir root = Storage::openFile("store.lss", EXISTING_OR_CREATE_INIT);
Dir someDir = root->path("foo/bar/myDir", EXISTING);
for (DirIt it = someDir.begin(); !it->atEnd(); it->next()) {
fprintf("%s\n", it->getName());
}
return 0;
}
Interface
typedef std::vector<std::string> SegmentPath;
class BaseDir;
typedef SmartPtr<BaseDir> Dir;
class BaseDirIt;
typedef SmartPtr<BaseDirIt> DirIt;
class BaseFile;
typedef SmartPtr<BaseFile> File;
class BaseDir : public RefCount {
public:
virtual File openFile(const std::string &name,
OpenMode mode) = 0;
virtual Dir openDir(const std::string &name,
OpenMode mode) = 0;
Dir path(const std::string &pathName, OpenMode mode);
Dir path(const SegmentPath &pathName, OpenMode mode);
void copy(Dir dstDir);
void move(Dir dstDir);
bool copy(const std::string &srcName, Dir dstDir, const std::string &dstName);
bool move(const std::string &srcName, Dir dstDir, const std::string &dstName);
virtual void remove(const std::string &name) = 0;
virtual DirIt begin() = 0;
};
class BaseFile : public RefCount {
public:
static const uint64 kMaxFileSize = 1LL << 48;
virtual uint32 read(void *buf, uint32 nBytes) = 0;
virtual uint32 pread(void *buf, uint32 nBytes, uint64 pos) = 0;
virtual void write(const void *buf, uint32 nBytes) = 0;
virtual void pwrite(const void *buf, uint32 nBytes, uint64 pos) = 0;
virtual uint64 seek(int64 pos, int whence = S_SET) = 0;
virtual uint64 tell() = 0;
virtual uint64 getSize() = 0;
virtual void setSize(uint64 size) = 0;
uint64 copyRange(uint64 nBytes, uint64 srcPos, uint64 dstPos);
uint64 copyRange(File dst, uint64 nBytes,
uint64 srcPos, uint64 dstPos);
void copyTo(File dst);
};
class BaseDirIt : public RefCount {
public:
bool operator==(DirIt it);
virtual const std::string &getName() = 0;
virtual bool isDir() = 0;
virtual void next() = 0;
virtual bool atEnd() = 0;
};
Design and Implementation
- The most important design goal is: simplicity; keep everything
as simple as possible. The elegance of the design, which provides conceptual
simplicity, ease of use and ease of extension, has precedence over performance.
Performance tuning (when it will be done done) will be based on profiling
data.
-
The library is multithread-safe.
-
The library operations incur low overhead:
opening a file or a directory is a fast operation
(compared to the corresponding OS filesystem operations).
An open file or directory has low resource overhead, so it's
ok for an application to keep
hundreds of thousands of files or directories open at once.
(In the case of the OS filesystem, the number of
open files per process is limited to a few hundreds / thousands).
-
PicoStorage handles efficiently very small files
(files which are only tens of bytes in size).
PicoStorage handles efficiently a large number (millions) of directories and files.
PicoStorage supports big files.
The idea is to have a simple and minimalist
but efficient hiearchical storage layer.
More complex functionality
(e.g. Unix-like access rights, MS properties,
different categories of files, etc) can later
be built upon this layer.
The persistent file format
This is the specification of the file format used by PicoStorage.
This is how PicoStorage organizes data in the single filesystem file in
order to support multiple files and directories inside it.
The storage file format is platform-independent and indianess-independent.
This allows to move the storage file between different computers.
The numerical values are stored as big-endian (aka. network byte order).
|
Header
|
8 bytes. Contains a magic (signature) number (4 bytes),
and the file format version number (4 bytes).
|
|
Inodes table size
|
The number of entries in the inodes table. (4 bytes).
|
|
Inodes table
|
A sequence of inode entries. The number of entries is given
by the Inodes table size (previous field).
Each inode entry 12 bytes.
The first inode entry (with index 0) is never used.
See below for details.
|
|
Allocated areas and empty areas (slices).
|
Each allocated slice is referenced by exactly one inode entry.
The empty slices are not referenced.
|
An inode entry identifies a slice. A slice is a contigous sequence
of bytes in the storage file. Slices can't overlap. A slice must start
after the inodes table, and must be completely contained in the storage
file.
An inode entry:
|
Start offset
|
the offset inside the storage file of the first byte of the slice. (8 bytes)
|
|
Len
|
the length in bytes of the slice. (4 bytes). (The slice is a contigous range of bytes)
|
PicoStorage supports big files (>2GB), in the sense that the storage file
(the underlying filesystem file) can be larger than 2GB.
The Start offset of a slice is 8bytes in order to allow
slices to be allocated after the 2GB or 4GB offset.
But the length of a slice uses 4bytes, which means that the length of
a file inside the storage is limited to 4GB.
We use the notation (<offset>, <len>) for an inode entry and to denote a
slice.
The value (0, 0) for an inode means that the inode is not allocated.
It means that the inode is not in use and can be allocated when somebody
requests an inode.
The value (1, 0) for an inode is special, and means an inode that is
allocated, but points to a slice with 0 lenght.
Any slice that has 0 lenght must use this special value. Note that this
value (1, 0) does not conflict with any ordinary inode value, because it
has a start offset of 1, which is before the inodes table; but all
ordinary slices start after the end of the inodes table.
So all normal slices have length of at least 1.
They can't overlap. Two normal slices can not start at the same offset.
We call inode number the index of an inode in the inode table.
The inode number 0 is never used.
The size of the inode table is limited to 2^31, so we can have at most
2^31 - 1 inodes (because 0 isn't used).
File
A file is associated with a slice. A file is identified by a
reference to the storage and the inode number of the slice.
Thus the intrinsec memory overhead of an open file is mainly:
a pointer to the storage class, the inode number (4bytes),
and the file pointer position (8bytes).
Because a file corresponds to a slice, and a slice is contigous,
a file is always contigous. There is no file fragmentation,
because a file is always composed of a unique contigous fragment.
When a file needs to grow (typically, when writing data at the end,
or past the end of the file) its slice needs to grow. If there's no
empty (available) space just after the current end of the slice,
the slice is moved to a new position where it has the space needed
to grow.
This slice move implies a copy in the underlying storage file
of a lenght equal to the current lenght of the file. So, growing a big
file can potentially be slow.
I did this tradeoff: prefer simplicity (a file is contigous) against
potentially slow grow for large files. Here's why: typically,
the files inside the storage are not very large. File which are
of an order of 100MB in size are
more likely to be stored as filesystem files. In storage we more often
find e.g. graphic sprites a few KB in size. So the thradeoff is not a
problem for the typical use.
Also, it's ok to use large files as long as they don't grow often.
Considering for example the case of a big log file, where data is
always appended at the end. A possible solution is to pre-allocate
the space by growing the file in a few big leaps rather than
in many small increments.
Directory
A directory works above a file. It imposes a particular structure
to the file content.
A directory is organized as a sequence of entries, possibly with
empty space in between.
The entries are variable size.
Between successive directory entries there can be empty space.
The empty space bytes are always zeroed.
An entry has this structure:
|
Name lenght
|
(1 byte)
|
|
Name
|
Variable size. Contains name lenght bytes.
The name is 8bit clean (e.g. the null byte is legal in name).
|
|
Inode number and type.
|
(4 bytes). The most significant bit is set to 1 if the referenced
inode corresponds to a directory.
Otherwise (most significant bit is 0) the referenced inode
corresponds to a file.
The rest of the bits contain the inode number.
(hence the limit of 2^31 inodes)
|
The names of the directory entries are limited to 255 bytes.
They are 8bit-clean. They need not be 0-terminated
(in fact, 0 is a legal byte inside the name).
This allows to use any encoding for the entry names
(e.g. it's possible to use UTF8, UTF16, ISO8859-2 --
it's up to the application).
The directory structure, by using variable-lenght entries,
is space-efficient: the maximum name lenght is 255bytes,
but if the names are short, the directory will take up less space.
The root directory has inode number 1.
Download Lightweight Structured Storage Library
PicoStorage is available under the terms of the
GNU General Public License (GPL).
The source code compilation was tested with
-
GCC3 on Linux
-
Mingw - GCC3 on Windows
-
cl (Microsoft (R) 32-bit C/C++ Standard Compiler) 13.00.9466 on Windows.
You can use these tools:
Scons a great Make replacement.
Mingw GCC for Windows.
Copyright (C) 2003 Mihai Preda.
PicoStorage : C++ Lightweight Structured
Storage Library