Three modules comprise the object persistence interface.
dbm
(anydbm
in Python 2.X)Key-based string-only storage files.
pickle
(and
cPickle
in Python 2.X)Serializes an in-memory object to/from file streams.
shelve
Key-based persistent object stores: pickles objects to/from
dbm
files.
The shelve
module implements
persistent object stores. shelve
in
turn uses the pickle
module to
convert (serialize) in-memory Python objects to byte-stream strings and
the dbm
module to store serialized
byte-stream strings in access-by-key files.
In Python 2.X, dbm
is named
anydbm
, and the cPickle
module is an optimized version of
pickle
that may be imported
directly and is used automatically by shelve
, if present. In Python 3.0, cPickle
is renamed _pickle
and is automatically used by
pickle
if present—it need not be
imported directly and is acquired by shelve
.
Also note that in Python 3.0 the Berkeley DB (a.k.a. bsddb
) interface for dbm
is no longer shipped with Python itself,
but is a third-party open source extension which must be installed
separately (see the Python 3.0 Library Reference for
resources).
dbm
is an access-by-key filesystem: strings are stored and fetched by their string keys. The
dbm
module selects the keyed-access
file implementation in your Python interpreter and presents a
dictionary-like API for scripts. A persistent object shelve
is used like a simple dbm file, except that the dbm
module is replaced by shelve
, and the stored value
can be almost any kind of Python
object (though keys are still strings). In most respects, dbm
files and shelves work like dictionaries
that must be opened before use, and closed after making changes; all
mapping operations and some dictionary methods work.
import shelve
import dbm
Gets whatever dbm
support library is available: dbm.bsd
, dbm.gnu
, dbm.ndbm
, or dbm.dumb
.
open(…)
file = shelve.open(filename [, flag='c' [, protocol=None [, writeback=False]]]) file = dbm.open(filename [, flag='r' [, mode]])
Creates a new or opens an existing dbm
file.
flag
is the same in shelve
and dbm
(shelve
passes it on to dbm
). It can be 'r'
to open an existing database for reading
only (dbm
default); 'w'
to open an existing database for reading
and writing; 'c'
to create the
database if it doesn’t exist (shelve
default); or 'n'
, which will always create a new empty
database. The dbm.dumb
module (used
by default in 3.0 if no other library is installed) ignores flag
—the database is always opened for
update and is created if it doesn’t exist.
For dbm
, the optional
mode
argument is the Unix mode of
the file, used only when the database has to be created. It defaults
to octal 0o666.
For shelve
, the protocol
argument is passed on from shelve
to pickle
. It gives the pickling protocol
number (described ahead) used to store shelved objects; it defaults to
0
in Python 2.6, and to 2
in Python 3.0. By default, changes to
objects fetched from shelves are not automatically written back to
disk. If the optional writeback
parameter is set to True
, all
entries accessed are cached in memory, and written back at close time;
this makes it easier to mutate mutable entries in the shelve, but can
consume memory for the cache, making the close operation slow because
all accessed entries are written back.
file['key'] =
value
Store: creates or changes the entry for 'key'
. Value is a string for dbm
, or an arbitrary object for
shelve
.
value =
file['key']
Fetch: loads the value for the 'key'
entry. For shelve
, reconstructs object in
memory.
count = len(file)
Size: returns the number of entries stored.
index =
file.keys()
Index: fetches the stored keys (can use in a for
or other iteration
context).
found = 'key' in file (or has_key()
in 2.X only)
Query: sees if there’s an entry for 'key'
.
del file['key']
Delete: removes the entry for 'key'
.
file.close()
Manual close; required to flush updates to disk for some
underlying dbm
interfaces.
The pickle
interface converts nearly arbitrary in-memory Python objects to/from serialized
byte-streams. These byte-streams can be directed to any file-like
object that has the expected read/write methods. Unpickling re-creates
the original in-memory object (with the same value, but a new identity
[address]).
See the prior note about Python 2.X’s cPickle
and Python 3.0’s _pickle
optimized modules. Also see the
makefile
method of socket objects
for shipping serialized objects over networks.
P = pickle.Pickler(fileobject [,
protocol=None])
Makes a new pickler, for saving to an output file object.
P.dump(object)
Writes an object onto the pickler’s file/stream.
pickle.dump(object, fileobject [,
protocol=None])
Combination of the previous two: pickles object onto file.
string = pickle.dumps(object [,
protocol=None])
Returns pickled representation of object as a string (a
bytes
string in Python 3.0).
U =
pickle.Unpickler(fileobject, encoding="ASCII",
errors="strict")
Makes unpickler, for loading from input file object.
object =
U.load()
Reads object from the unpickler’s file/stream.
object =
pickle.load(fileobject, encoding="ASCII",
errors="strict")
Combination of the previous two: unpickles object from file.
object =
pickle.loads(string, encoding="ASCII",
errors="strict")
Reads object from a character string (a bytes string in Python 3.0).
In Python 3.0, files used to store pickled objects should
always be opened in binary mode for all protocols, because the
pickler produces bytes
strings, and text mode files do not support writing bytes
(text mode files encode and
decode Unicode text in 3.0).
In Python 2.6, files used to store pickled objects must be opened in binary mode for all pickle protocols >= 1, to suppress line-end translations in binary pickled data. Protocol 0 is ASCII-based, so its files may be opened in either text or binary mode, as long as they are done so consistently.
fileobject
is an open
file object, or any object that implements file object
attributes called by the interface. Pickler
calls the file write
method with a string argument.
Unpickler
calls the file
read
method with a byte-count
and readline
without
arguments.
protocol
is an optional
argument that selects a format for pickled data, available in
both the Pickler
constructor
and the module’s dump
and
dumps
convenience functions.
This argument takes a value 0...3
, where higher protocol numbers
are generally more efficient, but may also be incompatible with
unpicklers in earlier Python releases. The default protocol number in Python
3.0 is 3
, which cannot be
unpickled by Python 2.X. The default protocol in Python 2.6 is
0
, which is less efficient
but most portable. Protocol −1
automatically uses the highest
protocol supported. When
unpickling, protocol is implied by pickled data contents.
The unpickler’s encoding
and errors
optional keyword-only arguments
are available in Python 3.0 only. They are used to decode 8-bit
string instances pickled by Python 2.X. These default to
'ASCII'
and 'strict'
, respectively.
Pickler
and Unpickler
are exported classes that
may be customized by subclassing. See the Python Library
Reference for available methods.