TextDB

class dbetto.textdb.TextDB(path, lazy=False, hidden=False)

A simple text file database.

The database is represented on disk by a collection of text files arbitrarily scattered in a filesystem. Subdirectories are also TextDB objects. In memory, the database is represented as an AttrsDict.

Currently supported file formats are JSON and YAML.

Tip

For large databases, a basic “lazy” mode is available. In this case, no global scan of the filesystem is performed at initialization time. Once a file is queried, it is also cached in the internal store for faster access. Caution, this option is for advanced use (see warning message below).

Warning

A manual call to scan() is needed before most class methods (e.g. iterating on the database files) can be properly used.

Examples

>>> from dbetto import TextDB
>>> jdb = TextDB("path/to/dir")
>>> jdb["file1.json"]  # is a dict
>>> jdb["file1.yaml"]  # is a dict
>>> jdb["file1"]  # also works
>>> jdb["dir1"]  # TextDB instance
>>> jdb["dir1"]["file1"]  # nested file
>>> jdb["dir1/file1"]  # also works
>>> jdb.dir1.file # keys can be accessed as attributes
Parameters:
group(label)

Group dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise groupings cannot be created.

Parameters:

label (str)

Return type:

AttrsDict

map(label, unique=True)

Remap dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise mappings cannot be created.

Parameters:
Return type:

AttrsDict

on(timestamp, pattern=None, system='all')

Query database in time[, file pattern, system].

A (only one) valid validity file (YAML, JSON, JSONL and other file types supported) must exist in the directory to specify a validity mapping. This functionality relies on the catalog.Catalog class.

The YAML specification is documented at this link.

The special $_ string is expanded to the directory containing the text files.

Parameters:
  • timestamp (str | datetime) – a datetime object or a string matching the pattern YYYYmmddTHHMMSSZ.

  • pattern (str | None) – query by filename pattern.

  • system (str) – query only a data taking “system” (e.g. ‘all’, ‘phy’, ‘cal’, ‘lar’, …)

Return type:

AttrsDict | list

reset(rescan=True)

Reset this database instance.

Reinstantiates the internal AttrsDict store and re-scans the database, if non-lazy. Useful if the database states changes at runtime.

Parameters:

rescan (bool)

Return type:

None

scan(recursive=True, subdir='.')

Populate the database by walking the filesystem.

Parameters:
  • recursive (bool) – if True, recurse subdirectories.

  • subdir (str) – restrict scan to path relative to the database location.

Return type:

None