dbetto package

class dbetto.AttrsDict(value=None, readonly=False)

Bases: dict

Access dictionary items as attributes.

Examples

>>> d = AttrsDict({"key1": {"key2": 1}})
>>> d.key1.key2
1
>>> d1 = AttrsDict()
>>> d1["a"] = 1
>>> d1.a
1
Parameters:
group(label)

Group dictionary according to a label.

This is equivalent to map() with unique set to False.

Parameters:

label (str) – name (key) at which the new label can be found. If nested in dictionaries, use . to separate levels, e.g. level1.level2.label.

Return type:

AttrsDict

Examples

>>> d = AttrsDict({
...   "a": {
...     "type": "A",
...     "data": 1
...   },
...   "b": {
...     "type": "A",
...     "data": 2
...   },
...   "c": {
...     "type": "B",
...     "data": 3
...   },
... })
>>> d.group("type").keys()
dict_keys(['A', 'B'])
>>> d.group("type").A.values()
dict_values([{'type': 'A', 'data': 1}, {'type': 'A', 'data': 2}])
>>> d.group("type").B.values()
dict_values([{'type': 'B', 'data': 3}])
>>> d.group("type").A.map("data")[1]
{'type': 'A', 'data': 1}

See also

map

map(label, unique=True)

Remap dictionary according to an alternative unique label.

Loop over keys in the first level and search for key named label in their values. If label is found and its value newid is unique, create a mapping between newid and the first-level dictionary obj. If label is of the form key.label, label will be searched in a dictionary keyed by key. If the label is unique a dictionary of dictionaries will be returned, if not unique and unique is false, a dictionary will be returned where each entry is a dictionary of dictionaries keyed by an arbitrary integer.

Parameters:
  • label (str) – game (key) at which the new label can be found. If nested in dictionaries, use . to separate levels, e.g. level1.level2.label.

  • unique (bool) – bool specifying whether only unique keys are allowed. If true will raise an error if the specified key is not unique.

Return type:

AttrsDict

Examples

>>> d = AttrsDict({
...   "a": {
...     "id": 1,
...     "group": {
...       "id": 3,
...     },
...     "data": "x"
...   },
...   "b": {
...     "id": 2,
...     "group": {
...       "id": 4,
...     },
...     "data": "y"
...   },
... })
>>> d.map("id")[1].data == "x"
True
>>> d.map("group.id")[4].data == "y"
True

Note

No copy is performed, the returned dictionary is made of references to the original objects.

Warning

The result is cached internally for fast access after the first call. If the dictionary is modified, the cache gets cleared.

reset()

Reset this instance by removing all cached data.

Return type:

None

to_dict()

Return a plain dict representation of the object.

Nested AttrsDict instances and lists are recursively converted to built-in containers to ensure the result is fully serialisable by callers expecting standard dictionaries.

Return type:

dict

class dbetto.Props

Bases: object

Class to handle overwriting of dictionaries in cascade order

static add_to(props_a, props_b)
static read_from(sources, subst_pathvar=False, trim_null=False)
static subst_vars(props, var_values=None, ignore_missing=False)
static trim_null(props_a)
static write_to(file_name, obj, ftype=None)
Parameters:

ftype (str | None)

class dbetto.TextDB(path, lazy=False, hidden=False)

Bases: object

A simple text file database.

The database is represented on disk by a collection of text files arbitrarily scattered in a filesystem. Subdirectories are also TextDB objects. In memory, the database is represented as an AttrsDict.

Currently supported file formats are JSON and YAML.

Tip

For large databases, a basic “lazy” mode is available. In this case, no global scan of the filesystem is performed at initialization time. Once a file is queried, it is also cached in the internal store for faster access. Caution, this option is for advanced use (see warning message below).

Warning

A manual call to scan() is needed before most class methods (e.g. iterating on the database files) can be properly used.

Examples

>>> from dbetto import TextDB
>>> jdb = TextDB("path/to/dir")
>>> jdb["file1.json"]  # is a dict
>>> jdb["file1.yaml"]  # is a dict
>>> jdb["file1"]  # also works
>>> jdb["dir1"]  # TextDB instance
>>> jdb["dir1"]["file1"]  # nested file
>>> jdb["dir1/file1"]  # also works
>>> jdb.dir1.file # keys can be accessed as attributes
Parameters:
group(label)

Group dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise groupings cannot be created.

Parameters:

label (str)

Return type:

AttrsDict

items()
Return type:

Iterator[str, TextDB | AttrsDict | list]

keys()
Return type:

list[str]

map(label, unique=True)

Remap dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise mappings cannot be created.

Parameters:
Return type:

AttrsDict

on(timestamp, pattern=None, system='all')

Query database in time[, file pattern, system].

A (only one) valid validity file (YAML, JSON, JSONL and other file types supported) must exist in the directory to specify a validity mapping. This functionality relies on the catalog.Catalog class.

The YAML specification is documented at this link.

The special $_ string is expanded to the directory containing the text files. Paths may be relative to this directory or absolute; only absolute paths will expand wildcards and environment variables.

Note that the same object will be returned for multiple timestamps if it is valid for all of them; modifications to the returned object will popagate to all of these (unless a deepcopy is explicitly performed).

Parameters:
  • timestamp (str | datetime) – a datetime object or a string matching the pattern YYYYmmddTHHMMSSZ.

  • pattern (str | None) – query by filename pattern.

  • system (str) – query only a data taking “system” (e.g. ‘all’, ‘phy’, ‘cal’, ‘lar’, …)

Return type:

AttrsDict | list

reset(rescan=True)

Reset this database instance.

Reinstantiates the internal AttrsDict store and re-scans the database, if non-lazy. Useful if the database states changes at runtime.

Parameters:

rescan (bool)

Return type:

None

scan(recursive=True, subdir='.')

Populate the database by walking the filesystem.

Parameters:
  • recursive (bool) – if True, recurse subdirectories.

  • subdir (str) – restrict scan to path relative to the database location.

Return type:

None

dbetto.load_attrs_dict(fname, ftype=None)

Load a text file as an AttrsDict.

Parameters:
Return type:

AttrsDict

dbetto.load_dict(fname, ftype=None)

Load a text file as a Python dict.

Parameters:
Return type:

dict

dbetto.str_to_datetime(value)

Convert a string in the format %Y%m%dT%H%M%SZ to datetime.datetime.

Submodules

dbetto.attrsdict module

class dbetto.attrsdict.AttrsDict(value=None, readonly=False)

Bases: dict

Access dictionary items as attributes.

Examples

>>> d = AttrsDict({"key1": {"key2": 1}})
>>> d.key1.key2
1
>>> d1 = AttrsDict()
>>> d1["a"] = 1
>>> d1.a
1
Parameters:
group(label)

Group dictionary according to a label.

This is equivalent to map() with unique set to False.

Parameters:

label (str) – name (key) at which the new label can be found. If nested in dictionaries, use . to separate levels, e.g. level1.level2.label.

Return type:

AttrsDict

Examples

>>> d = AttrsDict({
...   "a": {
...     "type": "A",
...     "data": 1
...   },
...   "b": {
...     "type": "A",
...     "data": 2
...   },
...   "c": {
...     "type": "B",
...     "data": 3
...   },
... })
>>> d.group("type").keys()
dict_keys(['A', 'B'])
>>> d.group("type").A.values()
dict_values([{'type': 'A', 'data': 1}, {'type': 'A', 'data': 2}])
>>> d.group("type").B.values()
dict_values([{'type': 'B', 'data': 3}])
>>> d.group("type").A.map("data")[1]
{'type': 'A', 'data': 1}

See also

map

map(label, unique=True)

Remap dictionary according to an alternative unique label.

Loop over keys in the first level and search for key named label in their values. If label is found and its value newid is unique, create a mapping between newid and the first-level dictionary obj. If label is of the form key.label, label will be searched in a dictionary keyed by key. If the label is unique a dictionary of dictionaries will be returned, if not unique and unique is false, a dictionary will be returned where each entry is a dictionary of dictionaries keyed by an arbitrary integer.

Parameters:
  • label (str) – game (key) at which the new label can be found. If nested in dictionaries, use . to separate levels, e.g. level1.level2.label.

  • unique (bool) – bool specifying whether only unique keys are allowed. If true will raise an error if the specified key is not unique.

Return type:

AttrsDict

Examples

>>> d = AttrsDict({
...   "a": {
...     "id": 1,
...     "group": {
...       "id": 3,
...     },
...     "data": "x"
...   },
...   "b": {
...     "id": 2,
...     "group": {
...       "id": 4,
...     },
...     "data": "y"
...   },
... })
>>> d.map("id")[1].data == "x"
True
>>> d.map("group.id")[4].data == "y"
True

Note

No copy is performed, the returned dictionary is made of references to the original objects.

Warning

The result is cached internally for fast access after the first call. If the dictionary is modified, the cache gets cleared.

reset()

Reset this instance by removing all cached data.

Return type:

None

to_dict()

Return a plain dict representation of the object.

Nested AttrsDict instances and lists are recursively converted to built-in containers to ensure the result is fully serialisable by callers expecting standard dictionaries.

Return type:

dict

dbetto.catalog module

class dbetto.catalog.Catalog(entries)

Bases: Catalog

Implementation of the YAML metadata validity specification.

The legacy JSONL specification is also supported.

class Entry(valid_from, file)

Bases: Entry

An entry in the validity file.

asdict()
save_format(system='all')
Parameters:

system (str)

static build_catalog(propstream, mode_default='append', suppress_duplicate_check=False)

Build a Catalog object from a validity file/stream

Parameters:
Return type:

Catalog

static get(value)
get_dict_format()
Return type:

list

static get_files(catalog_file, timestamp, category='all')

Helper function to get the files for a given timestamp and category

Parameters:
Return type:

list

static read_from(file_name)

Read from a validity file and build a Catalog object

Parameters:

file_name (str | Path)

Return type:

Catalog

valid_for(timestamp, system='all', allow_none=False)

Get the valid entries for a given timestamp and system

Parameters:
Return type:

list

write_to(file_name)

Write a Catalog object to a validity file

Parameters:

file_name (str | Path)

Return type:

None

class dbetto.catalog.Props

Bases: object

Class to handle overwriting of dictionaries in cascade order

static add_to(props_a, props_b)
static read_from(sources, subst_pathvar=False, trim_null=False)
static subst_vars(props, var_values=None, ignore_missing=False)
static trim_null(props_a)
static write_to(file_name, obj, ftype=None)
Parameters:

ftype (str | None)

class dbetto.catalog.PropsStream

Bases: object

Simple class to control loading of validity files

static get(value)
Parameters:

value (str | Path | list | Generator)

Return type:

Generator[dict, None, None]

static read_from(file_name)
Parameters:

file_name (str | Path)

Return type:

Generator[dict, None, None]

static yield_list(value)
Parameters:

value (list)

Return type:

Generator[dict, None, None]

dbetto.textdb module

class dbetto.textdb.TextDB(path, lazy=False, hidden=False)

Bases: object

A simple text file database.

The database is represented on disk by a collection of text files arbitrarily scattered in a filesystem. Subdirectories are also TextDB objects. In memory, the database is represented as an AttrsDict.

Currently supported file formats are JSON and YAML.

Tip

For large databases, a basic “lazy” mode is available. In this case, no global scan of the filesystem is performed at initialization time. Once a file is queried, it is also cached in the internal store for faster access. Caution, this option is for advanced use (see warning message below).

Warning

A manual call to scan() is needed before most class methods (e.g. iterating on the database files) can be properly used.

Examples

>>> from dbetto import TextDB
>>> jdb = TextDB("path/to/dir")
>>> jdb["file1.json"]  # is a dict
>>> jdb["file1.yaml"]  # is a dict
>>> jdb["file1"]  # also works
>>> jdb["dir1"]  # TextDB instance
>>> jdb["dir1"]["file1"]  # nested file
>>> jdb["dir1/file1"]  # also works
>>> jdb.dir1.file # keys can be accessed as attributes
Parameters:
group(label)

Group dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise groupings cannot be created.

Parameters:

label (str)

Return type:

AttrsDict

items()
Return type:

Iterator[str, TextDB | AttrsDict | list]

keys()
Return type:

list[str]

map(label, unique=True)

Remap dictionary according to a second unique label.

Warning

If the database is lazy, you must call scan() in advance to populate it, otherwise mappings cannot be created.

Parameters:
Return type:

AttrsDict

on(timestamp, pattern=None, system='all')

Query database in time[, file pattern, system].

A (only one) valid validity file (YAML, JSON, JSONL and other file types supported) must exist in the directory to specify a validity mapping. This functionality relies on the catalog.Catalog class.

The YAML specification is documented at this link.

The special $_ string is expanded to the directory containing the text files. Paths may be relative to this directory or absolute; only absolute paths will expand wildcards and environment variables.

Note that the same object will be returned for multiple timestamps if it is valid for all of them; modifications to the returned object will popagate to all of these (unless a deepcopy is explicitly performed).

Parameters:
  • timestamp (str | datetime) – a datetime object or a string matching the pattern YYYYmmddTHHMMSSZ.

  • pattern (str | None) – query by filename pattern.

  • system (str) – query only a data taking “system” (e.g. ‘all’, ‘phy’, ‘cal’, ‘lar’, …)

Return type:

AttrsDict | list

reset(rescan=True)

Reset this database instance.

Reinstantiates the internal AttrsDict store and re-scans the database, if non-lazy. Useful if the database states changes at runtime.

Parameters:

rescan (bool)

Return type:

None

scan(recursive=True, subdir='.')

Populate the database by walking the filesystem.

Parameters:
  • recursive (bool) – if True, recurse subdirectories.

  • subdir (str) – restrict scan to path relative to the database location.

Return type:

None

dbetto.time module

dbetto.time.datetime_to_str(value)

Convert a datetime.datetime object to a string in the format %Y%m%dT%H%M%SZ.

dbetto.time.str_to_datetime(value)

Convert a string in the format %Y%m%dT%H%M%SZ to datetime.datetime.

dbetto.time.unix_time(value)

Convert a string in the format %Y%m%dT%H%M%SZ or datetime object to Unix time value

dbetto.utils module

dbetto.utils.float_representer(dumper, value)
dbetto.utils.load_attrs_dict(fname, ftype=None)

Load a text file as an AttrsDict.

Parameters:
Return type:

AttrsDict

dbetto.utils.load_dict(fname, ftype=None)

Load a text file as a Python dict.

Parameters:
Return type:

dict

dbetto.utils.write_dict(obj, fname, ftype=None)

Write a Python dict to a text file.

Parameters:
Return type:

None