Version: Main/Unreleased


TrainingCache Objects

class TrainingCache(abc.ABC)

Stores training results in a persistent cache.

Used to minimize re-retraining when the data / config didn't change in between training runs.


def cache_output(fingerprint_key: Text, output: Any, output_fingerprint: Text,
model_storage: ModelStorage) -> None

Adds the output to the cache.

If the output is of type Cacheable the output is persisted to disk in addition to its fingerprint.


  • fingerprint_key - The fingerprint key serves as key for the cache. Graph components can use their fingerprint key to lookup fingerprints of previous training runs.
  • output - The output. The output is only cached to disk if it's of type Cacheable.
  • output_fingerprint - The fingerprint of their output. This can be used to lookup potentially persisted outputs on disk.
  • model_storage - Required for caching Resource instances. E.g. Resources use that to copy data from the model storage to the cache.


def get_cached_output_fingerprint(fingerprint_key: Text) -> Optional[Text]

Retrieves fingerprint of output based on fingerprint key.


  • fingerprint_key - The fingerprint serves as key for the lookup of output fingerprints.


The fingerprint of a matching output or None in case no cache entry was found for the given fingerprint key.


def get_cached_result(output_fingerprint_key: Text, node_name: Text,
model_storage: ModelStorage) -> Optional[Cacheable]

Returns a potentially cached output result.


  • output_fingerprint_key - The fingerprint key of the output serves as lookup key for a potentially cached version of this output.
  • node_name - The name of the graph node which wants to use this cached result.
  • model_storage - The current model storage (e.g. used when restoring Resource objects so that they can fill the model storage with data).


None if no matching result was found or restored Cacheable.

Cacheable Objects

class Cacheable(Protocol)

Protocol for cacheable graph component outputs.

We only cache graph component outputs which are Cacheable. We only store the output fingerprint for everything else.


def to_cache(directory: Path, model_storage: ModelStorage) -> None

Persists Cacheable to disk.


  • directory - The directory where the Cacheable can persist itself to.
  • model_storage - The current model storage (e.g. used when caching Resource objects.


def from_cache(cls, node_name: Text, directory: Path,
model_storage: ModelStorage,
output_fingerprint: Text) -> Cacheable

Loads Cacheable from cache.


  • node_name - The name of the graph node which wants to use this cached result.
  • directory - Directory containing the persisted Cacheable.
  • model_storage - The current model storage (e.g. used when restoring Resource objects so that they can fill the model storage with data).
  • output_fingerprint - The fingerprint of the cached result (e.g. used when restoring Resource objects as the fingerprint can not be easily calculated from the object itself).


Instantiated Cacheable.

LocalTrainingCache Objects

class LocalTrainingCache(TrainingCache)

Caches training results on local disk (see parent class for full docstring).

CacheEntry Objects

class CacheEntry(Base)

Stores metadata about a single cache entry.


def __init__() -> None

Creates cache.

The Cache setting can be configured via environment variables.


def cache_output(fingerprint_key: Text, output: Any, output_fingerprint: Text,
model_storage: ModelStorage) -> None

Adds the output to the cache (see parent class for full docstring).


def get_cached_output_fingerprint(fingerprint_key: Text) -> Optional[Text]

Returns cached output fingerprint (see parent class for full docstring).


def get_cached_result(output_fingerprint_key: Text, node_name: Text,
model_storage: ModelStorage) -> Optional[Cacheable]

Returns a potentially cached output (see parent class for full docstring).