DataStore

class DataStore

Data-layer access interface.

DataStore exposes a generic interface to store and read data used internally by the identification engine. There are, logically, two kinds of data that the engine needs to access: the fingerprints raw data and the fingerprints indexed data. The DataStore interface provides an abstraction to all these data collectively, regardless of their specific implementation. Clients must provide the concrete implementations in order to physically access these structures. Note that how these structures are organised in the data store is irrelevant to the identification engine, as long as they are returned in the specified formats.

Public Functions

virtual void OnIndexerStart() = 0

This method is called by the indexer to signal the data store about the start of an indexing session. It can be used to perform specific tasks in the data store before the indexer starts emitting index chunks.

virtual void OnIndexerEnd() = 0

This method is called by the indexer to signal the data store about the end of an indexing session. It can be used to perform specific tasks in the data store after the indexer has finished its job.

virtual void OnIndexerFlushStart() = 0

This method is called by the indexer to signal that the data stored in the cache is being flushed, that is processed, and index (list) chunks are being emitted. It is triggered when the cache size exceeds the internal default limit (or the limit set using Indexer::SetCacheLimit()) or as a result of calling Indexer::Flush(). It can be used to perform specific tasks according to the chosen indexing strategy (see [link] for more details).

virtual void OnIndexerFlushEnd() = 0

This method is called by the indexer to signal that the data stored in the cache has been completely processed and emitted.

virtual PListHeader OnIndexerListHeader(int lid) = 0

This method is called by the indexer during the indexing stage in order to build the search lists. It shall return the header of the specified list. The headers must be returned as they have been emitted by the indexer, so if some sort of transformation has been applied to them the original layout must be restored.

Return

The header of the specified list.

Note

If the list does not exist the method shall return a null header (a zero-initialized header).

Parameters
  • [in] lid: The identifier of the list whose header has to be retrieved.

virtual PListBlockHeader OnIndexerBlockHeader(int lid, int bid) = 0

This method is called by the indexer during the indexing stage in order to emit the chunked blocks. It shall return the block’s header for the specified list. The headers must be returned as they have been emitted by the indexer, so if some sort of transformation has been applied to them the inverse must be applied in order to restore the original layout.

Return

The header of the specified block in the specified list.

Note

If the list or block does not exist the method shall return a null header (a zero-initialized header).

Parameters
  • [in] lid: The identifier of the list where the block can be found.

  • [in] bid: The identifier of the block whose header has to be retrieved.

virtual void OnIndexerChunk(int lid, PListHeader &lhdr, PListBlockHeader &hdr, uint8_t *chunk, size_t chunk_size) = 0

This handler is called by the indexer whenever a new block chunk is produced. When the indexer’s cache is flushed, its contents are processed and block chunks emitted for the data store to process. Logically, these chunks are appended to the specified list as part of the specified block (the last block, also referred to as the “append block”), although the data store implementation is free to organize the data as it likes. However, remember that when blocks are requested by the engine, they have to be returned exactly as they had been emitted. See [link] for more details. Also, since the list (and the block) to which we’re appending are being modified, the updated headers are emitted as well. Clients should use them to update the data store accordingly.

Parameters
  • [in] lid: The identifier of the list to which to append the chunk.

  • [in] lhdr: The list’s header.

  • [in] hdr: The header of the block to which to append the chunk. This is actually always the last block in the list (the “append block”).

  • [in] chunk: Pointer to the chunk’s data. Clients must not delete nor retain this pointer.

  • [in] chunk_size: The size in bytes of the chunk’s data.

virtual void OnIndexerNewBlock(int lid, PListHeader &lhdr, PListBlockHeader &hdr, uint8_t *chunk, size_t chunk_size) = 0

This handler is called by the indexer whenever a new block is produced. This method is called to signal the data store that the last block (the “append block”) in the specified list has reached the set size limit and the chunk must be put in a new empty block. See [link] for more details.

Parameters
  • [in] lid: The identifier of the list to which to append the chunk.

  • [in] lhdr: The list’s header.

  • [in] hdr: The block’s header.

  • [in] chunk: Pointer to the chunk’s data. Clients must not delete nor retain this pointer.

  • [in] chunk_size: The size in bytes of the chunk’s data.

virtual void OnIndexerFingerprint(uint32_t FID, uint8_t *data, size_t data_size) = 0

As part of the indexing process, a fingerprint for each processed audio recording is emitted. This method is called by the indexer to signal the creation of the fingerprint. Clients may choose not to implement this method if the original fingerprints are not needed (for example if MMS is always set to zero the only match level used by the algorithm will not use the fingerprints original data).

Parameters
  • [in] FID: The fingerprint’s unique identifier.

  • [in] data: Pointer to the memory location containing the fingerprint’s data.

  • [in] data_size: The size in bytes of the fingerprint’s data.

virtual const uint8_t *GetPListBlock(int lid, int bid, size_t &data_size, bool headers = false) = 0

This method is called by the engine during the identification stage. It shall return the list’s block for the specified list id from the fingerprints index. Although clients are free to implement their storage solutions and layouts, index blocks, along with their headers, must be returned as they have been emitted during the indexing stage. So, if the datastore implementation applies some sort of transformation to the blocks data, the inverse transformation must be applied prior to returning the block’s data to the engine.

Return

A pointer to the memory location containing the block’s data. Clients must ensure that the returned pointer remains valid after the method returns, for example by storing the data into a session-bound buffer. See [link] for more details.

Note

A null pointer and zero size shall be returned if the block is not found.

Parameters
  • [in] lid: The identifier of the list from which to retrieve the block.

  • [in] bid: The identifier of the block to be retrieved.

  • [out] data_size: The size in bytes of the block’s data.

  • [in] headers: Flag specifying whether to include the block’s header in the returned data. If true the block’s header must be returned prepended to the block’s body.

virtual size_t GetFingerprintSize(uint32_t FID) = 0

Get the size of the specified fingerprint.

Return

The size in bytes of the specified fingerprint.

Parameters
  • [in] FID: The fingerprint’s unique identifier.

virtual const uint8_t *GetFingerprint(uint32_t FID, size_t &read, size_t nbytes = 0, uint32_t bo = 0) = 0

Get fingerprint data from the datastore. The match algorithm does not need the entire fingerprints for recognition but only small chunks of them, so this method is flexible with respect to the amount of data that can be retrieved, being it the whole fingerprint or just portions of it. This method is never called if the MMS parameter is set to 0 because the fingerprints data is not needed in such a case.

Return

A pointer to the memory location containing the fingerprint’s data. The engine does not take ownership of this pointer, which must remain valid after the method returns. This can be achieved by storing the data into a session-bound buffer (see [link] for more details).

Parameters
  • [in] FID: The fingerprint’s unique identifier.

  • [out] read: The size of the data actually read.

  • [in] nbytes: The size in bytes of the data to be read. If it’s set to zero, the whole fingerprint starting at bo shall be returned.

  • [in] bo: The offset in bytes within the fingerprint at which to start reading.