Recognizer

class Recognizer

Interface to access the engine’s core functionality.

The Recognizer class exposes the part of the API that deals with the audio identification. It performs fingerprinting of the given audio, matching and classification. Clients will use this interface to get the results of the recognition.

Public Functions

virtual void SetMatchType(eMatchType type) = 0

Set the type of matching algorithm to be used.

Warning

This value must match the value set in the Indexer to produce the fingerprints.

Parameters

virtual void SetMMS(float value) = 0

The engine uses a multi-level matching system (MMS) in order to find the best candidate fingerprints. This approach allows adaptation of the accuracy by adjusting the magnitude of the collected evidence. Clients can control such adaptive behaviour by setting a parameter according to the specific application being used. This parameter controls how adaptive the match algorithm should be to the current state of the recognition. The values are in the range [0,1], where a value of ‘0’ means that the adaptive behaviour is turned off (the engine will only use the current evidence to find a match), a value of ‘1’ means that the engine will collect the maximum possible amount of evidence, while any value in between affects the system’s decision of whether to use additional evidence or not (adaptive behaviour). The closer the value to 1 the more evidence the system will collect (and the more the processing power), and viceversa.

Parameters
  • [in] value: The MMS parameter in the range [0,1].

virtual void SetIdentificationType(eIdentificationType type) = 0

Set the type of identification to be used.

Parameters

virtual void SetIdentificationMode(eIdentificationMode mode) = 0

Set the classification mode used by the fuzzy classifier.

Parameters

virtual void SetBinaryIdThreshold(float value) = 0

Set the threshold for the binary identification mode. The value of the threshold is in the range [0.5, 1] and the optimal value is highly application dependent, so you need to experiment to find the right setting. The threshold controls an internal measure of “confidence of match” above which the current best match is considered identified. See Audioneex::eIdentificationType for more details.

Parameters
  • [in] value: The value of the threshold in the range [0.5, 1].

virtual void SetBinaryIdMinTime(float value) = 0

Set the minimum identification time for the binary identification mode. This is the minimum period of time that shall elapse before results are returned if an identification occurs and that can be used to increase the confidence of match. By default it is set to zero, i.e. there is no minimum time and the results are returned as soon as the confidence reaches the set threshold.

Parameters
  • [in] value: The minimum identification time in seconds. Valid values are in the range [0, 20]

virtual void SetMaxRecordingDuration(size_t duration) = 0

This method can be used to set the maximum recording duration (or its expected value) in the dataset to be fingerprinted. This value will be used internally to optimize the efficiency of some data structures used during the matching process. It is not mandatory to set it as the default value may be sufficient for most applications. However, should you experience highly recurring warning messages about reallocations occurring in the matcher, then you can use this value to mitigate or avoid this issue (note that sporadic reallocations are normal).

Parameters
  • [in] duration: The max duration in seconds.

virtual eMatchType GetMatchType() const = 0

Get the currently set match type.

Return

The currently set match type.

virtual float GetMMS() const = 0

Get the currently set MMS value.

Return

The currently set MMS.

virtual eIdentificationType GetIdentificationType() const = 0

Get the currently set identification type.

Return

The currently set identification type.

virtual eIdentificationMode GetIdentificationMode() const = 0

Get the currently set identification mode.

Return

The currently set identification mode.

virtual float GetBinaryIdThreshold() const = 0

Get the currently set binary id threshold.

Return

The currently set binary id threshold.

virtual float GetBinaryIdMinTime() const = 0

Get the currently set binary id minimum identification time.

Return

The currently set binary id minimum identification time.

virtual void Identify(const float *audio, size_t nsamples) = 0

This method is the heart of the recognition engine. Given an audio clip, it tries to match it against the reference fingerprints in the database to find the best match. It is designed and optimized for real-time audio identification, so it must be fed with short chunks of audio, generally 1-2 seconds long. If longer chunks are used, a buffer overflow with data loss will occur. Snippets shorter than 500ms won’t be processed. The audio must be 16 bit normalized in [-1,1], mono, 11025Hz. Note that this call is synchronous (i.e. blocking).

Parameters
  • [in] audio: Pointer to the buffer containing the audio samples. The engine does not take ownership of the pointer.

  • [in] nsamples: Number of samples in the buffer.

virtual const IdMatch *GetResults() = 0

Call this method to check the current state of the identification. Usually this is done right after calling Identify().

Return

A pointer to an array of Audioneex::IdMatch structures representing the best match(es). Usually there would be only one match, although ties may occur. The end of the array is marked by a ‘null’ element, which is an IdMatch structure set to zero (you can use Audioneex::IsNull(IdMatch) to check for that), so the number of elements in the array is always greater than zero. If no identification could be made, the results set will contain only the null element. The recognizer will always return a results set after a set period of time has elapsed, which can be empty if no match was found or an array with the best match(es). If the identification cannot be completed (e.g. for insufficient audio) the returned pointer will be null, so clients should always check its validity.

Note

The returned pointer is owned by the identification engine and must not be deleted nor retained by clients.

virtual double GetIdentificationTime() const = 0

Get the identification time. This is actually the duration of the audio being fed to the engine until a response is given (whether positive or negative).

Return

The time taken to perform the identification.

virtual void Flush() = 0

Flush the internal buffers. This method is mostly useful when performing off-line identifications (i.e. on fixed streams, such as files) where the length of the audio stream is finite. In this case, if the audio data is exhausted before the identification engine gives a response this method can be called to flush any residual data in the internal buffers and then check again for results. Using this method for identifications performed on indefinitely long streams (i.e. live streams) is not recommended. The method does nothing if the recognizer has already given a response.

virtual void Reset() = 0

Reset the recognizer’s internal state. After identification results are produced the recognizer must be reset to start a new identification session using the same instance.

Warning

It is absolutely important to reset the recognizer if the same instance is reused for different recognitions. Not doing so may produce undefined behavior.

virtual void SetDataStore(DataStore *dstore) = 0

Set the data store to be used for data I/O.

Parameters
  • [in] dstore: A pointer to a data store implementation.

virtual DataStore *GetDataStore() const = 0

Get the currently set data store.

Return

The currently set datastore

Public Static Functions

static Recognizer *Create()

Create an instance of recognizer.