Previous topic

music21.articulations

Next topic

music21.audioSearch.omrfollow

Table Of Contents

Table Of Contents

This Page

music21.audioSearch

Base routines used throughout audioSearching and score-folling.

Requires numpy, scipy, and matplotlib.

Functions

music21.audioSearch.autocorrelationFunction(recordedSignal, recordSampleRate)

Converts the temporal domain into a frequency domain. In order to do that, it uses the autocorrelation function, which finds periodicities in the signal in the temporal domain and, consequently, obtains the frequency in each instant of time.

>>> import wave
>>> import os
>>> import numpy  # you need to have numpy, scipy, and matplotlib installed to use this
>>> wv = wave.open(common.getSourceFilePath() + os.path.sep + 'audioSearch' + os.path.sep + 'test_audio.wav', 'r')
>>> data = wv.readframes(1024)
>>> samps = numpy.fromstring(data, dtype=numpy.int16)
>>> finalResult = audioSearch.autocorrelationFunction(samps, 44100)
>>> wv.close()
>>> print(finalResult)
143.6276...
music21.audioSearch.decisionProcess(partsList, notePrediction, beginningData, lastNotePosition, countdown, firstNotePage=None, lastNotePage=None)

It decides which of the given parts of the score has a better matching with the recorded part of the song. If there is not a part of the score with a high probability to be the correct part, it starts a “countdown” in order stop the score following if the bad matching persists. In this case, it does not match the recorded part of the song with any part of the score.

Inputs: partsList, contains all the possible parts of the score, sorted from the higher probability to be the best matching at the beginning to the lowest probability. notePrediction is the position of the score in which the next note should start. beginningData is a list with all the beginnings of the used fragments of the score to find the best matching. lastNotePosition is the position of the score in which the last matched fragment of the score finishes. Countdown is a counter of consecutive errors in the matching process.

Outputs: It returns the beginning of the best matching fragment of score and the countdown.

>>> scNotes = corpus.parse('luca/gloria').parts[0].flat.notes
>>> scoreStream = scNotes
>>> freqFromAQList = audioSearch.getFrequenciesFromAudioFile(waveFilename='test_audio.wav')
>>> detectedPitchesFreq = audioSearch.detectPitchFrequencies(freqFromAQList, useScale=scale.ChromaticScale('C4'))
>>> detectedPitchesFreq = audioSearch.smoothFrequencies(detectedPitchesFreq)
>>> (detectedPitchObjects, listplot) = audioSearch.pitchFrequenciesToObjects(detectedPitchesFreq, useScale=scale.ChromaticScale('C4'))
>>> (notesList, durationList) = audioSearch.joinConsecutiveIdenticalPitches(detectedPitchObjects)
>>> transcribedScore, qle = audioSearch.notesAndDurationsToStream(notesList, durationList, scNotes=scNotes, qle=None)
>>> hop = 6
>>> tn_recording = 24
>>> totScores = []
>>> beginningData = []
>>> lengthData = []
>>> for i in range(4):
...     scNotes = scoreStream[i * hop + 1 :i * hop + tn_recording + 1 ]
...     name = "%d" % i
...     beginningData.append(i * hop + 1)
...     lengthData.append(tn_recording)
...     scNotes.id = name
...     totScores.append(scNotes)
>>> listOfParts = search.approximateNoteSearch(transcribedScore.flat.notes, totScores)
>>> notePrediction = 0
>>> lastNotePosition = 0
>>> countdown = 0
>>> positionInList, countdown = audioSearch.decisionProcess(listOfParts, notePrediction, beginningData, lastNotePosition, countdown)
>>> print(positionInList)
0
>>> print(countdown) # the result is 1 because the song used is completely different from the score!!
1
music21.audioSearch.detectPitchFrequencies(freqFromAQList, useScale=None)

It detects the pitches of the notes from a list of frequencies, using thresholds which depend on the useScale option. If useScale is None, the default value is the Major Scale beginning C4.

>>> freqFromAQList=[143.627689055,99.0835452019,211.004784689,4700.31347962,2197.9431119]
>>> pitchesList = audioSearch.detectPitchFrequencies(freqFromAQList, useScale=scale.MajorScale('C4'))
>>> for i in range(5):
...     print(int(round(pitchesList[i])))
147
98
220
4699
2093
music21.audioSearch.getFrequenciesFromAudioFile(waveFilename='xmas.wav')

gets a list of frequencies from a complete audio file.

>>> import os
>>> readPath = common.getSourceFilePath() + os.path.sep + 'audioSearch' + os.path.sep + 'test_audio.wav'
>>> freq = audioSearch.getFrequenciesFromAudioFile(waveFilename=readPath)
>>> print(freq)
[143.627689055..., 99.083545201..., 211.004784688..., 4700.313479623..., ...]
music21.audioSearch.getFrequenciesFromMicrophone(length=10.0, storeWaveFilename=None)

records for length (=seconds) a set of frequencies from the microphone.

If storeWaveFilename is not None, then it will store the recording on disk in a wave file.

Returns a list of frequencies detected.

TODO – find a way to test... or at least demo

music21.audioSearch.getFrequenciesFromPartialAudioFile(waveFilenameOrHandle='temp', length=10.0, startSample=0)

It calculates the fundamental frequency at every instant of time of an audio signal extracted either from the microphone or from an already recorded song. It uses a period of time defined by the variable “length” in seconds.

It returns a list with the frequencies, a variable with the file descriptor, and the end sample position.

>>> readFile = 'pachelbel.wav'
>>> frequencyList, pachelbelFileHandle, currentSample  = audioSearch.getFrequenciesFromPartialAudioFile(readFile, length=1.0)
>>> for i in range(5):
...     print(frequencyList[i])
143.627689055
99.0835452019
211.004784689
4700.31347962
767.827403482
>>> print(currentSample)  # should be near 44100, but probably not exact
44032

Now read the next 1 second...

>>> frequencyList, pachelbelFileHandle, currentSample  = audioSearch.getFrequenciesFromPartialAudioFile(pachelbelFileHandle, length=1.0, startSample = currentSample)
>>> for i in range(5):
...     print(frequencyList[i])
187.798213268
238.263483185
409.700397349
149.958733396
101.989786226
>>> print(currentSample)  # should be exactly double the previous
88064
music21.audioSearch.histogram(data, bins)

Partition the list in data into a number of bins defined by bins and return the number of elements in each bins and a set of bins + 1 elements where the first element (0) is the start of the first bin, the last element (-1) is the end of the last bin, and every remaining element (i) is the dividing point between one bin and another.

>>> data = [1, 1, 4, 5, 6, 0, 8, 8, 8, 8, 8]
>>> outputData, bins = audioSearch.histogram(data,8)
>>> print(outputData)
[3, 0, 0, 1, 1, 1, 0, 5]
>>> print([int(b) for b in bins])
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> outputData, bins = audioSearch.histogram(data,4)
>>> print(outputData)
[3, 1, 2, 5]
>>> print([int(b) for b in bins])
[0, 2, 4, 6, 8]
music21.audioSearch.interpolation(correlation, peak)

Interpolation for estimating the true position of an inter-sample maximum when nearby samples are known.

Correlation is a vector and peak is an index for that vector.

Returns the x coordinate of the vertex of that parabola.

>>> import numpy
>>> f = [2, 3, 1, 6, 4, 2, 3, 1]
>>> audioSearch.interpolation(f, numpy.argmax(f))
3.21428571...
music21.audioSearch.joinConsecutiveIdenticalPitches(detectedPitchObjects)

takes a list of equally-spaced Pitch objects and returns a tuple of two lists, the first a list of Note or Rest objects (each of quarterLength 1.0) and a list of how many were joined together to make that object.

N.B. the returned list is NOT a Stream.

>>> import os
>>> readPath = common.getSourceFilePath() + os.path.sep + 'audioSearch' + os.path.sep + 'test_audio.wav'
>>> freqFromAQList = audioSearch.getFrequenciesFromAudioFile(waveFilename=readPath)
>>> detectedPitchesFreq = audioSearch.detectPitchFrequencies(freqFromAQList, useScale=scale.ChromaticScale('C4'))
>>> detectedPitchesFreq = audioSearch.smoothFrequencies(detectedPitchesFreq)
>>> (detectedPitchObjects, listplot) = audioSearch.pitchFrequenciesToObjects(detectedPitchesFreq, useScale=scale.ChromaticScale('C4'))
>>> (notesList, durationList) = audioSearch.joinConsecutiveIdenticalPitches(detectedPitchObjects)
>>> print(notesList)
[<music21.note.Rest rest>, <music21.note.Note C>, <music21.note.Note C>, <music21.note.Note D>, <music21.note.Note E>, <music21.note.Note F>, <music21.note.Note G>, <music21.note.Note A>, <music21.note.Note B>, <music21.note.Note C>, ...]
>>> print(durationList)
[71, 6, 14, 23, 34, 40, 27, 36, 35, 15, 17, 15, 6, 33, 22, 13, 16, 39, 35, 38, 27, 27, 26, 8]
music21.audioSearch.normalizeInputFrequency(inputPitchFrequency, thresholds=None, pitches=None)

Takes in an inputFrequency, a set of threshold values, and a set of allowable pitches (given by prepareThresholds) and returns a tuple of the normalized frequency and the pitch detected (as a Pitch object)

It will convert the frequency to be within the range of the default frequencies (usually C4 to C5) but the pitch object will have the correct octave.

>>> audioSearch.normalizeInputFrequency(441.72)
(440.0, <music21.pitch.Pitch A4>)

If you will be doing this often, it’s best to cache your thresholds and pitches by running prepareThresholds once first:

>>> thresholds, pitches = audioSearch.prepareThresholds(scale.ChromaticScale('C4'))
>>> for fq in [450, 510, 550, 600]:
...      print(audioSearch.normalizeInputFrequency(fq, thresholds, pitches))
(440.0, <music21.pitch.Pitch A4>)
(523.25113..., <music21.pitch.Pitch C5>)
(277.18263..., <music21.pitch.Pitch C#5>)
(293.66476..., <music21.pitch.Pitch D5>)
music21.audioSearch.notesAndDurationsToStream(notesList, durationList, scNotes=None, removeRestsAtBeginning=True, qle=None)

take a list of Note objects or rests and an equally long list of how long each ones lasts in terms of samples and returns a Stream using the information from quarterLengthEstimation and quantizeDurations.

returns a Score object, containing a metadata object and a single Part object, which in turn contains the notes, etc. Does not run makeNotation() on the Score.

>>> durationList = [20, 19, 10, 30, 6, 21]
>>> n = note.Note
>>> noteList = [n('C#4'), n('D5'), n('B4'), n('F#5'), n('C5'), note.Rest()]
>>> s,lengthPart = audioSearch.notesAndDurationsToStream(noteList, durationList)
>>> s.show('text')
{0.0} <music21.metadata.Metadata object at ...>
{0.0} <music21.stream.Part ...>
    {0.0} <music21.note.Note C#>
    {1.0} <music21.note.Note D>
    {2.0} <music21.note.Note B>
    {2.5} <music21.note.Note F#>
    {4.0} <music21.note.Note C>
    {4.25} <music21.note.Rest rest>
music21.audioSearch.pitchFrequenciesToObjects(detectedPitchesFreq, useScale=None)

Takes in a list of detected pitch frequencies and returns a tuple where the first element is a list of :class:~`music21.pitch.Pitch` objects that best match these frequencies and the second element is a list of the frequencies of those objects that can be plotted for matplotlib

To-do: only return the former. The latter can be generated in other ways.

>>> import os
>>> readPath = common.getSourceFilePath() + os.path.sep + 'audioSearch' + os.path.sep + 'test_audio.wav'
>>> freqFromAQList = audioSearch.getFrequenciesFromAudioFile(waveFilename=readPath)
>>> detectedPitchesFreq = audioSearch.detectPitchFrequencies(freqFromAQList, useScale=scale.ChromaticScale('C4'))
>>> detectedPitchesFreq = audioSearch.smoothFrequencies(detectedPitchesFreq)
>>> (detectedPitchObjects, listplot) = audioSearch.pitchFrequenciesToObjects(detectedPitchesFreq, useScale=scale.ChromaticScale('C4'))
>>> [str(p) for p in detectedPitchObjects]
['A5', 'A5', 'A6', 'D6', 'D4', 'B4', 'A4', 'F4', 'E-4', 'C#3', 'B3', 'B3', 'B3', 'A3', 'G3',...]
music21.audioSearch.prepareThresholds(useScale=None)

returns two elements. The first is a list of threshold values for one octave of a given scale, useScale, (including the octave repetition) (Default is a ChromaticScale). The second is the pitches of the scale.

A threshold value is the fractional part of the log-base-2 value of the frequency.

For instance if A = 440 and B-flat = 460, then the threshold between A and B-flat will be 450. Notes below 450 should be considered As and those above 450 should be considered B-flats.

Thus the list returned has one less element than the number of notes in the scale + octave repetition. If useScale is a ChromaticScale, prepareThresholds will return a 12 element list. If it’s a diatonic scale, it’ll have 7 elements.

>>> l, p = audioSearch.prepareThresholds(scale.MajorScale('A3'))
>>> for i in range(len(l)):
...    print("%s < %.2f < %s" % (p[i], l[i], p[i+1]))
A3 < 0.86 < B3
B3 < 0.53 < C#4
C#4 < 0.16 < D4
D4 < 0.28 < E4
E4 < 0.45 < F#4
F#4 < 0.61 < G#4
G#4 < 1.24 < A4
music21.audioSearch.quantizeDuration(length)

round an approximately transcribed quarterLength to a better one in music21.

Should be replaced by a full-featured routine in midi or stream.

See quantize() for more information on the standard music21 methodology.

>>> audioSearch.quantizeDuration(1.01)
1.0
>>> audioSearch.quantizeDuration(1.70)
1.5
music21.audioSearch.quarterLengthEstimation(durationList, mostRepeatedQuarterLength=1.0)

takes a list of lengths of notes (measured in audio samples) and tries to estimate what the length of a quarter note should be in this list.

If mostRepeatedQuarterLength is another number, it still returns the estimated length of a quarter note, but chooses it so that the most common note in durationList will be the other note. See example 2:

Returns a float – and not an int.

>>> durationList = [20, 19, 10, 30, 6, 21]
>>> audioSearch.quarterLengthEstimation(durationList)
20.625

Example 2: suppose these are the inputted durations for a score where most of the notes are half notes. Show how long a quarter note should be:

>>> audioSearch.quarterLengthEstimation(durationList, mostRepeatedQuarterLength = 2.0)
10.3125
music21.audioSearch.smoothFrequencies(detectedPitchesFreq, smoothLevels=7, inPlace=True)

It smooths the shape of the signal in order to avoid false detections in the fundamental frequency.

>>> inputPitches=[440, 440, 440, 440, 442, 443, 441, 470, 440, 441, 440, 442, 440, 440, 440, 397, 440, 440, 440, 442, 443, 441, 440, 440, 440, 440, 440, 442, 443, 441, 440, 440]
>>> result = audioSearch.smoothFrequencies(inputPitches)
>>> print(result)
[441, 441, 441, 441, 446, 446, 446, 447, 443, 443, 442, 441, 435, 434, 432, 431, 437, 438, 439, 440, 440, 440, 440, 440, 440, 441, 441, 441, 441, 441, 441, 441]