returns a tuple containing ALL currentingly implemented feature extractors. The first in the tuple are jsymbolic vectors, and the second native vectors. Vectors are NOT nested

streamInput can be Add a Stream, DataInstance, or path to a corpus or local file to this data set.

>>> s = converter.parse('tinynotation: 4/4 c4 d e2')
>>> f = features.allFeaturesAsList(s)
>>> f[1][0:3]
[[1], [0.689999...], [2]]
>>> len(f[0]) > 65
>>> len(f[1]) > 20
music21.features.base.extractorById(idOrList, library=('jSymbolic', 'native'))

Get the first feature matched by extractorsById().

>>> s = stream.Stream()
>>> s.append(note.Note('A4'))
>>> fe = features.extractorById('p20')(s) # call class
>>> fe.extract().vector
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]
music21.features.base.extractorsById(idOrList, library=('jSymbolic', 'native'))

Given one or more FeatureExtractor ids, return the appropriate subclass. An optional library argument can be added to define which module is used. Current options are jSymbolic and native.

>>> features.extractorsById('p20')
[<class 'music21.features.jSymbolic.PitchClassDistributionFeature'>]
>>> [ for x in features.extractorsById('p20')]
>>> [ for x in features.extractorsById(['p19', 'p20'])]
['P19', 'P20']

Normalizes case...

>>> [ for x in features.extractorsById(['r31', 'r32', 'r33', 'r34', 'r35', 'p1', 'p2'])]
['R31', 'R32', 'R33', 'R34', 'R35', 'P1', 'P2']

Get all feature extractors from all libraries

>>> y = [ for x in features.extractorsById('all')]
>>> y[0:3], y[-3:-1]
(['M1', 'M2', 'M3'], ['MD1', 'MC1'])
music21.features.base.getIndex(featureString, extractorType=None)

Returns the list index of the given feature extractor and the feature extractor category (jsymbolic or native). If feature extractor string is not in either jsymbolic or native feature extractors, returns None

optionally include the extractorType (‘jsymbolic’ or ‘native’) if known and searching will be made more efficient

>>> features.getIndex('Range')
(59, 'jsymbolic')
>>> features.getIndex('Ends With Landini Melodic Contour')
(19, 'native')
>>> features.getIndex('abrandnewfeature!')
>>> features.getIndex('Fifths Pitch Histogram', 'jsymbolic')
(68, 'jsymbolic')
>>> features.getIndex('Tonal Certainty', 'native')
(1, 'native')
music21.features.base.vectorById(streamObj, vectorId, library=('jSymbolic', 'native'))

Utility function to get a vector from an extractor

>>> s = stream.Stream()
>>> s.append(note.Note('A4'))
>>> features.vectorById(s, 'p20')
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]


class music21.features.base.FeatureExtractor(dataOrStream=None, *arguments, **keywords)

A model of process that extracts a feature from a Music21 Stream. The main public interface is the extract() method.

The extractor can be passed a Stream or a reference to a DataInstance. All Streams are internally converted to a DataInstance if necessary. Usage of a DataInstance offers significant performance advantages, as common forms of the Stream are cached for easy processing.

FeatureExtractor methods


Extract the feature and return the result.


Return a list of string in a form that is appropriate for data storage.

>>> fe = features.jSymbolic.AmountOfArpeggiationFeature()
>>> fe.getAttributeLabels()
>>> fe = features.jSymbolic.FifthsPitchHistogramFeature()
>>> fe.getAttributeLabels()
['Fifths_Pitch_Histogram_0', 'Fifths_Pitch_Histogram_1', 'Fifths_Pitch_Histogram_2',
 'Fifths_Pitch_Histogram_3', 'Fifths_Pitch_Histogram_4', 'Fifths_Pitch_Histogram_5',
 'Fifths_Pitch_Histogram_6', 'Fifths_Pitch_Histogram_7', 'Fifths_Pitch_Histogram_8',
 'Fifths_Pitch_Histogram_9', 'Fifths_Pitch_Histogram_10', 'Fifths_Pitch_Histogram_11']

Return a properly configured plain feature as a place holder

>>> from music21 import features
>>> fe = features.jSymbolic.InitialTimeSignatureFeature()
>>> fe.getBlankFeature().vector
[0, 0]

Set the data that this FeatureExtractor will process. Either a Stream or a DataInstance object can be provided.


class music21.features.base.DataInstance(streamObj=None, id=None)

A data instance for analysis. This object prepares a Stream (by stripping ties, etc.) and stores multiple commonly-used stream representations once, providing rapid processing.

DataInstance methods

DataInstance.setClassLabel(classLabel, classValue=None)

Set the class label, as well as the class value if known. The class label is the attribute name used to define the class of this data instance.

>>> s = corpus.parse('bwv66.6')
>>> di = features.DataInstance(s)
>>> di.setClassLabel('Composer', 'Bach')


class music21.features.base.DataSet(classLabel=None, featureExtractors=())

A set of features, as well as a collection of data to operate on

Multiple DataInstance objects, a FeatureSet, and an OutputFormat.

>>> ds = features.DataSet(classLabel='Composer')
>>> f = [features.jSymbolic.PitchClassDistributionFeature,
...      features.jSymbolic.ChangesOfMeterFeature,
...      features.jSymbolic.InitialTimeSignatureFeature]
>>> ds.addFeatureExtractors(f)
>>> ds.addData('bwv66.6', classValue='Bach')
>>> ds.addData('bach/bwv324.xml', classValue='Bach')
>>> ds.process()
>>> ds.getFeaturesAsList()[0]
['bwv66.6', 0.0, 1.0, 0.375, 0.03125, 0.5, 0.1875, 0.90625, 0.0, 0.4375,
 0.6875, 0.09375, 0.875, 0, 4, 4, 'Bach']
>>> ds.getFeaturesAsList()[1]
['bach/bwv324.xml', 0.12, 0.0, 1.0, 0.12, 0.56..., 0.0, ..., 0.52...,
 0.0, 0.68..., 0.0, 0.56..., 0, 4, 4, 'Bach']
>>> ds = ds.getString()

By default, all exceptions are caught and printed if debug mode is on.

Set ds.failFast = True to not catch them.

Set ds.quiet = False to print them regardless of debug mode.

DataSet methods

DataSet.addData(dataOrStreamOrPath, classValue=None, id=None)

Add a Stream, DataInstance, or path to a corpus or local file to this data set.

The class value passed here is assumed to be the same as the classLabel assigned at startup.


Add one or more FeatureExtractor objects, either as a list or as an individual object.

DataSet.getAttributeLabels(includeClassLabel=True, includeId=True)

Return a list of all attribute labels. Optionally add a class label field and/or an id field.

>>> f = [features.jSymbolic.PitchClassDistributionFeature,
...      features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet(classLabel='Composer', featureExtractors=f)
>>> ds.getAttributeLabels(includeId=False)

Return column labels for the presence of a class definition

>>> f = [features.jSymbolic.PitchClassDistributionFeature,
...      features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet(classLabel='Composer', featureExtractors=f)
>>> ds.getClassPositionLabels()
[None, False, False, False, False, False, False, False, False,
 False, False, False, False, False, True]
DataSet.getDiscreteLabels(includeClassLabel=True, includeId=True)

Return column labels for discrete status.

>>> f = [features.jSymbolic.PitchClassDistributionFeature,
...      features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet(classLabel='Composer', featureExtractors=f)
>>> ds.getDiscreteLabels()
[None, False, False, False, False, False, False, False, False, False,
 False, False, False, True, True]
DataSet.getFeaturesAsList(includeClassLabel=True, includeId=True, concatenateLists=True)

Get processed data as a list of lists, merging any sub-lists in multi-dimensional features.


Get a string representation of the data set in a specific format.


Return a list of unique class values.


Process all Data with all FeatureExtractors. Processed data is stored internally as numerous Feature objects.

DataSet.write(fp=None, format=None, includeClassLabel=True)

Set the output format object.


class music21.features.base.Feature

An object representation of a feature, capable of presentation in a variety of formats, and returned from FeatureExtractor objects.

Feature objects are simple. It is FeatureExtractors that store all metadata and processing routines for creating Feature objects.

Feature methods


Normalize the vector between 0 and 1, assuming there is more than one value.


Prepare the vector stored in this feature.


class music21.features.base.OutputARFF(dataSet=None)

An ARFF (Attribute-Relation File Format) file.

See for more details

>>> oa = features.OutputARFF()
>>> oa._ext

OutputARFF bases

OutputARFF methods

OutputARFF.getHeaderLines(includeClassLabel=True, includeId=True)

Get the header as a list of lines.

>>> f = [features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet(classLabel='Composer')
>>> ds.addFeatureExtractors(f)
>>> of = features.OutputARFF(ds)
>>> for x in of.getHeaderLines(): print(x)
@RELATION Composer
@ATTRIBUTE class {}
OutputARFF.getString(includeClassLabel=True, includeId=True, lineBreak=None)

Methods inherited from OutputFormat:


class music21.features.base.OutputCSV(dataSet=None)

Comma-separated value list.

OutputCSV bases

OutputCSV methods

OutputCSV.getHeaderLines(includeClassLabel=True, includeId=True)

Get the header as a list of lines.

>>> f = [features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet(classLabel='Composer')
>>> ds.addFeatureExtractors(f)
>>> of = features.OutputCSV(ds)
>>> of.getHeaderLines()[0]
['Identifier', 'Changes_of_Meter', 'Composer']
OutputCSV.getString(includeClassLabel=True, includeId=True, lineBreak=None)

Methods inherited from OutputFormat:


class music21.features.base.OutputFormat(dataSet=None)

Provide output for a DataSet, passed as an initial argument.

OutputFormat methods


Get the header as a list of lines.

OutputFormat.write(fp=None, includeClassLabel=True, includeId=True)

Write the file. If not file path is given, a temporary file will be written.


class music21.features.base.OutputTabOrange(dataSet=None)

Tab delimited file format used with Orange.

For more information, see:

OutputTabOrange bases

OutputTabOrange methods

OutputTabOrange.getHeaderLines(includeClassLabel=True, includeId=True)

Get the header as a list of lines.

>>> f = [features.jSymbolic.ChangesOfMeterFeature]
>>> ds = features.DataSet()
>>> ds.addFeatureExtractors(f)
>>> of = features.OutputTabOrange(ds)
>>> for x in of.getHeaderLines(): print(x)
['Identifier', 'Changes_of_Meter']
['string', 'discrete']
['meta', '']
>>> ds = features.DataSet(classLabel='Composer')
>>> ds.addFeatureExtractors(f)
>>> of = features.OutputTabOrange(ds)
>>> for x in of.getHeaderLines(): print(x)
['Identifier', 'Changes_of_Meter', 'Composer']
['string', 'discrete', 'discrete']
['meta', '', 'class']
OutputTabOrange.getString(includeClassLabel=True, includeId=True, lineBreak=None)

Get the complete DataSet as a string with the appropriate headers.

Methods inherited from OutputFormat:


class music21.features.base.StreamForms(streamObj, prepareStream=True)

A dictionary-like wrapper of a Stream, providing numerous representations, generated on-demand, and cached.

A single StreamForms object can be created for an entire Score, as well as one for each Part and/or Voice.

A DataSet object manages one or more StreamForms objects, and exposes them to FeatureExtractors for usage.

StreamForms methods