music21.corpus.corpora

Corpus

class music21.corpus.corpora.Corpus

Abstract base class of all corpora subclasses.

Corpus read-only properties

Corpus.cacheName
Corpus.directoryInformation

Returns a tuple of DirectoryInformation objects for a each directory in self._directoryInformation.

>>> core = corpus.corpora.CoreCorpus()
>>> diBrief = core.directoryInformation[0:4]
>>> diBrief
(<music21.corpus.work.DirectoryInformation airdsAirs>,
 <music21.corpus.work.DirectoryInformation bach>, 
 <music21.corpus.work.DirectoryInformation beethoven>, 
 <music21.corpus.work.DirectoryInformation ciconia>)
>>> diBrief[3].directoryTitle
'Johannes Ciconia'
Corpus.metadataBundle

The metadata bundle for a corpus:

>>> from music21 import corpus
>>> corpus.corpora.CoreCorpus().metadataBundle
<music21.metadata.bundles.MetadataBundle 'core': {140... entries}>

As a technical aside, the metadata bundle for a corpus is actually stored in corpus.manager, in order to cache most effectively over multiple calls. There might be good reasons to eventually move them to each Corpus object, so long as its cached across instances of the class.

Corpus.name

The name of a given corpus.

Corpus methods

Corpus.getComposer(composerName, fileExtensions=None)

Return all filenames in the corpus that match a composer’s or a collection’s name. An fileExtensions, if provided, defines which extensions are returned. An fileExtensions of None (default) returns all extensions.

Note that xml and mxl are treated equivalently.

>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> a = coreCorpus.getComposer('bach')
>>> len(a) > 100
True
>>> a = coreCorpus.getComposer('bach', 'krn')
>>> len(a) < 10
True
>>> a = coreCorpus.getComposer('bach', 'xml')
>>> len(a) > 10
True
Corpus.getPaths()

The paths of the files in a given corpus.

Corpus.getWorkList(workName, movementNumber=None, fileExtensions=None)

Search the corpus and return a list of filenames of works, always in a list.

If no matches are found, an empty list is returned.

>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()

# returns 1 even though there is a ‘.mus’ file, which cannot be read...

>>> len(coreCorpus.getWorkList('cpebach/h186'))
1
>>> len(coreCorpus.getWorkList('cpebach/h186', None, '.xml'))
1
>>> len(coreCorpus.getWorkList('schumann_clara/opus17', 3))
1
>>> len(coreCorpus.getWorkList('schumann_clara/opus17', 2))
0

Make sure that ‘verdi’ just gets the single Verdi piece and not the Monteverdi pieces:

>>> len(coreCorpus.getWorkList('verdi'))
1
Corpus.getWorkReferences()

Return a data dictionary for all works in this corpus Returns a list of corpus.work.DirectoryInformation objects, one for each directory. A ‘works’ dictionary for each composer provides references to dictionaries for all associated works.

This is used in the generation of corpus documentation

>>> workRefs = corpus.corpora.CoreCorpus().getWorkReferences()
>>> workRefs[1:3]
[<music21.corpus.work.DirectoryInformation bach>, 
 <music21.corpus.work.DirectoryInformation beethoven>]
Corpus.search(query, field=None, fileExtensions=None)

Search this corpus for metadata entries, returning a metadataBundle

>>> corpus.corpora.CoreCorpus().search('3/4')
<music21.metadata.bundles.MetadataBundle {1842 entries}>
>>> corpus.corpora.CoreCorpus().search(
...      'bach',
...      field='composer',
...      )
<music21.metadata.bundles.MetadataBundle {22 entries}>
>>> predicate = lambda noteCount: noteCount < 20
>>> corpus.corpora.CoreCorpus().search(
...     predicate,
...     field='noteCount',
...     )
<music21.metadata.bundles.MetadataBundle {132 entries}>

CoreCorpus

class music21.corpus.corpora.CoreCorpus

A model of the core corpus.

>>> coreCorpus = corpus.corpora.CoreCorpus()

CoreCorpus bases

CoreCorpus read-only properties

CoreCorpus.cacheName
CoreCorpus.name
CoreCorpus.noCorpus

Return True or False if this is a corpus or noCoprus distribution.

>>> from music21 import corpus
>>> corpus.corpora.CoreCorpus().noCorpus
False

Read-only properties inherited from Corpus:

CoreCorpus read/write properties

CoreCorpus.manualCoreCorpusPath

Set music21’s core corpus to a directory, and save that information in the user settings.

This is specifically for use with “no corpus” music21 packages, where the core corpus was not included with the rest of the package functionality, and had to be installed separately.

Set it to a directory:

>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> coreCorpus.manualCoreCorpusPath = '~/Desktop'

Unset it:

>>> coreCorpus.manualCoreCorpusPath = None
>>> coreCorpus.manualCoreCorpusPath is None
True

CoreCorpus methods

CoreCorpus.getBachChorales(fileExtensions='xml')

Return the file name of all Bach chorales.

By default, only Bach Chorales in xml format are returned, because the quality of the encoding and our parsing of those is superior.

N.B. Look at the module corpus.chorales for many better ways to work with the chorales.

>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> a = coreCorpus.getBachChorales()
>>> len(a) > 400
True
>>> a = coreCorpus.getBachChorales('krn')
>>> len(a) > 10
False
>>> a = coreCorpus.getBachChorales('xml')
>>> len(a) > 400
True
>>> a[0]
'/Users/cuthbert/Documents/music21/corpus/bach/bwv1.6.mxl'
CoreCorpus.getComposerDirectoryPath(composerName)

DEPRECATED

Given the name of a composer, get the path to the top-level directory of that composer:

>>> import os
>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> a = coreCorpus.getComposerDirectoryPath('ciconia')
>>> a.endswith(os.path.join('corpus', os.sep, 'ciconia'))
True
>>> a = coreCorpus.getComposerDirectoryPath('bach')
>>> a.endswith(os.path.join('corpus', os.sep, 'bach'))
True
>>> a = coreCorpus.getComposerDirectoryPath('handel')
>>> a.endswith(os.path.join('corpus', os.sep, 'handel'))
True
CoreCorpus.getMonteverdiMadrigals(fileExtensions='xml')

Return a list of the filenames of all Monteverdi madrigals.

>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> a = coreCorpus.getMonteverdiMadrigals()
>>> len(a) > 40
True
CoreCorpus.getPaths(fileExtensions=None, expandExtensions=True)

Get all paths in the core corpus that match a known extension, or an extenion provided by an argument.

If expandExtensions is True, a format for an extension, and related extensions, will replaced by all known input extensions.

This is convenient when an input format might match for multiple extensions.

>>> from music21 import corpus
>>> coreCorpus = corpus.corpora.CoreCorpus()
>>> corpusFilePaths = coreCorpus.getPaths()
>>> 2500 < len(corpusFilePaths) < 2600
True
>>> kernFilePaths = coreCorpus.getPaths('krn')
>>> len(kernFilePaths) >= 500
True
>>> abcFilePaths = coreCorpus.getPaths('abc')
>>> len(abcFilePaths) >= 100
True

Methods inherited from Corpus:

LocalCorpus

class music21.corpus.corpora.LocalCorpus(name=None)

A model of a local corpus.

>>> localCorpus = corpus.corpora.LocalCorpus()

The default local corpus is unnamed (or called “local” or None), but an arbitrary number of independent, named local corpora can be defined and persisted:

>>> namedLocalCorpus = corpus.corpora.LocalCorpus('with a name')

LocalCorpus bases

LocalCorpus read-only properties

LocalCorpus.cacheName
LocalCorpus.directoryPaths

The directory paths in use by a given local corpus.

LocalCorpus.existsInSettings

True if this local corpus has a corresponding entry in music21’s user settings, otherwise false.

LocalCorpus.name

The name of a given local corpus.

>>> from music21 import corpus
>>> corpus.corpora.LocalCorpus().name
'local'
>>> corpus.corpora.LocalCorpus(name='Bach Chorales').name
'Bach Chorales'

Read-only properties inherited from Corpus:

LocalCorpus methods

LocalCorpus.addPath(directoryPath)

Add a directory path to a local corpus:

>>> localCorpus = corpus.corpora.LocalCorpus('a new corpus')
>>> localCorpus.addPath('~/Desktop')

Paths added in this way will not be persisted from session to session unless explicitly saved by a call to LocalCorpus.save().

LocalCorpus.delete()

Delete a non-default local corpus from the user settings.

LocalCorpus.getPaths(fileExtensions=None, expandExtensions=True)

Access files in additional directories supplied by the user and defined in environment settings in the ‘localCorpusSettings’ list.

If additional paths are added on a per-session basis with the addPath() function, these paths are also returned with this method.

LocalCorpus.removePath(directoryPath)

Remove a directory path from a local corpus.

If that path is included in the list of persisted paths for the given corpus, it will be removed permanently.

LocalCorpus.save()

Save the current list of directory paths in use by a given corpus in the user settings.

Methods inherited from Corpus:

VirtualCorpus

class music21.corpus.corpora.VirtualCorpus

A model of the virtual corpus. that stays online...

>>> virtualCorpus = corpus.corpora.VirtualCorpus()

VirtualCorpus bases

VirtualCorpus read-only properties

VirtualCorpus.cacheName
VirtualCorpus.name

The name of the virtual corpus:

>>> corpus.corpora.VirtualCorpus().name
'virtual'

Read-only properties inherited from Corpus:

VirtualCorpus methods

VirtualCorpus.getPaths(fileExtensions=None, expandExtensions=True)

Get all paths in the virtual corpus that match a known extension.

An extension of None will return all known extensions.

>>> len(corpus.corpora.VirtualCorpus().getPaths()) > 6
True
VirtualCorpus.getWorkList(workName, movementNumber=None, fileExtensions=None)

Given a work name, search all virtual works and return a list of URLs for any matches.

>>> virtualCorpus = corpus.corpora.VirtualCorpus()
>>> virtualCorpus.getWorkList('bach/bwv1007/prelude')
['http://kern.ccarh.org/cgi-bin/ksdata?l=cc/bach/cello&file=bwv1007-01.krn&f=xml']
>>> virtualCorpus.getWorkList('junk')
[]

Methods inherited from Corpus: