.. _usersGuide_53_advancedCorpus: .. WARNING: DO NOT EDIT THIS FILE: AUTOMATICALLY GENERATED. PLEASE EDIT THE .py FILE DIRECTLY. User’s Guide, Chapter 53: Advanced Corpus and Metadata Searching ================================================================ We saw in :ref:`Chapter 11` some ways to work with and search through the “core” corpus. Not everything is in the core corpus, of course, so the ``converter.parse()`` function is a great way of getting files from a local hard drive or the internet. But the “core” corpus also has many great search functions, and these can be helpful for working with your own files and files on the web as well. In this chapter, we’ll introduce the other “Corpora” in addition to the “core” corpus and how they might be used. The Default Local Corpus ------------------------ .. code:: ipython3 from music21 import * localCorpus = corpus.corpora.LocalCorpus() localCorpus .. parsed-literal:: :class: ipython-result You can add and remove paths from a *local* corpus with the ``addPath()`` and ``removePath()`` methods: Creating multiple corpus repositories via local corpora ------------------------------------------------------- In addition to the default local corpus, music21 allows users to create and save as many named local corpora as they like, which will persist from session to session. Let’s create a new *local* corpus, give it a directory to find music files in, and then save it: .. code:: ipython3 from music21 import * aNewLocalCorpus = corpus.corpora.LocalCorpus('newCorpus') aNewLocalCorpus.existsInSettings .. parsed-literal:: :class: ipython-result False .. code:: ipython3 aNewLocalCorpus.addPath('~/Desktop') #_DOCS_SHOW aNewLocalCorpus.directoryPaths print("('/Users/josiah/Desktop',)") #_DOCS_HIDE .. parsed-literal:: :class: ipython-result ('/Users/josiah/Desktop',) .. code:: ipython3 aNewLocalCorpus.save() aNewLocalCorpus.existsInSettings .. parsed-literal:: :class: ipython-result /Users/cuthbert/git/music21base/music21/corpus/corpora.py: WARNING: newCorpus metadata cache: starting processing of paths: 0 /Users/cuthbert/git/music21base/music21/corpus/corpora.py: WARNING: cache: filename: /var/folders/qg/klchy5t14bb2ty9pswk6c2bw0000gn/T/music21/local-newCorpus.p.gz metadata.bundles: WARNING: MetadataBundle Modification Time: 1686508943.60752 metadata.bundles: WARNING: Skipped 0 sources already in cache. /Users/cuthbert/git/music21base/music21/corpus/corpora.py: WARNING: cache: writing time: 0.202 md items: 0 /Users/cuthbert/git/music21base/music21/corpus/corpora.py: WARNING: cache: filename: /var/folders/qg/klchy5t14bb2ty9pswk6c2bw0000gn/T/music21/local-newCorpus.p.gz .. parsed-literal:: :class: ipython-result True We can see that our new *local* corpus is saved by checking for the names of all saved *local* corpora using the corpus.manager list: .. code:: ipython3 #_DOCS_SHOW corpus.manager.listLocalCorporaNames() print("[None, 'funk', 'newCorpus', 'bach']") #_DOCS_HIDE .. parsed-literal:: :class: ipython-result [None, 'funk', 'newCorpus', 'bach'] .. note:: When running ``listLocalCorporaNames()``, you will see ``None`` - indicating the default *local* corpus - along with the names of any non-default *local* corpora you've manually created yourself. In the above example, a number of other corpora have already been created. Finally, we can delete the *local* corpus we previously created like this: .. code:: ipython3 aNewLocalCorpus.delete() aNewLocalCorpus.existsInSettings .. parsed-literal:: :class: ipython-result False Inspecting metadata bundle search results ----------------------------------------- Let’s take a closer look at some search results: .. code:: ipython3 bachBundle = corpus.corpora.CoreCorpus().search('bach', 'composer') bachBundle .. parsed-literal:: :class: ipython-result .. code:: ipython3 bachBundle[0] .. parsed-literal:: :class: ipython-result .. code:: ipython3 bachBundle[0].sourcePath .. parsed-literal:: :class: ipython-result PosixPath('bach/bwv10.7.mxl') .. code:: ipython3 bachBundle[0].metadata .. parsed-literal:: :class: ipython-result .. code:: ipython3 bachBundle[0].metadata.all() .. parsed-literal:: :class: ipython-result (('ambitus', AmbitusShort(semitones=34, diatonic='m7', pitchLowest='G2', pitchHighest='F5')), ('composer', 'J.S. Bach'), ('fileFormat', 'musicxml'), ('filePath', '/Users/cuthbert/git/music21base/music21/corpus/bach/bwv10.7.mxl'), ('keySignatureFirst', -2), ('keySignatures', [-2]), ('movementName', 'bwv10.7.mxl'), ('noteCount', 214), ('numberOfParts', 4), ('pitchHighest', 'F5'), ('pitchLowest', 'G2'), ('quarterLength', 88.0), ('software', 'MuseScore 2.1.0'), ('software', 'music21 v.6.0.0a'), ('software', 'music21 v.9.1.0'), ('sourcePath', 'bach/bwv10.7.mxl'), ('tempoFirst', None), ('tempos', []), ('timeSignatureFirst', '4/4'), ('timeSignatures', ['4/4'])) .. code:: ipython3 mdpl = bachBundle[0].metadata mdpl.noteCount .. parsed-literal:: :class: ipython-result 214 .. code:: ipython3 bachAnalysis0 = bachBundle[0].parse() bachAnalysis0.show() .. image:: usersGuide_53_advancedCorpus_20_0.png :width: 787px :height: 681px .. raw:: html

 

.. image:: usersGuide_53_advancedCorpus_20_2.png :width: 787px :height: 927px Manipulating multiple metadata bundles -------------------------------------- Another useful feature of ``music21``\ ’s metadata bundles is that they can be operated on as though they were sets, allowing you to union, intersect and difference multiple metadata bundles, thereby creating more complex search results: .. code:: ipython3 corelliBundle = corpus.search('corelli', field='composer') corelliBundle .. parsed-literal:: :class: ipython-result .. code:: ipython3 bachBundle.union(corelliBundle) .. parsed-literal:: :class: ipython-result Consult the API for :class:`~music21.metadata.bundles.MetadataBundle` for a more in depth look at how this works. Getting a metadata bundle ------------------------- In music21, metadata is information *about* a score, such as its composer, title, initial key signature or ambitus. A metadata *bundle* is a collection of metadata pulled from an arbitrarily large group of different scores. Users can search through metadata bundles to find scores with certain qualities, such as all scores in a given corpus with a time signature of ``6/8``, or all scores composed by Monteverdi. There are a number of different ways to acquire a metadata bundle. The easiest way to get the metadataBundle for the core corpus is simply to download music21: we include a pre-made metadataBundle (in ``corpus/metadataCache/core.json``) so that this step is unnecessary for the core corpus unless you’re contributing to the project. But you may want to create metadata bundles for your own local corpora. Access the ``metadataBundle`` attribute of any ``Corpus`` instance to get its corresponding metadata bundle: .. code:: ipython3 coreCorpus = corpus.corpora.CoreCorpus() coreCorpus.metadataBundle .. parsed-literal:: :class: ipython-result Music21 also provides a handful of convenience methods for getting metadata bundles associated with the *virtual*, *local* or *core* corpora: .. code:: ipython3 coreBundle = corpus.corpora.CoreCorpus().metadataBundle localBundle = corpus.corpora.LocalCorpus().metadataBundle otherLocalBundle = corpus.corpora.LocalCorpus('blah').metadataBundle But really advanced users can also make metadata bundles manually, by passing in the name of the corpus you want the bundle to refer to, or, equivalently, an actual ``Corpus`` instance itself: .. code:: ipython3 coreBundle = metadata.bundles.MetadataBundle('core') coreBundle = metadata.bundles.MetadataBundle(corpus.corpora.CoreCorpus()) However, you’ll need to read the bundle’s saved data from disk before you can do anything useful with the bundle. Bundles don’t read their associated JSON files automatically when they’re manually instantiated. .. code:: ipython3 coreBundle .. parsed-literal:: :class: ipython-result .. code:: ipython3 coreBundle.read() .. parsed-literal:: :class: ipython-result Creating persistent metadata bundles ------------------------------------ Metadata bundles can take a long time to create. So it’d be nice if they could be written to and read from disk. Unfortunately we never got around to…nah, just kidding. Of course you can. Just call ``.write()`` on one: .. code:: ipython3 coreBundle = metadata.bundles.MetadataBundle('core') coreBundle.read() .. parsed-literal:: :class: ipython-result .. code:: ipython3 #_DOCS_SHOW coreBundle.write() They can also be completely rebuilt, as you will want to do for local corpora. To add information to a bundle, use the ``addFromPaths()`` method: .. code:: ipython3 newBundle = metadata.bundles.MetadataBundle() paths = corpus.corpora.CoreCorpus().search('corelli') #_DOCS_SHOW failedPaths = newBundle.addFromPaths(paths) failedPaths = [] #_DOCS_HIDE failedPaths .. parsed-literal:: :class: ipython-result [] then call ``.write()`` to save to disk .. code:: ipython3 #_DOCS_SHOW newBundle print("") # did not actually run addFromPaths... #_DOCS_HIDE .. parsed-literal:: :class: ipython-result .. note:: Building metadata information can be an incredibly intensive process. For example, building the *core* metadata bundle can easily take as long as an hour! And this is even though the building process uses multiple cores. Please use caution, and be patient, when building metadata bundles from large corpora. To monitor the corpus-building progress, make sure to set 'debug' to True in your user settings: >>> #_DOCS_SHOW environment.UserSettings()['debug'] = True You can delete, rebuild and save a metadata bundle in one go with the ``rebuildMetadataCache()`` method: .. code:: ipython3 localBundle = corpus.corpora.LocalCorpus().metadataBundle #_DOCS_SHOW localBundle.rebuildMetadataCache() The process of rebuilding will store the file as it goes (for safety) so at the end there is no need to call ``.write()``. To delete a metadata bundle’s cached-to-disk JSON file, use the ``delete()`` method: .. code:: ipython3 #_DOCS_SHOW localBundle.delete() Deleting a metadata bundle’s JSON file won’t empty the in-memory contents of that bundle. For that, use ``clear()``: .. code:: ipython3 localBundle.clear() With local corpora you will be able to develop your own collections of pieces to analyze, work on a new self-contained project with, and index and search through them. But what if some of your music is in a file format that ``music21`` does not yet support? Maybe it’s time to write your own converter or notation format. To learn how to do that, go to the next chapter, :ref:`Chapter 54: Extending Converter with New Formats `.