Abstract

Aligning versions of the same source material has been a persistent challenge in the field of digital libraries for musicology, and a barrier to progress. The growing number of publicly accessible symbolic datasets (of scores, analyses, and more) now increasingly cover multiple versions of the sameworks. As creators/curators/representatives of many such datasets and encoding standards, we came together in this project to coordinate platform-neutral interoperabilility for combining and comparing different sources, reliably and automatically. Here, we outline the main challenges and propose solutions centred on the 'measure map': a lightweight format for representing symbolic bar information alone. We offer new code for producing this representation from various formats, diagnosing differences, and even solving for those differences by modifying sources inplace. While we cannot solve for every possible discrepancy, we do provide corpus-scale demonstration; and while we focus on symbolic data, we consider the measure map also a useful basis for aligning audio, manuscripts and any source for which bar-relative location data provides a useful point of reference.

Details