Basic Functionality¶
Basically all functionality is incorporated in the class MTCFeatureLoader. A MTCFeatureLoader object takes as source a .jsonl file, which is a text file (optionally gzipped) with on each line a json object representing a melody. A melody object contains metadata fields and several sequences of feature values.
Several .jsonl files are provided with the module:
MTC-ANN-2.0.1
- Feature sequences for the melodies of MTC-ANN-2.0.1.MTC-FS-INST-2.0
- Feature sequences for the melodies of MTC-FS-INST-2.0.ESSEN
- Feature sequences for the melodies in the ESSEN Folksong Collection.
The MTCFeatureLoader can be initialized either with one of these, or with a user provided .jsonl or .jsonl.gz file:
from MTCFeatures import MTCFeatureLoader
fl = MTCFeatureLoader('MTC-ANN-2.0.1')
fl = MTCFeatureLoader('MTC-FS-INST-2.0')
fl = MTCFeatureLoader('../path/to/my/file.jsonl.gz')
fl = MTCFeatureLoader('/path/to/my/file.jsonl')
The MTCFeatureLoader class provides various functionalities:
Melody Filtering : select melodies according to given criteria
Feature selection : keep subset of features
Feature extraction : compute a new feature from existing features and add it to the object
Replace undefined feature values (
null
in json,None
in Python) with sensible fall back values
Operations can be chained. All feature extractors, feature selectors, object filters, and NoneReplacer
return an iterator over the resulting sequences. Each has a parameter seq_iter. If seq_iter is None
(default),
the .jsonl file is taken as data source and a new iterator is created. Otherwise, the provided iterator
is taken as data source. A method, applyFilters is available which takes a list of filter names and applies
these in the provided order.
The following filters are registered in class MTCFeatureLoader:
vocal
: Only keep vocal melodiesinstrumental
: Only keep instrumental melodiesfirstvoice
: Only keep first voices/stanzas (i.e. identifier ending with _01)ann_bgcorpus
: Only keep melodies unrelated to MTC-ANN (only applicable to MTC-FS-INST)labeled
: Only keep melodies with a tune family labelunlabeled
: Only keep melodies without a tune family labelafteryear(year)
: Only keep melodies in sources dated later than year (year not included)beforeyear(year)
: Only keep melodies in sources dated before year (year not included)betweenyears(year1, year2)
: Only keep melodies in sources dated between year1 and year2 (both not included)inOGL
: Only keep melodies that are part of Onder de Groene LindeinNLBIDs(id_list)
: Only keep melodies with given identifiers in id_listinTuneFamilies(tf_list)
: Only keep melodies in given tune families in tf_listinInstTest
: Only keep melodies that are in cINST.origin(location)
: Only keep melodies iflocation
occurs somewhere in the origin meta data field (only for Essen).
Available as separate functions:
minClassSizeFilter : Keep only melodies in tune families with >=
minsize
members.maxClassSizeFilter : Keep only melodies in tune families with <=
maxsize
members.head : Keep only first
n
melodies.tail : Keep only last
n
melodies.randomSel : Take a random sample of
n
melodies.replaceNone : Replace undefined feature values (
None
) with sensible fall back values.
For replacement of the None values, a separate rule is included for each of the relevant features. The following rules are included:
metriccontour
: None -> ‘=’ if all items in the sequence are None. None -> ‘+’ if only the first item is None.imacontour
: First note: None -> “+”contour3
: First note: None -> “=”contour5
: First note: None -> “=”IOR
: First, and possibly last notes: None -> 1.0IOR_frac
: First, and possibly last notes: None -> “1”durationcontour
: First note: None -> “=”restduration_frac
: None -> “0”diatonicinterval
: First note: None -> 0chromaticinterval
: First note: None -> 0nextisrest
: Last note: None -> Truebeatfraction
: None -> “0”beatinsong
: None -> “0”beatinphrase
: None -> “0”beatinphrase_end
: None -> “0”beatstrength
: None -> 1.0beat_str
: None -> “1”beat_fraction_str
: None -> “0”beat
: None -> 0.0timesignature
: None -> “0/0”lyrics
: None -> “”noncontentword
: None -> Falsewordend
: None -> Falsephoneme
: None -> ‘’rhymes
: None -> Falserhymescontentwords
: None -> Falsewordstress
: None -> False
For the different models from the literature (LBDM, GTTM, IR) no None-replacers are included.