Prof. Dr. Sabine Schulte im Walde

Compositionality Ratings

Compositionality ratings are human ratings on the degree of compositionality of compounds. There are two basic versions of compositionality ratings:

Humans are asked to rate the degree of compositionality of a compound as a whole, i.e., without explicitly referring to the constituents.
Example: On a scale between 0 (definitely opaque) and 6 (definitely transparent), how compositional is the compound Löwenzahn 'lion's tooth' (lit. 'dandelion')?
Humans are asked to rate the degree of compositionality of a compound with regard to one or more constituents.
Example: On a scale between 0 (definitely opaque) and 6 (definitely transparent), how compositional is the compound Löwenzahn 'lion's tooth' (lit. 'dandelion') with regard to the head noun Löwe 'lion'?

We have collected several datasets of compositionality ratings for German compounds. See here on how to obtain the data.

Compositionality Ratings for German Noun Compounds

CONCRETE-NN, collected by Susanne Borgwaldt and Sabine Schulte im Walde:
von der Heide and Borgwaldt (2009) created a set of 450 concrete, depictable German noun compounds and collected human ratings on compositionality for all their 450 compounds. The compounds were distributed over 5 lists, and 270 participants judged the degree of compositionality of the compounds with respect to their first as well as their second constituent, on a scale between 1 (definitely opaque) and 7 (definitely transparent). For each compound-constituent pair, they collected judgements from 30 participants, and calculated the rating mean and the standard deviation.
We disregarded noun compounds with more than two constituents (in some cases, the modifier or the head was complex itself) as well as compounds where the modifiers were not nouns, thus deriving at a subset of the 450 compounds including 244 two-part noun-noun compounds. A second experiment collected human ratings on compositionality for our subset. In this case, we asked the participants to provide a unique score for each compound as a whole, again on a scale between 1 and 7. The collection was performed via Amazon Mechanical Turk (AMT) and resulted in 27-34 ratings per target compound. For each of the compounds we calculated the rating mean and the standard deviation.
Ghost-NN, collected by Sabine Schulte im Walde, Anna Hätty, Stefan Bott and Nana Khvtisavrishvili:
Ghost-NN is a gold standard of German noun-noun compounds including 868 compounds annotated with corpus frequencies of the compounds and their constituents, productivity and ambiguity of the constituents, semantic relations between the constituents, and compositionality ratings of compound-constituent pairs. Moreover, a subset of the compounds containing 180 compounds is balanced for the productivity of the modifiers (distinguishing low/mid/high productivity) and the ambiguity of the heads (distinguishing between heads with 1, 2 and >2 senses).
The resource comprises three parts:
1. a set of 154,960 noun-noun candidate compounds and their constituents, accompanied by corpus frequency, productivity and degree of ambiguity;
2. the final gold standard Ghost-NN of 868 noun-noun compounds and their constituents, accompanied by corpus frequency, productivity, ambiguity, and annotated with semantic relations and compositionality ratings;
3. the carefully balanced Ghost-NN subsets of 20x9 and 5x9 compounds and their constituents, categorised according to 9 criteria combinations for modifier productivity and head ambiguity.
Feature-NN, collected by Sabine Schulte im Walde:
Feature-NN is a novel collection of compositionality ratings for 1,099 German noun compounds where, differently to previous related work, we asked the human judges to provide (a) paraphrases of the compounds’ meanings, (b) constituent features contributing to the compounds meanings, (c) judgements on the hypernymy relations between the compounds and their head constituents, and (d) judgements on the concreteness of the compounds and constituents, before they provided their judgements on the compounds’ degree of compositionality with regard to the respective constituents. The elaborate information enables us to relate compositionality judgements to a range of compound and constituent properties.

References:

Sabine Schulte im Walde (2023)
Collecting and Investigating Features of Compositionality Ratings
In: Voula Giouli / Verginica Barbu Mititelu (eds), Multiword Expressions in Lexical Resources. Linguistic, Lexicographic and Computational Perspectives. Berlin: Language Science Press, "Phraseology and Multiword Expressions".

Sabine Schulte im Walde, Anna Hätty, Stefan Bott, Nana Khvtisavrishvili (2016)
Ghost-NN: A Representative Gold Standard of German Noun-Noun Compounds
In: Proceedings of the 10th Conference on Language Resources and Evaluation (LREC). Portoroz, Slovenia.

Sabine Schulte im Walde, Stefan Müller, Stephen Roller (2013)
Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds
In: Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM). Atlanta, GA.

Claudia von der Heide, Susanne Borgwaldt (2009)
Assoziationen zu Unter-, Basis- und Oberbegriffen. Eine explorative Studie (in German)
In: Proceedings of the 9th Norddeutsches Linguistisches Kolloquium.

Compositionality Ratings for German Particle Verbs

Over the years, we developed two gold standards with compositionality ratings for German particle verbs (PVs). Each of them contains PVs across different particles and was annotated by humans for the degree of compositionality.

PV-99, collected by Silvana Hartmann and Sabine Schulte im Walde:
Hartmann (2008) describes the collection of compositionality judgements for 99 German particle verbs across 11 different preposition particles, and across 8 frequency bands, plus one manually chosen verb per particle (to make sure that interesting ambiguous verbs were included). Four independent judges rated the degree of compositionality of the selected particle verbs between 1 (definitely opaque) and 10 (definitely transparent).
Ghost-PV, collected by Stefan Bott, Nana Khvtisavrishvili and Sabine Schulte im Walde:
Ghost-PV is a gold standard of 400 randomly selected German particle verbs. It is balanced across several particle types and three frequency bands, and accomplished by human ratings on the degree of semantic compositionality.
Bott and Schulte im Walde (2015) used two preliminary versions of Ghost-PV, containing 354 and 150 PVs.

References:

Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, Sabine Schulte im Walde (2016)
Ghost-PV: A Representative Gold Standard of German Particle Verbs
In: Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex). Osaka, Japan.

Stefan Bott, Sabine Schulte im Walde (2015)
Exploiting Fine-grained Syntactic Transfer Features to Predict the Compositionality of German Particle Verbs
In: Proceedings of the 11th Conference on Computational Semantics (IWCS). London, UK.

Silvana Hartmann (2008)
Einfluss syntaktischer und semantischer Subkategorisierung auf die Kompositionalität von Partikelverben (in German)
Studienarbeit, Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart.

Resources: Compositionality Ratings

Compositionality Ratings

Compositionality Ratings for German Noun Compounds

Compositionality Ratings for German Particle Verbs