End-weight effects in verse and language Lev Blumenfeld*1 Abstract: Weight ordering preferences appear to function in opposite directions in verse and language. While linguistic expressions, in both syntax and phonology, typi- cally display a “long-last” effect (Cooper and Ross 1975), stanza forms often show the the opposite, “short-last” structure. This effect has been called “saliency” in previous literature (Hayes and MacEachern 1996; Kiparsky 2006). In this paper I address this apparent discrepancy between the behaviour of verse and language. I argue that “saliency” is not a primitive in the theory, but can be derived from more basic mechanisms that allow grouping structure to be signalled, and show that “short-last” structures are optimal under the conditions of metrical verse that possesses parallelism. Keywords: weight, prosody, stanza “Why do you always say Joan and Margery yet never Margery and Joan? Do you prefer Joan to her twin sister?” “Not at all, it just sounds smoother”. (Jakobson 1960) 1. Introduction When linguists study the structure of verse, they typically focus on its prop- erties that are homologous to the properties of language. This interest is natural in the context of the Jakobsonian and generative approaches, which foreground the intimate connection between the rule-governed behaviour of writers, readers, and performers of verse texts with their rule-governed behav- iour as speakers of their native language. The claimed connection between verse and language may be superficial, such as the observation that phono- logical categories relevant to a language’s meter – quantity, weight, stress, or tone – are also the ones relevant to its phonology, or may take a deeper and * Author’s address: Lev Blumenfeld, School of Linguistics and Language Studies, Carleton University, SLaLS, 1125 Colonel By Drive, Ottawa, ON, K1S 5B6, email: lev.blumenfeld@carleton.ca. Studia Metrica et Poetica 3.1, 2016, 7–32 doi: dx.doi.org/10.12697/smp.2016.3.1.01 http://dx.doi.org/10.12697/smp.2016.3.1.01 8 Lev Blumenfeld more abstract view, such as the claim that the building blocks for rules and constraints governing verse are the same as for language. In this paper I will investigate an effect which at first blush appears to behave in the opposite way in verse and language: the relationship between ordering and weight. Consider the examples in (1). There are many fixed expressions in English such as (1a), where the triad is arranged such that the longest element is last, but not many expressions with a different order (1b). Conversely, a 43 couplet such as (1c) sounds better than the couplet in (1d), rearranged to make it 34.1 (1) a. the good, the bad, and the ugly b. # the ugly, the bad, and the good c. Amazing grace! How sweet the sound / that saved a wretch like me. d. # Amazing grace! This sound / has saved a bitter wretch like me. Herein lies the paradox: why does language favour long-last structures, while verse favors short-last structures? In what follows, I will investigate this para- dox in more detail, offering a resolution in terms of the grammar of grouping preferences which links short-last effects to meter and parallelism in verse, thus deriving it from more basic notions. In sum, the argument runs as fol- lows. Sequences of lines form constituents like couplets and stanzas, which we can generally refer to as “groups”. The structure of those groups – i.e. the location of the boundaries between constituents like couplets and stanzas – can be signalled overtly, using strategies well-established in the study of music (Lerdahl and Jackendoff 1983). These strategies reflect constituent structure iconically, using proximity and similarity between elements (elements that are closer together are in a constituent; elements that are more similar are in a constituent). As I will argue, either short-last or long-last groupings of lines can manifest their boundaries using proximity and similarity. However, only short-last structures are possible in texts set to a fixed metrical template which imposes the additional requirement of parallelism from line to line. This argument will be unpacked in the following sections. The main differ- ence between the present proposal and the discussion of the short-last effect in previous works (Hayes and MacEachern 1996; Kiparsky 2006) is that the effect 1 The expression “43 couplet” means ‘a pair of lines such that the first line contains 4 surface beats and the second line contains 3 surface beats’. 9End-weight effects in verse and language is not stipulated but derived from more basic grouping preferences, which are themselves expressions of iconicity. Furthermore, the claim will delimit the empirical range of short-last effects in verse, and will show that the paradox in (1) is a false one: long-last and short-last are distinct effects. This paper is organized as follows. I begin in §2 and §3 by rehearsing the evidence for the paradox exemplified in (1). In §4, I report an experiment probing the relationship between end weight and metricality, confirming the effect from a new angle. In §5, I offer an explanation of short-last in terms of grouping preferences, and explore its consequences in §6. 2. Long-last in language The long-last effect is known by many names, among them Pāṇini’s Principle, Behaghel’s Fourth Law, and Law of Increasing Terms. It has been observed in a wide variety of circumstances: in idiomatic expressions, in syntactic ordering, and in phonological rhythmic preferences. Cooper and Ross (1975) produced the first detailed investigation of the effect in English fixed expressions, such as those in (2). They observed that ordering by weight, while in competition with other preferences, often over- rides them. A ready example is men and women vs. ladies and gentlemen, where the gender-based ordering – whatever it is – yields to Pāṇini’s principle. The significance of weight-based ordering was confirmed more recently by Benor and Levy (2006) in a more sophisticated statistical setting. The effect shows up both in the count of the syllables (2a), and in the vowel quality (2b), with low and back vowels acting as longer and heavier than high and front vowels. (2) a. vim and vigor; hot and heavy; hale and hearty; Tom, Dick, and Harry; free and easy; bag and baggage; ladies and gentlemen; men and women; bread and butter b. zig-zag; bric-a-brac; tic-tac-toe; hip-hop; riffraff; King Kong; ding-dong; ping-pong; mishmash A long-last effect has been known to psychologists of rhythm for more than a century (Bolton 1894; Woodrow 1909, 1951; Fraisse 1974). Stimuli that differ in intensity are grouped as Strong-Weak, while stimuli that differ in length are grouped as Short-Long. In the linguistics context, the phenomenon has been 10 Lev Blumenfeld dubbed the “iambic-trochaic law” (Hayes 1985, 1995; Hay and Diehl 2007), where the long-last preference applies at the level of the phonological foot. On the syntactic side, long-last effects have been thoroughly explored under the name of “Heavy NP shift” in English and other languages (Wasow 1997a, b; Wasow and Arnold 2003; Wasow 2002; Anttila et al. 2010). A vari- ety of syntactic constructions cater to Pāṇini’s principle, such as the dative construction in English (3), where the acceptability of ordering the theme NP after the goal PP increases as the theme NP becomes larger. (3) a. * I gave [to Kim] [it]. b. ? I gave [to Kim] [the book]. c. I gave [to Kim] [the most riveting and whimsical piece of creative fiction I have ever read in my entire life]. This wide variety of long-last effects refer to a diverse set of notions of what counts as ‘long’: vowel quality, vowel length, syllable count, number of words, syntactic complexity have all been shown to play a role. The diversity of the surface manifestations of an apparently general effect calls for a general explanation. One possibility is that placing elements in the order of increasing length puts their beginnings as early as possible. There may be an advantage to such arrangements, as beginnings are psycholinguis- tically important and perceptually salient, and thus long-last structures can efficiently signal constituency. This is true in the general case, but other factors may intervene, such as minimizing the distance between heads of constituents (Hawkins 1990), which sometimes results in long-last syntax. A further complication is that the connection between such a process- ing explanation and the effect may be more or less direct depending on the context. Fixed expressions in particular, being memorized wholesale, do not obviously benefit from a better signalling of their constituent structure. The mechanism by which expressions in (2) acquire their long-last structure must thus also appeal to diachrony. That said, the long-last effect is extremely well-established and robust across a wide variety of circumstances and languages, making each example of the oppo- site effect an important topic of investigation, to which I turn in the next sections. 11End-weight effects in verse and language 3. Short-last in poetry The preference for short-last in verse can be illustrated statistically by looking at corpora of quatrains. Consider the data for two sets of quatrains in Russian and English: 873 quatrains from Blok’s “Book Three” (1907–1916), and 1023 quatrains by Emily Dickinson. Table (4) shows the mean length of lines (in syllables) in the quatrain, and the masculine/feminine line ending ratio for each line position in the quatrain.2 In both corpora, lines II and IV are shorter and more likely to have a masculine ending than lines I and III. This suggests a long-short-long-short pattern for the quatrain. (4) line I II III IV Blok lenght 9.45 8.11 9.47 8.1 M/F 0.204 3.198 0.211 4.105 Dickinson lenght 7.6 5.96 7.8 5.9 M/F 1.85 7.1 1.568 3.355 The same effect has been found in two studies investigating the structure of quatrains in folk verse, Hayes and MacEachern (1996) and Kiparsky (2006). Both works observe that there is a near-absolute preference for couplets such that the second line is either equal to or shorter than the first, and quatrains whose second couplet has either equal lines, or a long-short structure. The short-last desideratum has been dubbed Saliency by these authors, and is a key component in explaining the distribution of quatrain types. The short-last effect has been observed only in actual verse. Next I turn to a confirmation of the effect outside of verse corpora. 4. Confirming the phenomenon There appears to be an interaction between end-weight preferences and met- ricality: metrical structures favour short-last, while non-metrical structures favour long-last. Cooper and Ross (1975: 78) have noticed that there are some short-last sequences like hickory-dickory-dock, clackety-clack, blankety-blank, and hint that their structure might have something to do with the regular 2 I have performed these counts automatically. The dictionaries used to generate stress and syllable count information for English and Russian were Hammond (2012) and an electronic copy of Zalizniak (1977), respectively. 12 Lev Blumenfeld distribution of stressed and unstressed syllables, but do not investigate that effect any further. In order to test whether short-last structures are judged as more well-formed when they are rhythmical, I conducted a rating study, on which I report here. I constructed stimuli using two cross-cutting parameters: short-last vs. long-last, and rhythmic vs. unrhythmic, according to two different schemas that are illustrated in tables (5) and (6). For exach schema, ‘rhythmic’ tokens such as mát-balloón-laságna or veránda-Frída-sítter had a regular alternation between stressed and unstressed syllables, while ‘unrhythmic’ tokens like trúck-brunétte-médicine or Lárry-génder-abándon had either a stress clash or a stress lapse or both. In the examples in tables (5) and (6), the clashes and lapses are underlined. There were 10 examples in each condition, for a total of 40 stimuli per trial (i.e. per table (5) and (6)) presented in random order to 26 and 22 native English speaker subjects in the two trials, respectively. Participants were asked to rate on a scale of 1–7 each stimulus based on how smooth or fluent it sounds, or “how it rolls off the tongue”. They were told that the phrases are meant to be meaningsless sequences of words, and were instructed to ignore the semantics. The responses were transformed into z-scores by subject. The data from the two trials was pooled together.3 3 When tested individually, the trials do not have relevant differences, despite some differences in stimuli construction (stress clash vs. stress lapse). Also, in trial 2 but not trial 1, I avoided consonant clusters across word boundaries, so words followed by other words ended in syllabic segments, but this did not seem to have an effect on the ratings. 13End-weight effects in verse and language The main results are shown in Figure 1a. The effect of rhythm on score is significant as determined by ANOVA: sequences with ‘good’ rhythm get better scores (F(1,1918)=112.3;p<0.0001). More surprisingly, there is a gen- eral short-last effect: sequences ending in shorter words get higher scores (F(1,1918)=60.873;p<0.0001). The hypothesis, however, is about the interac- tion of the two factors: does short-last give greater benefit to rhythymic than to unrhythmic lines? The interaction plot is shown in Figure 1b. Two-factor ANOVA shows that he interaction was significant (F(1,1916)=10.23;p<0.0015). Figure 1. Main results. 14 Lev Blumenfeld In other words, when a well-formed rhythmic structure induces the perception of a metrical template, subjects appear to prefer stimuli with final empty beats, and/or with final short words. This effect can be thought of as equivalent to Saliency observed in verse corpora if each word is an analog of a line. 5. Resolving the paradox: couplets 5.1. Introduction The opposite directions of weight preferences in verse and language call for an explanation. In previous literature, Saliency was either stipulated, or explained in terms of phonetic effects. For examle, Hayes and MacEachern (1996: 483) relate saliency to the tendency to slow down at the ends of prosodic con- stituents. Such prosodic final lengthing (Beckman and Edwards 1990) makes constituent-final vowels, syllables, and words longer than medial ones. Shorter lines, Hayes and MacEachern argue, allow the extension of the final syllable over a longer portion of the metrical template, inducing a perception of final lengthening, which, they claim, “is itself a cue to phrasehood, that is, a kind of constituency marker”. The amount of final empty space at the end of the line that allows for this mimicry of final lengthening is what makes lines appear cadential, and thus optimal for couplet- and quatrain-final position. This explanation in terms of final lengthening is incomplete, because if the facts were the other way around – i.e. final lines were longer than non-final lines – final lengthening could also explain the same phenomenon. A longer final line could serve as “a cue to phrasehood” and a “constituency marker” just as a short final line. Furthermore, the explanation in fact does not depend on the verse text being metrical, or having parallel rhythmic structure from line to line: any pair of lines such that the second one is shorter fall under the explanation, because in such cases just as in the metrical cases the final syllable can be extended over a longer time span in imitation of phonetic phrase-final lengthening. I will suggest below that while a key component of Hayes and MacEachern’s explanation is on the right track, their proposal is incomplete. My own pro- posal will attempt to connect short-last with two key properties of verse texts: grouping and parallelism. This will not only derive Saliency from more basic notions, but will limit the empirical domain in which it is expected to be observed. Metrical verse possesses two layers of structure: the linguistic represen- tation and the metrical constituency. The latter includes the structure both 15End-weight effects in verse and language within the line (feet and metrical positions), and larger constituents such as lines, couplets, and stanzas. Let us assume that there is a desideratum to signal constituent structure at these higher levels – that is, to signal the location of stanza boundaries in some way. Stanzas consist of sequences of lines, and represent groupings of such lines into higher-level units. One area where grouping preferences have been worked out is the generative theory of music (GTTM; Lerdahl and Jackendoff 1983), where grouping well-formedness is part of a set of constraints governing the structure of a musical representation. While these grouping preferences are designed for a theory of music, they are more general, and can be harnessed to describe grouping perception in stanzas. 5.2. How grouping is signalled in music Grouping can be signalled in several independent ways, of which the most important are proximity and similarity. Such preferences have been defined in GTTM as follows (GPR stands for “Grouping preference rule”). (7) GPR 2 (Proximity) Consider a sequence of four notes n1n2n3n4. All else being equal, the transition n2 − n3 may be heard as a group boundary if a. (Slur/Rest) the interval of time from the end of n2 to the begnning of n3 is greater than that from the end of n1 to the beginning of n2 and that from the end of n3 to the beginning of n4, or if b. (Attack-point) the interval of time between the attack points of n2 and n3 is greater than that between the attack points of n1 and n2 and that between the attack points of n3 and n4. (Lerdahl and Jackendoff 1983: 44) The preference rule in (7) expresses the common-sense notion that grouping is iconic: elements that are closer together are perceived as a group, and a break in a sequence of elements can induce the perception of a grouping boundary. Proximity can be established from the end of the last element to the begin- ning of the next (7a), or from the beginning of the last to the beginning of the next (7b). Lerdahl and Jackendoff show that both ‘end-to-beginning’ (EB) and ‘beginning-to-beginning’ (BB) proximity play a role in musical grammar. Proximity is not the only possible way of signalling grouping – the other is similarity. Again the following formal statement (8) expresses the iconic nature of grouping: pairs of elements separated by a boundary are more different than pairs of elements within a group. 16 Lev Blumenfeld (8) GPR 3 (Change), adapted4 Consider a sequence of notes n1n2n3n4. All else being equal, the transition n2 − n3 may be heard as a group boundary if it involves a greater change than the transitions n1 − n2 or n3 − n4. (Lerdahl and Jackendoff 1983: 46) The “change” in (8) can be defined in terms of any characteristics such as dynamics, register, duration, etc. GPR 3, like other grouping preferences, does not fully determine the group- ing structure. Consider a sequence like aaabaaa, where a and b are notes. Excluding the grouping (aaa)(b)(aaa), where one group consists of a single element (specifically precluded by GPR 1, Lerdahl and Jackendoff 1983: 43), two possible groupings are consistent with GPR 3: (aaa)(baaa), and (aaab) (aaa), depending on whether b initiates the second group or concludes the first. Lerdahl and Jackendoff note that the choice of grouping depends on the nature of the differences between b and a, and suggest that a fuller theory might contain a ranking different versions of GPRs. A relevant factor is the location of the difference between b and a. If the difference is perceptible at the onset of b, as would be the case with pitch or dynamics, the information on boundary would be immediately available to the listener. In such a case b could be perceived as initiating a new group, and a boundary would thus be perceived before b. On the other hand, the difference between a full note and a staccato note is not available at the onset of b, and that might induce a perception of the boundary after b rather than before it. Proximity and Change are very general grouping preferences – they simply express iconicity, the notion that elements within a group are closer to each other than elements outside of the group, and more similar to each other than elements outside of the group. These general ideas can be applied to verse lines, to which I turn in the following sections. 5.3. Proximity in verse lines Verse lines can be thought of as analogs to notes, and thus can be subject to the preferences like those expressed in (7) and (8). 4 This GPR was adapted by removing some music-specific detail, and replacing “and” with “or” in the last clause. Without this change, the preference rule will not literally apply to group- ings like (aaab)(aaa), on which see below: the transition across the grouping boundary, ba, is not necessarily greater than the transition between the penultimate and final elements of the first group, ab. 17End-weight effects in verse and language First consider proximity. As in (7), we might entertain two versions: end- to-beginning (slur/rest), and beginning-to-beginning (attack-point). Which version of proximity depends on the realization of the lines. To appreciate how this works more specifically, we need to visualize the possible rhythmic scenarios. I will use the following schematic representation to illustrate the ways that various possible configurations signal or not signal couplet boundaries. Each circle represents a beat. A filled circle (•) stands for a realized beat; an empty one (∘) stands for a beat present in the template but not realized – in musical terms, a pause. So, a four-beat line can be represented as ••••, a three-beat line (with a final empty beat) as •••∘, etc. Now let me illustrate how proximity applies to sequences of lines, starting with end-to-beginning proximity. Consider a sequence of 43 couplets, where the second line has an empty final beat, as follows. “Amazing grace” (1c) can serve as an example, repeated below.5 Now consider the distance, in beats, from the end of each line to the beginning of the following line. 5 Of course, this hymn is written in quatrains, not couplets. Quatrains, however, are not sequences of four lines, but of two couplets, and thus analysis at the level of the couplet is legitimate, as was assumed in previous work (Hayes and MacEachern 1996; Kiparsky 2006). 18 Lev Blumenfeld Clearly, the end-to-beginning distance within a couplet (in line pairs 1–2, 3–4, 5–6, 7–8) is shorter than across a couplet boundary (in line pairs 2–3, 4–5, 6–7). This means that the couplet boundaries in (9) are signalled by end- to-beginning proximity (7a): the interval across a boundary is greater than within a couplet. To see the effect of beginning-to-beginning proximity, consider a sequence of 34 couplets, without empty beats. (Such as sequence is implausible in sung verse, because it does not respect parallelism, on which see below; here it simply serves to illustrate the application of proximity). Let us compute the distance, in beats from the beginning of line n to the begin- ning of line n+1. For line pairs within a couplet (pairs 1–2, 3–4, 5–6, 7–8), this distance is 3, equal to the length of the first of the two lines in the pair. For line pairs across a couplet boundary (pairs 2–3, 4–5, 6–7), this distance is 4. This means that the couplet boundaries in (12) are signalled by beginning-to- beginnning proximity (7b). 19End-weight effects in verse and language Note that which version of proximity applies to which case depends on the structure of the examples: end-to-beginning proximity signals the boundary in (9) but not (12), and vice versa for beginning-to-beginning proximity. 5.4. Change in verse lines Now consider how Change (8) might apply to sequences of verse lines. A general consequence of (8) is that lines that are in some respect different – for example, shorter – than other line might induce the perception of a bound- ary in their vicinity. Recall from the discussion of (8) in §5.2 that a sequence like aaabaaa is compatible with two groupings, (aaa)(baaa) and (aaab)(aaa). Suppose a and b are verse lines of different length, e.g. a could be •••• and b could be •••∘. Then the sequence aaabaaa can be represented as follows, with | showing line boundaries, for clarity. Clearly, in such a sequence the shorter line b (•••∘) can only conclude, not initiate a group of lines, i.e. only (aaab)(aaa) and not (aaa)(baaa) is possible. The reason is that the information on b’s difference from a is not available until the end of b. Thus, a structure like (aaa)(baaa), with the shorter line •••∘ initiating the second stanza, would create a processing garden path: a listener would not know that b initiated a group until the end of b, and would have to repair an incorrectly perceived constituent structure. For this reason, in the discussion below I will only consider differences at ends of lines, and in structures of the type aaabaaa, Change will be inter- preted to induce only the grouping (aaab)(aaa).6 In the case of the particular example (14), it appears that Change is in itself an explanation for short-last effects: placing the shorter element (•••∘) first in a constituent creates a garden path, unlike placing it last. This, however, is not yet a full explanation of the effect, because in other configurations, the opposite holds. Consider the following structure, where a is •••, and b is ••••. 6 For some reason, poetry generally prefers to mark ends rather than beginnings of groups. For example, there is a near-universal tendency for line endings to be metrically stricter than line beginnings. See also Smith (1968) for a broader perspective on endings in poetry. This curious asymmetry between beginnings and ends, while obviously related to the topic at hand, is beyond the scope of this study. 20 Lev Blumenfeld Here, Change once again is only compatible with (aaab)(aaa), because just as in (14), in (15) the information on the difference between b and a is not available until the end of the lines, and thus the same garden path argument applies. This time, however, the structure it creates is a long-last one, not a short-last one. An explanation of the short-last effect thus cannot be found in Change alone. In the next section I turn to a more complete view of the logically pos- sible line couplets, and the application of Proximity and Change to them. 5.5. Typology of couplets In the preceding two sections I showed how the grouping structure of a sequence of verse lines can be signalled through the general strategies of Proximity and Change. Next I take a more systematic look at all the logical possibilities of beat arrangement in couplets, and illustrate how the grouping structure is or is not signalled in them. Consider couplets consisting of two lines, where each line is a sequence of beats. The beats can either be filled (•) or unfilled (∘). Lines can differ in their length in terms of filled or unfilled beats. A line like •••∘, consisting of four beats only three of which are filled, can be said to have four beats in the “template”, and three beats in the realization. There are thus the following possibilities: the second line is longer in its template (16a); second line is longer in its surface realization (16b); second line is shorter in realization (16c); second line is shorter in its template (16d). There is yet another possibility that combines the preceding ones, viz. where the second line’s template is longer than the first, but its realization is shorter (16e). Also, it is possible for the second line to be longer than the first by virtue of the first containing an empty final beat (16f ). Finally, the two lines can be identical both in their surface and underlying structures, whether or not they have a final empty beat (16g, 16h). I consider impossible structures where the realization is longer than the template, e.g. five surface beats filling a four-beat template. The illustrations below show two couplets in sequence, to demon- strate both couplet-internal and between-couplet boundaries. In the following, a vertical line (|) represents a line boundary; two vertical lines (∥) represent a couplet boundary. Thus, a 44 couplet can be represented as ∥••••|••••∥, a 43 couplet with a final empty beat as ∥••••|•••∘∥, etc. 21End-weight effects in verse and language The following table summarizes these options. The numbers refer to the num- ber of beats in the final lines. Structures in shaded cells are impossible. Each of the types in (16) signals the couplet boundary in a different way. The types where the second line’s template is longer than the first (16a, 16b, 16e) satisfy beginning-to-beginning proximity: the initial beats of their couplet- final lines are further from the initial beats of the following couplet-initial lines than are initial beats of couplet-internal line pairs. The types where an empty beat separates the second line of a couplet from the following initial line (16a, 16c, 16e) satisfy end-to-beginning proximity. The types where the second line has a different number of realized beats (16b, 16c, 16d, 16e) satisfy Change. The types (16d) and (16f ) fail to satisfy either of the versions of Proximity. Moreover, in (16d) beginning-to-beginning proximity prevents the percep- tion of the grouping boundary in the desired place, because the initial beats of couplet-final lines are closer to the initial beats of following lines than in the case of couplet-internal line pairs. The same is true of end-to-beginning proximity in (16f ). In other words, both short-last and long-last arrangements are useful strat- egies of signalling constituency boundary in terms of proximity. Thus, on its face it is not possible to determine which of these types is ‘optimal’. Lerdahl and Jackendoff point out that a fuller theory of grouping might impose a ranking between the different desiderata. However, no such theory has been worked out, and in any case it is not clear that the ranking that works for music would carry over to verse lines. 22 Lev Blumenfeld 5.6. Parallelism Fortunately, one of the types in (16) can be selected once we take into account another desideratum of metrical quatrains: Parallelism. Lines prefer to have identical metrical structures. Parallelism can be stated at the underlying level of the metrical or musical template, requiring the metrical space (e.g. number of beats) that is available to each line to be identical, regardless of how a line actually occupies that space (18a). Or, parallelism can be required of surface structures, preferring those where all lines have the same number of surface metrical or musical beats (18b). The ‘domain’ in the definitions below in our case is the couplet.7 (18) a. UNDERLYING PARALLELISM (UP): The metrical/musical templates of all lines within some domain are identical. b. SURFACE PARALLELISM (SP): The surface realizations of all lines within some domain occupy an identical number of beats. The perfectly parallel examples are (16g, 16h), where the two lines are equal in surface and underlying realization. All of the other cases in (16) have imperfect parallelism, either because their lines have a different number of surface beats (16c, 16f ), or because the metrical space occupied by the second line is longer or shorter that occupied by the first (16a), or both (16b, 16d, 16e). The following table summarizes how each of the three couplet bound- ary types in (16) fares on each of the desiderata – proximity, change, and parallelism. In the column labels, EB, BB stand for end-to-beginning and beginning-to-beginning proximity; Ch stands for Change; UP and SP stand for underlying and surface parallelism. A check mark (ü) in a cell means that the column’s desideratum is satisfied by the row’s structure, i.e. that the second line of the couplet is treated as couplet-final by that preference. A blank cell means that the desideratum does not treat the line boundary across couplets differently from the line boundaries within couplets; in such cases, the prefer- ence contributes nothing to grouping perception. Finally, a cross (×) means that the desideratum selects the boundary incorrectly; this happens only with the pathological behaviour of BB proximity in the (16d) type, and with the 34 type (16f ). 7 Metrical parallelism loosely corresponds to Lerdahl and Jackendoff ’s GPR 6 (1983: 51), which calls for parallel grouping structures of parallel musical passages. 23End-weight effects in verse and language The desiderata expressed by columns can be thought of as grammatical pref- erences, or constraints: a structure is less marked, or more harmonic to use Optimality-Theoretic jargon (Prince and Smolensky 2004 [1993]), the more checkmarks it has in its row. None of the types in the table (19) are perfect. In fact, the preferences are incompatible: one cannot satisfy Change and Surface Parallelism at the same time, for example. If table (19) is treated as an OT tab- leau, with each column containing a constraint and each row a candidate, then the outcome will depend on the ranking of the constraints, and any candidate except (16b), (16d), and (16f ) is a potential winner.8 However, the table above is not an OT tableau, because in the context of sung verse, not all preferences are equal. In particular, underlying parallelism in a musical setting in non-negotiable: the musical meter normally contains a fixed number of beats. There are four types in (19) that satisfy UP: the 43 short-last type (16c), the 34 long-last type (16f ), and the two fully parallel types, 44 and 33 (16g, 16h). Of these, only one – the short-last type – satisfies Proximity. For the other types, Proximity is either silent or selects the wrong grouping boundary. Thus, with a privileged view of underlying parallelism, the short-last type is the only one that signals constituent boundaries via Proximity. To put it another way, while both long-last and short-last structures can satisfy Proximity, only short-last also satisfies parallelism. 8 The long-last candidate (16b) is harmonically bounded by (16e); the short-last, no-empty- beat candidate (16d) is harmonically bounded by (16b), (16c), and (16e); the 34 candidate (16f ) is harmonically bounded by (16c). All other candidates can win under some ranking. (16a) wins if {EB,BB,SP} ≫ {CH,UP}; (19) wins if {EB,CH,UP} ≫ {BB,SP}; (16e) wins under the ranking shown in (19); and either (16g) or (16h) win if both of the parallelism constraints are undominated. In this evaluation, violations are interpreted positively rather than negatively. 24 Lev Blumenfeld Saliency has thus been reduced to the more basic strategies of signalling constituency. However, there is yet another line of attack. If the two constraints on prox- imity and the two constraints on parallelism are interpreted as disjunctions of a single constraint, then the picture becomes much simpler.9 (20) a. PROXIMITY: Either BB or EB proximity is satisfied. b. PARALLELISM: Either SP or UP is satisfied. The following table implements this unification of proximity and parallelism into single constraints. It repeats (19), placing a mark in the unified column if there is a checkmark in either of the subcolumns of (19). In this new table (21) one structure stands out: the short-last option (16c) is the only one that contains a checkmark in every column – the only one that satisfies all three preferences. Now, which of the types examined above are actually attested? According to both Hayes and MacEachern as well as Kiparsky, only 44, 43, and 33 occur with any significant frequency as couplet components of quatrains. These three types form a natural class in (19): they are the types where underlying paral- lelism is satisfied, and where there are no ×s in any cells. In other words, of the underlyingly parallel couplets, only those are attested where the grouping boundary is not mis-signalled by Proximity, i.e. where the 9 Although, as I said above, the tables here are not OT tableaux, it is worth noting that such a disjunctive definition is not strange to standard OT constraints; cf. FtBin, which is normally defined as foot binarity at either syllabic or moraic levels (e.g. Kager 1999: 156). 25End-weight effects in verse and language structure does not suggest that there is a boundary between the first and sec- ond lines of a couplet. Once again, Saliency falls out of something more basic. In the next subsection I will apply these ideas to the structure of quatrains. 5.7. Quatrains Quatrains are pairs of couplets; the grouping structure of the four lines of a quatrain is [[Line1 Line2][Line3 Line4]]. The boundary between the second and third lines is a couplet boundary, while the boundary after the last line of a quatrain and the first line of the following is both a couplet and a quatrain boundary. Thus, if the grouping structure in a quatrain sequence is to be signalled efficiently, it must be the case that the inter-quatrain boundary (call it the 4–1 boundary) is signalled more strongly than the intra-quatrain boundary (the 2–3 boundary). (22) PROXIMITY: is there a 4–1 boundary, and if so, is it stronger than the 2–3 boundary? Given the three attested couplet types (44, 43, and 33), there are nine quatrain types. They are listed below, along with their performance on the two prefer- ences. A checkmark (ü) means that the 4–1 boundary is present; a plus sign means that it is also stronger than the 2–3 boundary. A blank cell means that the preference says nothing about the presence of a group boundary. A cross mark (×) indicates that the preference places the boundary in a wrong place. In the table, the hand (43) before a quatrain type indicates that it is attested in the corpora examined in the literature. Underlying parallelism is assumed to be satisfied. There is also a quatrain- level surface parallelism preference, defined below. (23) Q-PARALLELISM: The two couplets of a quatrain have identical surface metrical structure. Here, the attested types form a natural class: they are those without an ×, i.e. those where Proximity does not mis-signal the quatrain boundary. Another way of summarizing the facts is that the only attested non-parallel quatrains are those with a + in the Proximity column: parallelism between couplets in a quatrain is broken only if there is something to gain for signalling the quatrain boundary. 26 Lev Blumenfeld Once again, I emphasize that the tables (19), (21), and (24) are not OT tableaux. Adding other reasonable constraints, such as RealizeBeat (or *∘) will generate a different factorical typology and a different set of outcomes. For example, with {*∘, Prox} undominated, the undesired long-last candidate (16b) can win: it satisfies BB proximity without missing any beats. Rather, the logic of the above discussion is that Saliency, or short-last, can be made sense in terms of the more general desiderata that have to do with making the grouping structure explicit, i.e. Saliency can be stated in terms of those patterns. Short-last structures of a particular type – with a final missing beat – are an efficient way of signalling grouping structure while satisfying metrical parallelism. There are other desiderata which they are not efficient at satisfy- ing – e.g. the preference not to have empty beats. But it is not incumbent on this discussion to explain why out of the universe of possible preferences, it is Proximity and Parallelism that seem to have such sway in chanted verse. The goal here was more narrow: to reduce Saliency to more basic notions. Those notions were the grouping structure possessed by metrical verse. The paradox exemplified (1) is thus an illusion: long-last in language and short- last in verse operate at different levels, the former on syntax and phonology, and the latter on meter. 6. Additional evidence The account of short-last effect, which sees it as signalling grouping structure while preserving parallelism, empirically links the effect to two concrete prop- erties of texts: grouping and parallelism. The account makes a prediction: the short-last effect should be absent when either grouping or parallelism is not at issue. This prediction is not made by simply stipulating a ‘saliency’ advantage to short-last structures without any connection with other properties. For example, the suggestion of Hayes and MacEachern (1996) that shorter final lines imitate phonetic final lengthening does not make such a connection to specific properties of texts. As both anonymous reviewers of this paper point out, there are many examples of stanzas that have longer final lines – the Spencerian stanza is one well-known case, where the last line contains six rather than five beats; there are many others in various traditions in the world. There is, however, a strong tendency toward short-last in texts where parallelism is overt – that is, where the template is realized in the musical beat, and this agrees with the specific prediction of the present proposal that short-last is linked with 27End-weight effects in verse and language metrical parallelism. The same reviewer argues convincingly that the corpus of the world’s metrical texts is too heterogeneous and too complex to be exam- ined efficiently with respect to the predictions of my proposal, and thus the typological work is left for another day. A general prediction, however, is clear: short-last structures should be the more likely the more overt parallel metrical structure is. It should be strongest in sung verse, then perhaps in children’s verse which is often recited in a metrically explicit way; it should be weaker in more cultivated art verse. The converse of the prediction is that text which are not metrical at all, and thus where there is no notion of missing ‘beats’, and where the issue of paral- lelism does not arise, should not display short-last effects. As an example of non-metrical poetry, I examined the long-line poems from The Pocket Book of Ogden Nash (1962). There were 83 such poems. I counted the syllables of the final line pairs in all such poems. The second line of the pair is on average 3.59 syllables longer than the first (mean length difference significantly differs from 0; t(82)=2.082;p<0.05). Thus, Ogden Nash displays a long-last rather than a short-last effect in final line pairs. Another source of evidence are verse texts that are metrical, but whose meter does not impose a periodic template. In such cases, there is a notion of ‘empty beat’, but no coherent notion of paralelism. I considered A. S. Griboedov’s Gore ot uma (Woe from wit), a play written in rhymed iambic lines whose length varies unpredictably between 1 and 6 feet. I counted the average difference between two lines such that the second line but not the first line ends a sentence, and there is no sentence break within the second line. Testing the first 62 line pairs of the play meeting these conditions, the mean difference between the foot count of the two lines was 0.0323, which is not significantly different from 0 (t(61)=0.1859;p>0.85). There were 20 long-last, 19 short-last, and 23 equal pairs. I have also counted all scene-final line pairs, comparing two final lines of each scene of the play (54 line pairs). Here, the numbers are not as equal, but still do not show a clear short-last effect. The mean difference in the number of feet was −0.259, which is not statistically different from 0 (t(53)=−1.459;p>0.15). There were 13 long-last, 23 short-last, and 18 equal line pairs; the difference between 13 and 23 is not significant on a binomial test (p>0.13). Finally, I considered the converse situation: a text with a periodic structure, but without a metrical structure that involves the notion of ‘beat’. W. H. Auden’s Age of anxiety is written in 9-syllable lines in two alliterating halves, usually 4+5 or 5+4, occasionally 6+3 or 3+6. Thus, while the lines have a repeating template, it is not based on metrical beats. If the lines display a short-last effect, we would expect their second halves to be shorter than the first. In the first 128 28 Lev Blumenfeld lines, 67 are short-long, and 61 long-short lines – a non-significant difference. The second half of each line is 0.031 syllables longer than first half (also not significant; t(127)=0.253;p>0.8). The three cases where a short-last effect is absent are exactly what one would expect if the effect is a consequence of grouping and parallelism. The evidence is, of course, negative – nothing precludes a future discovery of a non-metrical text with a short-last effect, and that would weaken my proposal, because a second source of Saliency will have to be sought outside of parallel- ism and grouping. The additional arguments in this section thus must be taken with the usual caveat that acompanies such negative evidence.10 However, the discussion in this paper makes it clear that a careful look at the typology of short-last and long-last structures is needed. 7. Conclusion I started with the observation that with respect to weight and ordering, verse and language show opposite behaviour: language prefers long-last structures while verse-prefers short-last structures. In the paper I argued that the short- last effect in verse is a means of signalling the additional layer of constituency that verse possesses compared to language. Thus, the mismatch between verse and language appears to be a false paradox. I conclude by mentioning another relevant set of data that shows the com- plex interaction that grouping preferences can enter into. The so-called Butz triads form a familiar rhythmic scheme, with two short items and a longer one. Sugar, and spice, and all things nice; Slugs, and snails, and puppy-dogs’ tails are some of the more innocuous ones; the reader is invited to cull Morgan (1983) for less appropriate examples.11 Although these figures are called “triads”, their common structure also resembles a quatrain, if the last element of the triad is thought of as a pair. In terms of beats, the rhythm can be thought of as 1121, e.g. Slugs (1) and snails (1) and puppy-dogs’ (2) tails (1). In this way, the Butz triad resembles the short-meter quatrain (3343). 10 The argument also comes with the caution that the evidence comes from statistics over a corpus. 11 The pattern is not limited to English; cf. two common Russian triads: i švéc, i žnéc, i na dudé igréc ‘a sewer, a reaper, and a player on the pipe’; kefír, zefír, i tjóplyj sortír ‘kefir, zefir (a confectionary), and a warm loo’. 29End-weight effects in verse and language Morgan (1983: 50) pointed out that these triads combine apparently incom- patible rhythms: “It is the reconciliation of a triple pattern to a duple pattern… It establishes the three-beat pattern, and extends with a fourth beat to accom- modate the duple rhythm”. However, the two properties that are combined here are not in fact incom- patible. Rather, the rhythmic perfection of these triads lies in the fact that they satisfy both kinds of end-weight preferences at the same time. At the level of the syntax, they follow the long-last pattern ([Slugs][and snails][and puppy- dogs’ tails]), but at the level of rhythm, it is short-last, 1121. This embedding of a short-last structure inside a long-last one shows that the two end-weight preferences are distinct, and not in fact incompatible. Finally, I end with an example that shows an even more complex rhythmic embedding. Below is the poem “Majolica Lament, or ‘Australopithecus’ ” by Linda Kunhardt.12 The quatrains are in 3343 short meter. The third line of each stanza contains six syllables with this stress profile: σ́(#)σ́́#σ́σ́σ́σ́, e.g. Óx chíp gastrólogỳ. They are distributed among four beats as follows: This structure is reminiscent of a Butz triad: short-long in its linguistic shape with a long-short rhythmic cadence. In this poem, short-last sits inside long- last which sits inside short-last.13 The farmer in the dell The farmer in the dell Ox chip gastrology The farmer in the dell The farmer takes a wife The farmer takes a wife Pupa reconnaissance The farmer takes a wife 12 Poetry 196(2): 110, available at https://www.poetryfoundation.org/poetrymagazine/poems/ detail/53553. 13 Thanks for comments to Boris Maslov and Tatiana Nikitina, to the audience at the Prosody Today workshop at the University of Chicago in March of 2015, to my colleagues at Carleton, and to two anonymous reviewers of Studia metrica et poetica. https://www.poetryfoundation.org/poetrymagazine/poems/detail/53553 https://www.poetryfoundation.org/poetrymagazine/poems/detail/53553 30 Lev Blumenfeld The wife takes a child The wife takes a child Sweetbread electrolyte The wife takes a child The child takes a nurse The child takes a nurse Cheese futz habitual The child takes a nurse The nurse takes a cow The nurse takes a cow Flatworm collateral The nurse takes a cow References Anttila, Arto; Adams, Matthew; Speriosu; Michael 2010. The role of prosody in the English dative alternation. In: Language and Cognitive Processes 25(7/9), 946–981. Beckman, Mary, Edwards; Jan 1990. Lengthenings and shortenings and the nature of prosodic constituency. In:. Kingston, John; Beckman, Mary (eds.), Papers in Laboratory Phonology 1: Between the Grammar and Physics of Speech. Cambridge: Cambridge University Press, 152–178. Benor, Sarah Bunin; Levy, Roger 2006. The chicken or the egg? A probabilistic analysis of English binomials. In: Language 82(2), 233–278. Bolton, Thaddeus L. 1894. Rhythm. In: American Journal of Psychology 6, 145–238. Cooper, William E.; Ross, John R. 1975. World order. In: Grossman, Robin E.; San, L. James; Vance, Timothy J. (eds.), Papers from the Parasession on Functionalism. Chicago: Chicago Linguistics Society, 63–111. Fraisse, Paul 1974. Psychologie du rhythme. Paris: Presses Universitaires de France. Hammond, Michael 2012. A Searchable Dictionary of English. http://www.lexicon. arizona.edu/hammond/newdic.html. Hawkins, John A. 1990. A parsing theory of word order universals. In: Linguistic Inquiry 21(2), 223–261. http://www.lexicon.arizona.edu/hammond/newdic.html http://www.lexicon.arizona.edu/hammond/newdic.html 31End-weight effects in verse and language Hay, Jessica S. F.; Diehl, Randy L. 2007. Perception of rhythmic grouping: testing the iambic/trochaic law. In: Perception and Psychophysics 69(1), 113–122. Hayes, Bruce 1985. Iambic and trochaic rhythm in stress rules. In: Proceedings of the Eleventh Annual Meeting of the Berkeley Linguistics Society, 429–446. Hayes, Bruce 1995. Metrical Stress Theory: Principles and Case Studies. Chicago: University of Chicago Press. Hayes, Bruce; MacEachern, Margaret 1996. Quatrain form in English folk verse. In: Language 74: 473–507. Jakobson, Roman 1960. Linguistics and poetics. In: Sebeok, Thomas A. (ed.), Style in Language. Cambridge, MA: The MIT Press, 350–377. Kager, René 1999. Optimality Theory. Cambridge: Cambridge University Press. Kiparsky, Paul 2006. A modular metrics for folk verse. In: Dresher, B. Elan; Friedberg, Nila (eds.), Formal Approaches to Poetry. Berlin: Mouton de Gruyter, 7–49. Lerdahl, Fred; Jackendoff, Ray 1983. A Generative Theory of Tonal Music. Cambridge, MA: The MIT Press. Morgan, Gareth 1983. Butz triads: toward a grammar of folk poetry. In: Folklore 94, 44–56. Prince, Alan; Smolensky, Paul 2004 [1993]. Optimality Theory: Constraint Interaction in Generative Grammar. Malden: Blackwell. http://roa.rutgers.edu/files/537- 0802/537-0802-PRINCE-0-0.PDF. Smith, Barbara Herrnstein 1968. Poetic Closure: A Study of How Poems End. Chicago: University of Chicago Press. Wasow, Thomas 1997a. End-weight from the speaker’s perspective. In: Journal of Psycholinguistic Research 26(3), 347–361. Wasow, Thomas 1997b. Remarks on grammatical weight. In: Language Variation and Change 9, 81–105. Wasow, Thomas 2002. Postverbal Behavior. Stanford: CSLI. Wasow, Thomas; Arnold, Jennifer 2003. Post-verbal constituent ordering in English. In: Rohdenburg, Günter; Mondorf, Britta (eds.), Determinants of Grammatical Variation in English. Berlin: Mouton, 119–154. Woodrow, Herbert 1909. A quantitative study of rhythm: the effect of variations in intensity, rate, and duration. In: Archives of Psychology 14, 1–66. http://roa.rutgers.edu/files/537-0802/537-0802-PRINCE-0-0.PDF http://roa.rutgers.edu/files/537-0802/537-0802-PRINCE-0-0.PDF 32 Lev Blumenfeld Woodrow, Herbert 1951. Time perception. In: Stevens, Stanley Smith (ed.), Handbook of Experimental Psychology. New York: Wiley, 1224–1236. Zalizniak, Andrei A. 1977. Grammaticheskii slovar’ russkogo iazyka: Slovoizmenenie. Moscow: Russkii Iazyk.