1. The backness alternation in the lexicon

While the Brazilian Portuguese plural is regular and predictable for the majority of nouns and adjectives in the language, the focus of this paper is on [w]-final stems, where the plural is irregular. These [w]-final words allow two kinds of plural. Most words change the stem-final [w] to [j], and an [s] is added, e.g., [lẽȷ̃ˈsɔw ~ lẽȷ̃ˈsɔjs] ‘sheet’. A fair number of [w]-final words, however, allow the stem to surface faithfully in the plural, e.g., [muˈzew ~ muˈzews] ‘museum’. In addition to these two robust patterns, a few nouns have a [w ~ l] alternation, e.g., [mɛw ~ mɛlis] ‘honey’.

The transcriptions above, and throughout this paper, reflect the dialect(s) of São Paulo, where our research was conducted. The present characterization of the plural morphology is more broadly applicable to most Brazilian dialects, and specifically to any dialect in which the historical word-final lateral is fully merged with [w]. Dialects that retain the final lateral, as in many European Portuguese dialects, are beyond the scope of this work (for those, see Freitas 2001; Mateus & d’Andrade 2000, a.o). For a fuller description and analysis of Brazilian Portuguese plural morphology, the reader is referred to Mattoso Câmara (1953), Abaurre (1983), Morales-Front & Holt (1997), Huback (2007), Gomes & Manoel (2010), among others.

Becker et al. (2017) survey the [w]-final nouns and adjectives in the lexicon, with a focus on two factors, as seen in Table 1: the laxness of the final vowel and monosyllabicity of the stem. The alternation is common in polysyllables and in words whose final vowel is one of the lax vowels [a ɛ ɔ], but much less common in monosyllables and in words whose final vowel is tense [e o i u].1

Table 1

Lexicon study: the plural [w ~ j] alternation is more frequent in polysyllables and following a lax vowel.

Size Vowel n % [j] example
Monosyllabic tense 12 8% ˈɡow ~ ˈɡows ‘goal’
lax 20 41% ˈsɔw ~ ˈsɔjs ‘sun’
Polysyllabic tense 45 57% muˈzew ~ muˈzews ‘museum’
lax 263 92% lẽȷ̃ˈsɔw ~ lẽȷ̃ˈsɔjs ‘sheet’

This paper surveys the acquisition of these two factors, using data from a nonce word study (“wug test”, Berko 1958) with 115 child participants and 43 adult participants as a control group. We show that speakers’ sensitivity to monosyllabicity is evidenced among the youngest children surveyed (62 children ages 7–9), while the role of vowel laxness is learned later (53 children ages 10–13). This distinction between monosyllabicity and laxness cannot be detected in the pattern of responses of the adult control group, which is equally sensitive to both factors. We claim that both factors are expressions of universal phonological pressures: the protection of initial syllables from alternations, and a preference for vertically dispersed diphthongs. Sensitivity to vowel laxness is acquired later, however, because it is specific to the plural morphology, and it contradicts the majority of phonotactic preferences in the lexicon.

The paper starts with a constraint-based analysis of the plural in section 2. The results of a nonce word task in section 3 show the earlier sensitivity to monosyllabicity. We conclude and suggest directions for further study in section 4.

2. Phonological trends in the plural alternations

This section offers an analysis of the Brazilian Portuguese plural that is generalized from individual words. It is modeled based on their violations of universal phonological constraints. The weights of these constraints are fitted to the participants’ responses, with the increasing weights meant to model the development of the idealized language acquirer. The resulting model can be applied to the treatment of nonce words.

For most [w]-final lexical items, the plural is based on the historical source of the [w], usually as reflected in the orthography as ⟨l⟩ vs. ⟨u⟩. Speakers, however, generalize the phonological properties of these lexical items and apply these properties productively to novel words, even in the absence of orthography. The generalization of the phonological trends in the lexicon shows that speakers not only store the items in their lexicon but also compute the phonological patterning of these items. An early constraint-based implementation of this duality was offered in the USELISTED framework (Zuraw 2000). Computation over lexical items, and in particular, computation over lexical exceptions in addition to storage was proposed in the rule-based approach of Albright & Hayes (2002; 2003; 2006). Our paper follows these precedents, using the existing, stored lexical items to compute a grammar that productively derives novel items.

2.1. Independence from history and orthography

Historically, the [w ~ j] plural alternation is related to a stem-final lateral that deleted in the plural (de Kolovrat 1923). For example, the current paradigm [ɐˈnɛw ~ ɐˈnɛjs] ‘ring’ used to be [ɐˈnɛl ~ ɐˈnɛles]. Deletion of intervocalic laterals in Galician-Portuguese (beginning in the 9th century) removed the lateral from the plural, with subsequent raising of [e] to [i] and hiatus resolution creating the glide in [ɐˈnɛl ~ ɐˈnɛjs], which is still the current form in many European Portuguese dialects and some southern Brazilian dialects. In most of Brazil, however, and throughout the state of São Paulo, vocalization of coda [l] to [w] yielded [ɐˈnɛw ~ ɐˈnɛjs], with no remnant of the lateral. This last sound change completely merged historical syllable-final laterals with existing syllable-final glides.

While in many cases of historical mergers, evidence for the original phonological source is maintained paradigmatically, this is not the case here. The final lateral is completely lost from both singulars and plurals. One cannot rely on derivational suffixes to reconstruct the lateral, either: while a lateral still appears before the suffix [ejɾ] in [ʃɐˈpɛw ~ ʃɐpeˈlejɾ-ə] ‘hat ~ hat-maker’, putatively suggesting an underlying representation such as */ʃɐpɛl/, the same lateral appears in [ˈʃa ~ ʃɐˈleiɾ-ə] ‘tea ~ tea-pot’. The underlying representation /ʃal/ for the root of ‘tea’ would be implausible, since no process would delete the lateral in the base [ˈʃa].

Although language-internal phonological evidence is missing, the orthography still maintains the stem-final lateral in most cases. The plural alternations, however, are independent from this orthography. Tellingly, the [w ~ j] alternation has extended to nouns that do not have a historical lateral, and are not spelled with ⟨l⟩, provided that they are polysyllabic with a final lax vowel, e.g., [deˈɡɾaw ~ deˈɡɾajs] ‘stair’, normatively [deˈɡɾaws]; [j]-plurals are similarly innovated for [ʃɐˈpɛw] ‘hat’, [tɾoˈfɛw] ‘trophy’, and so forth. Conversely, the alternation is blocked and [w]-plurals are innovated for nouns that do have a historical lateral and are written with an ⟨l⟩, provided that they are monosyllables, e.g., [ˈmɛw ~ ˈmɛws] ‘honey’, normatively [ˈmɛjs] or [ˈmɛlis]. Going beyond the plural morphology, speakers often misspell coda [w], e.g., writing ⟨auto-falantes⟩ ‘loudspeakers’ instead of ⟨alto-falantes⟩, providing further evidence that this phonological neutralization is unencumbered by orthographic representations.

Loanword adaptation is similarly based on phonological rather than orthographic cues. Looking at lateral-final source words, the alternation is blocked in the monosyllabic [ˈɡow ~ ˈɡows] ‘goal’, yet it applies to the polysyllabic [kokiˈtɛw ~ kokiˈtɛjs] ‘cocktail’, further aided by its final lax vowel. Both words have the same final orthographic ⟨l⟩, as well as the same lateral in the English source.

The alternation, then, cannot be reduced to orthography, nor diachrony, nor can it be plausibly encoded as a presence of an underlying lateral in the phonological representation. We suggest that the alternation is imposed by the regular grammar, but plural forms that are known to the speaker are stored and protected by the constraint *USELISTED that mediates between the stored lexicon and the productive grammar.

In terms of type frequency, items that have a historical lateral are much more frequent in the language (Huback 2007), and therefore the [w ~ j] alternation is more frequent than faithful [w] paradigms. This could explain why polysyllabic nouns such as [deˈɡɾaw] ‘stair’ are attracted to the [w ~ j] alternation. As we note above, however, monosyllables tend to abandon the [w ~ j] alternation, showing the primacy of phonological factors over the effect of frequency. Foreshadowing the results of the experiment in section 3, we see that nonce words, which do not have any lexical frequency at all, are subject to the aforementioned phonological factors of monosyllabicity and vowel tenseness.

The analysis offered here for São Paulo Portuguese is notably different from the analysis of dialects that maintain final laterals (as most European dialects do). First, these dialects supply overt evidence for the plural-specific deletion of final laterals, which speakers will have to learn, whereas São Paulo speakers have no final lateral to delete. Second, in European dialects the plural is completely predictable: language acquirers need to learn to add [ʃ] after all [w]-final nouns such as [muˈzew], and delete all final laterals, regardless of monosyllabicity or preceding vowel quality. What both types of dialects share is the uneven distribution of diphthongs: [ɛj] and [ɔj] are underrepresented generally in the language, yet preferred in the plural (see Section 2.3). This represents a different set of challenges to European Portuguese learners despite the regularity of the lateral deletion process.

2.2. Protection of initial syllables

The [w ~ j] alternation is disfavored in monosyllables, which following Becker et al. (2017), we attribute to the protection of initial syllables (Beckman 1997 et seq); in monosyllables, the affected glide is in the initial syllable. When the alternation impacts a polysyllabic noun, as in [lẽȷ̃ˈsɔw ~ lẽȷ̃ˈsɔjs] ‘sheet’, the initial syllable [lẽȷ̃] remains unchanged – the change impacts the non-initial syllable syllable [sɔw]. In a monosyllable, however, the alternation impacts the initial (and only) syllable of the stem, as in [ˈsɔw ~ ˈsɔjs] ‘sun’.

There is good reason to believe that this protection of monosyllables is universal. First, the same tendency is observed in a variety of languages, such as French (Becker et al. 2017), Turkish (Becker et al. 2011), Russian (Becker & Gouskova 2016), and others. In Turkish, for example, stem-final voiceless stops often voice when a vowel-initial suffix is added, e.g., [ɡuɾup] ‘group’ ~ [ɡuɾubu] ‘group.ACC’. This voicing alternation affects the majority of polysyllables but only a minority of monosyllables, which Becker et al. (2011) attribute to the protection of initial syllables. Second, Becker et al. (2012) show that the effect is asymmetrical, based on evidence from English. In the English lexicon, monosyllables are more often impacted by the [f ~ v] alternation than polysyllables, e.g., violators of initial syllable faithfulness such as [loʊf ~ loʊvz] ‘loaf’ are common, while non-violators such as [pæɹəɡɹæf ~ pæɹəɡɹævz] ‘paragraph’ are considerably less acceptable – the opposite of the trend observed in Brazilian Portuguese, Turkish, and other languages. Yet when given nonce words, English-speaking participants apply the alternation equally to monosyllables and polysyllables, essentially refusing to learn that monosyllables are impacted more than polysyllables. In artificial language tasks, English speakers prefer the Turkish-like universal pattern where alternations impact polysyllables more strongly, despite the language-internal evidence.

In the cases cited above, initial syllable faithfulness constraints protect monosyllables. The protective effect of initial syllable faithfulness within polysyllables is observed in Shona (Beckman 1997). In this language, mid vowels contrast with high vowels only in the initial syllable, e.g., [ɡondwa] ‘become replete with water’ vs. [huna] ‘search intently’. The mid/high distinction is predictable in non-initial syllables, e.g., [dokonja] ‘be very talkative’, *[dokunja], showing the effect of greater faithfulness to initial syllables in polysyllabic words. Although Brazilian Portuguese does not provide direct evidence to distinguish between faithfulness to initial syllables and faithfulness to monosyllables, in terms of modeling we prefer the broad typological coverage of initial syllable faithfulness.

For the Brazilian Portuguese speaker, the lexicon provides evidence that the [w ~ j] alternation applies more strongly to polysyllables, as shown in Table 1. Additional evidence for the protection of monosyllables comes from nasal diphthong alternations: the [ɐ̃w̃ ~ õȷ̃s] alternation similarly applies to most polysyllables, e.g., [boˈtɐ̃w̃ ~ boˈtõȷ̃s] ‘button’, but never to monosyllables.

The effect of initial syllable faithfulness is demonstrated in the tableaux in Tables 2 and 3. We assume that the plural suffix is underlyingly /is/ for such forms, and show two candidates: one where the [i] deletes, violating MAX, and one where the [w] fronts to [j] and fuses with the suffixal [i]. Fronting the stem glide in monosyllabic [ɡow] causes a violation of IDENT-σ1(back), since the glide is in the initial syllable of the word. No violation of IDENT-σ1(back) is incurred in Table 3, since the fronted glide is in the second syllable, and the initial syllable [lẽȷ̃] is not affected.

Table 2

Monosyllabic [ˈɡow] protected from alternation.

/ˈɡow1 + i2s / IDENT-σ1 (back) MAX
a. ˈɡow1s *
b. ˈɡoj1,2s *!
Table 3

Polysyllabic [lẽȷ̃ˈsɔw] fuses with the suffixal [i].

/lẽȷ̃ˈsɔw1 + i2s / IDENT-σ1 (back) MAX
a. lẽȷ̃ˈsɔw1s *!
b. lẽȷ̃ˈsɔj1,2s

We note that our choice of /is/ as the underlying representation of the plural suffix for such forms is not uncontroversial. The suffix surfaces as [-s] after vowels, but after consonants, depending on the lexical item, the suffix varies between [-s] and [-{i,e}s], e.g., [floɾ ~ floɾ{i,e}s] ‘flower’ vs. [tuɾ ~ tuɾs] ‘tour’. This added vowel is analyzed as thematic by Mattoso Câmara (1967; 1984) and Bermudez-Otero (2013). Further such lexically-specific allomorphy can be found in [s]-final words, cf. the zero-marked [ˈtɔɾəks ~ ˈtɔɾəks] ‘thorax’ vs. the [is]-marked [ˈfeniks ~ ˈfeniksis] ‘phoenix’. In sum, analyzing this vowel as the fully predictable result of epenthesis is untenable, and thus we opt for its treatment as a postconsonantal allomorph. The appearance of a front vowel or glide before the plural [s/ʃ] also varies by dialect, e.g. it appears with [l]-final stems in European Portuguese, and with finally-stressed vocoid-final stems in Rio de Janeiro. This dialectal variation could be incorporated into the analysis with a family of context-sensitive constraints that would replace the monolithic effects of MAX. These constraints would still interact with IDENT-σ1(back) just as shown in Tables 2 and 3.

Returning to the two tableaux in Tables 2 and 3, they depict a dichotomous grammar: one that protects all monosyllables and does not protect any polysyllables. In section 2.4, we develop a weighted version of this grammar to more closely model speakers’ treatment of such forms across developmental age groups.

2.3. Blocking shallow diphthongs

The [w ~ j] alternation is disfavored following the tense vowels [e o i u]; the alternation preferably creates an optimal set of diphthongs [aj], [ɛj], [ɔj], as in the innovative plurals [deˈɡɾaw ~ deˈɡɾajs] ‘stair’, [tɾoˈfɛw ~ tɾoˈfɛjs] ‘trophy’, and so forth. The alternation does not spread to new items with tense vowels, thus avoiding the diphthongs [ej], [oj], [uj] (note that [ij] is absent from the language). In Nevins (2012), this tendency in Brazilian Portuguese is attributed to a preference for diphthong-internal height dispersion, which is phonetically grounded and observed in language typology (see also Kubozono 2001). The typologically most common diphthongs combine a low vowel with a glide, as in [aj, aw], as these maximize the height differences within the two halves of the diphthong. The least common diphthongs are those that have no height distinction at all, such as [iw] or [uj]. Following Becker et al. (2017), the effect is operationalized as the markedness constraint *SHALLOWDIPH, which penalizes tense vowel+glide combinations such as [ej], [oj], [uj].

The preference for the low-initial [aj] over the high-initial [uj] is evident throughout the lexicon; a search in the Mac-Morpho corpus (Fonseca et al. 2015) suggests that the type frequency of [aj] is about four times greater than that of [uj] (930 vs. 239). The mid vowels, however, do not follow the universal height preference: [oj] is more frequent than [ɔj] (221 vs. 52 types). This reversal may be due to a different universal tendency: tensing before vocoids. In American English, for example, lax vowels may not be followed by another vowel (Donegan 1978), e.g., [θiəɾɹ] ‘theater’ *[θɪəɾɹ], obeying the constraint *[–tense][–consonantal]. A parallel restriction applies in French (Storme 2017). Word-internally in Brazilian Portuguese, the lax vowels [ɛ, ɔ] are rare preceding low and mid vowels ([ɛa], [ɛo], etc), and are dispreferred even before high vowels and glides (Wetzels 2011: 346 mentions an “assimilatory effect of the glide that yields upper-mid-vowels”). The constraint *[–tense][–consonantal] is not fully respected in Brazilian Portuguese, but it exerts a gradient pressure as a part of a family of constraints that reduce the acceptability of lax vowels before vocoids.

The speaker is thus presented with conflicting evidence: [ɛj] and [ɔj] are preferred in the plural, and supported by the optimizing diphthong-internal height differences, but the same diphthongs are strongly under-attested generally in the lexicon, due to the conflicting requirement to tense before a vocoid (on the idea that constraints can directly conflict, see Coetzee & Pretorius 2010; Zsiga et al. 2011, among others). Instances of singular [ɛj] and [ɔj] are primarily limited to Latinate forms (e.g., [eˈɾɔj] ‘hero’), loanwords (e.g., [ˈbɔj] from English ‘boy’), and other marginal forms (e.g., colloquial [vɛj] from [vɛʎu] ‘old’). In contrast, [ɛw] and [ɔw] are more common due to the vocalization of the historical lateral (e.g., [ˈmɛw] ‘honey’, [ˈsɔw] ‘sun’), but not as common as the tense [ew], [ow].

Restrictions on monomorphemic forms are known as Morpheme Structure Constraints in Chomsky & Halle (1968). In Brazilian Portuguese, these are soft constraints, since [ɛj] and [ɔj] are attested in monomorphemic forms (e.g., [eˈɾɔj] ‘hero’); these diphthongs are only under-attested, and not outright ungrammatical. Overall in the language, however, [oj] is preferred, and Bonilha (2000) shows that children acquire [oj] before [ɔj].

The morphologically limited effect of *SHALLOWDIPH to plural formation is thus formalized in terms of Comparative Markedness (McCarthy 2003). The shallow diphthongs [ej, oj] are tolerated and surface faithfully when they already exist in underlying forms, but the creation of new instances is blocked. The constraint N*SHALLOWDIPH penalizes only newly created shallow diphthongs, or more technically, it penalizes only shallow diphthongs that do not appear in the fully faithful candidate. In Table 4, both candidates have optimal diphthongs, and thus MAX prefers the the creation of [j]. In Table 5, both [ew] and [ej] are shallow, but [ej] is newly created and thus violates N*SHALLOWDIPH. The [ew] that appears in the fully faithful candidate is penalized by O*SHALLOWDIPH, but the ranking of this constraint below MAX renders it inactive. We include O*SHALLOWDIPH for the sake of completeness only, following McCarthy’s (2003) assumption that both old and new versions of constraints are present in the grammar. In sum, the alternation is disfavored for tense nuclei, as the newly created diphthong violates N*SHALLOWDIPH. Existing non-alternating forms are ‘grandfathered in’ by O*SHALLOWDIPH.

Table 4

A lax stem vowel allows the creation of a vertically optimal diphthong.

/pɾiˈzɛw1 + i2s / IDENT-σ1 (back) N*SHALLOWDIPH MAX O*SHALLOWDIPH
a. pɾiˈzɛw1s *!
b. pɾiˈzɛj1,2s
Table 5

A tense stem vowel blocks the selection of a shallow diphthong.

/suˈpew1 + i2s / IDENT-σ1 (back) N*SHALLOWDIPH MAX O*SHALLOWDIPH
a. suˈpew1s * *
b. suˈpej1,2s *!

The grammar depicted in Tables 4 and 5 is idealized; it allows the [w ~ j] alternation in all words with lax vowels and prevents it in all words with tense vowels. We turn in section 2.4 below to a more fine-grained view.

2.4. Acquisition as weight update

Speakers’ judgments of [w]-final nonce words as a whole are probabilistic. The faithful [w]-plural is usually favored with monosyllables and with words with a tense vowel in their final syllables, while the [w ~ j] alternation is usually preferred in polysyllables and words with a lax vowel in their final syllables.

To model this probability distribution over responses, we employed MaxEnt grammars (Maximum Entropy, Goldwater & Johnson 2003; Smolensky & Legendre 2006, a.o). In MaxEnt, constraints are weighted, and violation marks are multiplied by the weight of the violated constraint. Each candidate’s sum of weighted violations is the candidate’s harmony (ℋ). In each tableau, the harmonies are exponentiated and then divided by the sum of the exponentiated harmonies, to yield each candidate’s predicted probability (p).

When the constraints have no weight, violations have no effect on harmony, all candidates have a harmony of zero, and all candidates are equally likely. This is shown in Table 6, and represents an idealized initial state, the grammar of a child who hasn’t yet learned the plural morphology.

Table 6

Weightless constraints in the initial state; all candidates equally probable.

/ˈɡɾew + is / MAX
w = 0
IDENT-σ1 (back)
w = 0
w = 0
a. ˈɡɾews * 0 0.50
b. ˈɡɾejs * * 0 0.50

Anticipating the experimental results in Section 3, the youngest children tested, who were 7–9 years old, already departed from the initial state. To simplify presentation, we divided the children into two roughly equally sized groups, with the children who are 7–9 representing the younger half of our participants. Table 7 shows the grammar obtained from fitting weights to the experimental results of this group. The weight of the initial syllable faithfulness constraint IDENT-σ1(back) is highest, showing an early sensitivity to monosyllabicity, compared to a much smaller weight of N*SHALLOWDIPH.

Table 7

Children 7–9 y/o: a preference for the faithful plural in monosyllables.

/ˈɡɾew + is / MAX
w = 0.12
IDENT-σ1 (back)
w = 0.36
w = 0.16
a. ˈɡɾews * –0.12 0.60
b. ˈɡɾejs * * –0.52 0.40

The older children, 10–13 years old, exhibit a stronger sensitivity to vowel tenseness, as seen in Table 8. The weight of IDENT-σ1(back) shows a modest increase compared to a dramatic increase in the weight of N*SHALLOWDIPH.

Table 8

Children 10–13 y/o: the monosyllabicity and the vowel tenseness effects increase in magnitude.

/ˈɡɾew + is / MAX
w = 0.50
IDENT-σ1 (back)
w = 0.43
w = 0.55
a. ˈɡɾews * –0.50 0.62
b. ˈɡɾejs * * –0.98 0.38

Finally, the adults (ages 19–24) respond more strongly than the children both to monosyllabicity and to vowel tenseness, as evidenced by the larger values of weights in Table 9. Future research may probe the 14–18 age group, to find whether the transition from late childhood to early adulthood is linear (in which case it is likely due to gradual increase in exposure to the ambient language), or whether it is more abrupt and tracks a sharp transition around adolescence.

Table 9

The grammar with adult weights.

/ˈɡɾew + is / MAX
w = 1.25
IDENT-σ1 (back)
w = 1.05
w = 0.92
a. ˈɡɾews * –1.25 0.67
b. ˈɡɾejs * * –1.97 0.33

In sum, the acquisition trajectory of the plural morphology is expressed here with increasing weights of the constraints from zero in the initial state until they reach the adult weights. Table 10 summarizes the hypothesized gradual increase in constraint weighting. All weights were calculated using the MaxEnt Grammar Tool (Hayes & Wilson 2008) based on the judgments of our participants (see section 3).

Table 10

The plural acquisition path as gradual constraint weight increase.

Initial state 0 0 0
Children 7–9 0.12 0.36 0.16
Children 10–13 0.50 0.43 0.55
Adults 1.25 1.05 0.92

Binning the participants into two groups of children and a group of adults was performed to simplify the presentation of the analysis, though we assume that constraint weights may increase gradually throughout the development of each speaker’s grammar. The statistical analysis in section 3 in fact uses age as a scalar predictor, with no division of the children into age groups (see also Figure 3).

Table 10, a summary of these three groups, shows that the youngest children tested, aged 7–9, are already sensitive to monosyllabicity, and consequently, the weight of IDENT-σ1(back) is already considerable, and similar to the weight assigned by the older 10–13 year olds. By contrast, the weight of N*SHALLOWDIPH starts much closer to zero and rises sharply from early to late childhood. The weight of MAX rises as well, thereby allowing polysyllables and words with final lax vowels to alternate more often, as a highly weighted MAX increases the probability of [j]-plurals.

2.5. Summary

We cast the analysis in terms of two universal grammatical constraints that disprefer the [w ~ j] plural alternations: initial syllable faithfulness, which penalizes the alternation in monosyllables, and a constraint requiring diphthong-internal height differences, which penalizes the alternation following tense vowels. Both factors are construed as weighted constraints that reduce the probability of generalizing the [w ~ j] alternation to novel words.

The two constraints differ in their general applicability to Brazilian Portuguese: initial syllable faithfulness is unproblematically available to the speaker and applicable throughout the lexicon. The constraint that blocks the alternation following a tense vowel, however, directly conflicts with the lexicon-wide preference for vowel tensing before vocoids, providing the speaker with two competing forces that need to be recognized and reconciled: tensing before a vocoid is phonotactically favored throughout the lexicon, while the preference for lax vowels diphthong-internally is limited to plural morphology. The distinction is expressed here with Comparative Markedness (McCarthy 2003), where shallow diphthongs are allowed to surface if they appear in an underlying representation, but the creation of new shallow diphthongs is blocked. By hypothesis, this more nuanced constraint interaction takes longer to learn.

3. Experiment: pluralizing nonce words

To test the development of the grammar that underlies the treatment of plurals in Brazilian Portuguese, we carried out a nonce word task (Wug test, Berko 1958). Children, along with an adult control group, were asked to choose between two possible plural forms of novel nouns, varying the factors of monosyllabicity and vowel laxness. While speakers of all ages were sensitive to monosyllabicity, the older 10–13 year olds were significantly more sensitive to vowel laxness than the younger 7–9 year old children.

3.1. Participants

Two groups of participants were recruited: 43 adults, serving as the control and baseline group, and 115 children.

The adults were recruited online via word of mouth among the undergraduate students at UNICAMP (State University of Campinas). The participants volunteered their time and effort. We analyzed data from the 43 adult participants who completed the task and self-identified as being at least 18 years old and from the state of São Paulo. Of these, 30 identified as female, 10 as male, and 3 did not say. The average reported age was 21 (range 19–24, median 21).

The children were recruited from four schools: Escola Comunitária de Campinas (private) in the city of Campinas, and Escola Estadual Professor José Calvitti Filho (public), Centro Educacional Objetivo ABC (private), and Escola Estadual Jardim Prado II (public) in the city of São Paulo. We obtained the consent of each child’s parents, as well as the children themselves; the school administrations allowed us to approach the parents, and in some schools, also run the experiment on school premises. Ethical approval was obtained according to local institutional protocols. The children’s age and gender are shown in Table 11. All children were enrolled in age-appropriate grades.

Table 11

Age and gender distribution of children participants.

age female male total
7 1 1 2
8 5 8 13
9 19 15 34
10 17 8 25
11 13 9 22
12 7 10 17
13 0 2 2
total 62 53 115

On average, the participants took 5.8 minutes to complete the experiment (median 5.2, range 2.9–26.5). There was no correlation to speak of between age and completion time (Spearman’s ρ = –0.04, p > 0.1).

3.2. Materials

A total of 70 nominal paradigms were created, of which 69 were nonce. Each paradigm included a [w]-final singular and two plurals, one faithful and one alternating, e.g., [ku.ˈtaw ~ ku.ˈtaws, ku.ˈtajs]. In addition to the nonce words, the existing noun [ʃaˈpɛw] ‘hat’ was used as a training item, with its normative plural [ʃaˈpɛws] and its innovative plural [ʃaˈpɛjs].

The 69 nonce items are listed in Appendix A. These included 21 monosyllables with a tense vowel [e, i, o, u] and 17 monosyllables with a lax vowel [a, ɛ, ɔ], for a total of 38 monosyllables. Iambs (disyllables with final stress) included 12 items with a tense stressed vowel and 9 items with a lax stressed vowel, for a total of 21 iambs. In addition to these, 10 fillers were created with a stressed [ɐ̃w̃], 5 monosyllables and 5 iambs.

Each item was recorded inside a frame sentence. Singulars were recorded in the frame ⟨isso é um_____⟩ (‘this is a’), with the masculine indefinite article ⟨um_____⟩ serving to mark the nonce word as a singular masculine noun. Plurals were recorded in the frame ⟨esses são dois_____⟩ (‘there are two_____’). The 210 (=70*3) sentences were presented to a 19 year old phonetically trained female native speaker from Campinas written in IPA in three random orders, and recorded into a computer in a sound attenuated room. The best of each three tokens was selected and converted to mp3 format.

As referents for items, we used pictures of aliens (“gorps”), kindly provided by van de Vijver & Baer-Henney (2011) that were randomly paired with nonce words.

3.3. Procedure

The experiment was run using Experigen (Becker & Levine 2015). The server executed a random selection of 24 items for each participant: 16 target items and 8 fillers. The target items were counterbalanced for monosyllabicity and vowel laxness, with 4 items in each combination. The fillers were similarly balanced for monosyllabicity.

Each nonce item was presented as schematized in Figure 1, without any written instructions. First, a single picture was displayed on the left with a button. When the button was pressed, the singular nonce noun was played in its frame sentence. When the recording was finished, the two pictures on the left were displayed, with a second button. When pressed, one randomly chosen plural was played in its frame sentence, and when done, the third button was displayed. Once pressed, the other plural frame was played, and two additional buttons appeared, each one below one of the plural buttons. When one of these was pressed, the next item appeared.

Figure 1
Figure 1

Sample item.

The experiment began with a consent form (filled by an experimenter upon receipt of a signed parental consent form), followed by the training item [ʃaˈpɛw] ‘hat’, which was presented with a picture of a hat and otherwise followed the format in Figure 1. Subsequently, the 24 trials were presented.

For the adult control group, an additional form at the end solicited demographic information such as age, sex, and so forth. A pilot run confirmed that the task was easy to understand without any written instructions. Children were engaged and happy to complete the task.

3.4. Results

The aggregated results are presented in Figure 2. In the interest of clarity, the child data is binned into two groups of roughly similar size, showing children ages 7–9 (49 children), ages 10–13 (66 children) and the 43 adults. In our statistical analysis, however, all children are analyzed as one group, with their age as a scalar variable. The full, raw results can be found at: http://becker.phonologist.org/projects/bpchild/.

Figure 2
Figure 2

Development of unfaithful [w ~ j] plurals: children 7–9 years old are sensitive to monosyllabicity but not to vowel laxness.

Starting with the laxness effect in the left panel of Figure 2, we see that items with a stressed tense vowel [e, i, o, u] usually elicited the choice of faithful plurals; unfaithful plurals were chosen in about 45% of all trials in all age groups for these vowels. Items with stressed lax vowels [a, ɛ, ɔ] elicit significantly more unfaithful responses from the adults and the older children, but not from the younger children.

Turning to the monosyllabicity effect in the right panel of Figure 2, we see that monosyllables elicited the choice of more faithful plurals, while polysyllables elicit significantly more unfaithful responses from all age groups, including the youngest children. The effect does strengthen with age, becoming more pronounced for the adults. Our results, then, show that the acquisition of the plural continues throughout late childhood and does not reach adult levels even for the 12 year olds. Similarly, morphophonological differences between 11/12 year olds and adults are also reported by Vogel & Raimy (2002) and others.

A closer look at the results is shown in Figure 3, where for each participant we show the proportion of items that elicited the choice of an unfaithful plural. The size of dots corresponds to the number of participants, with a LOWESS moving average line in blue. The laxness effect shows a steady increase, starting at zero (no difference in treatment of tense and lax vowels) for the youngest children, and increasing with age. In contrast, the monosyllabicity effect is fairly stable for the children, with polysyllables eliciting more unfaithful plural choices across all age groups.

Figure 3
Figure 3

Experimental results by participant (n = 158).

To assess the reliability of the effects, and to test specifically the increase in unfaithful responses with age, a mixed-effects logistic regression model was fitted using lme4 (Bates et al. 2015) in R (R Core Team 2016) with the choice of unfaithful plural as the dependent variable, using the child data only; this model is reported in Table 12. The predictors were lax, a binary predictor that contrasted tense vowels with lax vowels, monosyllabicity, a binary predictor that contrasted monosyllables with polysyllables, age, the child’s age in years, and all of their interactions.2 The main effects of lax and monosyllabicity are both equally strong and highly significant, while the main effect of age is not significant. Importantly, the interaction of age with monosyllabicity is not significant, meaning that younger children do not treat monosyllabicity significantly differently from older children. In contrast, the interaction of age with lax is significant, as the selection of unfaithful plurals with lax vowels increases significantly with age.

Table 12

Regression model for the child data.

β SE(β) z p-value
(Intercept) –0.06 0.09 –0.69
lax 0.44 0.11 4.20 <0.001
monosyllabicity 0.45 0.11 4.29 <0.001
age –0.05 0.40 –0.13 >0.1
lax:monosyllabicity 0.22 0.21 1.05 >0.1
lax:age 0.98 0.45 2.19 <0.05
monosyllabicity:age 0.30 0.45 0.66 >0.1
lax:monosyllabicity:age 2.76 0.90 3.09 <0.005

The model described above is limited to the child data, treating age as linear, scalar variable. The lack of participants aged 14 to 18 makes a linear age predictor discontinuous, leading to lack of convergence of the model if the adult participants are included. An additional model that binned participants into three age groups (as shown in Figure 2), showed essentially the same result as in Table 12, with monosyllabicity as a main effect only, but lax having a significant interaction with age.

4. Conclusion

This article discusses the [w ~ j] alternations in the plural morphology of Brazilian Portuguese, focusing on dialects that lack final laterals. We have shown that the alternation applies rather freely to polysyllables that have one of the lax vowels [a ɛ ɔ] in their final syllables, but the alternation is dispreferred in monosyllables and following a tense vowel [e o i u].

A nonce word task showed that both considerations are strongly present in the judgments of older children (ages 10–13) and an adult control group (ages 19–24), but younger children (ages 7–9) are only sensitive to monosyllabicity and not to vowel laxness. The proposed analysis relies on the difference in generality of two grammatical pressures. Monosyllables are less likely to alternate due to initial syllable faithfulness, a universal constraint family that prevents changing the initial (and only) syllable of a monosyllable; this grammatical pressure is generally applicable throughout the lexicon. Tense vowels block this alternation because they create diphthongs with poor height differences, e.g., [ej, oj], as opposed to the better dispersed [ɛj, ɔj]. These same lax vowel+glide diphthongs, however, are under-attested generally in the language, plausibly as a part of a general preference for tense vowels before vocoids. Encoding this conflict grammatically, we argue, is responsible for the longer acquisition path. We formalized the conflicting evidence using Comparative Markedness (McCarthy 2003), where [ej, oj] are tolerated when they appear in underlying representations, but not when created by the plural [w ~ j] alternation.

To model the experimental results, we employed a grammar with weighted constraints, where the weights are hypothesized to start at zero, at which point [w]-plurals and [j]-plurals are equally probable. The constraint against shallow diphthongs shows an apparent time upward trajectory, starting close to zero for the youngest children we tested, increasing considerably for the older children, and then again for the adults, thus tracing a full acquisition path from children through late childhood and adulthood. In contrast, our initial syllable faithfulness constraint starts with a higher weight, and only increases slightly for the older children. The observed early sensitivity to monosyllabicity may indicate that either initial syllable faithfulness constraints increase in weight more readily given relatively little support from the ambient language, or that they might have innate non-zero initial weights. We hope that future research with younger children will shed light on this issue.

Further language-internal support for the privileged status of monosyllables may come from plural marker dropping. Plural marking in Brazilian Portuguese is optional on nouns (Cristófaro-Silva 2012; Miranda Soares 2013; da Silva 2010, a.o), e.g., both [ɐs ˈpɛɾnɐs] and [ɐs ˈpɛɾnɐ] ‘the legs’ are grammatical, with the second option sounding more informal. In highly informal speech, the plural [s] is droppable from all nouns, be they polysyllabic or monosyllabic, e.g., [us ˈpɛ] ‘the feet’ is acceptable. Yet our intuition is that dropping the [s] from a monosyllabic noun requires more informality, i.e., there is an intermediate level of informality that allows [ɐs ˈpɛɾnɐ] but not [us ˈpɛ] (although Scherre 1988 attributes the contrast to final stress rather than monosyllabicity). Such differences in [s] dropping may supply further cues for the protection of monosyllables from phonological and morphological processes. We intend to seek more systematic evidence for the differential application of plural marker dropping in future research, using corpora and experimental work.

This paper adds another element to the large body of literature on the distribution of tense and lax vowels in the language (see Kenstowicz & Sandalo 2016; Wetzels 1995 for a recent review), connecting the limited distribution of the lax vowel diphthongs [ɛj] and [ɔj] to the more widely recognized limitation on lax vowels before other vowels, which are mostly limited to proper names, e.g., [ˈkɛop(i)s] ‘Cheops’.

Nonce word studies such as the present one uncover the way in which speakers generalize over the existing words of their language and organize their lexicon according to grammatical principles that are productively extended to new forms. Speakers accumulate lexical items, storing them in memory, but at the same time compute phonological grammars that are generalized from these lexical items.

In this paper, we provided evidence for two factors in the formation of the plural in Brazilian Portuguese [w]-final stems: monosyllabicity and vowel laxness. For adult speakers, the two factors are equally strong, but the children show that they follow different acquisition schedules: sensitivity to monosyllabicity is observed in the youngest children we tested, while sensitivity to vowel laxness is acquired later, and continues to develop throughout childhood. The experimental results are modeled grammatically using gradually increasing weights, as is also done in gradual learning algorithms (e.g., Boersma & Hayes 2001), and in the current model we fit the experimental data with changes in constraint weights, demonstrating the applicability of MaxEnt approaches to phonology within the context of acquisition.

Additional File

The additional file for this article can be found as follows:

Appendix A

Experimental results by item. DOI: https://doi.org/10.5334/jpl.189.s1


  1. The mid lax vowels [ɛ ɔ] are limited to the stressed syllable in São Paulo Portuguese. [^]
  2. All three predictors were centered and scaled such that their mean was zero and their range was one. This resulted in values of –0.5 and +0.5 for monosyllabicity and lax and a range of –0.49 and +0.51 for age. The model includes random intercepts for item and participant, but no random slopes, as their introduction resulted in lack of convergence. [^]


This project was supported by a FAPESP grant no. 2012/17869-7 awarded to Filomena Sandalo.

Competing Interests

The authors have no competing interests to declare.


Abaurre, B. (1983). Alguns casos de formação de plural em português: uma abordagem natural [A few cases of plural formation in Portuguese: a natural approach]. Cadernos de Estudos Lingüisticos, 5, 127–156.

Albright, A., & Hayes, B. (2002). Modeling English past tense intuitions with minimal generalization. In M. Maxwell (ed.), Proceedings of the sixth meeting of the ACL special interest group in computational phonology (pp. 97–108). Philadelphia: ACL. DOI:  http://doi.org/10.3115/1118647.1118654

Albright, A., & Hayes, B. (2003). Rules vs. Analogy in English past tenses: a computational/experimental study. Cognition, 90, 119–161. DOI:  http://doi.org/10.1016/S0010-0277(03)00146-X

Albright, A., & Hayes, B. (2006). Modeling productivity with the gradual learning algorithm: The problem of accidentally exceptionless generalizations. In G. Fanselow, C. Féry, M. Schlesewsky, & R. Vogel (eds.), Gradience in Grammar (185–204). Oxford: Oxford University Press.

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. R package version 0.999999-4. DOI:  http://doi.org/10.18637/jss.v067.i01

Becker, M., Clemens, L. E., & Nevins, A. (2017). Generalization of French and Portuguese plural alternations and initial syllable protection. Natural Language and Linguistic Theory, 35, 299–345. DOI:  http://doi.org/10.1007/s11049-016-9343-y

Becker, M., & Gouskova, M. (2016). Source-oriented generalizations as grammar inference in Russian vowel deletion. Linguistic Inquiry, 47, 391–425. DOI:  http://doi.org/10.1162/LING_a_00217

Becker, M., Ketrez, N., & Nevins, A. (2011). The surfeit of the stimulus: Analytic biases filter lexical statistics in Turkish laryngeal alternations. Language, 87, 84–125. DOI:  http://doi.org/10.1353/lan.2011.0016

Becker, M., & Levine, J. (2015). Experigen – an online experiment platform. Available at: https://github.com/tlozoot/experigen.

Becker, M., Nevins, A., & Levine, J. (2012). Asymmetries in generalizing alternations to and from initial syllables. Language, 88, 231–268. DOI:  http://doi.org/10.1353/lan.2012.0049

Beckman, J. (1997). Positional faithfulness, positional neutralisation and Shona vowel harmony. Phonology, 14, 1–46 DOI:  http://doi.org/10.1017/S0952675797003308

Berko, J. (1958). The child’s learning of English morphology. Word, 14, 150–177. DOI:  http://doi.org/10.1080/00437956.1958.11659661

Bermudez-Otero, R. (2013). The Spanish lexicon stores stems with theme vowels, not roots with inflectional class features. Probus, 25, 3–103.

Boersma, P., & Hayes, B. (2001). Empirical tests of the gradual learning algorithm. Linguistic Inquiry, 32, 45–86. DOI:  http://doi.org/10.1162/002438901554586

Bonilha, G. F. C. (2000). Aquisição dos ditongos orais decrescentes: uma análise à luz da teoria da otimidade (master’s thesis) [The acquisition of falling oral diphthongs: an Optimality Theoretic analysis]. Universidade Católica de Pelotas.

Chomsky, N., & Halle, M. (1968). The sound pattern of English. Cambridge, MA: MIT Press.

Coetzee, A. W., & Pretorius, R. (2010). Phonetically grounded phonology and sound change: the case of Tswana labial plosives. Journal of Phonetics, 38, 404–421. DOI:  http://doi.org/10.1016/j.wocn.2010.03.004

Cristófaro Silva, T. (2012). Organização fonológica de marcas de plural no português brasileiro: uma abordagem multirrepresentacional [The phonological organization of plural markers in Brazilian Portuguese: a multi-representational approach]. Revista da Abralin, 11, 273–306.

Donegan, P. J. (1978). On the natural phonology of vowels (doctoral dissertation). Ohio State University.

Fonseca, E. R., Rosa, J. L. G., & Aluísio, S. M. (2015). Evaluating word embeddings and a revised corpus for part-of-speech tagging in Portuguese. Journal of the Brazilian Computer Society, 21. DOI:  http://doi.org/10.1186/s13173-014-0020-x

Freitas, M. J. (2001). Sons de ataque: segmentos complexos, grupos segmentais e representações fonológicas na aquisição do português europeu [Onset sounds: complex segments, segmental groups, and phonological representations in the acquisition of European Portuguese]. Letras de Hoje, 36, 67–84.

Goldwater, S., & Johnson, M. (2003). Learning OT constraint rankings using a maximum entropy model. In J. Spenader, A. Eriksson, & O. Dahl (eds.), Proceedings of the Stockholm Workshop on Variation within Optimality Theory (pp. 111–120).

Gomes, C. A., & Manoel, C. G. (2010). Flexão de número na gramática de criança e na gramática do adulto [Number inflection in child grammar and adult grammar]. Veredas, 14, 122–134.

Hayes, B., & Wilson, C. (2008). A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry, 39, 379–440. DOI:  http://doi.org/10.1162/ling.2008.39.3.379

Huback, A., & P. da Silva. (2007). Efeitos de freqüência nas representações mentais (doctoral dissertation) [Frequency effects in mental representations]. Universidade Federal de Minas Gerais, Belo Horizonte.

Kenstowicz, M., & Sandalo, F. (2016). Pretonic vowel reduction in Brazilian Portuguese: Harmony and dispersion [Frequency effects in mental representations]. Journal of Portuguese Linguistics, 15, 1–19.

de Kolovrat, G. (1923). Etude sur la vocalisation de la consonne l dans les langues romanes [A study of l-vocalization in Romance languages]. Paris: Jouve et cie.

Kubozono, H. (2001). On the markedness of diphthongs. Kobe papers in linguistics, 3, 60–73.

Mateus, M. H., & d’Andrade, E. (2000). The phonology of Portuguese. Oxford: Oxford University Press.

Mattoso Câmara, J. (1953). Para o estudo da Fonêmica Portuguesa [Towards a study of Portuguese phonemics]. Rio de Janeiro: Simões.

Mattoso Câmara, J. (1967). A note on Portuguese noun morphology. In To honor Roman Jakobson, vol. 2 (pp. 1311–1314). Mouton de Gruyter.

Mattoso Câmara, J. (1984). Estrutura da língua portuguesa [The structure of Portuguese]. Vozes.

McCarthy, J. J. (2003). Comparative markedness. Theoretical Linguistics, 29, 1–51. DOI:  http://doi.org/10.1515/thli.29.1-2.1

Miranda Soares, L. (2013). A influência de variáveis linguísticas e sociais na ausência de concordância nominal no português falado no Brasil [The role of linguistic and social variables in the lack of nominal agreement in Brazilian Portuguese]. Anais do SILEL, 3.

Morales-Front, A., & Holt, D. E. (1997). On the interplay of morphology, prosody and faithfulness in portuguese pluralization. In F. Martínez-Gil, & A. Morales-Front (eds.), Issues in the phonology and morphology of the major Iberian languages (pp. 393–437). Georgetown University Press.

Nevins, A. (2012). Vowel lenition and fortition in Brazilian Portuguese. Letras de Hoje, 47, 228–233.

R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria. (http://www.r-project.org).

Scherre, M. M. P. (1988). Reanálise da concordância nominal em português (doctoral dissertation) [The reanalysis of nominal agreement in Portuguese]. Universidade Federal do Rio de Janeiro.

da Silva, C. (2010). A variação na marcação de plural nos sintagmas nominais (sns) na fala de informantes de duas comunidades tocantinenses [The variation in plural marking in noun phrases in the speech of two communities in Tocantins]. Talk given at XXIII Jornada Nacional de Estudos Linguísticos do Nordeste – GELNE.

Smolensky, P., & Legendre, G. (2006). The harmonic mind: from neural computation to Optimality-Theoretic Grammar. Cambridge, MA: MIT Press.

Storme, B. (2017). Contrast enhancement motivates closed-syllable laxing and open-syllable tensing. Unpublished manuscript, MIT, Cambridge, USA, available as lingbuzz/003700.

van de Vijver, R., & Baer-Henney, D. (2011). Acquisition of voicing and vowel alternations in German. In Proceedings of BUCLD 35, vol. 2 (pp. 603–615).

Vogel, I., & Raimy, E. (2002). The acquisition of compound vs. phrasal stress: the role of prosodic constituents. Journal of Child Language, 29, 225–250. DOI:  http://doi.org/10.1017/S0305000902005020

Wetzels, L. (1995). Mid-vowel alternations in the Brazilian Portuguese verb. Phonology, 12, 281–304. DOI:  http://doi.org/10.1017/S0952675700002505

Wetzels, L. (2011). The representation of vowel height and vowel height neutralization in Brazilian Portuguese (southern dialects). In J. A. Goldsmith, E. Hume, & L. Wetzels (eds.), Tones and features, (pp. 331–360). Berlin: De Gruyter. DOI:  http://doi.org/10.1515/9783110246223.331

Zsiga, E., Gouskova, M., & Tlale, O. (2011). Grounded constraints and the consonants of Setswana. Lingua, 121, 2120–2152. DOI:  http://doi.org/10.1016/j.lingua.2011.09.003

Zuraw, K. (2000). Patterned exceptions in phonology (doctoral dissertation). UCLA.