The vowel inventories of the Western Romance languages descend from the seven-vowel system of Proto-Romance (Hall 1950) that distinguished between open and close mid vowels: /i, u, e, o, ɛ, ɔ, a/. In the modern languages the four-way height contrast has been reduced by two principal forces: neutralization of the open-close distinction in the mid vowels and regressive height harmony. Reduction to five vowels /i, u, e, o, a/ is found in Spanish and in the unstressed syllables of Italian and Brazilian Portuguese. A further reduction to just three vowels with a binary height contrast /i, u, ɐ/ occurs in the unstressed syllables of certain Southern Italian dialects as well as in the word-final syllables of Brazilian Portuguese (BP). European Portuguese (EP) extends the binary [±high] reduced vowel system to all unstressed syllables (Mateus and d’Andrade 2000:18).2 These languages also exhibit height harmony in various contexts such as Portuguese verbal inflection (Harris 1974, Wetzels 1995) and Italian metaphony (Calabrese 2011).
In this paper we discuss another instance of reduction and height harmony that to the best of our knowledge has not been experimentally studied before: the behavior of pretonic vowels in contemporary Brazilian Portuguese.3 As noted above, BP is commonly described as having preserved the seven-vowel system of Proto-Romance in stressed syllables, with a loss of the open-close mid vowel contrast in unstressed nonfinal syllables, and further reduction to just a binary height distinction in unstressed word-final syllables (1) (Mattoso Camara 1972, Mateus and d’Andrade 2000, Barbosa and Albano 2004).
|tonic||pretonic and posttonic||word-final|
Some examples of phonological alternations illustrating the neutralization of the open and close mid vowels appear in (2).
Two alternative conceptions of the motivation for such reduction have appeared in the recent Optimality-Theoretic (OT) generative literature. One sees reduction as the enhancement of contrasts in prosodic structure. Higher sonority, more open vowels are favored in prominent, stressed positions and disprefered in nonprominent, unstressed positions. See Crosswhite (2004), Kenstowicz (2010), Walker (2011), and Wetzels (2011) among others, for expressions of this view. An alternative suggested by Flemming (2004) gives explicit recognition to constraints on dispersion, formalizing some ideas of Lindblom (1986) in the OT framework. In this view, vowel inventories are the outcome of three types of constraints. Minimal Distance constraints require vowels to be maximally dispersed along a phonetic dimension such as F1 vowel height. A countervailing Maximize Contrast constraint seeks to segment the dimension in order to create more phonological categories to encode lexical contrasts. Finally, constraints minimizing articulatory effort can introduce sounds that violate the Minimal Distance threshold in particular contexts. When this happens a typical response is the loss of contrasts and hence a violation of the Maximize Contrast constraint that determines the overall size of the segment inventory found in other contexts such as the stressed syllable.
Flemming (2004) illustrates this conception of neutralization with material drawn from the literature on the phonetics of certain Italian dialects as well as Brazilian Portuguese. Specifically, he interprets the raising of the low vowel to [ɐ] or [ə] in unstressed syllables as a response to the constraint *Short Low Vowel that prohibits the rapid tongue body movement that would be required to articulate an [a] in the reduced time available in unstressed syllables: *ă. When this articulatory effort constraint predominates, the raised [ɐ] or [ə] encroaches on the space of the mid vowels and thus introduces a violation of the Minimal Distance requirement on F1 contrasts. Loss of the open vs. close mid vowel contrast restores the required minimal F1 distance between the remaining vowels, but at the cost of reducing the number of vowels that are available to encode lexical contrasts in the phonetic output.
In this paper, we address the phonetic assumptions underlying the mid vowel open-close reduction from the perspective of BP. First, we review the previous phonetic investigations of BP vowels that Flemming (2004) relied on in developing his dispersion-theoretic account of vowel reduction. We also review an earlier analysis of height harmony between the tonic and pretonic syllables. We then turn to our study, describing the methods of data collection and the results obtained. To preview, our first major finding is that the duration of the low vowel provides only a partial explanation for the sites of mid-vowel neutralization in BP: it is shorter and raised in the posttonic and word-final contexts compared to the pretonic, but neutralization of the open-close lexical contrast seen in (2) occurs in all three nontonic positions. Furthermore, the reduction to a binary height contrast in the final syllable is better attributed to a difference in intensity rather than duration. Our second major finding is that pretonic mid vowels harmonize for height with the following tonic mid vowel, creating a four-height, seven-vowel phonetic inventory in pretonic position that mirrors the stressed vowel inventory. The paper concludes with a discussion of the results and their implications for the dispersion account of BP vowel reduction and the status of the open-close mid vowel contrast in BP.
We are aware of two previous phonetic studies that explicitly compare BP vowels in stressed and unstressed positions.4 Major (1986, 1992) investigated the phonetic correlates of metrical stress in trisyllabic paroxytone (penultimate stress) words using a reiterant speech paradigm in which the real word was pronounced followed by a mimicking CVCVCV schema: e.g. repita a palavra BATATA de novo, repita a palavra LALALA de novo (‘repeat the word potato again, repeat the word LALALA again’).5 Findings for three speakers are reported where the mean durations of the pretonic and post-tonic syllables of the paroxytone LALALA occurred in the ratios .65 vs. .45, respectively, compared to the 1.0 tonic. Major does not indicate how many words were recorded nor whether the duration measures were normalized. Fails and Clegg (1992) present the results of an investigation of the recordings of 10 male speakers from five regions in Brazil. They do not tell how many words were analyzed. The recordings were analyzed with a digital sonagraph: “The nucleic vowel formants were measured with a calibrated hand ruler and were recorded. F1 and F2 were subsequently plotted on Koenig graph charts for visual facility” (p. 35). The table in (3) shows the first and second formant measures obtained.
|(3)||BP first and second formant averages in Hz (Fails and Clegg 1992)|
The authors note (p. 38) that the low vowel is “considerably raised” in posttonic and final positions while the high and mid vowels are “moving toward neutralization.” They make the standard assumption (Ladefoged 1982) that F1 primarily correlates with articulatory vowel height (higher vowels, lower F1) while F2 reflects lip rounding/protrusion and tongue body retraction (more backing/rounding, lower F2). In particular, the pretonic and medial posttonic mid vowels in nonfinal positions are realized at heights very near to the tonic close mid vowels. The first and second formants for the [i, u, ɐ] word-final system are also provided and show minimal difference from the word-medial ones, except that, somewhat surprisingly, word-final [ɐ] has a greater F1 compared to posttonic position. Nevertheless, in BP the mid vowels are merged phonologically with high vowels in this context. No vowel duration measures are provided in Fails and Clegg (1992).
Flemming (2004) relies on these two BP studies (as well as a study of Italian) to develop his dispersion theoretic account of the neutralization of vowel height contrasts. In particular, Major’s finding that the duration magnitude is ordered tonic > pretonic > post-tonic (1.0 > .65 > .45) is applied to the Fails and Clegg’s data on /a/ in (3), where there is a significant difference in the height of /a/ in pretonic as opposed to (medial) post-tonic position. As indicated above, Flemming suggests that the shorter duration of the post-tonic and final syllables results in a raising of the low vowel, which compresses the vowel space and leads to the neutralization of the mid vowel contrasts. In pretonic position vowels are long enough to allow the low articulatory target of /a/ to be reached. Flemming does not comment on why the open-close mid vowel contrast is nevertheless neutralized in this context as well (recall the data in (2)) even though the F1 distance requirement can evidently be satisfied.
In a sociolinguistic investigation of the Gaúcho dialect of the southern Brazilian state of Rio Grande do Sul, Bisol (1989) documented a variable height harmony process in which the mid vowels /e/ and /o/ are raised to high in pretonic syllables when the following stressed vowel is high: pepino [pepínu] ≈ [pipínu] ‘cucumber’; coruja [korúʒa] ≈ [kurúʒa] ‘owl’; formiga [formíga] ≈ [furmíga] ‘ant’. A particularly interesting asymmetry in the contexts exhibiting the process was observed. While stressed /i/ raised both pretonic mid vowels, stressed /u/ raised only pretonic /o/. An /e-u/ sequence such as in veludo ‘velvet’ was largely unchanged. Bisol conjectured that the difference between /e-u/ vs. /o-u/ may lie in the relative heights of the front vs. back vowels in phonetic space. In particular, she speculates that “being less high, [u] does not exert as great an attractive force on [e], because changing the latter to [i] would mean causing a higher articulation than [u] itself” (Bisol 1989:186). Acoustic phonetic studies of BP reveal that, as in many other languages, the front vowels are higher than the corresponding back vowels (as reflected in F1). This is true of the Fails and Clegg (1992) data in (3) as well as for Escudero et al.’s (2009) data on BP stressed vowels. A parallel asymmetry in the behavior of front vs. back high vowels in triggering height harmony is found in some Bantu languages where suffixal /i/ is lowered to /e/ when the root contains either mid vowel /e/ or /o/, while suffixal /u/ is lowered to /o/ after /o/ but not after /e/ (Hyman 1998).
Our study was prompted by the observations of Abaurre and Sandalo (2009) that in the speech of many BP speakers, the pretonic vowels in such pairs as levéza ‘lightness’ vs. lelɛ́ca ‘nickname for Amelia’ as well as jocóso ‘joyful’ vs. lorɔ́ta ‘fib’ harmonize as [e-e] vs. [ɛ-ɛ] and as [o-o] vs. [ɔ-ɔ]. This looks like an extension of the height harmony noted by Bisol to the tonic mid vowels. If height harmony is triggered by the mid vowels as well then the four levels of vowel height found in the stressed syllable will be reproduced pretonically. Such harmony, if it occurs, would offer an alternative explanation for the different behavior of the low vowel in pretonic vs. posttonic position. Instead of the relative difference in duration between pretonic and posttonic vowels being the motivation for raising the posttonic but not the pretonic low vowel, maintaining a minimal F1 distance between neighboring vowels in the pretonic four-height inventory would block raising of the low vowel even though it is found in a shorter context than the stressed syllable. Thus, both explanations appeal to vowel dispersion, but they differ in the reason why dispersion is called into play: retention of phonetic contrasts vs. articulatory effort.6
More generally, there are several reasons to reexamine the behavior of pre- and post-tonic vowels in BP. The studies of Major (1986, 1992) and Fails and Clegg (1992) focused on different topics, with different speakers: the first on duration, the second on vowel reduction. If vowel reduction really is tied to duration, it would be desirable to demonstrate this point in a single experiment with the same set of speakers. Also, Fails and Clegg (1992) do not comment on the pretonic harmony noted by Bisol (1989), presenting average formant values for the underlying vowel phonemes in the various syllable positions without regard to the surrounding segmental context. On the other hand, Bisol’s study provides phonetic transcriptions rather than acoustic measures of the pretonic vowel system. We sought to investigate the BP pretonic vowels experimentally with regard to the following questions. What is the relative duration of the stressed vowel to the vowels in pretonic and post-tonic positions? Is the height of the low vowel in these positions related to duration? Does the height of the tonic vowel help to determine the height of the pretonic mid vowel? Has the height harmony noted by Bisol been extended to the mid vowels? If so, what is the effect of the resultant four-height pretonic inventory on the vowel dispersion? Finally, for the mid-vowel harmony seen in [e-e] lev[é]za vs. [ɛ-ɛ] lel[ɛ́]ca and [o-o] joc[ó]so vs. [ɔ-ɔ] lor[ɔ́]ta, are there front-back asymmetries comparable to Bisol’s [e-ú]?
We collected two sets of data (see appendix). Set A consists of 23 proparoxytone (antepenultimate stress) and 23 paroxytone (penultimate stress) nouns with CVCVCV syllable shape. This data set was used to determine the vowel duration, intensity, and timbre differences among three unstressed syllable positions in comparison to the tonic: pretonic, posttonic, and final. For the crucial pretonic vs. posttonic comparison, the five unstressed phonemes were each represented from three to five times in both positions. Set B consists of 170 CVCVCV paroxytone nouns in which the tonic penultimate syllable was varied among the seven BP vowel phonemes and the pretonic syllable was varied among the five unstressed vowel phonemes. This set was used to compare the tonic and pretonic vowel inventories and in particular to see if the height of the pretonic mid vowels was influenced by the height of the tonic vowel. The data were recorded in a sound proof booth at the first author’s university by five BP native speakers (two males and three females) who reflect the dialects in (4). Written informed consent of each subject was obtained following the study protocol approved by the MIT Committee on the Use of Human Experimental Subjects (COUHES 0410000939). Each word was recorded in the frame sentence: Ela disse X devagar ‘she said X slowly’. The recordings were made with a head-mounted Shure SM10A Unidirectional Head-Worn Dynamic Microphone and a USB Pre 2 Preamp at a sampling rate of 44.1 kHz, 16 bits.
|(4)||BP1||F||Santa Maria, Rio Grande do Sul (South)|
|BP2||F||Campinas, San Paolo (South)|
|BP3||F||Belo Horizonte, Minas Gerais (Central)|
|BP4||M||Belo Horizonte, Minas Gerais (Central)|
|BP5||M||Recife, Pernambuco (North)|
Our analysis was conducted with Praat (Boersma and Weenink 1992–2011). Target words were segmented and annotated with two text tiers: one for the entire word and the other for the vowels of each syllable. Segmentation was based on a comparison of the waveform and the formant displays in the spectrograms. Duration measures for the entire target word and each constituent vowel as well as formant values (F1 and F2) from the vowel midpoint were collected by a Praat script (Kitahara 2010). Following the recommendations in the Praat Manual, the Praat formant tracking algorithm was set to its default standard values: five formants, method Burg, pre-emphasis from 50 Hz, and an analysis range of 50 to 5,500 Hz for the female speakers and 50 to 5,000 Hz for the males. All of the measurements were then checked by hand and corrections were made where the script made an error (c. three percent of the items, chiefly due to the failure of the Praat formant tracking algorithm to separate F1 and F2 in back round vowels: 22 errors with [u] and 12 with [o]). In these cases the formants were resolved by changing the number of formants from five to four in the tracking algorithm. We also collected intensity measures of the vowels in the four prosodic positions (tonic, pretonic, posttonic, and final) with the help of another Praat script (Kawahara 2010). They were made across the entire vowel in the text grid (view range 40–100 dB, averaging method mean energy). Average and maximum F0 measures for the tonic and pretonic syllables were also taken. Mixed effects regression analyses and plots were conducted in R version 2.11.1 (Bates and Maechler 2010, R Development Core Team 2011) utilizing the lmer function and employing both random intercepts and random slopes for speaker and word. P values were estimated by dropping the random slopes and utilizing the MCMC function. In order to accommodate speech rate differences among our speakers, duration measures were normalized as Z-scores. Following the procedure discussed in Wang & Van Heuven (2006), vowel formant measures were first converted to the bark scale using the formula due to Traunmüller (1990) and then normalized as Z-scores.
4.1 Duration and Intensity
The barplots in Figure 1 below in (5) show the duration distributions for the syllable positions of interest. The penult and antepenult plots show the distribution of the stressed vowels in paroxytone and proparoxytones; final syllables are also distinguished by the paroxytone and proparoxytone stress types. The Y axis is the normalized vowel duration.
As expected, the stressed (tonic) syllable vowels are longer than the pretonic, which in turn are longer than the vowels in posttonic and final positions. The mean of the stressed penultimate vowel of the paroxytones was greater than the mean of the stressed antepenultimate vowel of proparoxytones: penultimate 1.25 (141 ms), antepenultimate 0.99 (131 ms). Simple linear regression found this difference to be significant (F = 6.312 on 1 and 218 DF, t = –2.512, p = 0.0127).7 But the t value dropped to –1.39 under mixed effects regression with random intercepts and random slopes for word and speaker suggesting that this difference may not be entirely reliable.
Another relevant point is that the pretonic syllable was higher in F0 than the stressed penult for four of our five BP speakers; our northern speaker BP 5 had the opposite tendency. A paired t-test for the paroxytones’ pretonic and tonic syllables found significant differences for the log-transformed F0 maxima (t = 6.6371, df = 94, p < 0.001) and F0 averages (t = 9.313, df = 94, p < 0.001). This result coincides with the finding of Vigário and Frota (2003) concerning the alignment of the declarative HL tonal melody in European Portuguese, where the H occurs on the pretonic syllable and L on the stressed syllable. Frota and Moraes (submitted) report that this tonal contour is not found in prenuclear accent positions in BP and may reflect a more general bias of this pitch contour for the nuclear accent, as pointed out by the authors. H+L* is also frequently found on BP focused words (see also Fernandes 2007, Truckenbrodt, Sandalo and Abaurre 2009, and Toneli 2014). 8
For unstressed syllables in our BP data, the mean of the pretonic syllable is longer than the posttonic syllables. Recall that Major (1986, 1992) found that the mean pretonic-tonic-posttonic duration ratios for his reiterant paroxytone LALALA paradigm were .65: 1.0: .45. Our paroxytone data showed a smaller ratio between the two unstressed positions relative to the tonic (raw scores): .55: 1.0: .46. However, some of the discrepancy may lie in the fact that Major’s data are restricted to the low vowel. As reported below, when the low vowel is separated out in our data, the pretonic-tonic-posttonic duration ratios align quite well with Major’s findings.
In order to assess the significance of the syllable-position vowel-duration differences, we ran a mixed effects linear regression analysis following the procedure indicated above. Normalized vowel duration was the dependent variable and syllable position was the predictor. The durations of the two tonic syllable types (penultimate and antepenultimate) and two final syllable types (final antepenultimate and final penultimate) were combined. The syllable position (tonic, pretonic, posttonic, final) was backwards difference coded to allow us to assess the significance of the mean differences between each successive level of this factor. The model returned the following results for the fixed effects.9
|(6)||Mixed effects regression model of duration differences between successive syllables in the assumed prosodic hierarchy.|
|Estimate||Std. Error||t value||Pr(>|t|)|
The difference between the pretonic and posttonic vowel durations is significant while the difference between the posttonic and final is not. The lack of an appreciable duration difference between the posttonic and final syllables was also found by Moraes (1998). Part of this might be attributed to the fact that some of our speakers tended to insert a pause between the target word and the following word in the fame Ela disse X devagar, possibly resulting in prepausal lengthening of the word-final syllable in the target word. Recall that the final syllable is the site of an additional vowel reduction in BP, where the distinction between mid and high vowels is eliminated. As Flemming observes (2004: 275), if decreased duration is the motivation for the phonological vowel reduction (rather than the opposite) then the phonetic grammar must abstract away from the effects of pause, since pausal lengthening never undoes the mid-vowel – high-vowel neutralization. He speculates that the mid-high reduction in BP’s final syllable might reflect a difference in intensity.
We followed up on this conjecture by comparing the average intensity values among the vowels in the four syllable positions in set A. The barcharts in Figure 2 below in (7) show the distribution.
A mixed effects linear regression was run to test the significance of this distribution. The syllable position (tonic, pretonic, posttonic, final) was backwards difference coded to assess the significance of the mean differences between each successive level of this factor. The regression analysis returned the following results for the fixed effects.
|(8)||Mixed effects regression model of intensity differences between successive syllables in the assumed prosodic hierarchy.|
|Estimate||Std. Error||t value||Pr(>|t|)|
We see that each syllable position is significantly different from the preceding one in the tonic, pretonic, posttonic, word-final hierarchy. Thus, for our data intensity does a better job of differentiating the prominence hierarchy than duration does. A JPL reviewer remarked that the prosodic weight of a syllable can be expected to involve the integration of several factors including duration, intensity, and possibly F0. See Gordon (2004) for one approach to the formalization of these factors.
4.2 The Low Vowel
As noted earlier, Fails and Clegg (1992) found that the low vowel in BP is raised from [a] to [ɐ] in unstressed, post-tonic positions. We investigated whether a similar effect was found for our speakers and to what extent it correlated with vowel duration. The barcharts in Figure 3 below in (9) taken from data set A show the normalized values for F1 and for vowel duration of the low vowel for our five BP speakers with respect to the four prosodic positions of interest: tonic, pretonic, post-tonic, and final. There is very little difference in F1 between the pretonic and tonic positions, while the post-tonic and final syllables show a large F1 difference relative to the tonic. In addition, there is a difference between the posttonic and final syllables, with the latter having a lower F1 value indicating a higher tongue body position. Thus our data replicate the major finding of Fails and Clegg (1992). The tonic/pretonic vs. post-tonic/final division for the low vowel seen in (9) mimics the overall vowel duration differences noted in (5), where the posttonic and final syllables are significantly shorter than the pretonic and tonic. But our data differ from Fails and Clegg (1992) in showing a greater F1 difference between the posttonic and word-final positions. This makes sense in terms of dispersion since there is only a binary height contrast in word-final position.
To what extent are the F1 differences for the low vowel a function of duration vs. syllable position? As suggested by the barcharts in (9), syllable position draws a three-way distinction of tonic, pretonic vs. posttonic vs. final while normalized duration partitions as tonic vs. pretonic vs. posttonic, final. To address this question we ran another regression where normalized F1 was the dependent variable and syllable position (backwards difference coded), normalized duration and their interactions were the predictors. As shown in table 10, syllable position accounted for more of the variance than duration. As would be expected, the interactions between position and duration are significant. Separate regressions found that syllable position accounted for more of the variance than normalized duration: AIC 134 vs. 436. Thus, syllable position does a better job of differentiating among the heights of the low vowel compared to normalized duration.
|(10)||Mixed effects regression model of low vowel F1 as predicted by syllable position and duration.|
|Estimate||Std. Error||t value||Pr(>|t|)|
|tonic vs. pretonic||–0.29077||0.09038||–3.217||0.0015|
|pretonic vs. posttonic||–1.79929||0.12005||–14.988||0.0000|
|posttonic vs. final||0.02868||0.11113||0.258||0.7966|
One might interpret the data as showing that there is a duration threshold somewhere between the average durations for the pretonic and posttonic positions such that below this cutoff point, the speaker does not have enough time to comfortably reach the low F1 tongue position target required for [a], and so the low vowel is raised to [ɐ]. This would be the duration threshold where Flemming’s (2004) *ă constraint blocking a short stressed low vowel would be activated in the phonetic grammar. The average durations and standard deviations for the low vowel as a function of syllable position for our five BP speakers are as follows: tonic 155 (39) ms, pretonic 98 (27) ms, posttonic 64 (18) ms, final 66 (16) ms. The hypothetical duration threshold would thus fall between 98 and 64 ms. Specifically for paroxytones, they are tonic 151 ms (15), pretonic 98 ms (27), and final 63 ms (15). The resultant 1.0, .64, .41 tonic – pretonic - final duration ratios for the low vowel in our paroxytone data thus align very closely with the 1.0, .65, .45 ratios Major (1986) reports for his reiterant LALALA data mentioned earlier.
4.3. Pretonic Vowel Height
The chart in Figure 4 below in (11) depicts the spacing of the tonic and pretonic BP vowels in the normalized F1/F2 vowel space based on the analysis of the words in Set B. We removed items whose vowels were adjacent to a nasal consonant since contextual nasalization may affect F1.
The tonic vowels separate into four clearly defined heights. The pretonic high vowels are lower than their stressed counterparts while the pretonic mid vowels fall between their open and close stressed counterparts but nearer to the latter. Pretonic [a] has essentially the same location as its stressed counterpart and is much lower than the posttonic low [ɐ] depicted in (9). In the F2 dimension the pretonic back vowels are somewhat more centralized than their stressed counterparts. In sum, the pretonic vowels approximate the location of the corresponding tonic vowels in the vowel space and thus our results agree with the statement of Barbosa and Albano (2004: 229) concerning BP that “In pre-stressed position … the quality of the corresponding stressed vowel is roughly preserved.”
Recall that one of the motivations for our study was the observation of Abaurre and Sandalo (2009) that the pretonic mid vowels in BP take on the height of the tonic mid vowels. In order to investigate this point phonetically, we separated out the pretonic mid vowels from set B and assessed their normalized F1 values as a function of the height of the following stressed vowel. The tonic vowels were assigned to four levels: high, close, open, and low. The plot in Figure 5 below in (12) shows that the pretonic mid vowels separate into three levels as a function of the height of the tonic.
In order to assess the significance of these differences a mixed effects linear regression was run with the normalized F1 of the pretonic vowel as the dependent variable and phonological height of the tonic vowel as a predictor variable. Height was backwards difference coded in terms of the height of the tonic: high, close, open, low. We also checked whether the sonorant vs. obstruent nature of the intervening consonant had an effect. Height assimilation sometimes crosses a sonorant consonant but blocks at an obstruent (Uffmann 2007). We also examined whether the front pretonic [e] behaved differently from the back pretonic [o], prompted by the difference noted by Bisol (1989) for high vowels. The regression analysis returned the results in (13) below. They suggest that there is a significant difference in the height of the pretonic vowel as the tonic is changed from high to close and close to open. The change from open to low is not significant nor is the change from an intervening obstruent to a sonorant consonant or the change from a back to a front pretonic vowel. We conclude that at least for these BP speakers there is a regressive height harmony effect between the tonic and pretonic open and close mid vowels in BP.
|(13)||Mixed effects regression model of pretonic mid vowel’s F1|
|Estimate||Std. Error||t value||Pr(>|t|)|
We might ask whether the height difference in the pretonic vowel can be explained as phonetic coarticulation with the tonic vowel. One way of getting at this issue is to see to what extent the height of the pretonic vowel is correlated with the height of the tonic vowel in general. If there is phonological height harmony singling out the mid vowels, then we expect the correlation to be higher for these vowels as compared to pretonic high and low vowels. This in fact is what the data reveals. In terms of simple correlation between the F1bark of the pretonic and tonic syllables, mid vowels had an r = .62, while for high vowels r = .25, and for the low vowel r = .31. Also mixed effects linear regression predicting the normalized F1 value of the pretonic vowel from the normalized F1 value of the tonic returned the results in (14). They show that while there is a significant connection between the height of the pretonic mid vowel with respect the tonic vowel, the connections for the pretonic high and low vowels are not significant. This difference makes sense if there is a phonological process harmonizing the height of the pretonic mid vowel with the height of the tonic vowel.
|(14)||Mixed effects regression models of pretonic vowel’s F1 as predicted by tonic vowel’s F1.|
|Estimate||Std. Error||t value||Pr(>|t|)|
We recall that in her investigation of the assimilation of the pretonic mid vowels to tonic high vowels in the Rio Grande do Sul dialect, Bisol noted an asymmetry where /e/ raised to /i/ before /i/ but not before /u/. We investigated whether there were any such front-back disparities in our data with respect to the effect of the tonic mid vowels on the pretonic mid vowels. The barchart in Figure 6 below in (15) shows the eight possible combinations of the open and close tonic vowels with a front vs. back pretonic vowel. Recall from the chart in (11) that the normalized F1 of tonic [é] was somewhat lower than tonic [ó] (–.45 vs. –.22) while the open tonic vowels had comparable heights ([ɛ́] .56 vs. [ɔ́] .60). We therefore might expect tonic [é] to have a greater vowel raising effect than tonic [ó]. This is true for [eCé] vs. [eCó] but not for [oCé] vs. [oCó], where tonic [é] is associated with greater F1 values in the pretonic vowel compared to tonic [ó]. Rather the charts suggest a coupling effect. For the open tonic vowels, when the tonic and pretonic vowels agree in backness, the pretonic vowels are associated with greater F1 compared to when they disagree in backness: [eCɛ́] > [eCɔ́] and [oCɔ́] > [eCɔ́]. Likewise for the close tonic vowels, pretonic vowels are associated with less F1 when they agree in backness compared to when they disagree in backness: [eCé] < [eCó] and [oCó] < [oCé].
However, due to the large variances evident in (15), regression modeling with multiple comparisons (Tukey) did not find these differences to be significant: in the first case of open vowels z = 2.174 (p = .13) and in the second of closed vowels z = –1.76 (p = .29). We conclude that, at least for our data, there are no significant effects of backness or identity between the target and the trigger of harmony. All that matters is the open vs. close nature of the tonic.
The motivation for this study was to explore the phonetic assumptions underlying two alternative accounts of Brazilian Portuguese vowel reduction. The first sees reduction as reflecting abstract prosodic structure while Flemming’s (2004) alternative dispersion-theoretic analysis attributes the reduction to the effort constraint against a low vowel in the decreased duration available in posttonic and word-final syllables. On the prosodic account, the proparoxytone and paroxytone words could be analyzed as having the prominence structures depicted in the metrical grids (Halle and Vergnaud 1987) of (16). There is a lexical contrast in the location of the main stress. Metrical structure would reflect the parameter settings/constraint rankings enforcing exhaustive parsing into optimally binary left-headed constituents with final syllable extrametricality for line 0 and right-headed constituents for line 1.
These structures align rather naturally with the intensity and duration measures discussed in section 4.1 as well as the contexts where the open-close mid-vowel contrast is licensed. The height of the columns of grid marks readily distinguishes tonic, pretonic and posttonic syllables; the final syllable stands out as being unparsed and this is the syllable with the lowest intensity and least F1 in our experimental results. Furthermore, the durational difference between the stressed penult and antepenult noted in 4.1 correlates with whether or not the foot contains one or two syllables. The lexical open vs. close mid vowel contrast is licensed in syllables associated with a level-2 grid mark. The posttonic syllables would reduce in quality and duration in virtue of their flat prosodic structure. Crosswhite (2004) analyzes these syllables as nonmoraic. Finally, the greater raising of the low vowel in word-final position and the concomitant loss of the mid vs. high vowel contrast could be associated with the final syllable’s unparsed grid mark. As noted earlier, the H+L* pitch contour imposed by the majority of our speakers suggests that the words were associated with a nuclear sentence accent, perhaps induced by focus (Fernades 2009 and Frota et al. 2015). An interesting follow-up study would be to try to explicitly manipulate nuclear and prenuclear accents to see what effect this factor would have on the duration and intensity correlates of the various prosodic positions, in particular pretonic vs. posttonic.
In the alternative dispersion-theoretic account the basic thesis was that due to the reduced time available in unstressed syllables, the low vowel is raised for reasons of articulatory effort. This in turn violates the minimum distance requirement along the F1/vowel height dimension. The phonological response is to neutralize the contrast between the open and close mid vowels so that the minimum distance requirement is maintained among the remaining three vowel heights. Our study of five BP speakers found that the low vowel is raised to [ɐ] in posttonic and word-final syllables, corroborating one of the assumptions of this analysis (and replicating the finding of Fails and Clegg (1992) upon which it was based). But we also found that in pretonic position the low vowel is not markedly raised and assumes a height comparable to the stressed vowel. The duration (and intensity) of the low vowel is significantly greater in pretonic compared to posttonic position and so we may infer that vowel duration must fall below 96 ms. before the constraint *ă that motivates raising the low vowel is activated in the phonetic grammar. What remains unexplained in the dispersion account is why the pretonic syllable is a site for the neutralization of the close-open contrast, as seen in the alternations in (2), even though the Minimum Distance constraint can evidently be satisfied.
According to Brandão de Carvalho (1988–92: 7), the Portuguese of c. 1,500 AD had a raised low vowel [ɐ] in pretonic as well as posttonic position, as does present-day European Portuguese.10 If this is true then the pretonic [a] seen in the present-day language must be an innovation of the Brazilian variety. The pretonic neutralizations of the open vs. close mid vowels seen in (2) may reflect this earlier stage of the language. In current BP, the pretonic realization of the low vowel as [a] entails that there is enough space available to support a four-way height distinction in this position that is comparable to the tonic syllable. This in turn would permit the pretonic mid vowels to split into close and open allophones as a function of the tonic vowel. As far as we know, this pretonic harmony is not found in the European variety of Portuguese. In addition to the vowel harmony, pretonic open mid vowels also appear in BP in the analogical extensions of the open mid vowels in diminutives such as flɛ́cha, flɛchínha ‘arrow’ and bɔ́la, bɔlínha ‘ball’ studied by Ferreira (2005).
Turning to the tonic vowels, Wetzels (2011) observes that for many BP speakers the close mid vowls are being replaced by open vowels in various derived contexts such as before the prestressing –ic suffix (cf. esquel[é]to ‘skeleton’, but esquel[ɛ́]tico ‘skeletal’) as well as in loanwords such as m[ɔ́]vel ‘mobile’ and c[ɔ́]dex ‘codex’. Kenstowicz (2010) discusses parallel examples of this phenomenon in Standard Italian. Chitoran (2002) analyzes the ə≈á, e≈eá, o≈oá alternations of Romanian as vowel lowering under stress. What could be the motivation for the preference of open over close mid vowels in tonic syllables? One possibility is that the open vowels are longer and hence more optimal bearers of stress, given that duration is a primary cue for stress in Portuguese as well as Italian. Indeed, Escudero et al. (2009) observe that the duration ratio between high and low stressed vowels is relatively large in Portuguese (1.33) compared to Iberian Spanish (1.14) or Continental French (1.13). Moreover, stressed vowels are longer in BP compared to European (Lisbon) Portuguese. For example, for female speakers Escudero et al. (2009) found that BP high vowels averaged 99.5 ms vs. 144 ms for [a] while for EP the differences were 93 ms (high) vs. 122 ms for [a].11 Also the duration of the BP open mid vowels was very near the duration of the low vowel: [ɛ] 141 ms, [ɔ] 139 ms, [a] 144 ms. Escudero et al. (2009: 1390) suggest that “Portuguese has turned duration into a language-specific (minor) cue for phonological vowel identity.” But dispersion in the vowel space can also motivate the preference for open over close mid vowels in stressed position. As indicated by our chart in (11), the close mid vowels are nearer to the corresponding high vowels, while the open mid vowels overlap minimally with the low vowel in F1 and virtually not at all in F2. Thus, if a choice is to be made between the open and close mid vowels, the open vowels are a better option on grounds of dispersion and this factor could motivate the changes from close to open in the stressed syllable discussed by Wetzels (2011). In this regard, the loanword adaptations in Slovene reported by Jurgec (2010) become important. This language also contrasts open and close mid vowels. But in Slovene open mid vowels are adapted as close in loanwords, the opposite choice from Italian and BP: cf. r[ɔ]ck > [rok] ‘rock music’, [ɛ]cstasy > [ˈekstazi] ‘ecstasy’. However, as noted by Kenstowicz (2010), Slovene differs from Italian (and BP) in having a central vowel /ə/ phoneme that is phonetically closer to the open /ɛ/ and /ɔ/ than to the close /e/ and /o/. Thus, for Slovene the close mid vowels are a better choice than open mid vowels in terms of dispersion.
6. Summary and Conclusion
In this paper we reported the results of the phonetic analysis of two data sets designed to assess the degree of vowel reduction in Brazilian Portuguese across four prosodic contexts (tonic-pretonic-posttonic-final) with respect to vowel duration, intensity, and locus in F1-F2 space. Our most immediate goal was to reconfirm previous findings from Major’s (1986) study of duration and Fails and Clegg’s (1992) study of vowel spacing with data gathered from a single set of speakers recorded in a comparable laboratory setting using modern analytic tools. Our results replicated Fails and Clegg’s findings on the different positions of the BP low vowel relative to the tonic: pretonically the vowel is comparable to the tonic in quality, while posttonically it is raised dramatically to [ɐ] or [ə]. We found that the F1 realization of the low vowel was better predicted by abstract prosodic position (tonic > pretonic > posttonic) than by phonetic duration. Phonetic intensity also aligned well with the tonic-pretonic-posttonic-final hierarchy. We also studied in detail the realization of the mid vowel phonemes in pretonic position and found a correlation with the phonological height of the tonic vowel. In particular, the mid vowels separated into open and close variants that mimic the location of open and close tonic vowels. This confirms experimentally the claims of Abaurre and Sandalo (2009) and Freitas (2010) that the height harmony induced by tonic high vowels documented by Bisol (1989) also occurs with the tonic mid vowels. Our general conclusion is that the absence of low-vowel raising to [ɐ] in pretonic position probably represents an innovation in the Brazilian variety of Portuguese (Brandão de Carvalho 1988–92) that is correlated with if not caused by the harmony that splits the mid vowel phonemes into open and close variants that anticipate the height of the following tonic syllable. Another contributing factor is the analogical extension of the open vs. close contrast to the pretonic syllables preceding the diminutive suffix studied by Ferreira (2005). The resultant seven-vowel pretonic inventory parallels the spacing found in tonic syllables. An interesting follow-up study would be to investigate whether the BP speaker can predict the height of the downstream tonic vowel on the basis of the pretonic and if so whether this information gives an advantage in lexical access.