On the nature of sC-clusters in European Portuguese

Research on the nature of word-initial sC-clusters in different languages led scholars to propose several hypotheses on the syllabification of the segments involved in this sequence. Elaborating on the analyses available in the literature for Portuguese and for other languages, we will focus on the segmental and syllabic properties of the members of this cluster in European Portuguese (EP). We will provide empirical evidence from EP dialectal variation and from EP L1 acquisition to argue for the Coda status of the initial fricative and for the presence of a left-adjacent empty Nucleus (Andrade & Rodrigues, 1998; Mateus & Andrade, 2000).


Introduction
The reports available in the literature accounting for the behaviour of word-initial fricative plus obstruent clusters (sC-clusters) show that the syllabification of these clusters follows different patterns in the languages exhibiting this sequence and that similar syllabifications may display different phonetic shapes.It is traditionally assumed that Romance languages tend to present a vowel at the left-edge of the cluster while Germanic languages do not show this vowel in overt output forms (like English (star [sta ]) and Dutch straat [stra :t]).However, not all Romance languages present the same strategies concerning word-initial sC-clusters, as we may observe in the forms for 'mirror' in the following Romance languages: (i) some languages exhibit a vowel in word-initial position, in overt output forms (Spanish [espe xo], Brazilian Portuguese [ispéu], Galician [espe o], Catalan [espi ]); (ii) other languages tend to use word-initial sC-clusters (Italian [sp o], Corsican [sp ccu] and Rheto-Romance [pi agal]) 2 .Some languages exhibit different solutions for words that formerly had sC-clusters (see French 3 for [skalje ] escalier 'stairs', [ek l] école 'school', [spesifi k] spécifique 'specific').
In European Portuguese (EP), as it will be shown in this paper, this type of clusters generally surfaces without a left-adjacent vowel, although a word--initial vowel is possible in some contexts, and the deletion of the fricative is not an option in the language.
In this paper, empirical evidence from EP dialectal variation and from EP L1 acquisition will be used to discuss the syllable structure assigned to the sC(C)-clusters in this language; based on this evidence, we will argue that, for the analysis of words like [k l] escola 'school', the syllabification is: (i) the initial fricative of the cluster associates to a coda position with a left-adjacent empty nucleus and (ii) the following consonant(s) associate to an onset position.The structure of the paper is as follows: in section 2, we will summarise the hypotheses presented in the literature for the analysis of sC(C)--clusters in several languages; in section 3, we will supply information on the nature of the data reported in this study; in section 4, we will describe the types of sC-clusters in EP, following Andrade & Rodrigues (1998), Rodrigues (1999) and(2000) and Mateus & Andrade (2000); finally, we will provide empirical evidence from dialectal variation (section 5) and from L1 acquisition (section 6) to discuss the hypotheses listed in section 3.

The hypotheses for the analysis of sC-clusters
The structure under analysis in this paper is problematic since it may violate universal principles of syllabic constituency, namely the Sonority Sequencing Principle, the Dissimilarity Condition 4 and the fact that constituents in the grammar are maximally binary 5 .The problematic nature of sC-clusters has been a frequent topic of discussion in various languages.Based on different output forms displayed by languages that show sC-clusters, different scholars have claimed a considerable number of analyses for these clusters.In (1), we summarise the hypotheses that have been put forward in the literature and that, in our opinion, may be relevant for the analysis of this structure in EP:

The dialectal variation data
The data concerning EP dialectal variation discussed in this paper belongs to a larger corpus (CPE-Var) 7 that includes 180 interviews from native speakers of the dialectal varieties of EP spoken is Lisbon and Braga.The interviews were held in 1996-97.The same interviewer made all the recordings in places that were familiar to the interviewee.The sociolinguistic interview has the following structure: formal speech, word-list reading, sentence-list reading, text reading and informal speech.The reading tests were designed to cover several types of linguistic variables within the word and between words.The recordings lasted between 45 to 90 minutes depending on reading ability and the speaker's profile.Data include male and female speakers (normally three per cell) from four educational groups and five age groups.The educational groups are the following: 0no formal education; 1 nine years of school attendance at most; 2twelve years of school attendance at most; 3speakers having academic degrees.The following age groups were tested: 1 -13 to 19 years old speakers; 2 -20 to 25 years old speakers; 3 -26 to 39 years old speakers; 4 -40 to 55 years old speakers; speakers with ages above 55 years old.A sample of the data were phonetically transcribed (on an auditory basis), inserted in a database (using Access, developed for this purpose) and linguistically encoded, according to the relevant linguistic phenomena exhibited and the speaker's profile.This sample includes over 53000 utterances of full content words with reference to speech style or sentence context.For this paper we use all the utterances containing sC-clusters in the database.

The L1 acquisition data
The EP acquisition data comes from a corpus8 containing longitudinal cross-sectional data from 7 monolingual Portuguese children aged 0;10 to 3;07; all the children were acquiring the Lisbon variety of the language.The children have been videotaped monthly for 1 year (one of the children has been videotaped for 2 years).The sessions took place in the children's home, with the mother and the researcher present, and sessions lasted between 30 to 60 minutes.The database format used to analyse the children's productions is the CHILDPHON database, an application of the 4 th Dimension software for Macintosh, developed at the Max Planck Institut for Psycholinguistics -Nijmegenand first used in Levelt (1994) and Fikkert (1994).The database for EP contains 18 654 spontaneous utterances.For this paper, we examined all utterances containing target words with word-initial sC-clusters.

The sC-clusters in European Portuguese
Andrade & Rodrigues (1998) propose a clear-cut distinction between four consonantal structures involving the sC-clusters under discussion in this paper.We illustrate these four structures in (2): (2) EP standard forms Group a.
[ According to the data in (2), it is possible to observe that not all sC--clusters in EP behave similarly in the standard variety of EP: group a. does not surface with a word-initial vowel and presents a word-initial [-anterior] coronal fricative; the words in group b. obligatorily display a [+anterior] coronal fricative and a left-edge vowel is not attested; group c. presents a word-initial [i] vowel and the fricative is always [-anterior]; finally, group d. shows an alternation pattern, i.e., variants without a word-initial vowel and with an [] initial diphthong are possible, the fricative being exclusively [-anterior].Section 5. in this paper will provide information on how the typology in ( 2) is supported by dialectal variation data concerning the behaviour of Lisbon and Braga speakers, during the performance of several formal and informal verbal tasks.
Voicing is an important aspect in the data under analysis: as we may observe in (2), the [-anterior] fricative assimilates the voicing properties from the right-adjacent consonant ([t ] 'star' versus [go tu] 'drain').As it will be discussed in sections 5 and 6, this behaviour matches the one displayed by word-medial and word-final coda fricatives in EP: Different syllabifications for the sC-clusters in (2) are assumed in the EP grammar (their presentation in this section follows from the analyses in Andrade & Rodrigues, 1998;Rodrigues, 1999and 2000and Mateus & Andrade, 2000).
It is generally stated that the standard forms of words from group a. (escola [k l] 'school') do not exhibit a word-initial vowel (on this topic, see the description in sections 5. and 6.).This leaves us with a problematic word--initial structure: (i) are all the consonants of the cluster part of the same word-initial onset (according to hypotheses A. and B.)?If this is true, how to account for word-initial tri-positional onsets and for the violation of the Sonority Sequencing Principle and the Dissimilarity Condition within the complex onset, as in escravo [ka vu] 'slave'?(ii) are the fricative and the following consonant(s) associated to two different onsets under two different syllabic nodes (see Hypothesis D)? (iii) is the syllabic status of the fricative different from the one of the following consonant(s) (see hypothesis C, and E)?Some important aspects argue for the coda status of the fricative and for the presence of a word-initial empty nucleus, as stated in Andrade & Rodrigues, 1998 (see Hypothesis E2): (a) the fricative in group a. always displays voicing assimilation from the right-adjacent consonant; (b) in case of vowel epenthesis in word-initial position, not all vowels are possible; (c) the behaviour of the sC-clusters under external sandhi conditions and in derived forms show that there is an empty nucleus available at the left-edge of the cluster.These aspects concerning the behaviour of sC-clusters from group a. will be discussed further on in this paper (see sections 5. and 6.).Group b. is the one showing an obligatory [+anterior] voiceless fricative in word-initial position.This argument is used to discriminate words like stress from the other three types of words in (2) and it shows that the fricative in these words does not behave as a coda fricative in EP.As we will see in section 5., the analysis of this word-initial sC-clusters matches the one assumed for other problematic clusters in the system, as in pneu [pnéw] 'tyre' or afta [áft] 'ulcer': the two consonants are onsets of different syllable nodes; the word-initial /s/ is the onset of a syllable with an empty nucleus (Mateus & Andrade, 2000).The arguments used to support this analysis in words from group b. are: (i) the fricative does not assimilate voicing from the following consonant; it surfaces as [+ anterior], the unmarked value for C-place; (ii) a branching onset with the /sC/ shape would violate the Sonority Sequencing Principle and the Dissimilarity Condition; (iii) it is possible to find [] epenthesis between the two members of the cluster (stress [st s], as it will be demonstrated in section 5., below).
As for the words in group c., the fact that there is a word-initial [i] vowel in the citation form of the words and that the fricative undergoes the voicing assimilation process show that the nucleus of the first syllable is lexically associated to a root node and that the fricative is a coda of this initial syllable.
Once these words tend to occur with [i], unlike group a. words, the former have /iS/ in the structure.
Finally, it is assumed that group d. allows two possible standard overt output forms: (i) the fricative is produced in word-initial position ([p su] 'express'); (ii) a word-initial diphthong occurs ([p su] 'express').In section 5., we will see how frequent the two variants are and if those are the only possible output forms for the words in group d.; moreover, we will discuss the segmental nature of this phonological nucleus: is it an empty category, as in words from group a., or is it lexically filled with a root node?If this last scenario is right, what is the segmental nature of the vowel associated to this word-initial nucleus?

The dialectal variation data
Andrade & Rodrigues (1998) elaborate on the results from a sample extracted from the CPE-Var, which includes 21 male and female speakers, aged from 26-39 years old, either graduated or with less than the 9th year of school education.The main results reported in this paper allow us to observe that the four groups of words in (2) exhibit different variation patterns, even though the groups a., c. and d. all share the initial [] or [C-variants.
Words in a. show different variation patterns in Lisbon and Braga.Lisbon speakers from the age group 3, independently of their educational group, almost exclusively produce the initial [] or [C-variants in spontaneous speech (98% for female speakers and 99% for male speakers; other variants only occur in reading tests).Similar results are attested for Lisbon speakers with a high educational level in reading tests.However, if we consider data from word-list reading tests performed by females with a low educational level, productions with word-initial vowel correspond to 100%, in the forms that have a voiced obstruent as the second consonant of the cluster (C2), and to 40% if the C2 is voiceless.
As for Braga speakers, they produce [i]C, []C and [e]C-variants, in addition to the []C-variant (this last variant is the most frequent one in all kinds of speakers, in all the tasks performed).Women having a high educational degree are the only speakers to use [e]C, a variant exclusive to the word-list reading test (2%).These residue forms, then, may be due to written-form processing or to misinterpretation of the standard form, since this type of speakers from the non-standard variety have shown for other variable phenomena a good mastering of the standard forms. 9This behaviour from women of the non-standard variety argues in favour of the marked status of the cluster.Men and women from Braga use [i]C and []C variants in all the three tasks performed (word-list and text reading and spontaneous speech), reported in Andrade & Rodrigues (1998).Variants with [i] at the left-edge of the fricative correspond to 32% in word-list reading, to 57% in text-reading and to 6% in spontaneous speech.Variants having [] in word-initial position, although present in all speech styles, surface more often in spontaneous speech than in the two other tasks.Their results are always under 10% of the utterances with sC-clusters.Words having a voiced consonant as C2 surface more often with a vowel at the left-edge of the fricative than those having a voiceless C2, either in the reading tests or the informal talk.The lexicon of EP exhibits voiceless consonants in C2 position more often than voiced ones for these clusters.Once the fricative assimilates voice from C2 and the fact that auditory perception of a vowel at the left-edge of a voiced consonant is more likely to happen than the perception of a vowel at the left-edge of an voiceless consonant, these results may be explained by co-articulation factors and phonetic/phonological processing.
In conclusion, the speakers from Braga produce the words of type a. with word-initial vowel much more often than Lisbon speakers; the production with a word-initial vowel in these words is virtually absent in Lisbon, although some residue forms arrive in reading tests.
In words from type a., speakers with a high educational level use the variant with no initial vowel more often than other speakers from both varieties.Male speakers use vowel-initial surface forms of these words more often than women.
Words in b. (snob) never show-up with initial [] or [], therefore, they have to be distinguished in structural terms from the three other groups of words.This type of words may surface either with [s] ([snui mw ] snobismo (essa) 'snobbery' -Braga speaker 117) or with [s] before C2 in EP ([sn ] snob 'snob' -Braga speaker 164).As the fricative does not assimilate the voicing from C2 (unlike the fricative of the words in a.), it is assumed that it must be specified as [-voice].Similarly, since it is always [+anterior] it does not show the regular behaviour of coda fricatives en EP (see examples in (3)), i.e., a fricative in coda shows up as [-anterior] (festas [f t] 'parties.').
It is assumed that words of group c. (isqueiro 'lighter') allow [i] and [], although the production with vowel deletion is rare.In the spontaneous speech from the CPE-Var this structure never occurs, suggesting that this is a marked structure in EP.Word-list reading tests present 100% of [i] for the only lexical item with this structure in the data -Israel. 10he words of group d. are the only ones licensing the variants [ej]/[ej] and [j]/[j] (see (4) below), which is used as empirical evidence to assume the structure /eS/C in these clusters.This group of words shares [e] or [e] variants with other types of words.Word--initial [] or [] variants in the reading tests from the CPE-Var correspond to 51% for Lisbon speakers, while for spontaneous speech data, these variants correspond to 91%.The same variants correspond to 8.8% in the reading tests from Braga speakers, and to 40.4% in spontaneous speech.This means that the standard variety uses the empty nuclei surface forms for /eS/C-clusters more often than the non-standard variety (Braga).( 4 In short, although the words in a., c. and d. may all be produced without an initial vowel, the presence of the vowel occurs much more often in types c. and type d. than in type a. words.In accordance with the data presented above, Andrade & Rodrigues (1998) claimed that in words from group a. (escola 'school') the fricative syllabifies in the coda of the first syllable, within a rhyme with an empty nucleus (see Hypothesis E2).In these words, the second consonant of the cluster is under the domain of the onset of the following syllable; in the case there is a right-adjacent liquid, the /obstruent+liquid/ sequence is syllabified in the branching onset of the second syllable (escravo [ka vu] 'slave').This analysis is consistent with the behaviour of coda fricatives in EP.Once this fricative assimilates the voicing specification of the following consonant and surfaces as [-anterior] before a consonant, it is assumed that it is unspecified for voice and C-Place (therefore it is represented as /S/), as it happens with the C-place of the other coda segments (Mateus & Andrade, 2000).This analysis accounts for the EP variation data described above.In fact, words like escola 'school' allow only the typical variants ([i] and []) to fill empty nucleus positions in EP, although they do not surface in spontaneous speech for Lisbon speakers; group a. words (with the empty nucleus) do not exhibit all the surface forms that words like experiência 'experience' do, namely [ej]/[ej] and [j]/[j], the ones used to argue for the specification of the nucleus (/eS(C)/).The analysis of the fricative as a coda with a left-adjacent empty nucleus avoids the violation of the Sonority Sequencing Principle in the onset, as it would very often be the case if the fricative and the following consonant were under the domain of the same branching onset.Furthermore, it implies no violation of the binarity of constituents in the case of sCC-clusters (esgrima 'fence'), i.e., the fricative is a coda and the /obstruent-liquid/ cluster is a branching onset, therefore, the onset is bi-positional.Moreover, this analysis makes a clear distinction between type a. and type b. words: words in a. have an unspecified coda fricative, /S/, while group b. words have the specified fricative /s/.
The analysis of group a. words with an empty nucleus in initial position also accounts for the fact that when the prefixe /iN/ is added they behave as words beginning with a vowel.For instance, inesperado 'unexpected' (/iN+Sp+ad+o/) surfaces as [in ] (Braga speaker 110word-list reading) with a filled empty nucleus or as [in ] (which contrasts with inspirado 'inspired') (Braga speaker 157word-list reading) in EP, but, noticeably, with an [n] in onset position and not with the nasalization of the preceding vowel, as we would expect if the following onset was filled; Notice that the behaviour of /N/11 in derived words in EP is as follows: if the following onset is empty, /N/ is attached to the onset position ([inábi] Lisbon speaker 33word-list reading, from /iN+abil/ inábil 'unable'); if the onset is filled, it shows up on the previous nucleus ([i li ] speaker 33word-list reading, [i li ] Lisbon speaker 8word-list reading, from /iN+feliS/ infeliz 'unhappy').The presence of an epenthetic [] in derived words from group a. (inesperado) shows that they have a word-initial empty nucleus The facts mentioned above argue against hypotheses A, B and D, since they show that the fricative is not an onset.
Words in b., then, have a specified /s/ (not an /S/) in initial position and, therefore, the fricative belongs to the onset of the first syllable (which also contains an empty nucleus), while the next consonant belongs to the following onset.This is true for these loan words because the fricative in EP is always produced as a [s], independently of the production of the nucleus.Moreover, if the nucleus is produced, it shows-up on the right side of the fricative, never on its left, as it happens in Brazilian Portuguese (BP).In BP, then, the structure of the loan word has been interpreted with /S/ in coda (stress 'stress' may surface as [itr si] in BP, contrary to EP).This interpretation of words like snob matches the one of words like pneu [pne w]/[pne w] 'tyre' and amnésia [ ]/[ ] 'amnesia', where the two consonants belong to two different onsets (again, the first syllable has an empty nucleus).This interpretation is consistent with the base syllabification algorithm proposed by Mateus & Andrade (2000), which prevents consonants with the same degree of sonority and/or similar articulator features to syllabify in the same onset in Portuguese.This algorithm establishes that only liquids may be considered the second segment in a branching onset, since syllables (and their onsets) must show an increase of sonority from the beginning towards the nucleus and, simultaneously, the consonants of the onset must show a certain degree of dissimilarity12 .Branching onsets are binary constituents.The analysis proposed for stress and pneu 'tyre' is similar, since, whenever the empty nucleus position is filled, it shows-up after the first consonant as [] or [i], the only vowels that perform this task in Portuguese (EP speakers use either [] or [i] for this task, while BP speakers use [i]).
Words in c. ("isqueiro") have /iS/ structure, therefore, the fricative is a coda of the initial syllable.The next consonant belongs to the following onset.Sporadically, these words surface with deletion of the initial vowel.The normal production of these words, contrary to words such as escola 'school', displays [i] initially.The fact that isqueiro 'lighter' and escola 'school' behave differently is, then, due to their different initial nucleus specification.
Words from group d. have /eS/C-clusters and, therefore, may vary according to the general pattern of <ex-> prefixed words.This includes the productions with vowel deletion, [], [i], [e], [ej] and [j] (and their voiced counterparts, as far as the fricative is concerned), some of which are disallowed by the other groups of words.[i] in word-initial position results from the reduction of unstressed /e/, a regular lexical process of EP in this position.Surface forms including diphthongs were further explained in Rodrigues (2000).The author relates the glide insertion in this context with the insertion of the same glide between /e/ and [-anterior] Coronal consonants, also found in EP in words such as 13 .Note that the insertion of the glide does not occur in words like [se tu] cesto 'basket'.To account for these apparent exceptions, Rodrigues (2001) proposes that the /S/ in texto and similar words must be specified as Coronal, unlike the /S/ of cesto, which lacks specification for C-Place as a whole.Notice also that in EP every time /S/ is syllabified in coda position it receives [-anterior] specification, the non--default value for C-Place.If this unspecified consonant must surface before a vowel, either in the middle of a word or between words, the /S/ is syllabified in onset position and receives default values for C-Place (e.g.: it surfaces as Coronal [+anterior], [z] by regressive assimilation of voice from the following vowel festas alegres [f tzl g] 'happy parties').Rodrigues (1999) and Rodrigues (2000) report more data on type a. words (these data belong to 113 speakers of the CPE-Var).In 1999 the author concludes that younger speakers use no vowel initial forms more often than older ones: three age groups were compared (group 1, group 3 and group 5, of the CPE-Var) and group 1, the group with the younger speakers, only produces forms without vowel for the empty nucleus position.The data also confirms the patterns of variation found in the three speech style materials in Andrade & Rodrigues (1998) for the age group 3. Results discussed in Rodrigues (2000) also include speakers from the fourth age group of the CPE--Var.These results are compatible with the explanation referred above for all the categories of words, in both varieties of the language.This indicates that no-vowel word-initial variants are gaining their way over the variants with empty nuclei filling in EP.We may conclude, then, that initial empty nucleus words (i.e.escola 'school') are subject to an on-going change process in EP.Although they still admit production with empty nucleus filling in some varieties of the language (namely in Braga), and in some speakers (men more than women, low educational level more than high educational level, old speakers more than young speakers), and speech styles (more in reading tasks than in informal talk), the standard variety of the language tends to present no filling of initial empty-nucleus positions before /S/.If the empty nucleus is filled, [i] or [] tend to appear, since they are unmarked surface vowels in initial position in EP varieties.
Words from the other groups discussed, except for group b. words, seem to be affected by two sorts of processes.Words like isqueiro 'lighter' may be reduced in the first syllable, i.e., they can be produced with deletion of the initial /i/.Words like experiência 'experience' may surface either with [i] (instead of the expected [e]) or with the insertion of [j] before a consonant specified as Coronal in the base form (a dissimilation process).These last forms may suffer a post-lexical centralization process of /e/, surfacing as [], after the glide insertion (experiência 'experience' [  ] -Lisbon speaker 32word-list reading).
EP adult production for words involving sC-clusters has been explained on the basis of empty nucleus structure + fricative in coda position (confirming Hypothesis E2), rather than using analyses that mention: (i) complex segments in onset (against Hypothesis B); (ii) extra-syllabic segments (against Hypothesis C, rejected for EP does not exhibit invariable C-place and voice values for the appendix, as we would expect from an appendix); (iii) more than two consonantal roots in onset, as it would be the case for words such as esgrima 'fence'(against Hypothesis A).The analysis assumed in this paper, then, distinguishes the words of the sort of stress 'stress' and amnésia 'amnesia' from the other types of words.The former have the consonants syllabified in two onsets and the later have the /S/ in coda of a syllable (against Hypothesis D.).The coda fricative occurs in words from group a., c. and d.; in group a. it follows an empty nucleus, which is confirmed by the vowel epenthesis exhibited by several varieties of the language (against Hypothesis E1 and favouring Hypothesis E2), while in words from groups c. and d. it follows a specified nucleus (/i/ for group c. words or /e/ for group d.).

The L1 acquisition data
In this section we will observe Portuguese children's productions of target word-initial sC-clusters and compare them with the mastering of this phonological cluster in the acquisition of other target systems, namely Dutch.We will consider the hypotheses listed in section 2. and discuss them based on data from the acquisition of EP.Our assumption is that the study of children's productions may provide empirical evidence to evaluate the adequacy of the analyses proposed for the target grammar.As mentioned in section 3., the results reported for the acquisition of EP refer to all utterances containing target words with word-initial sC-clusters in the database used for this purpose.

The problem
The Portuguese children are faced with three types of phonological consonant clusters 14  From the three types of consonant clusters mentioned above, the /obstruent+liquid/ clusters in (5) are the most frequent (Andrade & Viana, 1993;Vigário & Falé, 1993); it is assumed that they are under the domain of a branching onset and, as for other languages, they do not violate universal principles in the grammar, namely the Sonority Sequencing Principle and the Dissimilarity Condition.As shown in sections 4. and 5., the /s+obstruent (+liquid)/ clusters in (6) are problematic syllabic structures since they violate the Sonority Sequencing Principle and the Dissimilarity Condition; the same happens with the consonant clusters mentioned in (7).The group in (7) corresponds to marked and un-frequent clusters in the system (Mateus & Andrade, 2000).Considering the topic under analysis in this paper, the acquisition problem raised by this variety of consonant clusters in the input is as follows: do Portuguese children process /S+obstruent (+liquid)/ clusters as they process the other two types of consonant clusters?
In this section, our basic concern is to identify the syllabic status of the initial fricative of the /s+obstruent (+liquid)/ in the process of acquisition of EP.We will show that Portuguese children clearly discriminate the three types of structures illustrated in ( 5) -( 7).Moreover, we will compare the production strategies of Portuguese children with those exhibited by Dutch children (acquiring other languages, namely Dutch); this cross-linguistic comparison will allow us to observe different production strategies for apparently identical consonant sequences.Finally, we will provide empirical evidence from the acquisition of EP to discuss the syllabic analysis of the /S+obstruent (+liquid)/ proposed in Andrade & Rodrigues (1998) for the target system.

The lexical selection of words with consonant clusters
Before we focus on the description of the phonological properties of Portuguese children's words containing word-initial sC-clusters, we will briefly report the way the children observed select lexical targets with the three types of clusters mentioned in ( 5)-( 7).The corpus under analysis contains 2 124 lexical targets; each one of these lexical items was produced several times in the 18 655 utterances of the database referred in section 2. If we consider this lexicon of 2 124 items, we may observe that the distribution of words from the three types of clusters mentioned above is as follows: ( The results in (8) and in (9) reveal a contrast in the use of the three types of clusters: /obstruent+liquid/ clusters are the most frequent consonant sequences, both in the lexicon and in the children's productions.Both /obstruent+liquid/ and /s+obstruent (+liquid)/ clusters are differently represented in the lexicon (10% versus 3% in ( 8)), and differently productive in the child's system (see the contrast 6.6% versus 2.7% in ( 9)).The consonant clusters referred in (7) are clearly the less frequent clusters, both in the lexicon and in the production data (see (8.c) and (9.c)).
Although differently represented in the lexicon, targets with /obstruent+liquid/ and with /s+obstruent (+liquid)/ clusters are present in the Portuguese children's productions much earlier than words with the consonant clusters mentioned in (7)  Based on the contrast observed for the Portuguese children's lexical selection of target words with the three consonant clusters described above, we will now focus on /s+obstruent (+liquid)/ sequences and compare the results obtained for this structure with those concerning /obstruent+liquid/ clusters 17 .

The production strategies for sC-clusters
As argued in sections 4. and 5. in this paper, there are four types of word--initial /s+obstruent (+liquid)/ sequences in EP: ( At Stage II, the production of segmental material at the left-edge of the second obstruent of the cluster occurs; at this stage, it is possible to observe the co-ocurrence of the four following production strategies: (i) production of 17 For a description of the acquisition of /obstruent+liquid/ clusters in EP, see Freitas (1997) and(2003).As for the type of clusters mentioned in (11), the data available in our corpus is not enough to make generalizations about the acquisition of this structure in EP. 18 The only words from this group present in the lexicon of the children observed are história 'story' and isqueiro 'lighter'; the first item was produced 26 times and the second one only once.Since história 'story' is normally produced as [t j], we will consider that this is the format of the children's target, therefore, we included it in the subset (11.a). 19The fact that the Portuguese children almost exclusively select targets from the subset in (11.a) may argue for the less marked nature of the structure in (11.a), when compared with the structures in (11.b-d).Notice that, as mentioned in section 5 in this paper, some items from paradigm (11.c)

The acquisition data discussion
Let us return to the 5 hypotheses for the processing of word-initial sC-clusters presented in section 2.: (15) A. The fricative and the following consonant are under the domain of the same branching onset (following Booij, 1996, for Dutch); B. The fricative and the following consonant form a complex segment in the domain of an onset (Fudge, 1969;Selkirk, 1982, for English); C. The fricative is an appendix to the syllable, therefore it lies outside the onset that dominates the following consonant (Trommelen, 1983, for Dutch;Giegerich, 1992, for English); D. The fricative and the following consonant are associated to two different onsets; the fricative is the onset of a word-initial syllable with an empty nucleus (following the analysis of Mateus & Andrade, 2000 for other problematic clusters in EP); E. The fricative is the coda of the first syllable and the following consonant is associated to the onset of the second syllable (following Andrade &Rodrigues, 1998 andMateus &Andrade, 2000 for EP); under this analysis, the fricative is either (1) a coda of a rhyme with a /e/ vowel in the nucleus or (2) a coda of a rhyme with an empty nucleus (Andrade & Rodrigues, 1998).
We will now observe Portuguese children's behaviour in order to test the adequacy of these hypotheses to account for the acquisition facts concerning sC-clusters.
The Portuguese children's data presented in ( 12) -( 14) show that they are not processing the /s+obstruent/ sequence as a complex segment (against Hypothesis B): (i) if /s+obstruent/ structures were complex segments, one would expect children to produce, at Stage I, either the fricative or the following obstruent, for the target segment would have two competing manners of articulation under the same skeletal position.However, this is not attestedat this Stage I, Portuguese children delete the fricative and exclusively produce the right--adjacent obstruent (see exemples in ( 12)).Notice that, for target complex consonants /k w / and /g w / in the adult grammar, the same children start producing either [k]/[g] or [w] (Freitas, 1997 and2000), therefore, we would expect the same behaviour for sC-clusters, if they were complex segments in the target system: (16) (ii) at Stage II, the children are producing segmental material at the left-edge of the obstruent (a vowel [], a vowel [] plus a fricative or only a fricative; these segments are sometimes lengthened or followed by a pausesee ( 13)).This material may be interpreted as phonetic cues showing that children are processing the fricative under the domain of a syllabic node different from the one hosting the following obstruent (therefore, against hypotheses A and B).
Notice that the insertion of the vowel [] at Stage II also argues against Hypothesis D: if the fricative was the onset of a syllable with an empty nucleus, we would expect the vowel [] not to be produced in word-initial position but rather between the two consonants of the clusters (/SC/ -> [.C]).This is not attested in the Portuguese children's data observed.Moreover, if sC-clusters were similar to the clusters in (11) (i,e., the two consonants are onsets of different syllables, the first one having an empty nucleus, like pneu 'tyre' and afta 'ulcer'), then we would expect words with sC-clusters to be selected as possible targets late in development: (i) as we have seen in (10c), words like pneu 'tyre' emerge very late in the child's lexicon; (ii) however, the data in (12) shows that words with sC-clusters are selected by children early in development.These acquisition facts argue against Hypothesis D.
As we have seen in sections 4. and 5., the Hypothesis A raises questions concerning the syllabic licensing of /s+obstruent(+liquid)/ sequences as branching onsets, for these structures violate the Sonority Sequencing Principle (in the case of both /S+obstruent / and /S+obstruent+liquid/), the Dissimilarity Condition and the fact that constituents are maximally binary (in the case of /s+obstruent+liquid/).If Portuguese children were processing /s+obstruent(+liquid)/ clusters as /obstruent+liquid/ branching onsets, one would expect both types of clusters to emerge simultaneously.Moreover, one would find the same production strategies in the mastering of both structures.None of the two predictions are attested: (i) By the time Portuguese children are producing /s+obstruent(+liquid)/ clusters, they are still unable to produce /obstruent+liquid/ branching onsets according to the target grammar: (ii) The first strategy used by Portuguese children to deal with target /obstruent(C 1 )+liquid(C 2 )/ branching onsets is to reduce the cluster to its left--edge member21 (see Freitas, 2003): (18) The acquisition of /obstruent+liquid/ branching onsets in EP If sC-clusters were branching onsets, the first consonant would be C 1 , the second one would be C 2 and the first production strategy used by the children would be /C 1 C 2 / -> [C 1 ]This would elicit the production of the target /S/ and the deletion of the right adjacent obstruent (/s+C/  [s).As we may observe in ( 12) -( 14), this strategy is not attested in the acquisition of EP sC-clusters22 .On the contrary, the examples in (12) show that the first productions of sC-clusters match the pattern [C].The deletion of the fricative at Stage I shows that this segment is not interpreted as C 1 of a branching onset, therefore, it is not under the scope of the syllabic node dominating the obstruent.
(iii) Another productive strategy exhibited by Portuguese children in the acquisition of branching onsets is the use of an epenthetic vowel [] between C 1 and C 2 .This occurs before the setting of branching onsets in the child ' As we have seen before, the fact that the use of an epenthetic vowel between the two members of the sC-clusters is not attested also argues against an analysis where the left-edge fricative is under the domain of an non--branching onset [.CV]: Portuguese children are never processing this fricative as an onset (against Hypothesis D).Moreover, sC-clusters are acquired earlier than /obstruent+liquid/ clusters and the two clusters do not exhibit the same production strategies.This argues against the hypothesis that Portuguese children are processing both types of clusters as branching onsets (against Hypothesis A).
Summarising, the arguments listed above allow us to assume that (i) the fricative of the sC-clusters is not under the domain of an onset and (ii) the fricative and the obstruent are not hosted by the same syllabic node, therefore, rejecting the hypotheses A, B and D. This leaves us with hypotheses C and E. According to Hypothesis C, the fricative is an appendix to the syllable.This makes sC-clusters marked structures, which allows us to predict that they will emerge late in the child's system, after /obstruent+liquid/ clusters, for this last type of clusters do not violate principles of syllabic licensing.As we have seen in ( 17), this is not borne out by data: by the time Portuguese children produce sC-clusters, they are still unable to produce /obstruent+liquid/ clusters.If we compare the results from the acquisition of EP with those from the acquisition of Dutch (Fikkert, 1994;Fikkert & Freitas, 1999), we observe that languages with similar overt structures may be acquired differently.Fikkert & Freitas (1999)  (21) sC-clusters in Dutch (Fikkert, 1994 (Fikkert & Freitas, 1999).Similar results are reported for English in Kirk & Demuth (2003): the authors compare accuracy on the word-initial /s/+stop with accuracy of word--final stop+/s/; their results show that stop+/s/ coda clusters are mastered before /s/+stop onset clusters.
If Hypothesis C was right for EP (the fricative is an appendix), then Portuguese and Dutch children should show a similar developmental behaviour concerning the structure under analysis.The fact that both groups of children exhibit different production strategies in development may be interpreted as a consequence of the different nature of sC-clusters in the two languages.If the Dutch children's acquisition path is the one matching the interpretation of the fricative as an appendix, then the Portuguese children's behaviour corresponds to a different processing of the apparently similar sC-clusters in the target system (therefore, against Hypothesis C).
Concluding, the arguments listed so far in this section allow us to assume that: (i) the fricative and the obstruent are under different syllabic nodes (against hypotheses A and B); (ii) the syllabic status assigned to the fricative is different from the one of the following obstruent, i.e, the fricative is not under an onset domain (against hypothesis A, B and D); (iii) the fricative is not an appendix (against hypothesis C).
According to Hypothesis E, the fricative is a coda of a word-initial syllabic node 24 (either with a left-adjacent /e/ (Hypothesis E1) or with a left-adjacent empty nucleus (Hypothesis E2)).The deletion of the fricative at Stage I and the segmental material produced at the left-edge of the obstruent at Stage II (see examples in ( 12) and ( 13)) provide empirical evidence to argue for the autonomous syllabic status of the fricative in the sC-clusters.The behaviour illustrated in (13) may be interpreted as the result of the fact that children are processing the fricative as a coda under the domain of a word-initial syllabic node different from the one hosting the obstruent (thus confirming Hypothesis E).This is cued by (i) the presence of the vowel [], either with production or deletion of the fricative, (ii) the use of lengthened segments left-adjacent to the second obstruent, or (iii) the use of pauses that may be interpreted as cues of a syllabic boundary.An additional empirical argument for the coda status of the fricative comes from the fact that the fricative of the sC-clusters emerges by the time coda fricatives are mastered in Portuguese children's phonological development.In early developmental stages, constraints ruling the syllabic constituency disallow codas in production (see (18a)); later on in development, the coda becomes available (see (18b)) and coda fricatives are faithfully producedthis is when the fricative of the sC-clusters emerges (see (18b)); this allows us to assume that Portuguese children are interpreting the fricative as a coda Notice that the fricative of the sC-cluster in Dutch is not produced by the time coda fricatives are mastered in the child's system (Fikkert 1994, Fikkert & Freitas 1999) We have argued that the fricative of the sC-clusters may be interpreted by the Portuguese children as a coda, thus confirming Hypothesis E; we will now refer to another aspect of Hypothesis E, i.e., the fricative is a coda (1) of a rhyme with a /e/ vowel in the nucleus (Hypothesis E1) or (2) of a rhyme with an empty nucleus (Hypothesis E2).In general, the vowel [] in Portuguese children's productions may be the result of two different processes: (i) it is the output of vowel reduction affecting /e,/ in unstressed position, in the target system (like in seda ['sed] 'silk'  sedoso [s'dozu] 'silk-like'; anel ['n] 'ring'  anelinho [n'liu] 'small ring'); (ii) it is used to fill empty prosodic positions, either at the level of the prosodic word, the foot or the syllable 26 (this vowel is used in the target system for similar purposes: mar Assuming that the word-initial fricative of the sC-clusters is a coda, the remaining question is to know whether the left-adjacent nucleus is an empty category or is associated to a phonological /e/ vowel, which is normally reduced to [] , 2001), which means that one single lexical target may have as many entries in the child's lexicon as overt output forms.The data on the acquisition of sC-clusters focused on this paper showed that children go into a stage (Stage II) where a vowel at the left-edge of the fricative is produced; this behaviour was interpreted as evidence for the processing of a word-initial empty nucleus.If children are processing this information, this means that it is part of the lexical representation of the word, therefore, children are not simply storing surface information but rather reconstructing more abstract phonological information.Notice that they could have interpreted the fricative of the sC-clusters as Dutch children do (i.e., as an appendix).On the contrary, they seem to pick up all information available in the system (morphological information in derived contexts, external sandhi information, voicing assimilation) in order to build up an abstract lexical representation, where an empty nucleus is left-adjacent to the coda fricative, under the same rhyme domain.If the lexical storage of all variants hypothesis was right, then children should never exhibit a vowel in word-initial position since this vowel is not available in word-initial position, in the target system, as shown in section 5.

Final remarks
In this paper, we provided empirical evidence from EP dialectal variation and from EP L1 acquisition to discuss the syllabification of sC(C)-clusters in this language.The hypotheses on the syllabification of word-initial sC(C)-clusters listed in section 2 were evaluated both in section 5 and in section 6; arguments were listed based on the evidence from the two types of data discussed in the paper (dialectal and acquisition data) and on the phonological properties of the words described.It was, therefore, possible to confirm the discrimination of four groups of words displaying sC(C)-clusters (following the proposal in Andrade & Rodrigues, 1998): to assume a fully specified nucleus at the left-edge of the fricative (/eS/C).
Finally, it was possible to show that the dialectal variation data provides empirical arguments to discriminate the two dialects observed (Lisbon and Braga) based on the behaviour of word-initial sC(C)-clusters.
As for the acquisition data, it was observed that group a. is the only lexically productive set of words in the Portuguese children's system.Moreover, we have shown that Portuguese children clearly discriminate sC(C)-clusters from other clusters in the target system, namely /obstruent+liquid/ clusters and problematic clusters as the ones in pneu 'tyre' and afta 'ulcer'.Moreover, their behaviour is consistent with the production of sC(C)-clusters exhibited by Lisbon speakers in the spontaneous speech context (their input variety of the language): Assuming that children process overt output forms in order to build their abstract phonological representations, the data on the acquisition of sC(C)-clusters has shown that Portuguese children are able to process empty categories; in the processing of the clusters focused in this paper, children provide empirical evidence to argue for a word-initial empty nucleus in their lexical representation of the words displaying this structure.It seems, then, that children are not just storing surface information but rather reconstructing more abstract phonological information.Our hypothesis is that they are picking up information from the system (morphological information in derived contexts, external sandhi information, voicing assimilation) in order to build up an abstract lexical representation, where an empty nucleus is left--adjacent to the coda fricative.As we mentioned before, if the lexical storage of all variants hypothesis was in the right track (Bybee, 2001), Portuguese children would not exhibit a vowel in word-initial position, since, as we have shown by the description of the Lisbon variety, this vowel is not available in word-initial position as a possible target structure in the adult system.
listed the differences between the acquisition of sC-clusters in the two languages: (i) The production strategies used by Dutch and Portuguese children at Stage II are different: unlike Portuguese children, Dutch children (a) never exhibit the production of a word-initial vowel and (b) never use the lengthening of word--initial segments or (c) use a pause between the fricative and the following obstruent.See the examples from Dutch in (21); on the other hand, as mentioned before, Portuguese children never delete the second obstruent of the cluster: (i) words from group a. ([k l] escola 'school') assign (a) the initial fricative of the cluster to a coda position with a left-adjacent empty nucleus and (b) the following consonant(s) to an onset position; (ii) in words from group b. ([st s] stress), the fricative is a word-initial onset in the domain of an initial syllable with an empty nucleus; the following consonant(s) associate to the onset of the second syllable; (iii) in words from group c. ([ikj u] isqueiro 'lighter'), the fricative is a coda with a left-adjacent specified nucleus (/iS/; (iv) words from group d. ([plika ]/[jplika ] explicar 'to explain') display several types of variants ([ej]/[ej]; [j]/[j]; [] or []; [i] or [i]; [] or []; [e] or [e]), which is used as empirical evidence On the nature of sC-clusters in European Portuguese 81 (i) at the last stage of development (Stage III), they exclusively use the []C(C)-variant, reproducing the Lisbon speakers' behaviour; (ii) at Stage II, the production strategies selected by Portuguese children (word-initial [] epenthesis, with or without deletion of the fricative, segmental lengthening and insertion of a pause at the left-edge of the second obstruent) show that they are processing a word-initial empty nucleus in the lexical representation of words with sC(C)-clusters; (iii) the emergence of the fricative in production is licensed by the time coda fricatives are available in the child's system, which argues for the coda status of the fricative.

)
Production of words from group d.
expõem-se '(they) are exhibited' [epo  je ] -Lisbon speaker 44 expõem '(they) exhibit' [ejpo  j j] -Lisbon speaker 57 [ipo  j j] -Lisbon speaker 32 os exclusivos do 'Det.m.pl.exclusive Prep+Det.'[ukluzi vudu] -Lisbon speaker 35 in EP (cf.Mateus & Andrade, 2000): The acquisition of the complex /k w / and /g w / in European Portuguese quadro /'k w adu /  ['kalu]/['kalo] (Marta: 1;7.18) '(the) painting' If Portuguese children were interpreting sC-clusters and /obstruent+liquid/ clusters similarly, i.e., under the domain of a branching onset, they should productively use an epenthetic [] between the fricative and the following obstruent ([.CV]).This behaviour is not attested, which again argues against Hypothesis A and D. Notice that this epenthetic vowel occurs in words where the sC-cluster is already produced but the /obstruent+liquid/ branching onset is still problematic: , which again reveals the different behaviour of Dutch and Portuguese children when faced with apparently similar sC-clusters (see the examples below): and deleted in spontaneous speech (as in vestido [v'tidu]  [v'tidu] 'dress' or cidades [si'dad]  [si'dad] 'cities').Notice, however, that the /e/  [] regular vowel reduction process in unstressed position does not apply to word-initial /e/ nuclei (Mateus & Andrade, 2000), as we may observe in the contrast [e u] erro 'error'  [ia ]/[ea ] errar 'to make an error' (*[a ] is not a valid form).This allows us to expect children to produce word-initial [i] or [e], the two possible instances of word-initial /e/, in case they are processing an /e/ nucleus in the structure.To discuss this two competing analysis in the acquisition of EP, let us briefly refer to the Portuguese children's behaviour concerning the use of the two functions of []