The Change in Clitic Placement from Classical to Modern European Portuguese : Results from the Tycho Brahe Corpus

0. Introduction In the history of Portuguese, one of the most salient syntactic features that change along time is clitic placement. As clitic placement can be considered one of the major grammatical indicators, changes in this domain constitutes an important key to the grammatical history of a language. In this paper, we bring the results from a new research on this topic, which aims at accounting for one of the grammatical changes Portuguese underwent, and at locating this change in time. We started out from a much debated point in the literature: When does Modern European Portuguese start? In previous research, two different proposals had been made, based on the evolution of clitic placement in enclisis/proclisis variation contexts (namely, non-dependent affirmative sentences XP-V, XP being a [+referential] phrase). In these contexts, the predominance of proclisis typical of 16 century texts cedes to the generalization of enclisis – which became obligatory, verb-clitic being the grammatical order in Modern European Portuguese (henceforth EP). On one hand, Martins (1994) claims that the new grammar starts in the 17 century; on the other hand, Galves and Galves (1995) and Galves et al. (1998) claim that the change occurs only at the end of the 18 century. The empirical grounds for the proposal in Martins (1994) were the patterns of enclisis versus proclisis variation in nine 16-19 century texts. This included two 17 century texts: the letters by Francisco Manuel de Melo (1608-1666), with predominant proclisis (7,7% enclisis


III. The methodology
The present paper was guided by the following methodological criteria :

The organization of the data
The procedure in organizing the data was as follows.In a first stage, all occurrences of enclisis and proclisis in a given text were separated -regardless of the syntactic context in which they occur.Next, we worked on the totality of the occurrences, classifying them according to the sentence type and the clause-initial elements, obtaining thus a global picture of the distribution of data.Last, data is separated into varying and categorical; only the contexts in which variation has been registered -within a text or between different texts -are considered in the analysis.The totality of occurrences is, however, readily retrievable from the initial archives.We comment below on the criteria followed to isolate relevant contexts, and mention some ensuing problems.

Sentence type and quantification of the data
In V1 sentences enclisis was categorically attested in all texts.In negative sentences, proclisis is the only option.Those types of sentences are therefore no considered in our study.
Proclisis is also highly predominant from Old Portuguese to modern European Portuguese texts in subordinate clauses.Some enclitic relative and completive clauses appear in 18 th and 17 th century texts, but as these occurrences are marginal in numbers, we do not consider them here and exclude subordinate clauses from our variation set.
Coordinate clauses are expected to follow the pattern of matrix clauses as far as clitic placement variation is concerned, once the connectives are not counted as clause-initial elements, but instead, as constituents outside clause limits.However, this generalization fails in one context: coordinate clauses in which there is no constituent between the connective and the verb/clitic sequence.Whereas those sentences could be considered V1 clauses (once, as mentioned, the connective is outside clause limits), research has shown that they present variation in clitic placement (cf.for instance Martins 1994 andRibeiro 1995), which is also verified in our data.
Preliminary research has shown that -The frequency of enclisis and proclisis according to the pre-verbal element is constant across matrix and second conjunct of coordinate clauses.
-There is a discrepancy between, on one hand, matrix clauses and coordinate clauses in which the verb is not in the first position after the connective and, on the other hand, coordinate clauses in which the verb follows immediately the conjunction (from now on referred to as V1 coordinate structures).In all the authors considered, the relative frequency between enclisis and proclisis in V1 coordinate structures is sensibly different from what we observe in V2 constructions, both in matrix and coordinate clauses.The variation between authors is much bigger with V1 Coordinate than with Matrix and V2 coordinate structures.We shall therefore compute all the V2 constructions together, and keep apart the V1 coordinate clauses only.

2 Clause-initial elements
As shown in Table 1, within each V2 affirmative sentence-type group, clauses were separated according to the initial elements with which variation was registered in at least one text of the set considered-namely, subjects, adverbs, prepositional phrases and dependent clauses.
Proclisis was registered categorically, in all texts, in sentences initiated by explicitly focalized, and interrogative phrases.It is almost categorical also with quantified N phrases.However, some quantifiers, like todos (all) alguns and muitos (many) (some) present some cases of enclitic placement.At this point of our research, we did not take this variation into account, the order cl-V was computed as categorical proclisis, and the cases of enclisis were ignored.
Sentences with the adverbs bem, mal, já, sempre, também, and ainda in pre-verbal position have also been excluded from the variation set since they never occurred with enclisis.We also excluded the adverb assim, although some cases of enclisis appeared.But in this case, the picture of the variation is more complicated since there are two different uses of assim [2] .One is still categorically proclitic in Modern European Portuguese.The other yields enclisis.It is for this one that we have to compute variation in Classical Portuguese.As it is much less frequent than in the other in texts, we have eliminated both from our data up to now.
Last, there are cases in which more than one constituents precede the verb.In this case, we keep track of this fact in the data, but for classification and statistical purposes, we consider the phrase which immediatly precedes the verb.It must be noted however that when the second phrase clearly modifies the first one, they are counted as only one constituent.The two following examples illustrate this point.In the first one, from Aires, the relative clause is part of the subject, which is considered as being the relevant pre-verbal phrase, while in the second one, from Maria do Céu, in which the pre-verbal PP modifies the verb, it is this PP which counts as the pre-verbal element.As for the coordinate structures, we have found variation with the coordination conjunctions e, "and", mas "but", and porem "however", but not with ou "or", pois "because", and the explicative que, also tagged as conjunction in the corpus.We also excluded from our computation the clauses introduced by porque, because of the ambiguity between the clausal and the explicative reading.Although these clauses do present variation, we left them for a future research.
Finally, we have found some few occurrences of other pre-verbal phrases, which are computed, up to now, as "others".These are essentially vocatives, some dislocated or topicalized NPs, and some other fronted elements, mainly adjectives.

Further categorization
The 'variation' contexts here considered constitute broad classes in a preliminary organization of data.Within each class, more specific groupings were made when relevant -for example, heavy vs. short phrases.We believe that this organization in general classes, although not exhaustive, can facilitate further research on the data, allowing for more specific classifications where this reveals to be relevant.One consequence of this option for a broad classification of clause-initial elements is that non explicitly focalized and topicalized elements were not considered as separate groups .In other words, the syntactic categories topic/focus were not separated a priori in the classification.We have preferred, in face of the complexity involved in identifying focalization and topicalization operations in written texts, to keep ourselves to broader syntactic categories, procrastinating the interpretation of the status of each element as foci or topics to the stage of the analysis.
As it can be inferred from what is said above, we adopt a new methodology for the description of the variation.What counts as variation context is not defined a-priori, only on the basis of a previous knowledge, but also on the basis of what we find in the texts.The variation contexts are therefore defined as the ones in which we find optionality in clitic placement either within one text or across texts of the period.In opposition, by definition, categorical contexts are those in which none of the surveyed texts show optionality in placement.One consequence of this methodology is that the group of 'variation contexts' may change as work proceeds from one text to another.As a result, it should be pointed out that "variation context" is an open category, in that the potential register of variation in a newly researched text would force all the previous data to be reviewed, in order to include the new syntactic environment as a variation context.It must be stressed, then, that the data presented below describes the present state of research, as the inclusion of further data from other texts can force the variation set to be revised.This does not mean that we do not use our knowledge to evaluate the relevance of marginal data on the overall picture.For instance, in what follows, we did not take into consideration in the total quantification of the data the variation in subordinate clause.The reason is that, since enclisis in this context is at most very marginal, it would increase enormously the final percentage of proclisis for all the authors, hiding the relevant quantitative contrasts.
We believe, however, that at the end of the process a fair picture of clitic-placement variation can be achieved.This method presents the advantage of permitting a qualitative approach to the variation, as shown in the analysis below, which reveals that the variation in clitic placement throughout the period augments not only in absolute numeric terms, but also in terms of the contexts in which it can be attested.
One last detail on the methodology should be pointed out.As it can be seen in the examples listed in the analysis below, some sentences can include more than one occurrence of enclisis/proclisis.In separating the data, each sentence was taken as a unit, but each occurrence was counted separately.Thus a sentence which shows a subject initial clause with a clitic, followed by a coordinate clause with a clitic, for example, is listed twice -once in each pertinent context.The aim of this procedure was to allow analysis to access the broader discursive contexts, which showed to be pertinent, for example, in identifying topicalization constructions.The numbers on the tables refer to proclisis/enclisis occurrences.

IV. The data
Applying the methodology presented in III. to our corpus, and ordering the authors according to their birth date [3] , we obtained the following results.
Let's call V2 sentences the sentences in which the verb is immediately preceded by either a subject or a PP or an adverb.
Picture 1 shows the variation between enclisis and proclisis in this context.Picture 2 and Picture 3 respectively show this variation in sentences in which the verb is immediately preceded by a clause, and in V1 2 nd coordinate (that is when the verb immediately follows the coordination conjunction).

Picture 1: the variation between enclisis and proclisis in V2 sentences
Picture 2: the variation between enclisis and proclisis in sentences in which the verb is immediately preceded by a clause

Picture 3: the variation between enclisis and proclisis in V1 coordinate sentences
By comparing these three pictures, we immediately observe two important facts.First, the mean frequency of enclisis is much higher when a clause or a coordinating conjunction immediately precedes the verb than in the other contexts.Second, in all these graphs, the data allow us to define two periods, with a border line around 1700.
In the remaining of this article, we shall argue that these two periods can be characterized as follows: -The variation observed in the first period, in which enclisis ranges from 0 to 15%, with few exceptions, is produced by a single grammar, that we shall call Classical Portuguese, in which enclisis is a marked option since it arises in structures in which the pre-verbal phrase is outside the boundaries of the clause, and therefore the verb is in first position, as represented below: Enclisis in Period 1:

XP[ V-cl
-The variation observed in the second period, in which we observe an inversion in frequency between proclisis and enclisis, is due to the competition in texts between two grammars, Classical Portuguese and Modern European Portuguese.We shall bring some evidence that in the latter, enclisis is no more a V1 phenomenon.

VI.
Enclisis is a V1 phenomenon in Classical Portuguese

Enclisis and contrast in Vieira's sermons
As was already pointed out by Britto (1999) In both cases, the subject is separated from the verb by some phrase adjoined to the clause: a sentential adverbial PP in 4., and a vocative in 5.
-Finally, the only case in which we find enclisis with an adverb is the following: This exception (1 case in 28) is interesting because it appears in a discursive context in which the adverb "aqui" is contrasted with "lá" in the previous sentence, with the repetition of the same verb: vemos/vêem.We shall see that it is in this context that enclisis is systematically found in the sermons.
Let's now look at enclisis in the sermons in detail.The very striking difference with what we find in the letters is the great quantity of enclisis with subjects and with PPs.Let's compare the sentences with enclisis and the sentences with proclisis.
The examples below illustrate the fact that, in all the cases of enclisis with pre-verbal subjects, with no exception, these subjects are contrasted with another phrase, generally a subject too.In many cases, the opposition between the two phrases is explicitly given in the immediately preceding sentence.

Deus/os homens
It must be observed that the contrast between the pre-verbal phrases is reinforced by explicit oppositions inside the sentences they precede.Many times, the verb is repeated in both sentences but some other aspect explicitely marks a contrast, affirmative vs. negative form ("porque a revelação não me póde salvar sem boas obras; e as boas obras pódem-me salvar sem revelação, As pégadas estão manifestas e vêem-se; as raizes estão escondidas, e não se vêem), lexical oppositions (As outras prophecias cumprem-se a seu tempo, esta do dia do Juiso tem o seu cumprimento antes de tempo; porque as figuras vão-se, e o theatro fica.).Observe that the first example combine with negation the exact inversion of the terms in the sentences.
We find exactly the same system of contrasts with other phrases: In conclusion, we see that in Vieira's sermons enclisis appears consistently when two terms are contrasted.In other terms, the pre-verbal phrases in enclitic constructions can be characterized as contrastive topics.Non contrastive topics appear with proclisis.The high rate of enclisis in Vieira's sermons can be therefore explained by discursive reasons: the sermons are masterpieces of the baroque style, which uses oppositions between terms as a fundamental stylistic resort.This view is consistent with the hypothesis defended by many authors (see for instance Benincà(1994) Galves and Galves(1995), Galves (1997Galves ( , 2000) Salvi (1990) that in Classical Portuguese enclisis always corresponds to a V1 configuration.This means that when some phrase precedes the verb, it is outside the sentence.
We now straightforwardly explain why the letters, which are not pieces of baroque literature, but narrative and argumentative texts, display much less enclisis.However, the cases of enclisis in the letters support the analysis of clitic placement in the sermons.In effect, as we saw above, enclisis arises with subjects and PPs when these are clearly dislocated.Cf. ex. 1. and 2.
Which are clitic-left dislocation constructions, 3.-5.in which we find the string Subject X V-cl, X a clausal adjunct, and last but least a case of neat contrastive effect between the adverbs aqui (here) and lá (there).
We shall now compare these results for Vieira with the data concerning his contemporaries.

Enclisis in the 17th century authors
Let's observe the enclitic constructions in each of these authors
As for the subjects, we also observe the contrasts found in Vieira's sermons.

Um leaõ/a rapoza
But, contrary to Vieira's sermons, this is not a systematic characteristic of enclisis with subjects.However, these constructions all share an interesting property.They all appear in passive se-constructions, (in proclitic clauses, se only appear in 50% of the clauses (7/15) [4]
Moreover, it must be noted that in most of the cases (to be quantified in a further moment), these subjects are long, as in ( 44) and ( 45), or separated from the verb by a clause like in ( 46) and (47).

45.
E/CONJ o/D caso/N presente/ADJ-G da/P+D-F maneira/N que/WPRO o/CL resolvemos/VB-P ,/, ainda/ADV que/C naõ/NEG está/ET-P na/P+D-F Ordenaçaõ/NPR deste/P+D Reyno/NPR ,/, do/P+D Direito/NPR Civil/ADJ-G ,/, e/CONJ está/ET-P determinado/VB-AN por/P Acursio/NPR ,/, Bartholo/NPR ,/, e/CONJ os/D-P Doutores/NPR-P ,/, e/CONJ admittido/VB-AN ,/, e/CONJ praticado/VB-AN em/P Portugal/NPR ,/, e/CONJ muitos/Q-P outros/ADJ-P Reynos/NPR-P ,/, como/CONJS mostrámos/VB-P ./. Summarizing, enclisis in Costa can be characterized, as in Vieira, as deriving from structures in which the pre-verbal phrase is clearly external to the clause, functioning as a marked topic.However the texts differ with respect to the frequency of this kind of structures appear, and the different types existing in clauses.In Vieira's sermons, enclisis exhaustively results from the system of oppositions constitutive of the baroque style.In Costa, we also find the same stylistic factor, but it is less systematic.However, another conditioning of enclisis appears, the frequent use of passive se-constructions.Finally, the fact that in both authors, pre-verbal clauses very frequently triggers enclisis supports the claim that this placement of the clitic does correspond to a V1 structure, since clauses are likely to be adjoined to the maximal projection of the clause.

Melo
As is easily seen in Picture I, enclisis in Melo is very reduced.As Vieira in his letters, he never uses it with pre-verbal PPs and adverbs, and the only context in which we can find an important rate of enclisis is when the pre-verbal phrase is a clause.
However, Melo does display some cases of enclisis with pre-verbal subjects.We transcribe below all the 7 cases at stake: Leaving apart the first example, which is a clear case of topicalization since the verb "parecer" takes no subject, only in ( 49) and ( 50) can be the contrast between two terms the origin of the enclitic positioning of the pronoun.There is however a common feature between the sentences exemplified in 50-53, they all instantiate the first person.Can this fact explain enclisis in a way coherent with what have been said so far?With respect to Mello, the question is almost useless, given the very low frequency of enclisis.

Chagas
In Chagas, the proportion of enclisis with subjects is comparable to the one found in Melo, and the cases of enclisis with adverbs and PPs pattern with what is found in Vieira's letters.It is therefore a proclitic author .Below, we list all the cases of enclisis with pre-verbal subjects, adverbs and PPs.54.Um/D-UM mosquito/N não/NEG tem/TR-P ombros/N-P para/P um/D-UM monte/N ,/, uma/D-UM-F ervinha/N ,/, débil
A careful reading of all the examples of pre-verbal subjects with enclisis listed above show that the analysis proposed so far is applicable to Chagas also.

Maria do Céu
In Maria do Céu, proclisis shows up in its extreme form.Only in V1 coordinate structures, and only in four sentences with pre-verbal subjects do we find enclisis.The remarkable fact is that there is no case of V-cl order with pre-verbal clauses, a context which favours enclisis a lot in all the other authors.In all these cases we find again that the marked pattern is intended to have a contrastive effect, in the sense that a distinctive topic is introduced.In effect, in all the sentences, it corresponds to an abrupt change of topic.
In conclusion, the authors of the 17 th century show a very consistent pattern in the distribution of clitic-placement.It is clearly a minoritary, marked, pattern, associated to emphasis or contrast.This characterization is fully compatible with the hypothesis that enclisis in this period corresponds to a V1 structure, with some phrase adjoined to the sentence, producing an apparent V2 order.
We shall now see that changes occurring in the distribution of enclisis from 1700 on support the analysis proposed so far, and evidence that not only does proclisis decline lost during the 18 th century but enclisis ceases to correspond to a V1 structure.

VI. The competition of grammars
Picture 1 shows an important variation among authors in the first quarter of the 18 th We observe that the frequency of enclisis in V2 sentences ranges from 0.15 (Garção) to 0.68 (Verney).Figure 4 below shows this variation with pre-verbal subjects: We shall see below, that in contrast with we found at the beginning of the 17 th century, this variation does not correspond to clear stylistic features of the sentences or texts.The exact point of change is however still difficult to precisely locate by articulating a qualitative and quantitative analysis.For instance, how is the frequency of 36% of enclisis subjects to be interpreted in Aires?Is it the result of the competition of grammars, or a stylistic effect due to the nature of his text?By looking at the picture, the former interpretation is favoured since Aires'point is inside an ascending curve.But if we consider the sentences, we see that we find the same kind of oppositions as in Vieira's sermons in most of the cases.

Nos primeiros anos/depois
As we shall see now, the variation between enclisis and proclisis in the following authors is more easily interpretable as the result of the competition of two grammars since no clear pattern emerge from the distribution of each form.

Garção
Correia Garção (born in 1724) presents little enclisis.The 7 cases with subjects, and the only case with PP are transcribed below.In these examples, no clear pattern is recoverable.Note that in the last sentence, the pre-verbal PP is clearly anaphoric.This is also the case for the subject in (85).This is not a context for enclisis in the preceding authors.This could be a piece of evidence that, although he uses very little enclisis, Garção no more assigns to this construction a structure in which the pre-verbal phrase is outside the boundaries of the clause.In other terms, we would already be at the beginning of the period in which the variation between enclisis and proclisis is the reflex of a competition of grammar, proclisis being the choice of the old grammar, still by far majoritary at this time.
The two preceding authors, Verney and Antonio da Costa, respectively born in 1713 and 1714, reinforce this hypothesis, since they both present a very high rate of enclisis for their time.It is interesting to note that Verney can be consider a very atypical author since, as he himself claims, his mother tongue is not Portuguese.But this is not the case of Antonio da Costa, and their use of enclisis with pre-verbal subjects is exactly the same.The difference between them is that if we consider all V2 sentences (cf. Figure 1) Verney is consistently more enclitic, while Costa comes back the line.It is also interesting to note that Antonio da Costa is the first author in our Corpus in which the high frequency of enclisis does not go together with a high frequency of the clitic SE, as shown in Figure 5 below:

Marquesa de Alorna
Alorna (born in 1750) displays a frequency of enclisis comparable with the one found in Vieira's sermons.51% of her subjects are followed by enclisis.However, in contrast with Vieira, no clear pattern is found in the distribution of enclisis and proclisis.
In sum, differently from Vieira, the quantitative importance of enclitic constructions in Alorna can be taken as evidence that the grammar has changed, and that the occurrences of proclitic constructions are the effect of the use of the old grammar, in a situation of competition of grammars.The results found in the following authors, in whom the enclitic pattern goes on increasing, reinforce this conclusion.Finally, it must be noted that Alorna displays an almost categorical enclitic pattern in V1 coordinate sentences.

Almeida Garrett
In Almeida Garrett (born in 1799), the only context in which proclisis keeps beeing predominant is when the pre-verbal phrase is an adverb.A possible explanation for this discrepancy is the possibility of focalization of adverbs, as illustrated in (93).Although, from the point of view of the model of grammar competition, we need no explanation for these two cases, it is worth noting that ( 93) is clearly a case of fixed expression.As for (92), the indefinite determiner uma be analyzed as an existential quantifier requiring proclisis.

VI.
Enclisis ceases to be a V1 phenomenon from the 18 th century on So far, we have shown that up to the beginning of the 18 th century, we consistently find in authors the expression of a grammar in which enclisis corresponds to a V1 grammar.This ceases to be the case from Correia Garção on, which, together with the increasing of the rate of enclisis, authorizes us to claim that the variation observed from this author on is produced by the competition of two grammars, Classical Portuguese, and Modern European Portuguese, in which enclisis is categorical in the former variation contexts.
The texts of the Tycho Brahe Corpus provide us with further evidence that the change observed in the recent history of European Portuguese affects the very nature of enclisis.We have shown that in Classical Portuguese, enclisis is marginal in variation contexts and corresponds to structures in which the pre-verbal phrase is outside the boundaries of the clauses.In other terms, enclisis in Classical Portuguese is correlated with the Tobler Mussafia Law which prevents a non stressed item to show up in the absolute first position in intonational phrases.This is consistent with the fact, shown by Picture 2 above, that the variation is much greater when the phrase which a clause.In effect, clauses are more likely to form their own intonational phrase, and therefore, the verb is more likely to be treated as the first element of the main clause.Furthermore, as pointed out to us by Tony Kroch [5] , we expect that the longer the pre-verbal clause is, the higher is the probability that enclisis be choosen.