## 1. Introduction

The literature on postcolonial, nativizing African varieties of Portuguese (AVPs) spoken in Angola (AP), Mozambique (MP), and São Tomé and Príncipe (STP) refers a more widespread and non-standard use of locative preposition em ‘in’ selecting the Goal argument of directed motion verbs (e.g., Adriano, 2014; Avelar, 2017; Chavagne, 2005; Mingas, 2000; P. Gonçalves & Chimbutane 2004; P. Gonçalves, 2010; R. Gonçalves, 2010), which contrasts with the use of the standard directional prepositions a ‘to’ and para ‘to, toward’ used in European Portuguese (EP), the official standard in the former Portuguese colonies in Africa. In the case of AP and MP, the role of language contact with Bantu languages has been at the forefront of the descriptive and theoretical explanations regarding these patterns (e.g., Avelar & Álvarez López, 2018; Marques, 1985; Mingas, 2000; P. Gonçalves & Chimbutane, 2004).

Since these previous claims regarding the expression of Goal arguments are typically not based on quantified data nor on comparisons across AVPs, we aim to provide a corpus-based assessment of the expression of Goal arguments of two high frequency verbs of inherently directed motion (cf. Levin, 1993), ir ‘to go’ and chegar ‘to arrive’, in the three varieties at stake, in order to discuss the factors that drive language variation in this domain. Since language contact has been a primary explanation, the point of departure will be a cross-comparative approach between the three AVPs and their main contact languages in the urban areas where the corpora were collected: Bantu languages Kimbundu (KB) for AP (Luanda) and Changana (CH) for MP (Maputo), and creole language Forro (Santome) for STP (São Tomé). Since it will be shown that the contact-induced hypothesis does not offer an encompassing explanation, we will shift our focus to an analysis that takes into account semantic features of the constructions under discussion.

The paper is organized as follows. Section 2 sets the stage with respect to the three varieties at stake and the corpus-based data used in this paper; section 3 discusses the expression of Goal arguments in the three AVPs with motion verbs ir ‘to go’ and chegar ‘to arrive’ based on quantified data, showing the patterns of variation within and across these varieties; section 4 provides an overview of the properties of Goal arguments in the main contact languages of the AVPs, namely Kimbundu, Changana, and Forro; section 5 relates and discusses the findings of sections 3 and 4 from the perspective of language contact, arguing that this factor lacks explanatory power to account for the observed patterns; finally, in section 6, we offer an alternative explanation for the data at stake that takes into consideration the semantics of the predications of the verbs ir and chegar and of the complements of the prepositions, focusing on features such as durativity and the type of location.

## 2. Background and methodology

Portuguese in Africa has been historically acquired as an L2, a phenomenon that essentially has its roots in the late 19th and 20th century when the former Portuguese colonies were effectively colonized (e.g., Hagemeijer, 2016; P. Gonçalves, 2013). In addition, Portuguese in Angola, Mozambique, and São Tomé and Príncipe also shows an increasing tendency toward nativization since their independences in 1975, boosted by the choice of Portuguese as the only official language and its widespread democratization in all these postcolonial societies, hereby reinforcing its role as a lingua franca. Portuguese is by far the most spoken language in São Tomé and Príncipe (98,4%), whereas the predominant creole, Forro, is nowadays only spoken by 36,2% of the population (INE, 2013); in Angola, 71,15% of the population indicates that Portuguese is the most spoken language at home, especially in urban areas, whereas Kimbundu, one of the main Bantu languages which traditionally competes with Portuguese in the capital Luanda, is only spoken by 7,82% (INE, 2016); in Mozambique, Portuguese is spoken by around half of the population, which includes 16,6% of L1 speakers (INE, 2019), also with prevalence in urban areas. Despite the lack of direct information on L1 and L2, it can be inferred that Portuguese in Luanda and São Tomé is generally L1 and/or the primary language, with increasing monolingualism, especially among the younger generations, which make up a substantial part of the total population.

This case study is based on spoken urban corpora of MP, AP, and STP that were prepared within the project Possession and Location: microvariation in African varieties of Portuguese (PALMA) (Hagemeijer et al., 2021; Miguel et al., 2021; R. Gonçalves et al., 2021). The corpora were collected in the capitals Maputo, Luanda, and São Tomé between 2008 and 2020, by several researchers of the Center of Linguistics of the University of Lisbon, as well as doctoral students, and were as much as possible balanced according to level of education, age, and gender (cf. Hagemeijer et al., 2022). The interviews that integrate the corpora are predominantly semi-structured. Portuguese is the L1 or primary language of the majority of the informants who contributed to the corpora, especially in the case of urban AP and STP. Table 1 summarizes the basic information of the corpora used.

Table 1

Profile of the PALMA corpora.

 Interviews Hours Tokens Years of recording Angola 58 34 393,745 2012, 2013, 2019 Mozambique 70 42 380,958 2010, 2020 São Tomé and Príncipe 77 32 322,999 2008, 2011, 2012 Total 205 108 1,097,702

For the data set used in this paper, we extracted all the occurrences of ir ‘to go’ and chegar ‘to arrive’ from the three corpora on the CQPweb platform and then proceeded to exclude the following contexts:

• (i)   contexts without a locative argument.

• (ii)  periphrastic verbal constructions of the type ir/chegar (constituent) + main verb.

• (iii) preposed locative arguments (antecedents of relative clauses, topics, focused locatives, etc.), which are more prone to preposition deletion.

• (iv)  prepositioned locative arguments that were separated from the verb by locative adverbs, such as ‘there’ and ali ‘there’.

• (v)   lexicalized constructions involving ir and chegar.

• (vi)  unclear contexts, for example due to hesitations, reformulations, etc.

• (vii) contexts produced by two interviewers who are native speakers of AP and MP.

The remaining constructions were annotated with respect to (i) the preposition selecting the locative argument (V PPLoc) and cases without preposition (V NPLoc), (ii) the physical or non- physical nature of the propositions described by predications, and (iii) the nature of the NP (area or container) of PPs headed by locative em.

## 3. Goal arguments in AVPs

The available literature on AP, MP, and STP refers the use of a non-standard prepositional strategy with locative em ‘in’ to introduce arguments with the thematic role of Goal1 (e.g., Adriano, 2014, pp. 334–340; Cabral, 2005; Chavagne, 2005, pp. 223–227; Mingas, 2000, pp. 75–77; P. Gonçalves & Chimbutane, 2004; P. Gonçalves, 2010; R. Gonçalves, 2010) instead of the standard prepositions a and para. Chavagne (2005, p. 227) mentions that the use of em with verbs of movement represents a strong tendency, comparing it with Brazilian Portuguese (BP). Based on a smaller spoken corpus, R. Gonçalves (2010, p. 48) shows that in STP there are two main tendencies: ir preferentially selects PPs headed by para (even in contexts where EP selects a), whereas chegar shows a tendency toward em. Our data show Goal arguments of ir and chegar (among other verbs) headed by em are indeed attested in all three AVPs, as illustrated in (1) below.2

1. (1)
1. (a)
1. Logo
2. soon
1. que
2. that
1. cheguei
2. I.arrived
1. no
2. in.the
1. aeroporto… (AP, corpus data)
2. airport
1. ‘As soon as I arrived at the airport…’
1.
1. (b)
1. Por isso,
2. therefore
1. muitos
2. many
2. players
1. negros
2. black
1. vão
2. go
1. em
2. in
1. Europa. (MP, corpus data)
2. Europe
1. ‘Therefore, many black players go to Europe.’
1.
1. (c)
1. Eles
2. they
1. aconselham
1. que
2. that
1. é
2. it.is
1. para
2. to
1. ir
2. go
1. no
2. in.the
1. hospital.
2. hospital
1. ‘Now they advise to go to the hospital.’ (STP, corpus data)

In our corpora of the three AVPs, however, Goals introduced by prepositions a or para, the standard strategies in EP, overall represent the most common strategy, which can be seen in examples (2–3).

1. (2)
1. (a)
1. Quando
2. when
1. uma
2. a
1. pessoa
2. person
1. chega
2. arrives
1. à
2. to.the
1. escola… (AP, corpus data)
2. school
1. ‘When a person arrives at school…’
1.
1. (b)
1. conseguiu
2. he.managed
1. ir
2. go
1. à
2. to.the
1. África do Sul… (MP, corpus data)
2. South Africa
1. ‘he managed to go to South Africa…’
1.
1. (c)
1. fui
2. I.went
1. a
2. to
1. Santa Catarina. (STP, corpus data)
2. Santa Catarina
1. ‘I’ve been to Santa Catarina.’
1. (3)
1. (a)
1. A
2. the
1. filha
2. daughter
1. dele
2. of.him
1. foi
2. went
1. para
2. to
2. university
1. ‘His daughter went to university.’
1.
1. (b)
1. Amanhã
2. tomorrow
1. de
2. of
1. manhã
2. morning
1. vou
2. I.go
1. logo
2. right
1.
2. away
1. para
2. to
1. a
2. the
1. escola. (MP, corpus data)
2. school
1. ‘Tomorrow morning I’m going to school right away.’
1.
1. (c)
1. Fui
2. I.went
1. para
2. to
1. um
2. a
1. outro
2. other
1. meio
2. place
1. em
2. in
1. que
2. that
1. tinha
1. cabo-verdianos.
2. Cabo Verdeans
1. ‘I went to another place where there were only Cabo Verdeans.’ (STP, corpus data)

The overall results for the three AVPs with respect to the use of a, para, em, até, Ø (null preposition), and other prepositioned contexts introducing Goal complements of the verb ir in our corpora are presented as percentages and the corresponding raw numbers (between parentheses) in Table 2 below.

Table 2

Selection of Goal arguments of ir ‘to go’ in three AVPs.

 Ir PPa PPpara PPem PPaté Ø (NP) Other total AP 27,90%(60) 38,14%(82) 29,30%(63) 2,33%(5) 1,86%(4) 0,47%(1) 100%215 MP 51,53%(252) 32,32%(158) 9%(44) 3,47%(17) 1,02%(5) 2,66%(13) 100%(489) STP 30,64%(129) 61,52%(259) 1,90%(8) 4,28%(18) 0,47%(2) 1,19%(5) 100%(421) Total 39,2%(441) 44,36%(499) 10,22%(115) 3,55%(40) 0,98%(11) 1,69%(19) 100%(1125)

The overall results for the three AVPs with respect to the use of a, em, até, Ø, and other prepositioned contexts introducing Goal complements of the verb chegar are shown in Table 3 below.

Table 3

Selection of Goal arguments of chegar ‘to arrive’ in three AVPs.

 Chegar PPa PPem PPaté Ø (NP) Other total AP 43,4%(23) 39,62%(21) 7,55%(4) 1,88%(1) 7,55%(4) 100%(53) MP 25,79%(33) 67,19%(86) 2,34%(3) 2,34%(3) 2,34%(3) 100%(128) STP 37%(37) 37%(37) 4%(4) 22%(22) 0%(0) 100%(100) Total 33,10%(93) 51,25%(144) 3,91%(11) 9,25%(26) 2,49%(7) 100%(281)

The tables show in the first place that the varieties exhibit substantial disparity with respect to the patterns selected by ir and chegar, both within and across varieties. It also follows that the complements of ir and chegar are predominantly PPs and not NPs. STP, however, is exceptional in the sense that this variety exhibits a rather expressive number of cases where chegar selects an NP argument (22/22%).

In the specific case of ir (cf. Table 2), the standard EP strategies with a and para are most frequent (940/83,56%), with a strong preference for para in the case of STP (259/61,52%) in comparison to AP and MP.3 With respect to the use of the non-standard strategy with em, which only represents a relatively small number of overall cases in these varieties (115/10,22%), AP is the variety that stands out for its more widespread use of this preposition (63/29,30%). The application of Fisher’s test4 (Eddington, 2016) to the data in Table 2 shows that significant differences can be observed between varieties, since the p-value is much lower than 0.05 in all cases: The pairs AP/MP, AP/STP, and MP/STP all have a p < .001.

In the case of chegar (cf. Table 3), the variation in the AVPs mainly occurs between two prepositional strategies, either with a or with em. In contrast to ir, the non-standard strategy with em, corresponding to 144/51,25% of the occurrences, is more widespread than the canonical strategy, which accounts for 93/33,10% of the cases. While the strategy with em is dominant in MP (86/67,19%), it is also common in AP (21/39,62%) and STP (37/37%). Differently from AP and MP, as mentioned above, STP also shows a fair number of cases where chegar selects an NP (22/22%), a tendency that was not observed with regard to ir (2/0,47%). Applying Fisher’s test to the data in Table 3 shows, again, that significant differences between varieties can be observed: the pair AP/MP has a p-value = 0.006181; the pair AP/STP a p = 0.004861; and, most notably, the pair MP/STP a p < .001.

All in all, the statistical tests suggest significant differences between the three varieties with respect to the selection of Goal complements with both ir and chegar.

## 4. Location in the main contact languages

In order to assess the role of language contact, we will now turn to the strategies used to express Goal arguments in the main contact languages of the three urban AVPs at stake, which are the Bantu languages Kimbundu (Mbundu) and Changana, for AP and MP respectively, and the creole language Forro for STP. If contact plays a role, as has been claimed by several authors, the role of these languages is expected to be both direct and indirect. Direct, in the sense that there are speech communities and/or bilinguals of these languages within the urban environment and among our informants, especially in the case of MP; indirect, in the sense that these languages – or sometimes languages with a similar typology spoken by migrants moving to urban areas – have been historically in contact with Portuguese, even if speakers do not necessarily speak them (fluently or actively).

For Proto-Bantu, the ancient ancestor of the contemporary Bantu languages, three locative noun class prefixes have been reconstructed, corresponding to classes 16, 17, 18 (*pà-, *kù-, *mù-), which have a reflex in many contemporary Bantu languages but have been lost or only occur as vestiges in other ones (e.g., Zeller, in press).

Kimbundu, a western Bantu language, preserves reflexes of the three proto-Bantu locative noun classes (e.g., Araújo, 2013; Chatelain, 1888–1889; Diarra, 1990; Miguel, 2019).5 The marker that is standardly used to introduce the Goal argument of verbs of directed motion is ku, which stems from class 17, as illustrated in (4–5).

1. (4)
1. Ngiya
2. 1SG.go.FV
1. ku
2. LOC
1. {ngeleja/kalunga/Lisboa}.
2. {church/sea/Lisbon}
1. (Kimbundu, Afonso Miguel, p.c.)
2.
1. ‘I’m going to {the church/the sea/Lisbon}.’
1. (5)
1. Nzwa
2. Nzwa
1. watula
2. 3SG.PST.arrive.FV
1. {ku
2. LOC
1. Wambu /
2. Huambo /
1. ku
2. LOC
1. ngeleja}.
2. church
1. ‘Nzwa arrived in Huambo / at the church.’ (Kimbundu, Afonso Miguel, p.c.)

Differently from Proto-Bantu, however, the locative markers in Kimbundu exhibit properties of free morphemes, i.e., prepositional items, instead of prefixes, as shown in (4–5). The prepositional use of locative morphemes has been demonstrated for a number of Bantu languages in different areas (e.g., Marten, 2010; Zeller, in press). Although the prepositional status of locative markers in Kimbundu has been assumed in different studies without being substantiated with linguistic evidence (e.g., Diarra, 1990, p. 59), it is, for instance, supported by the fact that the locative marker does not trigger locative agreement in object position. In (6), it is the noun ngeleja ‘church’ which triggers class 9 agreement with demonstratives and possessives and not the locative related to class 17.

1. (6)
1. Ngiya
2. 1SG.go.FV
1. ku
2. LOC(17)
1. {ngeleja
2. 9.church
1. yiyi
2. 9.DEM
1. / yami}.
2. / 9.POSS
1. (Kimbundu, Afonso Miguel, p.c.)
2.
1. ‘I’m going to {this/my} church.’

Furthermore, when the locative marker selects an animate object, a connective morpheme -a is required, as illustrated in (7).6 This feature is not compatible with a prefixal status of ku because the intervention of this type of morphological material between a noun class prefix and the nominal root is not attested in Bantu languages.

1. (7)
1. Ngiya
2. 1SG.go.FV
1. {ku
2. LOC
1. kalunga
2. sea
1. / kua
2. / LOC.CONN
1. Phetele}.
2. Phetele
1. (Kimbundu, Afonso Miguel, p.c.)
2.
1. ‘I went to the sea / to Phetele’s (place).’

The expression of location in Changana, a southeastern Bantu language, is quite distinct from the facts observed for Kimbundu, since the language only exhibits remnants of the three locative classes reconstructed for Proto-Bantu (Chimbutane, 2002; Ngunga & Simbine, 2012; Sitoe, 2001). Instead, Goals are characterized by other morphological strategies (Chimbutane, 2002; Sitoe, 2001), typically the attachment of suffix -ini (or its alomorph -eni) to the NP, accompanied by optional prefixation of á- or é-.

1. (8)
1. Bilá
2. Bila
1. átáya
2. 1.FUT.go
1. áxikólwéni
2. LOC.school.LOC
1. múndzuku
2. tomorrow
1. ‘Bila will go to school tomorrow…’

Sitoe (2001, pp. 6–7) mentions that Changana locatives are generally vague with respect to their localization. He illustrates this claim with the example nambzeni (nambu ‘river’ + -eni), which “can refer to a location inside the river, on the river, at the bank of the river, or in the general area of the river.”

Changana further displays a number of inherently locative nouns, which includes borrowings, that lack any overt marking, as mánánga ‘desert’ in (9a), or can be prefixed with á- to reinforce their locativity, as ákáyá ‘home’ in (9b).

1. (9)
1. (a)
1. Makamela
2. 6.camels
1. máyé
2. 6.go.PT
1. mánánga
2. desert
1. ‘The camels went to the desert.’
1.
1. (b)
1. Podíná
2. Podina
1. ámúka
2. 1.go:back.PR
1. ákáyá.
2. LOC.home
1. ‘Podina is going back home.’
1.
1. (c)
1. Bilá
2. Bila
1. áfikélé
2. 1.arrive.APPL.PT
1. káyá.
2. home
1. (Changana, F. Chimbutane, p.c.)
2.
1. ‘Bila arrived home.’

For endogenous toponyms, on the other hand, prefix ká- is employed (10a), whereas toponyms of foreign origin are not marked or are optionally marked by prefix á- (Sitoe, 2001, p. 10) as shown in (10b).

1. (10)
1. (a)
1. Bilá
2. Bila
1. áya
2. 1.go.PR
1. káTembe.
2. LOC.Tembe
1. ‘Bila goes to Catembe.’
1.
1. (b)
1. Bilá
2. Bila
1. áya
2. 1.go.PR
1. áJoní.
2. LOC.Johannesburg
1. ‘Bila goes to Johannesburg.’
1.
1. (c)
1. Bilá
2. Bila
1. áfikélé
2. 1.arrive.APPL.PT
1. áMapútu.
2. LOC.Maputo
1. ‘Bila arrived precisely at Maputo.’

Finally, Goal arguments in Forro, the main creole language of S. Tomé, are uniformly expressed as NPs (e.g., Hagemeijer, 2004).

1. (11)
1. (a)
1. Kuma
2. how
1. n
2. 1SG
1. ga
2. FUT
1. ba
2. go
1. fesa
2. party
1. sun alê?
2. Mr. king
1. ‘How will I go to the party of the king?’
1.
1. (b)
1. N
2. 1SG
1. ba
2. go
1. glêza
2. church
1. ku
2. with
1. inen.
2. 3PL
1. ‘I went to church with them.’
1.
1. (c)
1. Oze
2. today
1. n
2. 1SG
1. ga
2. FUT
1. ba
2. go
1. ‘Today I will go to (the town of) Trindade.’
1. (12)
1. (a)
1. San
2. 3S.FRM
1. xiga
2. arrive
1. palaxu.
2. palace
1. ‘She arrived at the palace.’
1.
1. (b)
1. Flolensa
2. Florença
1. xiga
2. arrive
1. poson.
2. city of São Tomé
1. ‘Florença arrived at the city of São Tomé.’
1.
1. (c)
1. Tudu
2. all
1. ngê
2. people
1. xiga
2. arrive
1. misa.
2. mass
1. (Forro, elicited)
2.
1. ‘Everybody arrived at the mass.’

In sum, the three main contact languages of the urban AVPs analyzed exhibit fundamentally different solutions to encode Goal arguments with verbs of directed motion: Kimbundu exhibits PPs headed by ku; Changana uses a strategy in which NPs are morphologically marked; in Forro, Goals are also NPs but lack any (overt) morphological marking.

## 5. Discussion

A cross-comparison between the patterns available in the AVPs and their respective main contact languages shows that the contact-induced hypothesis runs into several problems.

First, the AVPs in this study predominantly use a PP strategy to introduce Goals of the direction motion verbs ir and chegar, whereas two of the contact languages, Changana and Forro, exhibit an NP strategy, respectively with or without overt morphology. Second, while the prepositions heading PPs in the AVPs show considerable variation, a clear-cut distinction arises between ir, where standard a and para constitute the dominant pattern, and chegar, which shows a preference for the non-canonical strategy with em. The main contact languages (Kimbundu, Changana, and Forro), on the other hand, use a language-internally uniform strategy to express the complement of these and other directed motion verbs.7

The contact-induced hypothesis is also unable to explain why ir in STP shows a strong preference for para and hardly any cases of em, and chegar a primary pattern with em, since Forro systematically uses an NP argument in both cases. In AP, however, the more pronounced overall presence of em, although far from being the exclusive patterns, would be more consistent with a more emphatic role of language contact, since Kimbundu also displays a generalized and uniform PP strategy with locative ku, which could further facilitate syntactic transfer and convergence.8 This line of argumentation can arguably not hold for MP though, since it was shown that its main contact language Changana exhibits morphological strategies.

To overcome this problem in the case of MP, P. Gonçalves and Chimbutane (2004) and P. Gonçalves (2010) propose an analysis, based on a scenario of L1 Bantu speakers who are L2 learners/acquirers of Portuguese, whereby preposition em is reanalyzed as an NP-internal locative Case marker, [em-NP], i.e., as a prefix, inspired in the Bantu noun class tradition. This hypothesis, they argue, would then be able to account for examples like (13a-b), where preposition no in (13a) and em (in 13b) would be incorporated into the NP as follows: [NP no-encontro da igreja] and [PP para [NP em-casa]].

1. (13)
1. (a)
1. vou
2. I.go
1. no
2. in.the
1. encontro
2. meeting
1. da
2. of.the
1. igreja
2. church
1. (MP, corpus data)
2.
1. ‘I’m going to the church meeting’
1.
1. (b)
1. leva
2. you.take
1. os
2. the
1. meus
2. my
1. filhos
2. children
1. para
2. to
1. em
2. in
1. casa
2. house
1. do
2. of.the
1. outro
2. other
1. homem
2. man
1. ‘you take my children to the other man’s place’ (MP, corpus data)

We concur with P. Gonçalves and Chimbutane (2004) that examples of cooccurring prepositions, in (13b), in particular para and em are rare. In our data, we identified only 5 such occurrences, produced by 4 different speakers. In 4 of these occurrences, the locative NP was casa ‘house’. The scarcity of these constructions would therefore not only weaken the hypothesis that em is a prefix, but would also require further explanations as to the workings of the grammaticalization of a preposition into a bound morpheme in MP. As shown in section 4, the primary locative strategy in Changana relies on a suffix (-ini or -eni), and not on a prefix. Moreover, there is no evidence that other Bantu affixes, for example negation prefixes or verbal extensions (applicatives, etc.), induce reanalysis of Portuguese morphemes as bound morphology in MP (or AP). This raises the more general issue as to whether or how morphology in a L1 is processed and transferred into an L2 target language in contact situations. In the case of Changana-Portuguese it would be a costly operation, requiring both a categorial and syntactic reanalysis of em.

Furthermore, Bantu locative morphemes are typically prefixed to the head noun, as in the case of Changana’s ká- and á- described in section 4, which means that in a strong version of the contact hypothesis the strategies with em are not expected to contract with definite articles9 or demonstratives nor to be separated from the head noun by possessives, numerals, prenominal adjectives, etc., contrary to fact (e.g., há quem chega no meu salão ‘some people arrive at my beauty salon’, MP corpus data). This type of noun modifying material is generally postnominal in Bantu, but in those languages where it occurs preverbally, the locative marker is a free morpheme (preposition).10

The contact hypothesis for MP was also adopted by Avelar and Álvarez López (2018) for the (L2) Cabinda variety of AP, which is primarily in contact with Ibinda, a western Bantu language belonging to the Kongo cluster, which exhibits prenominal locative morphemes. While this would be syntactically consistent with contact-induced hypothesis above, it still faces the problems mentioned in the previous paragraph regarding the NP-internal syntactic differences between Bantu and Portuguese.

Finally, the NP-internal prefixation hypothesis proposed originally by P. Gonçalves and Chimbutane (2004) is unable to account for AP, since it was shown that Kimbundu locative markers introducing Goals are free morphemes (prepositions). Since Kimbundu and Portuguese both exhibit prepositions, it would be counterintuitive to claim that Ps incorporate in AP with verbs of directed motion. In fact, as expected, we did not find any instance of double prepositions in our AP data.11

The gradual withdrawal of the contact languages caused by shift, especially in the urban environment in Angola and São Tomé, where Portuguese is not only widely spoken but also frequently the native or primary language of the speakers, with increasing monolingualism in Portuguese, especially among the younger generations, constitutes yet another argument that increasingly weakens the role of language contact with respect to the observed patterns. This tendency has become stronger since the independences in 1975 and can be correlated with the massification of education in the official language and, concomitantly, higher educational attainment. Even though the spoken local (non-native) varieties constitute the main input (e.g., Stroud, 1997), the role of education is expected to raise the speakers’ linguistic awareness and to promote wider use of patterns found in the standard language.12 While the historical relevance of language contact cannot be ignored, the contemporary urban environments have become the playground of thriving language shift toward Portuguese, diminishing the role of transfer of substrate patterns.

Finally, compared to EP, the AVPs appear to privilege prepositions para and em over a with the verbs at stake, even though ultimately a finer-grained analysis is required to assess the use of para and a, for example with regard to implications concerning a longer or a shorter stay at the Goal, as in the difference between ir para Luanda ‘to go to Luanda’ vs. ir a Luanda ‘to go to Luanda’, where the former implies a longer stay in Luanda than the latter. The partial loss of a is an expected outcome in contact situations where L2 acquisition plays or played a prominent role. Phonetically weak and semantically opaque functional items, such as directional preposition a, definite articles, or accusative clitics, are prone to loss and/or partial replacement by other, often less functional material. In contexts of ambiguous and insufficiently dense input (e.g., P. Gonçalves, 2002, 2004), it is expected that directional para or locative em have the upper hand in comparison to a. For instance, the latter, unlike the former two, did not make it into the Portuguese-related creoles in West-Africa belonging to the Upper Guinea and Gulf of Guinea clusters. Moreover, in (spoken) BP, which also has a history of substantial L2 acquisition and partial restructuring (e.g., Lucchesi, 2001), directional preposition a has given rise to variation, favoring in particular em in the structures at stake, and does not occur in Afro-Brazilian subvarieties which underwent substantial restructuring (see Avelar, 2017, for discussion).

In sum, the empirical evidence does not underscore a major direct role for language contact regarding the selection properties of directed motion verbs ir and chegar, even though there may be – on a limited scale – individual grammars that privilege the grammatical properties of the contact languages. It was also shown that a morphological analysis based on language contact, proposed initially for MP, cannot be applied across the board, primarily because it is not consistent with the data from the contact languages and the AVPs.

In light of the above, we aim to develop a different, semantic-oriented approach to the data, as an alternative (and complement) to the contact-induced hypothesis.

## 6. A semantic hypothesis

In the remainder of this paper, we discuss Goal PPs introduced by em, a non-standard strategy in EP, focusing primarily on the lexical semantics of verbs and their objects and on the semantics of predications.

### 6.1. Goals expressed by locative phrases

Goals are not always expressed by Goal prepositions, such as a and para in EP. In English, for instance, locative phrases can be understood as goals and this interpretation seems to arise irrespective of the semantics of the verb, that is, irrespective of being a directed motion verb or a manner of motion verb.

1. (14)
1. (a)
1. John walked in the room.
1.
1. (b)
1. Kim jumped on the bed. (Beavers, Levin & Tham, 2010, p. 33)

Although both in and on in (14) are locative prepositions, besides the locative interpretation, the PPs they introduce also have directional (Goal) interpretations, which are easier to achieve in the right context, namely if the entity referred to is standing outside the room, in (14a), or next to the bed, in (14b). Levin, Beavers and Tham (2009) argue that this strategy of combining motion verbs with locative PPs is one of the ways of expressing directionality with motion verbs in English – the other one is combining motion verbs with Goal-marking PPs – but there are differences in their productivity, as the former strategy shows acceptability variation whereas the latter is widespread.

This possibility of Goal readings of locative phrases is also available in other languages, such as Afrikaans, Ancient Greek, Dutch, Norwegian, Russian, and Ukrainian (cf., e.g., Beavers, Levin & Tham, 2010; Nikitina & Maslov, 2013), as well as Romance languages, as French, Spanish, or Italian (e.g., Nikitina, 2008; Ursini, 2013).

1. (15)
1. Rick
2. Rick
1. sprong
2. jumped
1. in
2. in
1. het
2. the
1. meer.
2. lake
1.                 (Dutch, Gehrke, 2007, p. 248)
2. (locative / directional)
1. ‘Rick jumped in the lake.’
1. (16)
1. Max a couru dans sa chambre.
1. ‘Max ran into his room.’

In fact, Nikitina (2008, 2017) points out that in many languages, including Romance languages, (static) Locations and Goals tend to be expressed by the same linguistic forms (e.g., prepositions, case), being the distinctions between both readings context dependent. Also, Pantcheva (2010) surveys typologically distinct languages and concludes that, despite the general tendency toward a distinct marking of Locations, Goals and Sources (60% of the languages under scrutiny use differential marking), there is also a considerable number of languages (approximately 30%) exhibiting the same strategies for the expression of both Locations and Goals.

If we turn to the two main Portuguese varieties, the situation is twofold. On one hand, in BP, the locative preposition em (‘in’, ‘on’) is used with directional interpretation in events of movement that typically include verbs of directed motion.13

1. (17)
1. (a)
1. {Fui/cheguei/vim}
2. I {went/arrived/came}
1. no
2. in.the
1. cinema. (BP; *EP)
2. cinema
1.
1. (b)
1. Fui/cheguei/vim
2. I {went/arrived/came}
1. ao
2. to.the
1. cinema. (BP; EP)
2. cinema
1. Both: ‘I {went to/arrived at/came to} the cinema.’

In EP, on the other hand, only a small number of verbs denoting some sort of change of location can appear with preposition em introducing Goal complements: this short list includes entrar (‘to enter’) (cf. (18a)), aparecer (‘to appear’), esconder (‘to hide’), penetrar (‘to penetrate’) and pôr (‘to put’) (cf., e.g., Raposo & Xavier, 2013, pp. 1543–1544). The preposition em conveying a directional meaning cannot be used with other directed motion verbs (nor manner of motion verbs), as illustrated in (18b-c).

1. (18)
1. (a)
1. Entrei
2. I.entered
1. em
2. in
1. casa. (EP)
2. house
1. ‘I entered the house.’
1.
1. (b)
1. Subi
2. I.went.up
1. no
2. in.the
1. quarto. (EP)
2. room
1. ‘I went up in the room.’ / *‘I went up to the room.’
1.
1. (c)
1. Caminhei
2. I.walked
1. em
2. in
1. casa. (EP)
2. house
1. ‘I walked inside the house.’ /*‘I walked to the house.’

All in all, the occurrence of locative prepositions with a Goal meaning is crosslinguistically widespread, EP being an exception.14 In the next section, we review some hypotheses for this directional reading of locative prepositions.

### 6.2. Hypotheses for locative phrases with directional meaning

#### 6.2.1. Brazilian Portuguese preposition em

There are a few sociolinguistic studies on BP geographic varieties that try to explain the directional meaning of em in terms of pragmatic and/or semantic factors (e.g., Mollica, 1996; Vieira, 2009; Wiedemer, 2008). These studies scrutinize the factors underlying the use of em in directional contexts with certain verbs of directed motion and converge in postulating that directional em is favored by the fact that preposition em with a directional interpretation typically introduces a closed space, like a house, whereas directional prepositions a ‘to’ and até ‘up to’ typically introduce open spaces, such as a beach.

More recently, Rammé (2017) and Ferreira and Basso (2019) discuss the use of directional em in BP using a nanosyntactic framework. In both proposals, the authors assume that prepositions can be either locative or directional and, when a locative preposition has a directional interpretation, this interpretation depends on the verb (that is, there is a structural effect). Using this point of departure, they postulate that em is always a locative preposition, and that the PP only has a locative meaning, i.e., em only codifies location, not direction, even in those cases where a directional interpretation arises. The directional interpretations of em depend exclusively on the verb’s meaning, which explains why directional em only occurs with verbs that codify directed motion. Therefore, both studies conclude that the directional interpretation of BP em is a consequence of structural ambiguity, or false syncretism (cf. Gehrke, 2008). Additionally, Rammé (2017) also posits that em in BP has not changed its semantic value in the past two centuries (contrary to what is usually assumed in the literature), whereas Ferreira and Basso (2019) propose that em is used in those contexts where the speaker wants to assure that, at the end of the event, the Figure is in the interior of the Goal (cf. the abovementioned pragmatic studies of BP em).

#### 6.2.2. English preposition in

Although there are proposals explaining the occurrence of locative PPs with directional meaning that follow a lexical approach (e.g., Alonge, 1997; Fábregas, 2007; Folli & Ramchand, 2005), Levin, Beavers and Tham (2009) and Beavers, Levin and Tham (2010) convincingly argue that the occurrence of these locative PPs with directional interpretation can be explained taking into consideration semantic and pragmatic factors, following Nikitina (2008). In general, the research on these matters points out two strategies for the encoding of the directional meaning: (i) to encode the directional meaning in a specialized satellite, if available; this kind of Goal combines with all types of verbs, including manner verbs; (ii) to encode the directional meaning in a specialized class of inherently directional verbs (typically manner verbs are excluded). However, Nikitina (2008) suggests a third option to describe directed motion. In fact, languages lacking specialized means of encoding directionality outside the verb (by specialized satellites) do not always restrict the expression of directed motion to a subset of inherently directional verbs; instead, the directional meaning is not expressed overtly, that is, the strategy of expressing directed motion relies on contextual inference rather than lexical encoding (“zero” encoding strategy; cf. Kopecka, 2009, p. 55 for a similar claim).

Nikitina´s proposal relies on a compositional view of directional meaning. Thus, she assumes that more than one element in the sentence can contribute to the directional interpretation and that the directional meaning does not have to be encoded by specialized lexical material to be inferred. Furthermore, the choice of a strategy partly depends on factors that have to do with alternative ways of conceptualizing motion events. One of those factors is context. As Beavers, Levin and Tham (2010, p. 33) posit, “… locative phrases are understood as goals precisely in those contexts that allow a reader or hearer to infer that a goal interpretation is intended.” In (19), repeated from (14a), “in the room” is interpreted as a Goal phrase only if John is standing nearby the door of the room; otherwise, the phrase is interpreted as locative (the place where the event of John walking occurred).

1. (19)
1. John walked in the room.

The verb’s lexical semantics is also an important factor in Nikitina’s proposal. Locative PPs with Goal interpretation are found more often with verbs that project non-durative predications; on the contrary, they do not arise alongside durative verbs; therefore, they are usually combined with directed motion verbs (usually punctual, transitional verbs), but not with manner of motion verbs (usually durative verbs). Similar conclusions grounded on English, Dutch, and French data can be found in Gehrke (2007),15 Thomas (2004), and Kopecka (2009).

The last relevant factor underlying the expression of Goals in Nikitina (2008) is the semantics of the complement of the preposition. The author argues that locations with well-defined boundaries (“containers”16), that is, that are not surrounded by a transitional zone,17 are used more often in locative PP with directional interpretation; on the contrary, the locations that do not have well-defined boundaries (“areas”18) are excluded from these constructions (cf. also Kopecka, 2009, for French).

Although Nikitina’s proposal concerns English data, there is evidence that the semantic/pragmatic account of locative Goals can be extended to Romance languages. For instance, Kopecka (2009) analyses some manner of motion verbs with locative prepositions in French and concludes that the factors that favor the directional readings of locative prepositions in English can be extended to account for French data. In what concerns BP em, the studies of Mollica (1996), Wiedemer (2008), and Vieira (2009) also adopt the same analysis, emphasizing the influence of the semantics of the complement of the preposition, while Rammé (2017) and Ferreira and Basso (2019) highlight the importance of the meaning of the verb. In the next section it will be shown that certain tendencies observed with respect to directional readings of em in the AVPs can be accounted for by Nikitina’s hypothesis.

### 6.3. A semantic approach to directed motion in AVPs

#### 6.3.1. Introducing a semantic feature hypothesis

Following Nikitina’s (2008) semantic/pragmatic approach of directional locatives, we argue that the directional readings of preposition em in AVPs are favored:

1. when the speaker wants to describe an event of movement, that is, an event in which there is an entity (the Figure) that moves toward a reference object (the Ground), and this movement can be described by a series of spatial coordinates.

2. when the reference object has well-defined boundaries.

3. when the predication describes a non-durative change of place.

Concerning 1., one of the key aspects of the semantic/pragmatic account of the directional readings of locative phrases is that it involves actual movement in the “real world”. In other words, locative phrases are interpreted as Goal phrases if, during the event, the Figure changes its location and, in the end of the event, the Figure is located typically inside that location. Therefore, one assumes that the Figure must be located by spatial coordinates. However, verbs of movement can also be used to describe situations that are events that do not correspond to actual movement in the real world or do not involve movement at all, as in (20) and (21).

1. (20)
1. The price of gasoline rose from $2.6 to$3.
1. (21)
1. The road goes from the village to the mountain.

In (20), there is an event, and, during this event, there is a change in a scale (associated with the word “price”), and not a change in the physical location of the entity denoted by “the price of gasoline”. As for (21), the predication corresponds to a state describing the (static) location of the entity denoted by “the road”. In the cognitive literature, sentences like (20) are typically treated as metaphors, whereas sentences like (21), that correspond to a description of a static situation, are treated as “fictive motion” (Talmy, 2000), “subjective motion” (Langacker, 1986; Matsumoto, 1996) or “non-actual motion” (Blomberg & Zlatev, 2014).

This difference between events that correspond to a change in the physical location of the Figure (as the entity denoted by “the man” in “the man went to the village”), and, on the other side, events that do not correspond to a change in the physical location, as in (20), or do not correspond to a change at all, as in (21), seems to be relevant to the analysis of predications with verbs of movement. Take, for, instance, the verb ir. This verb can cooccur with three Goal prepositions, a, para and até (a),19 and one can change one preposition for another if the predication involves physical motion, as in (22), although with subtle changes in the meaning of the sentence. However, if the predication describes a non-physical motion, these prepositions are not interchangeable. See (23), where ir para o FC Porto means ‘to start working at FC Porto’ (cf. Leal, Oliveira & Silvano, 2017).

1. (22)
1. O
2. the
1. rapaz
2. boy
1. foi
2. went
1. {a / para / até}
2. {to / to / up.to}
1. casa. (EP)
2. house
1. ‘The boy went home.’
1. (23)
1. O
2. the
1. rapaz
2. boy
1. foi
2. went
1. {*ao
2. {to.the
1. / para
2. / to
1. o /
2. the /
1. *até ao}
2. up.to to.the}
1. FC Porto. (EP)
2. FC Porto
1. ‘The boy joined FC Porto.’

Therefore, in the analysis of the data of AVPs, we took into consideration this difference between (i) predications expressing events that denote physical motion (cf. (24)) and (ii) predications that express events that do not denote physical motion (cf. (25)) or that express locations (that is, fictive motion).

1. (24)
1. fui
2. I.went
1. à
2. to.the
1. província
2. province
1. de
2. of
1. Lubango
2. Lubango
1. (AP, corpus data)
2.
1. ‘I went to Lubango province’
1. (25)
1. fui
2. I.went
1. à
2. to.the
1. tropa
2. army
1. (AP, corpus data)
2.
1. ‘I joined the army’

Concerning 2., that is, the semantics of the complement of the preposition em, we consider, as in Nikitina (2008), that the directional readings of preposition em are favored when the reference object corresponds to a region with well-defined boundaries, and when there is no transition zone between the region that corresponds to the reference object and all the regions that do not correspond to the reference object. In other words, the directional readings of preposition em will be favored if the preposition introduces NPs denoting containers, not areas. We therefore distinguished three kinds of NP complements:

• (i)   areas: locations that do not have well-defined boundaries, that are surrounded by a transitional zone (some examples from the corpora: lavra ‘farmland’, município ‘municipality’, praia ‘beach’, …).

• (ii)  containers: locations with well-defined boundaries, that is, which are not surrounded by a transitional zone (some examples from the corpora: casa ‘house’, bloco ‘operating room’, loja ‘store’, …).

• (iii) undefined NP: this third category is used in those cases that are not adequate for any of Nikitina’s classes (some examples from the corpora: porta ‘door’, árvore ‘tree’, …), or if one cannot classify the relevant NPs, for instance, due to lack of context (some examples from the corpora: sítio ‘place’, médico ‘doctor’, curandeiro ‘traditional healer’, …).

Concerning 3., in Nikitina’s proposal, the lexical information exhibited by the verb form is a relevant factor, because locative PPs with Goal interpretation are found more often with verbs that project non-durative predications (culminations, in Moens’ (1987) terminology) than with durative predications (processes or culminated processes); on the contrary, they should not arise alongside durative verbs. We therefore tested the Aktionsart of the verbs chegar and ir, namely their durative vs. non-durative nature, using the tests usually found in the literature (cf. e.g., Dowty, 1979; Vendler, 1957; for EP, Leal, 2009), and concluded that chegar is a non-durative verb, whereas ir is ambiguous between durative and non-durative readings (see Leal, Oliveira & Silvano, 2018).

The following examples, based on the judgement of EP speakers, illustrate these conclusions. Example (26) shows that, with the adverbial “in x time”, the sentence with ir entails the truth of the same sentence in the progressive during the same time, which is a diagnostic for durativity. The same test applied to chegar, in (27), shows that the predication is non-durative.

1. (26)
1. O
2. the
1. rapaz
2. boy
1. foi
2. went
1. para
2. to
1. a
2. the
1. escola
2. school
1. em
2. in
1. 10
2. 10
1. minutos.
2. minutes
1. ‘The boy went to school in 10 minutes.’
2. O rapaz esteve a ir para a escola durante esses 10 minutos.
3. ‘The boy had been going to school for 10 minutes.’
1. (27)
1. O
2. the
1. rapaz
2. boy
1. chegou
2. arrived
1. à
2. at.the
1. escola
2. school
1. em
2. in
1. 10
2. 10
1. minutos.
2. minutes
1. ‘The boy arrived at school in 10 minutes.’
2. *O rapaz esteve a chegar à escola durante esses 10 minutos.
3. ‘The boy had been arriving at school for 10 minutes.’

Another test for durativity is the occurrence of predications with “at x time” adverbials. In (28), the adverbial às 14 horas ‘at two o’clock’ only locates the beginning of the event, which indicates that this event is durative. However, in (19), with chegar, the adverbial locates the whole event, which indicates that it is non-durative.

1. (28)
1. O
2. the
1. rapaz
2. boy
1. foi
2. went
1. para
2. to
1. a
2. the
1. escola
2. school
1. às
2. at.the
1. 14 horas.
2. 14 hours
1. ‘The boy went to school at 2 pm.’
1. (29)
1. O
2. the
1. rapaz
2. boy
1. chegou
2. arrived
1. à
2. at.the
1. escola
2. school
1. às
2. at.the
1. 14 horas.
2. 14 hours
1. ‘The boy arrived at school at 2 pm.’

However, when combining predications with ir with “for x time” adverbials, apparently contradictory results arise. In (30), if the predication is true, then two distinct entailments arise: the boy was going to school during those ten minutes (cf. 30a) or the boy was at school for 10 minutes (cf. 30b). In other words, the adverbial “for x time” can measure the preparatory process (cf. Moens, 1987) of the event (30a) or the consequent state of the event (30b). In the former case, the entailment is like those that arise with typical durative events (cf. (31)); in the latter case, the entailment is like those that arise with non-durative events (cf. (32)). One can thus conclude that predications with ir can have both durative and non-durative readings (see also Leal, Oliveira & Silvano, 2018).

1. (30)
1. O
2. the
1. rapaz
2. boy
1. foi
2. went
1. para
2. to
1. a
2. the
1. escola
2. school
1. durante
2. for
1. 10 minutos.
2. 10 minutes
1.
1. (a)
1. O rapaz esteve a ir para a escola durante 10 minutos.
2. ‘The boy had been going to school for 10 minutes.’
1.
1. (b)
1. O rapaz esteve na escola durante 10 minutos.
2. ‘The boy was at school for 10 minutes.’
1. (31)
1. O
2. the
1. rapaz
2. boy
1. leu
1. o
2. the
1. livro
2. book
1. durante
2. for
1. 5 minutos.
2. 5 minutes
1. O rapaz esteve a ler o livro durante 5 minutos.
2. ‘The boy was reading the book for 5 minutes.’ (Leal, Oliveira & Silvano, 2018, p. 341)
1. (32)
1. O
2. the
1. rapaz
2. boy
1. desmaiou
2. passed out
1. durante
2. for
1. 5 minutos.
2. 5 minutes
1. O rapaz esteve desmaiado durante 5 minutos.
2. ‘The boy was passed out for 5 minutes.’ (Leal, Oliveira & Silvano, 2018, p. 341)

Before we draw our attention to the analysis of AVP data concerning these parameters, some remarks on Nikitina’s (2008) proposal are in order. Among the relevant parameters favoring the occurrence of directional readings of locative in, one also finds co-text. In Nikitina’s study of English, a Goal interpretation of in is impossible if there is also a PP indicating the Source of the movement (irrespective of the verb). We therefore checked the occurrences of directional em searching for Source phrases. However, there is only one occurrence of a Source phrase in the analyzed data (see (33)), which leads to the conclusion that these verbs (ir and chegar) do not easily occur with this type of phrases. Notice that, in (33), the Goal phrase is headed by preposition a, and not by em.

1. (33)
1. da
2. from.the
1. Suazilândia
2. Swaziland
1. fui
2. I.went
1. à
2. to.the
1. África do Sul
2. South Africa
1. (MP, corpus data)
2.
1. ‘from Swaziland I went to South Africa’

A quick verification of the data available in the PALMA corpora shows that Source phrases can occur with other directed motion verbs and are able to cooccur with Goal phrases headed by para, as shown in (34–35).

1. (34)
1. os
2. ‘the
1. pais
2. parents
1. vieram
2. came
1. de
2. from
1. Cabo
2. Cabo
1. Verde
2. Verde
1. para
2. to
1. ,
2. here
1. right
1. (STP, corpus data)
2.
1. ‘his parents came from Cabo Verde to here, isn’t it’
1. (35)
1. vieram
2. they.came
1. da
2. from.the
1. província
2. province
1. para
2. to
2. city
1. (AP, corpus data)
2.
1. ‘they came from the province to the city’

So, crucially, what seems to be ruled out is not the cooccurrence of a Source phrase with a Goal phrase, but the cooccurrence of a Source phrase with a Goal phrase headed by em, which seems to support Nikitina’s proposal. However, a further analysis based on more data is required to confirm this hypothesis.

#### 6.3.2. Physical motion vs. non-physical motion readings

We start the analysis with the distinction “physical” vs. “non-physical motion”. Examples (36–41) present these two readings with each preposition.

1. (36)
1. quando
2. when
1. eu
2. I
1. cheguei
2. arrived
1. na
2. in.the
1. oitava
2. 8th
1. classe
1. (STP, corpus data, non-physical motion)
2.
1. ‘when I got to the eighth grade’
1. (37)
1. quando
2. when
1. ela
2. she
1. chegou
2. arrived
1. na
2. in.the
2. town
1. (STP, corpus data, physical motion)
2.
1. ‘when she arrived in town’
1. (38)
1. uma
2. a
1. notícia
2. news
1. que
2. that
1. chega
2. arrives
1. a
2. to
1. alguém
2. someone
1. (AP, corpus data, non-physical motion)
2.
1. ‘news that reaches someone’
1. (39)
1. e
2. and
1. logo
2. soon
1. que
2. that
1. cheguei
2. I.arrived
1. ao
2. to.the
1. aeroporto
2. airport
1. (AP, corpus data, physical motion)
2.
1. ‘and as soon as I arrived at the airport’
1. (40)
1. vamos
2. let’s.go
1. agora
2. now
1. para
2. to
1. gastronomia
2. gastronomy
1. e
2. and
1. culinária
2. cuisine
1. angolana
2. Angolan
1. ‘let’s move on now to Angolan gastronomy and cuisine’ (AP, corpus data, non-physical motion)
1. (41)
1. quando
2. when
1. eu
2. I
1. fui
2. went
1. para
2. to
1. o
2. the
1. norte
2. north
1. do
2. of.the
1. país
2. country
1. (MP, corpus data, physical motion)
2.
1. ‘when I went to the north of the country’

Tables 4 and 5 below show the results for each of the two verbs, by preposition, and by AVP.

Table 4

physical vs. non-physical motion with ir.

 ir PPa PPpara PPem total non-physical motion physical motion non-physical motion physical motion non-physical motion physical motion AP 6,83%(14) 22,44%(46) 12,2%(25) 27,8%(57) 1,95%(4) 28,78%(59) 100%(205) MP 12,53%(57) 42,86%(195) 5,93%(27) 28,79%(131) 1,32%(6) 8,57%(39) 100%(455) STP 7,58%(30) 25%(99) 19,44%(77) 45,96%(182) 0,25%(1) 1,77%(7) 100%(396) Total nr. 101 340 129 370 11 105 1056
Table 5

physical vs. non-physical motion with chegar.

 chegar PPa PPem total non-physical motion physical motion non-physical motion physical motion AP 26,67%(12) 24,44%(11) 11,11%(5) 35,56%(16) 100%(44) MP 14,29%(17) 13,44%(16) 5,88%(7) 66,39%(79) 100%(119) STP 27,03%(20) 22,97%(17) 5,41%(4) 44,59%(33) 100%(74) Total nr. 49 44 16 128 237

The main conclusions can be summarized as follows. In what concerns the occurrences with the verb ir, the results are not clear-cut, but:

• (i)   both with prepositions a and para, the AVPs show a regular pattern, as non-physical readings correspond to approximately 1/3 of the physical readings (with a: 101 non-physical readings vs. 340 physical readings; with para: 129 non-physical readings vs. 370 physical readings); this proportion seems to be related to the physical/non-physical asymmetry, and not to the prepositions.

• (ii)  with em, there is a sharp contrast, as non-physical readings correspond to approximately 1/10 of the physical readings (11 non-physical readings vs. 105 physical readings).

With the verb chegar, the results are slightly different from the ones with ir, because, as in EP and BP, preposition para does not occur with chegar20; the comparison between prepositions therefore only regards a and em:

• (i)   with preposition a, non-physical readings correspond to approximately the same number of physical readings (a tendency shared among the AVPs): 49 non-physical readings vs. 44 physical readings.

• (ii)  with em there is again a sharp contrast, as non-physical readings correspond to 1/8 of the physical readings: 16 non-physical readings vs. 128 physical readings.

The detailed results of Tables 4 and 5 are summarized in Table 6, which does not take into account the results by AVP nor by verb.

Table 6

physical vs. non-physical motion by preposition in the AVPs under analysis.

 a para em total non-physical motion 49,02%(150) 42,16%(129) 8,82%(27) 100%(306) physical motion 38,90%(384) 37,49%(370) 23,61%(233) 100%(987)

As expected, given the variation that characterizes the AVPs, the results are not absolutely clear-cut, but they show a tendency: the use of preposition em in predications describing non-physical motion events is substantially reduced compared to the other two prepositions (27 vs. 233 occurrences). On the other hand, prepositions a and para seem to be chosen irrespective of the distinction “physical/non-physical” motion. Fisher’s test applied to the data in Table 6 shows that an enormous difference between the pairs em/a (p < .001), and em/para (p < .001) can be observed. On the other hand, we found no statistical difference between para/a (p = 0.440824) regarding the parameter “physical vs. non-physical motion”. Our data therefore seem to confirm Nikitina’s hypothesis: although there is no threshold dividing different uses of prepositions, the predications with directional em are typically those that correspond to events describing some change in the location of an entity that can be identified by spatial coordinates. On the contrary, if predications express non-physical motion (that is, events of change along a scale, or states denoting a location), prepositions a and para are typically used.

#### 6.3.3. Semantics of the preposition’s complement

In this section, we present the results concerning the semantic type of NP, that is, if it is a container or an area. The data concerning the cases marked as “undefined” (see 6.3.1 above) were removed, because they don’t add any relevant information to our analysis. Moreover, we only analyzed predications marked as “physical” motion (cf. previous section). Corpus-based examples (42–45), with em, illustrate the different combinations.

1. (42)
1. fui
2. I.went
1. em
2. in
1. casa
2. house
1. de
2. of
1. meu
2. my
1. pai
2. father
1. (MP, corpus data, ir + container)
2.
1. ‘I went to my father’s house’
1. (43)
1. quimbanda
2. quimbanda
1. é
2. is
1. aquela
2. that
1. pessoa
2. person
1. que
2. that
1. vai
2. goes
1. na
2. in.the
1. mata
2. woods
1. (MP, corpus data, ir + area)
2.
1. Quimbanda is that person who goes in the woods’
1. (44)
1. quando
2. when
1. chego
2. I.arrive
1. no
2. in.the
1. hospital
2. hospital
1. (STP, corpus data, chegar + container)
2.
1. ‘when I arrive at the hospital’
1. (45)
1. estávamos
2. we.were
1. a
2. at
1. chegar
2. arrive
1. na
2. in.the
1. zona
2. area
1. (MP, corpus data, chegar + area)
2.
1. ‘we were arriving in the area’

Tables 7, 8, and 9 present the results according to preposition (em, a, para), except for the combination preposition para + verb chegar for which we only found one occurrence (cf. fn. 20 above), which we did not include in Table 9.

Table 7

Verbs ir & chegar + preposition em.

 ir & chegar PPem Container Area total AP 52,54%(31) 47,46%(28) 100%(59) MP 72,38%(76) 27,62%(29) 100%(105) STP 77,50%(31) 22,50%(9) 100%(40) Total nr. 138 66 204
Table 8

Verbs ir & chegar + preposition a.

 ir & chegar PPa Container Area total AP 53,33%(24) 46,67%(21) 100%(45) MP 62,43%(113) 37,57%(68) 100%(181) STP 36,54%(38) 63,46%(66) 100%(104) Total nr. 175 155 330
Table 9

Verb ir + preposition para.

 Ir PPpara Container Area total AP 33,96%(18) 66,04%(35) 100%(53) MP 48,76%(59) 51,24%(62) 100%(121) STP 50,31%(81) 49,69%(80) 100%(161) Total nr. 158 177 335

The results are not clear-cut, but they show that preposition em (Table 7) is more prone to occur with containers than prepositions a and para (Tables 8 and 9), particularly in MP (72,38%) and STP (77,50%) but less so in AP (52,54%).

With preposition a, in Table 8, the overall difference between containers and areas is small (175 vs. 155 occurrences): MP confirms its preference for containers with this preposition (62,43%); STP, on the other hand, shows a preference for areas (63,46%). Finally, with preposition para (and verb ir only), in Table 9, the overall difference between containers and areas is, once again, small, but in this case the preference leans slightly toward areas (158 vs. 177 occurrences), a tendency which is clearer in AP (66,04%).

The detailed results of Tables 7, 8 and 9 are summarized in Table 10, which does not take into account the results by AVP.

Table 10

Containers and areas by preposition in the AVPs under analysis.

 em a para total container 29,30%(138) 37,15%(175) 33,55%(158) 100%(471) area 16,58%(66) 38,95%(155) 44,47%(177) 100%(398)

This table shows the contrast between the preference for containers instead of areas with Goal phrases headed by em in AVPs (29,30% vs. 16,58%). With respect to PPs headed by the prepositions a and para, the difference between these two types of NP is not clear-cut, with a slight preference of para toward areas (44,47%). In fact, regarding the parameter “container vs. area” in Table 10, Fisher’s test shows the same tendencies noticed before (concerning the parameter “physical vs. non-physical motion”, in Table 6), i.e., statistical differences can be observed between the pairs em/a (p = 0.001) and em/para (p < .001), but not between para/a (p = 0.14082). In the next section it will be shown that the tendencies pinpointed in this section are not identical for the verbs under analysis.

#### 6.3.4. Semantics of the verb

Recall that, in Nikitina’s (2008) proposal, the lexical information exhibited by the verb form is a relevant factor: locative PPs with Goal interpretation are found more often with non-durative verbs than with durative ones. We mentioned previously that Portuguese chegar behaves as a non-durative verb, whereas ir is ambiguous between durative and non-durative readings. This difference between these two verbs seems to explain the different results described earlier: since chegar is a non-durative verb, it combines more often with directional em phrases in all three varieties at stake. But the fact that ir is ambiguous between durative and non-durative readings does not favor the directional readings of Goals headed by em. These conclusions seem to be reinforced when we look at the data that consider the distribution of examples using the parameter “type of NP”, shown in Tables 11 and 12 below.

Table 11

NP containers and areas with the verb ir.

 ir PPem PPa PPpara total Container Area Container Area Container Area AP 15,38%(20) 17,69%(23) 13,85%(18) 12,31%(16) 13,85%(18) 26,92%(35) 100%(130) MP 7,50%(24) 2,81%(9) 31,88%(102) 20,00%(64) 18,44%(59) 19,37%(62) 100%(320) STP 1,16%(3) 1,55%(4) 11,63%(30) 23,26%(60) 31,39%(81) 31,01%(80) 100%(258) Total nr. 47 36 150 140 158 177 708
Table 12

NP containers and areas with the verb chegar.

 chegar PPem PPa total Container Area Container Area AP 40,74%(11) 18,52%(5) 22,22%(6) 18,52%(5) 100%(27) MP 59,77%(52) 22,99%(20) 12,64%(11) 4,60%(4) 100%(87) STP 59,57%(28) 10,64%(5) 17,02%(8) 12,77%(6) 100%(47) Total nr. 91 30 25 15 161

The detailed results of Tables 11 and 12 are summarized in Table 13, which does not take into account the results by AVP.

Table 13

Verbs ir and chegar by preposition in the AVPs under analysis.

 em a para total ir 11,72%(83) 40,96%(290) 47,32%(335) 100%(708) chegar 75,16%121 24,84%40 0%0 100%(161)

These results clearly show that, with chegar, a non-durative verb, Goal phrases are typically introduced by em (75,16%). A sharp contrast emerges with ir, a verb that lexically allows both durative and non-durative readings, where the occurrences of Goal phrases introduced by em are substantially reduced (11,72%).21 This does not mean that the type of NP is not relevant: on the contrary, it was shown that with both verbs there are more containers than areas occurring in Goal phrases introduced by em. However, these results show that the variable type of NP appears to be less influential than the variable type of verb. Since only two verbs were analyzed, these conclusions must be confirmed by an analysis that encompasses other verbs of movement.

Despite these limitations, these preliminary results lend overall support to Nikitina’s proposal that one cannot point out one single factor underlying the occurrence of Goal phrases headed by em. Instead, it seems that there are several factors with a different weight that contribute to the observed tendencies: (i) the semantic nature of the NP complement is a relevant factor (containers favor the occurrence of directional em phrases), but (ii) the semantics of the verb is more relevant than the semantics of the NP (chegar, a non-durative verb, favors the occurrence of directional em phrases).

In fact, the application of the “Joint Entropy” (JE) and “Mutual Information” (MI) functions to the data consistently reveals this correlation, as shown in Table 14. These functions (Manning & Schutze, 1999) allow us to quantify the degree of association between two variables. For example, if we have two variables potentially contributing to the ranking/classification of a third one, we can measure which one contributes the most (being the most relevant). The JE measures a quantity having an opposite sense of the MI. JE measures the level of irrelevance of one variable with respect to another one, and the lower the value the more dependence/order exists. On the contrary, MI measures how much the phenomenon that jointly involves the two variables is beyond random, meaning that there is information in the observed association that can be measured in Shannon bits.22 In this case, the higher the MI, the greater the degree of association between the variables. The application of both functions to the data (including the three prepositions at stake) converges toward the same result, as shown in Table 14: the most relevant parameter is the verb (the highest MI and the lowest JE) and the least relevant parameter is the opposition “container vs. area” (the lowest MI and the highest JE), with the parameter “physical vs. non-physical motion” being in between.

Table 14

Parameters relevance using “Joint Entropy” and “Mutual Information”.

 Parameters Functions Joint Entropy Mutual Information Verb ir 1,98261 0,2602872 chegar Motion non-physical 2,29107 0,0207105 physical NP Goal complement container 2,52796 0,0183608 area

### 6.4. Summary

To conclude, the data we presented confirm Nikitina’s (2008) hypothesis. Although there is no threshold dividing different uses of prepositions, the following general picture emerges. The use of Goal phrases headed by em in AVPs is favored by semantic factors with different weights:

• (i)   The lexical semantics of the verb: non-durative verb chegar favors em Goal phrases (contrary to ir, which licenses both durative and non-durative readings).

• (ii)  The type of eventuality described: “physical” motion events favor em Goal phrases (instead of metaphorical motion events or fictive motion situations).

• (iii) The type of NP complement of the preposition: containers favor em Goal phrases (instead of areas).

According to our data, (i) appears to play a primary role, (iii) is the least relevant factor, whereas (ii) is somewhere in between. However, further studies that include other verbs of movement are required to confirm this claim.

All in all, we conclude that, typically, with em Goal phrases, the predications correspond to non-durative physical events of movement, i.e., events describing some change in the location of an entity (a change that can be identified by spatial coordinates), and, in the end of the events, the entity is located typically inside the region associated to the entity denoted by the NP complement of the preposition. A semantic account for the occurrence of Goal em phrases could therefore explain the AVP data in the same way Nikitina (2008) and Kopecka (2009), for instance, account for the English and French data. Furthermore, this analysis is aligned with the analyses of BP data previously referred. In fact, there is nothing peculiar about the fact that AVPs allow Goal phrases introduced by em. Being a crosslinguistically widespread pattern, AVPs behave just like many other languages. EP, on the contrary, constitutes an exception to this regularity since this pattern is restricted to a few verbs only.23 In sum, we argue that general semantic (and pragmatic) factors must be taken into consideration in the analysis of AVPs, namely when scrutinizing the importance of language contact to explain the occurrence of Goal phrases introduced by em.

## 7. Conclusions

The main goal of this paper was to assess the use of preposition em introducing Goal phrases in urban AVPs, in order to test the hypothesis of language contact as the major cause for this use. To do so, we analyzed examples extracted from spoken corpora of three AVPs (AP, MP, and STP) with two inherently directed motion verbs ir ‘to go’ and chegar ‘to arrive’ and we compared the strategies to express Goal phrases in these AVPs with the ones used in the main contact languages (Kimbundu for AP; Changana for MP; and Forro for STP).

In general, we found that AVPs typically use PPs as Goal phrases and that there is a difference between the verbs: ir generally occurs with the standard EP strategies (a and para), whereas chegar exhibits strong variation between the standard EP strategy with a and the non-standard EP strategy with em. In addition, STP stands out in comparison to AP and MP because of a strong tendency for ir to select para and due to the fact that this variety also exhibits a fair number of cases where chegar selects an NP argument.

It was further shown that the language contact hypothesis lacks explanatory power with respect to the use of prepositions with the verbs ir and chegar across AVPs. The three main contact languages exhibit internally uniform patterns for the verbs ‘to go’ and ‘to arrive’ (PPs headed by ku in Kimbundu; morphological strategies in Changana; and NPs in Forro), which do not match the patterns in the respective AVPs, where PPs are the dominant pattern and ir and chegar exhibit different patterns of variation. Although the semantic hypothesis was not explicitly applied to the contact languages discussed in section 4, they lack the kind of sensitivity to the semantics of the predicate and to the nature of the Goal (area or container) observed in the AVPs with regard to the verbs of directed motion at stake. At best, there is only a very mild correlation between some of the patterns that occur in the AVPs and in the contact languages. In addition, we argued in section 5 that the contact-induced morphological incorporation hypothesis of preposition em inspired on the Bantu locative classes, originally proposed for MP, lacks empirical motivation.

Given the weaknesses of the contact-induced hypothesis, we put forward a different hypothesis. We assumed, as in Nikitina (2008), that the use of locative prepositions heading Goal phrases can be explained taking into consideration semantic/pragmatic factors. To test this hypothesis, we analyzed the data using the following parameters: the semantic nature of the predication, i.e., whether it describes an event of movement that can be traced using spatial coordinates; the semantic nature of the complement of the preposition, i.e., whether the NP denotes an entity with well-defined boundaries; and the lexical semantics of both verbs, i.e., whether the verb describes a durative or non-durative change of place.

The main findings of this approach are as follows. In the first place, and as expected, there are no clear-cut rules regulating the use of em in AVPs. Instead, directional readings of preposition em in AVPs are favored by the following properties, where i) was shown to be the most prominent and (iii) the least relevant:

• (i)   the predication corresponds to a non-durative change of place.

• (ii)  the speaker wants to describe an event in which there is an entity that moves toward a reference object, and this movement can be described by a series of spatial coordinates (i.e, a movement in the “real world”).

• (iii) the reference object describing the final location of the mover has well-defined boundaries (containers).

The main conclusion that emerges is that in the case of the contemporary, urban AVPs the contact-induced hypothesis is seriously flawed by the empirical data, whereas the occurrence of Goal phrases headed by locative prepositions (such as em) is attested in different, typologically unrelated languages. In other words, we are dealing with a cross-linguistic phenomenon that relies on the mechanisms available in each language to express motion and on general semantic and pragmatic principles.

There are, however, some shortcomings of our analysis that require future research, in particular the amount of analyzed verbs – only two. Different types of verbs of movement (durative and non-durative verbs; manner of motion and inherently directed motion verbs) will have to be added to the sample in order to test if our general conclusions hold regarding the durative vs. non-durative nature of (the lexical semantics of) the verb and its relative weight in the licensing of Goal phrases headed by em. Other aspects of Nikitina’s proposal, namely the hypothesis that Goal-em phrases cannot cooccur in the same predication together with Source phrases or with verbs that describe highly specific manners of motion (e.g., to crawl), will be difficult to evaluate, because they occur only scarcely in the spoken corpora. A workaround would be to carry experimental tasks with native speakers of the AVPs. The increase in the number of contexts will also allow us to further assess whether the parameter “type of NP” is indeed the less influential factor in determining the occurrence of Goal phrases headed by em in AVPs.

## Abbreviations

AP = Angolan Portuguese; APPL = applicative; AVP = African variety of Portuguese; BP = Brazilian Portuguese; CONN = connective; DEM = demonstrative; EP = European Portuguese; FRM = formal; FUT = future; LOC = locative; MP = Mozambican Portuguese; NP = noun phrase; POSS = possessive; PP = preposition phrase; PR = present; PT = past tense; STP = Santomean Portuguese; V = verb; VF = final vowel.

## Notes

1. Other uses of em, especially in AP, are also mentioned by these authors, for example, to introduce Sources and Recipients. In a preliminary survey on Goal and Source arguments of verbs of movement based on a subpart of the same corpus, Hagemeijer et al. (2019) noted that Source arguments are quite consistently introduced by preposition de in all three AVPs. [^]
2. Em contracts with masculine and feminine definite articles (and with other determinants and pronouns), yielding, respectively no and na (or its plural counterparts nos and nas). [^]
3. The use and distribution of PPs headed by a and para when selected by ir (and other verbs of directed motion) in these varieties is a topic for future research. [^]
4. The Fisher’s exact test is a statistical test used to determine whether nonrandom associations hold between two categorical variables, for example between AP and STP. A decision is made based upon the p-value threshold (usually .05). A p < .05 means no association, i.e., a significant difference is detected. [^]
5. Other well-known Bantu languages in Angola whose speakers have migrated to the capital Luanda, such as Umbundu and Kikongo, also exhibit three similar locative classes. [^]
6. In fact, Kimbundu exhibits differential object marking, which further follows from the fact that locative marker ku also introduces arguments with the semantic role of Source and Recipient, requiring the presence of morpheme -a when objects are [+ANIM]. [^]
7. Source arguments of directed motion verbs constitute yet another argument against the contact-induced hypothesis, since the contact languages use similar strategies for the selection of Goals and Sources. However, as mentioned in footnote 1, Source arguments in the AVPs are typically headed by preposition de, hereby converging with the pattern in European Portuguese. [^]
8. In addition, the overlap between KB ku and AP em can also be observed in the marking of dative objects (Recipients), which constitutes an additional argument for a greater role of the contact-induced hypothesis (cf. R. Gonçalves, Duarte & Hagemeijer, 2022). [^]
9. Our MP corpus displays no occurrences of para + no(s)/na(s). [^]
10. See, for instance, Chimbutane (2002, pp. 135–142) on the distinction in Changana between locative prefix - and free morpheme ka, which exhibit different morphological, syntactic, and semantic properties. [^]
11. Avelar (2017, p. 27) also mentions that no instances of double prepositions were found in Cabinda Portuguese. [^]
12. In a study carried out by Firmino (2002, apud P. Gonçalves, 2005), it is mentioned that Mozambicans have positive attitudes toward the EP grammar, except toward its pronunciation, which is considered snob. [^]
13. Although this clashes with the BP standard (cf., e.g., Avelar, 2017; Farias, 2006; Vieira, 2009). [^]
14. The directional use of em is not a feature of EP. In addition to the fact that this use is not mentioned in reference work (grammars and dictionaries) on EP, it is not found in corpora, such as the Reference Corpus of Contemporary Portuguese (CRPC) or the CETEMPúblico newspaper corpus, and native speakers consider it ungrammatical. On the other hand, EP apparently exhibits the opposite phenomenon, that is, locative uses of the directional preposition para, as in O João está para o Algarve (‘John is (somewhere) in Algarve’). See Oliveira et al. (2021) for a description of the locative readings of preposition para with different copula verbs (ser, estar, and ficar). [^]
15. “The ambiguity of sentences with in and on between a locative and a directional reading is not observed with all instances of these Ps, though. In particular, only certain verbs such as kick, non-iterative jump, throw, put, fall, among others, henceforth kick-verbs, can trigger a directional reading (…). With other motion verbs like dance, crawl, walk, swim, among others, henceforth swim-verbs, these prepositions only get a locative reading.” (Gehrke, 2007, p. 247) [^]
16. Some examples: rooms, buildings, boxes, cars, water, ground. [^]
17. A transitional zone is “perceived as neither contained by the location nor located outside of it.” (Nikitina, 2008, p. 186) [^]
18. Some examples: cities, mountains, countries, space, forests, neighborhoods. [^]
19. Preposition até alternates with prepositional locution até a, formed by the combination of até and preposition a, according to the nature of the complement: até a is used whenever the NP complement bears a definite article; otherwise, simple preposition até is used (e.g., Raposo & Xavier, 2013, pp. 1504–1505, 1556). Examples (i) and (ii) below show this difference. The noun Lisboa is used without article and therefore only até can occur. The noun Porto, however, requires the use of the masculine, definite article and, concomitantly, the use of até a.
1. (i)
1. Fui
2. I.went
1. {até
2. up.to
1. Lisboa /
2. Lisbon /
1. *até
2. up.to
1. a
2. to
1. Lisboa}.
2. Lisbon
1. ‘I went to Lisbon.’
1. (ii)
1. Fui
2. I.went
1. {*até
2. up.to
1. o
2. the
1. Porto /
2. Porto /
1. até
2. up.to
1. ao
2. to.the
1. Porto}.
2. Porto
1. ‘I went to Porto (Oporto).’
[^]
20. Only one occurrence in the data. [^]
21. Applying Fisher’s test to the data in Table 13 shows that remarkable differences between all pairs of prepositions can be observed. But notice that the existence of a statistical difference between para and a is biased by the fact that only the latter preposition occurs with the verb chegar. [^]
22. The Shannon bit is a unit of information related to the conventional bit (used in computers). While in the latter the value is always an integer (e.g., 2 bits, 64 bits), in the former the value tends not to be (e.g., 0.26 bits, 3.124 bits), resulting from the use of the base 2 logarithm for its calculation. A certain amount of Shannon bits represents the combined number of bits necessary to codify a number of random events with a certain probability with respect to each event. [^]
23. Although the scope of this paper is not EP, this language belongs typologically to the group of languages in which Location and Goal receive differentiated (prepositional) marking, whereas (spoken) AVPs, as well as BP, can receive the same marking for these two semantic roles. [^]

