Slavic Languages in Psycholinguistics

Tanja Anstatt; Anja Gattnar; Christina Clasmeier

eBooks

Slavic Languages in Psycholinguistics

2016

978-3-8233-7969-0

Gunter Narr Verlag

Tanja Anstatt

Anja Gattnar

Christina Clasmeier

Psycholinguistics explores the anchoring of language in cognition. The Slavic languages are an attractive topic for psycholinguistic studies since their structural characteristics offer great starting points for the development of research on speech processing. The research of these languages with experimental methods is, however, still in its infancy. This book provides an insight into the current research within this field. On one hand, central topic is the question of how Slavic languages can contribute to psycholinguistic findings. On the other hand, all chapters introduce their respective psycholinguistic method and discuss it according to its usefulness and transferability to the Slavic languages. The researched languages are mainly Russian and Czech, however, other languages (e.g., Polish, Belarusian or Bulgarian) are touched upon as well. Main topics are the characteristics of the mental lexicon, multilingualism, word recognition, and sentence comprehension. Furthermore, several contributions address the issue of verbal aspect and aktionsarten as well as other grammatical categories.

TBL Tübinger Beiträge zur Linguistik Slavic Languages in Psycholinguistics Chances and Challenges for Empirical and Experimental Research Tanja Anstatt / Anja Gattnar / Christina Clasmeier (eds.) Slavic Languages in Psycholinguistics Tübinger Beiträge zur Linguistik herausgegeben von Gunter Narr 554 Tanja Anstatt / Anja Gattnar / Christina Clasmeier (eds.) Slavic Languages in Psycholinguistics Chances and Challenges for Empirical and Experimental Research Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http: / / dnb.dnb.de abrufbar. Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschützt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzulässig und strafbar. Das gilt insbesondere für Vervielfältigungen, Übersetzungen, Mikroverfilmungen und die Einspeicherung und Verarbeitung in elektronischen Systemen. Gedruckt auf säurefreiem und alterungsbeständigem Werkdruckpapier. © 2016 · Narr Francke Attempto Verlag GmbH + Co. KG Dischingerweg 5 · D-72070 Tübingen Internet: www.narr.de E-Mail: info@narr.de Printed in Germany ISSN 0564-7959 ISBN 978-3-8233-6969-1 Contents Introduction 7 The use of experimental methods in linguistic research: advantages, problems and possible pitfalls 15 Barbara Mertins How to investigate interpretation in Slavic experimentally? 34 Roumyana Slabakova Does language-as-used fit a self-paced reading paradigm? (The answer may well depend on how you model the data.) 52 Dagmar Divjak, Antti Arppe & Harald Baayen One experiment — different languages: A challenge for the transfer of experimental designs. Examples from cross-linguistic and inner-Slavic research 83 Anja Gattnar Variation in Russian verbal prefixes and psycholinguistic experiments 113 Anastasia Makarova Reaction time methodology in psycholinguistic research: An overview of studies on Czech morphology 134 Denisa Bordag Some “cases of doubt” in Russian grammar from different methodical perspectives 151 Elena Dieser How to study spoken word recognition: evidence from Russian 175 Julija Nigmatulina, Olga Raeva, Elena Riechakajnen, Natalija Slepokurova & Anatolij Vencov 6 Are Schalter and šapka good competitors? Searching for stimuli for an investigation of the Russian-German bilingual mental lexicon 191 Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke Measuring lexical proficiency in Slavic heritage languages: A comparison of different experimental approaches 225 Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski Psycholinguistic aspects of Belarusian-Russian language contact. An ERP study on code-switching between closely related languages 257 Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk Influence of spatial language on the non-linguistic spatial reasoning of sign language users. A comparison between Czech Sign Language users and Czech non-signers 279 Jakub Jehlička Index 299 Notes on Contributors 30 Contents 8 Introduction The volume at hand is an outcome of the workshop on empirical psycholinguistic methods “Slavic Languages in the Black Box”, which took place from September 24 to 26, 2014, at the University of Tübingen and was organized by Slavists from the Universities of Bochum and Tübingen 1 . The workshop brought together psycholinguists and Slavists from the Czech Republic, Germany, Great Britain, Norway and Russia. The idea for this workshop was rooted in our shared experience while conducting psycholinguistic research on Slavic languages. On the one hand, we all face particular methodological problems that seem to be at least partially specific to the experimental investigation of Slavic languages. On the other hand, regardless of these difficulties, we appreciate the benefit of this research, not only for Slavic linguistics but also for psycholinguistics in general. Thus, the aim of the workshop was to discuss these and similar issues with other experts on this field. The multifaceted projects and studies presented at the workshop and our intense discussions of methodological problems attested to the high demand for this type of scientific exchange. Research on the topic of Slavic languages from a psycholinguistic view comes from at least two directions: first, from genuine psycholinguists, who deal with Slavic languages as their material, and second, from Slavists, who “discover” new perspectives on their subject and research methods. Thus, the researchers working in this field approach the topic from different starting points. From the purely psycholinguistic point of view, the general cognitive abilities of humans are the focus, which are analyzed in the linguistic material at hand — usually the language by which the respective scientist is surrounded. An example of this point of view is the psycholinguistic controversy about whether syntactic and semantic information is processed sequentially or in parallel during sentence comprehension. Important contributions to this issue were made by Friederici and colleagues for German (cf. Friederici, 2002; Hagoort & van den Brink, 2004); however, they did not intend to obtain specific results for the German language. 1 The workshop was organized in the context of the SFB 833: “The Construction of Meaning,” project C2 “Verbal Aspect in Text: Contextual Dynamization vs. Grammar” at the University of Tübingen and received financial support from the Deutsche Forschungsgemeinschaft (DFG). Funding for editing of the volume was provided by the chair of Slavic Linguistics at the Institute for Slavic Studies at the Ruhr University of Bochum. 8 The development of psycholinguistics as a discipline in Slavic countries was rigidly isolated from that in Western countries for a long time, and the research questions and methods of both differed (cf. Sappok, 1999). However, important contributions to the development of the discipline were made in the 19 th and early 20 th centuries by researchers such as J. Baudouin de Courtenay, L. S. Vygotskij and A. N. Leont’ev, and later by A. A. Leont’ev, R. M. Frumkina in the Soviet Union and I. Kurcz in Poland, to mention just a few names. In the last few decades, there has been a growing exchange between the different traditions of psycholinguistics, documented in the work of T. V. Černigovskaja and her laboratory in St. Petersburg and by B. Bokus and her chair in Warsaw. In contrast, psycholinguistic research from the perspective of a Slavist — or more general, that of a linguist — tries to gain new insights into a specific language by investigating its processing in the human mind, intending findings of a new kind with regard to classical questions and re-analyzing “traditional” issues from the viewpoint of psycholinguistics. A good example is the psycholinguistic perspective on the verbal aspect, which recently has received increasing attention. Roussakova and colleagues examined whether members of Russian aspectual pairs are stored and processed as separate lexemes or as forms of one lexeme (Roussakova et al., 2002, p. 306), obviously motivated by the old controversy of the question of how to describe the relationship between two verbs such as the perfective pomoč’ and the imperfective pomogat’, both meaning “to help.” Presumably, most of the psycholinguistic studies on Slavic languages are derived from the linguistic point of view. Upon closer examination, we can again observe two different types of studies, mentioned by Sekerina (2006, p. 20-21). The first and, in the opinion of Sekerina, “easier,” way is “to take an existing line of research in English (and other languages) and modify it to accommodate Slavic data” (2006, p. 20). An example is the investigation of the mental lexicon by Feldman (1994). She applied a primed Lexical Decision Task to Serbo-Croatian material and asked respondents to figure out if derivational and inflected word forms differed in the way they were represented in the mental lexicon. This “line of research” has a long tradition in psycholinguistic research on morphology, and, until then (the end of the nineties), had been applied predominantly to English. The second, and, as Sekerina (2006, p. 20) calls it, “more challenging,” approach is “to take a phenomenon specific to Slavic and to try to work out the psycholinguistic analysis for it, including choosing a new hypothesis or technique” (2006, p. 20-21). Examples of this type of work are the doctoral theses by Makavčik (2004) and Clasmeier (2015). Both scholars applied new techniques to obtain adequate insights into the psycholinguistics of the Russian verbal aspect. Introduction Introduction 9 Applying a new technique is challenging, because in contrast to established experiment designs, many things are unclear in the beginning. The verbal aspect is a striking example of the difficulties the researcher confronts in applying classical questions and methodologies to Slavic languages, and a considerable number of the contributions to this volume deal with this grammatical category. In general, in the last few decades linguistics has experienced an empirical turn, corresponding to the growing technical possibilities. This turn was further accelerated by the increase in studies in multilingualism that became an important issue especially for Slavic languages. Thus, the last few years have witnessed a growing body of research on Slavic languages in migration contexts, trying to find answers to the question of how and why multilingual speakers use their languages in this or that way and what the specifics of processing more than one language look like. The close relationship between multilingualism research and psycholinguistics is demonstrated by Grosjean and Li (2013). A closely related topic that has been the long-time focus of empirical psycholinguistic research is the research on second language (L2) acquisition, again gaining additional actuality in migrational contexts. Thus, another large portion of the contributions deal with issues in the field of multilingualism (cf. the work of Kira Gor, e.g., Gor, 2007; Gor, Cook, Malyushenkova, & Vdovina, 2010). For the workshop and this volume, it is essential to define psycholinguistics not only by its major research questions on language knowledge, processing and acquisition (Rickheit, Sichelschmidt, & Strohner, 2007, p. 15) but also in terms of a discipline based upon empirical and experimental research. An initial overview of cognitive-oriented works in the field of Slavic studies was conducted by Irina Sekerina in 2006. Since then, a number of volumes have been published covering the question what studies on Slavic languages contribute to the investigation of language in the human mind, cf. Cognitive Paths into the Slavic Domain, edited by Divjak and Kochańska (2007), Slavic Linguistics in a Cognitive Framework, edited by Grygiel and Janda (2011) and Die slavischen Sprachen im Licht der kognitiven Linguistik (The Slavic Languages in Light of Cognitive Linguistics), edited by Anstatt and Norman (2010). These volumes, as is clear from their titles, acknowledge they are based in the research field of cognitive linguistics. However, they contain theoretical as well as empirical work. Distinguishing between psychoand cognitive linguistics is, especially in Slavonic studies, anything but a simple task. Both terms are often used in parallel and are not clearly distinguished from each other. Cognitive linguistics became a brand name in the 1980s and 1990s. The term seems to be connotatively clear; however, it has remained without sharp denotative outlines (Knobloch, 2003, p. 26). 10 Therefore, not least in Slavonic studies, the number of surveys assigning themselves to cognitive linguistics is high and eclectic. Thus, for the workshop and this volume, we decided to refrain from using the term cognitive linguistics. Instead, we assigned our work to psycholinguistics, determining our discipline by methodology. We propose that investigations of “human experience or behavior concerning language” (Rickheit et al., 2007, p. 13) which use empirical methods belong to psycholinguistics. Corresponding to this definition, the volume contains classical psycholinguistic studies that draw upon the measurement of behavioral data during language processing, as well as neurolinguistic investigations that study the physiological mechanisms by which the brain processes linguistic information. However, since this volume is dedicated to methodological issues, scientific work dealing with language and cognition by purely theoretical consideration and modeling goes beyond the aims of this book. Therefore, methodological issues and the peculiarities of psycholinguistic investigations particularly on Slavic languages are the recurring themes in the articles in this volume. Each work considers at least some of the following questions: - What about the research questions, methods and/ or results is specific to Slavic languages? - What are the advantages and the disadvantages of the chosen research method? Which problems does the researcher have to cope with in this respect? - Did the specific properties of Slavic languages influence the selection of the method? How suitable is the selected method for the specific Slavic research question? - How do the results fit into the general psycholinguistic research? Is there a specific contribution of the Slavic languages? All contributions in this volume take into account Slavic languages, but the linguistic subareas the articles focus upon differ considerably. In the opening contribution, Barbara Mertins presents a classification of experimental methods and reveals the benefits and difficulties of offline and online methods in experimental linguistic research on various Slavic languages. With regard to three concrete experimental settings, Mertins discusses the methods of elicitation, eye tracking, memory tasks, and preference judgment tasks for research on language production. Native speakers of various Slavic languages and Slavic native speakers in foreign languages make up the main group of participants in her experiments, the focus of which is the effect of grammatical aspect on cognition. Mertins concludes Introduction Introduction 11 that with respect to experiment planning, besides other aspects, calculating intercoder reliability is necessary to code linguistic data. Furthermore, she contends that for valid results, quantitative analyses are required. Roumyana Slabakova focuses on difficulties in the experimental investigation of sentence interpretation in Slavic languages. She presents two case studies with Russian native speakers that examine the interpretation of bare plural and mass objects in sentences with perfective or imperfective verbs (case study 1) and the acceptability of fronted objects when they are topics or foci (case study 2). Both studies observe an unexpectedly high variability in the interpretation of the sentences and thus reveal a multitude of factors affect the participants’ judgments. Finally, Slabakova introduces a specific concept of grammar, which provides an explanation for these findings. The contribution by Dagmar Divjak, Antti Arppe and Harald Baayen highlights the effects of tense, aspect and mood (TAM) marking for processing in Russian. The authors test experimentally the predictions of a corpus-based model and ask how TAM markers on verbs affect processing in Russian. They illustrate the modeling of experimental data of a self-paced reading experiment on TAM marking on six near-synonymous Russian verbs expressing “try.” In addition, they impressively present appropriate data analysis by contrasting the linear generalized linear mixed model against the nonlinear generalized additive mixed model. The article by Anja Gattnar discusses the pros and cons of the adaptation of target sentences for eye tracking and self-paced reading studies for inter- Slavic and cross-linguistic experimental research about the processing of the verbal aspect. Examples of her own studies on the processing of the verbal aspect in languages with (Russian and Czech) and without the grammatical aspect (German) illustrate that a one-to-one translation of target sentences is nearly impossible without syntactical or semantic changes. Gattnar develops opportunities and measures to get out of the dilemma. The contribution of Anastasia Makarova is concerned with Russian Aktionsarten. Based on two empirical studies on the Russian attenuative and semelfactive Aktionsart, preceded by a corpus study, she analyses the influence of the factors frequency, morphology and context on speakers’ preference for one of two functionally equivalent morphemes. Makarova thus offers a technique that handles and yields reliable results in dealing with a very typical phenomenon of Slavic languages, namely, morphological variation. The article by Denisa Bordag concerns a range of reaction time experiments on Czech morphology. Her investigations deal with the perspective of language comprehension and psycholinguistic paradigms. Using a lexical decision task, morphological repetition priming and two picture-word inter- 12 ference paradigms, Bordag investigates the representation of Czech prefixed verbs, grammatical gender and gender processing in Czech, as well as the representation of the declensional and conjugational classes. Elena Dieser, similar to Makarova, focuses on variation in Russian grammar. However, while Makarova highlights variation in standard language, Dieser explores “cases of doubt,” fringe phenomena of Russian, which seldom occur and thus allow analysis of how native speakers relate to the periphery of grammar. She shows that in the mental grammar there is no sharp border between “wrong” and “right.” While Dieser explores these phenomena based on grammaticality judgments, the core of the contribution is the comparison of different judgment techniques, from simple questionnaires to the more sophisticated method of thermometer judgment. Julija Nigmatulina, Olga Raeva, Elena Riechakajnen, Natalija Slepokurova and Anatolij Vencov assume in their contribution that the perceptual system of a listener depends on the phonological system of the language. They investigate Russian spontaneous speech in everyday communication, discuss the method of a dictation task experiment to study the recognition of Russian reduced word forms and point out the role of the context for the interpretation of such reduced word forms. The authors show that for their experimental design the use of spontaneous speech corpora is highly necessary. The contribution by Christina Clasmeier, Tanja Anstatt, Jessica Ernst and Eva Belke is situated in the context of bilingualism research. The authors present problems they faced in the preparation of stimuli for a study of coactivation phenomena in the bilingual mental lexicon. As one-word stimuli that matched strict criteria had to be found in two languages, the authors discuss in detail the issues of phonetic similarity, measuring frequency and specifics of the picture-word relations, thus pointing to general problems that must be considered in the preparation of bilingual stimuli. Bernhard Brehmer, Tatjana Kurbangulova and Martin Winski discuss a closely related issue, that also concerns the bilingual mental lexicon. They present the results of four lexical tasks (picture naming, semantic mapping, translation and verbal fluency) conducted with bilingual Russian-German and Polish-German adolescents. The contribution thus is devoted to the important issue of methodological comparison, as in bilingualism research usually only one test is conducted. While the authors found significant medium to strong correlations for the non-dominant heritage language, for the dominant environmental language no significant correlations were found. Jan Patrick Zeller, Gerd Hentschel and Esther Ruigendijk make a third contribution to the psycholinguistic investigation of bilingualism. In their article, they discuss the specifics of the language contact situation in Belarus Introduction Introduction 13 and then report an ERP study on code-switching between the two closely related languages Russian and Belarusian. The researchers built Russian and Belarusian sentences that did or did not contain code switches to the other language, presented them auditorily to 36 young Belarusians and measured the participants’ event-related electrophysiological potential. The authors found some components (N400) to show similarity with the processing of code-switching between less closely related languages, but other components showed differences that might reflect the specifics of the language contact situation. The last paper in this volume, by Jakub Jehlička, is situated in research on the interaction between language and cognition (linguistic relativity). Jehlička examines the influence of spatial language on the non-linguistic spatial reasoning of Czech sign language users. He presents his study, which set out to investigate the influence of different factors, such as the subjects’ gender and their competence in Czech sign language, on accuracy in a mental rotation task. He reports interim results on the group of Czech hearing subjects (without competence in Czech sign language) and compares them to the findings of the seminal study by Emmory et al. (1998). Finally, he discusses several challenges in developing an appropriate design for this type of experimental work. This volume was realized with the help and support of many people. First of all we would like to thank all of the workshop participants for their commitment and seminal discussions of our common topic. Most of them have contributed a chapter to this volume. We are grateful for the authors’ promptness and flexibility that made working on this volume a pleasure. Sincere thanks are due to Joshua Bebout for the English proofreading. Special thanks go to Anke Luislampe and Natalie Müller for their thorough work and tireless commitment in formatting this volume. Our gratitude is also extended to Til l man n Bub of Narr Francke Attempto Publishing House for his support and professional advice. Tanja Anstatt , Christina Clasmeier & Anja Gattnar References Anstatt, T., & Norman, B. (Eds.). (2010). Die slavischen Sprachen im Licht der kognitiven Linguistik. Wiesbaden: Harrassowitz. Clasmeier, C. (2015). Die mentale Repräsentation von Aspektpartnerschaften russischer Verben. Leipzig: BiblionMedia. Divjak, D., & Kochańska, A. (Eds.). (2007). Cognitive paths into the Slavic domain. Berlin: De Gruyter. 14 Emmorey, K., Klima, E. S., & Hickok, G. (1998). Mental rotation within linguistic and nonlinguistic domains in users of American Sign Language. Cognition, 68, 221-246. Feldman, L. B. (1994). Beyond orthography and phonology: Differences between inflections and derivations. Journal of Memory and Language, 33, 442-470. Fernández, E. M., & Smith Cairns, H. (2011). Fundamentals of psycholinguistics. Chichester: Wiley-Blackwell. Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6(2), 78-84. Gor, K. (2007). Experimental study of first and second language morphological processing. In M. Gonzalez-Marquez, I. Mittelberg, S. Coulson, & M. J. Spivey (Eds.), Methods in cognitive linguistics (pp. 367-398). Ithaca, NY: Benjamins. Gor, K., Cook, S. V., Malyushenkova, V., & Vdovina, T. (2010). Russian verbs of motion: Second language acquisition and cognitive linguistics perspectives. In V. Driagina-Hasko & R. Perelmutter (Eds.), Multiple perspectives on Slavic verbs of motion (pp. 361-381). Amsterdam: Benjamins. Grosjean, F., & Li, P. (2013). The psycholinguistics of bilingualism. Chichester: Wiley- Blackwell. Grygiel, M., & Janda, L. A. (Eds.). (2011). Slavic linguistics in a cognitive framework. Frankfurt am Main: Lang. Hagoort, P., & van den Brink, D. (2004). The influence of semantic and syntactic context constraints on lexical selection and integration in spoken-word comprehension as revealed by ERPs. Journal of Cognitive Neuroscience, 16(6), 1068-1084. Knobloch, C. (2003). Geschichte der Psycholinguistik. In G. Rickheit & T. Herrmann (Eds.), Psycholinguistik. Ein internationales Handbuch Bd. 24. Handbücher zur Sprach- und Kommunikationswissenschaft (pp. 15-33). Berlin: De Gruyter. Makavčik, V. O. (2004). Vido-vremennaja kategorizacija russkogo glagola v jazykovom soznanii nositelej russkogo jazyka. Tomsk: Avtoreferat. Rickheit, G., Sichelschmidt, L., & Strohner, H. (2007). Psycholinguistik. Tübingen: Stauffenburg. Roussakova, M., Sai, S., Bogomolova, S., Guerassimov, D., Tangisheva, T., & Zaika, N. (2002). On the mental representation of Russian aspect relations. In S. Bendjaballah, W. U. Dressler, O. E. Pfeiffer, & M. D. Voeikova (Eds.), Morphology 2000. Selected papers from the 9 th Morphology Meeting, Vienna, 24-28 February 2000 (pp. 305- 312). Amsterdam: Benjamins. Sappok, C. (1999). Zur Psycholinguistik in Russland. In H. Jachnow (Ed.), Handbuch der sprachwissenschaftlichen Russistik (pp. 1191-1214). Wiesbaden: Harrassowitz. Sekerina, I. (2006). Building bridges: Slavic linguistics going cognitive. In S. Franks, E. Andrews, R. Feldstein, & G. Fowler. (Eds.), Slavic Linguistics 2000: The future of Slavic Linguistics in America. Glossos, 8. Introduction The use of experimental methods in linguistic research: advantages, problems and possible pitfalls Barbara Mertins Abstract: In the present paper I will present and discuss several experimental methods used inside and outside psycholinguistic research. The overall focus will be on language production. The methods presented include different elicitation techniques, eye tracking, memory tasks and preference judgment tasks. On the basis of my own experimental data, I will describe the main features of these methods, comment on their suitability for various linguistic research questions, and explore some of their advantages, shortcomings and limitations. The paper addresses general methodological issues and challenges, going beyond the research conducted on Slavic languages. However, all the studies presented and discussed here are based on data collected from native speakers of various Slavic languages. In addition, two studies address language production of Slavic native speakers in a foreign language. The paper concludes with general remarks on the use of experimental methods and statistics in linguistic research. 1 Introduction The use of experimentally based methods and techniques has become fairly popular in linguistics. Researchers from different linguistic areas employ methods originating in experimental linguistics and psycholinguistics to test various linguistic theories, models and concrete research hypotheses. The choice of a method or a methodological approach always means commitment to a particular experimental design. This choice in turn puts specific requirements on the stimulus material or the selection of participants and has, in the end, consequences for data coding and analysis. It is therefore essential to have some knowledge of the advantages and disadvantages of a particular method before employing it in an experiment. This paper is structured as follows: Section 2 presents a classification of experimental methods and discusses their advantages and disadvantages. Section 3 focuses on selected aspects of language production research and introduces methods employed in my own research. Section 4 comprises three studies chosen from my own research on the basis of which different methods are explained and evaluated. This section also includes relevant Barbara Mertins 16 details concerning the design of an experiment. The article ends with a set of conclusions. 2 A classification of psycholinguistic methods Experimental methods can be classified in different ways (cf. Höhle, 2010; Müller, 2013; Vanpatten & Jegerski, 2014). In this paper three method types are distinguished (cf. Schmiedtová & Flanderková, 2012): (1) offline methods; (2) online methods; (3) true online methods. I will concisely describe the different types and provide examples for each of them. Then I will elaborate on some pros and cons of the methods and make some general remarks on their suitability for linguistic research. The terms offline/ online relate to the degree, to which a given method reflects the studied underlying mental and/ or neuronal process. The offline methods focus on speakers’ linguistic competence, whereas the online methods concentrate more on speakers’ performance. (1) The offline methods have no direct access to a mental process and reflect conscious decision-making. The tasks are solved with a delay in time. A good example of an offline task is a paper-and-pencil questionnaire (which can also be administered in a more modern manner as a web-based task) or object naming, a method frequently used with special participant groups, such as aphasic patients. It is characteristic for the second group (2), the online methods that they offer mediated access to underlying mental processes. These processes are more automatized and unconscious. The participants have to solve an experimental task with only a short time delay. Examples of these methods are reaction time experiments 1 or eye-tracking, both methods frequently used in psycholinguistic research. The last method types (3) are the true online methods. These methods have immediate access 2 to the relevant process and can assess highly automatized and unconscious mental and neuronal processes. Functional magnet resonance imaging (fMRI) or electroencephalography (EEG) with the measurement of event related potentials (ERP) are examples for these methods. 1 In some other classifications reaction time experiments are considered an offline method. These classifications do not differentiate between online and true online methods. Instead they collapse all behavioral methods in one method type (offline) and keep the online method type only for electrophysiological and neuroimaging methods. 2 Researchers in cognitive sciences and neurolinguistics assume that, even in case of true online methods, the access to the relevant neuronal processes is delayed. This is perhaps true, but not relevant for the purpose of this paper. The Fig In van Off A p con dat app pan par me als tes int fro me On On exa in 3 e use gure the ntag fline prob ntro ta a pro nts rtici etho o m ting o p om a ent 3 nline ne d amp a l Off pen e of 1: M e fo ges e me blem ol ov are c ach nee ipan ods mean g pa arti an o . e me disa ple, lexic fline nds o expe Meth ollow of t etho m w ver coll h is ed t nts is th ns e arti icip offli etho adva to cal e me on th erim hods win the ods: whe the lecte emp to f can hat easy cipa ant ine ods: anta the dec ethod he a ment s in ng thre en u e cou ed ploy finis n be (ve y lo ants ts’ li tas age e sp cisi ds c aim a tal m rela par ee m usin urse via yed sh t e exc ery) ogis s’ c ingu sk c of eed on an b and meth ation ragr met ng o e of an d, it the clud lar stics om uist an the d of tas be a the hods n to raph thod offlin f the int is u tas ded rge s an mpet tic p hel e on han sk. lso u rese in l tim hs d ty ne m e da tern usu sk. W d fro dat nd a tenc pref p to nlin nd But used earch ingu me I w ypes met ata net-b ually Wh om t ta am alm ce a fere o ge ne m mo t al d for h qu uisti will s. thod coll base y po hen the mou most and ence ene met ovem lso r tes uestio ic re de ds i lecti ed q ossi an dat unt t no are es o rate hod men in sting on o esear scri is th ion que ible exc ta se s ca o co e su or gr e sp ds i nts, eye g a s of th rch ibe hat t . Th estio to cess et. A an b osts. uitab ram peci is a wh e-tra spec e ind the the his i onn me sive A bi be c . Th ble mma ific a ce hen acki cific divi e a res is es naire asu e am ig a olle he o for atica hyp rtai me ing ling dua dva sear spec e. H ure t mou adva ecte offli ge al ju poth in s easu the guist l stu anta rche cial How the unt anta ed a ine ttin udg hese slow urin ere tic h udy. ages er h lly t weve tim of age t on me ng (f gme es f wdo ng r is hypo s an has n true er, me th tim e of nce. etho firs ents for a own eac a s othes nd no o e wh wh he p me is the . Th ods t) in . Th an n, d tion slow sis. T dis or l hen hen par s us off heir aim nsig he d exp due, n tim wdo This 17 sadittle n the this ticised fline use m at ghts data perifor mes own s de- 7 e e s - , e e t s a r s n - Barbara Mertins 18 caused by the movement of the eye. Online methods make relatively high organizational demands (one person per recording session), hence smaller samples. Considering the fact that one needs about twenty participants to run a proper statistical analysis this is not a trivial point, especially in the case of cross-linguistic research. In contrast to the offline methods, the researcher has more control over the experimental procedure and the task execution. The online methods are suitable for testing unconscious and automatized mental processes, with focus on participants’ performance. These methods are a good tool for testing concrete linguistic hypotheses. True online methods: These methods make considerable organizational and financial demands. This is why studies using true online techniques are usually based on data from only a few subjects. Another disadvantage of these methods is a rather heavy dependency on “hidden” statistical procedures and calculations. This means that various tools used for the data analysis and visualization (e.g., Brain Voyager for functional and structural MRI data sets) operate on many preset defaults that are quite impossible for the researcher (let alone layperson) to retract and understand. Moreover, because of the complexity of the entire experimental protocol, only simplified experimental designs can be employed. Additionally, there are also serious technical restrictions on task execution (e.g., on free language production). An advantage of these methods is the possibility to study highly automatized and unconscious mental and neuronal processes. Depending on the type, these methods are suitable for investigating the time course of language processing (e.g., ERP in EEG) or the localization of language skills (e.g., PET, fMRI). In my opinion the true online methods should be only used, when behavioral methods can no longer provide answers. 3 Language production research Before going into a more detailed description of the various methods I use in my own research, I would like to mention a number of points related to the study of language production in general. This research area has a long tradition in linguistic research (for an overview see Carroll, 2008). It deals with all phenomena linked to the production of spoken and written language in different populations. There are some challenges to be dealt with, especially when studying spoken language: The logistical and technical requirements are quite high since participants must be tested individually. As mentioned above, a general rule is to have data The use of experimental methods in linguistic research 19 from at least twenty subjects to carry out a good statistical analysis. However, the empirical value shows that because of unexpected technical problems and other sources of data loss, one must record about 30% more participants than the minimal number required for a proper statistical analysis. Another point is the comparability of experimental settings: The entire experiment is almost never run by one and the same person. But even if only one person is in charge, it is very important to keep the experimental protocol across individual recordings as consistent as possible and to minimize variations in the experimental procedure, incl. the instruction and interactions between participants and investigator. As in any other research the creation of a good stimulus set is an essential prerequisite for a well-designed study. Going into all aspects, which need consideration when creating stimulus material, would go beyond the scope of the present paper. Also, different research questions and experimental designs call for different stimulus material. But in general, the following points should always be taken into account in order to avoid undesirable side effects and biases: a sufficient amount of fillers (a rule of thumb: twice as many fillers as testing/ critical items), control over word frequency and word length (optionally also the number of syllables), control over the degree of concreteness/ abstractness, and awareness of intercultural aspects. Last but not least: The transcription of spoken language (audio data) is very time-demanding. This needs to be taken into consideration when planning a language production study. The methods and tasks I use in my own research are: elicitation, memory tasks, eye-tracking, the measurement of speech onset times (SOT), preference and grammatical judgment tasks. Depending on the study, the methods are either used alone or in combination. Due to the scope of this paper I cannot explain the SOT method and grammaticality judgment tasks. As for elicitation, I will describe this method in more depth because it is widely used in experimental linguistics. (1) Elicitation 4 is in very general terms the act of obtaining specific language data from another person. Depending on the time constraint, it qualifies either as an offline or an online method. There are many areas of linguistic research for which this method is fitting. For example: (a) elicitation of a particular linguistic structure (e.g., case — cf. Dąbrowska et al., 2006); (b) elicitation of a phenomenon rarely occurring in spontaneous speech (e.g., simultaneity marking — cf. Schmiedtová, 2004); (c) testing of a specific hypothesis (e.g., determiners before tense marking in child acquisition — cf. Wit- 4 In the present paper, the term elicitation is confined to eliciting linguistic structures. Barbara Mertins 20 tek & Tomasello, 2002). Data can be elicited in different ways, for example by context restrictions, stimulus manipulations, or creation of minimal pairs. Depending on the research question and the population to be studied, pictures, picture books, audio recordings, written texts or video clips can be employed. Pictures are often used to elicit children’s language (cf. Clark, 2009). The well-known picture book “The Frog story” has been a popular elicitation tool over the past decades in many different contexts and all kinds of populations (cf. the CHILDES database: http: / / childes.psy.cmu.edu). Audio recordings serve as stimuli in phonetics and phonology research. For instance, written texts can be employed to study production in association tasks (cf. Glucksberg & Danks, 2013 for an overview). I use video clips for eliciting spoken language in adult native and second language speakers. (2) Another task I employ in my own research are memory tasks. They are well suited to collecting non-linguistic data, an important complement to linguistic data when examining linguistic relativity (more detail see section 4). The memory task in my own research was administered to the participants with a time delay, which is why it qualifies as an offline method. (3) Eye- Tracking is an online method that makes it possible to study eye-movements and to test hypotheses concerned with allocation of visual attention (e.g., for testing the effects of language on cognition). I combine eye-tracking with (4) measurements of speech onset times (SOT) 5 , another online method providing an insight into the planning processes taking place just before a participant begins to speak. Two additional methods that can be found in my own research, including (5) the preference judgment task and the grammatical judgment task 6 . Both tasks belong to the offline group and are suitable for testing a particular linguistic phenomenon in larger or specific populations. 4 Examples from my own research: research questions and the use of different methods In this section, I will present and discuss a number of studies carried out either by myself or in cooperation with colleagues. The focus will be on how selected methods introduced in section 3 are applied in concrete experimental settings. I will critically discuss different aspects linked to the planning of an experiment (e.g., choice of participants, experimental procedure, 5 For more detail on the measurement of speech onset times see Schmiedtová (2011a). 6 Grammatical judgment tasks are commonly used in adult native as well as L2 speakers for testing grammatical acceptability. In my research, this task was employed for testing grammatical knowledge of patients with Broca aphasia (Flanderková, Mertins, et al., 2014). The use of experimental methods in linguistic research 21 determining level of proficiency in an L2, and creation of stimuli). When relevant, I will point out possible problems and pitfalls. Study 1: Elicitation (Schmiedtová & Sahonenko, 2008) Elicitation has been employed frequently in my own research (cf. Schmiedtová & Sahonenko, 2008; Schmiedtová, 2011, 2011a; v. Stutterheim, et al., 2012; Schmiedtová, 2012, 2013, 2013a). In the majority of these studies, elicitation was used in combination with other tasks (for more detail see below). In the article by Schmiedtová & Sahonenko (2008) elicitation was the only method employed. Because of this, I will present and discuss this study in more detail. The focus of Schmiedtová & Sahonenko (2008) was to examine the role of grammatical aspect and tense in the encoding of goal-oriented motion in adult native speakers (L1) of Czech, Russian, German and very advanced second language speakers (L2+ 7 ) of German with L1 Czech or Russian. Based on previous work on German, English, French, and Italian (e.g., Carroll & v. Stutterheim , 2002; v. Stutterheim & Carroll, 2003; v. Stutterheim & Lambert, 2005) the research question was posed of how and to what extent core grammatical categories determine how information is selected and structured in dynamic contexts. The related L2 research question was concerned with the restructuring of conceptual knowledge 8 , i.e. with the question to what degree are near-native L2+ speakers able to learn to reorganize conceptual knowledge (e.g., encoding of motion events) towards the target language pattern. We used 40 short video clips depicting different goal-oriented motion events (critical items) and homogenous activities serving as fillers (distractors) for the elicitation of spoken data. The length of the clips varied. The stimulus material appeared automatically on a laptop screen in random order with a five second blank in between. The participants’ task was to start to speak as soon as they knew what was happening in the clip. The question in the instruction was presented in present tense (German: Was passiert? ; Czech: Co se děje? ; Russian Čto proischodit? ). In order to ensure comparable conditions across recordings, a standard experimental procedure was developed and set down in written text to be repeated in every session. We also 7 The abbreviation L2+ refers to second languages speakers who speak the target language as their third, fourth or even fifth language. It reflects the fact that European L2 speakers (and participants in our studies) are often multilingual and German is not always their second foreign language. 8 The explanation and discussion of the terms conceptual restructuring and conceptual knowledge can be found in Schmiedtová (2011a, 2013). Barbara Mertins 22 controlled for the effect of language mode 9 (cf. Grosjean, 1998). To make sure that participants were exposed only to the tested language during the recording, only a native speaker of this language (Czech, German, Russian) was present at the recording and interacted with the participant 10 . The audio data were digitally recorded, transcribed and coded by the investigators. The coding scheme comprised the coding of grammatical aspect, tense, and reference to endpoints. In order to calculate intercoder reliability 11 for data from each language, we asked another linguist (who was also a native speaker of that language) to code large parts of the data. This way for each language there were three coders (the authors of the study and an additional linguist). For the data analyses we used a combination of qualitative and quantitative (statistical) tools. Thirty native speakers for each L1 as well as 30 advanced L2+ speakers of German (15 with Czech L1, 15 with Russian L1) were recruited for this study. All participants were comparable in terms of socio-economical and educational background 12 . The native speaker data were collected in the respective countries. All native speakers were students. All native and L2+ speakers were between 20 and 30 years old (average age 24.6 years). The L2+ speaker data were collected in Germany (Russian L2+ speakers of German living in Heidelberg) and in the Czech Republic (Czech speakers of L2 German living in Prague 13 ). All L2+ speakers were either students of German in higher semesters or professionals (e.g., interpreters, translators, German language teachers). Since our study dealt with language production of advanced and very advanced L2+ speakers we had to ensure that the proficiency level in German was comparable across participants. Accessing the proficiency degree 9 A number of previous studies have shown that the choice of language mode can have a great impact on language processing in biand multilingual speakers (cf. Soares & Grosjean, 1984; Cenoz et al., 2001; van Hell & Dijkstra, 2002). 10 The control of language mode is very important. It is, however, a question to what extent (and if at all) one can make biand multilingual speakers “switch off” the language(s) that is/ are not being actively used at a particular moment. For example, in two eye-tracking experiments Marian & Spivey (2003) have demonstrated that the nonactive language affects spoken language processing in bilingual speakers. 11 The calculation of intercoder reliability (or intercoder agreement) is in my opinion an absolutely essential prerequisite for any study of linguistic data. I will elaborate this point in the concluding part of this paper. 12 To ensure the comparability of these variables and to create homogenous participant groups we developed a biographical questionnaire that participants had to fill out before the experiment. 13 At that point in time, we were unable to find enough very advanced L2 speakers of German with L1 Czech living in Heidelberg or nearby. The use of experimental methods in linguistic research 23 in advanced L2+ speakers is certainly a challenge. In my knowledge, only few studies dealing with topics concerning near-native L2+ speakers (e.g., ultimate attainment issues) have made an effort to lay out their procedures for determining advancedness 14 . I think that this is an unfortunate situation that needs to be changed, as such studies should make sure that the L2(+) participants are near-native in the target language. In our study, we used a combination of linguistic and extra-linguistic criteria for establishing the advancedness of a L2+ speaker. (1) Excellent language knowledge: This parameter was established on the basis of a warm-up interview that was recorded and later transcribed. We qualified only those speakers as advanced who made no grammatical errors in agreement, word order and inversion. Some article errors were tolerated. On the basis of this criterion we excluded three participants from the study. (2) Active use of German in everyday life: We only included speakers who indicated in the biographical questionnaire that they use German as their dominant language in daily life. Dominant was defined as at least 70% of all everyday situations (also for the Czech participants). We did not exclude any participants on the basis of this criterion. (3) An early onset of acquisition: More than 60% of the L2+ speakers in our study started to learn German as a foreign language in primary school, i.e. around the age of 10. (4) A longer stay in a German-speaking country: All L2+ speakers with L1 Russian had been living at the time of the experiment at least four years in Germany. For the Czech L2+ speaker group the criterion was a minimum two-year sojourn in a German-speaking country. (5) Highly tutored L2 acquisition: All L2+ participants learned German at a certain point in their life in school (average length of school tutoring was 4.7 years). Applying these five criteria we were able to put together two very homogenous and comparable L2+ groups with perfect or near-native command of German. The majority of them (80%) were female. In summary, the online elicitation task with video clips serving as stimuli was a suitable method to study the research questions examined in Schmiedtová & Sahonenko (2008). Furthermore, this study clearly demonstrated that the widely spread notion of “Slavic aspect”, which often only includes the Russian system must be further differentiated. This is further supported by the next study discussed in the present paper (v. Stutterheim et al. 2012) that shows that these differences are not only in the linguistic but also in the underlying conceptual system. 14 In some studies self-assessment is used as the only measure of language proficiency. I find this problematic since self-assessment is a very subjective and culturally dependent measurement (cf. MacIntyre et al., 1997 discussing biases in self-rating of language proficiency and the role of anxiety). Barbara Mertins 24 Despite all these positives, I would like to point out several problems linked to the design of the study: The choice of stimulus material was suboptimal since the video clips were not controlled for length, type of protagonist (person, animal, object, vehicle, etc.), the direction from which the protagonist appears (left vs. right), or intercultural aspects (e.g., a clip with a typical yellow German mailbox was used which was not immediately recognized by speakers of languages other than German). This clearly disadvantaged those speakers. The number of fillers was too low: Only about a third of the stimulus set consisted of distractors. Yet another difficulty emerged with the instruction text used. As mentioned above, the participants were asked to say what was happening in the clip. We did not, however, instruct them explicitly to concentrate on the event. This imprecision led to some participants producing descriptions of the protagonists, the surrounding environment, etc., rather than the event depicted in the clip. These texts had to be excluded from the analysis because they did not follow the posed quaestio (v. Stutterheim & Klein, 1987). A helpful workaround would have been to pilot the instruction before the experiment and adapt it accordingly. Another problematic point was that the two L2+ groups differed in the amount of exposure to German. However, this factor was taken into account when comparing the groups statistically. With respect to the analyzed categories (number of endpoints and the use of tense) no relevant between-group differences were found in terms of the country of residence (Prague vs. Heidelberg). The last point of criticism concerns the number of L2+ speakers in Schmiedtová & Sahonenko (2008). Because it was not possible at the time of the study to recruit more than fifteen L2+ speakers, there were not enough data to perform all statistical analyses. This problem, of course, will always come up when studying participant groups, such as atypical populations (e.g., SLI children) or near-native L2+ speakers, for which there is no “endless” pool of possible subjects one can recruit from (as is the case for native speakers). Nevertheless, as mentioned already, fifteen subjects is not a sufficient number of speakers to do a thorough statistical analysis. In a follow-up study 15 (v. Stutterheim et al., 2012) all these shortcomings were removed and the design was improved. These improvements will be explained further. 15 The problems of the low number of L2+ participants and different country of residence at the time of testing were removed in another set of studies (Schmiedtová, 2011, 2013) in which elicitation was used either alone or in combination with eye tracking and memory task to study language production of near-native L2+ speakers. The use of experimental methods in linguistic research 25 Study 2: Non-linguistic Tasks ― Memory Task & Eye-Tracking (v. Stutterheim et al., 2012) The study by v. Stutterheim et al. (2012) combined the elicitation of spoken data with a simultaneous recording of eye movements and a subsequent memory task. In contrast to the previous study (Schmiedtová & Sahonenko, 2008), this paper examined only native speakers. The general research question was concerned with the effects of language on cognition, i.e. with testing the thinking-for-speaking hypothesis (Slobin, 1996) and the seeing-forspeaking-hypothesis (Carroll et al., 2004; Schmiedtová et al., 2011). The aim of this study was the encoding of endpoints in goal-oriented motion events in Czech, Dutch, English, German, Russian, Spanish, and Modern Standard Arabic. Compared to Schmiedtová & Sahonenko (2008) the stimulus material had been improved with respect to the aspects: the number of fillers, standardized video clip length, control of type and appearance of the protagonist, intercultural usability. In total 60 short video clips including 10 critical, 10 control items and 40 fillers were filmed 16 for the purpose of this study. The critical clips showed goal-oriented motion events, in which a potential endpoint was not reached within the duration of the clip (e.g., two persons walking on the pathway, in the background a building). The control items depicted goal-oriented motion events with an endpoint reached before the end of a clip (e.g., a vehicle going along a street, turning and disappearing into a garage). The fillers showed 30 activities with causative events (e.g., a person making a necklace) and 10 static scenes (e.g., a candle burning). The video clips were six seconds long. The number of clips depicting people, animals and vehicles was comparable. The direction of the appearance of the protagonist (left vs. right) was equally distributed across all critical and control items. In addition, all videos were piloted before the experiment with about 100 students with different language and cultural backgrounds to ensure their intercultural transferability. The task for the participants was to verbalize what was happening in the clip. As in the other study (Schmiedtová & Sahonenko, 2008), the emphasis was on depicting the event and the question was posed in the present tense. The instruction text was improved and included an explicit request not to verbalize any descriptions and to concentrate solely on the event. The text was translated by native speakers into all languages and presented to the 16 The video clips were filmed and cut over the course of three months by members of a research group at the University of Heidelberg. Twenty clips in total were made and they were all piloted and pretested. Only ten were selected for the experimental stimulus set. Barbara Mertins 26 participants first orally and then in written form. The experimenter was a native speaker of the language tested (cf. control of language mode). Each experimental session was preceded with six testing items covering all testing categories. The experimental items were presented automatically from a computer screen, in a pseudo-randomized order, with an eight-second interval in between to give participants sufficient time to finish their verbalization 17 . The elicitation and eye-tracking data were recorded simultaneously. Each recording session took approximately 15 minutes. After that, participants were asked to fill out a biographical questionnaire designed on the basis of the questionnaire used in Schmiedtová & Sahonenko (2008). Subsequently, and without prior announcement, a memory task was administered to the participants (see below for more detail on the design of the memory task). This task took between two and five minutes to finish. For this study, we recorded data from twenty subjects per language, i.e. from 140 participants in total 18 . For logistical reasons all data had to be collected in Heidelberg. The speakers of Arabic, Czech, Dutch, English, Russian and Spanish were participants in a summer school at the University of Heidelberg, and had no or very little knowledge of German. To minimize the exposure to German, the subjects were recorded in the first five days of their stay in Heidelberg. An utmost effort was made to ensure that all speakers were as “monolingual” as possible, with English being the only foreign language all participants were able to speak (at different proficiency levels). All participating subjects, including native speakers of German, were matched in terms of socio-economical background and were students or postgraduates, aged 20-35 (average age 26.7 years). The groups were balanced for gender and all participants had normal or corrected vision. The elicited linguistic data and the memory data were transcribed and coded by respective native speakers. The linguistic analyses included the coding for temporal/ aspectual categories and reference to endpoints. The transcriptions and the coding schemes for these two tasks were checked for consistency by a second researcher. 17 The length of the in-between-clip-interval had also been tested in a pilot study. In a study by Schmiedtová (2013b) the elicitation of spoken data was performed under time pressure so the blank between the presented clips was reduced to three seconds. Such a design aims at eliciting highly automatized responses and presents another good method for studying participants’ performance. 18 This number refers to subjects whose data were used for the analysis. The actual number of participants recorded was much higher (see above for a detailed discussion of data loss). The use of experimental methods in linguistic research 27 The eye-tracking data included the following measurements: the total fixation count within the area of interest 19 , the total fixation duration, and the number of first and second periods of fixation. All eye-tracking analyses were run with average measures across participants as well as averages over items. The memory task comprised fifteen color screen shots in which a specific part was cut off. There were ten critical items in which the endpoint was removed and five control items where a random object was missing. The control items were used to control for general memory performance. The task for the participants was to write down as fast as possible and in only one or few words what exactly was cut out. Before evaluating the experimental design and the suitability of the chosen methods in v. Stutterheim et al. (2012) I would like to make several general comments on the use of non-linguistic methods and tasks for testing linguistic relativity hypothesis. A question to raise here is: What counts as an effect of language on thought/ cognition 20 ? In other words, how can it be ensured that the observed effects reflect the influence of language on thought. It is not uncommon in linguistic and anthropological research, from which the linguistic relativity theory has emerged, to assume (or even to claim) cognitive differences solely on the basis of variations in linguistic data. Differences in linguistic form, for instance, are very relevant and may lead to finding differences in cognition-but not necessarily (cf. Lucy, 1996 ― an excellent article with relevant methodological thoughts and hints for the study of the relation between language and thought). When linguistic differences are found, usually by means of various behavioral (offline or online) methods, one must employ yet another method to make sure that diversity in language leads to differences in thinking. To this end, a number of methods (e.g., eye-tracking, memory tasks as used in v. Stutterheim et al., 2012) and non-linguistic tasks can be employed (e.g., sorting, matching, classification, or categorization tasks as used in Lucy, 1992 or Levinson et al., 2002). I believe that using data from behavioral tasks with data from non-linguistic tasks and methods is the only way to (a) show actual effects of language (or grammatical structure) on cognition; and thus (b) escape the argumentative tautology of claiming language effects on cognition based only on linguistic differences. 19 An area of interest (AoI) or a critical region are key terms from the eye-tracking research referring to the part of the stimulus where the eye movement (or gaze movement) is recorded. 20 For a definition and discussion of the terms language, cognition, thought, see Schmiedtová, 2011a. Barbara Mertins 28 Overall, the results of v. Stutterheim et al. (2012) have shown that the chosen methods and tasks, especially in their combination, proved to be excellent for testing the effects of language on thought. Compared to the previous study (Schmiedtová & Sahonenko, 2008), the experimental design, including stimulus material, instruction text, and intercultural transferability, was improved and yielded reliable data. The only two minor points to comment on are the recording of native speakers outside their native country and the absence of intercoder reliability calculation. It is obvious that one should opt for collecting data from native speakers in their respective native countries. However, considering the size of the data sample in v. Stutterheim et al. (2012) and the logistics of making recordings of eye-tracking data in seven different countries, it would have been nearly impossible to satisfy this point. The other point of criticism is more serious: Although in v. Stutterheim et al. (2012) a second researcher was asked to check the transcripts and the coding, I am of the opinion that the only way to develop an objective “waterproof” coding scheme is by employing coding of at least two other coders (optimally a mix of linguists and “naïve” native speakers). Based on the coding of several independent “blind” coders, intercoder reliability can be calculated and if necessary the coding schemes adjusted. Study 3: Preference judgment task (Schmiedtová, 2013a) Preference judgment tasks have been successfully employed in linguistics for a long time. As pointed out in section 2, they represent a powerful tool to gather large data sets with relatively little effort. However, caution should be exercised when designing these tasks since the selection of the right stimulus material is not trivial or easy. To demonstrate a possible way to design a preference judgment task I selected a study of my own (Schmiedtová, 2013a). In this study an extensive judgment task was designed to test preferences in aspect use in Czech native speakers. The underlying hypothesis was that in contemporary spoken Czech the usage of the present perfective form (e.g., vy-pije PF “she/ he drinks up”) has been extended (perhaps under the influence of German, Schmiedtová, 2012, 2012a) from future to here-and-now reading. To test this hypothesis, a questionnaire was developed comprising 35 scenarios, 15 critical and 20 fillers, all presented in present tense contexts. The fillers were motion verbs embedded in goal-oriented motion events with a potential endpoint (e.g., somebody riding a bike on a pathway, in the background is the beginning of a forest). The critical items were verbs depicting a situation with a resultant state (e.g., somebody drinking a cup of coffee, somebody throwing garbage into a trash can). Czech verbs are classified into five different conjugation classes. In order to test whether a particu- The use of experimental methods in linguistic research 29 lar verb class allows the use of the present perfective form in here-and-now reading, three verbs from each class were selected. In order to avoid priming effects, the target verb did not appear in the prestory. So for instance, in the critical scene “throwing away garbage into a trash can” (the target verb, vyhodit PF / vyhazovat IMPF “to throw away”) the wording was as follows: “Imagine a situation, in which you see a man standing next to garbage containers doing something. He is nearly finished with the activity he has been involved in. How would you most likely describe such a situation? ” After reading this text participants could choose from five options in which the target verb appeared in five different tempus/ aspect combinations (i.e. present imperfective, past imperfective, present perfective, past perfective, secondary imperfective) 21 . Except the difference in tempus/ aspect, these options were identical in wording. The participants’ task was to check off the most preferred description of a given situation and if needed, also indicate their second best preference. The 35 scenarios were presented in a pseudo-randomized order, in the form of a paper-and-pencil questionnaire and administered to 256 participants. The questionnaire was piloted with ten native speakers of Czech. Educational level and age of the participants were taken as factors for ensuring homogeneity of the participant group in terms of age and socio-economical background. The subjects were either pupils in the last year of high school or first semester university students (age range 17-30; average 19.3). The gender was not controlled. The questionnaires were filled out in regular classes with a standardized instruction given to the participants orally by their teacher 22 . The participants had twenty minutes to finish the task. To investigate a possible influence of dialectal variations on the use of the present perfective, data were collected in five different regions of Czech Republic. The questionnaires were anonymous and included only information regarding gender, native language, and the origin of the participants. Two subjects were excluded from the sample because they grew up in bilingual families. Despite the fact that preferential judgment tasks come with some downsides, e.g., the participants may not indicate their real preferences because they find the task odd or boring and make their choices randomly, the task was suitable for the investigation of the questions studied in Schmiedtová (2013a). One may suggest performing a corpus analysis instead, which would perhaps yield (even) more data points. The problem with a corpus 21 These are all possible combinations in Czech. 22 Several colleagues of mine kindly did the collection of the data. They received detailed instructions on how to proceed in the collection of the data. In this manner, the procedure was comparable. Barbara Mertins 30 study would have been to control the context, in which the tested form was presented (as it had to be present tense). Before concluding the current paper, I would like to point out several aspects that should be considered when planning a judgment task. (1) Stimuli: In addition to the more formal points listed in section 3 it should be taken into account that the language material in a questionnaire should be natural and not grammatically odd. Also, testing linguistic preferences or grammatical acceptability on isolated items (i.e. without any context) is highly problematic. (2) Fillers and presentation: For a test in written language, the use of a large amount of fitting fillers is absolutely essential. In addition, one has to control for the presentation order since participants have a tendency to connect individual items in a “meaningful way” (e.g., creating some kind of a story) or to select items repeatedly from only one place on the page (e.g., the first choice from the left). (3) General: A lengthy questionnaire will not yield good data due to the attention and interest span of the participants. I would recommend shorter tasks tested on a larger number of participants. 5 Final Remarks There are many different experimental methods suitable for linguistic research. The focus of the current paper was in regards to the offline and online methods. When planning an experiment, a number of aspects must be taken into consideration in order to come up with a good experimental design and thus usable data. The relevant aspects include the selection of stimulus material, the experimental protocol, the recruitment of participants as well as the coding and analysis of data. Because of ecological validity the calculation of intercoder reliability for the coding of linguistic data is indispensable. For the data analysis, I would always opt for the use of inferential statistics. A prerequisite for this is a proper experimental design and a sufficient number of data points. In my opinion, basing a study only on qualitative analyses does not lead to meaningful and generally valid results, except for case studies in language pathology and child acquisition research. Last but not least: Although the use of experimental methods is crucial for doing linguistic studies, the research cannot be done without a good linguistic theory, yielding interesting and challenging research questions. References Carroll, D.W. (2008). Psychology of language. Belmont: Cengage Learning. Carroll, M., & Stutterheim, C. (2002). Typology and information organisation. Perspective taking and language-specific effects in the construction of events. In A. The use of experimental methods in linguistic research 31 Ramat (Ed.), Typology and Second Language Acquisition (pp. 365-402). Berlin: de Gruyter. Carroll, M., Stutterheim, C.v., & Nüse, R. (2004). The language and thought debate. a psycholinguistic approach. In C. Habel, & T. Pechmann (Eds.), Approaches to Language Production (pp. 183-218). Berlin: de Gruyter. Cenoz, J., Hufeisen, B., & Jessner, U. (2001). Cross-linguistic influence in third language acquisition: Psycholinguistic perspectives. Clevedon, UK: Multilingual Matters. Clark, E. (2009). First language acquisition. Cambridge: Cambridge University Press. Dąbrowska, E., & Szczerbinski, M. (2006). Polish children’s productivity with case marking: the role of regularity, type frequency, and phonological diversity. Journal of Child Language, 33(3), 559-597. Flanderková, E., Mertins, B., Bezdíček, O., Baborová, E., & Černá, M. (2014). Posuzování gramatičnosti v Brocově afázii příklad dvou pacientů. Česká a slovenská neurologie a neurochirurgie 77/ 110(2), 202-209. Glucksberg, S., & Danks, J.H. (2013). Experimental Psycholinguistics (PLE: Psycholinguistics): An Introduction. Hoboken: Psychology Press. Grosjean, F. (1998). Transfer and language mode. Bilingualism: Language and Cognition, 1(3), 175-176. Hell, J.v., & Dijkstra, T. (2002). Foreign language knowledge can influence native language performance in exclusively native contexts. Psychonomic Bulletin & Review, 9(4), 780-789. Höhle, B. (Ed.) (2010). Psycholinguistik. Berlin: Akademie Verlag. Levinson, S., Kita, S., Haun, D., & Rasch, B. (2002). Returning the tables: Language affects spatial reasoning. Cognition, 84, 155-188. Lucy, J. (1992). Grammatical categories and cognition: A case study of the linguistic relativity hypothesis. Cambridge: Cambridge University Press. Lucy, J. (1996). The scope of linguistic relativity. In J.J. Gumperz & S.C. Levinson (Eds.), Rethinking linguistic relativity (pp. 37-69). Cambridge: Cambridge University Press. MacIntyre, P.D., Noels, A.K., & Clément, R. (1997). Biases in Self-Ratings of Second Language Proficiency: The Role of Language Anxiety. Language Learning, 47(2), 265-287. Marian, V., & Spivey, M. (2003). Competing activation in bilingual language processing: Withinand between-language competition. Bilingualism: Language and Cognition 6(2), 97-115. Müller, N. (2013). Transfer in bilingual first language acquisition. Bilingualism: Language and Cognition 1(3), 151-171. Schmiedtová, B. (2004). At the same time: The expression of simultaneity in learner varieties. Berlin: de Gruyter. Schmiedtová, B. (2011). Do L2 speakers think in the L1 when speaking in the L2? International Journal of Applied Linguistics, 8, 97-122. Schmiedtová, B. (2011a). Wie Sprache unser Denken formt - psycholinguistische Hintergründe. In S. Schulte (Ed.), Ohne Wort keine Vernunft - keine Welt: Bestimmt Sprache Denken? (pp. 97-128). Münster: Wachsmann. Barbara Mertins 32 Schmiedtová, B. (2012). Vergleich von deutschen und tschechischen kunsthistorischen Texten. In S. Höhne, I. Fiala-Fürst, R. Mikuláš, & B. Schmiedtová (Eds.), Brücken 2011. Germanistisches Jahrbuch Tschechien - Slowakei; thematischer Schwerpunkt - Sprachwissenschaft (pp. 221-240). Praha: Lidové Noviny. Schmiedtová, B. (2012a). Untersuchung zu Sprache und Kognition am Beispiel von Ereigniskonzeptualisierung und Textkohärenz im Deutschen und Tschechischen. [Unveröffentlichte Habilitationsschrift]. Ruprecht-Karls Universität Heidelberg. Schmiedtová, B. (2013). Traces of L1-patterns in the event construal of Czech advanced speakers of L2-English and L2-German. In C.v. Stutterheim, M. Flecken, & M. Carroll (Eds.), IRAL (51), 87-116. Schmiedtová, B. (2013a). Zur Verwendung der perfektiven Präsensform im heutigen Tschechisch. Journal for Central European Studies, (2), 125-164. Schmiedtová, B. (2013b). Zum Einfluss des Deutschen auf das Tschechische: Die Effekte des Zeitdrucks auf die Sprachproduktion. In M. Nekula, K. Šíchová, & J. Valdrová (Eds.). Bilingualer Sprachvergleich und Typologie (pp. 177-206). Tübingen: Julius Groos Verlag. Schmiedtová, B., Stutterheim, C.v., & Carroll, M. (2011). Implications of languagespecific patterns in event construal of advanced L2 speakers. In A. Pavlenko (Ed.). Thinking and Speaking in two languages (pp. 66-107). Clevendon, UK: Multilingual Matters. Schmiedtová, B., & Flanderková, E. (2012). Neurolingvistika: předmět, historie, metody. Slovo a Slovesnost, 73, 46-62. Schmiedtová, B., & Sahonenko, N. (2008). Die Rolle des grammatischen Aspekts in Ereignis-Enkodierung: Ein Vergleich zwischen Tschechischen und Russischen Lernern des Deutschen. In P. Gommes, & M. Walter (Eds.), Fortgeschrittene Lernervarietäten: Korpuslinguistik und Zweitspracherwerbforschung (pp. 45-71). Tübingen: Max Niemeyer. Slobin, D. (1996). From “thought to language” to “thinking for speaking”. In J.J. Gumperz, & S.C. Levinson (Eds.), Rethinking linguistic relativity (pp. 70-96). Cambridge: Cambridge University Press. Soares C., & Grosjean, F. (1984). Bilinguals in a monolingual and a bilingual speech mode: The effect on lexical access. Memory & Cognition, 12 (4), 380-386. Stutterheim, C.v., Andermann, M., Carroll, M., Flecken, M., & B. Schmiedtová (2012). How grammaticized concepts shape event conceptualization in language production: Insights from linguistic analysis, eye tracking data and memory performance. Linguistics, 4, 833-867. Stutterheim, C.v., & Carroll, M. (2003). Typology and information organisation: perspective taking and language-specific effects in the construal of events. In A. Ramat (Ed.), Typology and Second Languge Acquisition (pp. 365-402). Berlin: de Gruyter. Stutterheim, C.v., & Klein, W. (1987). Quaestio und referentielle Bewegung in Erzählungen. Linguistische Berichte, 108, 163-183. Stutterheim, C.v., & Lambert, M. (2005). Crosslinguistic analysis of temporal perspective in text production. In H. Hendricks (Ed.), The structure of learner varieties (pp. 1-19). Berlin: de Gruyter. The use of experimental methods in linguistic research 33 Vanpatten, B., & Jegerski, J. (Eds.). (2014). Research methods in second language psycholinguistics. New York, NY: Routledge. Wittek, A., & Tomasello, M. (2002). German children’s productivity with tense morphology: the Perfekt (present perfect). Journal of Child Language, 29, 567-589. How to investigate interpretation in Slavic experimentally? Roumyana Slabakova Abstract: This chapter raises the issue of high and unpredicted variability in the performance of native Russian (Slavic) speakers. Two case studies are discussed, in which many more categorical contrasts were expected, but experimental findings attested highly variable interpretations. One case study, Slabakova (2004), investigated the interpretation of bare plural and mass objects in perfective sentences. The second case study, Cho and Slabakova (2014), looked at the acceptability of fronted objects when they are Topics or Foci. In both studies, native speaker participants revealed complex patterns of acceptability, sensitive to ambiguity permitted by the grammar. A view of the grammar is discussed, Ramchand and Svenonius (2008) which provides an explanation of the variable findings. Implications of this situation, with respect to psycholinguistic experiments, are considered. 1 Introduction In the investigation of meaning that speakers attribute to linguistic strings, Slavic languages present a curious empirical and methodological challenge. They often allow grammatical meanings (such as definiteness, specificity or quantization of the object) to be expressed without morphological marking on the noun phrase. Instead, the meanings under discussion are signaled by the perfective marking on the verb, by information structure (Topic, Focus) or by the word order of the entire sentence. From the outset, it is important to make the distinction between a semantic or grammatical category expressed by some language, say Number, and the linguistic expression, or exponent of that category, for example the functional morphemes -s (as in cat-s) and -en (as in ox-en) in English. In general, mapping one category (meaning) to many exponents is not an unusual situation across languages of the world: variability in exponents abounds while the semantic and grammatical categories are arguably universal. Many factors, including lexical and phonological considerations, may influence which linguistic exponent appears where. There is a possibly parallel situation in psycholinguistics. Linguistic theory often makes categorical claims or behavior predictions about the availa- How to investigate interpretation in Slavic experimentally? 35 bility of certain interpretations of strings, while native speaker judgments reveal a lot more intra‐ and inter‐personal variability than predicted. Very often, this situation is a result of several linguistic factors affecting the judgments. As a result, we researchers have a problem on our hands: how to make sure that speakers really have the interpretations that we think they have, for a certain string. Furthermore, how can we make sure that the interpretation depends upon the linguistic factors, that we think? How can we control for the effect of various factors over the interpretation? In this article, I will address this important issue from the perspective of Slavic languages. I will show results from two psycholinguistic experiments involving Russian native speakers that reveal a lot of unpredicted variability in interpretation. At the same time, this variability appears exactly with judgments where more than one grammatical factor influences the interpretation. To preview the answer, I will conclude that experimental psycholinguistic research needs to consider in detail a multitude of factors: grammatical, contextual, lexical, but also psycholinguistic variables such as the type of task, presentation, order of tasks, fillers, etc. The take-home message of this discussion is that if we are aware of the pitfalls to our experimental research, we are halfway to meaningful solutions. 2 Case Study 1: Quantization and Perfectivity The linguistic process of quantization (Krifka, 1989) has proven relevant to the proper characterization of verb phrase (VP) telicity and count/ mass nouns. I start by defining the relevant terms, since terminological confusion abounds in this literature. Telicity refers to the property of sentences to present events as bounded or unbounded in time (Vendler, 1957; Krifka, 1989; Filip, 2001). For example, to eat an apple is a telic VP because the apple-eating event has a potential endpoint with the end of the apple, while to eat apples or to like apples are atelic VPs because there is no such potential endpoint to the event or state. A quantized nominal expression is such that, whenever it is true of some entity, it is not true of any proper subparts of that entity. Count nouns are quantized; mass and bare plural nouns are not. Let us see why with an example. If something is an apple, then no proper subpart of that thing is an apple. If something is water, then many of its subparts will also be water. Hence, an apple is quantized, while water is not quantized. In the literature on aspect, two major mechanisms of “composing” telicity at the level of the VP have been identified (Krifka, 1989, 1998; Verkuyl 1972, 1993, 1999). One mechanism is to combine a non-stative (dynamic) verb with an object that is marked as exhaustively countable or measurable Roumyana Slabakova 36 (a quantized object, in Krifka’s terminology; a specific quantity object, in Verkuyl’s terminology). English and other Germanic languages use this object-marking mechanism in (most) accomplishment and activity predicates. For example: (1) Claire ate an apple/ the apple/ three apples/ a bag of popcorn. (telic) (2) Claire ate apples/ popcorn. (atelic) Dowty (1991) introduced a theta role, most often mapped onto objects, called Incremental Theme. The objects in (1) are such Incremental Themes because the progress of the event can be measured by looking at the affected participant. In an apple-eating event, for example, we know how close to its end the event is by looking at how much of the apple is eaten. In English, quantized nominal arguments linked to the Incremental Theme theta role combined with dynamic verbs bring forward a telic interpretation as in (1); cumulative Incremental Theme objects contribute to an atelic interpretation as in (2). This relationship is known as the Event-Object Homomorphism: when the object is completely affected, the event is over. Notice that quantization is orthogonal to definiteness, since both the indefinite nominal argument an apple and the definite the apple are quantized. However, the homomorphism only holds for Incremental Theme objects (Dowty, 1991), those objects that are created (effected) or consumed (affected) by the verbal action. Compare (3) and (4): (3) Mike drove a red car. (atelic) (4) Mike made a red car. (telic) The difference in interpretation is due to the two verbs. Verbs of creation (make, write) and verbs of consumption (eat, drink), among others, are unified in having Incremental Theme objects. These objects are affected by the event in a special way, and according to three recent theoretical accounts, “measure out” the progress of the event (Tenny, 1994), their discrete parts map to parts of the event (Krifka, 1989), or serve as an “event odometer” (Verkuyl, 1993). On the other hand, a verb such as drive does not take an Incremental Theme object, and even if the object is quantized as in (3), it does not make the whole VP telic. In the rest of this chapter, we will be dealing with Incremental Theme objects and their verbs. Now, marking the same meaning, telicity or boundedness of the event, happens in a completely different way in Slavic languages. (Same meaning, different exponents! ) In Russian (as well as in Czech and Polish, languages without articles), the verbal form carries the quantization information, while How to investigate interpretation in Slavic experimentally? 37 the objects are overtly unmarked in this respect (Wierzbicka, 1967; Forsyth, 1970; Krifka, 1998; Filip, 1993, 2001; Di Sciullo and Slabakova, 2005). Compare the sentences in (5) and (6). (5) Ja el gruši / tort (atelic) I eat- PAST.1SG pears- ACC / cake- ACC “I was eating (some) pears/ cake.” (6) Ja s”-el gruši / tort (telic) I P ERF -eat- PAST.1SG pears- ACC / cake- ACC “I ate all the pears/ the whole cake.” These two sentences differ only in the verbal form: in (5) the verb is imperfective while in (6) the verb is perfective-marked by the prefix s-, which carries completive meaning. Note that the objects gruši/ tort “pears/ cake” are a bare plural and a mass noun in both sentences. If these non-quantized nominals had any effect over the VP telicity, the objects would have made both VPs atelic, contrary to fact. In fact, it is the verbs that actually change the interpretation of the NP objects: in (5), where the verb is imperfective, we are talking about an unspecified quantity of pears or cake that I used to eat, or was eating, while in (6), where the verb is perfective, all the pears and all the cake involved in the eating event are understood to be completely consumed. Note that the analysis assumed here relies on Krifka and Filip’s work, as well as on Di Sciullo and Slabakova (2005). There are other linguistic analyses (e.g., Borik, 2006) where perfective prefixes are not treated as telicity marking morphemes. However, it will take us too far away from the main point of this chapter to discuss these different analyses, which I leave for further research. To summarize the linguistic facts according to the analysis I assume here, verbs and their Incremental Theme objects are in an Event-Object Homomorphism. In order to signal telicity, the potential endpoint of the event, either the verb (Slavic) or the object (English) can be marked for quantization. Because the VP with a perfective verb denotes a telic event in Slavic, the object in such sentences takes on a quantized interpretation. When such a sentence is additionally marked as past, the interpretation of the object is as completely affected (consumed of produced). I will not spend time here on ascertaining that the English part of the claim holds: I have tested this in my dissertation work and so I point the interested reader to Slabakova (2001). The other part of the claim is more interesting for our purposes here. In other words, we want to find out what interpretation Russian speakers attribute to the objects in perfective and Roumyana Slabakova 38 imperfective sentences as (5) and (6). In order to do that, we have to check construals (interpretations) in the absence of context and word order variations. Why is this necessary? If there are many factors affecting an interpretation, we want to keep the number of variables in an experiment at a minimum. Otherwise, we will not know which factor is producing the experimental effect that we see. Slabakova (2004) attempted to check the claims in the literature as to the event-object homomorphism and how Russian native speakers compute the telicity and completion of an event. The participants, 45 monolingual native speakers of Russian living in Russia, were asked to read some experimental sentences and choose the possible continuation, from two continuations spelled out below. There was also a third choice, that both continuations were possible. (7) Petja pro-čital ėtot roman, Petja PERF -read this novel a) no eščë ne zakončil čitat’. but yet not finished reading. b) i uže zakončil čitat’ do konca. ⇐ the only possible answer and already finished reading. c) oba A i B vozmožny. both A and B possible. (8) Petja Ø-čital ėtot roman, Petja IMP -read this novel a) no eščë ne zakončil čitat’. ⇐ also possible but yet not finished reading. b) i uže zakončil čitat’ do konca. and already finished reading to end. c) oba A i B vozmožny. ⇐ best answer both A and B possible. The judgments expected by the researcher, as marked in (7) and (8) are in keeping with the long-established claim that in Slavic, imperfectivity is unmarked, hence fluid, while perfectivity is the marked value in the opposition. Note that if this is true, continuation (7a) and (7c) are logical contradictions, and the only expected choice is (7b). In the case of (8), the incomplete interpretation (8a) was expected, as well as the ambiguous construal (8c); however, (8b) on its own would be a contradictory choice. The experiment had three conditions (with six tokens in each condition) in which Perfective and Imperfective verbs were crossed with three types of objects: Ho Rec (co ues abo Fig Th pec fin ers me Th exp 1 2 3 4 5 6 8 9 10 w to a) b) c) call omp s. F ove gure e d cted din s of ean ese pect 0 10 20 30 40 50 60 70 80 90 00 o in qu am obj no l tha pleti Figu . 1: P p istin d, th ngs c Ru tha fin ted No vest anti mple jects n-qu at th ion) ure Perce preti ncti here con ussia at th ndin beh on-q tiga ized es (7 s wi uan he ty ) of 1 p enta ing p ions e w nfirm an i he w ngs havi 9 I quan te in d ob 7) an ith q ntize ype the pres age o perf s be was m th inte who als ior. 94 Imp ntize nter bject nd ( qua ed m e of e ev sent of E fecti etw no he E erpr ole o v perf ed o rpre ts w 8); antif mass obj vent ts p Expe ive a een sta Eve eted eve valid 89,2 fecti objec tatio with fiers s or ject t; on pool ected and n th atist ent-O d th ent date 2 ive ct on i dem s (tw r bar is n nly led d Ch imp e co tical Obj he s was e th 9 ver in S mon wo sw re p not the res hoic perf ond lly ject sent s co he te 94 rb Qu lavi nstra wea plura sup e pe sults ces m fecti ditio sign Ho tenc omp est uanti ic ex ativ aters al o ppo erfe s fr mad ve s ons nific omo ces plet ins ized xper ve pr , a g bjec sed ectiv rom de by sente we can omo wit te, n strum d obj rime ron glass cts ( d to ve p m th y Ru ence ere n nt ef orph th p no m men ject enta oun s of (beer hav pref e th ussia es not ffec hism perfe mat nt a 93 lly? ns, th beer r, tea ve a fix c hree an n sig ct of m, s ecti tter as c 3,3 Per D ? his n r); a). any can e co nativ gnifi f ty sinc ive the capa 9 rfec Demo nove effe cha ond ve sp ican ype ce th ver e ty able 93,3 ctive onst el as ect o ang ditio peak nt, t of he n rbs i ype e of 3 e ve trati s in on t ge te ons kers that obj nati in t of t f eli 85, erb ive o the the elici des s in i t is, ect. ive the the iciti ,9 obje e exteli ity scri inte , as . Th spe pas obj ing ect 39 icity valbed rexhese eakst to ject the 9 y d e o . e Roumyana Slabakova 40 The last experimental condition (again n=6 tokens) looked at the interpretation of the object, not of the event. There is supposed to be a homomorphism here: if one changes, the other changes. The objects in this condition were all mass or bare plurals, non-marked morphologically. (9) Anja po-stirala odeždu…, Anja PERF -washed (the) clothes a) voobšče odeždu. in general clothes b) vsju odeždu kotoraja nuždalas’ v stirke. ⇐ expected all clothes which needed washing c) oba A i B vozmožny. both A and B possible (10) Anja Ø-stirala odeždu…, Anja IMP -washed (the) clothes a) voobšče odeždu. ⇐ also possible in general clothes b) vsju odeždu kotoraja nuždalas’ v stirke. all clothes which needed washing c) oba A i B vozmožny. ⇐ best answer both A and B possible The choice in (9a) - (10a) was intended as to convey the generic interpretation of clothes in general; the choice in (9b) - (10b) was intended to refer to a specific set of objects: all the clothes that needed washing. The speakers’ choices of construal were very different from the ones in the previous three conditions. The patterns of choices of perfective and imperfective verb sentences are roughly the same, but they should not be. Figure 2 presents the speakers’ expected and unexpected choices in this last condition. Ho Fig Wh the obj spe day cho bla per go sen tize tha Th ject gre bot Wh are the 1 2 3 4 5 6 8 9 10 w to gure hat e fir ject ecifi y.” osen ack rfor On aga nten ed, at ne ese ts “ ey c th in hile e sig em. 0 10 20 30 40 50 60 70 80 90 00 o in 2: O im sho rst g can ic, s In t n in col rmin n th ains nces bec eed cho clot colu nter sp gna vest Obje mpe ould gro n be suc this n ab um ng v he o st th s, no caus ded oice thes umn rpre eak aling tiga ect co erfec d we up e ge h a gro bout mn, w very the he on-q se t wa es, 5 s in n sh etat kers g th 44 Im E te in onst ctiv e be of ener as “ oup t 44 was y m r ha pre qua the shin 52% n gen how tion are hat 4,4 mpe Exp nter trua e se e se colu ric, “all p of % o s ch much and edic anti eve ng i %, a nera ws, t ns a e no the 1 erfe pecte rpre al ch nten eein um suc the col of an hose h as d, ch tion zed ent i in th are al” thes re p ot c e ex 12,4 ectiv ed tatio hoice nces g in mns, ch a e cl um nsw en a s exp hoic ns o d ob is c he f rep sho se c poss choo xpec 4 ve v N on i es m s n Fig on as “ oth mns, wers abo pec ces of th bject om fam prese ould choi sibl osin cted 46, verb Not in S made gur ne w “clo hes (10 s, w out cted of o he ts s mple mily, ente d no ices le, t ng t d co 8 b exp lavi e by re 2? wou othe tha 0a), whil 47% d. obje the shou ted , th ed ot a s ar the the onst pecte ic ex y Ru ? Fo uld s in t ne a p e (1 % of ect ory uld . Th at d in t ppe e on blac wr trua ed xper ssia or th exp n ge eed ossi 10c) f th con y. Re be he c day” the ear w nly ck c rong al i rime an na he i pect ener ded ible , th he a nstru eca inte cons ” w che wit 4% colu g an s n Bot enta ativ imp sp ral,” wa e ch he am answ ual ll th erpr stru was t ecke th p %. H umn nsw not t 52 Pe h in lly? e sp perfe eak ” or ashi hoice mbi wer wit hat rete ual i the ered perfe How n, a wer the erfec nterp ? peak ecti kers r th ing e of iguo rs. S th p in ed a in ( onl d co ectiv weve almo in t on 4 ctiv preta kers ive v s to hat t in f int ous So t perf the as s (9b) ly e olum ve v er, ost this nly o ve v atio in p verb cho the the terp s con the fect e pe pec , “a expe mn verb spe 49% s co one 48 erb ns perfe bs, oos obj e fa pret nstr spe ive erfe cific all t ecte . G bs, a eake % of ondi e po 8,8 b ectiv loo se th ect mil atio rual eak sen ctiv c, or he ed a ene and ers f th ition ossi ve a okin hat can ly, t on, w l in kers nten ve v r qu clot answ eric d as cho he ti n, t ible 41 and ng at the n be that was the are nces verb uanthes wer ob- the oose ime they for 1 t e e t s e e s b s . e e . y r Roumyana Slabakova 42 So, what is going on in these data? To reiterate, Russian native speakers behave in the expected way in interpreting the telicity, or completion of the events in simple sentences, based on the perfectivity of the verb. They choose completed construals for perfective sentences, with roughly 8% optional choices (both interpretations are possible), which is the wrong choice for perfective sentences. However, this relatively high accuracy, which supports the effectiveness of the test, is not replicated for the object construal condition. The speakers do not significantly demonstrate that they interpret bare plural or mass objects as specific quantity/ quantized, depending on the perfectivity of the verb. Importantly, their behaviors on the event construal and on the object construal diverge. Why would this be the case? If we think in universal terms, the behavior of the Russian speakers seems even odder. Compare the following English sentences, attempting to create equivalents of the Russian sentences in (9) and (10). (11) She washed clothes (for a living). (12) ? ? She finished washing clothes. The construal of (11), containing a generic object, is habitual, such as She washed clothes for a living, or every day. When we impose a finished one-time event interpretation for (12), the generic object sounds unnatural. So why are Russian speakers going for that unnatural interpretation? We have to look to the other factors affecting object interpretation to answer that question. In Russian, the relatively free word order, or scrambling, gives rise to different discourse information structures. The preverbal position is normally related to Topic, or old information, and the postverbal position is related to Focus, or new information (see Yokoyama, 2006; Holloway King, 1995; Bailyn, 1995 for more discussion). Let me exemplify with some examples from Ionin (2003, p. 111-112): (13) Košk-a v-beža-la v komnatu cat- NOM.FEM PERF -run- PAST. FEM into room- ACC. FEM “The cat ran into the room.” (14) V komnatu v-beža-la košk-a into room- ACC. FEM PERF -run- PAST. FEM cat- NOM. FEM “A cat ran into the room.” (15) Lena pro-č-la (kakuju-to) knig-u. Ja ne znaju kakuju. Lena PERF -read- PAST (some) book- ACC. FEM I not know which “Lena read some book. I don’t know what.” How to investigate interpretation in Slavic experimentally? 43 In (13), the prenominal subject is interpreted as Topic, hence known, contextually unique or definite. The same subject is postverbal in (14), and it is interpreted as Focused, an unknown cat. Indefinite non-specific objects also appear postverbally as in (15). Is it possible that in the Slabakova (2004) experiment, there was a clash between perfectivity and word order? All sentences testing the Event-Object Homomorphism had SVO word order. On the one hand, the mass and bare plural objects were in the scope of a perfective prefix, which would purportedly give rise to a quantized interpretation. On the other hand, the objects were in postverbal position, which would normally lead to an indefinite specific as well as non-specific interpretations, depending on the context. It is perhaps this clash of two sources of semantic information that makes Russian native speakers accept both quantized and non-quantized object construals in perfective sentences. Of course, these are post-hoc speculations. The point worth retaining from this discussion is that if variation in native judgments is uncovered, it is likely that two or more factors have contributed to these judgments. 3 Case Study 2: Definiteness, Topicalization and Word Order Word order and Topicalization are involved in the second case study I am going to discuss here, this time in relation to definiteness. Definiteness is not a simple concept: it consists of a number of semantic components such as familiarity, presupposition of existence, and uniqueness (Heim, 1991). In the study whose partial results I will use here, Cho and Slabakova (2014), we assumed an informal definition of definiteness based on presupposition: a nominal is definite when there is a presupposition of its referent being unique in the domain of discourse, where uniqueness can be established through previous mention or world knowledge. This definition is valid for singular nouns only, while for plural nouns there is a presupposition of maximality, that is, all members of a specified set are included. Consider these examples from Holloway King (1995, p. 78). It looks like there is another homomorphism, this time between definiteness and preverbal position, indefiniteness and postverbal position. But that of course is because preverbal DPs are Topic, and postverbal DPs are Focus (marked T/ F in the examples). Roumyana Slabakova 44 (16) Na stole [+def/ T] stojala lamp-a [-def/ F]. on desk stand- PAST. FEM lamp- NOM. FEM “A lamp was on the desk/ there was a lamp on the desk.” (17) Lamp-a [+def/ T] stojala na stole [-def/ F]. lamp- NOM. FEM stand- PAST. FEM on desk “The lamp was on a/ the desk.” (18) Na stole [+def/ T] lamp-a [+def/ T] stojala on desk lamp- NOM. FEM stand- PAST. FEM (a ne leža-la). (but not lie- PAST. FEM ) “The lamp was standing on the desk (it was not lying).” However, Geist (2010) showed that if the familiarity condition is met, a DP receives a definite interpretation regardless of word order position. (19) Na tom stole leža-la knig-a i gazet-a. on that table lay- PAST book- NOM and newspaper- NOM Anja vzjala knigu. Anja took book- ACC “A book and a newspaper were lying on that table. Anja took the book.” The referent kniga “book” in the first sentence is in the postverbal position and there is no presupposition that the referent is familiar to the hearer as well as the speaker; hence, kniga “book- NOM ” receives an indefinite reading from being in the postverbal (Focus) position. The referent knigu “book- ACC ” in the second sentence is also postverbal; however, it is definite since the referent kniga “book- NOM” was introduced in the previous discourse, which established familiarity. Corpus data confirm linguists’ intuitions and the fact that there is just a tendency but not a 100% correlation between word order and information structure in Russian. Sirotinina (1965/ 2003) offers counts of VO and OV word orders in various Russian registers related to information structure. She reports that objects are preverbal in 7-9% of the cases in scientific speech, 10-12% in literary texts and up to 60% in colloquial speech. Note that these counts are compatible with the ones cited in Bailyn (2012, p. 249): OSV in 4% and OVS in 11% of written sentences. Sirotinina’s distribution, given in table 1 following Slioussar (2007), shows a rough division of 40% to 60% for the word order-information structure correlation. How to investigate interpretation in Slavic experimentally? 45 Given object/ Topic New object/ Focus VO 166 (39%) 206 (59.7%) OV 259 (60.9%) 139 (40.3%) Table 1: Correlation between word order and Given-New status of the object in Russian, from Sirotinina (1965/ 2003) The Cho and Slabakova (2014) study looked at how second language learners interpret DPs in Russian in terms of definiteness, in two different constructions. One had to do with the type of adjectival and nominal possessors (not discussed here), and the other had to do with word order. This study investigated the second language acquisition of Russian definiteness by English and Korean native speakers. However, I will only look at some results by the native Russian speakers here. There were 57 native speakers of Russian participating, all were tested in Moscow. This time we gave the participants contexts that clearly identified Topic and Focus. Example (20) is a test item from one of our experimental conditions. The object sup “soup” is definitely given, or Topic, so it should be acceptable in preverbal position (Kovtunova, 1976, 1980). (20) [+def]/ Topic object in preverbal position (OVS should be accepted) Oleg and his brothers Sergei and Aleksei always help their mom make dinner. Today they made mushroom soup, baked potatoes and beet salad. When their dad came home and tried the soup, he asked: Q: Kto svaril takoj vkusnyj sup? “Who made such delicious soup? ” A: Sup svaril Oleg. 1 2 3 4 5 (expected) “soup boiled Oleg” In (21), on the other hand, sup “soup” is the answer to the wh-question, so it is new formation and should be unacceptable at the beginning of the clause. (21) [−def]/ Focused object in preverbal position (OVS should be rejected) I was watching TV when Aunt Galja called. She wanted to talk to Mom. I told her that Mom was busy cooking. Aunt Galja asked: Q: Čto gotovit tvoja mama? “What is your mom cooking? ” A: Sup gotovit mama. 1 (expected) 2 3 4 5 “soup cooks Mom” 46 We ace is p pre cus pre nat abs spe so sho pu ind p<0 tici a lo sia ject ly sta spe Fig e a epta per esen Ag sed ever tive solu eake cle owe rpo dica 0.00 ipan ot o n in t, gi reje nda ectiv gure ske abili rfect nted gain (in rbal e sp ute ers arly ed t orted ated 001) nts m of va ncor ivin ecte ard vely 3: R c 1 2 3 4 5 d R ity i tly d in n, as def l po peak clar acc y F that dly d th ). H mad aria rrec ng a ed t dev y. I Russ comb Rus in t acc n fig s in init ositi kers rity ept ocu t th acc at t How de a ation ctly aver the viat try sian bina T ssian the cept gure n ca te) D ion s rat wi ted i us, t here cept the weve a sta n in acc rage OV tion to m nat ation Topi n n con tabl e 3. ase s DPs (se te th ith r in p that e wa tabl rat er, o atis n th cep e ra VS ns o mak ive ns ic O nati ntex e. T stud s sh ee e hem resp prev t is as a le a e d on a stica he R ted ating ord of th ke s spea 4,44 Objec ive xt, in The dy oul exam m ab pect verb , un a si and diffe an in ally Russ the g of der he t sens aker 4 ct in spe n a e m 1, t d h mpl bov t to bal p nfam igni una eren ndiv sig sian e O f 4 o in two se o rs’ m n OV eak sca mean this have le 2 ve th fam pos mil ifica acce nce vid gnifi n na OVS or h the me of th mean VS kers le o n ra wa e be 21 a he m mili sitio liar, ant epta is s dual ican ative ord high e co ean hese n ra to of 1 ating as a een r and mid iarit on w ne diff able stati lev nt d e ra der her. onte rat e fin ating o ra to 5 gs o an u reje the ddle ty h was ew fere e se istic vel, istin ating in Sev exts ting ndin gs (o ate 5, w on unex ecte e rig e of here an info ence ente cally onl ncti gs. the ven wi gs in ngs out o Foc the whe the xpe d m ghtf the e. Th ans orm e be ence y h ly 3 ion. Nin con n na ith n fig in t of 5) cus ese ere 1 e te ected much -han e sc he o swe mati etw es. A high 3, o . In ne n ntex tive a t gur the n ) of 3, Obj an 1 is st s d fi h m nd cale obje er to on. ween A p hly s or 58 add nati xts e sp topi re 2 nex two ,24 ject Rou swe una sent indi more col , th ect o a “ Th n th aire sign 8.93 ditio ve s wit peak icali are xt se o wo in O umy ers acce tenc ing e ca um hat i tha “wh he g he r ed s nific 3%, on, spe th a kers ized e .67 ectio ord o OVS yana in epta ces sin ateg mn a is, 3 at th hat” grou ratin sam cant of the eake a Fo s un d o 7 an on. orde S a Sla ter able (n= nce goric abov 3. T he R ” qu up ngs mple t (t all 5 ere i ers o ocus nexp bjec nd er-co abak rms e an =6) the call ve) Ther Russ uest res of es tt=9.5 57 p is q of R sed pect ct. .75, onte kova s of nd 5 are e foy in but re is sian ion ults the -test 532, paruite Rusobted- The , reext a f 5 e n t s n , s e t , e - - e - How to investigate interpretation in Slavic experimentally? 47 4 Discussion Let us first take stock of the findings in case study 1 and case study 2. In study 1, speakers were asked about construal without context, and they did not show that they obeyed the Event-Object-Homomorphism. Perfective verbs did not impose a quantized construal on Incremental Theme objects. In study 2, speakers were asked about acceptability of preverbal objects (OVS) in clear focused context (answer to a wh-question). The expectation was that the focused object would go at the end of the sentence. Again, there was a lot of variation, and over-acceptance of Focused objects in preverbal OVS positions. Is it the case that anything goes in Russian grammar? Obviously not, but we have just seen that some categorical claims in the literature, like the Event-Object-Homomorphism, and the discourse mapping of Topic>Focus, are not observed in 100% of cases, and that variation is pervasive. Again, we have corpus support reflected in table 1. What kind of concept of the grammar, or more specifically, of the syntax-semantics interface, do findings such as these support? And how is Russian grammar different from English grammar in this respect? In order to make sense of this situation, I will refer to a concept of the grammar that is proffered and supported in Ramchand and Svenonius (2008). In this view of the grammar, all languages have the same formal syntax (which the authors dub syn/ sem) and Conceptual- Intentional systems (Chomsky, 2004), or Conceptual Structure (Jackendoff, 2002). Consequently, all languages can express all (grammatical) meanings. Some languages express those meanings morphologically or syntactically, and some languages express those meanings postsyntactically through context. Thus, language variation lies only in the way languages express the universal meanings. However, Ramchand and Svenonius (2008, p. 225) also argue against identical syn/ sem (or LF) representations in all languages. They argue that some meanings are underspecified, and thus it is left to the context to fix them, or fill in additional meanings. To be more concrete, let us take an example from Northern Sámi first person plural and dual pronouns, in comparison with English nominative pronouns. (22) Northern Sámi and English first person plural and dual pronouns Syn/ Sem Conceptual-Intentional Northern Sámi: mii “I and others” moai “I and one other” English: we “I and others” we “I and one other” Roumyana Slabakova 48 Northern Sámi has a plural we and a dual we. One can think of the Sámi and English pronouns as represented in (22), where English has two homophonous pronouns we, one with the meaning “I and one other” and another with the meaning “I and others.” We can also think of the same situation in another way, as represented in (23). (23) Underspecified English system: Syn/ Sem Conceptual-Intentional Northern Sámi: mii “I and others” moai “I and one other” English: we “I and one or more others” Context will make it clear whether “we” refers to two or more individuals, including the speaker, or it truly doesn’t matter. The grammar (syn/ sem) underspecifies that feature. To come back to the definiteness discussion, Ramchand and Svenonius (2008) capitalize on the fact that some semantic features are left underspecified by the syntax and interpreted by the C-I systems, for example tracking of referents in the discourse. Their approach forces all languages to have a DP projection so that nominals can be interpreted as arguments; however, some languages have overt morphophonological material in the D head while others have null D heads. English has two distinct D elements (a, the) of type <<e, t>, e>, making the whole DP to be of type <e> (mapping a predicate to an individual), each of which carries different information as to the familiarity of the NP referent. Russian has an underspecified null D whose concrete interpretation is filled in each discourse situation by the C-I system. As mentioned above, in the table “syn/ sem” stands for the meaning being fixed by the syntax at LF while “C-I” stands for the interpretation being resolved by the Conceptual-Intentional systems. Meanings Norwegian English Lillooet Salish Russian Argumenthood syn/ sem syn/ sem syn/ sem syn/ sem Definiteness syn/ sem syn/ sem C-I C-I Specificity syn/ sem C-I syn/ sem C-I Argument tracking C-I C-I C-I C-I Table 2: Parametric variation in encoding nominal meanings across languages Table 2 is cited from Ramchand and Svenonius (2008, p. 227) and, in my opinion, omits some meaning expressions while trying to make a bigger point. For example, the table incorrectly asserts that Russian does not mark How to investigate interpretation in Slavic experimentally? 49 specificity overtly, while in fact Russian has a range of indefinite pronouns and determiners derived from them (koe-kakoj, kakoj-to, kakoj-nibud’), which do mark specificity and/ or scope, and whose acquisition we investigate in other work. The bigger point, even if not all attributions in the various cells of table 2 are completely accurate, is still a valuable insight. It is not only possible, but also profitable to account for some of language variation in this way. Ramchand and Svenonius’s (2008) proposal may be speculative at this point and in need of significant empirical support; however, it bears important implications for language acquisition and language use. If all languages have the same universal syntactic/ semantic system and parametric variation captures the way universal meanings are encoded (i.e. morphologically or contextually), then speakers of some languages may be more attuned to, and as a result more sensitive to, variation, ambiguity and fuzzy meanings ultimately fixed by context. Inevitably, higher standard deviations will be attested in ratings or evaluations of meanings and strings in these languages. In other words, linguistic judgments will be more flexible. To be sure, this parametric variation may come in addition to other, more categorical, parametric variation in features. How can this idea of the language architecture explain what we observed in the two case studies? In study 1, speakers evaluated sentences without the benefit of context, and they signaled that the Event-Object Homomorphism may be violated if context pushed against it. Note that the speakers are not offering some wild judgment, such as that only generic objects are possible in perfective sentences (see figure 2), but that generic and definite specific object construals are possible, if supported by context. In study 2, the speakers allow Focused objects in preverbal positions, although they do distinguish them from Topicalized objects in acceptability. If we look at table 1 again, we can see why: there are about 40% of new information objects in preverbal positions. The Topic-Focus distinction is not exclusively marked through word order. One can think of at least one other factor that may play a role in this marking, such as intonation. It is in fact possible that the native speakers who accept the OVS word order are attributing some Focus intonation to the fronted object in silent pronunciation. If this concept of the grammar is on the right track, the implication for psycholinguistic methodology is clear. If a lot of grammatical meanings in Russian are underspecified, hence dependent on context, lexical items, word order, Information Structure, intonation, and possibly other factors to fix the exact meaning in every case, then our elicitation methods will have to take this fact into account. For example, the newer methods such as ERPs and eye tracking might not be able to get a good reading of speaker construals, be- Roumyana Slabakova 50 cause speakers are aware of a multitude of possible meanings and strings. ERPs and eye tracking depend on sharp contrasts and minimally distinct baseline and experimental test items. Thus these methods would not be appropriate for testing fluid, flexible meanings. In conclusion, it may be the case that some languages leave certain linguistic meanings to be fixed by context, or by subtle combinations of multiple factors, while other languages fix grammatical meanings more often through functional morphology. Slavic languages may be among the former with respect to aspect, definiteness, word order, information structure, etc. Hence, experimental psycholinguistic research in Slavic languages needs to pay a lot more attention to a multitude of linguistic factors: grammatical, contextual and lexical. This variation is in addition to psycholinguistic variables such as the type of task, presentation, order of tasks, fillers, etc. and sociolinguistic variables such as amount of proficiency of native speakers, exposure to type of language, education levels and socio-economic status. If we are aware of the pitfalls to our experimental research, we are halfway to meaningful solutions. At the same time, we should not be so petrified by the factors that bring variance into our experiments, so that we stop making experiments altogether. References Bailyn, J. (1995). A Configurational Approach to Russian “Free” Word Order. Unpublished Ph.D. dissertation, Cornell University. Bailyn, J. (2012). The Syntax of Russian. Cambridge University Press. Borik, O. (2006). Aspect and Reference Time. Oxford University Press. Cho, J. & Slabakova, R. (2014). Interpreting definiteness in a second language without articles: The case of L2 Russian. Second Language Research, 30(2), 159-190. Chomsky, N. (2004). Beyond explanatory adequacy. In A. Belletti (Ed.), Structures and beyond: The cartography of syntactic structures Vol. 3 (pp. 104-131). Oxford: Oxford University Press. Di Sciullo, A. M. & Slabakova, R. (2005). Quantification and Aspect. In H. Verkuyl, H. de Swart, & A. van Hout (Eds.), Perspectives on Aspect (pp. 61-80). Dordrecht: Springer. Dowty, D. (1991). Thematic proto-Roles and Argument Selection. Language, 67, 547- 619. Filip, H. (1993). Aspect, Situation Type and Nominal reference. PhD dissertation, UC Berkeley [Published as 1999, Aspect, Eventuality Types, and Noun Phrase Semantics. New York/ London: Garland Publishing]. Filip, H. (2001). Nominal and Verbal Semantic Structure: Analogies and Interactions. Language Sciences, 23, 453-501. Forsyth, J. (1970). A Grammar of Aspect: Usage and Meaning in the Russian Verb. Cambridge University Press. How to investigate interpretation in Slavic experimentally? 51 Geist, L. (2010). Bare singular NPs in argument positions: restrictions on indefiniteness. International Review of Pragmatics, 2, 191-227. Heim, I. (1991). Artikel und Definitheit. In: A. von Stechow & D. Wunderlich (Eds.), Semantics: An International Handbook of Contemporary Research (pp. 487-535). Berlin: De Gruyter. Holloway King, T. (1995). Configuring Topic and Focus in Russian. Stanford, CA: Rand Corporation Publication. Ionin, T. (2003). Article Semantics in Second Language Acquisition. Unpublished PhD thesis, MIT. Jackendoff, R. (2002). Foundations of Language. Oxford: Oxford University Press. Kovtunova, I. (1976). Sovremennyj russkij jazyk: Porjadok slov i aktual’noe členenie predloženija. Moscow: Prosveščenie. Kovtunova, I. (1980). Porjadok Slov. In N. Ju. Švedova et al. (Ed.), Russkaja grammatika. Moscow: Nauka. Krifka, M. (1989). Nominal Reference, Temporal Constitution, and Quantification in Event Semantics. In J. van Benthem, R. Bartsch, & P. van Emde Boas (Eds.), Semantics and Contextual Expressions (pp. 75-115). Dordrecht: Foris. Krifka, M. (1998). The Origins of Telicity. In S. Rothstein (Ed.), Events and Grammar (pp. 197-235). Dordrecht: Kluwer. Ramchand. G., & Svenonius, P. (2008). Mapping a parochial lexicon onto a universal semantics. In M. Biberauer (Ed.), Limits of Syntactic Variation (pp. 219-45). Amsterdam: John Benjamins. Sirotinina, O. (1965/ 2003). Porjadok slov v russkom jazyke. Moskva: Editorial, URSS. Slabakova, R. (2001). Telicity in the Second Language. Amsterdam: John Benjamins. Slabakova, R. (2004). Effect of Perfective Prefixes on Object Interpretation: A Theoretical and Empirical Issue. In Cahiers linguistiques d’Ottawa, 32, June, 122-142. Slioussar, N. (2007). Grammar and information structure: A study with reference to Russian. Unpublished doctoral Dissertation, Utrecht Institute of Linguistics, Utrecht, The Netherlands. Tenny, C. (1994). Aspectual Roles and the Syntax-Semantics Interface. Berlin: Springer Vendler, Z. (1957). Verbs and Times. The Philosophical Review, 66, 2, 143-160. Verkuyl, H. (1972). On the compositional nature of the aspects. Dordrecht: Reidel. Verkuyl, H. (1993). A Theory of Aspectuality: The interaction between temporal and atemporal structure. Cambridge University Press. Verkuyl, H. (1999). Tense, aspect, and aspectual composition. In M. Dimitrova- Vulchanova & L. Hellan (Eds.), Topics in South Slavic Syntax and Semantics (pp. 125-162). Amsterdam: John Benjamins. Wierzbicka, A. (1967). On the semantics of the verbal aspect in Polish. In To Honor Roman Jakobson: Essays on the Occasion of His Seventieth Birthday. Volume 3. Janua Linguarum Series Maior 33 (pp. 2231-2249). The Hague, Paris: Mouton. Yokoyama, O. (2006). Word order in spoken discourse. In K. Brown (Ed.), Encyclopaedia of Language and Linguistics, 2nd ed., 12, 88-95. Oxford: Elsevier Science. Does language-as-used fit a self-paced reading paradigm? (The answer may well depend on how you model the data.) Dagmar Divjak 1 , Antti Arppe & Harald Baayen Abstract: We report on a self-paced reading experiment that was run to ascertain whether the effect of differential tense, aspect and mood (henceforth TAM) marking on verbs would affect processing. TAM properties were identified as the strongest predictors for the choice between 6 near synonyms meaning TRY in Russian on the basis of regression models fit to manually annotated corpus data (Divjak, 2010; Divjak & Arppe, 2013). We will discuss how we used a Generalized Linear Mixed Model to account for the fact that we deviated from the traditional set-up for self-paced reading in two ways: we used an imbalanced design and ran the task with actually attested sentences rather than artificially created ones. These deviations were motivated by the need to accommodate the natural restrictions on TAM combinations and to respect the lack of a strict word order, which are both typical for Russian. We will also describe how we used a Generalized Additive Model to handle the non-linearities that we encountered in the reading times data. 1 Introduction: From text to model to mind? Over the past 15 years probabilistic statistical classification models have become established as de facto methodological standard for predicting the choice between lexical or constructional alternatives in usage-based linguistics. It is a method widely applied in semantics (e.g. Arppe & Järvikivi, 2007; Arppe, 2008; Divjak, 2010; Divjak & Arppe, 2013), syntax (e.g. Gries, 2003; 1 Corresponding author: Dagmar Divjak. Author contributions: Dagmar Divjak and Antti Arppe designed the self-paced reading experiment; Dagmar Divjak ran the experiment; Dagmar Divjak, Antti Arppe and Harald Baayen analyzed the data with comments from Petar Milin; Dagmar Divjak wrote the paper using comments and suggestions from Antti Arppe and Harald Baayen. The PsychoPy script for self-paced reading was written by Lily FitzGibbon; participants were recruited and scheduled by Daria Satyukova. The experiment received ethical approval from the University of Sheffield, School of Languages & Cultures. The financial support of the Prokhorov Foundation and the logistic support of the Saint Petersburg branch of the Russian Academy of Sciences are gratefully acknowledged. Does language-as-used fit a self-paced reading paradigm? 53 Bresnan, 2007; Bresnan et al., 2007; Bresnan & Ford, 2010; Kendall et al., 2011; Klavan, 2012), morphology (e.g. Antić, 2012; Baayen et al., 2013), phonetics and phonology (e.g. Erker & Guy, 2012; Raymond & Brown, 2012) and in areas as diverse as sociolinguistics (e.g. Grondelaers & Speelman, 2007), historical linguistics (e.g. Gries & Hilpert, 2010; Szmrecsanyi, 2013; Wolk et al., 2013) and language acquisition (e.g. Ambridge et al., 2012). We are currently experiencing another shift (see Klavan & Divjak, 2016), i.e. towards providing experimental “validation” for such models (cf. a series of papers by Arppe & Järvikivi, 2007; Ford & Bresnan, 2013a, 2013b; Divjak et al., 2016; Klavan, 2012). Bearing in mind the age-old adage that “[n]ot everything that counts can be counted and not everything that can be counted counts” we indeed need to ask the question of what is real about such statistical classification models. There are two aspects to this question. On the one hand, it addresses one of the main problems that corpuslinguists face when annotating datasets for their research, i.e. the decision on the level of granularity: which level of annotation and which annotation scheme yield the best prediction. On the other hand, it targets a key concern for cognitive corpus linguists: there is an abundance of patterns that can be detected in usage, but what the analyst detects may well be different from what the speaker detects and uses. Is the model that we propose cognitively realistic? Can we by means of textual data analysis get at what drives speakers? In this chapter, we focus on the choice between near-synonymous verbs expressing TRY in Russian that, among an impressive list of other synonymous words, have been the subject of extensive study by linguists from the Moscow Semantic School. In 20th century Western Linguistics, on the contrary, synonymy was rather neglected: part of the reason for this might be that a graded, lexical phenomenon like near-synonymy does not fit in well with the theoretical frameworks that predominated Western linguistics during the second half of the 20th century. During that time, synonymy was reserved for lexicographers, who often worked in a corpus-illustrated fashion (Tognini-Bonelli, 2001). Early studies focused on pairs of synonyms, e.g. Geeraerts (1985) on vernietigen and vernielen (destruct/ destroy) in Dutch, Church et al. (1991, 1994) on strong versus powerful, Mondry & Taylor (1992) on lying in Russian (lgat’ versus vrat’), Schmid (1993) on start versus begin, Taylor (2003) on high versus tall, Kjellmer (2003) on almost and nearly. Biber et al. (1998) studied a group of 3 synonyms: big, large and great, while Gries (2003) compared similar adjectives ending in -ic and -ical. Divjak (2004, 2010) attempted to put the study of lexical synonymy on sounder footing, thereby testing assumptions from usage-based theory in general and from cognitive linguistics in particular. At the same time, Arppe (2008) approached synon- Dagmar Divjak, Antti Arppe & Harald Baayen 54 ymy from a theoretically agnostic, quantitative perspective, while a primarily computational linguistic approach is presented in long-standing, comprehensive work by Hirst and collaborators (e.g. Edmonds & Hirst, 2002; Inkpen & Hirst, 2006). After an introduction to synonymy (section 1), we review the corpusbased analysis of TRY verbs in Russian (section 2) before moving on to new data from a self-paced reading task that was run to assess whether the effect of the strongest predictors would be felt during processing (section 3). In section 4 we reflect on the linguistic insights that were gleaned from working on data from a morphologically rich language. 2 What is synonymy? Traditionally, two words are considered synonymous in a sentence or linguistic context if the substitution of one for the other does not alter the true value of the sentence. Two lexical units would be absolute synonyms if, and only if, all their contextual relations were identical. For this reason, it is commonly asserted that absolute, perfect or full synonyms do not exist. Synonyms, then, are defined as lexical items whose senses are identical in respect of “central” semantic traits, but differ in respect of so-called “minor” or “peripheral” traits. Within the Western tradition (Cruse, 2000), synonyms are defined contextually by means of diagnostic frames. For cognitive synonyms such as die, pass away and kick the bucket that only differ in expressive traits it is impossible to state *He kicked the bucket but he did not die. Yet plesionyms differ in more than just expressive traits, so two plesionyms can be united in one sentence such as He was killed, but I can assure you he was not murdered. In the Russian tradition, the decompositional approach prevails and synonyms are analyzed by means of a semantic metalanguage. Apresjan et al. (1995, p. 60; 2000, p. XL) defines the constitutive characteristic of “synonyms” as “the presence in their meaning of a sufficiently big overlapping part”. To define the “sufficiency” of “big overlapping”, the meanings of words are reformulated with the help of a special meta-language. The strict formulation prescriptions and the limited inventory of lexical primitives of this metalanguage facilitate comparison of meanings. The overlap has to be bigger than the sum of the differences for two lexemes, or at least equal to the sum of the differences in case of three or more lexemes. Apart from that, the overlap has to relate to the assertion of the definition that contains “genera proxima”, the syntactic main word of which coincides. Does language-as-used fit a self-paced reading paradigm? 55 On both accounts (for a detailed discussion of the pros and cons of these approaches and an alternative proposal see Divjak, 2010), the three verbs that are in focus in this chapter (probovat’, pytat’sja, starat’sja “try”) would qualify as near-synonyms and they constitute a separate entry in Apresjan et al. (1999), indeed. Yet, as explained in Divjak (2010, pp. 1-14), the verbs were in fact selected on the basis of a distributional analysis in the tradition of Harris (1954) and Firth (1957), with meaning in the Wittgensteinian sense construed as contextual. Synonymy was operationalized as mutual substitutability (i.e., interchangeability), within a set of constructions, i.e. a shared constructional network. On a Construction Grammar approach to language, both constructions and lexemes have meaning; as a consequence, the lexeme’s meaning has to be compatible with the meaning of the construction in which it occurs and of the constructional slot it occupies to yield a felicitous combination. Therefore, the range of constructions a given verb is used in and the meaning of each of those constructions are revealing of the coarsegrained meaning contours of that verb. The results can then be used to delineate groups of near-synonymous verbs. On this approach, near-synonyms share constructional properties, even though the extent to which a construction is typical for a given verb may vary and the individual lexemes differ as to how they are used within the shared constructional frames. 3 Fitting a polytomous regression model to corpus data on TRY verbs in Russian For an exhaustive overview of corpus-based work on Russian TRY verbs, we refer to Divjak (2010, pp. 177-193). Here we will focus on the corpus research that inspired the hypothesis tested using self-paced reading. We build on earlier work by Divjak (2004/ 2010), who constructed a database containing 1585 tokens for 9 Russian verbs that mean TRY if combined with an infinitive: probovat’, pytat’sja, starat’sja, silit’sja, norovit’, poryvat’sja, pyžit’sja, tščit’sja, tužit’sja. The last 3 occur, however, too infrequently to yield reliable estimates in a regression model and were therefore omitted. Source of the data were the Amsterdam Corpus that contains written literary texts, supplemented with data from the Russian National Corpus. About 250 extractions per verb were analysed in detail, except for poryvat’sja, that is rare, and for which only half that number of examples could be found. Samples of equal size were chosen because of two reasons: 1) interest was in the contextual properties that would favour the choice of one verb over another, and by fixing the sample size, frequency was controlled, 2) the difference in frequency of occurrence between these 9 verbs is so large that manually Dagmar Divjak, Antti Arppe & Harald Baayen 56 annotating a sample in which the verbs would be represented proportionally would be prohibitively expensive. The extractions were manually annotated for a variety of morphological, semantic and syntactic properties, using the annotation scheme initially proposed in Divjak (2003, 2004) and later described under the name Behavioral Profiling (BP) in a number of publications (Divjak, 2006; Divjak & Gries, 2006). Divjak’s (2003, 2004) BP bears resemblance to annotation schemata used in Gries (2006) and Arppe (2008), and the name can be traced back to Hanks (1996), whose profiles were, however, restricted to complementation patterns and semantic roles. BPs chart the behaviour of X across N contexts (where context = “natural” unit of expression, i.e. sentence or clause) for a multitude of parameters (incl. grammatical information). The net is cast wide because it is not known what does (not) convey meaning. The tagging scheme (for details see Divjak (2010, pp. 119-129)) was built up incrementally and bottom-up, starting from the grammaticaland lexicalconceptual elements that were attested in the data. This scheme captures virtually all information provided at the clause (in case of complex sentences) or sentence level (for simplex sentences) by tagging morphological properties of the finite verb and the infinitive, syntactic properties of the sentences and semantic properties of the subject and infinitive as well as the optional elements. All annotation is “naïve” (different from the Bresnan dative studies), meaning that only such linguistic labels are used for which linguistically naïve native speakers can reasonably be expected to have a matching category. For example, we do not expect native speakers to be able to identify an inanimate subject or a past tense, but we do expect them to know whether something is alive or whether an event has already happened. There were a total of 14 multiple-category variables amounting to 87 distinct variable levels or contextual properties (listed in table 1), and yielding a set of 137,895 manually coded data points. Divjak and Arppe (2013) used this dataset to train a polytomous logistic regression model (Arppe, 2008, 2013a, 2013b), predicting the choice for one of the verbs. As this model underlies the self-paced reading task that is the focus of this chapter, we will describe the regression modelling in some detail. Does language-as-used fit a self-paced reading paradigm? 57 Type of variable Variable name Variable level name morphological tense future, present, past, not applicable mode infinitive, indicative, imperative, participle, gerund, conditional aspect (of both finite and infinite verb) imperfective vs. perfective syntactic subject structure nominative to the tentative verb, nominative to the preceding verb, accusative to the preceding verb, dative to the preceding “personal” verb, dative to the preceding “impersonal” verb, dative to the tentative verb, the subject is the infinitive tentative verb, the infinitive tentative verb modifies a noun sentence type declarative, interrogative, imperative, exclamation clause type main clause, subordinate clause semantic semantic type of subject concrete vs. abstract, animate (human, animal) vs. inanimate (event, phenomenon of nature, body part, organization/ institution, speech/ text) etc. properties of the process denoted by the verb physical, physical involving another, physical exchange/ transfer, physical motion, physical motion involving another, physical figurative, physical figurative involving another, figurative physical exchange/ transfer, figurative physical motion, figurative physical motion involving another, perceptual, perceptual active, communication/ interaction, mental, emotional controllability of the infinitive action high vs. medium vs. no controllability adverbial specification duration (dolgo “long”, dolgoe vremja “a long time”…), durative repetition (vsë “all (the time)”, vsë vremja “all the time”…), repetition (… raz “(…) times”), intensity (očen’ “very”, izo vsech sil “with all one’s might”…), vainness/ futility (zrja, Dagmar Divjak, Antti Arppe & Harald Baayen 58 naprasno, tščetno “in vain”…), intensity & vainness (kak ni/ ne … “however”) particles exhortation (davaj … “let’s, come on”), permission (pust’ … “let”), restriction (tol’ko … “only, just”), permission & restriction (pust’ tol’ko … “let … only”), intensification (daže … “even”), untimely halt (bylo) connectors external opposition (no, a, i ne), internal opposition (no, a, i ne), introducing a čtoby “in order to” clause, in a čtoby “in order to” clause negation present vs. absent; to the tentative verb, to the infinitive Table 1: Variables used in the annotation of the corpus sample As a rule of thumb, the number of distinct variable combinations that allow for a reliable fitting of a (polytomous) logistic regression model should not exceed 1 / 10 of the least frequent outcome (Arppe, 2008, p. 116). In this case, the least frequent verb occurs about 150 times; hence the maximum number of variable categories should be approximately 15. The selection strategy we adopted (out of many possible ones) was to retain variables with a broad dispersion among the 6 TRY verbs. This ensured focus on the interaction of variables in determining the expected probability in context rather than allowing individual distinctive variables, linked to only one of the verbs, to alone determine the choice. As selection criteria we required the overall frequency of the variable in the data to be at least 45 and to occur at least twice (i.e. not just a single chance occurrence) with all 6 TRY verbs. Additional technical restrictions excluded one variable for each fully mutually complementary case (e.g. the aspect of verb form ― if a verb form is imperfective it cannot at the same time be perfective and vice versa) as well as variables with a mutual pair-wise Uncertainty Co-Efficient UC value (a measure of nominal category association; Theil, 1970) larger than 0.5 (i.e. one variable reduces more than ½ of the uncertainty concerning the other). Altogether 18 variable values were retained (11 semantic and 7 structural), belonging to 7 different variable types. These are listed in table 2. The model specification thus by and large consists of TAM markings on the verbs and semantic properties of the infinitive. Does language-as-used fit a self-paced reading paradigm? 59 Property 1 declarative sentence 2 human agent 3 try verb in main clause 4 try verb in perfective aspect 5 try verb in indicative mood 6 try verb in gerund 7 try verb in past tense 8 subordinate verb in imperfective aspect 9 subordinate verb involves high control 10 infinitive designates an act of communication 11 infinitive designates an act of exchange 12 infinitive designates a physical action involving self 13 infinitive designates a physical action involving another participant 14 infinitive designates motion involving self 15 infinitive designates motion involving another participant 16 infinitive designates metaphorical motion 17 infinitive designates metaphorical exchange 18 infinitive designates metaphorical action involving another participant Table 2: Predictors used by the Divjak and Arppe (2013) model Using the values of these variables as calculated on the basis of the data in the sample, the model predicts the expected probability for each verb in each sentence. More interesting from a linguistic perspective, the model tells us how strongly each property individually is associated with each verb (e.g. norovit’ and especially poryvat’sja are strongly preferred when the infinitive describes a motion event while pytat’sja, starats’ja and silit’sja are dispreferred in this context; probovat’ does not have a preference one way or the other). This enables us to characterize each verb’s preference(s) (Divjak, 2010; Arppe & Divjak, 2013; Arppe, 2013b). Dagmar Divjak, Antti Arppe & Harald Baayen 60 Assuming that the model “chooses” the verb with the highest predicted probability (though strictly speaking a logistic regression model is attempting to represent the proportions of possible alternative choices in the long run), its overall accuracy is 51.7% (50.3% when tested on unseen data) and resampling techniques confirm this. This is well above chance: since there are six verbs, chance performance would have been at 16.7%. Verb-wise model predictions are provided in table 3: the highest values are on the diagonal, i.e. each verb is most often predicted as itself. norovit’ poryvat’sja probovat’ pytat’ -sja silit’sja starat’sja [original] norovit’ 143 32 4 36 17 18 250 poryvat’sja 22 57 1 19 8 12 119 probovat’ 8 8 189 16 5 20 246 pytat’sja 44 21 47 73 35 27 247 silit’sja 23 22 0 30 152 14 241 starat’sja 34 13 45 26 45 85 248 [predicted] 274 153 286 200 262 176 1351 Table 3: Model accuracy Table 4 summarizes the verb-specific odds per property for all six Russian verbs (details can be found in Divjak & Arppe, 2013). Cells with a “+” signal significantly positive odds, i.e. in favour of the occurrence of a lexeme; cells with a “0” are neutral, i.e. do not favour or disfavour a specific verb, whereas cells with a “-“ indicate odds for properties significantly against a lexeme. Overall, infinitival semantics play a significant role for norovit’ and to a lesser extent for poryvat’sja, but are much less relevant for the other four verbs: they play hardly any role for probovat’ and starats’ja and some seem to be repelling pytat’sja and silit’sja. If we take a specific property, such as main clause (CLAUSE.MAIN), we see that it has significant positive odds in favor of probovat’, neutral ones for silit’sja, starat’sja, norovit’ and poryvat’sja, and significant odds against pytat’sja. Moreover, the odds of having a perfective finite verb (FINITE.ASPECT_PERFECTIVE) in favor of probovat’ may stand Does language-as-used fit a self-paced reading paradigm? 6 1 Dagmar Divjak, Antti Arppe & Harald Baayen 62 out — this is due to the fact that probovat’ is one of only three verbs that have a perfective counterpart, and the verb that occurs most frequently in the perfective aspect in the data. Despite the relatively good fit we achieved, we have to face the inconvenient truth that “[w]henever we make a model [...], we are trying to force the ugly stepsister’s foot into Cinderella’s pretty glass slipper. It doesn’t fit without cutting off some essential parts.” (Derman, 2011). This realization has prompted a series of experimental studies that were designed to compare different aspects of the corpus analysis to different types of human behaviour. 4 Testing the predictions of the corpus-based model experimenttally Without going into detail we can say that, overall, the corpus-based models did well and mimicked subjects’ behavior on a range of tasks so “there must be something to them”. 2 Yet, the fact that the “resulting” states in model and speaker yield comparable results does not imply that they were arrived at by (exactly) the same means: the properties that play a role in capturing off-line knowledge of a phenomenon need not be the same as those guiding on-line processing of that same phenomenon. For this reason, we set out to capture time-bounded effects on sentence processing tasks such as reading (cf. Bresnan & Ford, 2010) ― are the effects of the corpus-based predictors that seem to play a role in off-line studies also active on-line, while processing language? Regression models fit to corpus data (Divjak, 2010; Divjak & Arppe, 2013; summarized in section 2) show that Tense, Aspect and Mood (TAM) markers, often overlooked in lexical semantic studies, are strong predictors of choice when faced with 6 near-synonymous verbs expressing TRY. Different from semantic properties, which seem to define 3 out of 6 verbs rather well (norovit’, poryvat’sja and probovat’), the 3 more frequent verbs (probovat’, pytat’sja, starat’sja, but also silit’sja) are defined by a combination of preferred and dispreferred TAM markers, as table 4 above shows. The effect of TAM marking came out as stronger for predicting the choice of TRY verbs than semantic properties of subject and infinitive action: using just TAM predictors vs. non-TAM predictors from the original model with 6 2 See Divjak & Gries (2008) for gap-filling and sorting data on the clustered lexical model proposed in Divjak (2003) and Divjak & Gries (2006) and see Divjak et al. (2016) for forced choice and acceptability ratings data on the regression models described in Divjak (2010) and Divjak & Arppe (2013). Does language-as-used fit a self-paced reading paradigm? 63 verbs, MacFadden’s pseudo R L2 (the relative reduction in Deviance (based on Log-Likelihood) gained by the model, in comparison to a null model) is substantially better for a model with TAM predictors at 0.219 vs. 0.129 for a model without, and the same applies for accuracy with 0.429 for a model with TAM predictors vs. 0.363 for a model without. That TAM marking would be important, is, however, at the same time surprising and expected: TAM marking is not typically used to tell synonyms apart, but it is a reliable predictor as TAM marking is obligatory: the presence of TAM markers on every verb form increases the frequency with which these properties are encountered. This is especially likely in Russian and other morphologically rich languages, for which it may be (more) cognitively unrealistic (than for morphologically poorer languages) to track words at the lexeme level rather than at the inflected/ declined level. Sinclair (2001) advances the argument that collocations are also active at the word-form level, not so much only at the lemma-level, and may indeed differ for various forms of the same lemma. Newman (2008) discusses support for lowlevel generalizations (studying linguistic behavior at the inflected level of words, as opposed to generalizing linguistic behavior at the lemma level) in corpus-based studies on language acquisition (not all word forms are acquired simultaneously), grammaticalization studies (grammaticalization can affect particular inflected forms only, e.g. the use of going to as progressive marker in English) and stylistics (where inflectional differences are typical for different genres). Psycholinguistic experimentation has confirmed that not all inflected forms of a lemma are associated with one and the same reaction time (Baayen et al. (1997) report storage for high-frequency noun plurals; Kostić and Havelka (2002) discuss different reaction times for different person and number forms of Serbian verbs in the future tense; Kostić and Mirković (2002) discuss the impact of inflectional forms of Serbian noun paradigms on reaction times). 4.1 Self-paced reading To explore whether the factors identified on the basis of corpus analysis also play an active role in processing, and in particular whether an in lexical semantics rather neglected formal variable such as TAM deserves more attention, we ran a self-paced reading task. In the self-paced reading task, participants are presented with a sentence one word at a time on a computer screen and must press a button to proceed to the next word; the exact timings of the button-presses are recorded. An example stimulus is given in (1): Dagmar Divjak, Antti Arppe & Harald Baayen 64 (1) И не пробуйте понимать I ne probujte ponimat’ And not try IMPF . IMPER - PL understand IMPF - INF чужого счастья ― не поймете čužogo sčast’ja ― ne pojmete another’s GEN . SG happiness GEN . SG ― not understand PF - IND . NON - PAST .2 PL “Don’t try to understand some else’s happiness ― you can’t.” The dominant interpretation (Kaiser, 2013) of what reading times reflect would have us expect that the verb with more probable TAM marking would require fewer resources during reading, so that processing complexity during reading would decrease on predicted high-probability TAM markings for a specific verb, resulting in quicker reading speeds. On this interpretation, we expect to find a negative correlation between probability of occurrence and reading times for TAM combinations, with more typical TAM markings leading to quicker reading times because they require less processing. However, there is an alternative interpretation of reading times which attributes a slowdown in reading speed to a sudden drop in parsing uncertainty (Hale, 2003; Levy, 2008). Participants We recruited 39 (17 male, 22 female) adult native speakers of Russian, aged between 18 and 31 (mean 23.6, s.d. 3.3) and currently living in St. Petersburg. The subjects were not linguists, philologists or language students and except for one subject, had never before participated in a (psycho-)linguistic experiment of any kind. Materials The 3 verbs used, probovat’, pytat’sja and starat’sja, are the most frequent and neutral ones. Of all TRY verbs, these three are the most similar to each other (Apresjan, 1999; Divjak, 2003; Divjak & Gries, 2006) so the differences between the verbs are very small. Preceding corpus research and experimental validation had provided a rich knowledge base and this was used when selecting stimuli in which there were no known confounds. The following procedure was followed to select stimuli: 1. A full polytomous logistic regression model was run for the 3 verbs of interest, probovat’, pytat’sja and starat’sja Does language-as-used fit a self-paced reading paradigm? 65 2. We checked whether certain types of subjects or infinitives increased the preference for one of the 3 verbs and if they did, these subjects or infinitives were avoided in the experimental items. We found that: a. physical activities increase the chances of probovat’ being chosen b. mental activity, metaphorical motion activity, motion activity involving another participant, physical action involving another participant reduce the chances of starat’sja being chosen c. there was no effect of subject on any of the 3 verbs ― all three verbs were neutral towards being combined with human animate subjects 3. We selected experimental sentences in the following way: a. we ran a model with TAM-related variables for the TRY verb and semantics for the infinitive: including infinitive semantics in the model gives us more precise information about the reading speed to expect since every sentence will include an infinitive. The infinitive semantics does not affect the probabilities significantly, but does tweak them; without the infinitive, all probabilities would be the same for one specific TAM combination. b. for each of the 9 existing TAM combinations we selected the top sentences in terms of probability estimates for all three verbs: we took those sentences that keep us closest to what the probabilities would be without us knowing what the infinitive is like, which controls for the effect of infinitive semantics. i. imperfective indicative past ii. imperfective indicative present iii. imperfective gerund present iv. imperfective indicative future v. perfective indicative past vi. perfective gerund past ― not attested in our database vii. perfective indicative future viii. imperfective imperative ix. perfective imperative 4. The list of stimuli was compiled as follows: a. We selected 3 examples for each of the 3 verbs for all 8 TAM combinations that were attested in our dataset. Although 9 combinations exist, for the perfective past gerund no cases were attested in our data. This means that no context and hence no probability estimates were available and this TAM combination was excluded from the experiment. Some combi- Dagmar Divjak, Antti Arppe & Harald Baayen 66 nations did not have sufficiently many attestations in our data, and for those we consulted the RNC; in all, 18 RNC examples with the same contextual properties as specified by the model were added to the dataset while 54 stem from the annotated corpus sample. b. These examples were divided over 3 experimental sets: set 1 gets 1st examples; set 2 gets 2nd examples; set 3 gets 3rd examples. We ensured that the imperfective future and the infinitive semantics were evenly distributed over all three sets. A third of all sentences was followed by a yes/ no question that the subjects had to answer. c. Every participant was presented with 1 example for each verbby-TAM combination. These examples were interspersed with 24 filler items containing verbs of perception. The set was preceded by 5 practice sentences and randomized automatically for each subject. This set-up deviates from the traditional approach to self-paced reading experiments in three important ways: 1. We used an imbalanced design to accommodate the natural restrictions on TAM combinations ― e.g. no present perfectives exist and none were created for the experiment. 2. We ran the task with actually attested sentences rather than artificially created ones. Because we used authentic examples, all stimuli were possible, albeit more or less likely. 3. Working with authentic sentences also introduced variation in the position the TRY verb occupied in the sentence. This is exacerbated by the fact that Russian, like other Slavonic languages, lacks a strict word order. Rather than controlling for the variation introduced by relying on authentic data, which runs the risk of observing how atypical language is processed, we preferred to embrace this variation and incorporate it as an integral element into the analysis by using regression modelling techniques, as explained in sections 3.2 and 3.3. Procedure The experiments were run in a quiet room at the Institute for Linguistic Studies of the St Petersburg branch of the Russian Academy of Sciences. Participants provided personal information prior to attending using Google forms. They had also been sent information sheets and consent forms and Does language-as-used fit a self-paced reading paradigm? 67 were offered the opportunity to read those again at the testing location where the documents were also signed and handed over to the experimenter. The self-paced reading task was programmed in PsychoPy. A Cedrus response pad was connected to a Windows 7 laptop (Intel i5 core) with Nvidia graphics card; all subjects completed the task individually on the same laptop. The presentation used a word-by-word template without placeholders. The self-paced reading task was preceded by a serial reaction time task and followed a digit span task that are not described in this chapter. All subjects were debriefed after the session. 4.2 Data analysis: mixed effects linear regression model A first set of models to explain the time reading the TRY verb was run using Generalized Linear Mixed effects Regression Modelling (GLMM) (e.g. Baayen et al., 2008), using R package lme4 (Bates et al., 2015). Generalized Linear Mixed effects regression Modelling is an extension of Generalized Linear Modelling so that the predictor effects are divided into fixed and random effects. Fixed effects represent the variables and their interactions we are interested in making inferences about beyond our sample to the entire population. In contrast, random effects are variables which we presume to represent the effects of gathering the (random) sample we have (such as individual differences between speakers, experimental factors that may not be representative of the entire population of speakers or of the phenomenon of interest in its entirety), and the distorting impacts of which we want to minimize when drawing inferences about the fixed effects. During the data preparation stage, some observations were excluded for the following reasons: data from 2 female subjects had to be discarded because the software crashed half-way through the reading task; 3 stimuli were excluded because they used a periphrastic future, which removed the tense marking from the TRY verb. Following standard procedure all responses were excluded that took less than .05 seconds and were more than 2 standard deviations removed from the mean. Baayen & Milin (2010) discuss some of the implications of this approach, and we mention explicitly here that our findings change with the more cautious trimming of datapoints advocated in Baayen & Milin (2010). Numerical variables were logtransformed. Probability of occurrence as specified by the regression model described in section 2 was predicted by either the (logarithm of the) reading time on the TRY verb, the (logarithm of the) reading time on the word following the TRY verb (i.e. the infinitive) or the residualized (logarithm of the) reading Dagmar Divjak, Antti Arppe & Harald Baayen 68 time on the infinitive; the latter cases can be considered as “spill-over” effects of reading the TRY verb. Control variables were length of the word, position of the word (in the sentence) and (except in the residualized model) reading time on the previous word. Two variants were run of each model: one in which control variables were included in the fixed effects structure, and one whereby they were included in the random effects structure. The latter approach gives us an idea of what factors in general seem to affect reading time, whereas the former tries to focus on the effect of prescribed factors in the apparent immediate textual, linguistic context, with the effect of the random factors “neutralized”; below we show the results of the latter models. The random effects structure always included at least subject, item and index. The base model with probability of occurrence predicting reading time on the TRY verb showed a negative correlation between reading time and probability with higher probability TAM combinations facilitating reading. In other words, TRY verbs were read slightly more quickly if encountered in their expected form. Yet, as soon as variables were entered that are typically used as control variables in the analysis of reading times data, in particular length of the TRY verb and position of the TRY verb in the sentence, the inverse correlation effect between reading time and probability, though still seemingly present, became overshadowed by the effects of the control variables and therefore non-significant. Linear mixed model fit by REML Formula: log(RT) ~ Probability + (1 | Participant) + (1 | TRY verb) + (1 | Position of TRY verb in sentence) + (1 | Length of TRY verb) + (1 | Position of sentence in experiment) Does language-as-used fit a self-paced reading paradigm? 69 Random effects Groups Name Variance Std.Dev. Participant (Intercept) 1.0302e-01 0.32097289 Sentence position (Intercept) 2.6211e-03 0.05119630 Verb position (Intercept) 2.7466e-03 0.05240846 Verb Length (Intercept) 3.6840e-04 0.01919366 TRY verb (Intercept) 3.0275e-07 0.00055022 Residual 4.1120e-02 0.20277984 Fixed effects Estimate Std. Error t value (Intercept) -0.31577 0.05756 -5.486 Probability -0.01277 0.03232 -0.395 There was some evidence of a spill-over effect as there was a stronger negative correlation with reading time on the word following the TRY-verb, which is typically the infinitive. Although a stronger effect on the infinitive would be expected on linguistic grounds (cf. Divjak 2004/ 2010 who showed that the distribution of TRY verbs is distinct from that of typical main verbs and instead resembles “light” verbs such as modals and phasals), this effect too failed to reach significance. Linear mixed model fit by REML Formula: log(RTspillOver) ~ Probability + (1 | Participant) + (1 | TRY verb) + (1 | Position of TRY verb in sentence) + (1 | Length of TRY verb) + (1 | Position of sentence in experiment) Dagmar Divjak, Antti Arppe & Harald Baayen 70 Random effects Groups Name Variance Std.Dev. Participant (Intercept) 0.11943454 0.345593 Sentence position (Intercept) 0.00331632 0.057588 Verb position (Intercept) 0.00068924 0.026253 Verb Length (Intercept) 0.00000000 0.000000 TRY verb (Intercept) 0.00000000 0.000000 Residual 0.04449880 0.210947 Fixed effects Estimate Std. Error t value (Intercept) -0.31709 0.06000 -5.285 Probability -0.02039 0.03169 -0.643 Overall, control variables such as the reading time of the previous word, the length of the TRY verb and the position of the sentence in the experiment all explain more about the speed with which the TRY verbs are read than does the probability of TRY verb occurrence given the TAM marking. In a third set of linear models residualization was performed because of autocorrelation issues: the (logarithm) of the reading time on the word preceding the TRY verb and the position of the TRY verb in the sentence account for 72% of the variance in the time spent reading the TRY verb. For the residualization, we used the (logarithm) of the (first pass) reading time on the word preceding the TRY verb and the position of the verb in the sentence as predictors. Note that this dataset contained slightly fewer observations, 745 instead of 825, because in some cases the TRY verb was the first word in the sentence, meaning there is no preceding word with reading time, or because the reading time for the previous word was missing due to various reasons, e.g. having been skipped on first pass. Does language-as-used fit a self-paced reading paradigm? 71 Linear mixed model fit by REML Formula: logRTresid ~ Probability + (1 | Participant) + (1 | TRY verb) + (1 | Position of TRY verb in sentence) + (1 | Length of TRY verb) + (1 | Position of sentence in experiment) Random effects Groups Name Variance Std.Dev. Participant (Intercept) 0.00850915 0.092245 Sentence position (Intercept) 0.00010236 0.010117 Verb position (Intercept) 0.00086404 0.029395 Verb Length (Intercept) 0.00193519 0.043991 TRY verb (Intercept) 0.00000000 0.000000 Residual 0.05340367 0.231092 Fixed effects Estimate Std. Error t value (Intercept) -0.0004919 0.0293385 -0.017 Probability -0.0299761 0.0386436 -0.776 There are a number of possible explanations for the lack of significance where one would expect it on the basis of previous research. These explanations concern the phenomenon and the stimulus selection, the corpus data annotation, the experimental paradigm, the sample size and the assumptions underlying linear regression. We will address each of these points in turn. First of all, different from other experimental work, the task our subjects faced was virtually “impossible”: of the nine TRY verbs at our disposal, the three verbs we were targeting all belong to the same cluster (cf. Divjak, 2003; Divjak & Gries, 2006, 2008), meaning they are the most similar from among a group of 9 synonymous verbs. This makes it very hard to find properties that distinguish between these verbs. The already subtle differences were further obscured because we decided to work with authentic experimental items: all stimuli were attested in our corpus, meaning that all contexts we provided “fit” the verbs: some fit better, some fit worse (in comparison to Dagmar Divjak, Antti Arppe & Harald Baayen 72 the other two TRY verbs in that same context, or in comparison to all other contexts for that same TRY verb) but all are possible and had been genuinely produced by a speaker/ writer. Secondly, the corpus-model that produced the predicted probabilities did not know (or care) about verb length and position of the verb in the sentence; including this information might have changed the calculated probabilities. However, it could be argued that in the corpus sources which we used the authors had ample time to consider the composition of the sentences they wrote, thus being able to take into consideration the “linguistic goodness” of the entire context, i.e. the full sentence they were writing, including that part of the context following the TRY verb. Thirdly, a word-by-word self-paced reading paradigm may be too mechanistic to pick up the subtle differences in reading times we are expecting; it is possible for subjects to fall into a pattern whereby the button presses guide reading speed rather than measure reading speed. It could therefore be suggested that eye-tracking would be a more suited experimental paradigm. Self-paced reading is, however, considered to be a robust technique, and reading latencies from both tasks correlate. 3 Fourthly, given the subtlety of the effect, our sample size may have been too small: simulations show that we would need 100 times more data for the effect of probability on reading time to reach significance. This ties in with Jones and Tukey’s (2000) criticism of null-hypothesis testing: given that eventually, we are likely to observe an effect, the question really becomes whether or not the effect is apparent enough in the data we have, and whether that effect is meaningful in terms of theory-based predictions and common sense. A final explanation for the lack of a significance effect where one would expect it on the basis of previous research relates to the assumptions underlying the statistical modelling technique we used. The assumption underlying a GLMM is that the effects are linear and continuous; a linear model can fail to pick up significant effects that are non-linear in nature. In the next section, we explore how such non-linearities in the data can be dealt with and modelled in more detail. 4.3 GAMM Mixed effects additive models In a second modelling round we considered a non-linear treatment of reading times using Generalized Additive Mixed Models (GAMM). GAMMs are 3 Miwa et al. (2014) report that in case of multiple fixations, typically, the later fixations tend to be more similar to lexical decision latencies. Does language-as-used fit a self-paced reading paradigm? 73 an extension of the linear mixed model that make it possible to model a response variable as a nonlinear function of one or more predictor variables, using, e.g., thin plate regression splines. GAMMs have recently been applied successfully to linguistic and psycholinguistic data ranging from dialectometry (Wieling et al., 2011) to electromagnetic articulography (Tomaschek et al., 2014) and from EEG data (Kryuchkova et al., 2012) to pitch contours (Koesling et al., 2012). The software available in the mgcv package for R by Wood (Wood, 2006; Wood, 2011) offers a wide range of statistical tools for the modelling of both fixed-effect factors, random-effects, covariates, and their interactions. Whereas the linear mixed model allows for the specification of a model in which a regression line Y = a + bX is modulated by Gaussian uncertainty for intercept and slope for a grouping factor F, effectively calibrating the regression line for each level of F, GAMMs offer the possibility to include a main effect of Y as a potentially nonlinear function of X, together with ‘random nonlinear curves’ for each level of F that are shrunk towards zero, under the constraint that these random curves have the same smoothing parameter. This is especially useful for modelling subject-specific variation in how participants perform in the course of an experiment. We ran GAMM using package mgcv 1.8-5 (Wood, 2015) in R 3.1.3 (March 2015). The specification for the best model, using the same dataset as for the GLMM models, is given below. It includes the length of the TRY verb as parametric coefficient, a smooth for the position of the sentence in the experiment, a factorial smooth for participant by position of the sentence in the experiment as well as an interaction between the probability of the verb given the TAM marking and the rank of the sentence in the experiment as tensor product. All terms contribute significantly to an explanation of the time it takes subjects to read the TRY verb in question. Formula: log(RT) ~ s(Position) + te(Probability, trial_order) + s(trial_order, participant, bs=“fs”, m=1) + CriticalLength Parametric coefficients Estimate Std. Error t value Pr(>|t|) (Intercept) -0.411802 0.068319 -6.028 2.64e-09*** CriticalLength 0.011211 0.004865 2.304 0.0215* Dagmar Divjak, Antti Arppe & Harald Baayen 74 Approximate significance of smooth terms: edf Ref.df F p-value s(Position) 4.420 5.296 5.517 3.96e-05 *** te(Probability,trial_order) 4.224 4.649 7.007 5.32e-06 *** s(trial_order,participant) 84.040 332.000 6.651 <2e-16 *** R-sq.(adj)=0.737 Deviance explained=76.7% fREM=54.623 Scale est=0.037683 n=825 The significance of the interaction between probability and position of the sentence in the experiment confirms that TAM marking is picked up by native speakers and plays a role in on-line processes as captured by a selfpaced reading task: words are read more quickly if presented with their distinctive TAM marking. Yet, the three-way interaction also signals that the readers’ reaction to probability is not uniform throughout the task. In order to understand what is happening, we plot the interaction. In figure 1 the tensor interaction is plotted with probability on the X axis and the rank of the sentence in the experiment on the Y axis. The colours on the graph represent reading times and shade from pale pink in the bottom right corner over deep pink to yellow and green in the top right corner. The values on the isolines are the logarithms of the reading times. The value on the isoline around the white area in the bottom right corner being -0.38 (the logarithm of a reading time of 0.68 seconds) and the value on the isoline around the green area in the top right corner being -0.56 (the logarithm of a reading time of 0.57 seconds), white signals longer reading latencies and green shorter reading latencies. The subjects’ behaviour changes half-way through the experiment: while they start out reading slowly and in fact reading verbs with expected TAM marking slightly slower than verbs with unexpected TAM marking, they end the experiment reading quickly and reading verbs with expected TAM marking most quickly, as predicted. Although it is typical for subjects to change pace during an experiment (some become faster, others slower) the pattern exhibited by their reading behaviour does not display the U-shape often seen. We have pointed out that there are two competing interpretations of reaction times. How do these account for the effect we observe? On the standard interpretation (Kaiser, 2013) longer reading latencies signal processing difficulty, while shorter reading latencies signal processing ease. This would mean that our subjects found verbs with highly likely TAM marking at first Doe slig wh Fig Giv ord hav for per ver tim cha ject the see lite abo es la ghtl hich gure ven der ve b rtun rim rbs mes ang ts w ey r ems erar out angu ly m h go 1: In th n tha of been nate ent in t sho ge in were reac s lik ry se the uage mor es a nter he s at th the n ca e lex . Th thei ow n su e ab ched ke a ente e ba e-asre d agai racti sente here sen aus xica hen ir pr su urpr ble d a mo enc ackg -use diffic inst ion ence e w nten ed al si n wh refe gge risal to m fam ore es, grou d fit cult t the of p e in were nces by idehy erre ests l (H mak mili pla but und t a se t to e m prob the thr s w ina -effe did ed fo tha Hale ke s iar i ausi t wi d of elf-p pr major babi exp ree was accu ects d ou orm at a e, 20 sens item ible itho f th pace oce rity ility perim diff ran urate s in ur s m? T a sl 003; se o m, i int out e si d re ess t y of y of men fere ndom e p the ubj The low ; Le of th i.e. terp wid itua eadin than find TR nt in ent s miz red e sti ects alte wdow evy, he s the preta der ation ng p n ve ding RY v the sets zed dicti imu s sl erna wn , 200 sent TR atio con n d parad erb gs i verb e GA s of for ons uli u ow ativ in 08). tenc RY v on: s ntex desc digm s w n th b giv AM M exp r ea s of used do ve in rea . Th ce th verb sub xt. T ribe m? with his a ven Mod peri ach the d at own nter adin his w hey b in bject The ed i un area TA del ime sub e re t th n wh rpre ng wou y we n it ts w ey d in t nlike a. AM m enta bject egre he b hen etati spe uld ere s p were did the ely mar al ite t, th essio begi n en ion eed me rea refe e re not sen TA rkin ems his on m inni ncou of w sig ean adin erre eadi t kn nten AM g an s an effe mod ing unte wh gna tha ng o ed f ing now nce ma nd nd t ect del of erin at r ls a at ou only form aut w an they arkin ran that can or the ng T read a la ur s y w m. T then nyth y w 75 ngs k of the nnot unex- TRY ding arge subhen This ntic hing were 5 , f e t - - Y g e n s , g e Dagmar Divjak, Antti Arppe & Harald Baayen 76 reading and may well have clicked through the first words in the sentence, collecting information, until a familiar word appeared and they were in a position to start integrating that information. Since the subjects knew that some sentences were followed by a yes/ no question that they needed to answer, it is plausible to assume that they will have paused to integrate information and to prepare for the question. The novelty of reading literary sentences without further context and the preparation in anticipation of what turned out to be relatively straightforward questions will have worn off over the course of the experiment and about half-way through (to be precise: after 5 training sentences, 12 filler items and 12 experimental items), subjects would have learnt to expect some kind of light verb expressing attempt or perception. This would have made it possible for them to start reading in a more natural way, skipping more quickly over expected words. This delay in effect may have been exacerbated by the fact that our subjects came from a prescriptive linguistic tradition and from an instruction-based rather than inquiry-based educational system. Both factors contribute to a desire to do well in test situations. 5 Looking back, looking forward and looking outward The impetus for the research question discussed in this chapter comes from the morphological richness, typical of Slavic languages. The fact that the bulk of corpusand psycholinguistic research is done on English, which is a morphologically poor language, has lead researchers to assume that the lemma level suffices as guide for annotation and for claims about representation. Data from a morphologically rich language like Russian with abundant inflectional markings show that this finding is a side-effect of the properties of the language studied: if inflectional markings are present, they are detected and used by speakers in on-line processing, as witnessed by an increase in reading speed on TRY verbs carrying their distinctive TAM marking. This finding highlights the extent of the knowledge speakers have of distributional patterns typical of their language and encourages linguists to think of language less in terms of inventories of items that can be freely combined in an unlimited number of ways, but in terms of prefabricated chunks that are more or less expected given the context. The recognition of the existence of degrees of meaning difference and the interest in pinpointing the source of these fine-grained differences is clearly reminiscent of the Russian semantic tradition. This interest underlies the corpus analysis on which the on-line experiment was built. Advantages of basing experiments on extensive corpus-linguistic research include having Does language-as-used fit a self-paced reading paradigm? 77 access to a very rich knowledge base that increases one’s chances of avoiding confounds due to knowing what to expect and having a good theoretically motivated and empirically supported idea of why that should be so. Admittedly, a corpus-based approach of the type described in this chapter is a labour intensive endeavour that furthermore complicates the experimental design and makes the use of advanced statistical methods necessary. Yet, at the same time, it is economical in that it captures subjects’ reactions to a much wider range of possible contexts (i.e. variable combinations) than is typically the case in an experiment, and it provides a very high level of control, not only over the frequency with which the words in the sentences occur, but also ― and crucially ― over the likelihood of having words co-occur with each other. That being said, one piece of information that corpus linguists should consider including in their models is the position of the word of interest in the sentence: information may well be structured differently for different verbs. Considering the relation between a verb and the way in which information is structured is especially important if the preceding context will be cut out in the experimental presentation, such as in a linear reading experiment with piecemeal exposure. The word-by-word presentation is not natural and the clicking might pace reading speed rather than record it. Nevertheless, the task is relatively intuitive for subjects and avoids uncertainty about what is causing longer reading time when chunks are presented (looking back/ forward). Although self-paced reading is a robust task yielding reaction times, which are a type of data about which much is known and which are strongly correlated with (first pass) reading times from eyetracking, the ecological validity of on-line tasks would benefit from including information structure in experimental stimuli and presenting the words of interest in the position in the sentence which seems most natural for them and in a larger context: even if the sentences themselves are authentic and extracted from a corpus, having a TRY verb without preceding context may well be unnatural as there is no information on who is trying, why, and what (cf. Roland & Jurafsky, 2002). On a practical level, our data shows that subjects’ prior experience and cultural background need to be taken into account when running experiments: subjects who are not accustomed to (psycholinguistic) experiments, who come from prescriptive linguistic traditions and/ or from instructionrather than inquiry-based educational systems may need to be given a longer time to practice to overcome subconscious barriers and start to show natural behaviour. Unless we address these issues, we risk continuing to measure what people do when things are not as they normally are and we might miss an effect that is indeed present in the data. Dagmar Divjak, Antti Arppe & Harald Baayen 78 In our case, we needed a powerful statistical technique, generalized additive mixed effects regression modelling, to detect the expected relation between probability and reading time. Yet, the algorithms underlying standard statistical classifiers such as regression techniques were not designed to mimic human learning. Although they show good prediction accuracy, the drawback is that they yield cognitively unrealistic models that are of limited interest to usage-based linguists from a theoretical point of view. Are probabilities the proper constructs to capture the processes that are at work? Research in progress (Milin et al.) re-models this same data using a biologically and cognitively plausible model of learning, the Naïve Discriminative Learner (Baayen et al., 2011) to answer a number of questions raised in this chapter. First, is it truly the probability of an abstract semantic category that is driving the behaviour that we see in this self-paced reading task, or is it the distinctive cues in the orthographic input that support TAM? Second, if it is a probability, how would the probability be learned? Could an approach based on discrimination learning shed light on this process? Third, why is there an interaction between probability and position of the sentence in the experiment? Doesn’t this suggest learning in the course of the experiment? And if so, what are our subjects learning? And finally, how can we obtain further insight into and evidence for the Hale (2003)/ Levy (2008) interpretation of a slow-down in reading latencies that our data seems to be supporting, but that has not been the dominant interpretation in the literature? We leave these questions to future research. References Ambridge, B., Julian M.P., Caroline F.R., & Franklin Ch. (2012). The roles of verb semantics, entrenchment, and morphophonology in the retreat from dative argument-structure overgeneralization errors. Language, 88(1), 45-81. Antić, E. (2012). Relative frequency effects in Russian morphology. In S.Th. Gries & D. Divjak (Eds.), Frequency effects in language learning and processing (Vol. 1.) (83-108). Berlin, Boston: De Gruyter Mouton. Apresjan, J.D. (1995 [1974]). Izbrannye trudy. Tom I. Leksičeskaja semantika: sinonimičeskie sredstva jazyka. Moskva, Rossija: Škola “Jazyki Russkoj Kul’tury”. Apresjan, J.D. et al. (1999). Novyj ob’’jasnitel’nyj slovar’ sinonimov russkogo jazyka (Vol. 1). Moskva, Rossija: Škola “Jazyki Russkoj Kul’tury”. Arppe, A. (2008). Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography ― A Study of Synonymy (PhD Dissertation). Retrieved May 28, 2015 from https: / / helda.helsinki.fi/ handle/ 10138/ 19274 Arppe A. (2013a). Package polytomous: Polytomous logistic regression for fixed and mixed effects. R package version 0.1.6. Retrieved from http: / / CRAN.R-project.org/ package=polytomous Does language-as-used fit a self-paced reading paradigm? 79 Arppe, A. (2013b). Extracting exemplars and prototypes: R vignette to accompany Divjak & Arppe. Retrieved from http: / / cran.r-project.org/ web/ packages/ polytomous/ vignettes/ exemplars2prototypes.pdf Arppe, A., & Järvikivi, J. (2007). Every method counts: Combining corpus-based and experimental evidence in the study of synonymy. Corpus Linguistics and Linguistic Theory, 3(2), 131-159. Baayen, R.H., Davidson, D.J., & Bates, D.M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390-412. Baayen, R.H., Dijkstra, T., & Schreuder, R. (1997). Singulars and plurals in Dutch: Evidence for a parallel dual route model. Journal of Memory and Language, 36, 94- 117. Baayen, R.H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research 3(2), 12-28. Baayen, R.H., Milin, P., Đurđević, D.F., Hendrix, P., & Marelli, M. (2011). An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological review, 118(3): 438-481. Baayen, R.H., Janda, L.A., Nesset, T., Dickey, S., Endresen, A., & Makarova, A. (2013). Making choices in Russian: Pros and cons of statistical methods for rival forms. Russian Linguistics, 37, 253-291. Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). lme4: Linear mixed-effects models using Eigen and S4: R package version 1.1-8. Retrieved from http: / / CRAN.Rproject.org/ package=lme4 Backhaus, K., Erichson, B., Plinke, W., & Weiber, R. (1996). Multivariate Analysemethoden: eine anwendungsorientierte Einführung (8th ed.). Berlin, Heidelberg" New York: Springer. Biber, D., Conrad, S., & Reppen, R. (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: Cambridge University Press. Bresnan, J. (2007). Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In S. Featherston, & W. Sternefeld, (Eds.), Roots: Linguistics in Search of Its Evidential Base (pp. 77-96). Berlin: Mouton de Gruyter. Bresnan, J., Cueni, A., Nikitina, T., & Baayen, R.H. (2007). Predicting the Dative Alternation. In G. Bouma, I. Krämer & J. Zwarts (Eds.), Cognitive Foundations of Interpretation (pp. 69-94). Amsterdam: Royal Netherlands Academy of Science. Bresnan, J., & Ford, M. (2010). Predicting Syntax: Processing Dative Constructions in American and Australian Varieties of English. Language, 86(1), 186-213. Church, K.W., Gale, W., Hanks, P., & Hindle, D. (1991). Using statistics in lexical analysis. In U. Zernik (Ed.). Lexical acquisition: Exploiting on-line resources to build a lexicon (pp. 115-164). Hillsdale, NJ: Lawrence Erlbaum. Church, K.W., Gale, W., Hanks, P., Hindle, D., & Moon, R. (1994). Lexical substitutability. In: B.T.S. Atkins, & A. Zampolli (Eds.), Computational approaches to the lexicon (pp. 153-177). Oxford, Ney York: Oxford University Press. Cruse, D.A. (2000). Meaning in Language: An Introduction to Semantics and Pragmatics. Oxford: Oxford University Press. Dagmar Divjak, Antti Arppe & Harald Baayen 80 Derman, E. (2011). Models. Behaving. Badly: Why Confusing Illusion with Reality Can Lead to Disaster, on Wall Street and in Life. New York: Free Press Divjak, D. (2003). On trying in Russian: a tentative network model for near(er) synonyms. Slavica Gandensia, 30, 25-58. Divjak, D. (2004). Degrees of Verb Integration. Conceptualizing and Categorizing Events in Russian (PhD. Dissertation). Retrieved from Dept. of Oriental & Slavic Studies. Divjak, D. (2006). Ways of Intending: Delineating and Structuring Near-Synonyms. In S.T. Gries, & A. Stefanowitsch (Eds.), Corpora in cognitive linguistics: Corpus-based Approaches to Syntax and Lexis (pp. 19-56). Berlin, New York: Mouton de Gruyter. Divjak, D. (2010). Structuring the lexicon: A clustered model for near-synonymy. Berlin, New York: Mouton de Gruyter. Divjak, D., & Arppe, A. (2013). Extracting prototypes from exemplars. What can corpus data tell us about concept representation? Cognitive Linguistics, 24(2), 221- 274. Divjak, D., Dąbrowska, E., & Arppe, A. (2016). Machine meets man: Evaluating the psychological reality of corpus-based probabilistic models. Cognitive Linguistics, 27(1). Divjak, D., & Gries, S.Th. (2006). Ways of Trying in Russian. Clustering Behavioral Profiles. Journal of Corpus Linguistics and Linguistic Theory, 2(1), 23-60. Divjak, D., & Gries, S.Th. (2008). Clusters in the Mind? Converging evidence from near-synonymy in Russian. The Mental Lexicon 3(2), 188-213. Edmonds, Ph., & Hirst, G. (2002). Near-synonymy and Lexical Choice. Computational Linguistics, 28(2), 105-144. Erker, D. & Guy, G.R. (2012). The role of lexical frequency in syntactic variability: Variable subject personal pronoun expression in Spanish. Language, 88(3), 526-557. Firth, J.R. (1957). Papers in Linguistics 1934-1951. London: Oxford University Press. Ford, M., & Bresnan, J. (2013a). 'They whispered me the answer' in Australia and the US: A comparative experimental study. In T.H. King, & V. de Paiva (Eds.), From Quirky Case to Representing Space: Papers in Honor of Annie Zaenen (pp. 95-107). Stanford, CA: CSLI Publications. Retrieved January 22, 2015 from http: / / web.stanford.edu/ group/ cslipublications/ cslipublications/ Online/ azfest -final.pdf Ford, M., & Bresnan, J. (2013b). Using convergent evidence from psycholinguistics and usage. In M. Krug, & J. Schlüter (Eds.), Research Methods in Language Variation and Change (pp. 295-312). Cambridge: Cambridge University Press. Geeraerts, Dirk. (1985). Preponderantieverschillen bij bijna-synoniemen. De nieuwe taalgids, 78, 18-27. Gries, S.Th. (2003). Multifactorial analysis in corpus linguistics: a study of Particle Placement. London, New York: Continuum Press. Gries, S.Th. (2006). Corpus-based methods and cognitive semantics: the many meanings of to run. In S.Th. Gries, & A. Stefanowitsch (Eds.), Corpora in cognitive linguistics: corpus-based approaches to syntax and lexis (pp. 57-99). Berlin, New York: Mouton de Gruyter. Does language-as-used fit a self-paced reading paradigm? 81 Gries, S.Th., & Hilpert, M. (2010). Modeling diachronic change in the third person singular: A multifactorial, verb-and author-specific exploratory approach. English Language and Linguistics 14(3), 293-320. Grondelaers, S. & Speelman, D. (2007). A variationist account of constituent ordering in presentative sentences in Belgian Dutch. Corpus Linguistics and Linguistic Theory, 3(2), 161-193. Hale J. (2003). The information conveyed by words in sentences. Journal of Psychological Research, 32, 101-123. Hanks, P. (1996). Contextual Dependency and Lexical Sets. International Journal of Corpus Linguistics, 1(1), 75-98. Harris, Z. (1954). Distributional structure. Word, 10(23), 146-162. Inkpen, D., & Hirst, G. (2006). Building and Using a Lexical Knowledge-Base of Near- Synonym Differences. Computational Linguistics, 32(2), 223-262. Jones, L.V., & Tukey, J.W. (2000). A sensible formulation of the significance test. Psychological Methods, 5(4), 411-414. Kaiser, E. (2013). Experimental paradigms in psycholinguistics. In R.J. Podesva, & D. Sharma (Eds.), Research Methods in Linguistics (pp. 135-168). Cambridge: Cambridge University Press. Kendall, T., Bresnan, J., & Van Herk, G. (2011). The dative alternation in African American English: Researching syntactic variation and change across sociolinguistic datasets. Corpus Linguistics and Linguistic Theory, 7(2), 229-244. Kjellmer, G. (2003). Synonymy and corpus work: on almost and nearly . ICAME Journal 27, 19-27. Klavan, J. (2012). Evidence in linguistics: corpus-linguistic and experimental methods for studying grammatical synonymy (Dissertation). Tartu: University of Tartu Press. Klavan, J., & Divjak, D. (in press for 2016). Review article: The Cognitive Plausibility of Statistical Classification Models: Comparing Textual and Behavioral Evidence. In M. Hilpert, K. Krawczak, & M. Fabiszak (Eds.), Folia Linguistica [Special issue]. Koesling, K., Kunter, G., Baayen, R., & Plag, I. (2012). Prominence in triconstituent compounds: Pitch contours and linguistic theory. Language and Speech, 56(4), 529- 554. Kostić, A., & Havelka, J. (2002). Processing of verb tense. Psihologija, 35(3-4), 299-316. Kostić, A., & Mirković, J. (2002). Processing of inflected nouns and levels of cognitive sensitivity. Psihologija 35(3-4), 287-297. Kryuchkova, T., Tucker, B.V., Wurm, L., & Baayen, R.H. (2012). Danger and usefulness in auditory lexical processing: evidence from electroencephalography. Brain and Language, 122, 81-91. Levy, R. (2008). Expectation-based syntactic comprehension. Cognition 106, 1126-1177. Milin, P., Divjak, D., & Baayen, R.H. (in progress). Reading Russian Synonyms. (working title). Miwa, K., Libben, G., Dijkstra, T., & Baayen, R.H. (2014). The time-course of lexical activation in Japanese morphographic word recognition: Evidence for a characterdriven processing model. The Quarterly Journal of Experimental Psychology, 67, 79- 113. Dagmar Divjak, Antti Arppe & Harald Baayen 82 Mondry, H. & Taylor, J. (1992). On lying in Russian. Language & Communication, 12(2), 133-143. Newman, J. (2008). Aiming low in linguistics: Low-level generalizations in corpus-based research. Retrieved June 6, 2013 from http: / / www.johnnewm.org/ downloads/ Raymond, W.D., & Brown, E.L. (2012). Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish. In S.Th. Gries, & D. Divjak (Eds.), Frequency effects in language learning and processing (pp. 35-52). Berlin: Mouton de Gruyter. Roland, D., & Jurafsky, D. (2002). Verb Sense and Verb Subcategorization Probabilities. In: S. Stevenson, & P. Merlo (Eds.). The Lexical Basis of Sentence Processing: Formal, Computational, and Experimental Issues (pp. 325-346). Amsterdam, Philadelphia: John Benjamins. Schmid, H.-J. (1993). Cottage und Co., idea, start vs. begin: Die Kategorisierung als Grundprinzip einer differenzierten Bedeutungsbeschreibung. Tübingen, Germany: Niemeyer. Sinclair, J. (2001). Corpus, Concordance, Collocation. Oxford: Oxford University Presss. Szmrecsanyi, B. (2013). Diachronic Probabilistic Grammar. English Language and Linguistics 1(3), 41-68. Taylor, J. (2003). Near synonyms as co-extensive categories: ‘high’ and ‘tall’ revisited. Language Sciences 25, 263-284. Theil, H. (1970). On the Estimation of Relationships Involving Qualitative Variables. The American Journal of Sociology, 6(1), 103-154. Tognini-Bonelli, E. (2001). Corpus Linguistics at Work. Amsterdam, Philadelphia: John Benjamins. Tomaschek, F., Tucker, B., Wieling, M., & Baayen, H. (2014). Vowel articulation affected by word frequency. Proceedings of 10th ISSP, 429-432. Wieling, M., Nerbonne, J., & Baayen, H. (2011). Quantitative social dialectology: Explaining linguistic variation geographically and socially. PLoS ONE, 6(9). Wolk, Ch., Bresnan, J., Rosenbach, A., & Szmrecsanyi, B. (2013). Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica, 30(3), 382-419. Wood, S. (2006). Generalized Additive Models. An Introduction with R. Boca Raton, FL: CRC Press. Wood, S. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B), 73, 3-36. Wood, S. (2015). Package mgcv. R package version 1.8-5. Retrieved from https: / / cran.rproject.org/ web/ packages/ mgcv/ index.html One experiment — different languages: A challenge for the transfer of experimental designs. Examples from cross-linguistic and inner-Slavic research Anja Gattnar Abstract: This paper 1 is the first contribution to the field of cross-linguistic experimental research that tries to analyze the difficulties arising from translating an existing linguistic experimental design into other languages. It investigates the comparability of cross-linguistic experimental research. The paper presents a range of difficulties that may arise when we transfer experimental target sentences into various languages due to their respective specific language properties. A cross-linguistic investigation of grammatical and lexical aspects, particularly the comparison of Russian and Czech, and the comparison of Russian with a non-Slavic language like German, reveals problems with word order, the absence of articles, or a different use of the verbal aspect. The analysis of various experimental methods, like eye tracking or self-paced reading, shows how each method interacts with crosslinguistic differences. 1 Introduction In empirical linguistic research, it is a common practice to adopt an experimental design or translate target items from one language to another. This is quite advantageous for the investigator, if he wants to investigate similar linguistic phenomena, verify similar hypotheses, or compare two or more languages. The investigator who finds a good original experiment for his or her assumptions and linguistic research questions saves valuable time and can rely on items that have already been tested. However, many mistakes can be made regarding the stimulus material of cross-linguistic experiments. Cross-linguistic research deals with either the similarity or the dissimilarity 1 Acknowledgments: The author would like to thank Tilman Berger and Stefan Heck for comments on earlier versions of the paper. Thanks also go to Oliver Bott. I would also like to thank Johanna Heininger, Tereza Navrkalová, and Anna Schmid for translating the items of the experiments. This research was funded by the German Research Foundation within project C2 of the Collaborative Research Center SFB 833: The Construction of Meaning. Anja Gattnar 84 of the compared languages. The aim of this paper is to figure out some problems that can arise when linguists explore these cross-linguistic differences or similarities by using “identical” experimental designs. As Abubakar (2015, p. 230) mentioned, an experimental design can be adapted or transferred, or the items can be translated. If we opt for an adaptation, we need a “systematic evaluation of all aspects of a measurement instrument, and to modify it where needed in order to make it more suitable to the context.” In a linguistically driven adaptation, “the adaptations are carried out to accommodate differences in the language structure of the original test” (ebd.). This paper concentrates on the target items and the design of the particular experiments. Therefore, the question will be whether to translate or adapt the target material. Translating target items means, in my point of view, a literal word-by-word transfer of the sentence. Adapting target items means a considering the grammatical or syntactical differences of the languages under investigation. Additionally, Au (1992, p. 372) asked: “How can a researcher be sure that the stimulus material in a cross-linguistic study is comparable in all of the investigated languages if the researcher is not fluent in those languages? ” Other questions concern the data of cross-linguistic experiments: Can we ensure the comparability of the data? What kind of experimental requirements must be met to achieve comparable cross-linguistic data? This paper presents several approaches to answering these questions. Selected experiments that investigate the use and processing of aspectual features in German, Russian, and Czech are discussed. The methodological difficulties described in this paper originated in my own research, which deals with the differences of aspectual use and processing in Russian and Czech on the one hand, and the differences in processing aspectual features comparing German and Russian on the other hand. First, the paper provides a very brief overview of the concept of the verbal aspect in German, English, and Russian, in order to underline the complexity of the related topic. Second, examples from the research literature show what kind of difficulties one may have to tackle during the transfer from one language to another. The examples deal with the differentiation of tense and aspect, problems with bare nominals in Russian due to the absent article, and finally, inner-Slavic differences in the verbal phrase and in aspectual functions and meaning. Third, the paper offers some practical examples of cross-linguistic experiments that examined the use and processing of the verbal aspect from the point of view of transferring experimental design into several languages. In this paper, only online experiments are presented, because their data is very sensible to validity and reliability. Finally, the last section summarizes the findings and discusses a way out of the dilemma. One experiment — different languages 85 2 A tricky object The verbal aspect is a universal phenomenon that exists in many languages with different parameter values. It expresses the properties of an event, action, or state in relation to time. In some languages, aspect can even be a grammatical category. Languages without a grammaticalized aspect, like German, determine aspectual meaning by context, for example, with adverbials or participles. Languages with a grammaticalized aspect have different realizations of this aspect. In English, for example, a certain aspectual feature can be expressed by the progressive aspect. It is grammatically realized by the auxiliary be + -ing participle and “is used in a large variety of contexts” (De Wit & Brisard, 2014, p. 49). 2 In Russian or other Slavic languages, verbal aspect is a grammatical category. The verbal aspect is described as a grammatical distinction between the perfective (PV) and the imperfective (IPV) aspect. This distinction is reflected by morphologically paired verbs with different functions. These verb pairs are derived by both prefixes and suffixes 3 : pisat’ “to write” IPV : napisat’ “to write” PV; otkryt’ “to open” PV : otkryvat’ “to open” IPV. In accordance with the Russian linguistic tradition (Jakobson, 1971), the PV aspect is the marked verbal aspect and the IPV aspect is the unmarked opposite. The PV aspect has some features that the IPV aspect lacks. Therefore, the PV aspect expresses completeness, telicity, and the limitation of an event, while the IPV aspect is used for incomplete, atelic, and unlimited situations, like processes or states (Zaliznjak & Šmelev, 2000). Lehmann (1999) summarized the basic canonical sentence functions for IPV as progressive, iterative, stative, and general-factual. The IPV aspect refers to progressive episodic situations as well as to non-episodic (for example, iterated or habitual) events or processes. The PV aspect, however, is used to indicate episodic events and holds a concrete-factual meaning. 4 All of the aspectual characteristics tell us something about the nature of the event that is or was going on. When we do cross-linguistic experimental research, we must keep in mind how aspectual information is implemented 2 For further discussion on the English progressive, see Comrie (1976), Dowty (1977), Palmer (1989), Leech (2004), and Declerck et al. (2006). 3 See also Vinogradov (1938), Švedova (1980), Bondarko (1983), Čertkova (1996), and Zaliznjak and Šmelev (2000). 4 Since the aim of this paper is to show the difficulties of adapting experimental material, I will not explain the Slavic aspect in more detail. It is included in a vast amount of Slavic aspectual literature. In this paper, I will mention only a very small (Russian oriented) selection of works: Anstatt (2003), Avilova (1976), Bondarko (1971), Breu (1980, 2000), Dickey (2000), Galton (1976), Lehmann (1999), Maslov (1984), Mehlig (1981), Padučeva (1996), Petruchina (2000), Rassudova (1982), etc. Anja Gattnar 86 in the particular languages investigated, be it through morphological, temporal, or contextual features. The following cross-linguistic example illustrates my assertion. (1) The boy was reading progressive the book. vs. The boy read the book. (2) Le garçon lisait imparfait le livre. vs. Le garçon lut passé simple le livre. or Le garçon a lu passé composé le livre. As (1) and (2) show, there are different temporal markers of the progressive in English and French as another aspect of language. Whereas in English the past progressive is used (1), in French we use the imparfait to express an ongoing event in the past (2). 5 In Slavic languages, as mentioned above, we differentiate the IPV and PV aspects morphologically by using aspectual affixes to mark either the PV or IPV aspect (3). (3a) Mal’čik čital knigu. Boy read PAST.IPV.3SG book. (3b) Mal’čik pročital knigu. Boy read PAST.PV.3SG book. The German translations turned out to be completely different (4a and 4b). The aspectual information is packed into the context of the whole sentence, or more precisely, into the object marking, and not into the verbal form: (4a) Der Junge las in dem Buch. The boy read PAST.3SG in the book. (4b) Der Junge las das Buch durch. The boy read PAST.3SG the book through. In (4a), the event is described as an atelic, uncompleted action in progress, whereas in (4b), the event is described as completed. Expressing the Russian IPV aspect in German demands an atelic interpretation phrased by the construction “in the book.” Rendering the PV aspect in German, on the other hand, requires a telic situation and is translated in German only with the past tense. This leads us to an additional challenge that is connected with the functions of the IPV and PV aspects in Russian compared with the aspectual 5 In contrast, the passé simple in the French literary language is used to express a completed event in the past. One experiment — different languages 87 features in the other languages. In the case of Russian-German, we have to deal with a very complex transfer or disambiguation, because in most of the cases one German verb is corresponds with two Russian verbs. German does not have morphologically marked aspectual features. Telicity in German is connected with the structure of the event and can only be limited by lexical or syntactical features, for example, adverbials (Smith & Staudinger, 1997). How can we construct equal experimental items for languages with and without grammaticalized verbal aspects? As I will show, we always have to keep these differences in mind when we conduct cross-linguistic research in the field of verbal aspect because they may have an influence on the experimental data and results. The rich variety of different expressions of aspectual information makes cross-linguistic investigation complicated. As we have seen, aspectual differences cause many challenges to solve. In addition, there are supplementary requirements concerning the experimental design. The main task for cross-linguistic experimental research is to guarantee the comparability of the data. To achieve this, it is necessary to ensure that the experiments are analogous, or even identical. This especially concerns the target items, but the experimental arrangements also have to be similar. Do we have any chance of fulfilling all of the requirements that are postulated in cross-linguistic experimental research? 3 A bag full of problems? The following examples of sentences from the research literature illustrate the pure linguistic problems with which we have to cope. These problems arise when sentence items are translated to illustrate the research question or to argue for or against hypotheses. There are many examples of such cases. In this paper, I have concentrated only on those cases that are linked to aspectual features. Some problems with cross-linguistic research in the field of verbal aspect are already well known from non-experimental research, for example, from investigations with parallel corpora. The following paragraphs provide a short overview of some inconsistencies in the Russian-English non-experimental research literature to demonstrate how difficult it is to translate aspectual features from one language to another. 3.1 Translating Russian into English Concerning the translation of Russian verb semantics into English, I concentrated my attention on two main problems. First, the inconsistency of the Anja Gattnar 88 translation of IPV and PV verb forms in the past tense, and, second, the problem that arises in the case of bare nominals and aspect use in Russian. 3.1.1 Russian IPV and the English progressive A good example of difficulties in cross-linguistic research is the equalization of the Russian IPV aspect with the English progressive in past tense. As examples (5) and (6) show, the Russian IPV preterite verb čital can be translated into English with a past progressive (6a) or a present perfect (6b). The translation depends on the interpretation of (5). In (6a), the past IPV is read as a process and therefore the English translation needs the past progressive. In (6b), the present perfect is used, which leads to a general-factual reading of the Russian past IPV. (5) Ja uže čital Krepost’. I already read PAST.IPV.1SG The Fortress. (6a) “I was already reading 1. P. SG.: PAST PROGRESSIVE The Fortress once.” (6b) “I have already read 1. P. SG.: PRES: PERF. The Fortress once.” Assuming we are planning an experiment on the processing of the Russian IPV aspect compared with the processing of the English progressive, we have to decide whether 1. The Russian sentence should be translated with a progressive (6a) in any case, because there is a PAST.IPV verb, or 2. The English native speaker prefers the present perfect (6b) and the speaker says only that the book was read (not mentioning whether every page was read, because that information is not important), or 3. We should use present perfect and an additional phrase that makes sure that the reading process was not completed insofar as the reader did not read every page of the book but only some of them (6c): (6c) “I have already read 1. P. SG.: PRES. PERF. at least some of The Fortress once.” Examples (7) and (8) are even more complicated. Here, the English translation of the Russian sentence is the same despite the use of different aspects. Referring to Altshuler (2010), the Russian sentence with the IPV verb priezžat’ (7a) is translated with the simple past (8a) or the past perfect (8b): One experiment — different languages 89 (7a) K nam priezžal otec, no vskore To us come PAST.IPV.3SG.M father but soon uechal. go away PAST.PV.3SG.M (8a) “Father came SIMPLEPAST to see us, but he went away again soon.” (8b) “Father had come PASTPERFECT to see us, but he went away again soon.” The use of the PV verb form priechal PV (7b) instead of priezžal IPV is translated with the same two English tenses, the simple past (8a) or simple past perfect (8b). Altshuler (2010) turned the aspectual difference in Russian into a temporal distinction in English. (7b) K nam priechal otec, no vskore To us come PAST.PV.3SG.M father but soon uechal. go away PAST.PV.3SG.M I have to mention that I do not quite agree with the translations in (8a) and (8b). In my opinion, we should distinguish between “to visit” for priezžat’ IPV and “to come” for priechat’ PV, that is, we have to deal with the lexical meaning of the aspect, not with temporal differences. Another example, also from Altshuler (2010), suggests that the semantics of (10a) and (10b) are identical. However, in my opinion, an aspect concerning an induced semantic difference between these two sentences persists. The progressive in (10a) denotes that the event is in progress and describes a “dynamic eventuality” (de Swart, 2000). In (10b), the event is reported as a state, similar to the original Russian IPV sentence (Altshuler, 2010). (9) On ničego ne delal IPV , tol’ko ležal i He nothing do PAST.IPV.3SG.M only lie PAST.IPV.3SG.M and kuril. smoke PAST.IPV.3SG.M (10a) “He wasn’t doing anything, just lying (around) and smoking.” (10b) “He didn’t do anything, only lay (around) and smoke.” Anja Gattnar 90 De Swart (2000) suggested that in English, aspectual operators are optional and their use depends on the interpretation of the situation as a process (10a) or as a state (10b). As the examples show, it is questionable whether the different functions of the verbal aspects were adequately translated and if they allowed the correct interpretation of the sentences. The English translation may not always take into account the distinctive meaning of the Russian verbal aspects. 3.1.2 Definiteness and indefiniteness As Russian is an articleless language, we only have a singular/ plural distinction of nominals. “As a consequence, unmarked nominals are always singular in meaning, and bare singulars as well as bare plurals are underspecified for definiteness” (de Swart, 2013, p. 13). The contrast between definite or indefinite nominals can be marked by aspect. The IPV aspect is usually combined with an undetermined quantity or unspecified object and the PV aspect with a determined object. To get the correct translation in other languages, one must pay attention to this peculiarity. It is quite a big difference if a. Ivan ate some bread (11a) — bread as a mass noun — or if b. he ate all of the bread (11b) or c. Ivan ate bread and nothing else. (11a) Ivan el chleb. Ivan eat PAST.IPV.3SG.M bread ACC.SG (11b) Ivan s”el chleb. Ivan eat PASTPV.3SG.M bread ACC.SG. (Sonnenhauser, 2008) The aspectual effect is reflected not only with bare mass nouns, but also with bare plurals: (12a) Petja čital stat’i. Peter read PAST.IPV.3SG.M article ACC.PL “Peter was reading articles/ the articles.” (12b) Petja pro-čital stat’i. Peter read PAST.PV.3SG article ACC.PL “Peter read the articles.” In (12a), the interpretation for the bare plural noun stat’i may be definite or indefinite; in (12b), the PV verb allows only a definite interpretation and One experiment — different languages 91 blocks the indefinite interpretation because of its telic meaning. 6 Slabakova (2004) suggested that in PV sentences, articleless bare plurals and mass objects are interpreted as denoting a specific quantity. The correct translation of the sentence is even more complicated in the present tense. In the next example (13), we find a bare plural nominal. The sentence with an IPV present tense verb and a bare plural noun can be interpreted as an episodic (14a) or habitual event (14b), as we can see in the English translation. For the German translation, we need additional context (15a, 15b) to express these distinctions. (13) Petja čitaet lekcii v universitete. Petja read PRES.IPV.3SG lectures at university. (14a) “Peter is giving lectures at the university.” (14b) “Peter gives lectures at the university.” (15a) Peter hält dieses Jahr Vorlesungen an der Universität. Peter give PRES.3SG THIS YEAR lecture ACC.PL at the university. “Peter gives lectures this year at the university.” (15b) Peter hält üblicherweise Vorlesungen an der Universität. Peter give PRES.3SG USUALLY lecture ACC.PL at the university. “Peter usually gives lectures at the university.” The present tense IPV can have either a present or a future interpretation in Russian. (16) Petja zavtra čitaet lekcii v universitete. Peter tomorrow read PRES.IPV.3SG lectures at university. “Tomorrow, Peter is giving (will give) a lecture at the university.” The bare plural in (16) can only refer to an episodic event. Forsyth (1970, p. 92) also considered the object and the verb in an IPV verbal phrase (VP) as a “coalesced unit, in which the object has no specific reference,” whereas in the PV VP the object is specific. Filip (1993) argued that in Slavic languages without articles, the PV aspect induces quantized readings on incremental theme bare plurals and mass arguments. As a direct consequence, aspect use can change the content matter and thereby influences the reading behavior 6 Filip (2007), Gehrke (2008), and many others have correctly combined the telicity effect with the Russian prefixes, which is one way of deriving aspectual verb pairs. Anja Gattnar 92 of the research participants. This risk can be prevented by avoiding contexts with mass nouns or plural markers on the direct object. 3.2 Differences in aspectual use in Czech and Russian Before I start to demonstrate my experimental experience with interlinguistics, the difference in aspectual use in Czech and Russian should be explained briefly. In principle, the main aspectual functions and main aspectual meanings of IPV and PV verbs do not differ. For a long time, the Russian system of grammatical aspects was incorrectly assumed to be equally valid for all other Slavic languages. Ivančev (1971) and Galton (1976) made some effort to show aspectual differences between the Slavic languages. Stunová (1991, 1993) described the differences between Russian and Czech in more detail. These differences were particularized by Dickey (2000), who established a “geography” of Slavic aspects with “a broad east-west isogloss dividing Slavic into an eastern group (Ru, Uk, Br and Bg) and an extreme western group (Sor, Cz, Sk and Sn). There are also two transitional zones in the north and south (Pol and SC respectively), which share some properties with each group” 7 (Dickey, 2000, p. 4). For Dickey, the aspectual differences between Russian and Czech are based on the meaning of the PV aspect. In the western languages, the meaning of PV is totality, whereas in the eastern languages, the central meaning of PV is temporal definiteness (Dickey, 2000, p. 48). My recent corpus studies have shown that the distinction in aspectual use can also be explained by the degree of grammaticalization. In Russian, aspectual meaning and function are grammaticalized to a higher degree, because the use of one or the other aspect is fixed, except in some special contexts. In Czech, the degree of grammaticalization is lower, because aspectual use is more flexible. This means that in Czech, both verbal aspects compete in more contexts than in Russian. However, there are additional differences in aspectual use, as the following examples show. (17) Liš’ tol’ko načinal zvenet’ telefon, Varucha Only just begin PAST.IPV.3SG.M ring INF.IPV phone, Varucha bral trubku […]. take PAST.IPV.3SG.M handset […]. 7 Ru=Russian, Uk=Ukrainian, Br=Bel a rusian, Bg=Bulgarian, Sor=Sorbian, Cz=Czech, Sk=Slovak, Sn=Slovenian, Pol=Polish, SC=Serbo-Croatian. One experiment — different languages 93 (18) Sotva za -zvonil telefon, Varucha zvedl Only just begin-ring PAST.PV.3SG.M phone, Varucha lift PAST.PV.3SG.M sluchátko. handset. “When the phone began to ring, Varucha lifted the handset.” Both sentences describe an iterated situation. In Russian (17), the IPV aspect is the default, in contrast to Czech (18), where the PV aspect is usually used to describe an iterative situation. In addition, the Slavic languages offer different constructions to express the beginning of an action. In the Russian example, the analytical construction with “begin to” is used, whereas in Czech, the prefix zaexpresses the ingressive aktionsart. In this case, it can be said that in Russian, the focus of the utterance concentrates on a beginning event, whereas in Czech, the focus lies on the moment of the beginning, a particular point in time. In sentence (19), all verbs are IPV in the Russian version and all verbs are PV in the Czech version. Whereas in Russian, the IPV aspect in an iterative context is obligatory, except for the so-called summarizing function (summarnoe značenie) of the PV in bounded iteration with adverbial count-quantifiers 8 , in Czech, the use of the PV aspect is quite common. Nevertheless, we have to point out that the sentences are not really equivalent. In Czech, the focus is on the moment of the beginning of the event, whereas in Russian, the emphasis is on the ringing phone and a repetitive moment (see (19) and (20)): (19) Dvaždy protjagivala ona ruku za kurtkoj i dvaždy Twice put out PAST.IPV.3SG.F she hand of jacket and twice otdergivala ee. refrain PAST.IPV.3SG.F her. (20) Dvakrát po něm vztáhla ruku a dvakrát toho zase Twice after it put out PAST.PV.3SG.F hand and twice It again nechala. refrain PAST.PV.3SG.F “She put out the hand twice out of the jacket and refrained her twice.” The action is repeated, but each time it is annulled again. In Russian, the phenomenon of annulirovannost’ resul’tata (“annulment of resultative ac- 8 See Bondarko (1971). Anja Gattnar 94 tions”) is combined with the IPV aspect, in contrast to Czech, where the PV aspect is used (19) and this type of aspect function does not exist. The Czech translation (22) corresponds in its aspect use to the Russian sentence (21), because the PV aspect is used for iteration in Czech anyway. The iterated or repeated actions are summed up into one single event. (21) Mr Prosser paru raz otkryl i Mr Prosser a couple of time open PAST.PV.3SG.M and zakryl rot [...]. close PAST.PV.3SG.M mouth [...]. (22) Mr Prosser několikrát otevřel a zase Mr Prosser several times open PAST.PV.3SG.M and again sklapl ústa [...]. close PAST.PV.3SG.M mouth [...]. “Mr Prosser opened and closed the mouth several times.” In my opinion, the different functions of verbal aspect and its different usage can be explained by the fact that Russian has the already mentioned higher degree of aspectual grammaticalization, whereas Czech has a lower degree. This implies that the inner-Slavic research on verbal aspect is somehow more difficult than other cross-linguistic research, because we have to focus our attention on the aspectual distinctions and ask ourselves carefully whether we are really investigating the identical aspectual features. We will not stumble into the same pitfall in the German-Russian experiment. Regarding comparative studies in the field of reported iterated events, we have to keep in mind that repetition is a function of the IPV aspect in Russian but not in Czech. The PV aspect in Russian is only accepted with a summarizing function (23). On the other hand, the IPV aspect in Czech is only accepted when the iterated action has failed (24): (23) On ešče neskol’ko raz pobuždal sebja spustit’sja vniz, He yet several times urge PAST.IPV.3SG.M himself go down, no ne smog — nervy sdali. but not could — nerves passed. “He urged himself several times to go down, but he could not — the nerves passed.” One experiment — different languages 95 (24) Ještě několikrát se nutil seběhnout dolů, selhaly Yet several times urge PAST.IPV.3SG.M himself go down, passed nervy. nerves. There are plenty of difficulties that are accompanied by different defaults or preferences, or differences in the function of the aspect. The success of crosslinguistic experimental research on the verbal aspect depends on the target sentences and how the lacking aspectual verbal functions are substituted in the particular translation (for example, with additional contextual information). 4 Experimental experience I conducted three cross-linguistic experiments on the use and processing of verbal aspect. In the first example, I introduce a cross-linguistic eye tracking study that revealed the different processing of aspectual features in German and Russian. The second example is from a self-paced reading experiment that investigated the use of verbal aspect in an iteration context in Russian and Czech. Finally, the third example describes a forced choice experiment regarding the acceptability of the PV aspect in performatives in Russian and Czech. 4.1 German-Russian The first cross-linguistic study to which I will refer is an eye tracking study on the time course of aspectual interpretation in German and Russian that my colleague Oliver Bott and I initiated (Bott & Gattnar, 2015). Eye tracking means that the eye movements of the participants are recorded while they perceive and examine visual stimuli, in our case, German or Russian phrases. While perceiving the stimulus, either the point of gaze (where one is looking) or the motion of an eye are measured. In our case, eye movement entailed the visual processing of a written text. When reading a text or a sentence, eyes do not move continuously along a line, but make short, rapid movements (saccades) intermitted by short stops (fixations). 9 This method provides us with the possibility of identifying and pinpointing mismatch effects at a specific location. 9 For further explanation, see Collewijn (1999), Richardson, Spivey, and Wnek (2008), and Pollatsek, Rayner, and Collins (1984). Anja Gattnar 96 Is it important to mention that the German version of this experiment was conducted without considering the Russian data (Bott, 2010, 2013). The Russian version was done later, when the German data was already collected and analyzed. Our research question was the following: Does the grammatical system of a language form the way in which the semantic interpretation proceeds during comprehension? To answer this question, we compared the proceedings of the comprehension of aspectual features in Russian, a language with grammatical aspect, and in German, a language without grammatical aspect. In our research, we assumed that semantic interpretation proceeds word by word, that is, incrementally (Bott, 2010). In our opinion, aspectual processing seems to be a very good object for the investigation of language processing. In Russian, the verb is marked for aspect and thus unambiguously encodes a particular situation type, whereas in a non-aspect language, like German, aspectual composition depends on a larger linguistic context involving the semantic properties of the arguments. Because of this, the increment size (number of words until aspectual information is processed) in aspectual processing should vary between the two languages. 10 In Russian, we predicted mismatch effects between a PV verb and a durative adverbial independently of the presence or absence of the verbal arguments; in German, however, mismatch effects are delayed until the semantic processor has encountered the complete predication and minimizes the risk of aspectual reanalysis (Bott, 2013). The hypothesis concerns the realization of the mismatch that will give information about the sentence processing in German and Russian. The mismatch in Russian should be realized immediately at the PV verb, whereas in German the interpretation takes place only at the end of the sentence when all needed information is given. We employed test sentences with (in Russian: PV) transitive achievement verbs modified by aspectually mismatching durative adverbials in German and Russian and we manipulated the word order in such a way that the aspectual mismatch occurred before or after the predication was complete. The data strongly suggest that German readers showed mismatch effects only after the complete predication, whereas Russian readers immediately noticed the aspectual mismatch independently of whether the verb preceded or followed its arguments (Bott, 2013; Bott & Gattnar, 2015). Based on the sentences with aspectual mismatches, I will describe the difficulties we had in transforming or translating the original German items of Oliver Bott’s experiment on incrementality (Bott, 2010, 2013) into Russian. 10 See also Altmann and Kamide (1999), Brennan and Pylkkänen (2008), Ferretti et al. (2007), and Todorova et al. (2000). One experiment — different languages 97 The targets are adaptations of the original German items. The original test sentences were constructed with verbal achievements and time-span (25) or time-frame adverbials (26): (25) *Der Ringer gewann den Kampf zehn Minuten lang. “*The wrestler won the fight for ten minutes.” (26) Der Ringer gewann den Kampf in zehn Minuten. “The wrestler won the fight in ten minutes.” We made the supposition that the reader needs all of the information in German to decide whether the sentence is well formed or not, while the Russian reader decides this question to a greater or lesser extent immediately after the perception of the verbal phrase. This is connected to the fact that the usage of a certain verbal aspect determines additional sentence arguments. To solve our research question, we had to tackle the task of translating the sentences of the original German experiment into Russian, keeping the usage of verbal aspect in mind. As provided in the original German experiment, we constructed two mismatch conditions. The first condition can be seen below. The durative adverbial stands at the end of the critical sequence. The original German sentence: (27) *Der Boxer gewann das Turnier ganze drei The boxer win PAST.SG the championship for three Stunden, und das Publikum freute sich. hours, and the crowd was happy. The Russian transformation: (28a) *Znamenitaja i opytnaja boksёrša vyigrala Famous and experienced female boxer win PAST.PV.3SG.F turnir celych tri časa i zriteli radovalis’. championship for three hours and crowd was happy. Example (28a) reveals that German sentence (27) was not translated but adapted into Russian. (28a) contains three important changes: Anja Gattnar 98 1. Changing the gender of the subject of the sentence; 2. Adding attributes; and 3. Changing the word order. The necessity of these three changes will be explained below. But first, we began to carry out a variation in the experimental design. Difference 1: The extension of the experimental items: the control condition with the IPV aspect in the Russian version of the experiment To verify our hypothesis, we had to replace the German achievements with the Russian PV verbs to receive optimal data and warranted mismatch sentences. Nevertheless, it was necessary to exclude the possibility that the IPV aspect leads to the same restrictions as the PV aspect. Therefore, the Russian experiment was enlarged by an IPV variant: (28b) ? Znamenityj i opytnyj boksёr vyigryval Famous and experienced boxer win PAST.IPV.3SG.M turnir celych tri časa i zriteli radovalis’. championship for three hours and crowd was happy. Due to the enlargement of the sentence items for the control items with IPV, we had to deal with a new problem: the number of syllables. Difference 2: Alignment of syllables A deviation in the number of syllables in one area of interest can change the data and might render the comparability of the cross-linguistic data impossible. Therefore, we focused on keeping the number of syllables constant. Why is the number of syllables a problem for our experimental design? The answer lies in the system of the aspect opposition and the aspect derivation in Russian. As a rule, aspectual pairs in Russian are mainly formed by prefixation or suffixation of the word stem. Because of this, verbal derivation is accompanied by changes in the number of syllables. The choice of verbs was determined by the German version of the experiment. Due to the derivation of the verbs with affixes, it was impossible to find verbal pairs in which the PV had the same number of syllables as the IPV past form. To solve the problem with the syllables in the IPV control sentences, we decided to manipulate the gender of the subject. 11 In the end, we successfully constructed the items in such a way that in the mismatch condition, as well as in the 11 The past tense morpheme for masculine is -l, for feminine it is -la, so we get one more syllable, which we need to balance the PV prefix morph. One experiment — different languages 99 control condition, the interest area for the “verb” consisted of the same number of syllables. Difference 3: Word order As you can see in (29) and (30), the word order is different 12 : (29) *Ganze drei Stunden gewann der Boxer das For three hours won the boxer the Turnier, obwohl es viele Konkurrenten gab. championship, although there many competitors were. (30) *Celych tri časa vyigrala turnir znamenitaja Whole three hours win PAST.PV.SG.F championship famous i opyntaja boksёrša i zriteli radovalis ʼ . and experienced female boxer and crowd was happy. On the one hand, Russian has a more flexible word order than German; on the other hand, word order in Russian interacts with the specific and definite interpretations of the arguments and reflects discourse functions. Russian is a language in which discourse and context play a large role. The preverbal position is normally related to topic, or old information, and the postverbal position is related to focus, or new information (see King, 1993 and Bailyn, 1995 for further information). Concerning the target items in our experiment, in the German sentences the word order is always the following: DurAdv — V — S — O. In the Russian sentences, the word order changes to DurAdv — V — O — S. The Russian durative adverbial always heads the sentence and the verb follows the durative adverbial directly. The favored word order in Russian is DurAdv — S — V — O (31): (31) * Celych tri časa znamenitaja i opyntaja boksёrša Whole three hours famous and experienced female boxer vyigrala turnir… win PAST.PV.3SG.F championship… However, constructing the critical sentence with the favored word order would have meant separating the adverbial from the achievement verb and 12 We changed the second part of the Russian sentence to have contexts that are neither connected with the prior action nor with rating it. Anja Gattnar 100 would have switched the verb into another region of interest. This must not happen, because we needed to be able to compare to the German data. The second best word order was in accordance with Russian native speakers, V — O — S, and not V — S — O. We decided to use this more natural word order and to neglect the accordance with the word order in German, because we expected a measurable effect to be caused by marked word order in Russian. The area of interest was identical in both of the experiments. Possible effects on other areas should not affect our research interest. Difference 4: Expression of definiteness/ indefiniteness of the noun: bare nominal and aspect Another difference between Russian and German concerns the grammatical definiteness of the noun. Russian has no indefinite or definite articles, but, as mentioned before, the definiteness or indefiniteness of the noun can be expressed by the telic or atelic meaning of the verbal aspect (IPV vs. PV). That means that the type of aspect can express the definiteness or indefiniteness of the nominal. In all of the original German items, the subject is definite and marked with a German determined article: der, die, or das, “the”. To make sure that the Russian IPV sentences have a determined subject as well and the IPV aspect has no influence on the data, we tried to manipulate the sentences in such a way that we got only the definite reading. The best approach seemed to be to widen the nominal phrase (NP) and to add an attribute (32). For PV sentences, the enlargement of the subject was not important. The definiteness had to be marked in the IPV sentences that served as controls for the PV test items. (32) Znamenitaja i opytnaja boksёrša vyigryvala Famous and experienced female boxer DEF win PAST.IPV.3SG.F turnir celych tri časa i zriteli radovalis’. championship whole three hours and crowd was happy. The explanations of the transferring problems can be summarized in the following way: a) Due to the two verbal aspects, the number of target items increases by one more control sentence for each target sentence; b) For derivational reasons, the syllable numbers of the Russian verbs differ; One experiment — different languages 101 c) Although the Russian language possesses free word order, there is a more or less natural word order that influences the assignment of the topic and focus information; and d) The expression of definiteness or indefiniteness of the bare nouns is linked to aspect usage. 4.2 Russian-Czech We have already started to show the problems in the field of inner-Slavic experimental work. The next two examples are part of our investigation of the use of the verbal aspect in iteration contexts with count quantifiers (examples 35 and 36; see Gattnar, 2013) and of the use of the verbal aspect in performative utterances in Russian and Czech (examples 37 and 38). Both experiments were originally designed for Russian. The Russian version of the experiment on iteration with count quantifiers was already conducted (Gattnar, 2013); the Czech version is currently being conducted. The experiment on performative utterances, a forced choice experiment regarding the acceptability of the PV aspect in performatives, was pretested for Russian (Gattnar, 2015), but not for Czech. As mentioned before, Dickey (2000) and Stunová (1993) pointed out that there are many coincidences in the function and use of the verbal aspect, but also several differences. For our investigation, two aspectual differences between Russian and Czech were especially important: iterated events (as described in Stunová, 1993) and the coincidence of aspect in performatives (as described in Dickey, 2000). It seemed to be worthwhile to transfer a previously conducted Russian experiment into Czech. Our main research questions were: 1. Can we show reverse effects in the processing aspect in such cases where the use of one aspect is different (IPV in Russian=PV in Czech and vice versa)? 2. Is the different level of the grammaticalization of aspect reflected in the processing of aspect, irrespective of its type? We predicted less marked effects in Czech than in Russian in cases of aspectual coincidence. Iterated events One of these distinctions is the different use of the IPV and PV aspects in the context of iterated events. Anja Gattnar 102 A corpus investigation (Gattnar, 2013) of counted iterated situations in Russian led to the assumption that the summarizing function of the PV aspect is more common than in the research literature so far described (see e.g., Dickey, 2000). For our research interest, the special function of the PV aspect in Russian is very worthwhile. It gives us the possibility of investigating a competitive situation where both aspects match in iterated contexts with counted quantification. In our first online experiment, we verified the results of the corpus investigation and we also searched for the answer to the question whether the position of the quantifier affects the choice of aspect. We decided to test our hypotheses with the help of a self-paced reading experiment, which should provide more information on the processing of aspect in counted iteration. First, we postulated that there would be different reading times for the IPV and PV sentences because of the default character of the IPV aspect in an iterated context. Second, the position of the count quantifier before or after the verb would influence reading times due to information structure (incrementality). The test items were built with count adverbials before or after the verbal phrase in the past tense. After all of our investigations, we concluded that the dominating position of the IPV in iterated contexts is weaker in favor of the PV if the number of repetitions is announced by a count quantifier. The summative function of the PV is triggered, which cannot be fulfilled by the IPV (Gattnar, 2013). A corpus investigation in the Czech National Corpus (ČNK) confirmed the assumption that in Czech, the PV aspect dominates in an iteration context, and this domination is more distinctive in the pre-verb position than in the post-verb position of the count quantifier (Dübbers, 2015). Intriguingly, the ČNK data show the occurrence of the IPV aspect in a counted iteration context as well. Therefore, it seemed interesting to adapt the Russian experiment on iteration context into Czech. As the PV is the expected aspect in Czech and the IPV is expected to be less common, consequently, we assumed that PV sentences are read faster than IPV sentences, which means an inversion of the Russian data. With regard to the research on aspect usage in iteration contexts, these differences may strongly affect the item settings and make it difficult to translate the item sentences word by word, since the meaning of the critical items can change or the same verbal aspect can even lead to a mismatch in one of the languages that (33) expresses, at this point with a non-counted adverbial quantifier: (33) On menja často ubeždal/ *ubedil. He me often convince PAST.IPV.3SG.M / * PAST.PV.3SG.M One experiment — different languages 103 (34) Často mě přesvědčoval/ přesvědčil. Often me convince PAST.IPV.3SG.M / PAST.PV.3SG.M In the Russian aspectual system, only the IPV aspect in (33) is grammatical. The PV aspect is the dominant form in unlimited iterative contexts (Stunová, 1993), denoted by the adverb “often” in this example. In the Czech aspectual system, iteration is not functionally limited to the IPV aspect but also to the PV aspect, because it prefers to point out the individual events that are subsumed by the iteration (34) (Stunová, 1993). This explains the use of the IPV aspect in Russian and the preferred use of the PV aspect in Czech. Regarding the aim of this paper, the problem that we have to solve is the following: what differences are to be transferred and which kind of differences are not? In the differences between the lexical and grammatical aspects — is one of them easier to translate than the other one? It would be no problem to test the preferred or compatible aspects in an iteration context independently of each other. The difficulties arise when we investigate the processing of an aspect, because in Russian, we would only test the processing of IPV, whereas in Czech, both aspects are possible. Simply constructing a new hypothesis does not really solve the difficulty. The problem seems to be more complicated, because in Czech the other aspect would not cause a mismatch in the presented case, but — according to Stunová (1993) — another meaning. Other experimental methods are better suited to verify the meaning of a sentence item. At this point, some experimental methods hit their limits. If the distinction between the PV and IPV aspects in iteration contexts was unambiguous in both languages, for example, because the use of a certain verbal aspect is obligatory in both languages, there would be fewer problems for cross-linguistic research. In the worst case, the two aspects would be used in each language but with different meanings and functions. We must not fall into that pit. In iteration contexts, the Russian sentences with the IPV aspect should be transferred to Czech by changing the Russian IPV verb into a PV verb in Czech. But in this case, we have to reconsider our research question and/ or hypothesis. “The substitution of the Czech PVs by their IPV counterparts would cause various ‘undesired’ effects” (Stunová, 1993, p. 18). The problem is to control these effects in such a manner that the data do not lose validity. We are now asking ourselves if the processing of aspect is similar in Russian and Czech. We started with the translation of the Russian self-paced reading experiment on the processing of aspect in a counted iteration context into Czech (see examples 35-36). The Czech variant of the experiment Anja Gattnar 104 was conducted to verify the hypothesis that the reading effects should disappear due to the lack of such competition in Czech (Dübbers, 2015). (35a) Ivan neskol’ko raz / dva raza sdelal Irine Ivan several times/ twice make PAST.PV.3SG.M Irina predloženie, a ėto bylo ochen’ romantično. marriage proposal, and that was very romantic. (36a) Miroslav několikrát / dvakrát požádal Lenku o ruku Miroslav several times/ twice ask PAST.PV.3SG.M Lena for hand a bylo to velmi romantické. and was this romantic. (35b) Ivan neskol’ko raz / dva raza delal Irine predloženie, a ėto … make PAST.IPV.3SG.M … bylo ochen’ romantično. (36b) ? Miroslav několikrát / dvakrát žádal Lenku o ruku a bylo to … ask for PAST.IPV.3SG.M … velmi romantické. (35c) Ivan sdelal Irine predloženie neskol’ko raz / dva raza, a ėto bylo … make PAST.PV.3SG.M … ochen’ romantično. (36c) Miroslav požádal Lenku několikrát / dvakrát o ruku a bylo to … ask for PAST.PV.3SG.M … velmi romantické. (35d) Ivan delal Irine predloženie neskol’ko raz / dva raza, a ėto bylo … make PAST.IPV.3SG: M … ochen’ romantično. (36d) ? Miroslav žádal Lenku několikrát / dvakrát o ruku a bylo to … ask for PAST.IPV.3SG.M … velmi romantické. At first glance, translating the test items seemed very unproblematic. Nevertheless, there are some differences. One experiment — different languages 105 Difference 1: First names We changed the first names of the subject and the object to maintain the naturalness of the sentence. Uncommon names might influence reading times or sentence judgments. Difference 2: Other verbal construction In the original Russian experiment, one of the test items was sentence (35). The Russian sdelat’ PV predloženie “to make PV a marriage proposal” is translated into Czech with the phrase požádat PV o ruku “to ask PV for the hand.” This means that we would test different verbs and completely different constructions. This may cause differences in processing that are connected to the whole construction. We had to decide whether to remove the sentence or not. In our case, the sentence was preserved based on the following reason: the event is considered as such and the verb used in the whole phrase is not semantically important, only the aspectual form is interesting to us. The special verb type in Czech, where a verbum dicendi is used to verbalize the event, has no influence on the use of a verbal aspect in this context. Difference 3: Aspect use A more difficult problem was the fact that in Czech, the IPV verb is not entirely acceptable in iteration contexts with count quantifiers. Czech native speakers strongly prefer the PV aspect (35a-36d) (Dübbers, 2015). This means that the Czech experiment was based on different conditions than the original Russian version, where both of the aspects were equally accepted. When analyzing the data, we need to focus on the different assumptions regarding the use of the IPV aspect in Czech iterative contexts. In Czech, the data for the IPV sentences should differ significantly from the PV condition. Thus, the hypothesis has to be reformulated: whereas in Russian we expect similar reading times, the reading time for the Czech items will differ for the two aspects. Performatives The same difficulties exist in the case of performatives. The use of the verbal aspect in performative utterances differs in the frequency of occurrences of the PV aspect for Russian and Czech. Performatives are utterances, where utterance and action fall together and happen simultaneously. The utterance is part of the action (Austin, 1962) and performs it. In Russian, by default, performatives are expressed with the IPV aspect present, but to some degree, a PV aspect present form is used. Dickey (2000) called this phenome- Anja Gattnar 106 non the temporal coincidence of a situation that is referred to by a PV present form in the moment of utterance; it was also described this way by Bondarko (1971) and Galton (1976). There are different opinions on the use of the Russian PV aspect in performative utterances. The research literature describes a few exceptional cases in which the PV aspect can be used. The PV performative verb carries additional pragmatic information (Dickey, 2000; Israeli, 2001). The situation in Czech is different. In Czech, Dickey (2000) observed a higher degree of coincidence in the performatives. Again, as in the case of an iteration context, the data of my corpus investigation provides another picture. In contrast to Dickey (2000), and for the purposes of this paper, it is worth mentioning that the analysis of the Russian National Corpus (NKR) 13 and the ČNK 14 led to the assumptions that, on the one hand, coincidence in the performatives in Russian is more common and has a higher degree than Dickey (2000) ascertained (Gattnar, 2015), and on the other hand, the differences between Russian and Czech seem to be smaller than Dickey described (Berger & Heck, 2014). Therefore, I conducted a forced choice experiment regarding the acceptability of the PV aspect in the performatives in Russian to verify my objections against Dickey (2000). If it is not possible or even wrong to use the PV aspect in performatives, the acceptability rates should reflect this mismatch due to very bad judgments. In the acceptability test, the informants had to rate performative utterances with different verbal aspects. We conducted two versions of the experiment: one with a little context followed by the test item, and one without such a context, only the test item was listed (Gattnar, 2015). The results of this study have shown that the PV aspect is less accepted as an IPV aspect, but this is not significant; the differences between the version with and without context are not significant either (Gattnar, 2015). We think it is worthwhile to look at the Czech data as well. One possibility of comparing the Russian data with the Czech data seems to lie in the translation of the Russian study. But can this approach be successful? As we have mentioned before, the degree of coincidence for both languages is different. This fact leads to a further comparability problem. It is not practical to compare the same type of aspect when in one language the given aspect is significantly more or less used in a given performative context, since such differences can cause effects that are not predictable. In (37) and (38), the IPV sentences are better than the PV sentences, but the degree of mismatch is different. In this case, the adaptation of the Russian experimental sentences for the Czech experiment was not recommended in reference to our research question. 13 www.ruscorpora.ru 14 www.korpus.cz One experiment — different languages 107 (37) Blagodarim / ? Poblagodarim vsech prisutstvujuščich za vnimanie, thank- PRESIPV.1PL / ? PRESPV.1PL all of the audience for the attention, koncert zakončen. the concert is finished. (38) Děkujeme / ? Poděkujeme všem přítomným za pozornost, thank- PRESIPV.1PL / ? PRESPV.1PL all of the audience for the attention, koncert končí. the concert is finished. “Thanks are given to all of the audience for the attention, the concert is finished.” Thus, the main problem in inner-Slavic experimental research on the verbal aspect is how to deal with the differing realizations of the grammatical category of aspect. For all possible ways out of this dilemma, we argue that hypotheses should be formulated as generally valid as possible. Basically, there are two possible ways out: 1) transforming the items according to the inherent differences, while maintaining the conditions and reformulating the hypotheses; or 2) maintaining the experimental design, reformulating the items regarding the new experimental conditions without translating from one language to another, and adapting the hypotheses. In the first case, the data might be analyzed equally. A disadvantage lies in the effort that we have to shoulder, taking the three differences we have to deal with into consideration. Only then is the naturalness of the items guaranteed. In the second case, the structure and the hypotheses do not change, but the test items are formulated only on the basis of the aspectual function of the particular language. The advantage is that the special characteristic of the particular aspect use can be picked up and considered when we determine the experimental sentence conditions. A disadvantage, however, is the lack of statistically comparable data. We can only make statements about the use of aspect respective to the particular language, and not about its behavior in a common context. Summary and discussion The previous explanation supports the idea that core differences between languages are easier to cope with than marginal differences. It seems to be easier to compare languages that have different ways to express the same thing than languages with similar or equal qualities. Due to the fact that German is a non-aspect language and Russian is an aspect language, the Anja Gattnar 108 particular aspectual information is located at different sentence positions. We can use this kind of core difference to show fundamental differences in sentence processing. As long as the relevant sequences are in the same position, the order of the remaining elements has no influence on the area of interest and is insignificant. In the case of inner-Slavic investigation on the verbal aspect, the differences and overlaps in aspectual meaning and function complicate the implementation and analysis of the mentioned crosslinguistic experiments. These problems arise particularly strongly when an existing experimental design is to be transferred from one Slavic language to the other. The described examples also reveal that a one-to-one translation of experimental items that investigate the verbal aspect is nearly infeasible. Different word order, different word frequencies, different connotations, the lack of a grammatical category, or marginal differences in the meaning or function of verbal aspect — attention should be paid to all of these difficulties. If one does not pay attention to these diversities, there can be hidden differences in the experimental conditions that compromise the validity of the data. In the worst case, one does not even mention it since you are not aware of it. The more complicated the items, the more difficult it is to translate or adapt them into other languages. For instance, there should be as few factors and variables as possible. Before the implementation of a crosslinguistic experiment is completed, it is absolutely necessary to check the frequency of the relevant target words for each language. The problems come from cross-linguistic research do not concern all forms of experimental methods. In my opinion, there are differences in the dimensions of the problems. The difficulties in cross-linguistic research depend on the participants’ task. The more the participants have to produce or decide, the fewer challenges we have in cross-linguistic research. In my opinion, the most complicated case in this field of cross-linguistic research is when we measure times (reading, reaction, regression, etc.) and when we design sentence items with or without further context. Ideal cross-linguistic experimental work has to be planned from the beginning. It is more profitable to generate experimental items simultaneously for several languages and not successively, but this depends on the research question and the linguistic phenomenon under investigation. In our German-Russian experiment on language processing where we dealt with aspectual mismatch and investigated languages that are completely different in this regard, the language-immanent structural and grammatical differences could be balanced very well without any interference in the validity of the data. Regarding the inner-Slavic comparative studies of aspect usage or processing, we have to mention that it pays off to generate experimental One experiment — different languages 109 items simultaneously. It is very important to differentiate the factors that influence aspectual use in the respective languages. These are not methodological problems, but problems that deal with the language-specific characteristics of aspect use. It is also important to investigate the differences and not the similarities and formulate a hypothesis by comparing the diversity. Only in this case can real analogousness in the experimental design be reached and valid data obtained. There are well-proven experiments for testing the use and processing of the verbal aspect. We only have to be careful in translating or adapting them for the specific Slavic research question. References Abubakar, A. (2015). Equivalence and transfer problems in cross-cultural research. International Encyclopedia of the Social & Behavioral Sciences, Second Edition, 929-933. Altmann, G. T., & Kamide, Y. (1999). Incremental interpretation at verbs: Restricting the domain of subsequent reference. Cognition, 73, 247-264. Altshuler, D. (2010). Aspect in English and Russian: Flashback discourses. In A. Grønn et al. (Eds.), Russian in contrast. Oslo Studies in Language 2(1), 75-107. Anstatt, T. (2003). Aspekt, Argumente und Verbklassen im Russischen (Habilitationsschrift, Universität Tübingen). Au, T. K. (1992). Cross-linguistic research on language and cognition: Methodological challenges. In H.-C. Chen & O. J. L. Tzeng (Eds.), Language processing in Chinese (pp. 367-381). Amsterdam: North-Holland. Austin, J. L. (1962). How to do things with words. Cambridge: Harvard University Press. Avilova, N. S. (1976). Vid glagola i semantika glagol’nogo slova. Moskva: Izd. Nauka. Bailyn, J. F. (1995). A configurational approach to Russian “free” word order (Dissertation, Cornell University, Ithaca, NY). Berger, T. (2015, September). Upotreblenie glagol’nogo vida v “hedged performatives”. Talk at the meeting of the Grammatic Committee in Moscow. Berger, T., & Heck, S. (2014, September). Performativní užívání dokonavého prézentu v češtině ve srovnání s jinými slovanskými jazyky. Talk at the Korpusová slavistika conference in Prague. Bondarko, A. V. (1971). Vid i vremja russkogo glagola. Značenie i upotreblenie. Moskva: Prosveščenie. Bondarko, A. V. (1983). Principy funkcional’noj grammatiki i voprosy aspektologii. Leningrad: Nauka. Bott, O. (2010). The processing of events. Amsterdam, Philadelphia: John Benjamins Publishing Co. Bott, O. (2013). Processing domain of aspectual interpretation. In B. Arsenijević, B. Gehrke, & R. Martín (Eds.), Studies in the composition and decomposition of event predicates (pp. 195-230). Dordrecht, Heidelberg, New York, London. Bott, O., & Gattnar, A. (2015). The cross-linguistic processing of aspect — An eyetracking study on the time-course of aspectual interpretation in German and Russian. Language, Cognition and Neuroscience, 877-898. Anja Gattnar 110 Brennan, J., & Pylkkänen, L. (2008). Processing events: Behavioral and neuromagnetic correlates of aspectual coercion. Brain and Language, 106, 132-143. Breu, W. (Ed.) (2000). Probleme der Interaktion von Lexik und Aspekt (ILA). Tübingen: De Gruyter. Čertkova, M. J. (1996). Grammatičeskaja kategorija vida v sovremennom russkom jazyke. Mokva: Moskovskij Gosudarstvennyj Universitet. Collewijn, H. (1999). Eye movement recording. In R. H. S. Carpenter & J. G. Robson (Eds.), Vision research: A practical guide to laboratory methods (pp. 245-285). Oxford: Oxford. Comrie, B. (1976). Aspect: An introduction to the study of verbal aspect and related problems. Cambridge: Cambridge University Press. Declerck, R., Reed, S., & Cappelle, B. (2006). The grammar of the English verb phrase, vol. 1: The grammar of the English tense system. Berlin: Mouton de Gruyter. de Swart, H. (2000). Tense, aspect and coercion in a cross-linguistic perspective. In M. Butt & T. Holloway King (Eds.), Proceedings of the Berkeley Formal Grammar Conference, University of California, Berkeley. de Swart, H. (2013). Telicity features of bare nominals. Draft, Universiteit Utrecht. retrieved from http: / / www.hum.uu.nl/ medewerkers/ b.s.w.lebruyn/ telicity.pdf de Wit, A., & Brisard, F. (2014). A cognitive grammar account of the semantics of the English present progressive. Linguistics, 50, 49-90. Dickey, S. (2000). Parameters of Slavic aspect: A cognitive approach. Stanford. Dowty, D. R. (1977). Toward a semantic analysis of verb aspect and the English imperfective progressive. Linguistics and Philosophy, 1, 45-77. Dübbers, V. (2015). Factors for aspect choice with iterative situations in Czech. In R. Benacchio (Ed.), Glagol’nyj vid: Grammatičeskoe značenie i kontekst/ Verbal aspect: Grammatical meaning and context (pp. 79-91) . [Die Welt der Slaven, Sammelbände. Sborniki, 56] München, Berlin, Washington/ D.C.: Otto Sagner Verlag. Ferretti, T. R., Kutas, M., & McRae, K. (2007). Verb aspect and the activation of event knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(1), 182-196. Filip, H. (1993). Verbal Aspect and Object Case Marking: A Comparison between Czech and Finnish. In J. A. Nevis & V. Samiian (Eds.), Proceedings of the Western Conference on Linguistics (WECOL) 22, (pp. 43-59). Fresno: California State University. Filip, H. (2007). Events and maximalization: The case of telicity and perfectivity. In S. Rothstein (Ed.), Theoretical and crosslinguistic approaches to the semantics of aspect (pp. 217-256). Amsterdam: John Benjamins. Forsyth, J. (1970). A grammar of aspect: usage and meaning in the Russian verb. Cambridge: Univ. Press. Galton, H. (1976). The main functions of the Slavic verbal aspect. Skopje. Gattnar, A. (2013). Konkurencija vidov glagola v povtorjajuščichsja kontekstach v zavisimosti ot tipa i pozicii kvantifikatora. In Voprosy jazykoznanija, 2, 52-68. Gattnar, A. (2015). Die Produktion und Perzeption expliziter performativer Sprechakte im Russischen. Talk held at the Deutscher Slavistentag in Gießen 2015. One experiment — different languages 111 Gehrke, B. (2008). Goals and sources are aspectually equal: Evidence from Czech and Russian prefixes. Lingua, 118, 1664-1689. Israeli, A. (2001). The choice of aspect in Russian verbs of communication: Pragmatic contract. Journal of Slavic Linguistics (JSlL), 9(1), 49-98. Ivančev, S. (1971). Problemi na aspektualnostta v slavjanskite ezici. Sofia, Izdatelstvo na Balgarskata ezik 8.4/ 5, 363-86. Jakobson, R. (1971). Zur Struktur des russischen Verbums. In Selected writings, Vol. II, pp. 3-15). The Hague-Paris: Mouton. King, T. H. (1993). Configuring topic and focus in Russian (Dissertation, Stanford University). Leech, G. (2004). Meaning and the English verb. Harlow: Pearson Education. Lehmann, V. (1999). Aspekt. In H. Jachnow (Ed.), Handbuch der sprachwissenschaftlichen Russistik und ihrer Grenzdisziplinen (pp. 214-242). Wiesbaden: Harrassowitz. Maslov, J. S. (1984). Očerki po aspektologii. Leningrad: Izdat. Leningradskogo Universiteta. Mehlig, H. R. (1981). Satzsemantik und Aspektsemantik im Russischen (zur Verbalklassifikation von Zeno Vendler). Slavistische Linguistik, 1980, 95-151. Padučeva, E. V. (1996). Semantičeskie issledovanija: Semantika vremeni i vida v russkom jazyke. Semantika narrativa. Moskva: Škola Jazyki Russkoj Kul’tury. Palmer, F. R. (1989). The English verb. London: Longman. Petruchina, E. V. (1978). O funkcionirovanii vidovogo protivopostavlenija v russkom jazyke v sopostavlenii s češskim (pri oboznačenii povtorjajušichsja dejstvij). Russkij jazyk za rubežom, 1, 57-60. Petruchina, E. V. (2000). Aspektual’nye kategorii glagola v russkom jazyke v sopostavlenii s češskim, slovackim, pol’skim i bolgarskim jazykami. Moskva: Izdat. Moskovskogo Universiteta. Pollatsek, A. A., Rayner, K. K., & Collins, W. E. (1984). Integrating pictorial information across eye movements. Journal of Experimental Psychology: General, 113(3), 426-442. Rassudova, O. P. (1968/ 1982). Upotreblenie vidov glagola v sovremennom russkom jazyke. Moskva: Izd. Russkij jazyk. Richardson, D. C., Spivey, M. J., & Wnek, G. (2008). Eye-tracking: Characteristics and methods. Eye-tracking: Research areas and applications. The Pennsylvania State University CiteSeerX Archives. Retrieved from http: / / psych.ucsc.edu/ eyethink/ publicationsassets/ EyeTrackingEBBE.pdf Slabakova, R. (2004). Effect of PV prefixes on object interpretation: A theoretical and empirical Issue. Cahiers linguistiques d’Ottawa, 32, 122-142, Department of Linguistics, University of Ottawa. Smith, N., & Staudinger, B. (1997). Telizität im Deutschen. Sborník Prací Filozofické Brněnské Univerzity, A 45, 185-196. Sonnenhauser, B. (2008). Aspect interpretation in Russian — a pragmatic account. Journal of Pragmatics, 40, 2077-2099. Stunová, A. (1993). A contrastive study of Russian and Czech aspect. Amsterdam. Stunová, A . (1991). In defense of language-specific invariant meanings of aspect in Russian and Czech. Studies in West Slavonic and Baltic Linguistics, 291-313. Anja Gattnar 112 Švedova, N. J. (1980). Russkaja grammatika, tom I. Moscow: Nauka. Todorova, M., Straub, K., Badecker, W., & Frank, R. (2000). Aspectual coercion and the online computation of sentential aspect. In L. R. Gleitman & A. K. Joshi (Eds.), Proceedings of CogSci 2000 (pp. 523-528). Philadelphia: Cognitive Science Society. Vinogradov, V. V. (1938). Sovremennyj russkij jazyk. Grammatičeskoe učenie o slove. Moskva: Učpedgiz. Zaliznjak, A. A., & Šmelev A. D. (2000). Vvedenie v russkuju aspektologiju. Moskva: Jazyki russkoj kul’tury. Variation in Russian verbal prefixes and psycholinguistic experiments Anastasia Makarova Abstract: The present article reports on two psycholinguistic experiments that concerned Russian aspectual morphology. Revealing native-speakers’ preferences in the use of various morphemes, the two studies shed light on the nature of the distribution and the motivation behind the choice of prefixes and suffixes associated with attenuative (priotkryt’ “open slightly”, podpravit’ “correct slightly”) and semelfactive (e.g. kriknut’ “shout once”, sglupit’ “do something silly once”) Aktionsarten in Russian. The experiments provided additional support to hypotheses based on corpus data, and the results obtained in the experiments enabled us to see statistically robust tendencies and analyze the complex distributions of morphemes in more detail. The present article focuses on methodological questions with a special emphasis on challenges, particularly those related to peculiarities of Russian morphology, strong and weak sides of experiment design, and interpretation of results. Both experiments included cloze-test tasks with existing and nonce-verbs as stimuli, and the design differed slightly due to the different types of morphological variation. I demonstrate that one and the same general research question often requires tailored experimental techniques even when applied to seemingly similar types of language data. 1 Introduction. Studying rival forms: why and how? Language users are making choices each time there are two or more forms expressing similar meaning. Such choices need to be made for forms of different types and complexity, from syntactic constructions and lexical synonyms to morphemes. Factors that influence the choice include meaning, environment (phonological, morphological and syntactic) and frequency effects (Baayen et al., 2013). The different ways of expressing closely related semantics occur in different types of relationships; rival forms can be in free variation (can be used in one and the same context) or complementary distribution (only one of the forms can be used in a given context). Studying the behavior of rival forms is essential for our understanding of the formmeaning relationship in general. Modern linguistics has the following necessary tools to study variation: large electronic corpora and experimental Anastasia Makarova 114 methods. Language corpora serve as a source of data, facilitate both qualitative and quantitative analyses, and enable us to advance hypotheses. Subsequent experiments test the predictions of these hypotheses. Slavic languages in general and Russian in particular have a rich system of verbal morphology that includes variation and thus serves as a good source of specific research questions, such as variation in the morphological marking of Aktionsarten, with which this paper deals. Furthermore, Russian has a valuable resource of data ― the Russian National Corpus (hereafter RNC, www.ruscorpora.ru). The present paper focuses on morphological variation in the Russian aspectual system. Recent research indicates that morphological variation in Russian verbs is not free since the choice between the available perfectivizing prefixes is motivated (Janda et al., 2013; Sokolova, 2012). The studies reported on in the present paper, extend the hypothesis about motivated prefix choice to another fragment of the aspectual system in Russian, namely Aktionsarten. I use the term Aktionsarten for what in the Russian aspectual tradition is referred to as sposoby dejstvija, cf. a detailed description in Zaliznjak and Šmelev’s (2000) authoritative book on Russian aspect. 1 Aktionsarten modify verbal meaning by adding quantitative and/ or qualitative characteristics (e.g. spat’ “sleep” pospat’ “sleep for a while”). Aktionsarten represent productive models in both written and spoken Russian and are described in the reference grammars of Russian (cf. Švedova et al., 1980). In most cases, one and the same Aktionsart can be expressed by a variety of prefixes and suffixes. Furthermore, we find semantic and morphological overlap between different Aktionsarten. Whether or not the choice of a morpheme is motivated in each given case, and if yes, how, remains an open question. Thus, Aktionsarten provide data for studying the distribution of variants in language. Corpus analysis reveals a considerable degree of variation in the morphological marking of Russian Aktionsarten. In many cases different combinations of verbal stems and prefixes are equally legitimate (e.g. attenuatives privorovyvat’ and podvorovyvat’ both meaning “steal a little bit from time to time”), suggesting that Aktionsarten is not merely about a speaker’s lexical knowledge. Although corpus analysis enables us to observe some tendencies in the use of different morphological means and make assumptions as to what are the possible factors that determine the choice of the language users, 1 Aktionsarten facilitate a more detailed description of event types than, for example, Vendler’s four actional classes that are arguably less revealing for Russian data. For Russian, such Aktionsarten as ingressive, delimitative, cumulative, semelfactive, iterative, distributive and others are identified. For more detail, see Zaliznjak, Šmelev (2000, pp. 104-127). Variation in Russian verbal prefixes and psycholinguistic experiments 115 it cannot be considered conclusive. One important complicating factor is that corpus data include contexts where all factors (environment and frequency effects) are present together and where it is not possible to evaluate the contribution of each possible factor separately. Psycholinguistic experiments facilitate controlling for different factors, as well as tearing them apart; in other words, experiments make it possible to measure the impact of different factors separately and in interaction and further enable us to make assumptions about individual speakers’ competence. The focus of the present article is primarily on methodology, and after a brief introduction of Russian Aktionsarten and the relevant hypotheses in section 2, I focus on one experiment that targeted Russian attenuative Aktionsart and address practical questions of experiment design, choice of stimuli etc. in sections 3 and 4. In section 5 I describe a similar experiment of Russian semelfactive Aktionsart with slightly different design, before, in section 6, I discuss the advantages and challenges of the two experiments. Section 7 summarizes and provides general conclusions. 2 Variation in Russian attenuative Aktionsart. Background Attenuatives refer to events that are characterized by lower intensity than the events to which they are related; they describe events that are often secondary and incomplete (Isačenko, 1982; Zaliznjak and Šmelev, 2000). Examples include priotkryt’ “open slightly” and podpravit’ “amend slightly”. In the pair otkryt’ - priotkryt’ “open - open slightly” the unprefixed otkryt’ describes a telic event with a natural result, whereby something, for instance, a door or a window, ends up in a (fully) open position. Opening slightly is a part of the event described by otkryt’, since it means “open slightly”, i.e. “open to a lesser degree”. Although the resulting verb priotkryt’ is also telic, there is no clear understanding to the degree of which “slightly” refers. Russian attenuatives are formed via prefixation from verbs with varying semantics and shape. Attenuative morphemes can be attached to prefixed or unprefixed bases, both perfective and imperfective verbs. Most common morphological markers of attenuatives include prefixes priand pod- (prikupit’ “buy some”, podmorozit’ “get slightly frosty”) as well as combinations of prefixes pri-, pod-, and powith suffixes -iva-/ -va-/ -aand -nu-/ -anu- (priskulivat’ “whimper along”, poddaknut’ “say ditto to” and pokurivat’ “smoke a little bit from time to time”). As follows from the examples above, Russian attenuative morphology represents a case of morphological rivalry, since several prefixes and prefix-suffix combinations essentially perform the same task ― make attenuative verbs. As suggested in a recent corpus study, Anastasia Makarova 116 prefixes priand podare not randomly distributed across verbal stems (Makarova, 2014). The dataset based on the modern subcorpus of the RNC (texts from 1950 and later) includes 605 semantically tagged verbs (a total of 255,430 attestations) that form attenuatives with priand pod-. 2 These include 219 verbs (35,924 attestations) that are attested only with the attenuative pri-, 205 verbs (15,625 attestations) that are attested only with the attenuative podand 181 verbs (203,881 attestations) that are attested with both attenuative prefixes. Makarova (2014, chapter 6) demonstrates that the choice between priand podis based on the type of semantic interaction between the stem and the prefix. Being similar in their attenuative uses, the two prefixes are polysemous entities with different semantics. The attenuative submeaning is integrated in the network of other meanings of the prefixes; the networks of the two prefixes are different and have different spatial prototypes. 3 The analysis of the RNC sample revealed three types of stem-prefix interaction: 1) cases of semantic overlap and strong attraction of one of the attenuative prefixes, 2) cases of no semantic overlap and contrastive use of the prefixes, and 3) cases of semantic overlap with both prefixes and no semantic contrast in their use. The dataset consisted of both highand low-frequent verbs, and, due to the nature of the corpus data might include idiosyncrasies. The questions that remain unanswered after the corpus analysis are: What is the relationship between corpus data and native speakers’ competence? Is the behavior of the native speakers parallel to corpus data? What is the psycholinguistic status of prefix variation in attenuatives? Although a corpus gives a broad picture of language use in a speech community, it does not necessarily mirror the mental grammars of individual speakers (their competence). Hence, in order to answer the questions above, a psycholinguistic experiment was developed and carried out. 3 Russian attenuatives. Experiment design Since the main objective of the experiment was to see whether the responses of native speakers mimic the data obtained from the RNC, contexts attested in the RNC served as the best suitable basis for the experiment. The stimuli set included 59 sentences in Russian, culled from the RNC. Verbal prefixes 2 The dataset is made available through Tromsø Repository of Language and Linguistics (http: / / opendata.uit.no/ dvn) at http: / / hdl.handle.net/ 10037.1/ 100 46. 3 The relationship between different submeanings of the prefixes is, however, beyond the scope of the present article, for discussion see Dobrušina and Paillard (2001), Endresen et al. (2012), Jakunina (2001), Janda et al. (2013), Kagan (2012, 2013), Makarova (2014), Plungian (2001), and Viimaranta (2012 a, b). Variation in Russian verbal prefixes and psycholinguistic experiments 117 in all sentences were replaced with gaps, and the participants were asked to fill in the gaps by using the most appropriate prefix. This was a cloze-test type of task; in other words, no options were provided: (1) a. Original sentence in the RNC: Deduška otkryvaet vorota i, priderživaja nogoj, propuskaet menja. “My grandfather opens the gate and, holding it open with his foot against it, lets me in.” b. As presented in the experiment: Deduška __kryvaet vorota i, __derživaja nogoj, __puskaet menja. In total, the informants had to fill in 164 prefixes, out of which 59 were attenuatives in priand podin the original examples from the RNC (targets) and the rest were non-attenuatives in the original sentences (controls). In (1) above, priderživaja “holding slightly” is the target, while otkryvaet “opens” and propuskaet “lets in” are controls. Since the experiment targeted the representation of Aktionsarten in the mental grammars of the speakers and not lexical access, no reaction times were measured. The experiment was carried out in St Petersburg in Fall 2012. A total of 122 informants, all native speakers of Russian living in Russia, of different ages, education and occupation, took part in the experiment. 4 This yielded 20,008 responses, 7,198 for targets and 12,810 for controls. 14,062 responses (70%) matched the original prefixes used in the RNC examples. When it comes to the 59 target verbs, 69% of the responses returned for the target verbs had either the prior podprefix. This is a high performance rate given the type of test where the informants were not restricted to a limited set of prefixes they could use. In cloze tests the proportion of responses of one type is referred to as cloze value or cloze probability. Cloze values above 40% are considered high (Coulson et al., 2006). Thus, the rate of 69% registered for target stimuli in my experiment is very high. The overall results of the experiment show that the informants have understood the task well, and the experiment was successful. Three types of verbs were included as targets in the experimental tasks: verbs that in the RNC are attested only with the prefix priin attenuatives (10 verbs), verbs that in the RNC are attested only with pod- (11 verbs), and verbs that are attested with both prefixes in the RNC (38 verbs). Lists of verbs are provided in Appendix at http: / / hdl.handle.net/ 10037.1/ 10046. In each group, as well as for the controls, verbs of different frequencies were 4 The results of the experiment, as well as the description of the stimuli, R script for analysis and general information about the participants are available at http: / / hdl.handle.net/ 10037.1/ 10046. Anastasia Makarova 118 included. In order to establish the frequencies, the latest frequency dictionary for Russian was consulted (Lyashevskaya & Sharoff, 2009). In order to divide the verbs into three groups according to their frequency, the method of quartiles was applied (Rasinger, 2008, pp. 121-123; Wieling et al., 2011). Quartiles divide lists of numbers into four portions, where the first quartile (Q1) splits the lowest 25% of the data, the second quartile (Q2) is the median, which cuts the data into two equal parts, and the third quartile (Q3) splits the upper 25% of the data. Establishing the quartiles enabled us to see which verbs are low frequent (the lowest 25%), which are high frequent (the upper 25%) and which have an average frequency (the remaining 50% of the data, between Q1 and Q3). The procedure is described in more detail in Appendix (http: / / hdl.handle.net/ 10037.1/ 10046). The final version of the experiment was preceded by two pilot studies that were carried out in order to detect problematic contexts. These were eliminated in the final version. 4 Russian attenuatives. Experiment results In this section I present the major results of the experiment and show that the distribution of the Russian attenuative prefixes priand podin the experimental data is not random. Since we are only interested in the distribution of priand pod-, in the following I will ignore cases where the informants selected other prefixes. For the present purposes, it is convenient to distinguish between “matching responses” and “non-matching responses”. Matching responses are those that match the prefixes in the original sentences in the RNC, while non-matching responses are those where the informants select the opposite prefix. We expect high number of matching responses, since such a result is an indication that the informants use the prefixes in the same way as they are used in the RNC. Let us take a look at the results for two groups of data. First, consider the numbers of matching and non-matching responses for verbs that in the original contexts were attested either exclusively with prior with pod-: Variation in Russian verbal prefixes and psycholinguistic experiments 119 Matching responses Non-matching responses Total (priand pod-) priin RNC 889 (97.7%) 21 (2.3%) 910 (100%) podin RNC 1044 (99.4%) 6 (0.6%) 1050 (100%) Table 1: Responses for verbs that in the RNC are used with prior podexclusively The high proportions of the matching responses (boldfaced in the table above) indicate that the distribution of the prefixes in the experimental data is very similar to the distribution in the RNC data. When it comes to verbs that in the RNC are attested with both prefixes, we expect more variation in the responses. Here is what we find in the experimental data: priresponses podresponses Total (priand pod-) priin RNC context 981 (67.7%) 467 (32.3%) 1448 (100%) podin RNC context 417 (27.65%) 1091 (72.35%) 1508 (100%) Table 2: Responses for verbs that in the RNC are used with both prefixes As follows from table 2, the number of responses that match the prefixes used in original contexts is higher than for the non-matching responses. Not surprisingly, we see that the number of target responses for this group of verbs is much lower than that observed for verbs with clear preferences presented in table 1. However, the number of matching responses is still very high given the fact that each verb could theoretically be used with either of the prefixes. Moreover, the distribution of the prefixes is not random (χ 2 =474.8; df=01; p-value<2.2e-16; Cramer’s V=0.4). If we further zoom in on verbs that are attested in the RNC with both prefixes, we find that they do not represent a homogenous group. Among the verbs in this group we find verbs for which the use of the prefixes is contrastive, i.e. priand podare associated with (slightly) different meanings, and non-contrastive, for which it was not possible to find contexts with a clear semantic contrast: Anastasia Makarova 120 Matching responses Non-matching responses Total (priand pod-) contrastive 1286 (77.4%) 376 (22.6%) 1662 (100%) non-contrastive 786 (60.74%) 508 (39.26%) 1294 (100%) Table 3: Contrastive vs. non-contrastive use of the prefixes As shown in table 3, verbs with contrastive use of the prefixes were marked with the matching prefix more often than verbs with non-contrastive use of the prefixes (χ 2 =95.2; df=1; p-value<2.2e-16; Cramer’s V=0.18). This distribution suggests that the two groups are indeed different. As described in section 3, verbs that were used as stimuli in the experiment belong to three different groups depending on their frequency. I distinguish between verbs of high (14 verbs), average (22 verbs) and low (23 verbs) frequency. Does frequency have an effect on the native speakers’ choice of the prefix? Consider the following table. Frequency Matching responses Non-matching responses Total (priand pod-) low 1162 (76%) 364 (24%) 1526 (100%) average 1518 (76%) 477 (24%) 1995 (100%) high 1325 (95%) 70 (5%) 1395 (100%) Table 4: Matching and non-matching responses for verbs of different frequency We see that the numbers of matching responses are high for all three groups. However, the informants treated verbs from the different groups differently. While low and average frequency verbs were marked with matching prefixes in 76% of the cases, high frequent verbs got as much as 95% of the matching responses. Thus, the analysis of verbal frequency revealed that speakers are sensitive to frequency, since high frequent verbs get more target responses than other verbs. Although more frequent verbs get more target responses than less frequent verbs, also for low frequent verbs the number of target responses reaches 76%. This strongly suggests that we are not dealing with extraction of prefabricated chunks (Dąbrowska, 2004, pp. 18-22) from the memory of the speakers, but, rather, that there is a psychologically Variation in Russian verbal prefixes and psycholinguistic experiments 121 realistic model of attenuative formation behind the choice of the prefixes made by the informants. We can conclude that the main goal of the experiment is achieved, and support to the hypothesis about non-random distribution of rival prefixes is obtained. However, the experiment can shed more light on Russian attenuatives, in particular it enables us to situate prefix variation in a broader context and address pivotal question for a general theory of attenuatives, namely when do speakers use attenuatives and what are the factors that induce speakers to use attenuatives? In the following, I only summarize the main points of the analysis described in Makarova (2014, chapter 8). In total, the number of attenuative responses for target stimuli is 75.5%, while for non-target controls (other prefixes than attenuative in the original contexts), only 5% of the returned responses were attenuative responses. This suggests that there is something in the context that motivates the use of attenuatives for target stimuli. Three possible factors were identified: lexical triggers in the context, frequency, and morphology. First, the effects of each of these three factors were analyzed separately. Let us consider lexical context first. The participants in the experiment were not provided with larger contexts than separate sentences, and thus were not able to make reliable assumptions regarding the discourse function of the sentences. Furthermore, in all stimuli-sentences provided, the use of the attenuatives was not obligatory (unlike aspect, Aktionsarten in Russian are rarely obligatory); other prefixes, including neutral perfectivizing prefixes, were plausible responses. Therefore, in order to avoid unnecessary speculations about sentence semantics as a whole, in the analysis of the results, we focused on what could be observed and measured, i.e. specific lexical items in the context (adverbs meaning “slightly”, politeness markers and others). Since redundancy is an important aspect of human language, it is reasonable to expect attenuatives in contexts where the meaning of “slightly, a little bit” is already expressed. 5 To evaluate the role of lexical context, examples with confounding morphological factors were eliminated for the test. From the remaining 25 examples, eleven included attenuating adverbs that can trigger the use of attenuatives, the number of attenuative and non-attenuative responses for the 25 sentences is presented in table 5. 5 Redundancy is well-attested in Russian. Consider examples like: my delaem “we do” where the person is expressed in the pronoun and the verbal ending, doechat’ do goroda “reach the city”, where the meaning of the preposition and the prefix overlap, or napisat’ “write PF ” where the semantic contribution of the prefix na “on” is virtually invisible due to the fact that it overlaps with the meaning of the unprefixed verb, since writing presupposes the surface. Anastasia Makarova 122 Examples with lexical support Examples without lexical support Total Number of attenuatives 782 897 1679 Number of other responses 560 811 1471 Table 5: Attenuative vs. other responses in sentences with and without lexical support for the use of attenuatives No indications of a strong relationship between the presence of certain lexical items in the context and the choice of an attenuative prefix were found (χ 2 =9.8, df=01, p-value=0.001, however, Cramer’s V=0.057, which is too low to be reportable). When it comes to frequency, verbs with high frequent attenuatives in the RNC returned more attenuative responses than verbs with low and average frequent attenuatives in the RNC. Consider table 6: Frequency Attenuative Other Total high 1414 (82.8%) 294 (17.2%) 1708 (100%) average 2192 (81.7%) 492 (18.3%) 2684 (100%) low 1826 (65.1%) 980 (34.9%) 2806 (100%) Table 6: Attenuative vs. other responses for verbs with different frequencies Also, there is positive and statistically significant correlation between the relative frequency of attenuatives compared to other prefixed verbs formed from the same stem and the number of returned attenuatives. For each sentence the number of attenuative responses was collected along with the frequency of the attenuative verb in the RNC divided by the frequency of the whole cluster of prefixed verbs with the same root (attenuative ratio). These two measures were then analyzed with a Pearson’s correlation test. This test measures the strength of the relationship between the variables. The value returned by the statistical software ranges from -1.0 to 1.0 The null hypothesis for a correlation test is that there is no correlation between variables (r=0). However, in the case under scrutiny, the statistical test revealed a positive correlation of 0.62 (t=6.002; df=57; p-value<0.01). Thus, the statistical tests show that the two variables, that is the number of attenuative responses and the attenuative ratio, are strongly correlated. We can therefore Variation in Russian verbal prefixes and psycholinguistic experiments 123 conclude that the choice of prefixes and frequency are not independent of one another. As for the presence of morphological triggers, Townsend (1975, pp. 129- 130), Isačenko (1960, pp. 238-294), Zalianjak and Šmelev (2000, pp. 119-127) have pointed out that attenuative prefixes tend to co-occur with certain other affixes (other prefixes, e.g. priotkryt’ “open slightly”, semelfactive, e.g. prikriknut’ “shout slightly once”, and secondary imperfective suffixes, e.g. podgjladyvat’ “peep, peek”). This suggests that presence of these affixes in the target stimuli can be an important factor influencing the choice of speakers in an experimental setting. While for the RNC data we cannot say whether it was the semelfactive suffix that motivated the use of the attenuative prefix or vice versa, in the experimental setting the other affixes are already present in the stimuli, so it is the choice between attenuative vs. other prefixes that the speakers are making, and hence, we can see if the use of other affixes favors the choice of attenuatives. Consider table 7: Examples with morphological triggers Examples without morphological triggers Number of attenuatives 1618 (88.5%) 1134 (58%) Number of other responses 212 (11.5%) 818 (42%) Total 1830 (100%) 1952 (100%) Table 7: The number of attenuative responses for stimuli with and without morphological triggers As follows from the table, the number of attenuatives returned by the informants is higher for contexts with morphological triggers, so in the experiment, the presence of morphological triggers favored the use of attenuative prefixes as opposed to other prefixes, this result is supported statistically (χ 2 =436.65; df=1; p-value<2.2e-16; Cramer’s V=0.34). Furthermore, among the morphemes that can have an effect on the choice of attenuative vs. other prefix, the presence of secondary imperfective suffixes is the strongest predictor of the use of the attenuative. In order to arrive at a better understanding of the data and to see whether and how the three factors interact, I used the statistical software R (R Development Core Team 2011) and applied two statistical models to the data: classification trees (ctree) and random forest (cforest). Statistical tests such as classification trees and random forest can assess variable importance and provide important insights about the interactions of the factors. The model Anastasia Makarova 124 of random forests measures relative importance of variables in the choice between X and Y. Classification trees group the predictors and visualize how well the different combinations of predictors account for the choice between X and Y. Random forests and classification trees in many ways resemble logistic regression. However, unlike running logistic regressions, using random forests and classification trees does not include series of trials and errors in order to arrive at the optimal model, as the models choose the optimal combinations of factors on their own (Strobl et al., 2009). Classification trees, random forests and regression models were compared using linguistic data in Baayen et al. (2013), and it was shown that trees and forests produce comparable results to regression models and represent a reliable tool to analyze linguistic data sets. By applying these two statistical methods, I obtain data that enable me to measure the role of frequency, context and morphology in the use of attenuatives. The results of the cforest indicate that, when analyzed in interaction, all factors are important, and the ranking of the factors is as follows: morphology>frequency>lexical triggers. More details can be found in the Appendix (available at http: / / hdl.handle.net/ 10037.1/ 10046). Lexical context belongs to cues that are syntagmatically related to the target stimuli. Syntagmatic relations are about possible combinations of linguistic units and concern positioning in a context. As we have seen, lexical context is not a very strong predictor for the use of attenuatives. Hence, the informants are not very dependent on syntagmatic relations. Morphological cues, on the other hand, are paradigmatically related to the target stimuli. Paradigmatic relations are about functional contrasts, differentiation and substitution. Unlike syntagmatically related entities, paradigmatically related entities are absent in the contexts. As follows from the analysis above, morphological factors are strongly associated with the use of attenuatives; therefore, paradigmatic relations seem to represent an important cue for the speakers. The ctree analysis brings the factors into groups of different size and shows how the different groupings enable us to sort the data. What the model does is simply present the results in an intuitively clear way, returning a tree where each branch represents a grouping of factors. Figure 1 presents the ctree returned by R: Variation in Russian verbal prefixes and psycholinguistic experiments 125 Morphology p < 0.001 1 yes no Frequency p < 0.001 2 {high, low} average Node 3 (n = 1952) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 LexicalTriggers p < 0.001 4 yes no Node 5 (n = 1342) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Node 6 (n = 610) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Frequency p < 0.001 7 low {average, high} LexicalTriggers p < 0.001 8 no yes Node 9 (n = 1098) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Node 10 (n = 610) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 LexicalTriggers p < 0.001 11 yes no Frequency p = 0.009 12 average high Node 13 (n = 488) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Node 14 (n = 244) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Frequency p < 0.001 15 high average Node 16 (n = 610) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Node 17 (n = 244) o t h e r a t t e n 0 0.2 0.4 0.6 0.8 1 Figure 1: Recursive partitioning tree for the factors influencing the choice of attenuative prefix Anastasia Makarova 126 The conditional inference tree in figure 1 guides us through the complex interaction of the factors and, although not all nodes can be easily explained, the tree facilitates several observations. Crucially, we see that the factors do indeed interact: only factors in combinations sort the data. This is an important finding, which can explain the deficiencies of the analyses where the factors were studied in isolation. The number of attenuative and other responses is visualized in the bottom portion of the figure by bars where light gray represents number of attenuative responses and dark gray stands for other responses. The tree in figure 1 indicates that there are several conditions in which attenuative responses are highly possible. First of all, the chances of getting attenuative responses are highest (see the leftmost bar with the largest portion of attenuative responses) for cases where there are morphological conditions and where the frequency in the RNC is high and low (in the figure, it is the following sequence of nodes: 1-2-3). The same holds for verbs with morphological triggers, with average frequency and where there are lexical triggers in the context (nodes 1-2-4-5). The number of attenuative responses is slightly lower for cases where there are morphological triggers, the frequency is low and there is no lexical support in the context (nodes 1-2-4-6). Other combinations that give high numbers of attenuative responses include: no morphological triggers, average or high frequency and lexical triggers in the context (nodes 1-7-11-12-13 and 14). The lowest numbers of attenuative responses are obtained for combinations of no morphological triggers and low frequency (nodes 1-7-8-9 and 10). From figure 1 we see that morphology is on the very top of the tree, thus being the best predictor for the choice of attenuative prefixes. We also see that lexical triggers come into play only after morphology and frequency. These findings can shed light on general questions of language competence of the speakers of Russian. On the one hand, the participants in the experiment seem to be more sensitive to the properties of the words themselves than to the syntagmatically related words (lexical context). On the other hand, they are very sensitive to paradigmatic relations. The attenuative ratio, i.e. the ratio of the frequency of attenuatives to the frequency of other prefixed verbs, involves paradigmatic relations. Attenuatives are in paradigmatic relationships to other prefixed verbs within one verbal cluster. The analyses presented above suggest that paradigmatic relationships are more important than syntagmatic relationships, because, as we have seen, frequency is more important than lexical context. Summing up this section, the experiment has confirmed the hypothesis, since the choices of morphological variants are not random and the distribution of priand podmimics that in the RNC. The results are statistically significant (chi-squares with effect sizes). Furthermore, the experiment enabled Variation in Russian verbal prefixes and psycholinguistic experiments 127 us to measure the effect of several factors on the choice of attenuative prefixes as opposed to other prefixes, as well as the relative importance of the factors (ctree, cforest). 5 Other Aktionsart ― same methodology? An important methodological question is whether we can use the same experiment design and analysis to investigate other Aktionsarten that involve morphological rivalry. In the following, I briefly present a case study of Russian semelfactives and argue that the methodology used in the attenuative experiment is not universal and needs to be adjusted in order to be applicable to other Aktionsarten. 6 Russian semelfactives denote single actions; they select a single cycle in a repeatable series of events, e.g. machnut’ “wave once”, skripnut’ “squeak once” (Townsend, 1968; Švedova et al., 1980; Zaliznjak and Šmelev, 2000; Janda, 2007; Nesset, 2013). Semelfactives are primarily formed from verbs denoting simple physical actions, acoustical or optical phenomena by suffixation (-nu-/ -anu-) and prefixation (s-). A corpus study by Dickey and Janda (2009) suggests that -nuand shave a near-complementary distribution, and their distribution is motivated by both semantics and morphology. Semantically, sound and impact verbs prefer the suffix -nu-, while verbs that describe movement and behavior use the prefix smore often. When it comes to morphology, verbs belonging to the non-productive first conjugation tend to combine with the -nu-suffix, while -*ej verbs are only attested with the prefix s-. Strong tendencies are also attested for other morphological classes. The corpus findings enable us to advance a hypothesis about non-random distribution of the rival morphemes. The hypothesis predicts that in a psycholinguistic experiment we will observe non-random distribution of the affixes under scrutiny. As in the case of attenuatives, we are dealing with variation in morphological marking, however, the nature of this variation is different. Instead of prefix vs. prefix rivalry, in semelfactives we find prefix vs. suffix rivalry. This presents a challenge if we want to use the same design as in the attenuative experiment: leaving gaps in both prefix and suffix positions for each verb may bias the responses. Hence, it is preferable that the participants produce the whole form in the experiment instead of simply adding a morpheme to a given base. Furthermore, using existing verbs and contexts from 6 For details, see Makarova (2009), available at http: / / munin.uit.no/ handle/ 10037 / 2377. Anastasia Makarova 128 the RNC is problematic, since for existing verbs semelfactives may be directly extracted from the memory rather than spontaneously produced. Also, for such verbs frequency, morphological class and semantics co-exist and their effects cannot be measured separately. In order control for semantics and morphology as independent factors, nonce-verbs and constructed contexts were used in the experiment. This further enabled us to eliminate frequency as a factor altogether. Nonce-verbs were presented in finite and non-finite forms in a way that made their morphological class clear and in contexts that provided informants with some hints about the possible meaning of the verb. In addition, there was contextual motivation for the use of semelfactives (Makarova, 2009): (2) Ptička tlikala vsë reže i reže, …………….. v poslednij raz i zamolčala. “The bird was tlicking less and less, …………… for the last time and was silent again. “ In (2), the verb that the participant is supposed to use is underlined, and the phrase v poslednij raz “for the last time“ triggers the use of the semelfactive instead of the blanks. Participants were not specifically asked to produce semelfactives, rather, they were required to provide “the best suitable form of a given verb”. Existing verbs were used as controls. In total, the experiment included 32 target forms and 12 controls. The order of presentation was pseudo-randomized and differed for all participants. 7 There were 63 participants and 2954 responses obtained for further analysis. Among 2233 responses for target verbs, 1239 (53%) were semelfactives. For convenience of statistical analysis, in cases where several responses were provided per stimulus, only the first response was considered. The hypothesis was confirmed, and both semantics and morphology appear to be relevant factors for the choice of the semelfactive morpheme (t-tests, p<0.001). The -ajclass was easiest to form semelfactives from, while the -*ej class was the hardest. Furthermore, participants avoided using semelfactives and returned more other forms in problematic contexts, i.e. contexts where morphology and semantics as motivating factors were in conflict. The experiment further suggests that suffixation is the default pattern for language users. 7 The MS Excel function RAND was used for randomization of the experimental contexts. The ordering was then manually checked and sequences of similar contexts changed in order to avoid that the participants develop a pattern based on the order of the presentation of the stimuli. Variation in Russian verbal prefixes and psycholinguistic experiments 129 The analysis of the results presented an additional challenge, since -nuand sare not mutually exclusive and forms like struchnut’ “act cowardly once” with both the prefix and the suffix are attested. This complicates statistical analysis. My solution was to study the behavior of semantic and morphological classes with respect to -nuand sseparately, and then compare the different semantic and morphological classes. The behavior of each semantic and morphological class with respect to the choice between -nuand swas compared using ANOVA. ANOVA (analysis of variance) analyzes the differences among group means, the null hypothesis being that all groups under scrutiny are random samples of the same population. Low p-values suggest rejecting the null hypothesis. In the present study, the obtained low p-values indicate that different classes of verbs behave differently when it comes to the choice of the semelfactive marker. For a detailed analysis of the results of the experiment, see Makarova (2009). To summarize, the general question pertaining to the nature of the distribution of rival forms in the semelfactive experiment is the same as in the attenuative experiment. Furthermore, along with attenuatives, semelfactives are an Aktionsart and arguably represent a similar phenomenon. However, due to the differences between the two Aktionsarten, and primarily the differences in the nature of affix variation in the two Aktionsarten, the experiment design varied in the type and the presentation of target stimuli, while the analysis varied in the use of statistical methods. 6 Two experiments. Methodological advantages and challenges In general, the selected method proved to be effective. First, the experiments were easy to run, easy to participate in, and the results were relatively easy to analyze. Both experiments provided additional support to the relevant hypotheses; moreover, the data obtained enabled us to see the complex distributions of morphemes in more detail. Second, the selected method allows the researcher to control for such factors as semantics (both experiments), morphological class of the verb (both experiments), its frequency (only the first experiment enabled control for frequency, since frequency cannot be established for nonce-verbs), context (both experiments, albeit to a lesser degree for contexts extracted from the RNC), and estimate the effect of each of these factors separately as well as in interaction. Third, the results of the experiments indicate that corpus data can serve as a valid starting point for an analysis of linguistic variation, and that a large and balanced corpus, such as the Russian national corpus, to a large extent mimics the use of the variants by native speakers. Anastasia Makarova 130 Although the applied experimental methods are fairly straightforward, they present certain challenges. The most challenging part of experiment design is the selection of the stimuli. Cloze-test tasks return a virtually unlimited number of varying responses; hence in order to provide analyzable output, the responses need to be primed in some way. While the use of grammatical categories can be easily triggered in many cases (e.g. English past tense can be triggered by the word “yesterday” in the context), Aktionsarten do not belong to categories, which are obligatory in particular contexts. Thus, for the experiments described above, it was crucial to select/ compose contexts where the use of Aktionsarten affixes was the natural and preferred option. Russian rich verbal morphology presents an additional challenge. Russian verbs can be prefixed with up to sixteen different prefixes, and most prefixes that express Aktionsarten can be replaced by neutral aspectual prefixes or even omitted altogether without any risk of ungrammaticality. In practice, this means that the number of acceptable responses in many contexts is at least three (two affixes in question and a neutral perfectivizer). 8 Such variation in plausible responses is challenging for many statistical models. Another challenge for statistical analysis is that in the case of semelfactive markers, the use of the suffix and the use of the prefix are not mutually exclusive. Even in the case of attenuative prefixes, the use of several prefixes on one verb is possible. For attenuatives, this problem can be partially solved by accurately stating in the instruction that the participants should only use one prefix; for semelfactives, however, this issue remains problematic. The important difference between the two experiments lies in the choice of stimuli. There is no doubt that the processing of attested and nonce-verbs is different, but do differences in the processing of attested vs. nonce-verbs by the native speakers influence the results of the experiment? Not necessarily. As demonstrated by Kuznetsova (2015), native speakers are very good at dealing with semantically problematic verbs even in nonexperimental settings. Kuznetsova studied the so-called “pro-verbs”, i.e. verbs that usually are highly colloquial, cannot be characterized with separate meanings and have various meanings depending on constructions they 8 Perfective bases, although few in the sample, represent a slightly special case. For prefixed perfective bases like otkryt’ “open” a neutral perfectivizing prefix was not among the available options. However, for unprefixed perfective bases more than three prefixes were available, e.g. chvatit’ “grab” can be used with the prefixes vy-, s-, pri-, pod-, u-, za-, ot-, ob-, and pere-. Variation in Russian verbal prefixes and psycholinguistic experiments 131 occur in. Consider, for instance, figačit’, which, depending on the construction it is used in, can mean “produce”, “move”, “occur”, “make sound” etc. When exposed to contexts with such verbs, native speakers easily identify the construction and deduce the meaning of the pro-verb in each given case (Kuznetsova, 2015, pp. 103-106). Given this general ability of native speakers to identify constructional semantics and given the fact that the contexts, in which the targeted forms occurred in the experiments described above, were grammatical and intelligible, we can argue that the differences in processing of attested and nonce-verbs was not crucial for the results of our experiments. As demonstrated above, even though the phenomena under scrutiny seem similar, the experiments needed to be tailored for different Aktionsarten. The fact that the methodology cannot be applied across the board to all Aktionsarten even within a single language, let alone cross-linguistically, may either be regarded as a disadvantage of the method, or as a general warning for comparative studies where same methodologies are applied to different types of data. Conclusion The two simple experiments described above addressed Russian Aktionsarten and had very specific and manageable research questions. The experiments have shown that variation in aspectual affixation in Russian is not random, but is motivated by identifiable factors, such as semantics, morphology and frequency. The two reported studies shed new light on the variation in morphological marking of Russian Aktionsarten and more generally on the status of Aktionsarten in the mental grammars of the native speakers of Russian. They further contribute to psycholinguistic studies of form-meaning relationships. Slavic languages have elaborate morphological systems that provide a great source of experimental data. Furthermore, Slavic languages are relatively well studied and have large grammatically tagged electronic corpora, which facilitate putting forward hypotheses. The hypotheses and their predictions can then be tested experimentally. Adopted by researchers of Slavic and adapted to Slavic data, existing psycholinguistic methods open up for important generalizations relevant for Slavic and beyond. As demonstrated in the present article, even within one and the same language system, even when dealing with closely related grammatical phenomena, and even when using simple methodologies, we need tailored experiments in order to obtain meaningful results. Therefore, it is crucial that we do not blindly apply Anastasia Makarova 132 the existing methodology to a new research question or a new set of data from the same or a different language, but rather study the phenomena in question thoroughly before we proceed to designing the experiments. References Baayen, R. H., Endresen, A., Janda, L.A., Makarova A., & Nesset, T. (2013). Making choices in Russian: pros and cons of statistical methods for rival forms. Russian Linguistics, 37(3), 253-291. Coulson, S., Urbach, Th.P., & Kutas, M. (2006). Looking back: Joke comprehension and the space structuring model. Humor: International Journal of Humor Research, 19(3), 229-250. Dąbrowska, E. (2004). Language, Mind and Brain: Some Psychological and Neurological Constraints on Theories of Grammar. Edinburgh: Edinburgh University Press, Georgetown: Georgetown University Press. Dickey, S. M., & Janda, L. A. (2009). Chochotnul, schitril: the relationship between semelfactives formed with -nuand sin Russian. Russian Linguistics, 33, 229-248. Dobrušina, E. R., & Paillard, D. (2001). Pristavka VY-, ili model’nyj rezultat. In E. R. Dobrušina, E. A. Mellina & D. Paillard (Eds.), Russkie pristavki: mnogoznačnost’ i semantičeskoe edinstvo (pp. 65-70). Moskva: Russkie slovari. Endresen, A., Janda, L. A., Kuznetsova, Ju., Lyashevskaya, O., Makarova, A., Nesset, T., & Sokolova, S. (2012). Russian ‘purely aspectual’ prefixes: Not so ‘empty’ after all? Scando-Slavica, 58(2), 229-290. Isačenko, A. V. (1960) Grammatičeskij stroj russkogo jazyka v sopostavlenii s slovackim. Morfologija (Čast’ vtoraja). Bratislava: Izdat. Slovackoj Akad. Nauk. Isačenko, A. V. (1982). Die Russische Sprache der Gegenwart. Formenlehre. München: Max Hueber Verlag. Jakunina, D. V. (2001). Pristavka pri-: postroenie semantičeskoj seti. Moskovskij Lingvističeskij Žurnal, 5(1), 125-160. Janda, L. A. (2007). Aspectual clusters of Russian verbs. Studies in Language, 31(3), 607- 648. Janda, L. A., Endresen, A., Kuznetsova, Ju., Lyashevskaya, O., Makarova, A., Nesset, T., & Sokolova, S. (2013). Why Russian aspectual prefixes aren’t empty: prefixes as verb classifiers. Bloomington, IN: Slavica Publishers. Kagan, O. (2012). Degree Semantics for Russian Verbal Prefixes: The Case of podand do-. Oslo Studies in Language, 4, 207-243. Kagan, O. (2013). Scalarity in the Domain of Verbal Prefixes. Natural Language and Linguistic Theory, 31, 483-516. Kuznetsova, J. (2015). Linguistic profiles. Going from Form to Meaning via Statistics. Berlin, Boston: De Gruyter Mouton. Lyashevskaya, O. N., & Sharoff, S. A. (2009). Častotnyj slovar’ sovremennogo russkogo jazyka. Moskva: Azbukovnik. Makarova, A. (2009). Psicholingvističeskie dannye ob allomorfii v russkich semelfaktivach (Psycholinguistic Evidence for Allomorphy in Russian Semelfactives). Unpublished MA thesis, available at: http: / / munin.uit.no/ handle/ 10037/ 2377 Variation in Russian verbal prefixes and psycholinguistic experiments 133 Makarova, A. (2014). Rethinking diminutives: a case study of Russian verbs. PhD dissertation. Univeristy of Tromsø. Nesset, T. (2013). The History of the Russian Semalfactive: The Development of a Radial Category. Journal of Slavic Linguistics, 21(1), 123-169. Plungian, V. A. (2001). Pristavka podv russkom jazyke: k opisaniju semantičeskoj seti. Moskovskij lingvističeskij žurnal, 5(1), 95-124. R Development Core Team (2011). R: a language and environment for statistical computing. R foundation for statistical computing. Vienna. Retrieved from http: / / www.Rproject.org Rasinger, S. M. (2008). Quantitative research in linguistics. London: Continuum International Publishing Group. Sokolova, S. (2012). Asymmetries in Linguistic Construal. Russian Prefixes and the Locative Alternation. PhD dissertation. University of Tromsø. Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. PsychologicalMethods, 14(4), 323-348. Švedova, N. Ju. et al. (1980). Russkaja grammatika (Vol. 1). Moskva: Nauka. Townsend, Ch. (1975). Russian Word Formation. Columbus, OH: Slavica Publishers. Viimaranta, J. (2012a). The metaphors and metonymies of domination: explaining the different meanings of the Russian prefix pod-. Russian Linguistics, 36(2), 157-174. Viimaranta, J. (2012b). Analogy or Conceptual Metaphor? Coming Concretely and Abstractly Close in Uses of the Russian prefix pod- / / Sky Journal of Linguistics, 25, 205-232. Available at: http: / / www.linguistics.fi/ julkaisut/ SKY2012/ Viimaranta.pdf Wieling, M., Nerbonne, J., & Baayen, H. R. (2011). Quantitative Social Dialectology: Exploring Linguistic Variation Geographically and Socially. PLoS ONE 6 (9): e23613. doi: 10.1371/ journal.pone.0023613 Zaliznjak, A. A., & Šmelev, A. D. (2000). Vvedenie v russkuju aspektologiju. Moskva: Jazyki russkoj kul’tury. Reaction time methodology in psycholinguistic research: An overview of studies on Czech morphology Denisa Bordag Abstract: The present article offers an overview of studies exploring processing of various morphological phenomena in a highly inflective West Slavic language, namely Czech 1 . Its focus is twofold: On the one hand it aims to present several experimental paradigms based on reaction times measurements, on the other it intends to demonstrate how specific language properties of a particular language can be employed to advance our knowledge about mental structures and processing in general. 1 Introduction Reaction times measurements have been applied in a variety of behavioural tasks both in comprehension and production research. Measuring of participants’ response latencies to language stimuli is relatively simple and typically does not require expensive equipment. A favourite, simple procedure, frequently employed in comprehension research, is lexical decision (or judgement), introduced by Meyer and Schvaneveldt in the early 1970s. In this task, participants are asked to press a yesor no-button depending on whether they encountered an actual word in a given language, or a nonword. The stimuli are presented either visually or auditory. The task provides evidence about how readily different types of stimuli are recognized and thus offers insights about the organisation of the mental lexicon or complexity of processing. 1 Parts of this text come from previous publications of the author that are reported in this article, in particular from Bordag and Pechmann (2008), Bordag and Pechmann (2009) and Julínková and Bordag (2015). Reaction time methodology in psycholinguistic research 135 2 Employing lexical decision task to explore the representation of the Czech prefixed verbs Julínková and Bordag (in press) conducted a simple lexical decision experiment exploring the influence of semantic transparency on the processing of morphologically complex words in Czech. In the critical condition they compared response latencies to transparent prefixed verbs, whose meaning can be derived from its constituents (e.g. kreslit “draw” - vykreslit “colour in” and opaque prefixed verbs, whose overall meaning is not composed from the meanings of its constituents kreslit “draw” - zkreslit “distort”. Despite the fact that the absolute reaction times to the opaque (644.9ms) and transparent (638.1ms) prefixed verbs were statistically the same, the experiment did deliver evidence for slower processing of the opaque prefixed verbs, since their frequency was almost five times higher than the frequency of the transparent verbs and thus considerably faster reaction times were expected. The results are thus in line with e.g. the dual-route activation model proposed by Schreuder and Baayen (1995). The model includes a mechanism of activation feedback between the lemma nodes (linked with a syntactic and semantic layer) and the access representations of constituents that are fully present in the stimulus. This mechanism allows cumulative frequency effects for transparent complex words, but not for opaque ones. Gradually, the activation feedback tunes the system towards an advantage for the decomposition route (contrary to the direct route that maps a full-form access representation with the corresponding lemma node), which results in a processing benefit for the transparent, but not the opaque words. Although the experiment by Julínková and Bordag (in press) provided evidence that is in accordance with previous research and is valuable, especially for demonstrating semantic transparency effects in the processing of prefixed verb forms (the previous research in this area focussed on compounds and noun suffixes), it also demonstrates problems associated with simple decision tasks. Frequency and word length effects crucially determine the response latencies to language stimuli and comparing reaction times of two groups of different stimuli thus makes it desirable that their elements will be matched with respect to these two factors. However, such restrictions on the experimental material can render the experiment impossible. Compromises that do not represent optimal solutions are thus unavoidable, like comparing the response latencies relative to the frequency of the two groups of stimuli as in Julínková and Bordag (in press). Denisa Bordag 136 3 Employing morphological repetition priming to test the split morphology hypothesis A more suitable solution (which is however not always implementable) is represented by procedures in which participants respond to the same stimuli in all critical conditions. Such methods can be, for example, lexical decision combined with priming. The stimulus at which participants make the lexical decision (target) is preceded by another word (prime) that can be in various ways related (or unrelated) to the target and affect the speed of its recognition. In another experiment of the same study, Julínková and Bordag (in press) employed morphological repetition priming with lexical decision to investigate whether Czech derived and inflected word forms are stored in the mental lexicon in the same, or different manner and thus to test the Split Morphology Hypothesis (Anderson, 1977, 1982, 1988; Perlmutter, 1988; Scalise, 1984, 1988) in Czech for the first time. The targets in their experiment were always either verbs in 1.pers. sg., e.g. plavu “I swim“ or nouns in nom. sg. (e.g. hmat “a touch”). The prime was either identical, or inflectionally (plaveš 2.P.-SG ; hmatem INST-SG ), or derivationally (plavec “a swimmer”; hmatáš “you touch”) related to its corresponding target. Thus, participants performed lexical decision always on the same target that was preceded either by an identical, an inflectional, or a derivational prime. The results revealed significantly smaller priming effect in the derivational condition compared to the inflectional and identical conditions (which were statistically the same) for both verbs and nouns. The pattern of results provides evidence for distinct representations of inflection and derivation and thus supports the relevance of the Split Morphology Hypothesis also in the highly inflective Czech language (see also Feldman, 1994 for Serbian). The results further indicate that during the processing of inflectionally related forms, the same lexical entry is accessed. On the other hand, the processing of a base and derived form either involves more complex processing (e.g. through activating more lexical nodes) or two different lexical entries. To our knowledge, the two experiments described above are almost the only studies exploring Czech morphology from the perspective of language comprehension and employing psycholinguistic paradigms. With the exception of Lukavský & Smolík (2009) and Smolík (2010) they are most likely the only experimental studies on Czech comprehension in general. Reaction time methodology in psycholinguistic research 137 4 Employing the picture-word interference-paradigm to explore the representation of grammatical gender The picture-word interference (or picture-word distractor) paradigm can be seen as a production variant of the priming paradigm with a lexical decision. Participants name pictures in the presence of word distractors that can be in various types of relationships to the picture names and can affect their naming latencies. In addition, stimulus onset asynchrony (SOA), i.e. the time interval between the presentations of the picture and the distractor can be manipulated as well. The paradigm has a long tradition in experimental psychology (cf. the Stroop task) and has become popular among psycholinguists in the last 20 years. The method crucially extended the empirical basis traditionally available in support of the notion that the production of a word occurs in two fairly distinct stages: a) retrieval of the word’s semantic and syntactic information and b) the retrieval of the corresponding lexicalphonological information (Bock, 1986; Caramazza, 1997; Dell, 1986; Levelt, 1989). The first evidence of this sort was provided in a seminal study by Schriefers, Meyer, & Levelt (1990) conducted in Dutch. The authors were the first ones to describe the so called semantic interference effect, i.e. longer picture naming latencies (e.g. lion) when the picture and the distractor were semantically related (e.g. picture lion - distractor tiger) than when they were unrelated (e.g. picture lion - distractor table), and the so called phonological facilitation effect, i.e. shorter naming latencies when the target and the distractor were phonologically related (e.g. dog/ fog). Crucially, the semantic interference effect was observed when the distractor was presented shortly before or at the same time as the picture (SOA=-150/ 0 ms), but disappeared when it was presented after the picture (SOA=+150 ms). On the other hand, the phonological facilitation effect appeared when the distractor was presented after the target. These results demonstrate that the paradigm can also provide useful means for investigation of the time course of activation of various types of lexical information. Short after, the picture-word interference paradigm started to be applied to explore the representation and processing of grammatical features. This line of research was initiated by experiments addressing the processing of grammatical gender in Germanic and later Romance languages. Schriefers (1993) observed slower reaction times when the picture target and the distractor word had different genders than when their genders were congruent (only in production of gender marked nominal phrases). In his interpretation, the results reflect a competition for selection in the gender-incongruent condition: The activated gender feature of the distractor interferes with the gender feature of the to be selected name of the picture. In the gender- Denisa Bordag 138 congruent condition this is not the case, because the distractor´s gender is identical with that of the target noun. This interpretation, locating the effect at the level of grammatical encoding, was later challenged by Schiller and Caramazza (2002) who located the effect at the level of phonological encoding claiming that the phonological forms of the determiners and not the abstract gender features compete for selection. They arrived at this conclusion based on a picture-word interference experiment in which singular and plural NPs in German were produced: The interference effect appeared only in singular, where the determiners are overtly marked for gender (der, die, das), but not in plural, where the definite article form is invariant for all three genders (die). Miozzo and Caramazza (1999) later specified that the effect is present only in the so called early selection languages, in which the determiner form depends solely on the grammatical gender of the head noun (as it is the case in German and Dutch). On the other hand, the effect was not observed in the so called late selection languages, in which the determiner form depends also on the phonological context (e.g. a giraffe vs. an elephant, ma table vs. mon ampoule in French) and where the determiner selection can thus occur rather late in the NP production process (Romance languages). The third controversial issue in the domain of grammatical gender processing concerns the scope of the gender congruency effect. Does the competition for selection apply only to freestanding morphemes like determiners, or is the retrieval of gender-marked inflections governed by the same principles? Schriefers (1993) obtained a congruency effect when subjects produced NPs in the form of a definite article + adjective + noun as well as in the form adjective + noun. This implies that either the gender features compete for selection or there is a competition for selection of the bound morphemes associated with the gender inflection of the adjective. However, Schiller & Caramazza (2003) failed to replicate these results in both German and Dutch. Costa, Kovacic, Fedorenko & Caramazza (2003) could not observe the effect in adjective + noun phrases in Croatian either, though they found the effect with personal pronouns in accusative case, that is with free morphemes. In a series of experiments, Bordag and colleagues addressed the question of gender processing in Czech, taking the advantage of specific properties of the Czech morphological system in order to contribute to the resolution of the controversial issues. In the first experiment Bordag and Pechmann (2008) attempted to replicate the gender/ determiner congruency effect in Czech. As mentioned earlier, the gender congruency effect was observed only in some languages and it is not certain whether all constraints determining the emergence of the con- Reaction time methodology in psycholinguistic research 139 gruency effect have already been discovered. According to the late/ early selection distinction between languages, Czech should belong to the latter because the choice of a modifier form is not affected by its phonological context. Since there are no articles in Czech, an overtly gender marked demonstrative pronoun ten (m.), ta (f.), to (n.) “this” was chosen to be produced in the naming task together with the target noun, e.g. ten ananas, ta kytara, to auto. The SOA was 0. The analyses clearly showed a robust congruency effect in Czech, when pictures were named with a gender marked NP consisting of a demonstrative pronoun and a noun. They also provide further evidence that there is a competition for selection either between the gender features or between the demonstrative pronoun forms in early selection languages. In the next two experiments, Bordag and Pechmann (2008) investigated the scope and the locus of the gender congruency effect in more detail focusing on two questions: Is the congruency effect in Czech only present when freestanding gender marked morphemes are produced, or does it also appear in the production of NPs, in which the gender is marked on an inflectional suffix? Does the inflection have to be formally marked for gender, or will the congruency effect also be observable with a formally invariant inflection? To answer these questions the authors took advantage of the fact that there are two types of adjectival declension in Czech, the so-called hard and soft declension. Both these classes do inflect for gender, but in some cases they take an invariant inflection for all three genders. E.g. in nominative singular, they differ in such a way that in the hard declension there are three different endings for the three genders (-ý, -á, -é for masculines, feminines and neuters respectively), whereas there is only one invariant ending -í for all three genders in the soft declension. On the other hand, in e.g. genitive singular both hard and soft adjectives take gender marked endings, which is not the case for e.g. genitive plural, where both hard and soft adjectives have one and the same gender invariant ending. Not only adjectives, but also some ordinal numerals belong to the adjectival declension. The ordinals druh-ý, -á, -é “second”, čtvrt-ý, -á, -é “fourth”, pát-ý, -á, -é “fifth” are declined according to the hard declension, the ordinals prvn-í “first” and třet-í “third” according to the soft declension. Color adjectives, which have been repeatedly used in other experiments on similar topics, all follow the hard declension (or are indeclinable) in Czech and were therefore unsuitable for the planned comparisons. In the experiments participants therefore named pictures with an NP comprising either an ordinal from a hard adjectival declension (druhý “second” or pátý “fifth” + a noun) or an ordinal from a soft adjectival declension (první “first” or třetí “third” + a noun). The appropriate ordinals were Denisa Bordag 140 elicited with the help of frames consisting of one, two, three, or five lines that were presented around the picture and the distractor. In order to prevent confusion caused by counting of the frame lines, the presentation of the stimuli was blocked and always only two ordinals (and thus only two types of frames) were to be used in one block, so that participants could name the less complex frame with the lower ordinal and the more complex frame with the higher ordinal. In the first experiment with the ordinals, the ordinals from one declension (hard vs. soft) were always in the one block and the ordinals from the other declension in the other. In the second experiment, there was always one ordinal from the soft and one from the hard declension in one block to test whether the way the conditions were blocked did or did not affect processing. The results in both experiments were the same: 1) The naming latencies were shorter in the soft than in the hard condition and 2) they were shorter in the gender congruent than in the gender incongruent condition, but only in the hard condition, where the inflections were overtly gender marked. In the soft condition with one invariant ending for all three genders, the reaction times were statistically the same in the gender congruent and incongruent condition. The presence of the gender congruency effect in the experiments reported by Bordag and Pechmann provides evidence that the competition (either at the level of grammatical or phonological encoding) can take place also when the gender feature is manifested on inflectional endings, i.e. bound morphemes, and that the scope of the effect is thus not limited to free morphemes. The results are furthermore in line with the data of Schiller & Caramazza (2002) and Costa et al. (2003) and speak in favour of the hypothesis that it is the phonological forms of the inflections (by Schiller and Caramazza of the determiners) that compete for selection, not the abstract gender features of the head nouns as argued by Schriefers (1993). In addition, if the locus of the congruency effect was at the level of phonological encoding, as the data suggest, then cascaded processing must be involved. Discrete serial processing, as proposed by Levelt (1989) and Levelt, Roelofs & Meyer (1999), does not allow that other than the selected phonological forms are activated. Clearly, if the two gender marked forms (inflections or determiners of the picture and of the distractor) can compete for selection, they both must be activated at the same time. However, Bordag and Pechmann (2008) are cautious to interpret the results as unambiguous evidence against competition at the level of grammatical encoding. They propose two alternative hypotheses to explain why the congruency effect only emerges when overtly gender marked forms are Reaction time methodology in psycholinguistic research 141 produced, while at the same time maintaining the idea of gender feature selection. According to one hypothesis, the gender feature in the soft condition is not selected at all, because it has no consequences for the selection of the appropriate inflection in nominative singular. This assumption is supported by the observation that the naming with soft ordinals was faster than the naming with hard ordinals in both the gender incongruent and congruent condition. In the soft condition, gender does not affect further encoding, thus it is bypassed, which results in short naming latencies. In the hard condition, gender selection is obligatory because it is relevant for the selection of the proper adjectival inflection. When the gender of the distractor and the target noun collide (congruent condition), the selection proceeds quickly, though still more slowly than in the soft condition. The hard incongruent condition yields the longest naming latencies, indicating that competition for selection between the gender feature of the distractor and the target noun takes place. Bordag and Pechmann (2008) then consider the so called Hierarchical Feature Selection Hypothesis which explains how the production system “knows” when the selection of the gender feature can be bypassed, because it has no consequences for further encoding. This hypothesis is compatible with discrete serial models of processing like the one of Levelt (1989), but has been criticized by e.g. Schiller and Caramazza (2003) as ad hoc and rather unlikely. It will be discussed in more detail later. Another hypothesis that could explain the present results while maintaining the claim that they reflect the competition for selection between abstract gender features, assumes cascaded processing and feedback between the levels of phonological and grammatical encoding: The activated (but not yet selected) gender feature(s) send(s) activation to the corresponding phonological form(s). If the activation converges on just one inflection, information is sent back through feedback to the level of grammatical encoding, signaling that the selection of e.g. grammatical gender can be bypassed, because all activation coming from that source converges on just one phonological form anyway. Such a mechanism would be compatible e.g. with the Interactive Activation Model of Dell (1986). Obviously, more empirical evidence is necessary to decide whether the observed congruency effect is located at the level of grammatical feature selection or at the level where phonological forms are selected. Another study by Bordag and Pechmann (2009) sheds more light on this issue. Denisa Bordag 142 5 Employing the picture-word interference paradigm to explore the representation of declensional and conjugational class Bordag and Pechmann (2009) employed the picture-word interference paradigm to explore two other lexically specified grammatical features, namely the declensional class of nouns and the conjugational class of verbs. The processing of these two features had not been an object of psycholinguistic research previously and the existing models of speech production did not make any explicit assumptions concerning their encoding either. One of the reasons for this neglect is the scope of psycholinguistic interests, which have mostly focused on Germanic and Romance languages, where these two features either do not exist at all or may pose difficulties as a subject of psycholinguistic research in speech production (e.g. Spanish conjugation classes). On the other hand, in highly inflected Slavic languages both declension and conjugation play an important role in language processing. Czech nouns are declined according to grammatical case (nominative, genitive, dative, accusative, locative, instrumental, and vocative), number (singular and plural), and the declensional class (DC) to which they belong. Each noun, moreover, belongs to one of the following genders: masculine animate (MA), masculine inanimate (MI), feminine (F), or neuter (N); with the latter two, the opposition of animacy is not grammatically marked. Compared to DC, grammatical gender of nouns has an important syntactic function, because most adjectives, pronouns, numerals, and some verb forms must agree in gender (and number and case) with the head noun. The number of declensional classes varies depending on the criteria applied by individual grammarians (some list deviating forms as exceptions, others define a new declensional class), but usually ca. 14 classes are listed. Nouns from one DC take the same set of inflectional endings. The Czech verbal system is guided by the following features: person (first, second, third), number (singular, plural), tense (present, past, future), mood (indicative, conditional, imperative), voice (active, passive), aspect (perfective, imperfective), and conjugational class (and paradigm). There are five main conjugational classes. Verbs from each class take the same set of final inflectional suffixes (endings) and have other common features, which are not relevant for the present study. Within each conjugational class (CC), one or more paradigms are usually distinguished, which differ e.g. in the stem suffix (a special suffix that is added between the root and the ending) in some of the verb forms, morphophonological alternations, etc. (see Komárek & Petr, 1986, p. 415-416). Reaction time methodology in psycholinguistic research 143 Masculine Feminine Neuter Condition Prep. Picture Distractor Picture Distractor Picture Distractor Congruent Kvůli hrad-u nos (nos-u) žen-ě hlav-a (hlav-ě) měst-u oko (ok-u) Incongruent Kvůli hrad-u stroj (stroj-i) žen-ě růž-e (růž-i) měst-u moře (moř-i) * Prep. - preposition; kvůli “because of”, “due to”; hrad “castle”; nos “nose”; žena “woman”; hlava “head”; město “town”; oko “eye”; stroj “machine”; růže “rose”; moře “sea” Table 1: Overview of the DCs and critical conditions used in experiment 1 (in column “picture” is the produced dative form, in column “distractor” is the presented nominative form with the dative form in brackets) I IV V Picture Distractor Picture Distractor Picture Distractor Congruent nes-e plést (plet-e) pros-í žehlit (žehl-í) zpív-á foukat (fouk-á) Incongruent nes-e žehlit (žehl-í) pros-í foukat (fouk-á) zpív-á nést (nes-e) * nést “to carry”, plést “to knit”, prosit “to ask for”, žehlit “to iron”, zpívat “to sing”, foukat “to blow” Table 2: Overview of the CCs (I, IV, and V) and critical conditions used in experiment 2 (in column “picture” is the produced 3.p.sg. form, in column “distractor” is the presented infinitive form with the 3.p.sg. form in brackets) Denisa Bordag 144 Bordag and Pechmann (2009) point out both parallels and differences between grammatical gender and DC/ CC. On the one hand, they all divide nouns and verbs into abstract classes, the adherence to which has implications for production of particular word forms. Understanding the mechanisms of encoding of one such feature may help to clarify processing of other features with a similar function. On the other hand, grammatical gender and DC/ CC differ in some important respects, which may play a crucial role in speech production. Whereas the grammatical gender of nouns affects syntactic processing (agreement), DC/ CC has implications only for the morphological encoding of the nouns/ verbs themselves. Parallel research on different grammatical features may thus help to specify not only the common processes involved in their encoding, but also to reveal which differences e.g. in function are relevant for the organization and functioning of the production system and which have no consequences for it. Bordag and Pechmann (2009) designed their experiments exploring the DC and CC along the same rationale as the previous gender experiments (cf. table 1). In the two critical conditions of the DC experiment, participants named pictures in the dative singular in the presence of either DC-congruent or -incongruent distractors that were presented in nominative singular. To avoid confusion with grammatical gender, the picture name and the distractor word always had the same gender value in both the congruent and the incongruent condition. In the critical conditions in the CC experiment, participants named pictures of actions with verbs in the third person singular of the present tense in the presence of either CC-congruent or -incongruent distractors in infinitive (cf. table 2). The selection criteria (depictable action, etc.) did not allow the selection of verbs of one CC from one paradigm only. In the congruent condition, the target verb and the distractor were always from the same CC and the same paradigm. In the incongruent condition, the target verb and the distractor were from different CCs (and consequently also paradigms). The results of both the DC and the CC experiments mirrored each other: Naming latencies were significantly longer in the incongruent than in the congruent condition in both experiments. These results thus converged with the evidence from similarly designed experiments exploring the processing of grammatical gender. This suggests that there are parallels between the processing of DC, CC, and grammatical gender in speech production. Bordag and Pechmann (2009) stipulated that DC and CC are both represented at the grammatical level of encoding in the form of generic DC and CC nodes. All nouns of a given DC or CC class are linked to one DC or CC node which mediates the connection to the appropriate ending (or word form) at the level of phonological encoding. Reaction time methodology in psycholinguistic research 145 In their last experiment, Bordag and Pechmann (2009) addressed the question of the locus of the congruency effect. The experiment paralleled the one with ordinal numerals from the soft and hard declension in the series about grammatical gender. Since the organization of the Czech conjugation system does not allow the design of an equivalent experiment with CCs, Bordag and Pechmann (2009) focussed on DC. Participants again named pictures from six DCs in the congruent and incongruent condition. In addition, the target nouns were named either in the genitive or instrumental singular. The DCs and the cases were chosen in such a way that in the incongruent condition, the inflectional endings of the target noun and the distractor were formally different in the genitive singular (as was the case with the dative singular in the previous DC experiment) and formally identical (homonymous) in the instrumental singular. The responses in the particular cases were elicited with the help of sentential fragments sedím vedle “I am sitting next to” requiring the genitive case and sedím před “I am sitting in front of” requiring the instrumental case. Participants produced the sentential fragments at the fixation point presentation prior to the presentation of the picture and distractor (in nominative). The genitive and the instrumental conditions were blocked, however several endings (including those of fillers) were produced during one and the same block. If the competition for selection took place between the abstract DC features, the DC congruency effect should be observed in both the genitive and instrumental conditions, irrespective of whether the different DCs are manifested overtly through different endings, or not. On the other hand, if the competition took place between the endings at the phonological level, the congruency effect would be expected only in the genitive singular, where the endings are formally different in the incongruent condition, and not in the instrumental, where they are homonymous (though the target noun and the distractor come from different DCs as well). The results of the DC experiments revealed DC congruency effects in both the genitive and the instrumental conditions and they were stable across all DCs. This finding crucially differed from the results of the gender experiments in which the gender congruency effect was observed only when overtly gender-marked NP was produced. In their attempt to reconcile the results of the gender and the DC experiments, Bordag and Pechmann (2009) address the distinction between externally and internally driven feature selection (cf. intrinsic vs. extrinsic features in Caramazza, 1997 and Schiller & Caramazza, 2002) and the notion of dispensability of a feature selection. The approach re-addresses the Hierarchical Feature Selection Hypothesis mentioned above and the question of Denisa Bordag 146 how the production system could know which features should be selected first and which can be selected later or bypassed completely. Congr. Masculine Feminine Neuter Picture Distractor Picture Distractor Picture Distractor Gen. Sg. hrad-u nos (nos-u) čelist-i kost (kost-i) měst-a oko (ok-a) Inst. Sg. hrad-em nos (nos-em) čelist -í kost (kost-í) měst-em oko (ok-em) Incongr. Masculine Feminine Neuter Picture Distractor Picture Distractor Picture Distractor Gen. Sg. hrad-u stroj (stroj-e) růž-e kost (kost-i) měst-a moře (moř-e) Inst. Sg. hrad-em stroj (stroj-em) růž-í kost (kost-í) měst-em moře (moř-em) * hrad “castle”; nos “nose”; stroj “machine”; čelist “jaw”; kost “bone”; růže “rose”; město “town”; oko “eye”; moře “sea” Table 3: Overview of the material used in the congruent and incongruent condition in experiment 3 (in column “picture” is the produced genitive/ instrumental form, in column “distractor” is the presented nominative form with the genitive/ instrumental form in brackets) Reaction time methodology in psycholinguistic research 147 An internal feature value only becomes available when the specific lemma is activated. It is a strictly lemma-specific piece of information (e.g. DCs or grammatical gender of Czech nouns). The external feature value is not lemma-specific; it can already become available before lemma activation and is typically variable (e.g. grammatical case, number). Consequently, externally specified features can already become available before the internally specified features, opening up the possibility that the selection of the internal features can be bypassed if the information about the external parameters is sufficient for the unique specification of the inflected form. Both grammatical gender and DC are internal features of noun lemmas (and CC of verb lemmas). Contrary to DC, grammatical gender is, however, not indispensable for noun form morphological specification in Czech (and other languages). It has primarily syntactic implications for the agreeing elements (adjectives, pronouns, some numerals, and verb forms in Czech), which inherit the gender value from their controlling head noun. Bordag and Pechmann (2009) review the previous and their own results on grammatical gender and DC production and conclude that Hierarchical feature selection mechanism, based on the internal/ external feature selection and the notion of indispensability, represents an approach that can explain the variety of the results without dispensing the competition between the abstract grammatical features at the lemma level. If the external feature specification is sufficient for the selection of an appropriate unit (ending or a free morpheme) at the phonological level, the selection of internal feature(s) can be bypassed (e.g. grammatical gender in the nominative plural in German). On the other hand, a lexically internal feature that is indispensable for further encoding can never be bypassed (e.g. DC of nouns in Czech). The mechanism can not only explain the absence of the gender congruency effect reported by Schiller and Caramazza (2002) and Bordag and Pechmann (2008) in the “gender-invariant condition” (nominative plural in German, nominative of the soft adjectival declension in Czech), but can also account for the difference between the gender and DC experiments. Contrary to grammatical gender of modifiers (and also nouns), the information about the DC of nouns is indispensable for the selection of the appropriate noun inflectional ending: The noun ending in Czech cannot be selected on the basis of the external parameters (case and number) only; the information about its DC is necessary as well. Consequently, whereas gender selection can be bypassed in some cases (when not needed for further encoding), the DC value must be set some way whenever an inflected noun form is produced. Such an approach can explain why no gender congruency effect was observed if there were no gender-marked elements, and why the DC congruency effect was observed in an analogically designed experiment Denisa Bordag 148 in Bordag and Pechmann (2009): The crucial point is not whether there are or are not any phonological forms that can compete for selection, but whether the given feature is dispensable or indispensable for further encoding. The merits of Bordag’s and Pechmann’s studies on grammatical gender and DC/ CC production in Czech are threefold. First, they expanded the experimental testing domain by exploring a language about which there were no L1 psycholinguistic data prior to Bordag and Pechmann studies. As the previous research in many psycholinguistic subfields has shown, results from one language cannot be readily generalized on other languages and verifying existing hypotheses on various linguistic material is a sine qua non for formulating reliable theories about language processing. Second, in their research on DC and CC, Bordag and Pechmann provided the first evidence for the psycholinguistic reality of these features and offered first proposals concerning their modelling. While exploring DCs and CCs may have a limited direct value for psycholinguistically frequently explored languages (especially Germanic and Romance), the two features represent an important organisational principle of the nominal and verbal systems in all Slavic languages. Conversely, phenomena frequently explored in e.g. German and English, like the distinction between regular and irregular verbs, may not have direct relevance for the Slavic languages. This being said it must be also added that research on related phenomena in various languages may shed light on their processing and mental representation in general, as the results on DC production showed with respect to possible modelling of grammatical gender production. This leads to the third merit of the studies, namely their theoretical relevance for the modelling of speech production. In addition to providing results with obvious language specific value, the data also contribute to the discussions about more general theoretical issues. The most significant result in this context seems to be the finding that there can be a competition for selection even when an invariant ending is produced (the last CC experiment) indicating that competition for selection between abstract grammatical features at the lemma level cannot be dispensed easily in favour of the competition at the level of phonological encoding. This indeed makes necessary the revision of the previous conclusions about the modelling that were based on the data about gender production only and brings the Levelt model of speech production “back to play”. Reaction time methodology in psycholinguistic research 149 Conclusion The review of the studies using reaction times methodology to explore Czech morphology is unfortunately complete. It must be noted that the review of research on Czech language using this methodology is unfortunately complete as well and that there is only a handful of other (mostly offline) psycholinguistic studies that explore the processing of adult native speakers of Czech. Experimental psycholinguistic research has only a very short tradition in the Czech Republic and, except for the studies reviewed in this article, there are only a few studies performed by Filip Smolík and his colleagues on child language and by Barbara Mertins and Denisa Bordag on Czech as an L2 of adult learners (see Bordag (2015) for a review). Nevertheless, we hope that the review showed that even with a relatively simple, unexpensive methodology, it is possible to achieve results that can advance our knowledge about language processing and mental representation and that language specific properties can be meaningfully employed to conduct research that has potential to improve more general psycholinguistic models and theories. References Anderson, S. R. (1977). On the formal description of inflection. Proceedings of Chicago Linguistic Society, 13, 15-44. Anderson, S. R. (1982). Where’s Morphology? Linguistic Inquiry, 13, 571-612. Anderson, S. R. (1988). Morphological Theory. In: F. J. Newmeyer (Ed.), Linguistics: The Cambridge survey I. Linguistic Theory: Foundations (pp. 146-191). Cambridge: Cambridge University Press. Bock, J. K. (1986). Meaning, sound, and syntax: Lexical priming in sentence production. Journal of Experimental Psychology: Learning. Memory, and Cognition, 12, 575- 586. Bordag, D. & Pechmann, T. (2008). Grammatical gender in speech production: Evidence from Czech. Journal of Psycholinguistic Research, 37(2), 69-85. Bordag, D. & Pechmann, T. (2009). Externality, internality, and (in-)dispensability of grammatical features in speech production: Evidence from Czech declension and conjugation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(2), 446-465. Bordag, D. (2015). Experimentální psycholingvistika v českém kontextu [Experimental psycholinguistics in Czech context]. In O. Uličný et al. (Ed.), Preliminária k moderní mluvnici češtiny (pp. 55-66). Olomouc: Univerzita Palackého. Caramazza, A. (1997). How many levels of processing are there in lexical access? Cognitive Neuropsychology, 14, 177-208. Costa, A., Kovacic, D., Fedorenko, E., & Caramazza, A. (2003). The gender congruency effect and the selection of free-standing and bound morphemes: Evidence from Denisa Bordag 150 Croatian. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 1270-1282. Dell, G. S. (1986). A spreading-activation model of retrieval in sentence production. Psychological Review, 93, 231-241. Feldman, L. B. (1994). Beyond orthography and phonology: Differences between inflections and derivations. Journal of Memory and Language, 33, 442-470. Julínková, R. & Bordag, D. (2015). Processing of Different Types of Affixes in Czech. Studies in applied Linguistics, 2, 52-76. Komárek, M., & Petr, J. (Ed). (1986). Mluvnice češtiny. 2, Tvarosloví. Ústav pro Jazyk Český ČSAV. Prague, Czech Republic: Academia. Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. Cambridge: MIT Press. Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22, 1-75. Lukavský, J. & Smolík, F. (2009): Word Order and Case Inflection in Czech: Online Sentence Comprehension in Children and Adults. In N. A. Taatgen & H. van Rijn (Ed.), Proceedings of the 31th Annual Conference of the Cognitive Science Society (pp. 1358-1363). Austin, TX: Cognitive Science Society. Meyer, D. E., & Schvaneveldt, R.W. (1971). Facilitation in recognizing pairs of words: Evidence of a dependence between retrieval operations. Journal of Experimental Psychology, 90, 227-234. Miozzo, M., & Caramazza, A. (1999). The selection of determiners in noun phrase production. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 907-922. Perlmutter, D. (1988). The Split Morphology Hypothesis: Evidence from Yiddish. In: M. Hammond & M. Noonan (Ed.), Theoretical Morphology: Approaches in Modern Linguistics (pp. 79-100). San Diego, CA: Academic Press. Scalise, S. (1984). Generative Morphology. Studies in Generative Grammar, 18. Dordrecht: Foris. Scalise, S. (1988). Inflection and derivation. Linguistics, 26, 561-582. Schiller, N. O., & Caramazza, A. (2002). The selection of grammatical features in word production: The case of plural nouns in German. Brain and Language, 81, 342-357. Schiller, N. O., & Caramazza, A. (2003). Grammatical feature selection in noun phrase production: Evidence from German and Dutch. Journal of Memory and Language, 48, 169-194. Schreuder, R., & Baayen, R.H. (1995). Modelling morphological processing. In L. Feldman (ed.), Morphological aspects of language processing (pp. 131-154). Hillsdale, NJ: Lawrence Erlbaum. Schriefers, H., Meyer, A. S., & Levelt, W. J. M. (1990). Exploring the time course of lexical access in language production: picture-word interference studies. Journal of Memory and Language, 29, 86-102. Schriefers, H. (1993). Syntactic processes in the production of noun phrases. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 841-850. Smolík, F. (2010). Inflectional Suffix Priming in Czech Verbs and Nouns. In S. Ohlsson & R. Catrambone (Ed.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society (pp. 1667-1672). Austin, TX: Cognitive Science Society. Some “cases of doubt” in Russian grammar from different methodical perspectives Elena Dieser Abstract: The present paper 1 investigates grammatical variations in the Russian language using various experimental methods, such as different types of grammaticality and acceptability judgments (with and without endpoints and with or without additional tasks) and questionnaires. Overall, more than 350 respondents participated in the experiments. The goal is to clarify which diverging or corresponding results can be achieved with the different methods, whether additional tasks have an effect on the numerical judgment of forms and constructions, and whether the presence of endpoints on the scale influences the respondents in their decisions. The results of this study show that the combination of different methods can be seen as a big enrichment. It helps the researcher in recognizing the limits of every single method and in balancing them out. Even when there were no statistically relevant differences between the results for the test with and without endpoints and for the test with or without additional tasks, this comparison showed many additional findings. The additional tasks, for example, help clarify which phenomena were seen by the respondents as divergent and to see that some of the respondents completely accepted the modified form and saw the codified form as a deviation. 1 Introduction Modern linguists employ a large number of experimental designs; nevertheless, most linguistic investigation areas are associated with classical methods. In many instances, the investigation of first language acquisition, for example, used to be based on longitudinal studies, and analyses of the periphery of the linguistic norm built on grammaticality judgments (cf. Bornkessel-Schlesewsky & Schlesewsky, 2007, “a new judgement-driven imperialism”). Although these methods are most efficient for the investigation of the respective object (cf. Köpke & Schmid, 2004), they also have their limits. The 1 Some of the results discussed in this article are presented in Dieser (2010), Dieser (2013), and Dieser (2015) in German or in Russian. Elena Dieser 152 collection of numerical grammaticality judgments does not always lead to unambiguously interpretable results (Hundt, 2005; Kaufmann, 2005). The reasons for this are the often-discussed unnatural test situation, the performance of several successive tasks of the same kind. The disadvantages of an experimental method can be compensated for at least partially by combining several methods (Anstatt, 2011; Dieser, 2009). “The combination of multiple data sources and multiple methods as evidence” within one study is a new phenomenon (cf. Arppe & Järviki, 2007, p. 131). Until the mid-1990s, the two main sources of evidence in linguistics were introspection and corpus data, which stood in strong opposition to each other (Kepser & Reis, 2005, cited in Arppe & Järviki, 2007, p. 131). However, during the last several years, many studies have appeared that use different types of empirical data and methods (cf. Arppe & Järviki, 2007; Bermel & Knittl, 2012; Hundt, 2005; Kempen & Harbusch, 2008). Combining several methods helps the researcher understand the advantages and disadvantages of each method. It makes it possible to shed light on various aspects of a linguistic phenomenon, but in some instances, it also raises puzzling new questions when the discrepancies between the results of different methods are too large (cf. Kempen & Harbusch, 2008). In the present study, some “cases of doubt” in Russian grammar are analyzed on a broad empirical basis. Tests with different designs form the experimental basis. One of the test types concerns the pure collection of grammaticality and acceptability judgments, in which the participants express their appraisal of the grammaticality of test sentences numerically. In this study, two versions of this experiment are used (in one version, the scale used by the informants does not have any endpoints, cf. the “thermometer judgments” method, Featherston, 2006; in the other version, the endpoints are given). Tests with increased tasks form the other type: The participants are asked — in addition to judging the grammatical acceptability of the sentences (cf. Dieser, 2010) — to improve the forms or constructions they consider ungrammatical. Moreover, the informants are asked to report on the situation or style of speech in which these deviating forms would be allowed (cf. Hundt, 2005). The third test type is a questionnaire. The questionnaire respondents are asked to build the correct word forms from words given in their initial form. Several studies (cf. Dieser, 2010; Featherston, 2006) have shown during the last few years that native speakers perceive the grammatical correctness of linguistic units as a graduated value; that is, between the grammatically correct and ungrammatical phenomena, there is a wide transitional area. In the present study, it should be examined whether certain modifications of the grammaticality experiments influence the test respondents in their deci- Some “cases of doubt” in Russian grammar from different methodical perspectives 153 sions. Thus, whether a grammaticality judgment on a scale depends on the presence of endpoints will be discussed. Another question is what happens when a test respondent has to not only judge the test sentences but also fulfill additional tasks: Does this lead to an increase in attention and perhaps to a stricter judgment of sentences? Furthermore, contrasting experimental data with data from the questionnaires sheds light on the way in which a grammatical variant’s frequency of use is related to the judgment of the variant’s grammaticality. In addition, other factors that may influence the grammaticality judgment will be examined. Another question is the replacement of the codified standard with an imaginary norm, when the former is no longer applied by the majority of native speakers (cf. Belikov, 2009, p. 44). To our knowledge, this is the first investigation of this extent regarding Russian (however, publications have analyzed the relationship between corpus data and acceptability judgments in other Slavic languages, cf. Bermel & Knittl, 2012, for Czech). Some types of experiments, for example, thermometer judgments (Featherston, 2006, p. 4) and tests with more extensive tasks, are to my knowledge virtually unheard of for Russian. The chapter is structured as follows: In the second section, the term a linguistic “case of doubt” is explained and the linguistic phenomena investigated in this study are introduced. The third section presents the empirical data and analysis. In the fourth section, the analysis of the results based on selected examples follows. The fifth section provides a short summary. 2 A linguistic “case of doubt” A linguistic “case of doubt,” which will be examined within the scope of this study, is understood as linguistic unity (word/ word form/ sentence), about which a competent speaker might have doubts concerning which of the forms is (standard-linguistically) correct (or it would be more correct) regarding (at least) two variations (a., b., cf. linguistic variation, double form, double; Klein, 2003/ 2004, p. 1) A grammatical case of doubt from Russian or, in other words, from the periphery of Russian grammar, is given in example 1: Elena Dieser 154 (1) a. Na kočke my zametili dvuch ljagušek. “On the tussock we noticed two GEN-ACC2 frogs GEN-PL .” b. Na kočke my zametili dve ljaguški. “On the tussock we noticed two NOM-ACC frogs GEN-SG .” The coexistence of the parallel constructions (a) and (b) in (1) shows that the category of animacy in the constructions with numerals has not been completely applied in the Russian language (cf. Grannes, 1986; Mel’čuk, 1985). This situation is reflected in the partial codification (Švedova, 1980) of constructions with the non-applied category of animacy (see 1b). How could a “case of doubt” appear? An important reason is the pressure from the language system. The language system is understood as the totality of the possible ways and means of expression in every national language (Krysin, 2007, p. 6). However, not all possible ways and means of expression are realized in the norm. In the system of the Russian language, for example, it is possible to decline the borrowed nouns of the neuter gender, having the final -o sound (e.g., kino “cinema” or pal’to “coat”), according to the model of corresponding Russian nouns (e.g., okno “window”), although the codified norm of the last century rejects this. Thus, it is not surprising that in those forms of the national language that do not undergo codification (for example, in sloppy colloquial language) the declined variations result from the pressure of the language system. Divergences from the norm attributable to the pressure of the language system seem especially common in children’s language, because children acquire the language system earlier than the linguistic norm (cf. Coseriu, 1970). Such offenses against the norm help linguists investigate the basic linguistic rules for the respective language. Corresponding to language system, such “grammatical errors” are like “windows to the faculty of language,” cf. Reis (2005, p. 101). Variations or deviations from the norm often appear in those cases when the rules of the language system enter into a conflict, or when the scope of the various rules overlap. For example, in German, in the gender assignment of noun, the morphological rules dominate the semantic ones, but colloquially, this hierarchy is not always observed. Thus, the noun das Mädchen “girl” is morphologically neuter, but as an anaphoric personal pronoun, colloquially 2 The Russian noun features the category of animacy, also called subgender. The category of animacy is marked in the accusative case, namely, in the accusative singular of masculine nouns ending in a consonant and in the accusative plural with all three genders. The form of the animated nouns in the accusative is homonymous with their genitive form (GEN-ACC), and the form of the inanimate nouns in the accusative is homonymous with their nominative form (NOM-ACC). Some “cases of doubt” in Russian grammar from different methodical perspectives 155 the semantically congruent feminine pronoun sie “she” is often used instead of es “it”. Other reasons for the origin of “case of doubt” are the influence of dialect or sociolect and disturbances in the mechanisms of the linguistic processing (slip of the tongue) (cf. Reis, 2005, p. 102; Cejtlin, 2009, pp. 14-15). In the present study, especially grammatical variations in the case and animacy categories are analyzed. Most have emerged because of the conflict between different linguistic rules or because of analogies. The two categories are — such as the morphological system as a whole — stable (cf. Glovinskaja, 2008). The variants often arise where several different rules come into conflict, for example, when a verb and a numeral (example 1) or a preposition and a noun (example 2) require a different case. However, there are also “cases of doubt” that are not explained by these reasons. There is, for example, the choice of the prepositive instead of the genitive with nouns (example 3). M. Glovinskaja, who has collected several hundred examples for divergences from the norm in spoken and written Russian, stated that during the last few years, above all in spoken language — also in the public area — instances of the use of genitive forms instead of prepositive forms and vice versa appear frequently, cf. Glovinskaja (1996, p. 267). One possible reason for the emergence of these variants is the great formal similarity: The endings of the adjectives of both cases are identical, and the endings of the nouns are very similar. Another fertile source of grammatical variants is the decline of compound numerals. In colloquial language, the forms of the nominative and the genitive expand at the expense of other cases, for example, of the instrumental. In the declension of compound cardinal numbers in Russian, a depletion or change in the case paradigm has been observed for more than one hundred years (cf. Glovinskaja, 2008, p. 250ff.; Poljakova, 2009, p. 15). The adaptation of the form to another case regularly takes place only in the first part of a compound, for example, *semistami “seven GENhundred INSTR ” instead of sem’justami “seven INSTR hundred INSTR ” (cf. Glovinskaja, 2008, p. 252). This trend has been noted by Vinogradov (1947), Panov (1968), and Mel’čuk (1985) in the middle or in the second half of the 20th century, and Panov (1968) explains by strengthening of the analytical component of Russian morphology. Non-codified grammatical variants that are used by native speakers regularly are referred to herein as marginal forms or deviations. Forms and structures that were invented by the author for the experiment and are not used by native speakers are referred to as “ungrammatical” hereafter. Elena Dieser 156 3 Data sources (Method) 3.1 Thermometer judgments The first experiment I carried out in this connection was thermometer judgments (Featherston, 2006). This method permits introspective gradated grammaticality judgments within the scope of a web experiment. “Graduated” means that the informants do not choose between “grammatical” and “ungrammatical” but instead are asked to evaluate the grammaticality of test sentences numerically. Moreover, every judgment is delivered relative to two sentences for which authoritative values are settled by the test leader as well as by their own previous judgments. The followers of the method of thermometer judgments argue in favor of the use of two authoritative sentences because it “has turned out that informants can deliver no proportional judgements. Speakers have no intuition whether a structure “‘twice’ or ‘half’ is as good as another as the Magnitude Estimation asks” (Featherston, 2006, p. 52). In detail, the experiment is carried out as follows: The informants are asked to judge test sentences that are (according to normative grammar) grammatical, marginal, or ungrammatical. On every screen page, there is one test sentence for each instance. Moreover, two authoritative sentences are performed on every side, for example, a completely well-formed sentence that receives an authoritative value of 30 and a colloquial sentence with an authoritative value of 20. The question which the informant has to answer is the following: If sentence A has the authoritative value 30 and sentence B the authoritative value 20, how many points do you give to sentence C (test sentence)? The scale the informants use has no endpoints; that is, any value may be assigned to the test sentence. With the experiment material, the content and the form are controlled, that is, the test sentences should be semantically clear, contain only one divergence (here, a non-codified form) per sentence, and have a similar length (for other points, see Featherston, 2006, p. 54). To avoid the observer’s paradox, every test set has a high number of distractors, i.e., sentences that show a divergence in areas other than the one to be examined. The aim of the use of distractors is to deflect the informants’ attention away from the target stimulus. In view of the realization of the investigation, the following conditions should be fulfilled in order to ensure that the test has statistical validity: Some “cases of doubt” in Russian grammar from different methodical perspectives 157 - There should be at least 25 informants; - Two practice phases with the purpose of preparing informants for the test as well as filtering out “unreliable” respondents should be conducted; - Elevation of metadata; - Measurement of response times. In addition, in the evaluation of the data, statistical procedures play an important role. Any numerical values can be assigned to the test sentences within the scope of the thermometer judgment method. As a result, informants use different scales: While one informant expresses the differences of well-formedness very sophisticatedly and moves in the figure space between 0 and 40, another picks a value between 20 and 25. To be able to compare the results of different respondents with each other, they must be transferred in the same unity. Such standardization is possible with the z-transformation. The values thus are referred to as z-standardized or as normalized. Within the scope of the study, a total of four thermometer judgment experiments were carried out. The participants were 96 native speakers of Russian; the average age was 24.7 years. Per test, an average of 16 sentences were evaluated (four grammatical, four ungrammatical, and eight marginal with divergences in different linguistic areas). The tested phenomena were taken partly from other investigations (Glovinskaja, 1996, 2000, 2008) and were partially defined on account of the authors’ own observations of the present-day Russian language. They were derived mainly from the grammatical areas of case and of animacy (see section 2 of this chapter). To check whether the low acceptance of a sentence by the informants was the result of a marginal phenomenon contained in the sentence, a controlling sentence (without a marginal linguistic phenomenon) was added to the test in each instance in an additional experiment. The controlling function for the sentence Na kočke my zametili dve ljaguški. “On the tussock we noticed two NOM-ACC frogs GEN-SG .” is fulfilled, for example, by the sentence: Na kočke my zametili dvuch ljagušek. “On the tussock we noticed two GEN-ACC frogs GEN- ACC ”. Altogether, the experiments were carried out within the scope of the study mainly with the thermometer judgment method. Nevertheless, because the applied method was an investigation object in the study, some regulations were sometimes directly violated or were modified, for example: Elena Dieser 158 - A different number of informants participated in the tests (tests 1 to 3 had 28 to 29 informants; test 4 had only 12). This should shed light on whether the smallness of the group has a statistical effect. - Several sentences were repeated in two tests. It should be examined whether there are significant differences between their assessment in different surroundings and by different informant groups. 3.2 Grammaticality judgments: The scale with endpoints Although the thermometer judgment method provides very extensive results and advantages for researchers, it was decided that a simplified version would be used for further tests. This decision was triggered by the participants in the thermometer judgment experiment themselves. Several wrote in the comments that the experiment was too complicated. It would have been much simpler if the informants had been asked to evaluate the example sentences using the Russian school grading system. The best sentences got the grade 5 and the worst ones a grade of 1. Such a scale was used in some linguistic studies in Russian (e.g., Norman, 2007). As a result, a new set of experiments emerged. In most points, it agreed with the thermometer judgment experiment, while in others it differed from it. The most important differences for the thermometer judgment experiment are as follows: - A scale with terminator points: from 1 (very bad) to 5 (very good); - No reference sentences; - No practice phase. It was considered especially advantageous that the necessity for the two reference sentences was discarded, as the values assigned by the director of the experiment could have influenced the participants. The new set of experiments consists of three tests. The first test type concerns the pure collection of grammaticality and acceptability judgments, in which the participants express their appraisal of the naturalness of the test sentences numerically. During the second test type, the participants were asked — in addition to judging the grammatical acceptability of sentences — to improve the forms or constructions they considered wrong or ungrammatical. Moreover, the informants were asked to report on the situation or style of speech in which these deviating forms would be allowed (cf. Hundt, 2005). The test sentences in the first and second tests were identical. In terms of design, the third test was similar to the second and contained additional questions. However, there were other test sentences; some were Some “cases of doubt” in Russian grammar from different methodical perspectives 159 controlling sentences for the sentences with a modification that were tested in tests 1 and 2. In each instance, the participants of every test were 45 native speakers of Russian, all of whom were students. The average age was 19 years. The tests were distributed during the lessons and were collected 10 minutes later. The results of these tests were compared to each other as well as with the results of the thermometer judgment experiments. The values awarded by the respondents were z-standardized. The aim was to examine whether requiring the respondents to perform additional tasks has an effect on the numerical judgment of the forms and constructions (and if yes, whether the difference is statistically relevant) and whether the presence of endpoints on the scale influences the respondents in their decisions. 3.3 Questionnaires In addition to the judgments, the questionnaire method was used. Material received in such a way is close to natural written language or to “found” linguistic data. Clarification of how the assessment of a language phenomenon and its use in speech correspond to each other was the purpose of carrying out additional experiments. It has been assumed that, on one hand, the divergences evaluated by native speakers as not absolutely false occur in the questionnaire and, on the other hand, the divergences that often occur in questionnaires, get relatively “good grades” on the tests. The main task of the questionnaire was to put the words, given in brackets in the initial form, into the form demanded by this grammatical design. The material employed in the questionnaire corresponded to the material employed in the judgment tasks. Eight groups of students from universities and pupils from vocational schools filled out the questionnaire; each group totaled 25 to 32 examinees. They were from St. Petersburg, Samara (Russia), Minsk (Belarus), and Almetyevsk (Russia, the Republic of Tatarstan). The informants from St. Petersburg and Samara were monolingual native speakers of Russian. The test participants from Minsk and Almetjewsk were Russian-Bel a rusian and Russian-Tatar bilingual speakers. Elena Dieser 160 4 Results 4.1 Questionnaires The discussion of the results will begin with the evaluation of the questionnaires. These questionnaires indicate whether the marginal grammatical phenomena of the different types of grammaticality judgments under evaluation are at all used by native speakers. First, we address the divergences from the category of case. In examples 2 and 3, two examples with confusion of the genitive and prepositive cases are given. In both instances, the standard-linguistic variation is listed under a), and the variation involving case confusion is listed under b): (2) a. V bol’šinstve slučaev prepodavatel’ otvečal na voprosy studentov ves’ma podrobno. “In the majority of cases GEN the instructor answered the students’ questions in great detail.” b. *V bol’šinstve slučajach prepodavatel’ otvečal na voprosy studentov ves’ma podrobno. “In the majority of cases PREP the instructor answered the students’ questions in great detail.” (3) a. Ob ėtich chudožnikach uže pisali. “About these artists PREP was already written.” b. *Ob ėtich chudožnikov uže pisali. “About these artists GEN was already written.” In the quantitative nominal phrase v bol’šinstve slučaev “in the majority of cases GEN ” (2a) the ending of the genitive of the noun slučaev becomes unstable. A form in the prepositive *v bol’šinstve slučajach “in the majority of cases PREP ” appears under (2b). In place of the prepositive form ob ėtich chudožnikach “about these artists PREP ”, a genitive form *ob ėtich chudožnikov “about these artists GEN ” appears under (3b). Som Fig Fig 3 me “ gure gure The use “case 1 3 : 2: e nu ed th es of v *v ob *o umbe his fo f dou v bol v bo b ėti ob ėt er o orm ubt” l’šin l’šin ich c tich on th . ” in nstve nstve chud chud he b Rus e slu e slu ožni dožn bars ssian čaev učaja ikach nikov in f n gra v “in ach “ h “a v “a figur amm n the “in t abou abou res 1 mar f e ma the m ut th ut th 1 to from ajori majo hese hese 4 g m dif ity o ority arti arti gives ffere of ca y of ists P ists G s the ent m ases G case PREP ” GEN ” e nu meth GEN ” es PR ” (Qu umbe hodi ” REP ” uest er o ical p (Qu tionn of th pers uesti nair e re spec ionn res) espon tives naire nden es es) nts 161 who 1 o 162 Fig for alm fus stu wa me ists (3x the for sol per tha ing dis qui er *šes Fig 2 gure rms mos se t udy. as u etye s GEN x). T We e pr rm o lved rcen at th g its spla ired gro stiso gure es 1 . It i t ev the . Th used evsk N ” b The e no rese of th d th ntag he u s do ayed d by oup ot *s 3: 1 an is th very gen he c d by k (4 by th hig ow ent he n he t ge w use o own d th y th s w sem šes (Q nd 2 he m y gr nitiv cons y th 4x) a he r ghes tur inv num task was of t nwa he f he st was mides st’ju uest 2 sh majo roup ve stru he in and resp st de rn to vesti mer k ap s th he c ard fewe tand dis sjat’ stam tion how orit p. T and uctio nfor d th pon evia o th igat ral 6 ppro e lo cod spi est dard stin ’ju t mi se nnair wh ty. H The d p on * rma he c den atio he d tion 673. opri owe difie iral div d la nctly trem em’ju res) hat How e mo rep *v b ants cons nts f on ra decl n, th . Fig iate est i ed d . O verg angu y lo mja r udes per wev ost posit bol’š s fro stru from ate lens he i gure e fo in t decle nly genc uag ow. rublj sjat’j rcen ver, freq tive šinst om uctio m St was sion info e 3 or th the ens the cies ge a Of ljam ju tr ntag div que e fo tve St. on t. P s as n of orm giv he n wh ion e st s w at a f th mi w remja ge o verg ent orm sluč Pet *ob eter scer f the ants ves nor hole of tude ith rate e d was u a; th of th genc dev s w čaja ters ėtic rsbu rtain e co s sh the m. tes the ents the e of dive use he in he r ces viati whic ach “ sbur ch c urg ned omp hou per For st. T com s fr e ot f 50% erge d m nstru resp from ion ch G “in rg ( chud (4x for pou uld rcen r al The mpo rom ther %. T ent most ume pond m th fro Glo the (1x) dožn x), S r Sam und pro ntag ll p res oun m St. r tas The form t fre enta den he n om ovin e m ), Sa niko Sam mar car oduc ge o arti sult nd n . Pe sks, e pe ms, eque l for nts u norm the nska ajor ama ov “ mara ra. rdin ce t of r icip ts co num eter , us ercen , th entl rm o use m a no aja rity ara abo a (4x nal n the esp ant onfi mera sbu sed nta he fo ly b of th E the also orm not of (3x out x), a num ins pond t gr irm als i urg, the ge o orm by a he nu Elena e sta app is t ted cas x), a the and mbe stru den oup m the is co wh e fo of th m ko all g ume a Di and pea to c in ses P and ese d Mi ers. ume nts w ps, heref onti ho a orm he o ošel grou eral ieser dard ar in conher REP ” Alartinsk For ntal who this fore inualso m reothek s ups. 673 r d n r ” - k r l o s e o - s 3 Som We com inf in cat con SG stru dož con cod den Fig Th cod ica app me “ e no mbi orm the tego nstr G) w ucti žnik nstr dific nt fr gure e q difie l fo pea “case ow inat man e GE ory ruct with ions kov ruct cati rom 4: que ed) orms ar on es of tur tion nts u ENof a tion h the s *v “ab tion on. m M dv dv stio var s th n th f dou rn t n wi use -AC anim n (1b e n v bo bout n see Pu Mins vuch ve lja onna riati hat w he q ubt” o th ith the CC a mac b) (t onl’šin t th ems upils k u h ljag aguš aire ions wer ques ” in he d the e gr and cy. T the -app nstv hese s to s fr se i guše ški. “ e re s ar re u stion Rus dec nu ram d th The num plie ve sl e ar o be rom it m ek. “t “two esul re u used nna ssian lens ume mma he n ma mer d c luča tists e di m St. more two o NOM lts sed d as aires n gra sion erals atica noun ajor ral ateg ajach s GEN istin . Pe e oft GEN M-AC con d by s dis s. amm n of s fro ally n in ity in t gory h “i N ” ( nctl eters ten -ACC CC fr nfir nat stra mar f f fe om pre n th of d the y o in t (dev ly fr sbu tha C fro ogs G rme tive acto from min 2 t efer he G dive NO of an he viat requ urg n th gs G GEN-S ed t e sp rs in m dif nine to 4 rred GEN erge OMnim maj ting uen and he G GEN-P SG ” that peak n g ffere e an 4. Fi d co N-P ent -AC macy jori g fro nt w d fr GEN PL ” (Qu t th kers ram ent m nim igur onst PL, w form CC a y. In ity o om whic rom N-A uesti he s. In mma meth mal n re 4 truc wit ms and n c of c the ch i m Sa ACC ionn de n con atic hodi nam 4 sh ction th c was d the omp case e st is c ama C for naire scri ntra ality ical p mes how n w cons s th e no par es PRE tand om ra a rm. es) ibed ast, y ju pers in ws h with side he h oun rison EP ” dard mpat as w d ( the udg spec the how h the erati heter n in n to *ob d n tible well (ma e un men tives e pl w oft e nu ion roge the o th b ėti norm e w l as y ngra nts es lura ten um n of gene e GE he c ich c m), with s a y n amm do 163 al in the eral the eous ENconchuthis the stunonmatnot 3 n e l e s - - s e - - t 164 4.2 As nat wr a h Fig cod 1: v en 2: v p ma 3: v en 4: v p 5: v p ung 6: v 4 2 ant tive ong high gure difie v bo ndp v bol point rgin v bol ndp v bol point v bol point gram v bol The end ticip e sp g bu hly s 5: ed fo ol ’ šin poin l ’ šin ts) nal f l ’ šin poin l ’ šin ts) l ’ šin ts an mma ’ šins erm dpo pate eak ut p soph orms nstve ts) nstve form nstve ts) stve stve nd a atica stve mom ints ed b kers perc hist s: e slu e slu ms: e slu sluč sluč addi al fo sluč mete s by m div ceiv tica učae učaev učaja čajac čajac ition orms čajav er ju man vide e th ted v “i v “in ach “ ch “ ch “ nal t s: v (re udgm ny l e lin hem wa in th n th “in t in th in th tasks esult men ling ngu m in ay (c he m he m the he m he m s) ts fr nts/ guis uisti n vie cf. f majo major maj majo majo rom / gra ts, t c ph ew fig. ority rity jorit ority ority the amm the hen of g 5). y of of c ty o y of y of sca mat dat nom gram f cas case f ca case case ale w tical ta co mena mm ses G es GEN ses P es PR es PR with lity omp a no matic GEN ” N ” ( PREP ” REP ” REP ” end jud pile ot ca cali (re (resu ” (re (res (res dpoi dgm ed in ateg ity/ sult ults esul sults sults ints) ment n th gori / non ts fr from ts fr s fro s fro ) ts: T he s ical n-gr rom m th rom om t om t The stud ly i ram the he s m the the s the s E e sca dy s into mma e sca scale e sca scale scale Elena ale w show o rig atica ale w e wi ale w e wi e wi a Di with ws t ght ality with ith e with ith e ith e ieser h that and y in hout endhout endendr t d n t t - - Som Fig gor thi and 197 the Fig cod 1: o ma 2: * 3: * an Dif ent ana dif the inf me “ gure rica s tr d co 75) e pe gure difie ob ėti rgin *ob ė *ob ė nd a ffere t su alys ffere e inf luen “case es 5 al al ans ons tha ercep 6: ed fo ich c nal f tich ėtich addi ent urro sis: ence form nce es of 5 an lloc sitio struc at is ptio orms chud form chu h chu ition gro ound Th es a man the f dou nd 7 atio onal ctio s so on o s: dožn ms: udožn udož nal t oup din he a are p nts e re ubt” 7 ind on o l are ons pr of g ikac niko žniko task ps o gs, anal per from esul ” in dica of “ ea, can odu gram ch “a v “a ov “a ks) f in sim lysis rcep m m lts. Rus ate righ gra n be uctiv mma abou abou abou nfor mila s of ptibl mos At ssian tha ht” adua e ob ve atic ut th ut th ut th man arly. f va le b st te the n gra at in and atio bser in l ality hese hese hese nts . Th aria betw est s e sa amm n th d “w ons rved ling y. arti arti e art eva his ance ween sets ame mar f he ju wro bet d. T guis ists P ists G tists alua sta e (A n th s. M e tim from udg ong wee The stics PREP ” GEN ” s GEN ” ated tem ANO he ju More me, m dif gme g,” w en m co s ca ” (re ” (re ” (re d th ment OVA udg eove , th ffere nt o was mor nce n a esult sult esul he s t is A) gme er, t ere ent m of d s no re a ept also ts fr ts fro lts fr ame su con ents the wa meth diffe ot p and of p ap rom om t rom e se ppo nfirm s of siz as a hodi eren oss les pro par the the m the ente orte ms the ze o a si ical p nt s ible s sa toty rent sca scal e sca ence ed b tha e sa f th gnif pers ent e. In atisf ypic tly b le w le w ale w e, ev by t at n me he g fica spec enc n ad fact calit be a with with e with ven the no s sen grou ant tives ces, ddit tory ty ( app end end h en n in sta sign nten up d diff es a c tion y fo (Ro plied dpoi dpoin ndpo n dif atist nific nces did fere 165 aten, in rms sch d to ints) nts) oints ffertical cant s by not ence 5 n s , o ) s l t y t e 166 bet (fig “pe lut tra the eva nai “go and ime Fig 1: d 2: d 3: d 4: d a 6 twe gs. 5 erfe ely” nsit No e res alua ire ood Th d 8) ents gure dvuc dve lj dve lj dve l ddit en 5-9) ect” ” u tion ow t sult ated and d gra hese ). In s su 7: h lja jagu jagu ljagu tion the ). Th sen ngr n is to t ts fr d by d th ade ass n th upp aguš uški. uški. uški. nal ta e ju he nten ram ver the r rom y na hat t es” o sum hese ort šek. “ “tw “tw . “tw asks udgm dist nces mma ry sm rela m the ativ the on t mpti ins the “two wo NO wo NO wo N s) men tanc s on atica moo ation e gr ve sp div the ions stan e qu o GEN OM-A OM-A NOMnt o ce b n on al se oth. nsh ram pea verg test s ca nces uesti N-AC ACC f ACC f -ACC of t betw ne h ente . hip b mma aker gen ts? an b s, th ionn C fro frog frog fro the wee hand enc betw atica rs a nces be co he re nair ogs G gs GEN gs GEN ogs G gra en th d an es o wee ality s no tha onfi esu re re GEN- N-SG ” N-SG ” GEN-S amm he nd b on en t y ju ot a at o irm ults o esu -PL ” ” (re ” (re SG ” mat asse betw the the r dgm abso often med of t ults. (res esul esul (resu tical essm wee oth resu men olut n ap bas the sults lts fr lts fr ults l an men en m her ults nts. tely ppe sed gra s fro rom rom fro nd nt o marg ha s fro Is it y fal ear on mm om th m the m the om t ung of m gina nd om t tru lse on ma mati he s e sca e sca the s gram marg al se wa the ue t occ the ny icali scale ale w ale w scal mm gina ente as n qu that ur o e qu exa ity j e wi with with e w matic al s ence not esti t the on uest amp jud th e out end with E cal sent es a as l ionn e di the tion ples gm endp end dpoi end Elena sen tenc and larg nair iver qu nnai (fig ent poin d-po ints) dpoi a Di nten ces “ab ge. res rgen uesti ires gs. 5 t exp nts) oints ) ints ieser nces and bso- The and nces ionget 5, 7, pers) and r s d e d s t , d Som For es P tain gen Th ern exp of ins to i Fig cod 1: š fr ma 2: * 3: * ta Th obs “V use me “ r ex PREP ” ns a nitiv e di nme plan thre stan info gure difie šest’j rom rgin šesti *šest asks ere serv ikto ed a “case xam ” (fi a m ve a iver ent nati ee t nces orm 8: ed fo justa m the nal f isot tisot s) are ved or st at a es of mple ig. 5 modi as re rgen of ion test . Th mant orms ami e sca form sem sem e al . Th tart all o f dou e, th 5: b ifica equ nt c the pos t set his ts, it s: sem ale w ms: idesj mides lso he s ted on th ubt” he co boxp atio uired choi pr ssib ts g me t see m ’ jud with jat ’ j sjat ’ j ins sent to * he q ” in ons plot on. H d by ice repo iliti gave ans ems desja h end ju tr ’ju t stan tenc *wr que Rus stru ts 3 Her y bo of t ositi ies, e a s tha s to at ’ ju dpoi emja tremj nces ce V ite P estio ssian ctio 3, 4, re, t ol’ši the ion see val at a the trem ints) a (re ja (r s w Vikto PERF onn n gra on * an the instv cas v “ e Glo lue alth em t mja; ) esult resu her or n at t naire amm *V b nd 5 no vo “ se is “in” ovin bet oug to b the ts fr ults re d ačal the es. T mar f bol’š 5), w oun “in t s pe ”, c nsk twe gh t be n e ins rom from discr l *n end The from šinst whic slu the erha cf. v kaja, een the not a strum the m th rep apis d of e us m dif tve ch i čaja ma aps v slu , 19 0.5 sen abso men sca he s anc sat’ the e of ffere sluč is u ach ajori due učaj 96). an nten olut ntal le w scale cies kur e se f th ent m čaja used “ca ity o e to jach . On nd nce tely form with e wi be rsov mes he p meth ch “ d by ases P of”, o th h “in n av -0.5 doe y im m of end ith e twe uju ster perfe hodi “in y na PREP , bu he in n c vera to es n mpos f the dpoi endp een rab r the ecti ical p the ativ P ” s ut in nflu ase age, this not ssib e nu ints) poin the botu e ho ve pers e ma ve sp tan n the uenc s PRE , the s se sou ble. ume ) nts a e re v k ome asp spec ajor pea ds e pr ce o EP ” e pa ente und ral 6 and esul konc ewo pect tives rity aker not repo of th (for artic ence “na 673 d ad lts ce se ork” in es of rs, c t in osit he g r ot cipa e in atu (res dditi can emes ” is com 167 casconthe tive govther ants n all ral” sults ona n be stra not mbi- 7 - e . r s l ” s l e . t - 168 nat Ru by Th jud but ue, 1 2 Fig cod 1: č 2: č ung 3: č 4: č Wi thr the pea spe 8 tion ussia the e ju dgm tion , 5, w 22 tim gure difie čitat’ čitat’ gram čitat’ čitat’ ith t ree t e ph ar i eake n wi an n e au udg ment n wi was mes 9: ed fo ’ kni ’ kni mma ’ kni ’ kni the tim hen in t ers ith nati utho gme t ex ith s aw s. orms igi “ igi “ atica igach igach thir mes, om the (Ke the ive or f ent xper the war s: to re to re al fo h “to h “to rd t 3 th eno cor emp e ph spe for t of rime sec ded ead ead orms o rea o rea test hree on t rpu pen hase eake the this ents cond d ele boo boo s: ad b ad b (w e tim that us a n & e ve ers exp s se s do d te even oks A oks A book book ith mes t th are & H erbs as u peri ente oes est ( n ti AKK ” AKK ” ks PR ks PR the s, 2 he si not arb s is ung ime ence not (fig ime ” (re ” (re REP ” REP ” e ad two ingl t cl usc vie gram ent e w t ap . 10 s, 4 esult esult (res (res dditi o ti le c lass ch, ewe mm and with ppea 0, bo eig ts fr ts fr sults sults iona ime cons sifie 200 ed b matic d di hin ar a oxp ght t om om s fro s fro al ta s, a stru d a 08, t by m cal. id n the all th lot: tim the the om t om t ask and uctio as “ the mor Thi not e sc hat 3) es, scal scal the s the s s), 5 1 2 ons “un gr re th is c occ cope neg is a 3 tw le w le w scale scale 5 w 28 ti wh ngra amm han com cur e o gati as fo wo t witho with e wi e wi was a ime hich amm mat n 20 mbin on f b ive ollo tim out e end ithou ith e awa es. M h pr mati tica 0 or natio the oth (fig ws: es, end dpoin ut e endp arde Man ract ical” ality ally on w e qu h gr g. 10 : the 2 tw poin nts) endp poin ed e ny s tica ” b y-fre E y in was uest ram 0). T e hi wo t nts) poin nts) eigh stud lly by t equ Elena nterv s in tion mma The ighe time nts) ht ti dies nev the enc a Di view nven nnai atica e dis est mes, ime s rep ver na cy g ieser wed nted ires ality strivaland es, 4 port aptive gap r d d . y - d 4 t e ; Som Ber by gar sen sen (in ver stru I th atte and fac Fig cod 1: n ung 2: n 3: n 4: n a me “ rme Ber rdin nten nten thi ry b ucti here enti d Sc ctors gure difie nača gram nača nača nača ddi “case el & rme ng t nce nces s in bad ion efor ion chle s. e 10: ed fo al pis mm al *n al *n al *n tion es of & Kn el an the is d s — nstan and wa re su wa esew : orm sat’ matic apis apis napis nal t f dou nittl nd dat due in nce d ve as ge upp as re wsk ms: “sta cal f sat’ “ sat’ “ sat’ “ task ubt” l, 20 Kni ta in to a con e, 3) ery enu pose espo ky (2 arte form “sta “sta “sta ks) ” in 012; ittl ntro a la ntra . Th goo uine e th ons 200 ed to ms: arted arted arted Rus ; Fe (20 oduc ck o ast t he a od. ely s hat, i sible 7) t o wr d to d to d to ssian eath 12, ced of a to th asse Ne seen in in e. T that rite I o *w o *w o *w n gra hers p. 2 d he atten he c ssm ever n by nsta That t thi IMPF write write write amm ton 241 re, ntio case men rthe y a q anc t is w is ty ” (r e PERF e PERF e PERF mar f n, 20 ) as I as on. O e of nt of eless qua es o why ype esu F ” (r F ” (r F ” (r from 005) s “th ssum One do f thi s, o arter of a y I of lts f resu resu resu m dif ). Th he l me e co ubt is se one r of ssig agr dat from ults ults ults ffere his limi that ould ts — ente can f all gnin ree w ta is m th from from from ent m spe its o t a d see — ha ence n ha l res ng t with s no he sc m th m th m th meth ecifi of t pos e th ardl e is ardly spon the h h B ot im cale he s he s he s hodi ic fe this sitiv his i ly h bas y be nde hig Born mmu wit scale scale scale ical p eatu pre ve ju in th have sica elie ents hes nkes une th e e wi e wi e wi pers ure edic udg he f e a lly eve s as st va ssele to endp itho ith e ith e spec is r ctab gme fact mid spli tha “ve alue -Sch per poin out e end end tives refe bility ent f tha ddle it be t th ery e, a hles rfor nts) end poin poin es erred y.” for at th e m etw his c goo lac sew rma dpoin nts) nts 169 d to Rethis hese mark ween conod.” k of wsky ance nts) ) and 9 o s e k n - ” f y e ) d Elena Dieser 170 One can observe another discrepancy between the results of the questionnaires and of the grammaticality judgment experiments in the construction *ob ėtich chudožnikov “about these artists GEN ” (figs. 2 and 6). This form is used by native speakers almost as often as *v bol’šinstve slučajach “in the majority of cases PREP ” (cf. figs. 1 and 5). The use of the genitive instead of the prepositive in *ob ėtich chudožnikov “about these artists GEN ” in the grammaticality judgment experiments is regarded as bad (fig. 6). The construction *V bol’šinstve slučajach “in the majority of cases PREP ”, in contrast, represents the imaginary norm for several respondents and therefore pertains to linguistic competence. The reason could lie in the system internal conflict between different case regimens (bol’šinstvo “majority” requires the genitive and v “in” requires the prepositive). The construction *ob ėtich chudožnikov “about these artists GEN ” that is supported by no other factors is viewed by most native speakers in the graphematic representation as a grammatical error. Now we come to the question of whether the different types of the grammaticality judgment experiments used in this study yield a higher concentration in scientific knowledge than a single type. If we compare the following boxplots (in fig. 5: boxplots 4 and 5; in fig. 6, boxplots 2 and 3; in fig. 7, boxplots 3 and 4; in fig. 8, boxplots 2 and 3) with each other, we see that the difference between the results from these tests was not significant in any instance, that is, that the additional tasks do not have a decisive effect on the numerical judgment of forms and constructions. Nevertheless, the answers for the additional tasks were interesting within the scope of a qualitative analysis. Several informants would allow constructions or forms with modifications, e.g., *v bol’šinstve slučajach “in the majority of cases PREP ” or my zametili dve ljaguški “we noticed two NOM-ACC frogs NOM-ACC ” in colloquial language. However, in some instances, the codified forms were classified as colloquial and modified as a standard. It was like this, for example, in the instance of *v bol’šinstve slučajach “in the majority of cases PREP ”. Of the 45 informants, six classified v bol’šinstve slučaev “in the majority of cases GEN ” as colloquial and instead recommended *v bol’šinstve slučajach “in the majority of cases PREP ”. If one now compares the judgments of sentences with a scale with endpoints and without endpoints with each other (in fig. 5: boxplots 1 and 2, boxplots 3 and 4/ 5; in fig. 7: boxplots 2 and 3/ 4; in fig. 9: boxplots 1 and 2, boxplots 3 and 4, in fig. 10: boxplots 2 and 3/ 4), one sees that the judgments of the modified sentences are more alike than the judgments of the sentences without modification. The significant differences emerge almost only with the sentences without modification. What could be the reason? It could possibly be a technical problem. Sentences without modification have a lower Some “cases of doubt” in Russian grammar from different methodical perspectives 171 dispersion with a school grading system because in most instances the best grade of 5 is assigned to them. However, in the use of a scale with endpoints, the native speakers perceive grammaticality as a gradated phenomenon. Conclusion The results of this study show that the combination of different methods can be seen as a big enrichment. It helps researchers recognize the limits of the individual methods and compensate for their limits. Even when there were no statistically relevant differences between results from the test with and without end points, and with or without additional tasks, this comparison yielded many additional findings. The additional tasks, for example, help clarify which phenomena were seen by the respondents as deviant. This helps “weaken” the criticism made by Lehmann (2004) on grammaticality judgments that the director of the experiment does not know how to isolate which phenomena the respondents are judging precisely. In addition, it is possible — thanks only to the additional tasks (the participants were asked to improve the forms or constructions they considered ungrammatical and to report on the situation or style of speech in which these deviating forms would be allowed) — to see that some of the respondents completely accept the modified form and view the codified one as a deviation. A further comparison with the results from the questionnaires made it possible to better recognize the weaknesses of grammaticality judgments. Some of the respondents presumably evaluate the sentences imprecisely due to a lack of attentiveness. Nevertheless, the values given still provide us with an idea about which judgments were made “consciously” and which “accidentally.” Using a scale with and without endpoints hardly had any influence on the results of the study. Only by combining different methods can one determine that a high use frequency of the forms does not lead necessarily to the classification of these forms as grammatically correct. The constructions appropriate for norm received the “best grades” for their grammaticality from most participants, even if their use was very low (e.g., the instrumental forms of the compound numerals). Therefore, the discussed data for the examined areas does not support the acceptance that an imaginary norm exists that differs substantially from the standard norm. The usefulness yielded by a combination of different methods for linguistic investigation is universal, irrespective of languages. In this investigation, Elena Dieser 172 we used these advantages for typical grammatical categories of Slavic languages, and Russian in particular (the case and animacy categories). References Anstatt, T. (2011). Sprachattrition. Abbau der Erstsprache bei russisch-deutschen Jugendlichen. Wiener Slawistischer Almanach, 67, 7-31. Arppe, A., & Järviki, J. (2007). Every method counts: Combining corpus-based and experimental evidence in the study of synonymy. Corpus Linguistics and Linguistic Theory, 3-2, 131-159. Belikov, V. I. (2009). Stereotipy v ponimanii literaturnoj normy. In: Russkij jazyk v uslovijach kul’turnoj i jazykovoj polifonii. (Die Welt der Slaven, Sammelbände 38) (pp. 43-59). München: Sagner. Bermel, N., & Knittl, L. (2012). Corpus frequency and acceptability judgements: A study of morphosyntactic variants in Czech. Corpus Linguistics and Linguistic Theory, 8-2, 241-275. Bornkessel-Schlesewsky, I., & Schlesewsky, M. (2007). The wolf in sheep’s clothing: Against a new judgement-driven imperialism. Theoretical Linguistics, 33(3), 319- 333. Cejtlin, S. N. (2009). Rečevye ošibki i ich predupreždenie. Na materiale ošibok škol’nika. Moscow, Russia: Librokom. Coseriu, E. (1970). System, Norm, Rede. In E. Coseriu (Ed.), Sprache - Strukturen und Funktionen (pp. 193-212). Tübingen, Deutschland: Gunter Narr Verlag. Dieser, E. (2009). Genuserwerb im Russischen und Deutschen. Korpusgestützte Studie zu ein- und zweisprachigen Kindern und Erwachsenen. München: Sagner. Dieser, E. (2010). Schwankungen von Sprecherurteilen aus kognitiver Sicht. In T. Anstatt & B. Norman (Eds.), Die slavischen Sprachen im Licht der kognitiven Linguistik. (pp. 141-157). Wiesbaden, Deutschland: Harrassowitz Verlag. Dieser, E. (2013). Konkurencija form padežnogo upravlenija v russkom jazyke. In S. Kempgen, M. Wingender, N. Franz, M. Jakiša (Eds.), Deutsche Beiträge zum 15. Internationalen Slavistenkongress, Minsk 2013 (pp. 99-108). München: Sagner. Dieser, E. (2015). Grammatische Varianten im Russischen: Gebrauch und Sprecherurteile. In: E. Dieser (Ed.), Linguistische Beiträge zur Slavistik: XX. JungslavistInnen- Treffen in Würzburg, 22.-24 September 2011 (pp. 47-70). München: Sagner (Specimina Philologiae Slavicae). Featherston, S. (2005). Magnitude estimation and what it can do for your syntax: Some wh-constraints in German. Lingua, 115(11), 1525-1550. Featherston, S. (2006). Experimentell erhobene Grammatikalitätsurteile und ihre Bedeutung für die Syntaxtheorie. In W. Kallmeyer & G. Zifonun (Eds.), Sprachkorpora-Datenmengen und Erkenntnisfortschritt (pp. 49-69). Berlin: de Gruyter. Glovinskaja, M. J. (1996). Aktivnye processy v grammatike. In E. A. Zemskaja (Ed.), Russkij jazyk konca XX stoletija (1985-1995) (pp. 237-302). Moskva: Jazyki Russkoi Kul’tury. Glovinskaja M. J. (2000). Aspektwandel im Russischen. In L. Zybatow (Ed.), Sprachwandel in der Slavia. Die slavischen Sprachen an der Schwelle zum 21. Jahrhundert. Ein Some “cases of doubt” in Russian grammar from different methodical perspectives 173 internationales Handbuch. Bd. 1 (pp. 159-179). Frankfurt am Main, Deutschland: Lang. Glovinskaja M. J. (2008). Aktivnye processy v grammatike. In: L. P. Krysin (Ed.), Sovremennyj russkij jazyk: Aktivnye processy na rubeže XX-XXI vv. (pp. 187-267). Moskva: Jazyki slavjanskich kul’tur. Grannes, A. (1986). Rodi mne tri syna. Animacy in Russian numerals norm and usage. In G. Hummel et al. (Eds.), Festschrift für W. Gesemann (pp. 108-117). Neuried: Hieronymus Verlag. Hundt, M. (2005). Grammatikalität - Akzeptabilität - Sprachnorm. Zum Verhältnis von Korpuslinguistik und Grammatikalitätsurteilen. In F. Lenz & S. Schierholz (Eds.), Corpuslinguistik in Lexik und Grammatik (pp. 15-40). Tübingen, Deutschland: Stauffenberg. Kaufmann, G. (2005). Der eigensinnige Informant: Ärgernis bei der Datenerhebung oder Chance zum analytischen Mehrwert? In F. Lenz & S. Schierholz (Eds.), Corpuslinguistik in Lexik und Grammatik (pp. 61-95). Tübingen, Deutschland: Stauffenberg. Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgements and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (Ed.), The discourse potential of underspecified structures (pp. 179-192). Berlin: de Gruyter. Klein, W.-P. (2003/ 2004). Sprachliche Zweifelsfälle als linguistischer Gegenstand. Zur Einführung in ein vergessenes Thema der Sprachwissenschaft. Linguistik online 16 (2003/ 4). Retrieved from http: / / www.linguistik-online.org/ 16_03/ index.html Köpke, B., & Schmid, M. S. (2004). Language attrition. The next phase. In M. Schmid, B. Köpke, M. Keijzer, L. Weilemar (Eds.), First language attrition. Interdisciplinary perspectives on methodological issues (pp. 1-43). Amsterdam: Benjamin. Krysin, L. P. (2007). Russkaja literaturnaja norma i sovremennaja rečevaja praktika. In Russkij jazyk v naučnom osveščenii. Nr. 2 (14) (pp. 5-17). Moskva: Institut russkogo jazyka im. V. V. Vinogradova RAN. Lehmann, C. (2004). Data in linguistics. The Linguistic Review, 21(3/ 4), 275-310. Mel’čuk, I. A. (1985). Poverchnostnyj sintaksis russkich čislovych vyraženij. Wien: Wiener Slawistischer Almanach. Norman, B. J. (2007). Jazykovye pravila: vybor varianta jazykovoj edinicy. In: Russkij jazyk i literatura. № 2 (pp. 43-48). Minsk: Adukacija i vychavanne. Panov, M. V. (Ed.). (1968). Russkij jazyk i sovetskoe obščestvo. Moskva: Nauka. Poljakova, S. V. (2009). Slovoizmenenie količestvennych čislitel’nych v sovremennoj russkoj reči: ėksperimental’noe issledovanie. Tomsk, Rossija: Vestnik Tomskogo gosudarstvennogo universiteta. Reis, M. (2005). “Wer brauchen ohne zu gebraucht...”: zu systemgerechten “Verstößen” im Gegenwartsdeutschen. Cahiers d'Etudes Germaniques, 48, 101-114. Elena Dieser 174 Vinogradov, V. V. (1947). Russkij jazyk (grammatičeskoe učenie o slove). Moskva: Ministerstvo prosveščenija RSFSR. Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology: General, 104, 192-233. Švedova, N. J. (Ed.). (1980). Russkaja grammatika. Moskva: Institut russkogo jazyka (Akademija nauk SSSR). How to study spoken word recognition: evidence from Russian Julija Nigmatulina, Olga Raeva, Elena Riechakajnen, Natalija Slepokurova & Anatolij Vencov Abstract: One of the ways to understand and model the recognition and processing of spoken words is to study the spontaneous speech we must overcome in our everyday communication. This paper addresses several problems that we encountered while analyzing spontaneous Russian: the annotation of spontaneous speech corpora; methods of phonetic annotation (since automatic transcription produces too many errors), among others. Reviewing the experimental methods used in our study of spoken word recognition (such as dictation tasks, cloze tests, estimation of naturalness of speech, etc.) and assessing the results received, we propose several methodological principles that should be followed in the course of psycholinguistic research. 1 Introduction In studying spoken word recognition, we attempted to develop a functional model of the process. “Virtually all current models of spoken word recognition share the assumption that the perception of spoken words involves two fundamental processes: activation and competition” (Luce and McLennan, 2005, p. 595). This presumably means that a speech signal perceived by a listener activates a number of candidates in his/ her mental lexicon, from which that listener chooses the most suitable in a given context. The human ear analyzes a continuous acoustic signal. Therefore, the model of spoken word recognition must contain a special mechanism for mapping the acoustic speech signal onto its inner discrete representations in a listener’s mind, thus enabling higher-level processing (semantic, syntactic, etc.). In order to describe this mechanism, a researcher should know exactly what a listener hears. As there is no way to map the acoustic signal directly onto orthography, we must use an intermediate representation of data in an acoustic-phonetic form. Therefore, a particular low level of the mental lexicon can be postulated, called the “perceptual lexicon”, containing all word forms in acoustic-phonetic encoding, and with units of the latter correspond- Julija Nigmatulina et al. 176 ing with those used by a human at the initial stages of acoustic signal processing. For many years, psycholinguists used isolated words or phrases read by trained speakers as stimuli for their experiments on spoken word recognition. The results of such experiments allowed them to introduce possible methods of mapping the acoustic speech signal onto its discrete representations and different algorithms of spoken word segmentation (see Vencov and Kasevič 2003 and Luce and McLennan 2005 for overviews of the models). In the majority of experiments, English words and phrases were used as stimuli. The use of Slavic languages as psycholinguistic research material is always useful, as they possess rich inflexional morphology. The Russian language appears to have been the first Slavic language to be used in assessing the models of spoken word recognition. In Kassevich, Ventsov and Jagounova (2000), the Cohort model (Marslen-Wilsen, 1987) was simulated for the automatic segmentation of written texts. A computer model that used the “word segmentation by identification mechanism”, based on the principles of the Cohort model, was tasked with segmenting several “Russian printed texts […] where all the blank spaces between words have been deleted”, in order “to imitate its running continuous counterpart in ordinary oral speech” (Kassevich et al., 2000, p. 50). A frequency word list consisting of 46,000 word forms was created from the same corpus of texts from which the texts for the experiment were taken, and used as an analogue of the mental lexicon of a listener. In Russian, word forms that belong to one lexeme can differ significantly in form, accentual pattern, and frequency of usage. Therefore, word forms (not lexemes or lemmas) were regarded as entries of the perceptual lexicon. The results were fairly encouraging: the average percentage of aborted segmentation was only 1.4%; in 0.36% of cases the segmentation was grammatically invalid (Kassevich et al., 2000, p. 54). A similar percentage of correct answers was obtained when the model was given the same texts in their “ideal phonetic transcription” (which was the result of an automatic transcription of written texts and imitated the ideal pronunciation without any omissions of sounds), although the mistakes were not always the same (Venсov, Kasevič & Jagunova, 2003). However, in our everyday communication, we must deal with a speech signal that is normally fairly different from ideal enunciation (Ščerba, 1957/ 1915, pp. 22-23). Thus, any model that pretends to imitate the process of natural spoken word recognition should be evaluated using a spontaneous speech signal. The crucial question to answer while developing the functional model of spontaneous speech recognition is: what helps a listener to identify a word form in the context if it is phonetically reduced? That is, how How to study spoken word recognition: evidence from Russian 177 does a listener associate each part of the speech signal with the only possible entry of the mental lexicon? In this paper, we share our experience in solving two main methodological problems that every researcher must encounter while attempting to answer this question. The first of these concerns the research material, whereas the second regards the experimental methods to be used. 2 Material for spoken word recognition research The research group of M. Ernestus was probably among the first to create a corpus of spontaneous speech (Ernestus, 2000), and to use it as material for psycholinguistic experiments on spoken word recognition (Ernestus, Baayen & Schreuder, 2002; Kemps, Ernestus, Schreuder & Baayen, 2004, etc.). Since that time, quite a number of spoken corpora have appeared (Furui, 2003; Corpus Gesproken Nederlands, 2004; Grønnum, 2009, etc.). However, not all of these are applicable to spoken word recognition research. The corpora of spoken Russian, being quite diverse in both size and material 1 , have only orthographic annotation (sometimes together with pause markers), and thus cannot be used as material for spoken word recognition modeling. We are developing our own corpus of spontaneous Russian that includes orthographic and acoustic-phonetic transcriptions. The existing means of automatic transcription are not satisfactory, and the results of their application require manual verification and correction (see, for example, the description of the phonetic transcription in Corpus Gesproken Nederlands, version 1, broad phonetic transcription section). Thus, an acoustic-phonetic transcription in our corpus was performed manually. We began by dividing any speech signal into interpausal intervals (as minimal pronunciation units), and provided these intervals with an orthographic annotation. We then proceeded to the acoustic-phonetic transcription, which was performed by trained phoneticians who listened to asemantic syllable long fragments, such that morphological and lexical information did not influence the results of the analysis. Auditory analysis was supported by the visual representation of dynamic spectrograms. Dispersion in the experts’ decision concerning vowel identification can be significant, while their decisions about consonants mostly coincide. A catalogue of typical templates of formant frequencies and trajectories for vowels of different 1 Consider the ORD speech corpus of Russian everyday communication “One Speaker’s Day” (Asinovskij et al., 2009); the Multimedia Subcorpus of the Russian National Corpus (Grišina, 2011), and the corpora of the project “Rasskazy o snovidenijach i drugie korpusa zvučaščšej reči” (n.d.) as examples. Julija Nigmatulina et al. 178 quality was created as a reference for further transcription. It contained spectrograms and the results of their identification by listeners. This reference is required to facilitate identification in ambiguous situations. However, though the transcription method used enabled minimization of the influence of the experts’ lexico-grammatical knowledge, the “orthography” still affects their decisions, since they know how the sequence should be pronounced. The set of symbols used for the transcription is partially similar to the X- SAMPA computer-aided description of phonetic features of a speech signal and consists of Roman alphabet characters with a minimal use of upper case, that is, capital letters. We assumed that the perceptual system of a listener is primarily based on the phonological system of the language he/ she speaks and listens to. However, we differentiated vowels that were between hard consonants, after or between soft consonants, and before soft consonants, because, for example, as Bondarko, Verbickaja and Gordina (2000, pp. 95-96) and Bogdanova (2001, p. 77) showed, a listener considers these types of vowels to be different in Russian. For strongly reduced vowels that cannot reliably be identified we used the schwa ([ə]). At present, our corpus of spontaneous Russian texts provided with an orthographic and acoustic-phonetic annotation includes 115 minutes of radio interviews and TV talk shows (an orthographic and acoustic-phonetic transcription of these recordings is available at http: / / www.narusco.ru/ ). However, even such a limited amount of spontaneous texts with a detailed annotation provides us with new facts that are relevant to spoken word recognition modeling. 1. Pauses can occur almost anywhere in a phrase and easily break semantic-syntactic units (clauses). One semantic-syntactic unit can even be broken by several pauses (up to seven in our material). In one of the texts from our corpus (27 min 12 sec, 3,037 items), 30% of all clauses were broken by “irregular” pauses (Nigmatulina, Pal’čikov & Riechakajnen, 2014, pp. 36, 40). However, it was found that a significant number of clauses are spoken without any pauses in between. These two phenomena show that the internal borders on different levels of a spontaneous text programming do not necessarily coincide, and thus make us think of how a listener addresses such an ambiguity in the process of spoken word recognition, and, more precisely, of the role that prosodic cues play in the process. 2. According to the acoustic-phonetic transcription, vowels can undergo all kinds of phonetic reduction (both qualitative and quantitative) far more often than consonants. This fact makes us suppose that conso- How Fig w to gure 3. o stu na ide Ru Sv red as 1: T A ma ph de suc ber cy the tio wo ly fre eac udy s nts ea o ussia etoz duc we The freq atica hone scri ch a r of of e e w on, ord po eque ch l spok are of c an zaro ced ll as spec que ally etic iptio a pr f va eve word tha for ossib ency low ken w e m con has ova vow s th ctro ency y cr des ons ronu rian ery r d fo t is rms ble y w w-fre wor ore son s pr a, 19 wels hose gram y w reat scri s of unc nts e real orm s, h had to word eque rd re im nant revi 988, s, w e tha m o word ed, iptio a w ciati each lizat ms w had d up o es d fo enc ecogn mpor ts b ious , pp whic at a f the d lis on ons word ion h w tion were at p to stim orm cy w nitio rtan bein sly p. 55 ch s are u e wo st of n th . Ev d fo in t word n. T e su leas o 10 mate ms, word on: e nt p ng m bee 5-56 show unst ord f al he b very orm the d fo he d ubje st o 0-15 e th due d fo evid perce mor en d 6, et w th tres form ll w basi y en m, al cor orm dat ect one 5 dif he e to orm dence epti re s disc tc.), hat ssed m lju word is o ntry ong rpus can a an to q om ffere fre o a m in e fro ive stab cuss , ou all d (se ublju d fo of b y in g wi s. It n ha naly qua mitt ent eque lim the om R cue ble t sed ur d stre ee fi u “lo orms both the ith t all ave, ysis antit ed rea ency mite e m Russ es th than (Z ata esse igu ove 1 s fr h or e lis the owe , an sho tativ or aliza y o ed a mater sian han n v Zem pro ed v re 1 1SG ” rom rtho st is nu ed u nd to owe ve sub atio of r amo rial n vo vow ska ovid vow 1). , pro m ou ogra a c mb us t o es ed t and bstit ns, real oun . Th owe wels aja, de n wels ono ur co aph com ber o to id stim that d qu tute alth liza nt of he w els. A in 197 new can unc orp ic a mbin of o den mate t up uali ed hou ation f oc wor Alth spo 73, w ex n be ced a pus and natio occu ntify e th p to itati sou ugh ns ccur rd li hou onta pp. xam e re as [l wa d ac on o urre y the e fr 20% ive und. it w for rren ist a ugh ane . 40 mple educ l j ub’ s au cous of b ence e nu requ % o red . So was r hi nces also 179 the eous 0-41 es of ced, l j i] utosticboth es of umuenf all ducome onighs of o al- 9 e s ; f , - h f - l e - f - Julija Nigmatulina et al. 180 lowed us to find linguistic factors that presumably influence word reduction (such as number of syllables and part of speech) (Riechakajnen, 2013). The resulting frequency word list can be used for the modeling of automatic speech recognition systems. It also provided us with the material for further experimental study aimed at solving the controversial question as to whether the mental lexicon of a listener includes all reduced variants he/ she has ever heard, or only the canonical realizations (see, for example, Aleksandrov and Gejl’man (1986), Ernestus et al. (2002) for the discussion of this question). Assumptions regarding the process of spoken word recognition based on a detailed analysis of spontaneous signals must then be verified in psycholinguistic experiments. 3 Experimental study of spoken word recognition: challenges and solutions The psycholinguistics of everyday speech is associated with its own particular challenges for designing a scientifically rigorous experimental model. As Vencov and Kasevič (2003) showed, the results of classical psychoacoustic experiments cannot be used in spoken word recognition modeling, in which a test set normally consists of a restricted list of similar stimuli, and participants are asked to choose one variant from several given responses. It is evident that such a task is not similar to the one a listener must undertake while perceiving spontaneous speech. Therefore, the algorithm a participant may use in an experiment can drastically differ from the general mechanism of spoken word recognition. At present, a variety of experimental paradigms are used to study spoken word recognition, from traditional offline tasks, such as dictation tasks (Taft, 1984; Ernestus et al., 2002), visual cloze tests (Janse & Ernestus, 2011), or shadowing (Brouwer, Mitterer & Huettig, 2010; Mitterer & Ernestus, 2008), to online methods where the priming effect is measured (e.g., the auditory lexical decision experiments described by van de Ven, Tucker & Ernestus 2011). In recent research aimed at studying the process of reduced word form recognition and the role of the context within this, there has also been a tendency to use the visual word paradigm (Brouwer, Mitterer & Huettig, 2012, 2013; Viebahn, Ernestus & McQueen, 2015). An example is the research conducted by Brouwer et al. (2012), in which the authors presented study participants with reduced word forms via headphones and asked them to decide How to study spoken word recognition: evidence from Russian 181 what word form they had heard by selecting it from four possible items shown on a screen. The responses could be a target word (written in orthography), a word form that sounded similar, but not identical, to the canonical pronunciation of a target word (canonical form competitor), a word form that sounded similar to the reduced pronunciation of a target word (reduced form competitor), and a distractor. In accordance with the visual word paradigm, a computer aided eye-tracker registered the participant’s eye movements, along with the time of fixation on every given word. The fixation times on the potential canonical form and the reduced word form competitors were similar to the fixation time on the distractor (Brouwer et al., 2012, p. 554), meaning that a listener regarded all three variants as distractors. The method described showed the algorithm that the participants used while choosing one of four given answers, whereas listeners have far more competitors from which to choose in everyday communication. Therefore, although eye-tracking is generally fairly useful while studying written word recognition, that is, the process of reading, a researcher should be very careful when discussing the new information that the obtained results can provide with regard to the process of natural speech recognition. Dictation task experiment The most frequent method that was used in our research was a dictation task, whereby a listener must listen to a short list of stimuli, none of which are repeated, and write down or repeat what he/ she has just heard. Although this is an offline task, and therefore provides information regarding only the result and not the process of recognition, such a simple procedure means that the experimental conditions are fairly close to the natural communication situation, in which a listener is not provided with any variants from which to choose. The experiments we performed using the dictation task method, and the methods further introduced, were described in detail by Riechakajnen (2010), Nigmatulina and Riechakajnen (2011), Raeva (2012), Apuškina, Vencov and Slepokurova (2014), and Raeva and Riechakajnen (2015). In this paper, we provide an overview of these, focusing on the methodological issues and the implications of the obtained results for the functional model of spoken word recognition. In a dictation task we use either semantic or asemantic stimuli, depending on the goal of the experiment. The study of the recognition of Russian reduced word forms is an example of the application of this experimental procedure to semantic stimuli. The research included several experiments Julija Nigmatulina et al. 182 that differed in stimuli, but were designed in one and the same way, that is, using the dictation task paradigm. In the first experiments, 82 participants were asked to listen to 24 isolated reduced word forms extracted from spontaneous dialogues in Russian, as well as to the same word forms in the phrase context, and to write down what they heard (for example: mne očen’ nravjatsja [nrɛts] situacii “I very much enjoy the situations”). The results we received are similar to those obtained in Dutch research (Ernestus et al., 2002) and illustrate that isolated reduced word forms cannot be identified by listeners, whereas context facilitates the recognition of a reduced word form. According to the results described, we can assume that only canonical realizations of word forms are stored in the mental lexicon of a listener, whereas those that are reduced should be reconstructed to the canonical realization using contextual information. However, further experimental research demonstrated that some reduced realizations can be stored in the mental lexicon of a listener. If a word form has a typical reduced realization that is used more often than all other variants of this word form (including the canonical variant) then it is accurately recognized by listeners, even if presented in isolation. The most vivid example in Russian spontaneous speech is the realization [ɕæs] of the word form sejčas “now”, which was correctly recognized by all participants in the dictation task experiment. In order to further study the role of the context and to check whether one and the same reduced word form can be differently interpreted, depending on the context, we instrumentally transferred the reduced word forms from the first experiment into new contexts (fragments of speech of the same speaker). The results showed that the contextual information was crucial for the interpretation of a reduced word form. For example, the reduced form [p r os j t] from the original context, prosjat [p r os j t] prislat’ fotografii “ask 3PL.PRS to send photographs”, was interpreted as pust’ “let (he/ she/ it)” by 57.7% of the participants in the context [p r os j t] živët, where the last word means “live 3SG ”. The answer prosjat was not given by any participant, as such an interpretation of a reduced word form is impossible in the context [p r os j t] živët. In assigning the crucial role in spoken word recognition to the context, the value of phonetic information in the process should still be estimated. The analysis of the answers we received to the isolated reduced word forms in the first experiment allowed us to provide additional evidence for the fact that consonants are more perceptually important than vowels. The participants tended to misperceive the number of syllables, as well as the quality of vowels, more often than the number and quality of consonants. The initial How sto for res (su ase for pos tion Fig w to op c rms sear uch ema rm, ssib n w gure o stu cons . H rch: as c antic and ble, with 2: T s udy s sona How the con c an d so oth the The spec spok ants eve e st nson nsw o att herw e sti stim trog ken w s w er, t udy nant wers tem wise imu mulu gram wor were ther y of ts, v s, w mpte e, th ulus us SO m an rd re e be re w f pr vow which ed to he fu s and O S nd th ecogn etter was rele wels h sh o w ull d n TI f he re nitio r pe on xica , an how write wo ot a from esul on: e erce ne im al in nd n w th e do rd p a tra m the lts o evid eive mp nter num hat a own pro ansc e co of th dence d th orta rpre mbe a pa n wh vid crip ontex he id e fro han ant etat r of arti hat ded ption xt so denti om R n all lim tion f sy cipa had ma n. o stic ifica Russ l ot mita n of yllab ant d be y m chov ation sian ther ation spe bles did een mere v “fr n of r ele n to ecif ) is d no n he ely rom the eme o th fic a acc ot re ard be a poe vow ents his acou cept eco d as a le ems” wels s of asp usti tabl gni acc exica ”: a s the pect ic fe le o ze a cura al a dyn e w of eatu only a w ately asso nam 183 word the ures y for word y as ociamic 3 d e s r d s - Julija Nigmatulina et al. 184 The results of the dictation task experiment on asemantic stimuli were presented by Apuškina et al. (2014), where every sequence of unstressed syllables was provided with a dynamic spectrogram and the results of the identification of the vowels in the string by listeners (see figure 2). Such results show whether similar acoustic features are identically interpreted by listeners and, if not, allow us to propose hypotheses on the reasons for the differences in interpretation. Asemantic stimuli have both advantages and disadvantages. They allow us to avoid the influence of an individual’s lexical background, and to compare the estimations we obtain in an experiment to the results of instrumental analysis. However, asemantic fragments are usually rather short and cannot be adequately interpreted if presented only once, therefore, individuals must listen to short stimuli two or three times consecutively, or the fragments must be instrumentally lengthened. Moreover, listeners normally attempt to find meaning in every speech fragment that they hear (Kasevič, 2006/ 1988, p. 593), and participants in an experiment can do the same, even if they know that the stimuli should be asemantic. Limitations of the dictation task experiment Although it is used to address a variety of different research tasks, the dictation task, as with every other experimental procedure, cannot be the only method that is applied while studying spoken word recognition. Our study of the recognition of Russian homophones can serve as an illustration of this proposal. The aim of the experiment was to identify what influences a listener’s choice of the interpretation of a homophone in a situation when the context cannot disambiguate the homophonic speech signal. Both lexical and grammatical Russian homophones were used in the experiment: 1) nouns that end in voiceless consonants in the nominative singular: [plot] (plod “fruit” ― plot “raft”), 2) verb forms (present tense, third person singular and plural) with unstressed endings that are pronounced equally in spontaneous speech: [΄voz ji t] (vozit “drive 3SG.PRS ” ― vozjat “drive 3PL.PRS ”), 3) verb forms (Past Tense, Singular, neuter and feminine) with an unstressed ending [ə]: [΄stoilə] “costed” (stoilo (n.) ― stoila (fem.)), and 4) word forms that may be interpreted as adverbs and neutral or feminine forms of short adjectives: [u’ʒasnə] (užasno “terribly”, “terrible N ” ― užasna “terrible F ”). Two versions of the experiment were used. In the first of these, 50 participants performed a classical dictation task in which they were asked to listen to isolated words and to write down what they had heard. The results generally demonstrated the hypothesis and showed that the frequency of a word form is the crucial factor when the context cannot disambiguate a homophone. However, this How to study spoken word recognition: evidence from Russian 185 does not guarantee that each participant wrote what he/ she intended. Orthographic mistakes or lapses can be quite frequent in a dictation task and cannot be controlled for as participants are not asked to provide any context for the answers they give. Moreover, such a simple and straightforward task can force a participant to choose one strategy for its completion (e.g., to write the variant the spelling of which is closer to the pronounced variant for all stimuli: plot and not plod for [plot], etc.), a strategy that reveals nothing about the behavior of a listener in the context of natural speech. Therefore, in the second version of the experiment, 35 participants were asked to create and write down a phrase using the given word and so clarify its interpretation. This method is more likely to prevent merely choosing the variant that most closely matches the pronunciation. The results of the experiment are also an example of the specific contribution made by the Russian language to research into interpreting ambiguous speech signals. As Russian has both lexical and grammatical homophones, we can compare strategies used by listeners while interpreting ambiguous speech signals, both lexically and grammatically. Although word-form frequency is the crucial factor while choosing the interpretation of a lexical homophone, for grammatical homophones the overall frequency of the same grammatical forms in the corpus (the so-called “type frequency”) has emerged as being more important. Moreover, the experiment described once again showed that word form frequency is more important than lexeme (lemma) frequency. For example, 88.6% of participants interpreted [rot] as rot “mouth”. The frequency of this word form was higher than that of rod “race, gender” (also [rot]), whereas the comparison of lexeme frequencies showed the opposite tendency. How to study the role of the context in spoken word recognition The dictation task experiments provide evidence for the importance of context in the spontaneous speech recognition of reduced word forms. However, they do not describe how and when it is involved in the process. To answer these questions new methods are required. This part of our research is now in progress and we are attempting to incorporate both new experimental and modeling methods. Reduction is often believed to influence highly predictable word forms (Jurafsky, Bell, Gregory & Raymond, 2000). In order to check this assumption we conducted a cloze test experiment. The participants were given 20 fragments of spontaneous Russian dialogues in their orthographic transcription with a gap at the position of a reduced word form, and asked to fill in the gaps (for example: ja choču ________ odnu očen’ vešč’ prostuju “I want __________one very simple thing”). The results obtained from 20 partici- Julija Nigmatulina et al. 186 pants do not prove the hypothesis. The context allowed listeners to correctly define the grammatical features of a word form to be used (at least part of speech) and the semantic field to which the content words belong. However, the answers themselves are fairly diverse; three to 14 different word forms were received as answers to the stimuli used in the experiment. As reduced word forms do not always appear in the most predictable contexts, the mechanism of the influence of the context should be more complex and may differ for different reduced word forms. Therefore, we decided to model the influence of the context for every reduced word form in our material, analyzing an interpausal interval where it is used. Due to free word order in Russian, the role of intonation should be considered in the process of natural speech recognition. We regard intonation as an essential part of the context. As already mentioned in section 2, pauses in spontaneous texts can occur almost anywhere in the phrase, making semantic-syntactic units “broken” by pauses. Therefore, the context emerges as being organized differently on semantic-syntactic and intonation levels, meaning that it is essential to know how a listener uses suprasegmental information while recognizing spontaneous speech. In such experiments, it is more important to obtain a subjective opinion of a listener than an interpretation of the lexical structure of a stimulus. Therefore, the task should be either to estimate whether a stimulus does or does not sound natural, or to describe a possible wide discourse context (or even the discourse situation) for a given stimulus. A total of 42 individuals participated in the experiment, which aimed to check whether the intonation contour unites semantic-syntactic units broken by pauses in spontaneous speech. To create the stimuli for the experiment, we instrumentally deleted pauses that occurred inside 12 semantic-syntactic units (e.g., učitel’ dolžen (pause) imet’ (pause) dochod “a teacher should (pause) have (pause) an income”). The participants were asked to indicate whether or not the phrase sounded natural, and to mark his/ her answers with “+” for natural, “-” for unnatural, and “0” if they could not choose between the two first answers. The majority of the participants considered that all but one of the stimuli sounded natural, although the dominance of the “+” answer was statistically significant 2 for only four stimuli out of 12. Therefore, not all the semanticsyntactic units broken by pauses had a smooth pitch contour. Thus, the intonation pattern did not always appear to provide a listener with information as to whether a fragment between pauses is complete or is a part of a bigger semantic-syntactic unit. Thus, semantic and syntactic contexts prove to be 2 To compare the results the method of confidence intervals has been used. How to study spoken word recognition: evidence from Russian 187 more important factors and further research in the field should focus on these aspects of contextual information. Conclusions Spontaneous speech corpora must be created and enriched in order to provide adequate material for spoken word recognition modeling. The annotation of texts in such corpora must include a detailed acoustic-phonetic transcription. Our corpus reveals a number of phonetic phenomena that appear to be important for spoken word recognition modeling. Researchers should not look for a single method of spoken word recognition research. Psycholinguists must select (or design) a method every time they begin a new study, following some basic methodological principles. While planning an experiment, a researcher should: - hypothesize on the ways in which a participant can behave in the experimental conditions and whether his/ her behavior will match natural communication conditions, - use the records of spontaneous speech as experimental material, - choose or develop methods that will arrive at an experimental setting that as closely as possible matches that of natural communication, - use different methods depending on the goal of the experiment, verify these in pilot experiments, and conduct a comparative analysis of results received under different experimental conditions. As the evidence on spoken word recognition from the Slavic languages is not extensive at present, we consider that our research method contributes to the psycholinguistic study of these languages. It also provides a contribution to the design of general psycholinguistic research, as the methodological problems we uncovered while studying spontaneous Russian do not appear to be language-specific and should thus be considered in any psycholinguistic research based on spoken word material. Acknowledgements The work has been supported by two research grants from the Russian Foundation for Basic Research: 09-06-00244 (2009-2011) and 13-06-00374 (2013-2015). We are very grateful to an anonymous peer-reviewer and to the editors for constructive comments on an earlier version of the article. Julija Nigmatulina et al. 188 References Aleksandrov, L., & Gejl’man, N. (1986). Nužno li učit’ fonetike častych slov? In Sluch i reč’ v norme i patologii (pp. 20-26). Leningrad: Leningradskij vosstanovitel’nyj centr VOG. Apuškina, I., Vencov, A., & Slepokurova, N. (2014). Albom dinamičeskich spektrogramm bezudarnych dvuslogov, vydelennych iz spontannoj reči, i rezultatov identifikacii nositeljami russkogo jazyka fonemnogo kačestva glasnych v ich sostave. Retrieved from http: / / www.narusco.ru/ ALBUM01. Asinovskij, A., Bogdanova, N., Rusakova, M., Ryko, A., Stepanova, S., & Šerstinova, T. (2009). The ORD speech corpus of Russian everyday communication “One Speaker’s Day”: creation principles and annotation. In V. Matoušek & P. Mautner (Eds.), Text, Speech and Dialogue. (Lecture Notes in Computer Science 5729) (pp. 250- 257). Berlin, Heidelberg: Springer. Bogdanova, N. (2001). Živye fonetičeskie processy russkoj reči. St. Petersburg: Filologičeskij fakultet SPbGU. Bondarko, L., Verbickaja, L., & Gordina, M. (2000). Osnovy obščej fonetiki (4th ed.). St. Petersburg: Filologičeskij fakul’tet SPbGU. Brouwer, S., Mitterer, H., & Huettig, F. (2013). Discourse context and the recognition of reduced and canonical spoken words. Applied Psycholinguistics, 34, 519-539. Brouwer, S., Mitterer, H., & Huettig, F. (2010). Shadowing reduced speech and alignment. Journal of the Acoustical Society of America, 128, EL32-EL37. Brouwer, S., Mitterer, H., & Huettig, F. (2012). Speech reductions change the dynamics in competition during spoken word recognition. Language and Cognitive Processes, 27(4), 539-571. Corpus Gesproken Nederlands. (2004). Retrieved September 30, 2015 from http: / / lands.let.ru.nl/ cgn/ doc_English/ topics/ version_1.0/ annot/ phonetics/ info.htm. Ernestus, M. (2000). Voice Assimilation and Segment Reduction in Casual Dutch: A Corpus-Based Study of the Phonology-Phonetics Interface. Utrecht: Landelijke Onderzoekschool Taalwetenschap. Ernestus, М., Baayen, H., & Schreuder, R. (2002). The Recognition of Reduced Word Forms. Brain and Language, 81 (1-3), 162-173. Furui, S. (2003). Recent advances in spontaneous speech recognition and understanding. In Proceedings of ISCA and IEEE Workshop on Spontaneous Speech Processing and Recognition. Retrieved from http: / / citeseerx.ist.psu.edu/ viewdoc/ download? doi= 10.1.1.106.7305&rep=rep1&type=pdf Grišina, E. (2011). Multimedijnyj russkij korpus: sovremennoe sostojanie i perspektivy razvitija. In Proceedings of the International Conference “Corpus Linguistics-2011” (27-29 June 2011) (138-144). St. Petersburg: St. Petersburg State University. Grønnum, N. (2009). DanPASS - A Danish phonetically annotated spontaneous speech corpus. Speech Communication, 51, 594-603. Janse, E., & Ernestus M. (2011). The roles of bottom-up and top-down information in the recognition of reduced speech: Evidence from listeners with normal and impaired hearing. Journal of Phonetics, 39, 330-343. How to study spoken word recognition: evidence from Russian 189 Jurafsky, D., Bell, A., Gregory, M., & Raymond, W. (2000). Probabilistic Relations between Words: Evidence from Reduction in Lexical Production. In J. Bybee & P. Hopper (Eds.), Frequency and the Emergence of Linguistic Structure (pp. 229-254). Philadelphia PA: John Benjamins. Kasevič, V. (2006/ 1988). Semantika. Sintaksis. Morfologija. In Trudy po jazykoznaniju (Vol. 1) (pp. 373-612). St. Petersburg: Filologičeskij fakultet SPbGU. Kassevich, V., Ventsov, A., & Yagounova, E. (2000). The simulation of continuous text perceptual segmentation: A model for automatic segmentation of written text. Language and Language Behavior, 3(2), 48-59. Kemps, R., Ernestus, M., Schreuder, R., & Baayen, H. (2004). Processing Reduced Word Forms: The Suffix Restoration Effect. Brain and Language, 90, 117-127. Luce, P., & McLennan, C. (2005) Spoken Word Recognition: The Challenge of Variation. In D.B. Pisoni, R.E. Remez (Eds.), The handbook of speech perception (pp. 592- 609). Berlin, Oxford: Blackwell Publishing Ltd. Marslen-Wilsen W. (1987). Functional parallelism in spoken word recognition. Cognition, 25(1), 71-102. Mitterer, H., & Ernestus, M. (2008). The link between speech perception and production is phonological and abstract: Evidence from the shadowing task. Cognition, 109, 168-173. Nigmatulina, Ju., Palčikov, G., & Riechakajnen, E. (2014). Pauzacija i sintaksičeskie svjazi v russkoj spontannoj reči. In Problemy poroždenija i vosprijatija reči: materialy XII vyezdnoj školy-seminara (29-30 nojabrja 2013 g., Čerepovec) (pp. 35-40). Čerepovec: ChGU. Nigmatulina, Ju., & Riechakajnen, E. (2011). Segmentacija spontannoj reči: vosprijatie staženij glasnych na styke slovoform. In E. Erofeeva (Ed.), Problemy socioi psicholingvistiki. (Vol. 15) (pp. 31-38). Retrieved from: http: / / splr.psu.ru/ Raeva, O. (2012). Strategija raspoznavanija reducirovannych variantov vysokočastotnych slovoform. In Problemy jazyka: Sbornik statej po materialam Pervoj konferencii-školy “Problemy jazyka: vzgljad molodych učёnych” (pp. 241-252). Moscow: Institut jazykoznanija RAN. Raeva, O., & Riechakajnen, E. (2015). Prostranstvo russkogo spontannogo teksta s točki zrenija slušajuščego. Socioi psicholingvističeskije issledovanija, 3, 67-70. Rasskazy o snovidenijach i drugie korpusa zvučaščej reči. (n.d.). Retrieved September 30, 2015 from http: / / spokencorpora.ru. Riechakajnen, E. (2010). Vzaimodejstvie kontekstnoj predskazuemosti i častotnosti v processe vosprijatija spontannoj reči (na materiale russkogo jazyka). [Unpublished PhD thesis]. St.-Peterburg: St. Petersburg State University. Riekhakaynen, E. (2013). Reduction in spontaneous speech: How to survive. In J. Heegård & P.J. Henrichsen (Eds.), New perspectives on speech in action. Proceedings of the 2nd SJUSK conference on contemporary speech habits (pp. 153-166). Frederiksberg: Samfundslitteratur Press. Svetozarova, N. (Ed.). (1988). Fonetika spontannoj reči. Leningrad: Leningrad State University. Julija Nigmatulina et al. 190 Ščerba, L. (1957/ 1915). O raznych stiljach proiznošenija i ob ideal’nom fonetičeskom sostave slov. In Izbrannyje raboty po russkomu jazyku (pp. 21-26). Moscow: Učpedgiz. Taft, M. (1984). Exploring the Mental Lexicon. Australian Journal of Psychology, 36, 35- 46. Van de Ven, M., Tucker, B., & Ernestus, M. (2011). Semantic context effects in the comprehension of reduced pronunciation variants. Memory & Cognition, 39, 1301- 1316. Vencov, A., & Kasevič, V. (2003). Problemy vosprijatija reči (2nd ed.). Moscow: Editorial URSS. Vencov A., Kasevič, V., & Jagunova, E. (2003). Orfografičeskij tekst i transkripcija s točki zrenija vosprijatija reči. In Materialy XXXI mežvuzovskoj naučno-metodičeskoj konferencii prepodavatelej i aspirantov (11-16 marta 2002 g.), Vypusk 22, Sekcija obščego jazykoznanija, č 2 (pp. 3-6). St.-Petersburg: Publishing house of St. Petersburg State University. Viebahn, M., Ernestus, M., & McQueen, J. (2015). Syntactic Predictability in the Recognition of Carefully and Casually Produced Speech. Journal of Experimental Psychology-Learning Memory and Cognition, 41, 1684-1702. Zemskaja, E. (Ed.). (1973). Russkaja razgovornaja reč’. Moscow: Nauka. Are Schalter and šapka good competitors? Searching for stimuli for an investigation of the Russian- German bilingual mental lexicon Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke Abstract: Journal articles on language processing rarely comment on the difficulties and obstacles involved in the construction of materials for experimental investigations. This is remarkable, since the compilation of appropriate (i.e., valid and well-controlled) linguistic stimuli is one of the biggest challenges in psycholinguistic research. This is particularly true for languages other than English. This paper addresses a number of methodological issues that arose during the preparation of material for an eye-tracking investigation of the Russian-German bilingual mental lexicon. As we intended to study intraand interlingual co-activation, quadruples of two Russian and two German words with a phonological overlap in the onset were needed. Thus, we discuss the concept of phonological overlap and consider the phonetic and phonological differences between Russian and German. Furthermore, word frequency is known to be a crucial factor in language processing. However, several problems are associated with the concept of frequency in general and with comparing frequency data from two languages. We broach this issue in the final section of this paper, presenting and discussing frequency data of different types. Lastly, we discuss how we guaranteed that the pictures we used were reliably associated with the intended object names in each language, which is a critical pre-condition for materials intended to be used in the visual-world paradigm. 1 Introduction Psycholinguistics is a genuinely empirical discipline, involving the examination of language from the perspective of its users. Key research questions in the field concern how language is represented in the brain and which access procedures are in place to retrieve these representations when encoding and decoding utterances. The choice of linguistic material is critical to the validity of psycholinguistic research, and the success of a study is highly dependent on the quality of the linguistic items that participants are confronted with. Fortunately, most authors provide a list of the materials they used in the Appendix, which offers readers insights into how the experimental de- Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 192 sign was implemented and gives them the opportunity to identically replicate studies. However, only a minority of studies mention the challenges and difficulties that had to be overcome while compiling these stimuli. The stimuli must meet a multitude of different selection criteria, which often results in a very narrow section of the language being tested in the experiment. On the other hand, most studies wish to draw general conclusions from their results, so the items must be sufficiently representative of the language(s) under investigation. Stimulus selection is far more straightforward if researchers have the opportunity to use specific databases such as the MRC Psycholinguistic Database (Coltheart, 1981; Wilson, 1988) or Celex2 (Baayen, Piepenbrock, & Gulikers, 1995). Most of these databases only work well with Germanic languages, and they are especially oriented towards the English language. 1 One of the specific problems facing psycholinguistic studies involving Slavic languages is the lack of databases containing metalinguistic information regarding the necessary words, such as subjective frequency, graphemic and phonological neighborhood, syllable structure etc. However, a major step forward has recently been achieved with the database of 375 action pictures and Russian verbs built by Akinina et al. (2015), as well as the database of Russian norms for different parameters of the colorized version of the Snodgrass and Vanderwart’s pictures (Tsaparina, Bonin, & Méot, 2011). In this article, we address some of the problems we had to deal with during the preparation of a visual-world eye-tracking study that we designed to investigate the mental lexicon of Russian-German bilingual speakers. In the following section, we will first describe this study and its background. After that, we will focus on the material and the specific conditions it has to meet. Word frequency is a crucial parameter in spoken language processing, but is corpus frequency an appropriate dimension for modeling psycholinguistic reality? In the third section, we address this problem as well as the relationship between the corpus frequencies for Russian and German words. Further, we present subjective frequency measurements as an alternative method. A second requirement of the visual-world paradigm is that each of the selected object names 2 has to match a picture that illustrates the denoted object as unambiguously as possible. We discuss the problems associated with this in section four, and we report the results of a pre-test of the materi- 1 The American Speech-Language-Hearing Association provides an impressive selection of links to (predominantly US-American) lexical databases for experiment preparation: http: / / www.asha.org/ research/ researcher-tools/ databases/ 2 We use the term object name in the sense of “lexical item”, which is common in psycholinguistics (and different from the linguistic meaning of “proper name”). Are Schalter and šapka good competitors? 193 al for the visual-world study. Finally, in implementing the experimental design of our study, we had to select Russian and German object names with phonologically overlapping onsets, and so we discuss the question of how to operationalize phonological overlap given the phonetic and phonological differences between the two languages. The material from our pre-tests is available for further use at the Tromsø Repository of Language and Linguistics (TROLLing: http: / / hdl.handle.net/ 10037.1/ 10281). 2 The long-term objective: An investigation of the Russian- German bilingual lexicon 3 Our long-term objective is to investigate how the bilingual mental lexicon is structured and, more specifically, whether there are differences between bilinguals with early vs. late age of onset in their two languages. There is a multitude of fundamentally different definitions associated with the term “bilingual”. These can be broadly divided into usage-based and competencebased definitions. We opt for a usage-based definition because of the obvious difficulties in defining and measuring competence levels within the competence-based framework. For our research, we intend to adapt a wellestablished definition by Weinreich (1953), which takes bilingualism to be “the practice of alternately using two languages [in everyday life]” (p. 1). We have added “in everyday life” to Weinreich’s (1953) original definition in order to separate bilinguals from foreign language learners. Bilingualism is often a consequence of migration. This also holds true for the current Russian-German bilingualism, which was caused by four waves of immigration from the Soviet Union or its successor states to Germany during the twentieth century. How the languages of a bilingual influence each other, in which situations one language in particular is being used, and to what extent a language can be acquired or preserved depends on a multitude of parameters. An individual’s age at the time of immigration has been established as one of the most important factors influencing bilinguals’ proficiency in their languages (Köpke & Schmid, 2004; Romaine, 1995). From the biographical perspective, adults who have immigrated to another country are referred to as the first immigrant generation, while children who have immigrated together with their parents or who are born in the new country form the sec- 3 The project is conducted at the Ruhr-University Bochum (Germany) as part of an interdisciplinary collaboration between the Chair of Slavic Linguistics (Tanja Anstatt, Christina Clasmeier) and the Chair of Psycholinguistics (Eva Belke, Jessica Ernst, and Sebastian Sauppe). Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 194 ond immigrant generation. The latter group is often referred to as “heritage speakers” (cf. Polinsky & Kagan, 2007). While members of the first immigrant generation often preserve their first language as their dominant language, it is typical for heritage speakers to feel more comfortable communicating in the majority language of the society they live in rather than in their family language. Usually, the environmental language is their dominant one. Moreover, heritage speakers often show phenomena of incomplete acquisition or attrition in L1 (Benmamoun et al., 2013). From the perspective of bilingualism research, a speaker’s linguistic biography is important for differentiating individual types of bilinguals. Adolescence is assumed to have a crucial impact on language development, such that people who started to acquire their second language after puberty are treated as late bilinguals, while those who were exposed to a second language during childhood are assigned to the group of early bilinguals (McLaughlin, 1984). Is there a reliable difference in the way the languages are stored in and processed by the bilingual mental lexicon? The objective of our study is to reveal such differences, should they exist, between the two groups of bilinguals. To this end, we used the visual-world paradigm to examine the degree of co-activation of the one language while processing auditory stimuli in the other language (cf. Huettig, Rommers, & Meyer, 2011). In this paradigm, participants listen to object names while looking at four or more objects on the screen. These include the target object (i.e., the object that corresponds to the name the participants hear) and one or more objects in the display, referred to henceforth as competitors, whose names are phonologically related to the target name. Lexical co-activation is measured by tracking the proportion of fixations to the target and the competitors during spoken word recognition. Using the eye-tracking technique with English monolinguals, Allopenna, Magnuson and Tanenhaus (1998) found that participants were more likely to look at an onset-related competitor than at a non-competitor. Following up on this finding, Spivey and Marian (1999) and Marian and Spivey (2003a, 2003b) applied the same paradigm to Russian-English bilinguals, assessing whether co-activation occurs across languages as well. In addition to the within-language competitor condition, they presented their participants with a between-language competitor condition of the type speaker - spički (Russian: “matches”). They replicated the results of Allopenna et al. (1998) regarding within-language competition. More importantly, Spivey and Marian (1999) and Marian and Spivey (2003a) found a significant main effect of between-language relatedness for both languages (i.e., for an English target and a Russian competitor and vice versa). These latter findings indicate that, for balanced bilinguals such as those tested by Marian and Spivey, lexical 195 competition was not restricted to the target language but instead involved the lexicons of both languages. However, the magnitude of the effect seems to depend on various circumstances, e.g., the language mode (Marian & Spivey, 2003b; Grosjean, 2013). Furthermore, Canseco-Gonzalez et al. (2010) examined English-Spanish early bilinguals and second language learners of Spanish and English within the same paradigm. They observed a small between-language competition effect depending on the age at the time of Spanish acquisition, which indicated that age of acquisition is a crucial factor in cross-linguistic activation. Their experiment, however, was restricted to English targets and within-language (English) and between-language (Spanish) competitors. Co-activation in the other direction (English competitors for Spanish targets) was not taken into account. In our own study, we investigate whether Marian and Spivey’s (1999, 2003a, 2003b) findings extend to non-balanced bilinguals and we include their first language Russian as well as the second language German. Late bilinguals should exhibit more co-activation of Russian words in the German setting than German words in the Russian setting, while for early bilinguals it should be the exact opposite. We expect that both groups exhibit stronger within-language competition in the environmental language of their childhood and adolescence. To test these predictions, we created withinand between-language target-competitor quadruples whose phonological forms initially overlapped. For instance, in the German within-language target-competitor pair, the target Schalter “light switch” was paired with the competitor Schaffner “conductor”, which shared the initial sounds [ʃa]. Accordingly, the Russian target šapka “woolen hat” was combined with the Russian competitor šaška “saber”. Finally, when pairing the targets and competitors between languages, the targets Schalter and šapka were combined with the competitors šaška and Schaffner, respectively. Note that with quadruples such as this one, competitors can also function as targets and vice versa. That is, in another variant of the experiment, Schaffner and šaška can function as targets with Schalter and šapka being the competitors. In the following, we want to discuss the challenges we faced when compiling quadruples like these. 3 Characterization of the required linguistic material It is clear from the example of Schalter - Schaffner - šapka - šaška that it is far from simple to find such quadruples. First, we had to ensure that the quadruples consisted of names of concrete objects, whose pictorial representations would be associated with the desired names. Withinand between- Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 196 language competition can only be induced if the competing name is evoked by its picture. This implied that the object names and the pictures had to match specific criteria. A few pictures could be retrieved from the corpus of Snodgrass and Vanderwart (1980), but the vast majority had to be drawn especially for our experiment 4 and so required careful pre-testing. Regarding the phonetical/ phonological forms of the object names, all of the quadruple members would have to be stressed on the first syllable, as this is the predominant stress pattern in German. Furthermore, it was vital that the translation equivalents did not overlap phonologically and were no cognates with the respective object names. For example, Welle “wave” would not have been an appropriate German quadruple member because the onset of its Russian translation equivalent volna overlaps with Welle in the first sound [v]. This criterion is crucial because such a phonological overlap between translation equivalents undermines the validity of the eye tracking measurements. For instance, if we used a picture labelled Welle/ volna as a competitor in the Russian between-language condition along with the target word vetka “sprig”, and if the participants gazed at the picture of Welle/ volna, we would not be able to interpret this effect as being due to between-language competition, since the within-language overlap of vetka and volna could also account for the participants’ behavior. Critically, establishing whether cross-language target-competitor pairs are phonologically similar or not required the careful consideration of crosslinguistic phonetic and phonological differences. At the beginning of the quadruple collection process, we also tried to take into account the syllable structure (disyllabic words only) and the sound pattern of the words’ onset (consonant + vowel (CV) only). However, it soon became clear that following these restrictions would leave us unable to identify enough words that match the obligatory criteria mentioned above. Finally, we aimed to match the names included in a quadruple with respect to their relative frequency. It is well known that frequency is one of the most important factors related to the processing of words (cf., e.g., Balota et al., 2004; Brysbaert & New, 2009, p. 977; an overview is given by Prestin, 2003). Broadly speaking, the word frequency effect means that more frequent linguistic units will be processed faster and more accurately. Various authors presume that frequency is represented in the mental lexicon (cf., e.g., Schmitt & Dunham, 1999). Interestingly, Balota et al. (2004) showed that the word frequency effect has a more significant impact on word recognition than on word production, a finding that, if anything, increases the relevance 4 We would like to thank Matthias Brunnert (Bochum) for his excellent drawings and his commitment throughout the whole creative process. 197 of word frequency for our study. However, there are two critical challenges that one faces when dealing with frequency. First, it is questionable whether the frequency count of a word obtained on the basis of corpora adequately represents the information stored in the mental lexicon. Secondly, it is extremely difficult to balance the frequency of the members of a quadruple. We will discuss these issues in the following section, before turning our attention to the pre-test. 3.1 Measuring word frequency The most commonly used method of measuring frequency is the calculation of the number of occurrences of a given linguistic unit in a certain amount of texts (corpus frequency), which is usually given in instances per million tokens (ipm). With respect to Russian, Marian and Spivey (2003a, 2003b) relied on the frequency data of two sources, namely Lönngren (1993) and Zasorina (1977), while for English they made use of Zeno et al. (1995). They established the respective corpus frequencies of the Russian stimuli and their English translations, as well as those of the English stimuli and their Russian translations. They did not balance individual quadruples for frequency. Instead, they assessed all four stimulus sets (Russian and English targets and Russian and English competitors) with regard to whether they differed significantly in terms of frequency. Since there were no significant differences, the authors regarded the frequency of their stimuli as balanced. Even though this is an accepted means of demonstrating the similarity between stimulus sets, it cannot be ruled out that the statistical comparisons were non-significant due to a lack of statistical power (there were only ten items per group in Marian and Spivey’s analyses). Even if there was an overall balance of frequencies, there may be strong imbalances within groups and between the members of a target-competitor pair. Example (1) below, which is taken from the material used by Marian and Spivey (2003a), presents a target whose Russian and English competitors differ considerably in terms of frequency. The numbers indicate the instances per million tokens. It was clear from the outset that for our study, too, it would be impossible to find target and competitor pairs matched for frequency, since the set of object names available to choose from was restricted in the first place. However, one way to control for this problem is to include frequency as a covariate in the statistical analysis of the results of the eye-tracking experiment. This way, frequency is accounted for individually for each targetcompetitor pair. Yet, the situation is made worse by the fact that different frequency lists may provide different results for the same item. Examples (2) and (3), which Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 198 are again taken from the material used by Marian and Spivey (2003a), illustrate such difficulties. In example (2), the frequency scores obtained from the two sources for Russian differ from each other. Finally, example (3) presents an item whose English and Russian names diverge strongly in terms of frequency. (1) Russian target busy “necklace”: ipm 6, interlinguistic English competitor book: ipm 301, intralinguistic Russian competitor buben “tambourine”: ipm 3. (2) Russian target spički “matches”: ipm 32 (Lönngren, 1993) vs. ipm 83 (Zasorina, 1977). (3) English target boot: ipm 8, Russian translation sapog: ipm 106. Surely, such discrepant results from various sources of frequency measurements illustrate a fundamental problem, as they touch upon the very concept of frequency and its measurement. After all, the representation of the frequency of a linguistic unit is a mere construct, since the “real” frequency can only be measured theoretically. All we can count are the occurrences of a certain form in a given corpus of texts, and even this is associated with unresolved problems. One of the most crucial issues is the representativeness of the corpus. Recent analyses have suggested that it depends on two central factors: Corpus size and proportion of spoken language (cf. Baayen et al., 2006; Brysbaert & New, 2009). Brysbaert and New (2009) determined a corpus basis of at least 16-30 million tokens as necessary for frequencies that allow reliable predictions of reaction times in lexical decision tasks. For English, large data sources are available; however, the situation is not that easy for Russian (and German as well). When Marian and Spivey (2003a, 2003b) conducted their experiments for Russian, only frequency lists with a very small corpus basis were available (Lönngren (1993) is based on 1 million tokens, while Zasorina (1977) is based on only 0.4 million), and the divergences of the ipm values between both sources (as in the above quoted example spički “matches”: 32 vs. 83 ipm) are due to differences in the textual basis. Recently, the Novyj Častotnyj Slovar’ (New Frequency Dictionary of Russian; Ljashevskaja & Sharov, 2009) has brought about substantial improvements, as it is based on a representative sample of 100 million tokens from the continuously growing Nacional’nyj korpus russkogo jazyka (National Corpus of Russian). However, the included proportion of spoken language is not yet optimal, since it accounts for only about 5% (for more detail, cf. Anstatt & Clasmeier, 2012). For German, we can rely on the dlexDB (Kliegl & Hanneforth, 2011), which operates on a similar size corpus of 100 million tokens. 199 When comparing two languages, a further potential difficulty has to be noted. The corpus frequencies in a considerable number of cases diverge substantially between a Russian word and its German equivalent, as was noted with respect to the English and Russian stimuli of Marian and Spivey (2003a) (cf. example 3 above). Some examples of our body of stimuli are given in table 1. A whole range of factors could explain the differences. Very often, the obvious reason is the divergences of the semantic structures between the two languages. Thus, Russian maslo does not only mean “butter”, but also “(edible) oil”, as opposed to German Butter (4a). Of course, intercultural differences also play a role: valenok is a culture-specific piece of clothing that is barely known in Germany (cf. example 5a). Moreover, it is striking that in most cases the Russian corpus frequency is higher. This could be due to structural features of the languages. German has a lot of compounds that count as single lemmas, whereas Russian makes more use of other structures as, for example, adjectives and participles. Hence, Regenpfütze “puddle of rain” would be a lemma on its own in German, whereas the Russian equivalent doždevaja luža counts for luža “puddle” and doždevoj “rain- (adj.)”. Lastly, we cannot exclude the possibility that differing techniques of measuring corpus frequency may have an influence. We therefore do not have reliable information regarding the representativeness of the corpus frequency with respect to the mental lexicon of the speakers. As opposed to English, for Russian (and for German) there is no database available containing results from previous investigations by means of which the representativeness could be controlled. The problem is compounded by the fact that our participants, as bilinguals, may differ substantially from monolingual speakers, especially in the case of Russian. Quantitatively, they have less input in Russian, and qualitatively, their Russian input differs, since the range of linguistic registers they have contact with is considerably lower in comparison to monolinguals. Lastly, we do not know the extent to which there are cross-linguistic interferences with respect to meaning and frequency. 5 For all the above reasons, we cannot assume that corpus frequency adequately represents the frequency stored in the mental lexicon of our bilinguals. 5 The study by Pavlenko and Malt (2010) shows that slight semantic influences occur even in everyday, familiar concepts such as words for drinking vessels. Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 200 Example Russian word Russian corpus frequency (ipm) German word German corpus frequency (ipm) 6 Distance of corpus frequencies 7 Meaning (4a) maslo 73.0 Butter 18.6 0.42 “butter” (5a) valenok 15.2 Filzstiefel 0.2 1.38 “felt boot” (6a) luža 22.2 Pfütze 2.5 0.84 “puddle” Table 1: Examples of the diverging corpus frequencies of Russian and German equivalents 3.2 Subjective frequency Another way of obtaining word frequency information is to ask speakers of the respective languages to estimate the frequencies of words, a technique that is known as subjective or estimated frequency (for an overview and discussion of the technique cf. Anstatt & Clasmeier, 2012, for a review of the method cf. Anstatt in press). It has been applied in many psycholinguistic and psychological studies, and it was shown to be an stable and reliable interpersonal measurement (cf. Ellis, 2002, for an overview) with a high predictive power for results in word processing tasks, normally better in fact than corpus frequency (Balota et al., 2001; Brysbaert & New, 2009; Reid & Marslen-Wilson, 2003). Recently, Brysbaert and Cortese (2011) showed that corpus frequency is on a par with subjective frequency in only those cases where the former was obtained on the basis of very advanced, large, and well-balanced corpora. Thus, in order to obtain more adequate frequency information, one step in the pre-test involved the collection of data on subjective frequency for our Russian and German material. 6 Logarithmized numbers more adequately represent the relation of frequencies; however, the differences of the examples are large, and we refrain from presenting the logarithmical numbers in order to avoid further complicating the explanations. 7 For the calculation of the distance, the corpus frequencies were first logarithmized and z-transformed. Then, the absolute values of the differences were calculated. This technique makes the distance comparable to the distance of subjective frequencies (see table 9, chapter 4.4). 201 As explained below (cf. section 4), the participants in the Russian part of the pre-tests were late bilinguals. Our assumption was that their data would best represent the bilingual mental lexicon of the participant group we intended to test in the visual world study, and it would thus give us as realistic a picture of the relevant frequency data as we could hope for. 4 Pre-tests We carried out pre-tests designed to establish a) the extent to which speakers matched the pictures with their intended names, and b) the subjective frequency of the object names. We tested the stimuli in the language that they were selected from for the eye-tracking experiment as well as their translation equivalents. Thus, besides Schalter “light switch”, the Russian translation vključatel’ was tested as well, even though it would not feature as a target name or competitor name in the experiment. All tasks administered in the pre-tests were paper-and-pencil tasks. In the following, the pre-test pertaining to the Russian material will be referred to as R for short, while the pre-test pertaining to the German material will be referred to as G. Based on the first pre-test’s (pre-test 1) results, some stimuli had to be changed and they were then pre-tested again (pre-test 2). 4.1 Method Participants 54 adult Russian-German bilinguals from the first immigrant generation, 15 of them male, took part in pre-test 1 of the Russian material. Their average age was 34 years (SD=9.6). In pre-test G1, which featured the German material, 36 adult German native speakers participated, nine of them male, with an average age of 28 years (SD=7.4). For pre-test R2, 18 Russian-German late bilinguals (four male), with an average age of 35 years (SD=10.4) participated, while for pre-test G2, 18 German native speakers (five male) with an average age of 27 years (SD=3.7) were recruited. Material We collected 36 quadruples of object names according to the criteria listed above. Furthermore, we added seven non-competitor (filler) objects to each quadruple, the names of which did not exhibit a phonological overlap with their corresponding quadruple names in either Russian or German. Accordingly, 396 pictures of these objects were included in pre-test 1. As there were Russian and German versions of the pre-test, and as each object was tested Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 202 for its German and its Russian name, 792 object names featured in pre-tests R1 and G1, while 53 were presented in pre-tests R2 and G2. Design and Procedure The participants were asked to perform three tasks. In the naming task, they were confronted with the object pictures and instructed to write down the best Russian (pre-test R) or German (pre-test G) name for each object in the empty column to the right of the picture. In the picture-word matching task, pictures were presented together with their corresponding names in Russian (pre-test R) or German (pre-test G), respectively, and the participants were asked to rate the match between the picture and its name on a scale ranging from 1 (“the word does not match the picture at all”) to 7 (“the word matches the picture completely”). In the subjective frequency rating task the object names were presented in the leftmost column of a blank table in Russian (pre-test R) and German (pre-test G), respectively, and the participants were asked to rate the frequency of each word’s appearance in everyday life on a scale ranging from 1 (“I never encounter this word”) to 7 (“I encounter this word constantly”). The assignment of stimuli to the tasks was counterbalanced across participants so that all stimuli were tested in all tasks, although a given participant would only ever see each stimulus once. Due to the two rounds of pre-tests, we have 18 values for each tested unit from pretest R1, 12 values from pre-test G1, and six values each from pre-tests R2 and G2. All instructions were given in the appropriate language of the respective test (i.e., Russian in pre-test R and German in pre-test G). 4.2 Data analysis and results I: The naming task The participants’ reactions were categorized as follows: - Expected target name. - Different but appropriate name that does not pose a problem for our study (this category was predominantly assigned when participants had provided the diminutive form for non-diminutive targets or the other way round, for example, list “leaf” instead of listik “small leaf”, or in the case of juxtaposition, for example, podlodka instead of podvodnaja lodka “submarine”). - Different and inappropriate name (including hyponyms/ hypernyms, for example, vorobej “sparrow” instead of ptica “bird” and semantically related words, for example, ležak instead of kušetka, both “couch”). - Unknown name. 203 For the quantitative analysis, the first two categories were collapsed to a “named as expected” category and the percentage of responses of this type was assessed. Table 2 shows that in both the German and the Russian naming tasks, most of the items were named as expected by the majority of participants. However, nearly one fifth of all pictures evoked the expected name in less than 50% of cases (cf. the third row of table 2). Table 3 provides four examples of the most striking cases. These were pictures that were never named as expected. The pictures in the first line pose a problem for our study because nasmork was supposed to feature as a Russian target and competitor. Likewise, Salbe was intended to be a German target and competitor. Instead of the target names, typical names mentioned by the participants were bol΄noj (čelovek) “ill (man)”, bolezn΄ “illness”, gripp “flu”, prostuda “cold”, prostudivšijsja/ prostužennyj “having a cold” instead of nasmork “cold, chill” and Zahnpasta “toothpaste”, Paste “paste”, Creme “cream” or even Inhalt “content” instead of Salbe “ointment”, respectively. As a consequence of these results, those particular stimuli were excluded from the target/ competitor set. Clearly, the pictures were not suitable for the elicitation of the intended object names, implying that there was no chance they would induce co- Percentage of items named as expected Russian German Absolute Percentage Absolute Percentage Items named as expected ≥ 90% 175 44.2% 213 53.8% Items named as expected ≥ 50% but <90% 143 36.1% 103 26.0% Items named as expected <50% 78 19.7% 80 20.2% Total 396 100% 396 100% Table 2: Number and proportion of items named as expected in the Russian and German versions of the naming task Are Schalter and šapka good competitors? 204 act nam In fac lem or Ge get the cal mit the fin sec had item P T 4 tiva me con ct th m fo targ rma t an e alt ly w tten em v As al s cond d oc ms Pictu Table tion (e.g ntras hat n or o get/ an t nd c tern with ns w vare s a r stim d pr ccur wer ure e 3: n or g., S st, i no p our / co targ om nati h Ge wor ežka resu muli re-t rred re te Exa r b Same in th part stud mp get a pet ve erm ked a (83 ult o i set est. d du este ampl be c en “ he c ticip dy petit and itor nam man d ou 3% o of p t. W In urin ed a Ru les o co-a “see case pan bec tor i d co r. A mes Pu ut m or 1 re-t We s doi ng t agai ussi Ex na “c na “p of pi Ch activ eds” es o nts n caus item omp ver s pr mpe muc 15 o test sub ing the in. T ian xpec asmo cold asos pum ictu hrist vate ” an of na nam se th ms i petit ry r rovi e or ch b out o 1, s bstit so, adm The cted ork d, ch mp” ures tina ed w nd n asos med he p in t tor, relev ided r Ru bett of 1 som tute we min e res d na hill” labe Cla whe nart s “p the pict the wh van d by ussia ter, 18 ca me o d th e als nistr sult ame ” elled asme en ta “d pum e pi ture oth hile nt re y th an v sin ases of ou hem so a ratio s ar e d wi eier, com dog mp” ictu es w her l the esul he p vare nce s). ur s m fo men on o re g Pic ith u Tan mbin g sle and ure a were lang e m lt fr part ežka alm stim or n nde of p iven ctur unex nja A ned edge d Fa as e e su gua itte rom ticip a. In most muli new ed a pre-t n in re xpec Anst d w e”, r aus expe upp age. ns o m the pan ndee t all i ha w on a nu test n tab cted tatt, with resp than ecte pose Th occu e pr nts o ed, i l of ad to nes umb t 1, ble G d nam , Jes the pect ndsc ed d ed to he p urre re-te ove in R f th o be and ber o and 4. Ger Ex Sa “o Fa “m mes sica e in tive chuh did o fu pum ed a est w rlap Russ e b e ex d th of m d th man xpe albe oint aust mitt s in 1 Ern nten ely). he “ not unc mp o as a was ppe sian ilin xclu hen min he co n cted tmen than tens 100% nst & nde . “mit t po tion occu a Ru s th d p n, n ngua uded con or l orre d na nt” ndsc s” % of & Ev ed p tten ose a n as urre ussi hat n phon nam als d fr ndu laps espo ame chuh f cas va B part ns”, a pr s tar ed a ian non nol ming nam om ucte ses t ond e he ses Belke tner the robrget as a tarne of ogig the med the ed a that ding e r e t a f e d e a t g Pe na Ite ex Ite ex bu Ite ex To Ta As pec be atic Sac nam Fig We elic erce ame ems xpec ems xpec ut < ems xpec otal able tab cted exc Ho c ite ckhü med gure e de cit t enta ed a s na cted s na cted 90% s na cted l 4: R ble d re clud owe em üpfe d Sa 1: P eter the age as ex ame d ≥ 9 ame d ≥ 5 % ame d <5 Resu 4 c esul ded ever yie en (c ackh Pictu rmin int of i xpe d as 90% d as 50% d as 50% ults o clear ts e for r, th lde cf. f hüpf ure f ned tend item ecte s % s % s of th rly eithe goo here d v figu fen b for S d fro ded ms d he n sho er. od f e we very ure 1 by 1 Sack om d na A 14 13 2 5 ami ows As from ere y go 1), w 100 khüpf the ame Abso 4 3 6 3 ing t s, m a c m th also od whi % o pfen e res e, al olut task many ons he f o so resu ich of th (Ger sult ltho Ru te k in p y o sequ final ome ults was he G rma ts of oug ussi Pe 26 24 49 10 preof th uen l wo e cas s, fo s th Germ an) a f th gh t an erce 6.4% 4.5% 9.1% 00% -test he s nce, ord ses or ex he su man and he n her nta % % % t 2 subs a c d set wh xam ubs n m pryž nam re w ge stitu erta t. here mple stitu mono žki v ming were utes ain e the e, th ute f olin v me g tas e cl Ab 19 13 21 53 s di num e su he p for ngua eške sk t lear solu id n mbe ubst pict Salb al p (Ru that rly G ute not er o titut ture be. T arti ssia t ma man Germ P 3 2 3 1 pro of st te fo e for Thi icip an) “ any ny man Perc 35.8 24.5 39.6 100% ovid tim or a r th s pi ant “sack y pic pic n cent % % % % de t muli a pr he G ictu s. k ra ctur ctur tage the had robl Germ ure w ace” res es t 205 e exd to emman was did that 5 o n s d t Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 206 elicited a variety of names. It is noteworthy, however, that in the visual world paradigm participants listen to a given word while viewing pictures, that is, they do not need to generate any names for the pictures. Therefore, the results of the picture-word matching task are the most relevant when evaluating the data from the pre-test. As long as the participants judge the match between a name and the corresponding picture to be good, the fact that they did not generate the correct name in the naming task is not particularly problematic. However, if there are items that are not named with the expected label and that also yield low scores in the picture-word matching task, these items should be removed from the stimulus set and replaced, if possible. 4.3 Data analysis and results II: The picture-word matching task As mentioned in section 4.1, in the picture-word matching task pictures were presented together with their corresponding names, and the participants were asked to rate the level of matching on a scale ranging from 1 (“the word does not match the picture at all”) to 7 (“the word matches the picture completely”). For each item, the mean score and the standard deviation of the judgments were calculated, and the mean scores were subdivided into four bands as given in table 5. This shows how the mean values were distributed over the four bands. Mean value of picture-word matching between… Russian German Absolute Percentage Absolute Percentage 7.00-5.50 370 93.4% 125 31.6% 5.49-4.00 14 3.5% 251 63.4% 3.99-2.50 9 2.3% 20 5.1% 2.49-1.00 3 0.8% 0 - Total 396 100% 396 100% Table 5: Distribution of mean values for picture-word matching in pre-test 1 In ma wo Ge and did is, pill Mo sta Fig As jud ma we new the sho bot atch ord rma d th d no pre lar, oreo ge i gure a c dged ain e e de w o e su ows th la hing com an L he m ot p esum es over in th 2: P cons d as exp etail ones ubst s, no ang g sc mbin Larv mea perfo mab pec r, in he l Pictu sequ s w perim led s an titut o ite guag ore nati ve “ an v orm bly, ciall n th life ure f uen wors men in nd t tes em ges, wa ions “gru valu m m bec ly w he st of v for L nce o se th nt. H sec then for was , for as f s w ub” ue of much cau whe trict verm Larv of t han How ction n co r th s ju r th four ere wa f th h be se a en p test min ve (G the r n 4. wev n 4. ond he e udge he v r or jud as ju he ju etter a gr pres t sen n an Germ resu 0 w ver, .2, w duct exclu ed a vast r hi dged udg udg r, w rub sen nse nd a man) ults were , the we ted ude as w ma ghe d to ged gmen with is v nted , a mp ) an s of e ex ey w sub a s ed i wors ajor er. O o ma to nts h the very d in gru phib nd lič the xclu wer bsti eco item se th ity Onl atch des wa e m y di n th ub is bian činka e pic uded re s itute ond ms w han of w ly a h ea scrib as 3. mean iffic he fo s no ns. a (R ctur d fr set a ed pre wor n fou wor a ve ach be t .25 n va cult orm ot a Russi re-w rom asid som e-te rke ur o rds ery oth the (SD alue to m o an e ian) wor m th de a me o st. I d o on a the sm her b pic D 2.0 e bei diff f a ntit “gr rd m he e as p of t In t out aver e av all bad ctur 01). ing fere sim ty a rub” matc expe poss the this ver rage vera num ly. F re in Th 4.2 entia mpl as su ” chin erim sible exc fol ry w e. age mbe For n fi he R 28 (S ate le l uch ng t men e fil clud llow wel pict er o exa gur Russ SD 2 fro ine h, bu task ntal ller ded w-up ll. A ture of p amp re 2 sian 2.59 m a dr ut r k, al set item ite p p As t e-w pictu ple, 2 fai n lič 9). T a ca raw ath ll ite t of ms. ems pre-t tabl 207 word ure- the irly, činka This atering er a ems the . As for test le 6 7 d e , a s - . a s e s r , 6 Are Schalter and šapka good competitors? 208 Me tur bet 7.0 5.4 3.9 2.4 To Tab Am (cf. Ge Fig 4.4 In foc fer fro 8 ean re-w twe 0-5 9-4 9-2 9-1 tal ble 6 mon . fig rma gure the cusi ent om o val word een… .50 .00 .50 .00 6: Di ng th gure an ( 3: P Dat fol ng typ our lue d m … istri he s e 3) (BH Pictu ta a low on pes eye of p matc ibuti sub ), w H) pe ure f anal wing (a) of e-tr picchin ion bstit was erfe for B ysis g, w the fre rack ng of m ute jud ectly BH ( s an we w e di eque king Ch A 4 6 0 0 5 mean s, o dge y (7 (Ger nd r will ispe ency g stu hrist Abs 47 6 0 0 53 n va our d t 7) by rma resu pre ersio y d udy tina solu alue sub o m y 10 an) a ults esen on o data y (i. Cla Ru ute s for bstit matc 00% and III: nt th of t a. O e., t asme uss P 88 1 - - 10 r pic tute ch of lifčik Sub he r the Our the eier, ian Perce 8.7% 1.3% 00% ctur e for its the k (R bjec resu resu pre tar Tan n enta % % % re-w r th wo par Russi ctive ults ults esen rget nja A age word he it rds rtici ian) e fre of o s, an ntat s an Anst e d ma tem in ipan “br equ our nd ( tion nd t tatt, Ab 48 5 0 0 53 atchi m gru bo nts. ra” uenc r sub (b) n is the , Jes bsol 8 3 ing ub, oth . cy r bjec the lim com sica G lute in p the Rus ratin ctiv e cor mite mpe Ern Ger e pre-t e pic ssia ng t e fr rrel d to etito nst & rma Per 90.6 9.4 - - 100 test ctur an ( task requ latio o th ors) & Ev an rcen 6% % 0% 2 re o (lifči k uen ons he ). T va B ntag of a čik) ncy t of stim Thus Belke ge bra and test difmuli s, in e a d , i n 209 this section we deal with 222 data points: 56 German target words and their Russian translations, and 55 Russian target words 8 and their German translations. As a measure of central tendency, we calculated the median of the grouped data. The agreement of the frequency judgments was calculated on the basis of the interquartile range and then classified according to the model presented by Krause (2002) (for more detail, see Anstatt & Clasmeier, 2012). The values of the levels of agreement are given in table 7. In sum, 73% of the Russian stimuli and 90% of the German ones exhibited an interquartile range below 1.11, and they were thus judged with at least mean agreement. More than half of the stimuli in Russian and two thirds in German were rated with good or unanimous agreement. The relatively high level of agreement among the judges underlines the stability of the subjective frequency. The differences in the degree of agreement between the German and Russian estimates could be due to differences between the test groups. Whereas the German judges were all monolinguals and students, the Russian judges were late bilinguals and of different ages and professions. As the agreement in German was mostly satisfactory, we decided not to exclude any stimuli for their low agreement in Russian. 8 For one stimulus, data are still missing. Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 210 With respect to the correlations of the corpus frequencies and subjective frequencies of the Russian and German stimuli, we note the following. First, all four groups of numbers exhibit highly significant correlations of at least medium strength (cf. table 8). The most interesting result with respect to the methodological question of how to measure frequency is the fact that the strongest correlation is exhibited between the subjective frequency of the German stimuli and the subjective frequency of the Russian stimuli (r =.739). IQR Level of agreement Russian Russian (%) Russian (cum.%) German German (%) German (cum.%) 0.25-0.6 Unanimous agreement 11 9.9 9.9 23 20.7 20.7 0.61-0.9 Good agreement 46 41.5 51.4 51 46.0 66.7 0.91-1.1 Mean agreement 24 21.6 73.0 26 23.4 90.1 1.11-1.8 Low agreement 28 25.2 98.2 11 9.9 100 1.81-2.0 Very low agreement 2 1.8 100 0 0 0 2.01-2.5 Bimodal distribution 0 0 0 0 0 0 Sum 111 100 111 100 Table 7: Dispersion of frequency judgments 211 The convergence of the subjective frequency data is demonstrated by table 9 below, wherein the examples reveal that many of the striking corpus frequency differences between Russian and German are leveled in subjective frequency. For example, valenok “felt boot” is regarded by Russian bilinguals living in Germany as being only slightly more frequent than Filzstiefel by the German monolinguals, while belka “squirrel” is on a par with its German equivalent Eichhörnchen. However, maslo “butter” again has a higher frequency in comparison to Butter, which underlines the existence of certain interlinguistic differences. In some cases, the subjective frequency reveals a higher divergence than the corpus frequency. In sum, however, the mean distance between Russian and German in the subjective frequency is significantly lower than that in the corpus frequency. 9 This is an indication of the higher suitability of subjective frequency in comparison to corpus frequency. We consider the higher correlation to be an indicator of the greater proximity of this value to the mentally stored frequency. 9 This calculation was performed on the basis of the z-transformed values (cf. Anstatt & Clasmeier, 2012 for details of the method). A t-test was run on the absolute number that showed a highly significant difference to the mean distance (.58 for corpus frequency and .51 for subjective frequency, p <.001). Corpus frequency Russian Subjective frequency German Subjective frequency Russian Corpus frequency German .624 (p<.001) N=104 .568 (p<.001) N=109 .454 (p<.001) N=109 Corpus frequency Russian ― .439 (p<.001) N=106 .614 (p<.001) N=106 Subjective frequency German ― ― .739 (p<.001) N=111 (Spearman’s r; N=111, smaller Ns are due to missing values of corpus frequency) Table 8: Correlations of the corpus frequencies and subjective frequencies for the Russian and German stimuli Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 212 5 What counts as phonetic or phonological overlap? As explained above, our visual world study is based on the assumption that there will be competition between stimuli with phonologically similar onsets. Our aim is to test whether this competition only occurs within one language or between the two languages of a bilingual as well. Therefore, it is crucial that our experiment test words with phonological overlap in both languages, and thus what is meant by “phonological overlap” has to be clarified. Within one language, the solution seems to be relatively easy, since we can base our considerations on phonemes as abstract representational units. However, even within one language there are differences between allophones, and one has to deal with the question of whether and how phonetic differences influence the degree to which participants perceive similarities and differences between words. If we are dealing with two different languages, this question becomes much more complicated. It is difficult to apply the notion of a phoneme here, since this is an abstract category of a single language. In the case of two languages, we are dealing with similar sounds, but not (at least not necessarily) with the same phonemes. Terminologically, in this case it would be more appropriate to refer to “phonetic overlap”. In general, a discussion of the phoneme correspondences between two languages in the framework of psycho-phonology is needed, although it goes beyond the scope of our paper. However, as we refer to the mental level, we use the term “phonological Ex. Russian word Russian SF QA German word German SF QA Distance SF 10 Meaning (4b) maslo 6.27 0.66 Butter 5.71 0.91 0.23 “butter” (5b) valenok 1.86 0.74 Filzstiefel 1.17 0.50 0.28 “felt boot” (6b) luža 3.57 1.29 Pfütze 3.57 0.87 0.18 “puddle” Table 9: Examples of the diverging corpus frequencies of Russian and German equivalents (cf. table 1, examples (4a)-(6a) with numbers for corpus frequency) (SF = Subjective Frequency) 213 overlap”, but we wish to underline that this term involves a certain level of inaccuracy. Marian and Spivey (2003a) applied a rather coarse definition of phonemes that did not take cross-linguistic phonetic differences into account. Based on this coarse definition, the authors considered the amount of overlap by counting overlapping phonemes between pairs of targets with their cross-linguistic competitors, as well as with their competitors in the same language. On average, in the material of Marian and Spivey (2003a), there are 1.8 overlapping phonemes between the pairs of English and Russian stimuli competing with each other, and 2.2 between the English targets and competitors. A statistical analysis showed that this difference was not significant, and the authors rated the phonetic and/ or phonological similarity as satisfactory. However, there were various phonetic and phonological differences between their pairs of words, which most probably make the words immediately recognizable as English and Russian, respectively, for fluent bilingual participants. Thus, we doubt that there was really the same chance for the Russian word čerepacha with its soft [tʃ ’] and the non-stressed, strongly reduced vowel [ı] to compete on a par with the English chair, starting with a harder [tʃ] and continuing with an open [ɛ]. A somewhat more finegrained analysis was conducted by Marian and Spivey (2003b), where the authors included some phonetic parameters. Again, however, relevant phonetic differences were not taken into account. E.g., their analysis of the triplet chess - chair - čerepacha resulted in six shared phonetic features at onset for the intralingual English pair of chess and chair, and six shared phonetic features for the interlingual Russian-English pair of chair and čerepacha as well. Ju and Luce (2004) conducted an eye-tracking experiment, presenting Spanish-English bilinguals with Spanish stimuli having wordinitial stop consonants with either Englishor Spanish-appropriate voice onset times. Their results demonstrated that the participants were highly sensitive to the “Spanish” vs. the “English” phonetic characteristics of the word-initial consonants. In light of these considerations, we decided to integrate this problem into our experiment. In the following paragraphs, we first provide a short overview of the phonetic and phonological differences between Russian and German in order to demonstrate the large contrast between both sound systems. Subsequently, we discuss the techniques by which we try to cope with the problems. Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 214 5.1 Differences between the Russian and German sound systems General characteristics. The sound systems of Russian and German exhibit some basic differences at the phonetic as well as the phonological level, which are described in detail in the literature on contrastive phonetics and phonology (cf. Böttger, 2008a, 2008b; Gabka, 1984; Potapova & Potapov, 2011; Wiede, 1981). First, at the level of phonology, the number of phonemes is more or less the same, but they are distributed extremely differently. While German has 15-19 vowel phonemes, Russian has only five (or six). German, on the other hand, has 21-25 consonant phonemes, whereas Russian features 34-38 (following the most customary classifications). At the level of phonetics, equally fundamental is the difference in the manner of articulation. German sounds exhibit a higher muscular tension and stability, as well as a greater breathing pressure (Gabka, 1984; Potapova & Potapov, 2011), which results in a whole range of articulatory and acoustical differences. Consonantism. Due to the additional feature of palatalization (soft vs. hard consonants), Russian consonantism is phonologically more complex than German consonantism. Furthermore, each of the languages has some phonemes that are not existent in the other language. For instance, the sounds [tʃ ], [ʒ], and [ʃ ː] have phonemic status in Russian but not in German, whereas [h], [ŋ], and [ʔ] are phonemes in German but not in Russian. At the level of phonetics, the difference in muscular tone leads to various differences in the articulation of consonants. For instance, in German sonorants have a higher tension than in Russian; voiceless stops in German are more aspirated than in Russian; German voiced stops exhibit a more intense noise than Russian ones; fricatives and affricates in German are formed with a more narrow channel and consequently have a more audible noise than in Russian; and Russian consonants exhibit stronger assimilation phenomena than German consonants do (for more detail, cf. Böttger, 2008a; Gabka, 1984; Potapova & Potapov, 2011). Furthermore, in German the phonological feature of voice is clustered with the fortis/ lenis contrast, whereas in Russian only the feature +/ − voice is relevant (cf. Gabka, 1984, p. 65). Quite substantial phonetic differences are observed with respect to the liquids / r/ and / l/ . In Russian, / r/ is typically an apical trill [r], whereas in German standard language it is an uvular [ʁ], with the apical [r] only occurring in some dialects and with less trills. German / l/ occupies an intermediate position between the strongly differing realizations of the Russian corresponding phonemes, with the hard / l/ realized as velarized [ɬ] and the palatalized / l’/ . According to Gabka (1984), Russian consonants differ from German consonants in terms of pitch, which is higher in the case of palatalized ’ ’ 215 sounds and deeper with non-palatalized ones. In general, German consonants, with the exception of / l/ , are more similar to Russian non-palatalized consonants. Vocalism. In contrast to consonantism, vocalism is more complex in German than in Russian. This is mostly due to the phonemic contrast of length, clustered with tension and openness that does not exist in Russian. Furthermore, German has phonemic rounded vowels (/ ʏ/ , / yː/ , / ɶ/ , / øː/ ) that are not known in Russian. Conversely, Russian has a central, closed, not rounded vowel [ɨ] (mostly regarded as allophone of / i/ ), which in turn is not known in German. Native speakers of German tend to substitute this [ɨ] with [y: ] or [i: ] in Russian language production (Gabka, 1984). With respect to phonetics, Russian vowels have a gliding articulation, exhibiting different timbre in the phases of onglide and offglide. This phenomenon is especially strong in accented / o/ , which shows a diphthongized or even triphthongized character [ u ɔ a ]. Furthermore, the gliding character is entangled with strong co-articulation phenomena, resulting in an [i]-like onor offglide in the neighborhood of a soft consonant. Thus, accented / a/ , for example, can occur as [a], [a i ], [ i a], or [ i a i ], depending on the surrounding consonants. With respect to length, Russian vowels normally have an intermediate position between German short and long vowels. As for timbre, they may be more similar to one of them and dissimilar to the other. For example, the Russian stressed / u/ is more similar to the German long and tensed / uː/ than to the short and lax / ʊ/ , whereas the Russian / o/ is closer to the German short and lax / ɔ/ than to the long and tensed / oː/ . A further problem is the reduction of unstressed vowels, which occurs in both languages. In Russian, the strong reduction of unstressed vowels is a very prominent feature, affecting all vowels quantitatively and, in many cases, qualitatively as well. For example, unstressed / a/ in Russian is realized as [ʌ], [ə], or [ı], depending on the phonetic features of the environment, and thus it is undistinguishable phonetically from unstressed / o/ . 5.2 Consequences for stimuli selection and preparation As detailed previously, we needed sets of four stimuli for our study (each including two Russian and two German words), with a phonological overlap in the onset. In order to make our stimuli more phonetically homogeneous, we first adopted the following general rules: 1. All members of each quadruple should have an onset consisting of exactly two overlapping phonemes, as, for example, Schalter - Schaffner - šapka - šaška, all starting with the sounds [ʃ] and [a]. (This Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 216 is opposed to the pairs used by Marian and Spivey, 2003a, 2003b, which exhibited between one and three overlapping sounds.) 2. The onset of all quadruples should be a consonant and a following vowel (CV), as this pattern is easy to decode and occurs quite often in both languages. Furthermore, if possible, the stimuli should consist of two syllables. However, due to several other constraints, it was not possible to follow these two requirements in all cases. 3. To avoid the distortion of vowel features by reduction, only stimuli with word initial stress were included. This constraint, however, drastically reduced the number of fitting stimuli, as in Russian only a relatively small number of words have word initial stress. For example, of the 57 Russian nouns that begin with šaand consist of more than one syllable, only 14 (25%) bear initial stress in the nominative singular. 10 Notwithstanding constraints one to three, it was not possible to choose only quadruples with maximally similar onsets among all four stimuli. Phonological overlap, according to our working definition, is exhibited by allophones bearing a relatively significant acoustic similarity, whilst not necessarily being identical in terms of all relevant features. The phonological level served as a guideline for similarity, but asymmetries were unavoidable, since there may be no corresponding phoneme in the respective other language ― or even more than one. The differences between the Russian and German phonological and phonetic systems discussed above imply that we were barely able to assume a full similarity between any pair of Russian and German sounds. Thus, we administered two further restrictions: 4. In order to control for the differences, we established a quantifying system that allows us to calculate the phonetic distances and include them in the statistical analysis. Moreover, some quadruples with a somewhat larger phonetic distance of the onset were included, so that the impact of this factor could be further analyzed. This system will be presented in detail in the next subsection. 5. The speaker of all stimuli, Russian as well as German, recorded for the experiment was a balanced fluent bilingual of both languages, who, according to native speakers of Russian and of German, has barely any accent in either language. However, in choosing a bilingual speaker we intended to level off the phonetic differences. In fact, an ear phonetic analysis of the recorded single word stimuli showed that there was a slight phonetic approximation in both languages in the direction of the respective other. For example, the triphthongoid 10 The count is based on the bilingual dictionary of Bielfeldt (1982), which was one source used in the search for stimuli. 217 character of the stressed / o/ in Russian was less pronounced when compared to a monolingual speaker. 5.3 Calculation of phonetic distances The phonetic distance was calculated for every possible combination of stimuli within every quadruple, which resulted in six combinations of stimuli in each quadruple, including intraas well as interlinguistic pairs. We measured the phonetic distance in each pairing for the first as well as for the second sound, assigning points according to a fixed algorithm. The single steps of the calculation were: 1. Interlinguistic pairs of allophones were assigned one point each due to the principal differences of articulation. Thus, for example, the allophone pair German / ʃ/ and Russian / ʃ/ , as well as Russian / a/ and German / a/ , in the stimuli Schaffner - šapka (cf. example 7 in table 10) has been assigned one point. With respect to the intralinguistic pairs, as in šaška - šapka or Schalter - Schaffner, respectively, (examples 8 and 9 in table 10), no such difference was assumed, so these pairs obtained 0 points. 2. In a second step, large phonetic differences between allophone pairs were assigned one further point. We primarily considered differences of place or manner of articulation as “large”. Such “large” phonetic differences are exhibited, for instance, by both overlapping allophones in the interlinguistic stimuli pair of Birne and bitva (cf. example 10). Thus, the pair of German / b/ vs Russian / b’/ was allocated 2 points, one for the principle phonetic difference and the other due to palatalization in Russian, which implies a secondary articulation place at the palate. The vowel pair German / ı/ vs. Russian / i/ was given 2 points as well, the first again for the principle phonetic difference and the second with regard to the difference in length and articulation place. Further examples of 2-point differences include German / v/ vs. Russian / v’/ in Wiege vs. vilka (13), the / r/ as well as the / o/ in Robbe and rošča (14), and German / a/ vs. Russian / a/ , realized as [a i ], in Bagger vs. bantik (15). 3. Of course, phonetic differences have to be taken into account with respect to the intralinguistic pairs as well. Here, the phonological level was included. This is relevant, for example, for the pair of German Birne and Biene (11). The difference between / ı/ vs. / i/ has been assigned 2 points, as, first, the vowels exhibit a noticeable acoustic difference, which is, second, phonologically relevant in German. However, small phonetic differences were not taken into account, such as, Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 218 for example, the allophones of / b/ in Birne and Biene. In contrast, coarticulation phenomena in Russian, caused by diverging palatalization of the neighboring consonants, were regarded as relevant. Thus, the allophone pair of / a/ in Russian bantik and banka ([a i ] vs. [a]) was assigned one point, as the difference is noticeable, but not phonologically relevant. 4. The values for the first and the second pair of allophones were accumulated. The interlinguistic pair of Birne and bitva, for example (cf. ex. 10), has been assigned a total of 4 points. 5. As mentioned above, the stimuli were recorded by a balanced bilingual speaker with slight phonetic interferences. In a last step, for all the interlinguistic pairs, one point was subtracted to take into account this slight leveling of the phonetic differences. This step resulted in an adjusted sum of phonetic difference (cf. the last column of table 10). This number will be included in the statistical analysis of the results from the eye-tracking experiment for each single pair. Of course, this calculation results in a relatively simplified model of phonetic differences; however, we believe that it at least allows us to take account of the broad lines of the problem. Table 11 provides an overview of the average phonetic distances. As might be expected, the average phonetic distance between interlinguistic onset-pairs is rather larger than that between the intralinguistic pairs, even taking into account the phonetic leveling of the bilingual speaker. Are Schalter and šapka good competitors? 219 Stimulus 1 Stimulus 2 Allophone pair 1 Allophone pair 2 Total of phonetic distance Ex. Allophones Phonetic distance Allophones Phonetic distance Sum Adjusted sum (7) Schaffner (G) šapka (R) [ʃ ] - [ʃ ] 1 [a] - [a] 1 2 1 (8) šaška (R) šapka (R) [ʃ ] - [ʃ ] 0 [a] - [a] 0 0 0 (9) Schalter (G) Schaffner (G) [ʃ ] - [ʃ ] 0 [a] - [a] 0 0 0 (10) Birne (G) bitva (R) [b] - [b’] 2 [ı] - [i] 2 4 3 (11) Birne (G) Biene (G) [b] - [b] 0 [ı] - [i] 2 2 2 (12) bantik (R) banka (R) [b] - [b] 0 [a i ] - [a] 1 1 1 (13) Wiege (G) vilka (R) [v] - [v’] 2 [i] - [i] 1 3 2 (14) Robbe (G) rošča (R) [ʀ ] - [r] 2 [ᴐ] - [ u ᴐ i ] 2 4 3 (15) Bagger (G) bantik (R) [b] - [b] 1 [a] - [a i ] 2 3 2 Table 10: Examples of the calculation of phonetic distances (ranging from 0 to 3 in the adjusted sum, with 3 as the greatest phonetic distance) Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 220 Conclusion Preparing stimuli for experimental research requires a thorough selection of material as well as controlling for various variables, which is a well-known fact in psycholinguistics. The more pre-tested data is available for the language in question, the easier it is to cope with these requirements. However, as experimental psycholinguistic research involving languages other than English does not have a very broad history, there is often not enough pretested material. The aim of our contribution is the discussion of some of the troubles involved in stimulus preparation and selection that we encountered when compiling stimuli for an experiment on the mental lexicon of Russian- German bilinguals. The goal of the study we are preparing is to test hypotheses regarding the co-activation phenomena in L1 and L2 of different types of bilinguals with an eye-tracking study using the visual world paradigm. Thus, we have to combine German and Russian stimuli, which poses some further difficulties we have to tackle. In this paper, we focus on three methodological problems: 1. word frequency, 2. matching of object names and pictures, and 3. phonetic and phonological differences between Russian and German. After presenting the goal and method of the study under preparation in further detail in section two, section three discussed the parameter of fre- Average phonetic distance Average Sum SD Pair type Allophone pair 1 (consonants) Allophone pair 2 (vowels) Sum of phonetic distance Adjusted sum of phonetic distance Adjusted sum of phonetic distance Interlinguistic pairs 1.43 1.47 2.91 1.91 0.75 Intralinguistic German pairs 0.00 0.61 0.54 0.54 0.87 Intralinguistic Russian pairs 0.04 0.32 0.36 0.36 0.55 Table 11: Average values of the phonetic distances between stimuli 221 quency as a highly important variable to control for, the adequate handling of which poses further challenges. The first is the heterogeneity of the frequency of the stimuli occurring as competitors in the test. This can be solved by including the frequency of the items in the statistical analysis of the test results. The second problem is that it is difficult to obtain reliable information on the frequency that represents the knowledge stored in the mental lexicon of the participants. We propose subjective frequency as a more suitable instrument in comparison to the less adequate corpus frequency. Section four outlined the pre-tests we conducted to check our stimuli. As no pre-tested pool of pictures exists, we had to have many pictures created especially for our experiment and then run pre-tests for them. In subsections 4.2 and 4.3, we presented these pre-tests as well as the results we obtained from them. Poor outcomes meant that we had to replace the given stimulus with a new one, which then had to be pre-tested in another round of pre-tests. Sub-section 4.4 is concerned with the results of the subjective frequency test. Finally, section five discusses the problem of fundamental differences between the sound systems of Russian and German. As these differences concern the phonetic as well as the phonological level, the extent to which we can assume a phonetical overlap, if any, has to be scrutinized. Our suggestion is to thoroughly analyze these differences and transpose them into a quantitative scale, allowing for a schematic calculation of the phonetic and phonological distance between the single stimuli that can be included in the statistical analysis of the experimental data. References Akinina, Y., Malyutina, S., Ivanova, M., Iskra, E., Mannova, E., & Dragoy, O. (2015). Russian normative data for 375 action pictures and verbs. Behavior Research Methods, 47, 691-707. Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38, 419-439. Anstatt, T. (in press). Subjektive Frequenz als Forschungsmethode. To appear in Wiener Slawistischer Almanach 2015. Anstatt, T., & Clasmeier, Ch. (2012). Wie häufig ist poplakat’? Subjektive Frequenz und russischer Verbalaspekt. Wiener Slawistischer Almanach, 70, 129-163. Baayen, R. H., Feldman, L. B., & Schreuder, R. (2006). Morphological influences on the recognition of monosyllabic monomorphemic words. Journal of Memory and Language, 55, 290-313. Baayen, R. H., Piepenbrock, R., & Gulikers, L. (1995). CELEX2 LDC96L14. Philadelphia, PA: Linguistic Data Consortium. Retrieved from https: / / catalog.ldc.upenn. edu/ LDC96L14 Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 222 Balota, D., Cortese, M., Sergent-Marshall, S., Spieler, D., & Yap, M. (2004). Visual word recognition for single-syllable words. Journal of Experimental Psychology: General, 133, 283-316. Balota, D. A., Pilotti, M., & Cortese, M. J. (2001). Subjective frequency estimates for 2.938 monosyllabic words. Memory & Cognition, 29, 639-647. Benmamoun, E., Montrul, S., & Polinsky, M. (2013). Heritage languages and their speakers: Opportunities and challenges for linguistics. Theoretical Linguistics, 39(3- 4), 129-181. Bielfeldt, H. H. (1982). Russisch-deutsches Wörterbuch. Berlin, Germany: Akad.-Verl. Böttger, K. (2008a). Negativer Transfer bei russischsprachigen Deutschlernern: Die häufigsten muttersprachlich bedingten Fehler vor dem Hintergrund eines strukturellen Vergleichs des Russischen mit dem Deutschen (Doctoral dissertation). Retrieved from http: / / www.sub.uni-hamburg.de/ opus/ frontdoor.php? source_opus=3622 Böttger, K. (2008b). Die häufigsten Fehler russischer Deutschlerner: Ein Handbuch für Lehrende. Münster, Germany: Waxmann. Brysbaert, M., & Cortese M. J. (2011). Do the effects of subjective frequency and age of acquisition survive better word frequency norms? The Quarterly Journal of Experimental Psychology, 64(3), 545-559. Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977-990. Canseco-Gonzalez, E., Brehm, L., Brick, C. A., Brown-Schmidt, S., Fischer, K., & Wagner, K. (2010). Carpet or Cárcel: The effect of age of acquisition and language mode on bilingual lexical access. Language and Cognitive Processes, 25(5), 669-705. Coltheart, M. (1981). The MRC Psycholinguistic Database. Quarterly Journal of Experimental Psychology, 33A, 497-505. Ellis, N. C. (2002). Frequency effects in language processing: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in Second Language Acquisition, 24(2), 143-188. Gabka, K. (1984). Einführung in das Studium der russischen Sprache: Phonetik und Phonologie. Leipzig, Germany: Brücken-Verlag. Huettig, F., Rommers, J., & Meyer, A. S. (2011). Using the visual world paradigm to study language processing: A review and critical evaluation. Acta psychologica, 137(2), 151-171. Ju, M., & Luce, P. (2004) Falling on sensitive ears: Constraints on bilingual lexical activation. Psychological Science 15(5), 314-318. Kliegl, R., & Hanneforth, T. (2011). dlexDB. Retrieved from http: / / www.dlexdb.de/ Köpke, B., & Schmid, M. S. (2004). Language attrition. The next phase. In M. S. Schmid, B. Köpke, M. Keijzer & L. Weilemar (Eds.), First Language Attrition (pp. 1- 43). Amsterdam, Philadelphia: John Benjamins. Krause, M. (2002). Subjektive Bewertung von Vorkommenshäufigkeiten: Methode und Ergebnisse. Glottometrics, 2, 53-81. 223 Ljaševskaja, O. N., & Šarov, S. A. (2009). Častotnyj slovar’ sovremennogo russkogo jazyk: Na materialach Nacional’nogo korpusa russkogo jazyka. Retrieved from http: / / dict.ruslang.ru/ freq.php Lönngren, L. (Ed.). (1993). Častotnyj slovar’ sovremennogo russkogo jazyka [A frequency dictionary of Modern Russian]. Uppsala, Sweden: Almqvist & Wiksell. Marian, V., & Spivey, M. (2003a). Bilingual and monolingual processing of competing lexical items. Applied Psycholinguistics, 24, 173-193. Marian, V., & Spivey, M. (2003b). Competing activation in bilingual language processing: Withinand between-language competition. Bilingualism: Language and Cognition, 6, 97-115. McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1-86. McLaughlin, B. (1984). Second-Language Acquisition in Childhood: Preschool Children (Volume 1). Hillsdale, NJ: Erlbaum. Norris, D. (1994). Shortlist: A connectionist model of continuous speech recognition. Cognition, 52, 189-234. Pavlenko, A., & Malt, B. C. (2010). Kitchen Russian: Cross-linguistic differences and first-language object naming by Russian-English bilinguals. Bilingualism, Language and Cognition, 14, 19-45. Polinsky, M., & Kagan, O. (2007). Heritage languages: In the ‘wild’ and in the classroom. Language and Linguistics Compass, 1(5), 368-395. Romaine, S. (1995). Bilingualism. Oxford: Blackwell Publishing. Potapova, R. K., & Potapov, V. (Eds.). (2011). Kommunikative Sprechtätigkeit: Russland und Deutschland im Vergleich. Köln: Böhlau. Prestin, E. (2003). Theorien und Modelle der Sprachrezeption. In G. Rickheit, T. Strohner & W. Deutsch (Eds.), Psycholinguistik: Ein internationales Handbuch (pp. 491-505). Berlin: De Gruyter. Reid, A. A., & Marslen-Wilson, W. D. (2003). Lexical representation of morphologically complex words: Evidence from Polish. In R. H. Baayen & R. Schreuder (Eds.), Morphological Structure in Language Processing (pp. 287-336). Berlin, Germany: Mouton de Gruyter. Schmitt, N., & Dunham, B. (1999). Exploring native and non-native intuitions of word frequency. Second Language Research, 15, 389-411. Snodgrass, J. G., & Vanderwart, M. (1980). A Standardized Set of 260 Pictures: Norms for Name Agreement, Image Agreement, Familiarity and Visual Complexity. Journal of Experimental Psychology. Human Learning and Memory, 6(2), 174-215. Spivey, M. J., & Marian, V. (1999). Cross talk between native and second languages: Partial activation of an irrelevant lexicon. Psychological Science, 10(3), 281-284. Tsaparina, D., Bonin, P., & Méot, A. (2011). Russian norms for name agreement, image agreement for the colorized version of the Snodgrass and Vanderwart pictures and age of acquisition, conceptual familiarity, and imageability scores for modal object names. Behavior Research Methods, 43, 1085-1099. Weinreich, U. (1953). Languages in Contact: Findings and Problems. New York, NY: Linguistic Circle. Are Schalter and šapka good competitors? Christina Clasmeier, Tanja Anstatt, Jessica Ernst & Eva Belke 224 Wiede, E. (1981). Phonologie und Artikulationsweise im Russischen und Deutschen: Eine konfrontierende Darstellung. Leipzig, Germany: Verlag Enzyklopädie. Wilson, M. D. (1988). The MRC Psycholinguistic Database: Machine Readable Dictionary (version 2). Behavioural Research Methods, Instruments and Computers, 20(1), 6-11. Zasorina, L. N. (1977). Častotynj slovar’ russkogo jazyka. Moscow, Russia: Russkij Jazyk. Zeno, S., Ivens, S., Millard, R., & Duvvuri, R. (1995). The Educator’s word frequency guide. Brewster, NY: Touchstone Applied Science Associate. Measuring lexical proficiency in Slavic heritage languages: A comparison of different experimental approaches Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski Abstract: The paper explores the effects of selected psycholinguistic experimental methods on measures of lexical proficiency in both languages of 40 Russian-German and Polish-German bilingual participants. For this purpose, four different task types were applied: (a) a picture naming task, (b) a semantic mapping task, (c) two translation tasks, (d) verbal fluency tasks which target semantic (or category) fluency. The results are compared on the group level as well as on the level of the individual participants. The comparison reveals that both language groups show more or less the same average results in the individual tasks. Furthermore, the results from all tests concerning the heritage language are statistically correlated with each other. In general, the translation tasks turn out to be the task type that is best suited for assessing lexical proficiency in the heritage language, since their results show the strongest correlation with the results obtained in the other psycholinguistic tasks. 1 Introduction Migration has always been a factor that has shaped the history of mankind; it has resulted in the emergence of linguistic minorities in countries all over the world. Despite this, heritage languages have become a core topic in research on multilingualism rather recently. The terms ‘heritage languages’ and ‘heritage speakers’ were first used in the North American context (Cummins, 2005, p. 585). This is certainly due to the fact that Canada and the USA have a long-standing history as a destination for immigrants from all over of the world. Both terms, however, are also increasingly used in the European context and with regard to Slavic languages. Maybe due to the novelty of the concept of heritage languages, there is still no general consensus among researchers about the exact definition of the term ‘heritage Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 226 speakers’. 1 In their survey article on heritage linguistics, Benmamoun, Montrul, and Polinsky (2013a) state that a heritage speaker is an early bilingual who grew up hearing (and speaking) the heritage language (L1) and the majority language (L2) either simultaneously or sequentially in early childhood (that is, roughly up to age 5 […]), but for whom L2 became the primary language at some point during childhood (at, around, or after the onset of schooling). As a result of language shift, by early adulthood a heritage speaker can be strongly dominant in the majority language, while the heritage language will now be the weaker language (p. 133). One crucial problem in defining a “typical” heritage speaker is the tremendous degree of variation in the levels of individual heritage language proficiency that can be observed within a single heritage speaker population. To address this issue, Polinsky and Kagan (2007) have proposed a continuum stretching from fluent acrolectal (more proficient) heritage speakers, who reach almost native-like levels in their heritage language, to basilectal (less proficient) heritage speakers who possess mostly receptive and only limited productive skills in it. Given this vast range of variation, one essential challenge is the development of metrics that can help to place heritage speakers along the proposed continuum. The main methodological problem lies in the fact that less proficient heritage speakers are often quite limited in their ability to produce spontaneous utterances, which renders some standardized psycholinguistic tasks for assessing linguistic proficiency inapplicable. Furthermore, they often experience a lack of metalinguistic awareness regarding certain structures of their heritage language. Judgments on linguistic structures often presuppose a certain amount of training and experience which is typically gathered in school. The problem is that many heritage speakers have never been exposed to formal schooling in their heritage language. This, again, reduces the range of possible tasks (e.g. grammaticality judgment tasks) that can be successfully applied to measure their proficiency. Polinsky and Kagan (2007, p. 374-376) propose two diagnostic parameters for quickly and easily evaluating the linguistic abilities of heritage speakers: (a) speech rate (measured as the words-per-minute output in spontaneous production); (b) lexical knowledge in the heritage language. Both measures establish the proximity of each speaker to the (monolingual) baseline. Furthermore, both diagnostic parameters were shown to correlate strongly with the speaker’s grammatical knowledge. This correlation proved to be valid 1 See Benmamoun, Montrul, and Polinsky (2013b, p. 259) for a discussion of different viewpoints and further references. Measuring lexical proficiency in Slavic heritage languages 227 for several heritage languages including Russian, Polish, Spanish, Arabic, Armenian, Korean, and Lithuanian, and has also been proposed for early child language (see Benmamoun et al., 2013a, p. 136 for details and references). 2 Polinsky and Kagan (2007, p. 376) therefore conclude that “lexical proficiency scores, which are relatively easy to obtain, can serve as a basis for the characterization and ranking of incomplete learners in terms of the proposed continuum model.” But even if lexical proficiency can count as a good indicator of overall skill levels in the heritage language, the question remains which experimental task should be used to measure lexical proficiency. Different psycholinguistic methods are frequently applied for assessing lexical proficiency in bilinguals (see section 2.2). Thus, the investigation of task-related effects in measuring lexical proficiency of heritage speakers also contributes to a critical evaluation of psycholinguistic experimental settings that are commonly used in studies on the bilingual mental lexicon. 2 Experimental methods for assessing the lexical proficiency of monolinguals and bilinguals 2.1 Dimensions of lexical knowledge Most researchers distinguish between receptive and productive (or: passive and active) word knowledge. The distinction builds on the difference between how well particular lexical items function in comprehension and production. Some linguists assume that the receptive-productive relationship is not a dichotomy but a continuum (Henriksen, 1999; Read, 2000). Waring (1997) states that the active vocabulary of a speaker comprises only about 50% of his/ her passive vocabulary, and, therefore, productive vocabulary knowledge should be considered a subset of receptive knowledge. There have also been attempts to describe different facets of vocabulary knowledge by using theoretical models such as the model of lexical space. In this model, lexical knowledge is described as a three-dimensional space. Each of the dimensions “represents an aspect of knowing a word” (Daller, Milton, and Treffers-Daller, 2007, p. 7). The dimensions are lexical breadth, depth of lexical knowledge, and lexical fluency. Lexical breadth refers to the quantity or size of a speaker’s vocabulary. Lexical depth is the quality of a speaker’s vocabulary and includes aspects such as collocations, pronuncia- 2 Comparable attempts to connect vocabulary knowledge with general language proficiency can be frequently encountered in the literature, e.g. in Cameron (2002) or Meara (1996). Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 228 tion, grammatical functions, and register. The third dimension, fluency, concerns access to core lexical items (Meara, 2005). Read (2000) proposes a three-dimensional model to describe the design of vocabulary tests. The first dimension concerns the construction of the instrument used to assess lexical knowledge. He distinguishes between discrete and embedded testing of vocabulary knowledge: a “discrete test takes vocabulary knowledge as a distinct construct, separated from other components of language competence” (ibid., p. 8), whereas an embedded vocabulary measure “forms part of the assessment of some other, larger construct” (ibid., p. 9). The second dimension covers the range of vocabulary included in the task. If specific items selected by the scholar are the focus of the assessment, Read considers these vocabulary tests to be a selective vocabulary measure (ibid., p. 10). Comprehensive measures of vocabulary, on the other hand, correspond to the concept of lexical richness. This general term includes lexical features such as lexical diversity (type-token ratio), lexical sophistication (the amount of lowfrequency words) and lexical density (the ratio of lexical to grammatical words) (Daller et al., 2007, p. 13). Finally, the third dimension concerns the role of context in vocabulary tests. A context-independent vocabulary task presents lexical items in isolation without any context. A context-dependent vocabulary test, however, is “[a] vocabulary measure which assesses the test-takers’ ability to take account of contextual information in order to produce the expected response” (Read, 2000, p. 9). 2.2 Types of psycholinguistic tasks 2.2.1 Picture naming tasks The picture naming task is a frequently used psycholinguistic tool for studying lexical access. Participants are asked to name objects presented on pictures as quickly as possible. The aim of this task is to measure the average reaction time for naming the depicted items. Another goal is to determine the level of picture-naming accuracy, which is measured as the percentage of correctly produced items. Two types of tasks can be distinguished according to whether the participants are granted a limited or an unlimited time span to deliver their answers. Picture naming tasks are generally easy to construct and to administer, and their results can be analyzed quickly. Another advantage is that they can be used with little children and heritage speakers who cannot read (yet), as they are not dependent on literacy skills. Some disadvantages have to be mentioned as well: in picture naming tasks, the range of testable items is limited to mostly concrete nouns which can be easily depicted. Furthermore, Schmid (2011, p. 145) observed a ceiling effect Measuring lexical proficiency in Slavic heritage languages 229 in applying picture naming tasks in studies on language attrition, since there were neither differences in accuracy nor in reaction times between the bilinguals and the monolingual control groups. This task can measure passive as well as active vocabulary abilities. To measure passive vocabulary (picture-to-word matching task) the test instructor presents a picture and at the same time reads a word aloud. The test-takers need to decide whether the presented picture corresponds to the given word or not. Another possibility is to present four different pictures and to read a single word aloud; the task for the test-takers is then to select the correct picture. 3 When measuring active vocabulary knowledge, testtakers themselves have to name the given items which are presented as pictures. Using Read’s (2000) definition of lexical test types, picture naming tasks can be classified as discrete, selective, and context-independent vocabulary measurement tools. 2.2.2 Translation tasks In translation tasks, 4 passive as well as active vocabulary knowledge can be assessed depending on the direction of translation. If test-takers have to show that they understand the meaning of a presented word, and their task is to translate it from L2 to L1, we are tapping into word recognition skills. If test-takers are asked to translate target words from L1 to L2, the response builds on their word memory (“recall”) (Read, 2000, p. 155). 5 Item selection is very important when designing a translation task: Fitzpatrick (2007) used frequency bands to select a total of 60 items. Twenty items were chosen from the first, second, and third 1000-word frequency layers, respectively. She claims that with such a translation test it is possible to “quantify the number of words a subject has in their (sic! ) L2” and that “this number is somehow meaningful in terms of overall proficiency” (ibid., p. 122). In another study, Polinsky (1997) employed a simple vocabulary translation task as a test of general heritage language ability. The participants had 3 A picture naming task of this type which is very often applied in research on monolinguals and bilinguals is the standardized Peabody Picture Vocabulary Test (Dunn & Dunn, 1997). This test has been used as a reliable measure of receptive vocabulary since the 1950s and its English version has been adapted to several other languages. 4 On parallels between picture naming tasks and translation tasks see Snodgrass (1993). 5 This distinction relies on speakers who have only one L1 as their “mother tongue” and an L2 which clearly has the status of a foreign language and was acquired rather late. With regard to heritage speakers (including simultaneous as well as early successive bilinguals), the distinction between L1 and L2 is often not easy to draw. This holds for the researchers as well as for the heritage speakers themselves. Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 230 to orally translate a list of 100 basic vocabulary items from their primary language (English) into their weaker Russian heritage language. Reaction time was not measured. The 100 words were taken from the so-called Swadesh list which was not compiled on the basis of word frequency lists (Swadesh, 1955). To calculate a score, Polinsky compared the answers of her testtakers “to the full language list” (Polinsky, 1997, p. 393). This list was obtained from dictionary translations of the selected items and then verified by “at least one full [i.e. native] speaker” (ibid.). For each correct translation, test-takers received one point; for each wrong or missing translation, one point was deducted. Also, half a point was deducted for each incorrect word form. The total score (max. 100) was taken as the test-taker’s lexical proficiency value. According to Polinsky (ibid., p. 394) the advantages of this method include its simplicity and the possibility of comparing test-takers to each other. Furthermore, the scores are relatively easy to obtain. It can be argued, however, that translation is a special skill which requires a certain amount of cognitive training; such training is normally received in foreign language lessons at school. Since our participants were regularly exposed to such translation tasks at school (albeit for most of them the tasks did never include their heritage language), we expected no difficulties for our participants in solving this kind of task. 2.2.3 Verbal fluency tasks The verbal fluency task (VFT) or controlled oral word association task (COWAT) is one of the most frequently applied experimental tests in neuropsychology (Lezak, 2004; Spreen & Strauss, 1998). It is commonly used as a clinical assessment tool for the diagnosis of neurodegenerative diseases such as Alzheimer’s or Parkinson’s disease. In psycholinguistics, this task was primarily used to examine the lexicosemantic abilities of monolinguals. It is increasingly being used also in studies on bilingualism (e.g., Roberts & Le Dorze, 1997; Rosselli et al., 2002; Luo, Luk & Bialystok, 2010) and language attrition (Schmid, 2011; Schmid & Jarvis, 2014). The aim of this task is to assess the spontaneous production of words from a given category within a limited period of time (usually 60 seconds). There are two different variants of this task: 1) phonemic or letter fluency tasks, where participants are asked to name as many words as possible that start with a particular letter (e.g., <F>, <A>, or <S>); the letters are normally presented to the participants on a computer screen; and 2) semantic or category fluency tasks, where participants have to produce words belonging to a certain semantic category, such as animals, fruits, colors, etc. For the latter type, semantic categories that aim at eliciting nouns are preferred, whereas adjectives and especially verbs are Measuring lexical proficiency in Slavic heritage languages 231 rarely target word categories of the task. VFT is a very popular task because it is easy to construct and to administer. Furthermore, it can be adapted for different languages and is relatively easy to score: the test score is the sum of all unique correct items produced in the given period of time. Word productivity, however, depends, to a considerable degree, on the given letter or semantic category: Ardila, Ostrosky-Solís, and Bernal (2006, pp. 326) cite an unpublished study by Lopera that investigated the word productivity of 3000 Spanish-speaking participants in 16 different semantic categories. Lopera found that three categories, body-parts, things to eat and animals, elicited, by far, the highest number of correct responses. Schmid (2011, p. 150) mentions another specific disadvantage of this task: often there is no correlation between VFT scores and scores which were received from other lexical tasks. These issues make studies, which depend on VFT alone for measuring lexical proficiency, rather problematic, at least with regard to the representativeness of the results. Furthermore, researchers have identified several problems in designing fluency tasks. It has been recommended for semantic VFT to choose clear, unambiguous categories which are not cultureor language-specific, since the lexical breadth of certain semantic fields may differ across languages (Ardila et al., 2006, p. 326). Culture-specific stimulus categories may induce code-switching, which “could trigger further L2 items and therefore distort the picture of overall productivity” (Schmid, 2011, p. 149). More general doubts regarding the validity of VFT as a tool for assessing verbal abilities were raised by Shao, Janse, Visser, and Meyer (2014). They point to the fact that general cognitive forces such as executive control also come into play and determine fluency scores. Participants have to focus on the task since they need to suppress semantically irrelevant responses or (in case of bilinguals) they have to filter their responses for language. Furthermore, they must keep track of words they already mentioned in order to avoid repetition which clearly involves executive control abilities (e.g., working memory capacities). It is this hybrid character of VFT that is argued to limit its value as a tool for screening the pure lexical abilities of participants. This criticism, however, can also be applied to other methods for measuring lexical abilities. For instance, executive control certainly plays a role in picture naming tasks as well, especially when they are conducted with bilingual participants. Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 232 3 Design and research question of the present study 3.1 Background The data for the present study were gathered in a research project on the role of contextual factors for the maintenance and use of heritage languages among Russian-German and Polish-German bilingual teenagers. 6 One main focus of the project deals with assessing their proficiency in the heritage language (Russian/ Polish) and the majority language (German). In contrast to comparable projects, we adopt a holistic approach in documenting the language abilities of our participants. This means that we collect data for both languages concerning the main communicative competencies (listening and reading comprehension, speaking, writing) as well as data regarding proficiency on all linguistic levels (pronunciation, orthography, grammar, vocabulary). The main linguistic goals of the project are (a) to determine the degree of interand intrapersonal variation concerning the language ability of our participants and (b) to evaluate the importance of parental input, language attitudes (mediated by family, friends, school) and other extralinguistic background variables for the observable variation of their bilingual proficiency. In the context of this holistic approach, we developed an extensive test battery that includes several psycholinguistic instruments which measure the lexical proficiency of our participants (see Brehmer & Mehlhorn, 2015 for details). Unlike previous research on Slavic heritage languages, we do not intend to assess our participants’ distance from the monolingual baseline (i.e. competent, age-matched monolingual speakers of the respective languages). Our main aim is to compare our individuals with each other, thus shedding light on recurrent features that characterize the whole group, but also on intraand interpersonal differences with regard to language ability. The only baselines we apply in the project are the data obtained from the parents in each family. The parents’ data will be systematically compared to the data of their children in order to check whether or not — and in which domains — the children’s performance deviates from the performance of their parents. 6 This is a joint project of two research teams located at the University of Greifswald and the University of Leipzig. The project is supported by generous funding from the German Ministry of Education and Research (BMBF, project number: 01JM1302) to both local project leaders (Bernhard Brehmer/ Greifswald and Grit Mehlhorn/ Leipzig). Measuring lexical proficiency in Slavic heritage languages 233 3.2 Participants To compensate for the holistic approach, we are forced to limit our study to a rather modest sample size of 40 participants. Twenty of these (13 females, 7 males) are Russian-German bilinguals living with their families in the German cities of Hamburg and Leipzig. The other twenty (10 females, 10 males) are Polish-German bilinguals from Berlin and Hamburg. In order to keep the sample as homogeneous as possible, we selected our participants according to a set of fixed extralinguistic criteria. Our participants are adolescents between the ages of 12 and 13 and live in families where preferably both parents are L1 speakers of Russian or Polish. The children were either born in Germany or moved with their families to Germany before they entered school (i.e. before age 6). In this respect, they represent “typical” profiles of heritage speakers, since they grew up in a Russianor Polish-speaking family environment, but upon entering kindergarten their exposure to the language of the majority community increased significantly (see section 1). Half of our sample, however, has been receiving at least some formal schooling in the heritage language: they attend classes in Russian or Polish as a foreign or heritage language in German grammar schools (“Gymnasien”) or at socalled Saturday/ Sunday schools of private, state or church institutions. With regard to their educational and social background, the participants are roughly comparable. For the current analysis, however, we will refrain from explicitly addressing the influence of sociolinguistic variables on the performance of our participants in the assessment tasks. Instead, we will focus on the issue of whether the four different psycholinguistic tasks employed in our study yield the same results regarding the lexical proficiency of our participants in the heritage language. 3.3 Research question Our main research agenda was to find a reliable method of evaluating the lexical abilities of young Russian and Polish heritage speakers in Germany. To that end, we employed different psycholinguistic tests in our project (see section 3.4). All of these tests represent discrete, selective and contextindependent ways of measuring vocabulary knowledge according to Read’s typology. However, they tap into different dimensions of lexical proficiency such as lexical breadth, lexical depth and lexical fluency, as well as active and passive vocabulary knowledge (see section 2.1). The main research question we address in this study is whether the four tests lead to an identical ranking of our participants with regard to lexical knowledge in the heritage language. Most studies that deal with language attrition or heritage Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 234 languages concentrate on one specific task to evaluate lexical proficiency in the supposed weaker language. Therefore, our aim was to look for possible task-dependent results that could pose problems for generalizing results that were obtained by the employment of one test only. The second dimension of our analysis is whether the types of lexical knowledge, which are targeted by the individual psycholinguistic tests, are related to each other or not. The last goal was to check whether the different tasks work essentially in the same way for both language groups investigated in our project (i.e., Russian and Polish heritage speakers). All employed testing methods are said to be applicable universally, i.e. without considering a specific target language. It must be empirically tested, however, whether or not these methods indeed function equally when used for different languages, and especially for Slavic languages in our case. Such a comparison might reveal psycholinguistic peculiarities related to (single) Slavic languages. 3.4 Methods Due to the enormous amount of data that was collected from each participant in both of his/ her languages, the data were gathered in five different sessions on five different days. As usual in psycholinguistic research on bilinguals, each session was devoted to data gathering in one language only (i.e. German or the heritage language) in order to reduce the impact of crosslinguistic influence. The test instructors exclusively used the language that was being tested. Furthermore, German and Russian/ Polish tests were administered by different instructors, all of them being native speakers in the language to be tested. The test battery included four tasks that either directly targeted lexical proficiency or could at least be used to (additionally) measure the vocabulary size of our participants in both of their languages. To prevent our participants from getting bored by testing the same linguistic domain multiple times in a row, the different tasks targeting lexical knowledge were distributed among different sessions. One problem for interpreting the data was the rather limited number of psycholinguistic studies on the mental lexicon of bilinguals involving Slavic languages. Therefore, no standardized tests for Russian or Polish exist which could be used to assess the lexical proficiency of teenagers and provide a baseline for comparison with the data we received from our heritage speakers. This is why we decided to develop our own tasks. However, whenever possible we tried to adapt existing tasks to our needs. Measuring lexical proficiency in Slavic heritage languages 235 3.4.1 Picture Naming Task (PNT) We employed an untimed version of the picture naming task. The main aim of this task, however, was to investigate phonetic features in the speech of our informants, not lexical proficiency per se. This, of course, had an impact on the selection of items to be included in this task: they were systematically filtered for phonetic criteria, not for criteria that are relevant for checking vocabulary size, although the selected items were also controlled for frequency. All items were depicted either on drawings or on photographs. Photographs were used whenever they seemed more reliable to produce the expected response. The pictures were scaled to fit into an identical frame and were shown to the participants on a laptop screen. Since the procedure was not timed, the participants themselves could control the speed of the task. As the aim was to elicit a certain name that contained the phonetic feature under focus, participants were encouraged to correct themselves if the picture had been interpreted wrongly (e.g., answers like Russian krysa “rat” instead of myš‘ “mouse”) or if the answer did not produce the relevant phonetic feature (e.g., because the participant used the expected word in an inflected form instead of the base form). All items were recorded and later transcribed. If the participant was unable to name the depicted object, the test instructor named the item him/ herself and asked the participant to repeat it. Of course, the latter case was coded as an error and excluded from the data analysis for our current study. Preference was given to short words (excluding compounds) which denote culturally unmarked concepts (cf. Bates et al., 2003; Glaser, 1992; Vaux & Cooper, 1999). As a matter of fact, only nouns were included in the sample, mostly concrete ones, but also some nouns that denote rather abstract entities (e.g., god “year” in the Russian sample). Since the underlying concept of this task was to check for language-specific phonetic features, the sample of items was different for the three languages. The German sample included 36 items, whereas the Russian and Polish version was made up of 50 items each. Consequently, our PNT was shorter than average PNTs targeting lexical knowledge, which normally contain 60 to 100 items (Schmid, 2011, p. 142). 3.4.2 Semantic Mapping Task (SMT) In order to assess word recognition skills among our heritage speakers, we employed an adapted version of a standardized subtest for checking vocabulary knowledge in German. This subtest is part of a larger screening set called CFT-20R (Weiß, 2007). The original aim of the German test was to assess passive vocabulary knowledge of monolingual German teenagers and Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 236 adolescents between the ages of 8 and 19. It includes not only basic vocabulary, but also less frequently used items which refer to central topics in young people’s lives. Each of the 30 target items is presented in a row together with five other items. The task for the participant is to find one item that semantically matches the target item as closely as possible (i.e., a synonym, a hyponym or an otherwise semantically related item). For example, the Polish target item bluzka “blouse” is presented together with the other assigned test items: koszula “shirt” — wiatr “wind” — garnitur “suit” — aparat “machine” — siła “power, strength”. For all items there is only one correct answer (in our case: koszula “shirt”). The total score of this task is thus the sum of target items that were correctly assigned to the semantically corresponding item. The task is performed in written form. 7 A Russian version of this task also exists; it has already been applied in other studies on heritage speakers (see Böhmer, 2015, pp. 102 for details). The Russian version primarily consists of items which have been translated from the German version into Russian. It should be noted that the Russian version is not standardized. We used the adapted Russian version and added a Polish one. Furthermore, we changed the order of target items by sorting them according to their frequency. By doing so, we tried to prevent our participants from giving up the task at an early testing stage due to a lack of knowledge of low-frequency items. 3.4.3 Translation Task (TT) The vocabulary translation task was split up into two pieces, depending on the language of the stimuli words. In the first part, participants had to translate 50 German items into their heritage language. The second part consisted of 50 words in the heritage language that the participants had to translate into German. The task was administered at the end of a German testing session, s ince for this task both languages have to be activated, i.e. the informants have to switch into a bilingual mode (cf. Soares & Grosjean, 1984). Each item was presented to the participants on a laptop screen and read out by the test instructor in case the participants were unable to read the items themselves. Again, reaction times were not measured and the participants could decide themselves when to introduce the next item. The oral answers were recorded and later transcribed. Item selection was guided by several principles: first, we used a frequency-based measure, as usual in TT (cf. Hughes, 1989; Nation, 2007). Second, 7 If our Russian participants were unable to read the items due to lack of knowledge of the Cyrillic script, the test instructor would read all of the nouns aloud and the participant would name the noun he/ she deemed semantically closest to the target noun. Measuring lexical proficiency in Slavic heritage languages 237 the selected items were intended to represent different semantic domains, different registers and different word classes (cf. Hirsch, 2010). Third, in order to make interlanguage comparisons possible, the test items needed to be identical for both Russian and Polish. These constraints raise a number of problems: language-specific polysemy as well as differing degrees of frequency cast doubt on the comparability of seemingly “equivalent” items in two or even more languages (cf. Bates et al., 2003). To tackle these problems, we decided to split the item selection process into two parts: approximately one third of the 100 test items (n=30) were selected exclusively on the basis of frequency measures. We compiled lists of the most frequent items for all word classes in the three languages by using available dictionaries of word frequency for each language (Kurcz, 1990 for Polish; Ljaševskaja & Šarov, 2014 for Russian; Ruoff, 1981 for German). 8 As a second step, mean frequency rankings across the three languages were calculated, again for all word classes separately. If, for example, a Polish verb ranked high on the Polish verb frequency list (e.g., in 4 th position) and so did its equivalent on the Russian list (e.g., in 7 th position), but the German equivalent was missing on the German verb frequency list altogether or occurred very low on it (lower than 99 th position), the German verb was assigned a frequency measure of 100. Thus, in our example, the verb received a mean verb frequency value for all three languages of 37 ((4+7+100)/ 3). Calculating in this fashion, we established a list of words showing the highest average value across all three languages. These words were taken to form the first 30 test items for the German and the Russian/ Polish task. All word classes were represented in this group according to their general share in the lexicon (i.e. nouns, verbs and adjectives were represented by more items than, e.g., prepositions). The second group of items to be included (n=70) were selected in order to represent different semantic fields. The semantic fields were based on Tschirner’s (2010) compilation of words for German learners of Russian (job/ education, leisure activities/ travelling, etc.). These items included basic vocabulary as well as specialized vocabulary (cf. Fitzpatrick, 2007) and were again balanced according to word class. The 100 items on the resulting list were randomly ordered and assigned to one of the two language groups (i.e. German or heritage language as the language of stimulus). Participants scored one point for each item correctly translated into the other language. ‘Correct’ in this context refers to one possible target meaning 8 The problem remains, however, that the dictionaries do not really provide comparable data: Ruoff (1981) establishes frequency measures based on (older) data for regional colloquial German, whereas Kurcz (1990) builds on data from written Polish. Ljaševskaja and Šarov (2014) used data from the Russian National Corpus. Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 238 of the item, irrespective of the grammatical form of the produced answer (cf. Hirsch, 2010). Participants scored half a point if the produced form was semantically correct, but did not reflect the expected word class (e.g. deverbal nouns like Russian čtenie “reading” as a response to the German infinitive verb form lesen “to read”). 3.4.4 Verbal Fluency Task (VFT) We decided to use semantic (or category) fluency tasks (see section 2.2.3) only. The reasons for this decision were twofold. First, a semantic fluency task is much more natural, because it “resembles everyday production tasks, such as making a shopping list, so that participants can exploit existing links between related concepts (e.g., between the category label and the category members and among associated category members) to retrieve responses” (Shao et al., 2014, p. 2). With letter fluency tasks, participants are forced to “suppress the activation of semantically or associatively related words and must resort to novel retrieval strategies” (ibid.). This might be one of the reasons why, in studies on bilingual participants, semantic fluency tasks in particular revealed substantial differences between monolinguals and bilinguals, whereas both groups performed more similarly in letter fluency tasks (Gollan, Montoya & Werner, 2002). Second, it has been shown that “vocabulary knowledge and lexical access speed are somewhat more important determinants of category than of letter fluency” (ibid., p. 8). Letter fluency is more restricted by executive control abilities. This fact is also supported by clinical and neuroimaging evidence (Shao et al., 2014, p. 2). Another decision pertains to the selection of relevant semantic categories. As illustrated in section 2.2.3, most studies rely on categories such as animals, fruits or vegetables. Following the arguments put forward by Roberts and Le Dorze (1997), we decided to refrain from using the category animals because of its ambiguity, but we still retained the categories fruits and vegetables. Furthermore, our aim was also to check semantic fluency not only with regard to categories that would trigger nouns — as is done in most of the relevant research — but also with regard to categories that would trigger other word classes. With that goal in mind, we additionally selected the categories colors and human properties (which were supposed to deliver adjectives as responses) as well as verbs of motion and verbs denoting human activities at home as a trigger for eliciting verbs. Thus, we ended up with six trials. For testing, we decided to split the VFT into two parts which were distributed among two different sessions. In the first session, the participants had to name words related to the categories fruits, colors and verbs of motion, whereas the second session included the trials on vegetables, human properties and verbs denoting human Measuring lexical proficiency in Slavic heritage languages 239 activities at home. At the beginning of each trial, the test instructor gave each participant the category label and also one category member as a demonstration, e.g., the Russian category label cveta ‘colors’ and category member krasnyj “red”. For the category label verbs of motion, two category members were given: one simple verb (e.g., Russian idti “to go”) and a prefixed verb (e.g., perejti “to go over, cross”). This was done in order to explicitly hint at using a “clustering” strategy for scoring points. The given time frame was 60 seconds for each category. The answers were recorded and later transcribed. For scoring, we considered the total number of correct words for each semantic category and for each participant, and then calculated a mean category score. When analyzing the data, however, a problem arose concerning how to exactly determine the number of correct responses. As usual in research involving VFT, repetitions of the same word were excluded from the count. In our case, repetitions also included: (i) attributive modifications of the same word that did not result in a new category member (e.g., German gelber Paprika “yellow pepper”, roter Paprika “red pepper”), (ii) naming of the same word in different grammatical forms (e.g., Polish biegać “to run INF ”, biegałem “(I) ran 1PS-SG.PAST-MASC ”) and (iii) the addition of analytic negation markers (e.g., German klug “clever”, nicht klug “not clever”). However, if negation was expressed by prefixation (e.g., Russian sčastlivyj “happy” and nesčastlivyj “unhappy”) the answers were scored separately. The same applies to cases where the participants named hypernyms together with corresponding hyponyms (e.g., German Kohl “cabbage” and compounds denoting different varieties of cabbage, such as Blumenkohl “cauliflower”, Rosenkohl “Brussels sprouts”, etc.). If the participants used synonyms for the same category member (e.g., Russian tomat and pomidor “tomato”), these were also counted separately. Word fragments, neologisms, nonsense words and words that contained morphological or phonetic errors were, however, excluded from the final score (e.g., Russian *banana instead of banan “banana”, *plyvat’ instead of plyt’ “to swim“, Polish *pełzgać instead of pełzać “to creep” or German *untergebildet instead of ungebildet “uneducated”). Furthermore, there were two additional error categories: (i) words that did not belong to the target language (e.g., nonce borrowings like Polish sprintować “to sprint” or loan translations like Russian delat’ krovat’, which replicates German das Bett machen and takes the place of the correct Russian equivalent stelit’ “to make up the bed”), and (ii) semantic intrusions representing not the category under focus or the expected word class (e.g., German Zitrone “lemon” as a response to the category label vegetables or the Russian noun durak “fool” in the trial targeting adjectives that denote human properties). Responses that fit the semantic category label from an everyday — but not from a strict scientific — point of view (like, e.g., Russian kukuruza “corn” as Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 240 a response to the category label vegetables) were, however, included in the final score. To assure consistency in calculating the scores for each individual, the coding was done by one coder only. 4 Results and statistical analysis 4.1 Results of individual tasks The Picture Naming Task (PNT) included a different number of target items for the heritage languages Russian and Polish (n=50) than it did for German (n=36). On average, the Polish group performed slightly worse (average score: 39.9, SD=8.8) than the Russian group (average score: 42.5, SD=6.2) in their heritage language. However, due to the phonetic focus of this task, different language-specific target items were selected (see section 3.4.1), which renders comparisons across language groups impossible (see section 4.2). In order to analyze the results on the level of individual participants, we defined clusters according to the share of correct responses for every participant. These clusters are meant to illustrate a categorization which is not content-based, but simply reflects numerical values, i.e. the performance scores that our participants achieved in the different tests. Table 1 gives an overview of such a clustering of participants for the PNT. Since we are focusing here on performance in the heritage language, the results for German will be neglected. 9 As table 1 shows, almost half of the Russian group (n=9, 45%) comprises the cluster that received the highest scores (≥ 90% of the maximum score) in this task, whereas in the Polish group, the same class has seven participants (35%). The next cluster (80-89%) consists of five Russian (25%) and seven Polish (35%) participants. Six participants in each group (30%) scored 79% or less. On average, both groups showed lower correctness scores for their heritage language in the Semantic Mapping Task (SMT) than in the PNT. The Polish group performed slightly better (average score: 19.9, SD=5.8) than the Russian group (17.3, SD=5.9) in this task consisting of 30 items. On the individual level, the clusters are distributed as depicted in table 2. 9 In section 4.2, the results for German on the group level will be listed for each task and language group separately to allow for comparisons regarding language dominance. 241 Cluster Correctness rate (CR) (in %) Russian group Polish group I 100-90 H06 (100) 1 , H01 (96), H09 (96), H10 (96), L05 (96), L10 (95), L08 (93), H16 (91), L06 (91) B09 (96), H02 (94), H04 (94), H08 (94), H06 (92), H09 (90), H10 (90) II 89-80 L04 (89), H04 (86), H12 (86), H13 (84), L09 (84) B04 (88), B08 (88), H11 (88), H12 (88), B01 (86), H05 (86), H07 (84) III 79-70 L11 (76), L07 (75), L01 (71) B02 (72) IV 69-60 H08 (69), H15 (64) B07 (64), B06 (62) V 59-50 H11 (53) B10 (58) VI ≤ 49 — B05 (42), B11 (38) Table 1: Clustering of participants according to individual performance in the PNT (heritage language only) If compared to the PNT, the individuals are much more evenly distributed among the different clusters in this task. This holds for both language groups, with the Russian group having one individual who considerably lags behind the others (H11). The Polish group shows a higher concentration of individuals in the cluster with the second highest ranking (80-89% of the maximum score). Since the Translation Task (TT) involves testing both languages simultaneously, both directions of translating the given items (n=50 for each direction) will be considered separately. The results for both groups are almost identical. Translating items from the heritage language into German is the easier task: the average scores are the same for both groups (Russian group: 36.6, SD=4.6, Polish group: 36.2, SD=7.9). Variation, however, is higher for the Polish than for the Russian group, as indicated by different values for standard deviation (SD). If we look at the other direction of translation, we see again that both groups perform similarly, although average correctness 1 The abbreviations indicate the place of residence (H-Hamburg, L-Leipzig, B-Berlin) and ID of each participant. In round brackets, the relative correctness scores for each participant are given. Measuring lexical proficiency in Slavic heritage languages Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 242 scores are lower (Russian group: 30.2, SD=7.2, Polish group: 31.9, SD=6.8). On the individual level, the distribution as given in table 3 can be observed. Cluster CR (in %) Russian group Polish group I 100-90 H10 (100) H10 (93) II 89-80 H06 (83), L08 (83) B09 (87), H09 (87), H11 (83), H04 (83), H12 (83), B08 (80), H08 (80) III 79-70 L09 (73), L10 (73) B10 (73), H06 (73) IV 69-60 L6 (67), L11 (67), L05 (63), H01 (60), H13 (60) B02 (67), H02 (67), B01 (63), H05 (63) V 59-50 H15 (57), L04 (53), H16 (50) B06 (53), B05 (50) VI 49-40 H04 (47), L07 (47), H09 (40), L01 (40) B07 (40), B11 (40) VII 39-30 H12 (37), H08 (33) H07 (33), B04 (30) VIII ≤ 29 H11 (17) — Table 2: Clustering of participants according to individual performance in the SMT (heritage language only) The comparison of both directions of translation within the two groups reveals that, in the Russian group, more participants fall into higher clusters if they are asked to translate items from Russian into German (cf. three participants in cluster II or nine participants in cluster III) than from German into their heritage language (cf. only one participant in cluster II and three in cluster III). Consequently, cluster IV contains most participants for the direction German>Russian (n=8, 40%), whereas for the direction Russian>German most individuals are found in cluster III (n=9, 45%). In the Polish group, each cluster receives a rather similar number of participants if both directions of translation are compared to each other. For a comparison of the individual rankings of each participant depending on the task, see section 4.2 below. Measuring lexical proficiency in Slavic heritage languages 243 Cl. CR (in %) German > Russian German > Polish Cl. CR (in %) Russian > German Polish > German I ≥ 80 H10 (86), L05 (86), H06 (84), L06 (80) B09 (94) I ≥ 90 H10 (94) B09 (98), H06 (92), H10 (90) II 79-70 L04 (72) H02 (78), H09 (78), B08 (74), H12 (72), H11 (70) II 89-80 L06 (87), L04 (84), L05 (80) H09 (86), H08 (84), H11 (80) III 69-60 L08 (68), H01 (65), H09 (62) H10 (69), H06 (67), H04 (66), H05 (66), H08 (66), B02 (66) III 79-70 L01 (78), L11 (77), H09 (76), H16 (75), L10 (75), H15 (74), H13 (74), H06 (73), H12 (72) H04 (78), B01 (76), B08 (76), B02 (74), H02 (74), B10 (71), H05 (70) IV 59-50 H12 (56), H13 (56), H16 (54), L11 (54), L07 (52), L10 (52), L09 (51), L01 (50) B04 (59), B06 (59), B01 (58), B07 (54), B10 (54) IV 69-60 L09 (69), H01 (67), H08 (66), L08 (64), H04 (63), L07 (62) H12 (67), B07 (66), B11 (65), H07 (61) V 49-40 H04 (48), H15 (44), H08 (44), H11 (42) H07 (47), B11 (45) V 59-50 H11 (54) B04 (58), B06 (54) VI <40 — B05 (32) VI <50 — B05 (26) Table 3: Clustering of participants according to individual performance in the TT (both directions) Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 244 The results of the Verbal Fluency Task (VFT) show that the highest amount of unique correct words in the heritage language was achieved in the semantic categories that triggered adjectives (colors and human properties) (cf. table 4). In particular, the category colors turned out to be very productive for our participants (average number of produced items: 13 for both groups). 2 Given the fact that category size has been identified as a predictor for productivity in VFTs (cf. Snodgrass & Tsivkin, 1995), we would have expected the category human properties to be more productive than colors, since we did not restrict our participants to a certain type of property (good vs. bad properties, physical vs. psychological properties, etc.). 3 The average scores for this category lag behind the ones we obtained for colors, but once again: the scores are very similar for both language groups. Considering performance in the other semantic categories, there are some subtle differences between the Russian and the Polish group. For the Russian group, both categories that aimed at triggering verbs yielded higher numbers of appropriate responses than the two categories which elicited concrete nouns. In the Polish group, the results show approximately the reverse picture. The trial on verbs denoting human activities at home, however, produced slightly more correct responses on average than the category vegetables. Since no standardized values are available for monolingual Russian and Polish participants representing the same age group, we cannot judge how far the participants in our study approximate a monolingual baseline. But since the main aim of our study is to compare the participants with each other in order to place them on a continuum between acrolectal and basilectal heritage speakers, we will not discuss the possible effects of bilingualism on the received scores here. Since there is no fixed maximum score to be achieved in the VFT, we decided to select the individual that obtained the highest score as a benchmark for comparing the results of the other participants. It must be said that, in the Polish group, there was one participant (B09) that clearly outperformed all the others. Therefore, in order to avoid outlier effects, we decided to take the results of the participant who scored second on the list (H12) to represent the benchmark for the Polish group. The clustering shown in table 5 is based on the total number of correct words that were produced by our par- 2 We can only speculate about the reasons for the high degree of productivity regarding the category color in our data. It might be that this category plays a very important role for the investigated age group, which is maybe due to its link with fashion or the arts. Gender-related issues might also be relevant here, since our sample - at least in the Russian group - contains more female participants. 3 For reasons of space we cannot elaborate on qualitative aspects of the responses given by our participants for each category. This will be the topic of a separate study. Measuring lexical proficiency in Slavic heritage languages 245 ticipants when all trials were taken together (cf. last column in table 4, for Polish n=91). Table 4: Average score of correct words for each category in the VFT The results show that there are more participants in the Russian group who fall into the first four clusters (n=16, 80%, Polish: n=8, 40%), whereas most of the Polish participants (n=12, 60%) produced only 59% or less of the total sum of correct examples that were named by the (second) most successful Polish participant (H12) in this task. Fruits Vegetables Colors Human properties Motion verbs Activities at home Polish group (n=20) x̄ 9.75 7.15 13.05 10.25 6.7 7.85 SD 4.29 2.99 3.93 3.65 4.1 4.65 Min 2 1 7 4 1 1 Max 22 13 24 18 15 18 Russian group (n=20) x̄ 7.65 6.65 12.7 10.4 8.3 9.4 SD 2.17 2.01 3.44 4.24 3.42 3.43 Min 5 3 7 4 3 3 Max 12 10 20 21 18 17 Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 246 Table 5: Clustering of participants according to individual performance in the VFT (heritage language only) 4.2 Comparison of results Table 6 summarizes the mean scores (x̄ ), standard deviations (SD), and minimum (Min) and maximum (Max) scores achieved in the four tasks on the group level. We included here the scores that the participants received in the German versions of the tasks as well. The first two rows depict the results for the Russian and Polish groups separately; the third row combines the results for both heritage language groups. As was stated earlier, there is no joint score for the Picture Naming Task because different language-specific items were selected (see section 3.4.1). For the Verbal Fluency Task we included only the average scores of our participants regarding their performance in two semantic category labels, namely fruits and vegetables, as these Cluster Rate in relation to the maximum score (%) Russian group Polish group I 100-90 H10 (100), H06 (94) B09 (100), H12 (100) II 89-80 L08 (86), L11 (84), H09 (84), L04 (82) H11 (82) III 79-70 H12 (76), H04 (72), H13 (71), L05 (70) H06 (76), B01 (75) IV 69-60 H01 (68), L06 (68), L07 (68), L10 (63), L09 (62), H16 (60) H04 (68), H07 (65), B08 (64) V 59-50 L1 (56), H15 (51) H08 (58), B04 (55), H09 (54), H10 (54), B02 (53), H05 (53), H02 (52) VI 49-40 H11 (43) B07 (44) VII 39-30 H08 (38) B10 (35), B11 (35), B06 (34), B05 (31) Measuring lexical proficiency in Slavic heritage languages 247 two categories are more regularly used in VFTs if compared to the other categories that were tested in our study. As can be seen from the results, our participants always obtained higher scores in the German tests than in the corresponding versions for the heritage language. For the tests that have a fixed maximum score, the difference between the average scores reached in the German and heritage language versions remains rather stable across all test types: PNT (G): 97.9% of the maximum score, PNT (HL): 79.7% (Polish) — 84.9% (Russian); SMT (G): 74.6%, SMT (HL): 62%; TT (HL>G): 72.8%, TT (G>HL): 62%. Accuracy scores are especially high in the PNT, which could be explained by the selection of items according to phonetic, but not lexical criteria. This resulted in a higher number of items from basic vocabulary if compared to the other tasks. Thus, the consistent picture emerges that German represents the dominant language for both groups of heritage speakers, at least on the lexical level. Second, performance in the heritage language does not substantially differ between the two investigated groups. Differences are most visible in the Semantic Mapping Task where, on average, the Polish group achieved higher scores than the Russian group. 4 But even here the difference between the two groups is not statistically significant (t=1.45, df=38, p=0.155). In order to examine whether there is a correlation between the results obtained from the different vocabulary tests, the participants’ performance on each test was compared. For this purpose, data were analyzed for possible correlations using Pearson’s r. Table 7 reveals that, for the German data, the only results with a strong correlation were from the tasks testing translation of items from the heritage languages into German and vice versa. The results of the other vocabulary tasks on German show no statistically significant correlation, because there are significant differences between the results obtained from these tasks. There are, however, significant correlations between all four tests targeting linguistic proficiency in the heritage languages (see table 8). Strong correlations can be observed between the Translation Task (from German into the heritage language) and the other three tests on Russian/ Polish. The results obtained from the other tests show only a medium correlation. 5 4 Differences in the PNT are not meaningful in this respect, since the selected items differed for both language groups (see section 3.4.1). 5 However, as one of our reviewers correctly pointed out, even in the case of our strongest correlation, this test accounts for only half of the variation of the other test (r 2 =.48) Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 248 PNT HL MS 50 PNT G MS 36 SMT HL MS 30 SMT G MS 30 TT G>HL MS 50 TT HL> G MS 50 VFT HL VFT G Polish group (n=20) x̄ 39.85 (79.7%) 35.2 (97.8%) 19.95 (66.5%) 22.85 (76.2%) 31.9 (63.8%) 36.2 (72.4%) 8.45 (48.3%) 10.65 (66.6%) SD 8.79 1.15 5.82 3.3 6.79 7.91 3.64 2.94 Min 19 32 9 15 16 13 1,5 6 Max 48 36 28 29 47 49 17.5 16 Russian group (n=20) x̄ 42.45 (84.9%) 35.3 (98.1%) 17.25 (57.5%) 21.9 (73%) 30.15 (60.3%) 36.6 (73.2%) 7.15 (65%) 10.93 (68.3%) SD 6.19 0.8 5.95 3.06 7.21 4.63 2.09 2.74 Min 27 33 5 16 21 27 4 5 Max 50 36 30 27 43 47 11 16 Σ HL group (n=40) x̄ — 35.25 (97.9%) 18.6 (62%) 22.38 (74.6%) 31 (62%) 36.38 (72.8%) 7.8 (54.7%) 10.79 (67.4%) SD — 0.98 5.96 3.18 6.97 6.4 2.87 2.81 Min — 32 5 15 16 13 2.75 5 Max — 36 30 29 47 49 14.25 16 Table 6: Comparison of results for all vocabulary tests (PNT=Picture Naming Task; SMT=Semantic Mapping Task; TT=Translation Task; VFT=Verbal Fluency Task; MS= maximum score for each task, G=German; HL=heritage language) Measuring lexical proficiency in Slavic heritage languages 249 VFT G PNT G SMT G TT G>HL TT HL>G VFT G — .294 n.s. .291 n.s. .120 n.s. .194 n.s. PNT G .294 n.s. — .216 n.s. .004 n.s. .017 n.s. SMT G .291 n.s. .216 n.s. — .024 n.s. .117 n.s. TT G>HL .120 n.s. .004 n.s. .024 n.s. — .694 (p<.001) TT HL>G .194 n.s. .017 n.s. .117 n.s. .694 (p<.001) — Table 7: Correlations (Pearson’s r) between test scores (German tests) VFT HL SMT HL TT G>HL TT HL>G VFT HL — .464 (p <.005) .621 (p <.001) .518 (p <.001) SMT HL .464 (p <.005) — .680 (p <.001) .577 (p <.001) TT G>HL .621 (p <.001) .680 (p <.001) — .694 (p <.001) TT HL>G .518 (p <.001) .577 (p <.001) .694 (p <.001) — Table 8: Correlations (Pearson’s r) between test scores (heritage language tests) Table 9 summarizes for each participant which cluster he/ she belongs to in every individual task. In section 4.1 we applied a rather fine-grained clustering where the cut-off points between the individual clusters were based on a regular 10% scale. This approach rather artificially separates participants who could legitimately belong to one cluster or the other, depending on just a few points. To reduce the impact of such arbitrary cut-off points, we considered only instances where participants fall into very different clusters if the cluster positions are compared between the tasks. In table 9 a deviation of two or more cluster positions from the average cluster position is marked in bold type for each participant. Bold type is also used to indicate participants who lack a dominant cluster position when all tasks are taken into Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 250 account. Round brackets following the IDs of our participants show the average cluster score. Out of 40 participants included in our study, there are only two participants who lack a dominant cluster position when all tasks are compared with each other (H04 in the Russian group, H02 in the Polish sample). Two participants belong to the same cluster in all five tasks (H10/ Russian and H11/ Polish), two show the same cluster positioning in four tasks (Russian sample: L04, Polish sample: B09), 14 fall into the same cluster for three tasks (Russian group: H01, H06, H11, H15, L07, L09; Polish group: B02, B05, B07, B08, H06, H09, H10, H12) whereas all other participants (eleven in the Russian group, nine in the Polish sample) represent the same cluster in two tasks. In most cases, the latter participants have at least one further task where they represent a cluster that is adjacent to their dominant cluster position. Table 9 also reveals that there are rather few participants who can be presumably classified as acrolectal speakers of their heritage language, at least with regard to lexical proficiency. 6 These are H10 and (to an already lesser extent) H06 for the Russian group, who reach the highest lexical scores and perform on an equally high level in all tasks. In the Polish sample, this type of heritage speaker is represented by just one participant (B09). The lower end of the proficiency scale (“basilectal speakers” in Polinsky and Kagan’s terms) is made up of rather few participants as well: H11 and H08 in the Russian group, B05, B11, B06 and B07 in the Polish sample. All the other participants comprise a wide range of what Polinsky and Kagan (2007) term “mesolectal speakers” of their heritage language since they occupy an intermediate position on this continuum of lexical proficiency, as evidenced by their performance in the different tasks. 6 As we have no monolingual data to compare (yet), we are well aware that the decision concerning which speakers approximate the monolingual baseline (cf. the definition of acrolectal heritage speakers in section 1) cannot be made with certainty (especially in tasks like the VFT). However, correctness scores between 90 and 100% can hardly be outperformed by monolingual controls. Measuring lexical proficiency in Slavic heritage languages 25 1 Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 252 Task (SMT) seems to be most problematic for the Russian heritage speaker group in this respect, since half of the participants in the sample (n=10) show considerable deviations with regard to their cluster position if compared to results taken from the other four tasks. They clearly perform worse on this task, which indicates that testing this kind of lexical knowledge (lexical depth) alone would provide a distorted picture of their overall lexical proficiency in the heritage language. Interestingly, the Polish group differs in this respect, although intuitively we would not expect language-specific effects to be connected to single tasks. We cannot offer a convincing explanation for this difference at the current stage of the data analysis process. For the Polish group, verbal fluency (VFT) seems to be the most difficult task: even if we do not take into account the participant who shows no clear dominant cluster position, there are still ten other participants who achieve considerably worse results in this task if compared to the others. The Picture Naming Task (PNT), on the contrary, triggers results which are generally above the level of scores achieved in the other tasks. This holds for both language groups of our study. The main reason for this kind of deviation is certainly connected to the process of item selection, since this task was designed to target other types of knowledge than just lexical proficiency (the PNT primarily targets pronunciation). Thus, the Translation Tasks (TT) emerged as the most convenient and representative measure of (general) lexical proficiency in the heritage language for both groups in our study. This is also confirmed by the strong correlation in the results obtained from the TT (especially translations from the heritage language into German) and in results received from the application of other tasks. Conclusion Lexical proficiency has been identified as a useful general diagnostic category in assessing the degree of heritage language maintenance in different populations of heritage speakers. Previous research on heritage languages, including Slavic heritage languages, has applied different psycholinguistic experimental methods to determine the lexical proficiency of heritage speakers in their (supposedly) weaker language. Due to limitations of time and resources, researchers are normally only able to use one method of testing lexical knowledge per study. Examples of lexical proficiency tests include picture naming tasks, semantic mapping tasks, translation tasks and verbal fluency tasks. In applying these different types of psycholinguistic tasks individually, it remains unclear to what extent task-related effects obscure the results. After all, all of these commonly used methods investigate differ- Measuring lexical proficiency in Slavic heritage languages 253 ent components of lexical knowledge: active vs. passive vocabulary size, lexical breadth, lexical depth and lexical fluency. Furthermore, other general cognitive factors (such as executive control) exert an influence on the test results. To tackle these issues, we used different methods for testing lexical proficiency in our group of 20 heritage speakers of Russian and 20 heritage speakers of Polish. The test battery included: (1) a picture naming task which was originally meant to target other domains of heritage language maintenance (phonetic skills); (2) a semantic mapping task where participants were asked to find one item that semantically matched the target item as closely as possible; (3) two translation tasks that contained words taken from different word classes which were selected according to frequency criteria and semantic domains; and (4) six verbal fluency tasks which included different semantic category labels that aimed at triggering nouns (vegetables, fruits), adjectives (colors, human properties) and verbs (motion verbs and verbs denoting human activities at home). Our main aim was to investigate whether exposure to different tests would result in identical participant rankings regarding lexical proficiency in the heritage language, and whether the two examined language groups would behave differently in these tasks. The results reveal that, for both groups, German is the dominant language. In all test types, participants achieved higher scores in the German versions than in the versions that targeted their heritage language. However, most of the results obtained in the different German tests did not show a significant correlation. This missing correlation suggests that the choice of test type for assessing lexical proficiency in the dominant language is a crucial problem. Fortunately, for the tests that targeted heritage language proficiency we did find a positive correlation between the results obtained from the different tasks. In particular, the translation tasks turned out to render the degree of variation quite consistently with regard to lexical knowledge in the heritage language. Their results were in accordance with the average results obtained by the other tasks. Language-specific effects were observed in the types of tasks that seemed to be most difficult for the investigated speakers: for the Russian group, the semantic mapping task posed the most severe problems, whereas in the Polish group the verbal fluency tasks generated scores that were lower than in the other tasks. Thus, given a limited amount of time, translation tasks seem to be the best solution for reliably assessing lexical proficiency in the heritage language. However, as was indicated by the task effects mentioned above, using multiple measures to investigate the same phenomenon (in our case: lexical proficiency in the heritage language) should be the optimal choice. This multiple-test approach makes a much finer-grained analysis possible. With more data at our disposal, it is easier to disentangle the effects caused by different types of lexical know l - Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 254 edge (such as lexical breadth or lexical fluency) and effects caused by general cognitive processes involved in lexical retrieval. References Ardila, A., Ostrosky-Solís, F., & Bernal, B. (2006). Cognitive testing toward the future: The example of Semantic Verbal Fluency (ANIMALS). International Journal of Psychology, 41, 324-332. Bates, E., D’Amico, S., Jacobsen, T., Székely, A., Andonova, E., Devescovi, A., & Tzeng, O. (2003). Timed picture naming in seven languages. Psychonomic Bulletin & Review, 10(2), 344-380. Benmamoun, E., Montrul, S., & Polinsky, M. (2013a). Heritage languages and their speakers: Opportunities and challenges for linguistics. Theoretical Linguistics, 39(3- 4), 129-181. Benmamoun, E., Montrul, S., & Polinsky, M. (2013b). Defining an “ideal” heritage speaker: Theoretical and methodological challenges. Reply to peer commentaries. Theoretical Linguistics, 39(3-4), 259-294. Böhmer, J. (2015). Biliteralität. Eine Studie zu literaten Strukturen in Sprachproben von Jugendlichen im Deutschen und im Russischen. Münster, New York, NY: Waxmann. Brehmer, B., & Mehlhorn, G. (2015). Russisch als Herkunftssprache in Deutschland. Ein holistischer Ansatz zur Erforschung des Potenzials von Herkunftssprachen. Zeitschrift für Fremdsprachenforschung, 26(1), 85-123. Cameron, L. (2002). Measuring vocabulary size in English as an additional language. Language Teaching Research, 6, 145-173. Cummins, J. (2005). A proposal for action: Strategies for recognizing heritage language competence as a learning resource within the mainstream classroom. The Modern Language Journal, 89, 585-592. Daller, H., Milton, J., & Treffers-Daller, J. (2007). Editors’ introduction. In H. Daller, J. Milton, & J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge (pp. 1-32). Cambridge: Cambridge University Press. Dunn, L.M., & Dunn, L.M. (1997). Peabody Picture Vocabulary Test — III. Circle Pines, MN: American Guidance Service. Fitzpatrick, T. (2007). Productive vocabulary tests and the search for concurrent validity. In H. Daller, J. Milton, & J. Treffers-Daller (Eds.), Modelling and assessing vocabulary knowledge (pp. 116-132). Cambridge: Cambridge University Press. Glaser, W.R. (1992). Picture naming. Cognition, 42, 61-105. Gollan, T.H., Montoya, R.I., & Werner, G. (2002). Semantic and letter fluency in Spanish-English bilinguals. Neuropsychology, 16(4), 562-576. Henriksen, B. (1999). Three dimensions of vocabulary development. Studies in Second Language Acquisition, 21, 303-317. Hirsch, D. (2010). Researching vocabulary. In B. Paltrigde, & A. Phakiti (Eds.), Continuum companion to research methods in applied linguistics (pp. 222-239). London: Continuum. Hughes, A. (1989). Testing for language teachers. Glasgow: Bell & Brain. Kurcz, I. (1990). Słownik frekwencyjny polszczyzny współczesnej. Kraków: PWN. Measuring lexical proficiency in Slavic heritage languages 255 Lezak, M.D. (2004). Neuropsychological assessment (4th ed.). New York, NY: Oxford University Press. Ljaševskaja, O.N., & Šarov, S.A. (2014, September 22nd). Častotnyj slovar’ sovremennogo russkogo jazyka. Retrieved from http: / / dict.ruslang.ru/ freq.php Luo, L., Luk, G., & Bialystok, E. (2010). Effect of language proficiency and executive control on verbal fluency performance in bilinguals. Cognition, 114, 29-41. Meara, P. (1996). The dimensions of lexical competence. In G. Brown, K. Malmkjaer, & J. Williams (Eds.), Performance and competence in second language acquisition (pp. 35- 53). Cambridge: Cambridge University Press. Meara, P. (2005). Designing vocabulary tests for English, Spanish, and other languages. In C. Butler, S. Christopher, M.Á. Gómez González, & S.M. Doval-Suárez (Eds.), The dynamics of language use (pp. 271-285). Amsterdam: John Benjamins. Nation, P. (2007). Fundamental issues in modelling and assessing vocabulary knowledge. In J. Milton & J. Treffers-Daller (Eds.). Modelling and assessing vocabulary knowledge (pp. 35-44). Cambridge: Cambridge University Press. Polinsky, M. (1997). American Russian: language loss meets language acquisition. In W. Brown, E. Dornisch, N. Kondrashova, & D. Zec (Eds.), Formal approaches to Slavic linguistics (pp. 370-407). Ann Arbor, MI: Michigan Slavic Publications. Polinsky, M., & Kagan, O. (2007). Heritage languages: In the “wild” and in the classroom. Language and Linguistic Compass, 1, 368-395. Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press. Roberts, P.M., & Le Dorze, G. (1997). Semantic organization, strategy use, and productivity in bilingual semantic verbal fluency. Brain and Language, 59, 412-449. Rosselli, M., Ardila, A., Salvatierra, J., Marquez, M., Matos, L., & Weekes, V.A. (2002): A cross-linguistic comparison of verbal fluency tests. International Journal of Neuroscience, 112, 759-776. Ruoff, A. (1981). Häufigkeitswörterbuch gesprochener Sprache. Tübingen: Niemeyer. Schmid, M.S. (2011). Language attrition. Cambridge: Cambridge University Press. Schmid, M.S., & Jarvis, S. (2014). Lexical access and lexical diversity in first language attrition. Bilingualism: Language and Cognition, 17, 729-748. Shao, Z., Janse, E., Visser, K., & Meyer, A.S. (2014). What do verbal fluency tasks measure? Predictors of verbal fluency performance in older adults. Frontiers in Psychology, 5, 1-10. doi: 10.3389/ psyg.2014.00772 Snodgrass, J.G. (1993). Translating versus picture naming: Similarities and differences. In R. Schreuder, & B. Weltens (Eds.), The bilingual lexicon (pp. 83-114). Amsterdam: John Benjamins. Snodgrass, J.G., & Tsivkin, S. (1995). Organization of the bilingual lexicon: Categorical versus alpabetic cuing in Russian-English bilinguals. Journal of Psycholinguistic Research, 24(2), 145-163. Soares, C., & Grosjean, F. (1984). Bilinguals in a monolingual and a bilingual speech mode: The effect on lexical access. Memory & Cognition, 12(4), 380-386. Spreen, O., & Strauss, E. (1998). A compendium of neuropsychological tests (2nd ed.). New York, NY: Oxford University Press. Swadesh, M. (1955). Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics, 21, 121-137. Bernhard Brehmer, Tatjana Kurbangulova & Martin Winski 256 Tschirner, E. (2010). Lextra - Russisch - Grund- und Aufbauwortschatz nach Themen: A1- B2 - Lernwörterbuch Grund- und Aufbauwortschatz. Berlin: Cornelsen. Waring, R. (1997). A comparison of the receptive and productive vocabulary sizes of some second language learners. Immaculata: The Occasional Papers at Notre Dame Seishin University, 94-114. Weiß, R.H. (2007). WS / ZF-R. Wortschatztest und Zahlenfolgentest - Revision. Ergänzungstests zum CFT-20R. Göttingen: Hogrefe. Vaux, B., & Cooper, J. (1999). Introduction to linguistic field methods. München: Lincom. Psycholinguistic aspects of Belarusian-Russian language contact. An ERP study on code-switching between closely related languages Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk Abstract: In this article, we discuss peculiarities of Belarusian-Russian language contact in Belarus and how psycholinguistic methods, more specifically the event-related potential (ERP) technique can shed light on the mechanisms of processing code-switching (CS) between closely related languages, where CS is especially difficult to distinguish from code-mixing (CM). In spite of a vast interest in the phenomenon of bilinguals switching and mixing languages, its psycholinguistic nature is so far poorly understood. This is especially true for closely related languages. More than that, psycholinguistic studies on bilingualism generally do not deal with the type of bilingualism that is present in today’s Belarus (but certainly not only there). The main topics addressed in the paper will be theoretical issues on bilingualism and language categories discussing what the study of Belarusian and Russian can contribute to the psycholinguistic discussion of bilingualism and methodological difficulties that arise when we study Belarusian and Russian language contact using psycholinguistic methods. Although these issues might be of special importance in the case of Belarus, they are important for other instances of language contact as well. 1 Two languages in the mind Terms like “code-mixing”, “code-switching”, “borrowing” or “interference” seem to presuppose the existence of two autonomous language systems as discrete entities within the bilingual mind. This so-called “monolingual bias”, that is the conceptualization of the bilingual as a “double monolingual”, that influenced the investigation of bilingualism for decades, has been criticized for various reasons. Firstly, a clear assignment of linguistic elements to one of the two languages in question may be heuristically appropriate for a linguist. But it does not need to correspond with the perception of the bilingual speakers. Their judgment whether a given linguistic item belongs to language A or B does not necessarily coincide with the judgment of the monolingual speaker and of the linguist (De Angelis, 2005). Moreover, espe- Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 258 cially for closely related languages, the question what belongs to A or B is by no means an empirical question, but to a large degree a theoretical one (cf. Hentschel, 2014a, pp. 96-99). Secondly, the “languages” in bilingual talk are not the same as the languages an individual uses in monolingual talk. In particular with closely related languages, but not only there, compromise forms exist that belong to a new, third language system. In Kazakh-Russian CM for example, speakers change the Russian genderand number-marking system of modifiers also in cases, when all overt morphology of the modifier and of the head of the NP stems from Russian. In the monolingual Russian variety of the same speakers, the Russian genderand number-marking system remains intact (cf. Auer, 2007, pp. 333-335). Thirdly, the property of cognates to trigger CS (Clyne, 1987) shows their ambivalent linguistic affiliation. And finally, loan words and loan constructions can become a fixed and permanent part of a language. This mere fact implies that linguistic items can change their linguistic affiliation. The consequence of this is that most ‘codeswitching’ in corpora is not actual “codeswitching” in that literal sense (i.e. an intentional switch to the other language system), but rather a “monolingual” phenomenon: the selection of a newish variant (which happens to originate in the other language), at the expense of its original base language equivalent (Backus, 2009, pp. 315-316). The psycholinguistic nature of these transition phenomena has not been studied at all (as far as we are aware). Although the bilingual mental lexicon is conceptualized as integrated and lexical access as non-selective, models of the bilingual mental lexicon (to name but a few: Dijkstra & van Heuven, 2002; Green, 1998; Kroll, Van Hell, Tokowicz & Green, 2010) suggest that any element can be attributed to one language only. The nature of this attribution is controversial. Green (1998) assumes a specific language tag, while Paradis (2004, p. 207) argues that the language-specific networks are a by-product of the simultaneous use of linguistic items. The subconscious process that leads the bilingual to the selection of the appropriate item is—in his view—the same as in a monolingual who chooses the contextually appropriate variant. On the syntactic level, syntactic interference and cross-linguistic priming are seen as evidence for the interaction of both languages. Hartsuiker, Pickering & Veltkamp (2004) propose an integrative model of bilingual language representation, with a shared lexicon and a shared syntax, arguing that verbs from different languages with the same syntactic characteristics (i.e. with respect to government) are connected with each other the same way that verbs from the same language are. Psycholinguistic aspects of Belarusian-Russian language contact 259 However these models see the separation or interaction and overlapping of the languages in the bilingual mind: most studies deal with one of two prototypical constellations (cf. Matras, 2009, p. 63). It is either a child who grew up in a Western, urban and more or less monolingual environment and who was exposed to two languages because the native language of one (or both) of his/ her parents was a different one than the surrounding language. Or it is the second language learner, that is, a person who grew up monolingually and learned another language after some critical period. In both cases, the two languages of the individual are normally standard languages with a high prestige and are not very closely related. They are differentiated functionally, by association either with one or several conversational partners or with certain situation types. 2 The type of “bilingualism” in Belarus Both Belarusian and Russian are official languages in Belarus. In reality, Russian clearly dominates, while Belarusian, at least in its standard variety, is spoken only by a small number of people (cf. Hentschel & Kittel, 2011). However, Belarusian is present to some degree, not only in schools and media. Probably all Belarusians understand Belarusian and most of them are able to speak it to a certain degree (cf. Hentschel, Brüggemann, Geiger & Zeller, 2015, pp. 140-143). Belarusian-Russian bilingualism is nevertheless of a different kind than the above mentioned “prototypical” types of bilingualism. Firstly, both languages are closely related and to a certain degree mutually intelligible. They are to a high degree structurally congruent, with practically no difference in the inventory of grammatical categories, and also a huge overlap in their morphonological representations. Secondly, as a consequence of the intensive language contact, large numbers of speakers in Belarus ‘mix’ elements and structures of both languages in their everyday speech. This Belarusian-Russian mixed speech (BRMS) is based partly on spontaneous mixing, partly on already stabilized patterns. As for the proportions of the two languages in BRMS, there is a clear hierarchy, ranging from phonic elements, where the Russian impact is weakest and Belarusian dominates, over inflectional endings, pronominal stems and functional words to lexical stems (especially nouns) and discourse markers, where Russian clearly dominates (Hentschel, 2013a, 2014a, p. 117). As a mass phenomenon BRMS appeared after the Second World War when masses of speakers of Belarusian dialects migrated into towns and cities where they had to adapt to the prestigious Russian (cf. Zaprudski, 2007). Against the opinion of many Belarusian linguists, BRMS is by no Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 260 means restricted to the uneducated but characteristic for millions of people (cf. Hentschel & Kittel, 2011). Of course, some Belarusian scholars (e.g. Mečkovskaja, 2014) by definition restrict the scope of “Trasyanka” 1 , as BRMS is widely called by laymen and scholars in the field, to the mixed speech of people, who are unable to converse in an “acceptable way” in at least one of the donor languages. Hentschel (2013b, p. 72; 2014b, pp. 13-14) has criticised this point of view for two reasons. First, it is circular. Inhabitants of Belarus that have no acceptable competence in at least the Russian language belong without any doubt to the uneducated. So subpopulations in Belarusian society, for which the attributes “uneducated” and “unable to speak acceptable Russian (and/ or Belarusian)” are true, are extensionally the same. Consequently, the statement that Trasyanka is the speech of the uneducated has no analytical value. It is true, but only for the uneducated and thus trivial, because lack of education is in a camouflaged way part of the restrictive definition. Second, the mentioned definitional restriction blurs the scope of BRMS in Belarusian society. There is so far no evidence for a principal structural difference of the BRMS of the uneducated and the BRMS of the (more) educated, apart from the fact that the frequency of Russian elements in the BRMS of the latter group, due to their permanent exposition to Russian speech, may be somewhat higher (cf. Hentschel & Zeller, 2012, 2014), at least for certain topics. 2 Of course, there are always peculiarities in the speech of uneducated people, and of course, educated people may choose to avoid using a “low variety” for ideological reasons, especially when it is heavily stigmatised like the “Trasyanka” in Belarus. Casual (“uncultivated”) speech, the domain of BRMS in Belarus, and formal (“cultivated”) speech in standard languages are, as a matter of fact, often linked with social class differentiation, in that the former is more frequent with lower, uneducated classes 1 This term has a severe negative connotation. It suggests that Trasyanka is a way of speaking of simple, uncultivated and uneducated peasants that were driven to town and cities by fate. In so far the term is a tool for stigmatizing mixed Belarusian-Russian speech (cf. Brüggemann’s discussion 2014, pp. 162-163 of Zenon Paznjak’s ― a former leader of the national(istic) opposition to president Lukašenka in Belarus ― conception of Trasyanka as a collective mental illness). 2 Of course, there is a certain amount of topic variation in BRMS. The more sophisticated the topic, the higher the probability of Russian traits (especially lexical elements) showing up in speech. But topic variation is usually seen as weaker compared to variation depending on interlocutor or situation (cf. Schilling-Estes, 2007, p. 385). Psycholinguistic aspects of Belarusian-Russian language contact 261 and the latter with higher, more educated (the so-called Bell’s Style Axiom, cf. Schilling-Estes, 2007, pp. 384 ff.). 3 Hentschel (2013a, pp. 60-62) points out that for many Belarusians, in particular the children of the village-town-migrants, BRMS was the first “language” of acquisition. These children did not grow up Belarusian-Russianbilingually in the very first years of their lives, but mixed-monolingually, that is, without a clear functional separation between both languages (cf. as well Suprun & Klimenka, 1982, p. 86, and Vyhonnaja, 1996, p. 10). Some Belarusian scholars (as the ones just mentioned) are aware of the fact that BRMS is spoken by many children. But they do not discuss its acquisition. In so far they ignore that acquisition of a language (whether we call it a language or not) always means systematization of the linguistic input by these children, even when the degree of free variation, here mainly of Belarusian or Russian based elements, is extremely high. They develop, so to say, a language competence that integrates elements and structures, which a linguist may qualify as Belarusian or Russian. Their BRMS therefore is mixed only from a diachronic, i.e. from a purely descriptive point of view, without any psycholinguistic reality. 4 From a synchronic psycholinguistic point of view it is no mixed speech at all, but the realisation of the base language of these children in the sense of a collective repertoire of linguistic alternatives. The “standard” Russian (and for those who do use it) Belarusian of these second generation speakers can be described as a variant of this base language system, to be used in different situations of contexts of conversation. 3 Violations of this axiom are known as well, as Schilling-Estes cites, in that middle class members sometimes are even more inclined to use formal speech than upper class ones. For the time being, there is no empiric data to discuss this for BRMS. 4 As for a diachronic or conventionalised impact, Belarusian scholars do realize the interference of Belarusian substratum on the Russian language in Belarus (cf. Norman, 2008, Michnevič, 1985) and speak of a Belarusian “natiolect” of Russian. The latter, the most widespread standard language (or formal style of speaking) in Belarus itself is thus mixed in a certain sense, but with much less Belarusian “ingredients” than BRMS. On the other hand, Belarusian scholars mostly deny that such a conventionalisation and thus systematisation could have happened (at least to a certain degree) in BRMS, the lower style. Some (cf. Cychun, 1998) even dare to call the Trasyanka a creolised variety but keep describing it as unsystematic (compare the discussion in Hentschel, 2014b, 2014c). In other words, they overlook that creolisation in the strict sense means systematisation. The mentioned conventionalised, stable interferences of Belarusian on Belarusian Russian and the corresponding conventionalised interferences of Russian in spoken or written Belarusian Standard can be seen as two tips of an iceberg called “Belarusian”, i.e. BRMS. Zaprudski (2014, pp. 120-131) describes, how the sovieticised variant of spoken and written (printed) Belarusian, which was clearly influenced by Russian patterns in many structural and stylistic traits, was perceived as a form of Trasyanka as well. Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 262 The variation of Belarusian and Russian traits in BRMS is thus comparable to style variation. As the acquisition of standard Russian and Belarusian and thus the differentiation between the two took place later, mainly in institutes of education, it can be conceptualized as the suppression of linguistic elements that did not fit into a monolingual context. There is thirdly some amount of subjective fuzziness of the borderline between Belarusian and Russian, that is, a meta-linguistic insecurity about linguistic norms and language boundaries (cf. Wexler, 1992). With Russian and Belarusian, there are not only two related standard varieties, but different codifications of Belarusian exist: The so-called Taraškevica from 1918, directed towards divergence with Russian, and the so-called Narkamaŭka from 1934, today’s official codification, directed to convergence with Russian (Sjameška, 1998). Since the Taraškevica is used today only marginally, this is of only minor importance for our study. More importantly, even within the scope of the official norm Belarusian linguists and dictionaries can tend to divergence or convergence with Russian and can differ widely in what they interpret as a russism and what they either recognize as an acceptable borrowing from Russian or do not perceive as a russism at all (Hentschel, 2008, p. 185). Even more complex is the situation if one considers Belarusian dialects, which in some aspects may correspond rather with the Russian than with the Belarusian standard language (Ramza, 2008, pp. 316-320). Such uncertainty and arbitrariness can be found also in the categorization of the languages and varieties as a whole. This can already be seen when we compare the results of the Belarusian Census from 2009 with the results of the survey reported in Hentschel and Kittel (2011). The census, asking for the language primarily used, gave Russian and Belarusian as possible responses. In the survey of Hentschel and Kittel, the category BRMS was added. While in the census, 26% declared Belarusian as the language primarily used, less than 5% characterize their language as “pure Belarusian” or “Belarusian with some Russian words” in the survey of Hentschel and Kittel, while 41% classify it as BRMS. Many Belarusians are obviously inclined to call their BRMS Belarusian, if they are forced to a decision. That they will not call BRMS Belarusian if they have the opportunity to do so, testifies to the existence of a separate category for BRMS in the speaker’s mind. Still it is unclear where the speakers set the borderlines. It can be assumed that the mental categorization of the varieties of the speakers differs considerably from the categorization assumed by linguists. Furthermore, these mental categories seem to differ in various groups of speakers. It is often mentioned that deviations from Russian are not noticed by speakers of BRMS, that is that they are sure to speak Russian (Sjameška, Psycholinguistic aspects of Belarusian-Russian language contact 263 1998, p. 41). However, this certainly holds only for older speakers with a low degree of education and linguistic consciousness. For this first generation of BRMS-speakers BRMS can be interpreted as a kind of interlanguage, i.e. an incompletely learned Russian. For the younger generation, the second or even third generation of BRMS-speakers, who were exposed to Russian to a much higher degree and to a lesser degree to Belarusian, in particular to Belarusian dialects, the reverse can be assumed, that is, deviations from Belarusian, particularly Belarusian dialects may go unnoticed. 3 The ratio behind this study In our study we, of course, could not account for all these issues connected with the bilingual language situation in Belarus. To start with, we were interested whether the processing of CS between closely related and structurally similar languages is similar to the processing of CS between more divergent languages. Structural convergence can enforce CS and reduce the restrictions on CS (Muysken, 2000). In light of the above mentioned differences in the “markedness” of lexical material from both languages, we were also interested in whether the direction of the switch plays a role. Therefore we began with an experiment replicating earlier studies on less closely related languages (Moreno, Federmeier & Kutas, 2002, on English-Spanish CS and in particular Ruigendijk, Hentschel & Zeller, 2015, on German-Russian CS), testing both CS directions. 4 Psycholinguistic studies on CS Although many bilingual speakers alternate between their languages and bilingual listeners process this speech without apparent difficulties, experimental studies have shown that CS can bring processing costs with it which are manifested for example in longer reaction times or reading times (see Bobb & Wodniecka, 2013, for a discussion). Studies using the ERP-technique have shed some light on the specific processes that take place when processing CS. ERPs are changes in electrical voltage on the scalp caused by sensory and cognitive processes in the brain. They are obtained by recording the electroencephalogram (EEG) while a person perceives and processes a number of stimuli of the same type. By averaging the corresponding periods of the EEG-signal, parts of the EEG-signal that are not connected with the processing of the stimuli are averaged out. ERPs consist of different components that are associated with different cognitive and sensory (sub-)pro- 264 ces a h res des in per and stu Stil com Bot nen kno mil for les hig for kno fro 4 F sses high ER sear sign the rcep d in udie ll, m mpo th c nts. Th own llise r sem s ex gh f re h owl om l Figu . Th h tem RP-s rch ns. T dir ptio n th es ob mos one com he ea n N econ man xpe req has ledg long ure 1 he E mpo stud tea The rect on, w he m bser st st nts mpo arly N400 nds ntic ected quen be ge w g-te 1: ER and Ru ERP oral dies ams e res tion whe mod rve tudi as nen y ne 0, a s aft cally d in ncy een whi erm RPs d in uigen P-tec l res on wh sult n of ethe dalit d d ies rel nts egat a ma ter a y ir n a wo ar ich m me for nterm ndijk chn solu n the ho ts a the er p ty ( diffe on leva are tivit ainl an e reg con ords rgue is emo Ger med k et niqu utio e pr focu are t e co parti spo eren the ant: rem ty f ly c even gula ntex s (cf ed con ory. rma iate al., ue a on. roce use ther odeicip oken nt ef e pe an min oun cent nt (s r w xt a f. Ku to nnec . Th Jan n-R e Ru 201 allow essi ed o refo -sw pant n ― ffec erce ear nisce nd f trosuch word and utas ref cted his Pat Russi ssia 15) ws t ing on d ore n witch ts w ― wr ts (s eptio rly ent for C par h as ds i for s & flec d to diff trick ian an se the of diff not h, th were ritte see on neg of CS rieta s a s in a r lo & Fe ct t o a ficu k Ze CS v econ dif CS fere dir he d e ex en). Ku of C gati we is m al n spec a sen ow-f der the wo ulty ller, vs. n nd la ffere hav nt rect diff xpos It utas, CS i ivity ell-k mos nega cific nten freq mei dif ord dep , Ger nonangu enti ve b asp tly c feren sed is th , M in s y, f know st of ativ c w nce, quen ier, fficu or pen rd H -CS uage iatio been pect com nce to her ore sent follo wn ften e p word , bu ncy 201 ulty ano nds Hent wor e lea on o n c s an mpar e be sen efor no tenc owe lan n as peak d). T ut a y wo 11). y to othe on tsche rds arne of th arri nd rab etwe nten re n & W ces ed b ngu soc k be The also ord Th o r er m bot el & in h ers o hese ied use le. D een nces no s Wich ide by a age iate etwe N4 for ds in he N retri mea th t & Est highl of G e pr out ed Diff pro or surp ha, entif a la e-rel ed w een 00 w r w n co N400 ieve anin the ther ly p erm roce t by diff fere odu sin pris 200 fied ate late with n 20 was word omp 0 ef e c ngfu stor r Ru profi man ( esse y di fere ence uctio ngle se th 09, p d tw pos ed c h th 0 a s ob ds th pari ffect conc ul e red uigen icien (cf. es w iffer ent es e on wo hat p. 3 wo E sitiv com he w and bser hat ison t th cep elem d rep ndijk nt with rent test exist and ords the 303) ERP vity mpowell- 500 ved are n to eretual ment prek h t t t d s e . P . - - 0 d e o l t - Psycholinguistic aspects of Belarusian-Russian language contact 265 sentation as well as on the cues that come with the context which of course makes sense for processing CS as well, as the lexical form in the less active lexicon has to be activated (Van der Meij, Cuetos, Carreiras & Barber, 2011, p. 52). For CS, the negativity is sometimes only observed or more pronounced for CS from the L1 into the L2 (Proverbio, Leoni & Zani, 2004) but van der Meij et al. (2011) and Ruigendijk et al. (2015) find an N400 also for CS from the L2 into the L1. Other studies found early negativities that differed from the distribution of the classical N400 (Khamis-Dakwar & Froud, 2007; Moreno et al., 2002). Late positivities (called Late Positive Complex/ LPC or P600) similar to the one observed for CS (Moreno et al., 2002; van der Meij et al., 2011; Khamis-Dakwar & Froud, 2007; Ruigendijk et al., 2015) are also found for sentences with syntactic violations, for syntactically complex sentences compared to simpler ones and for garden-path sentences (see van Petten & Luka, 2012, for an overview). The idea of the LPC as reanalysis effect has also been used to explain the positivity in CS (van der Meij et al.). The LPC for CS is sensitive to language proficiency: It is differently distributed (more frontal, van der Meij et al.) or stronger (Moreno et al.; Ruigendijk et al.; cf. figure 1) for less competent L2 speakers, suggesting that the integration of a CS in the syntactic structure is easier for more competent speakers. 5 Alternatively, the LPC found for CS may be related to the P300, a positivity connected with the processing of stimuli with a surprising form. Studies examining CS at the word level though did not find an LPC (for an overview see Kutas, Moreno & Wicha, 2009). The only study on CS in closely related languages is Khamis-Dakwar and Froud (2007) (colloquial Palestinian Arabic and standard Arabic). Since the authors examined only five speakers, the results of this study need to be replicated. Nevertheless their results suggest that similar effects can be found, namely an early negativity—albeit fronto-central—and a late positivity, in closely related languages as well. 5 Proverbio et al. (2004) found no LPC. This may be caused by the fact that highly competent bilinguals were studied, who were used to switch between languages, but also because the CS was predictable (block design), and their sentence material was not controlled and hence too varied. Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 266 4 The experiment 4.1 Material The design of the study was similar to that of Moreno et al. (2002) and Ruigendijk et al. (2015). Stimuli were auditorily presented sentences, each following the pattern subject ― predicate (transitive verb) ― direct object ― locative (prepositional phrase). 80 triplets of sentence contexts were made. Within each triplet, the verb was constant, and the nouns were semantically closely related. Target words were the last words in these sentences. These could be a semantically adequate word in the same language, a semantically odd word in the same language or the equivalent of the semantically adequate word in the other language. The latter could be interpreted as an instance of insertional CS, although on the background of the linguistic constellation in the country the borderline to CM is of course hard to draw. The target words were matched for number of syllables and frequency across conditions, using Mažėjka (2006) and Šarov (2001). 6 6 As Belarusian is very rarely practiced in Belarusian everyday life (see above) and the lexicon in BRMS is to a high degree Russian (especially “lexical” words in contrast to functional or grammatical words ― cf. Hentschel, 2013b), it is however safe to assume that Russian vocabulary is generally much better known to Belarusians. This holds especially for non-cognates such as Blr. papera vs. Ru. bumaga, both “paper”. All in all, Russian words are thus at least expected to be activated somewhat easier in the brains of our Belarusian participants in the tests. Nevertheless, all of the participants have a rather good knowledge of Belarusian (though maybe not fluent active competence), since they are either (former) student of Belarusian philology or (former) students of other university subjects. At their universities, even the latter have to attend classes in Belarusian. Psycholinguistic aspects of Belarusian-Russian language contact 267 List 1 List 2 List 3 a Trubočisty čistjat truby na (The) chimney sweepers clean (the) chimneys on (the) kojkax Sem beds (RU) kryšax Reg roofs (RU) daxax CS roofs (BR) Russian b Rabočie čistjat antenny na (The) workers clean (the) antennas on (the) kryšax Reg roofs (RU) daxax CS roofs (BR) kojkax Sem beds (RU) c Brigady čistjat čerepicu na (The) work brigades clean (the) tiles on (the) daxax CS roofs (BR) kojkax Sem beds (RU) kryšax Reg roofs (RU) a Kaminary čyscjac’ truby na (The) chimney sweepers clean (the) chimneys on (the) ložkax Sem beds (BR) daxax Reg roofs (BR) kryšax CS roofs (RU) Belarusian b Pracoŭnyja čyscjac’ antėny na (The) workers clean (the) antennas on (the) daxax Reg roofs (BR) kryšax CS roofs (RU) ložkax Sem beds (BR) c Bryhady čyscjac’ daxoŭku na (The) work brigades clean (the) tiles on (the) kryšax CS roofs (RU) ložkax Sem beds (BR) daxax Reg roofs (BR) Table 1: Examples for the sentence material Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 268 For half of the triplets, code-switches were cognates in both languages, for the other half not. The semantically incongruent condition was realized only for half of the cognate triples and half of the non-cognate triplets. To test both switch directions, all triplets were constructed both in Belarusian and in Russian. This way, we ended up with 10 conditions with 40 sentences each: 2 context languages x 5 target types (regular cognates, CS cognates, regular non-cognates, CS non-cognates, semantically incongruent words). Three lists were created from this material, which differed in the combination of context sentence and target. In each list, there were 400 sentences (40 instances of each condition). These sentences were spoken by a female Belarusian- Russian bilingual who showed no or hardly any accent in either of the two languages. In this article, we only analyse non-cognate switches, since these are most comparable to the previous studies on CS. As briefly mentioned above, cognates might have a special psycholinguistic status. The semantically incongruent condition was (among other reasons) included as a control, in order to see whether we can find the classical N400 effect with our sentence material. Due to space constrictions, we cannot go into details here, but this was indeed the case. 4.2 Methodological challenges For the above mentioned reasons, we could not count on intuitional judgments on the linguistic affiliation of the test items, but rather decided to rely on a dictionary that takes up an intermediate position between an extreme puristic and an extreme “russified” position, namely the BRS (2003). Still, the search for possible Belarusian-Russian word pairs denoting locations was difficult, due to the high number of lexical one-to-many correspondences: Although we could have used for example the Belarusian word praca “work” in a Russian sentence as a code-switch, we could not use the corresponding Russian word rabota in a Belarusian sentence as a code-switch, simply because the dictionary gives rabota as a Belarusian synonym to praca. Another problem concerned the participants in the experiment. We planned to compare two groups, namely persons with a good knowledge and persons with an intermediate knowledge of Belarusian. For time reasons we could not run a proficiency test but had to rely on the self-estimations of the participants in a questionnaire. Probably due to the complex language situation in Belarus, no clear group division was possible on the ground of this information. Students of Belarusian philology estimated their proficiency in Belarusian as rather low while other participants who were recommended to us as persons with a rather low competence stated to have an Psycholinguistic aspects of Belarusian-Russian language contact 269 excellent knowledge. Furthermore, in both groups there would have been active speakers and native speakers of BRMS and persons who at least claimed that they had never used BRMS. Instead of comparing two heterogeneous subject groups, we therefore decided to begin with a global analysis, not differentiating between groups. A finer analysis, taking into account the individual language biographies and testing whether the self-estimated linguistic competence still has predictive value is planned for the future. 4.3 Procedure The experiment (programmed in Eprime 2.0) was carried out by the first author during a research stay at the Belarusian State University Minsk made possible through a grant by the DAAD. Participants were told that they would be listening to sentences containing Belarusian and/ or Russian words. Oral instructions were in Russian, instructions on the screen were both in Belarusian and Russian. The experiment started with a small practice set. The sentences were presented via speakers. Before the presentation, a fixation cross appeared on the screen for 1000ms. After the sentence (2000ms after the onset of the last word), a question mark appeared for 500ms. Then a word followed that remained on the screen for 1500ms. Participants were asked to click the left mouse button when the word had appeared in the preceding sentence, in order to make sure they listened to the sentences. The words in this response task were always in the language of the context sentence, and never the target word. In 50% of the cases the correct response was “yes”, in 50% “no”. After a pause of 1500ms the next trial started automatically. The experiment was divided into 8 blocks of 50 sentences each. After each block the participants could decide for themselves when they wanted to go on with the next block. Within each block, the language of the context sentence and therefore the switch direction was kept constant. EEG activity was recorded from 26 scalp electrodes: F7/ 8, F3/ 4, Fz, FC5/ 6, FC1/ 2, FCz, T7/ 8, C3/ 4, Cz, CP5/ 6, CP1/ 2, P7/ 8, P3/ 4, Pz, O1/ 2. Signals were referenced online to the left mastoid, amplified within a bandpass of 0.01 to 100 Hz and digitized at 250 Hz. Data were re-referenced offline to the average of the left and right mastoids. Electrode impedances were kept below 5 kΩ. A baseline correction was carried out using the 100ms prior to the onset of the critical word. Eye movements were monitored by the vertical electro-oculogram (VEOG) and the horizontal electro-oculogram (HEOG). Off-line, trials were rejected automatically whenever the standard deviation for a 200ms moving window exceeded 40 μV in the EOG channels. Trials containing artefacts were also manually rejected. A bandpass filter of 0.3-20 Hz was applied off- Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 270 line. ERPs were computed by averaging the EEG per condition for each subject at each electrode site for a time window from 200ms prior to the onset to 1200ms after the onset. Only trials for which the word decision task was performed correctly were averaged. 36 young Belarusians participated, all of them right-handed, with normal or corrected-to-normal vision and no known brain damages or deficiencies. All of them had a high competence in Russian, but varying degree of competence in Belarusian. Four persons were excluded due to a high number of artefacts or errors in the word decision task. Another participant was excluded because she did not grow up in Belarus. The final group consisted of 31 participants (19 female, 12 male) with an age between 18 and 30 years (mean age: 21.7). Mean number of trials which entered the average process were for Russian context sentences 37.8 (regular condition,) and 37.8 (CS); for Belarusian context sentences 37.0 (regular condition,) and 37.6 (CS). 5 Results Since the focus of this article lies on theoretical and methodological considerations, we can only show some overall first results. Figure 2 shows the grand averages for the CS-non-CS comparisons. Figure 2 clearly shows both similarities and differences to the results found in Ruigendijk et al. (2015; see figure 1 above). There is a negativity with a peak around 400ms, but this negativity lasts longer and is also prominent in frontal electrodes, at least for CS from Russian to Belarusian. More than that, there is no clear late positivity. For statistical analysis, we grouped the electrodes in six regions of interest: left anterior (F3, FC5, C3 and T7), central anterior (Fz, FC1, FCz, FC2), right anterior (F4, FC6, C4, T8), left posterior (CP5, P7, P3, O1), central posterior (Cz, CP1, CP2, Pz) and right posterior (CP6, P8, P4, O2), like in our study on Russian-German CS. We calculated a repeated measures ANOVA with four repeated measures: 2 levels of context language (Russian vs. Belarusian), 2 levels of condition (regular vs. CS), 3 levels of laterality (left, central and right) and 2 levels of anteriority (anterior vs. posterior). Because of the obvious latency differences we had to use different time windows than in Ruigendijk et al. (2015). The results are given in table 2. We report only directly relevant effects, that is, those including condition or context language. Whenever Mauchly’s test indicated that the assumption of sphericity was violated, we report the Huynh-Feldt corrected values. Psy Fig ycho gure oling 2: G sw guist Gran witch tic a nd a h di aspec avera rect cts o ages tions of Be s of s) fo elaru the or se usia regu elect an-R ular ted e Russ r an elec ian d th trod lang he CS des guag S co ge co ondi onta ition act n (noon-ccognnatess, booth 2711 Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 272 For both time windows we find a main effect for language, showing that the ERP is more negative for Russian context sentences. In the early time window, although we find a negativity for CS, this effect differs in some respects from the classical N400 that was found in Ruigendijk et al. (2015). We did 250-650ms 650-1000ms Language F(1,30) 13.13** F(1,30) 11.29** Language x Laterality F(2,60) 1.89 F(1.68,50.34) 3.56* Language x Anteriority F(1,30) .03 F(1,30) 2.46 Language x Anteriority x Laterality F(1.75,52.56) 2.55 (p=.10) F(2,60) 2.93 (p=.06) Condition F(1,30) 18.03** F(1,30) .16 Condition x Laterality F(1,30) 2.04 F(1.62,48.59) 1.93 Condition x Anteriority F(1,30) 2.27 F(1,30) 7.38* Condition x Laterality x Anteriority F(2,60) 2.84 (p=.07) F(2,60) 11.19* Language x Condition F(1,30) .44 F(1,30) .73 Language x Condition x Laterality F(2,60) .64 F(2,60) 1.31 Language x Condition x Anteriority F(1,30) .12 F(1,30) 3.33 (p=.08) Language x Condition x Anteriority x Laterality F(1.65,49.38) 3.05 (p=.07) F(1.71,51.41) 3.57* Table 2: Results of repeated measures (*: p<.05; **: p<.01) Psycholinguistic aspects of Belarusian-Russian language contact 273 not find an interaction of laterality and condition typical for the N400 (only a marginal significant interaction of laterality, anteriority and condition), but a main effect for condition, showing that CS in comparison to non CSsentences elicit a negativity at all electrode sites. In the late time window, we did not find a main effect for condition, but an interaction of condition and anteriority and an interaction of condition, laterality and anteriority. This is most probably not caused by a posterior positivity (although the figure depicts slight differences) but rather by the enhanced frontal negativity most prominent on right anterior electrodes. The interaction of language, condition, anteriority and laterality confirms the impression that this negativity is stronger for the switch from Russian to Belarusian. 6 Discussion The results show that sentences containing Belarusian-Russian or Russian- Belarusian CS are processed differently than monolingual sentences by young Belarusians. There is obviously a neurolinguistic processing distinction between the two languages, in spite of the diffuse language situation, a widespread mixed-monolingual language acquisition and the structural affinity of the involved languages. As in Ruigendijk et al. (2015) for CS between less closely related languages, an early negativity (N400) can be observed. Assuming that this negativity reflects activation costs of the word in the less activated language (van der Meij et al., 2011, p. 52), this shows that the Belarusian and the Russian lexicon constitute different subsystems in the participants’ mental lexicon, at least for non-cognates. An overall effect for language, as well as interactions with language and condition (and distribution) indicate that the direction of CS affects processing. This is plausible, because due to the dominance of Russian in Belarus and of the Russian element in the lexical inventory of BRMS, Belarusian lexical elements are clearly “marked” in the Belarusian linguistic situation. Still, the only difference that hints at a larger processing cost for Belarusian words in Russian sentences is the late right frontal negativity for Russian- Belarusian CS. As far as we are aware, this effect has never been observed in studies on CS, hence what it stands for and whether it is language-specific or rather connected with overall cognitive task-switching activities deserves further studies. Late frontal negativities have been found for the processing of concrete in comparison to abstract words (West & Holcomb, 2000) and have been interpreted as an index of the elicitation of a mental image con- Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 274 nected with the concrete word. Although the parallelism to our findings is of course far from straight-forward, one might speculate that Russian words (at least lexical ones) as the unmarked ones are of a more abstract meaning for our participants, while Belarusian words as the marked ones are connected with rather concrete, perhaps Belarus-specific mental images. A clear contrast to the studies of Moreno et al. (2002) and Ruigendijk et al. (2015) on switching between less closely related languages is the absence of an LPC. This absence may be caused by the structural affinity of both languages. Because of the overlap of grammatical categories in both languages, words are easily integrated syntactically in a matrix clause provided by the other language. In line with the model of bilingual language representation proposed in Hartsuiker et al. (2004), it may even be the case that both languages make use of a shared syntax, due to this structural affinity and to the “mixed-monolingual” language acquisition. Drawing on the alternative interpretation of the LPC as an index of surprise, one could argue that the absence of an LPC shows that CS is not a surprising event to the participants, probably because they are used to process both languages not only in a “pure”, but also in a “mixed” form. Summary and outlook Belarusian-Russian bilingualism in Belarus differs from prototypical constellations of bilingualism. First, these languages are closely related. Secondly, many Belarusians did not grow up bilingually with Russian and Belarusian as two discrete languages, but rather “mixed-monolingually”, without a clear functional separation between elements from both languages. With them, the (partial) segregation of language systems developed much later. Thirdly, the subjective borderlines between the varieties used in Belarus are often fuzzy. With the help of the ERP-technique we showed that processing Belarusian-Russian and Russian-Belarusian CS has some similarity with the processing of CS between less closely related languages in that both need a reinforced effort in lexical retrieval which is manifested in a larger N400 component. But there are also differences: While CS between less closely related languages normally elicit an LPC which is interpreted as an index of (syntactical) reanalysis, we did not find such an effect in our study. Once retrieved, words from the other language are obviously easy to integrate into the sentence structure for young Belarusians. Of special interest is the late right frontal negativity found for CS from Russian to Belarusian, but not Psycholinguistic aspects of Belarusian-Russian language contact 275 in the opposite direction. This negativity might reflect a switch into the marked lexical register. Future analyses should address the question of how far these results are connected with the relatedness of the languages, with the specific type of first language acquisition and with the diffuseness of the linguistic situation in Belarus. It would therefore be useful to study the switching between other closely related languages without the existence of widespread (stabilized) CM (where Slavic languages offer a wide range of opportunities, for example Serbian and Croatian) and between varieties that are without doubt perceived as varieties of the same languages, for example between Russian standard and Russian substandard prostorečie. We hope that we could show how the ERP-technique can help to understand the phenomenon of CS and to a certain degree (spontaneous) CM. Moreover, we hope that we made clear what the kind of bilingualism present in today’s Belarus can add to the psycholinguistic discussion of language contact: We want to underline that the questions addressed in this article are not only important for Belarus. The type of language acquisition described here is certainly not restricted to this country. The question of subjective language categories and language borders is relevant elsewhere as well. Language contact situations can deviate in various aspects from the prototypes that most psycholinguistic studies deal with and the psycholinguistic consequences of these deviations are still unexplored. Acknowledgements We would like to thank Dr. N. Veremeeva and Prof. Dr. M. Pryhodzič, BSU Minsk, for helping recruit the participants, and Natallja Kraŭčanka and Alena Ljankevič for helping create the stimuli sentences. References Auer, P. (2007). The monolingual bias in bilingualism research, or: why bilingual talk is (still) a challenge for linguistics. In M. Heller (Ed.), Bilingualism: a social approach (pp. 319-339). Basingstoke: Palgrave Macmillan. Backus, A. (2009). Codeswitching as one piece of the puzzle of language change: The case of Turkish yapmak. In L. Isurin, D. Winford, & K. de Bot (Eds.), Multidisciplinary approaches to code switching (pp. 307-336). Amsterdam: Benjamins. Bobb, S., & Wodniecka, Z. (2013). Language switching in picture naming. What asymmetric switch costs (do not) tell us about inhibition in bilingual speech planning. Journal of Cognitive Psychology, 25(5), 568-585. BRS (2003). Belaruska-ruski sloŭnik. Minsk: Belaruskaja Ėncyklapedyja. Clyne, M. (1987). Constraints on code-switching: How universal are they? Linguistics, 25, 739-764. Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 276 Cychun, G. (1998). Creolic variety of the Belarussian Language (trasianka). Acta universitatis Lodziensis. Folia linguistica, 38, 3-10. De Angelis, G. (2005). Multilingualism and non-native lexical transfer: an identification problem. International Journal of Multilingualism, 2(1), 1-25. Dijkstra, A., & Van Heuven, W. J. B. (2002). The architecture of the bilingual word recognition system: From identification to decision. Bilingualism: Language and Cognition, 5(3), 175-197. Green, D. W. (1998). Mental control of the bilingual lexico-semantic system. Bilingualism: Language and Cognition, 1(2), 67-81. Hartsuiker, R. J., Pickering, M. J., & Veltkamp, E. (2004). Is syntax separate or shared between languages? Cross-linguistic syntactic priming in Spanish/ English bilinguals. Psychological Science, 15, 409-414. Hentschel, G. (2008). Zur weißrussisch-russischen Hybridität in der weißrussischen „Trasjanka“. In P. Kosta, D. Weiss (Eds.), Slavistische Linguistik 2006/ 2007 (pp. 169- 219). München: Sagner. Hentschel, G. (2013a). Belorusskij, russkij i belorussko-russkaja smešannaja reč’. Voprosy Jazykoznanija (2013), 53-76. Hentschel, G. (2013b). Zwischen Variabilität und Regularität, „Chaos“ und Usus: Zu Lautung und Lexik der weißrussisch-russischen gemischten Rede. In G. Hentschel (Ed.), Variation und Stabilität in Kontaktvarietäten: Beobachtungen zu gemischten Formen der Rede in Weißrussland, der Ukraine und Schlesien (pp. 63-99). Oldenburg: BIS. (= Studia Slavica Oldenburgensia 21). Hentschel, G. (2014a). Belarusian and Russian in the Mixed Speech of Belarus. In J. Besters-Dilger, C. Dermarkar, S. Pfänder, & A. Rabus (Eds.), Congruence in contactinduced language change: Language families, typological resemblance, and perceived similarity (pp. 93-121). Berlin: de Gruyter. Hentschel, G. (2014b). „Trasjanka“ und „Suržyk“ - zum Mischen von Sprachen in Weißrussland und der Ukraine. In G. Hentschel, O. Taranenko, & S. Zaprudski (Eds.), Trasjanka und Suržyk - gemischte weißrussisch-russische und ukrainischrussische Rede. Sprachlicher Inzest in Weißrussland und der Ukraine? (pp. 1-26). Frankfurt/ M.: Peter Lang. Hentschel, G. (2014c). On the systemicity of Belarusian-Russian Mixed Speech: the redistribution of Belarusian and Russian variants of functional words. In G. Hentschel, O. Taranenko, & S. Zaprudski (Eds.), Trasjanka und Suržyk gemischte weißrussisch-russische und ukrainisch-russische Rede. Sprachlicher Inzest in Weißrussland und der Ukraine? (pp. 195-220). Frankfurt/ M.: Peter Lang. Hentschel, G., & Kittel, B. (2011). Weißrussische Dreisprachigkeit? Zur sprachlichen Situation in Weißrussland auf der Basis von Urteilen von Weißrussen über die Verbreitung „ihrer Sprachen“ im Lande. In: Wiener Slawistischer Almanach, 67, 107- 135. Hentschel, G., Brüggemann, M., Geiger, H., & Zeller, J. P. (2015). The linguistic and political orientation of young Belarusian adults between East and West or Russian and Belarusian. International Journal of the Sociology of Language, 2015/ 236, 133-154. Psycholinguistic aspects of Belarusian-Russian language contact 277 Hentschel, G., & Zeller, J. P. (2012). Gemischte Rede, gemischter Diskurs, Sprechertypen: Weißrussisch, Russisch und gemischte Rede in der Kommunikation weißrussischer Familien. Wiener Slawistischer Almanach, 70, 127-155. Hentschel, G., & Zeller, J. P. (2014). Belarusians’ pronunciation: Belarusian or Russian? Evidence from Belarusian-Russian mixed speech. Russian Linguistics, 38(2), 229-255. Khamis-Dakwar, R., & Froud, K. (2007). Lexical processing in two language varieties: An event related brain potential study of Arabic native speakers. In M. Mughazy (Ed.), Perspectives on Arabic linguistics XX (pp. 153-166). Amsterdam: Benjamins. Kroll, J. F., Van Hell, J. G., Tokowicz, N., & Green, D. W. (2010). The Revised Hierarchical Model: A critical review and assessment. Bilingualism: Language and Cognition, 13(3), 373-381. Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the N400 component of the event-related brain potential (ERP). Annual Review of Psychology, 62, 621-647. Kutas, M., Moreno, E. M., & Wicha, N. (2009). Code-switching and the brain. In B. E. Bullock & A. J. Toribio (Eds.), The Cambridge handbook of linguistic code-switching (pp. 289-306). Cambridge: Cambridge University Press. Matras, Y. (2009). Language contact. Cambridge: Cambridge University Press. Mažėjka, N. S. (2006). Častotny sloŭnik belaruskaj movy. Minsk: UP Zorki Gor. Mečkovskaja, N. B. (2014). Die weißrussische Trasjanka und der ukrainische Suržyk: Quasiethische Substandards in der sprachlichen Situation. In G. Hentschel, O. Taranenko, & S. Zaprudski (Eds.), Trasjanka und Suržyk — gemischte weißrussischrussische und ukrainisch-russische Rede. Sprachlicher Inzest in Weißrussland und der Ukraine? (pp. 53-90). Frankfurt a. M.: Peter Lang. Michnevič, A. E. (1985). Osnovnye aspekty problemy. In А. Е. Michnevič (red.), Russkij jazyk v Belorussii (pp. 3-12). Minsk: Nauka i Technika. Moreno, E. M., Federmeier, K. D., & Kutas, M. (2002). Switching languages, switching palabras (words): an electrophysiological study of code switching. Brain and Language, 80, 188-207. Muysken, P. (2000). Bilingual speech: A typology of code mixing. Cambridge: Cambridge Univ. Press. Norman, B. (2008). Russkij jazyk v Belarusi segodnja. Die Welt der Slaven, LIII, 289-300. Paradis, M. (2004). A neurolinguistic theory of bilingualism. Amsterdam: Benjamins. Proverbio, A. M., Leoni, G., & Zani, A. (2004). Language switching mechanisms in simultaneous interpreters: An ERP study. Neuropsychologia, 42(12), 1636-1656. Ramza, T. (2008). Die Evolution der Trasjanka in literarischen Texten. Zeitschrift für Slawistik, 53(3), 305-325. Ruigendijk, E., Hentschel, G., & Zeller, J. P. (2015). How L2-learners’ brains react to code-switches. An ERP study with Russian learners of German. Second Language Research, Published online before print November 18, 2015, doi: 10.1177/ 0267658315614614 Schilling-Estes, N. (2007). Investigating stylistic variation. In J. K. Chambers, P. Trudgill, & N. Schilling-Estes (Eds.), The handbook of language variation and change (pp. 375-401). Oxford: Blackwell. Jan Patrick Zeller, Gerd Hentschel & Esther Ruigendijk 278 Sjameška, L. I. (1998). Sacyjalinhvistyčnyja aspekty funkcyjanavannja belaruskaj litaraturnaj movy ŭ druhoj palove XX st. In A. Lukašanec, M. Prigodzič, L. Sjameška (Eds.), Belaruskaja mova (pp. 25-54). Opole: Uniw. Opolski. Suprun, A. Ja., & Klimenka, H. P. (1982). Nekatoryja psichalinhvistyčnyja asablivasci belaruska-ruskaha dvuchmoŭja. In M. V. Biryla, A. Ja. Suprun (Eds.), Pytanni bilinhvizmu i ŭzaemadzejannja moŭ (pp. 76-105). Minsk: Navuka i tėchnika. Šarov, S. A. (2001). Častotnyj slovar’. Retrieved from http: / / www.artint.ru/ projects/ frqlist.php Van der Meij, M., Cuetos, F., Carreiras, M., & Barber, H. A. (2011). Electrophysiological correlates of language switching in second language learners. Psychophysiology, 48, 44-54. Van Petten, C., & Luka, B. J. (2012). Prediction during language comprehension: Benefits, costs, and ERP components. International Journal of Psychophysiology, 83(2), 176-190. Vyhonnaja, L. C. (1996). Psichalinhvistyčnyja aspekty belaruska-ruskaha bilinhvizmu. Belaruskaja linhvistyka, 45, 10-14. West, W. C., & Holcomb, P. J. (2000). Imaginal, semantic, and surface-level processing of concrete and abstract words: An electrophysiological investigation. Journal of Cognitive Neuroscience, 12(6), 1024-1037. Wexler, P. (1992). Diglossia et schizoglossia perpetua ― The fate of the Belorussian language. Sociolinguistica, 6, 42-51. Zaprudski, S. (2007). In the grip of replacive bilingualism. The Belarusian language in contact with Russian. International Journal of the Sociology of Language, 183, 97-118. Zaprudski, S. (2014). Zur öffentlichen Diskussion der weißrussischen Sprachkultur, zum Aufkommen des Terminus „Trasjanka“ und zur modernen Trasjankaforschung. In G. Hentschel, O. Taranenko, & S. Zaprudski (Eds.), Trasjanka und Suržyk - gemischte weißrussisch-russische und ukrainisch-russische Rede. Sprachlicher Inzest in Weißrussland und der Ukraine? (pp. 119-142). Frankfurt a. M.: Peter Lang. Influence of spatial language on the non-linguistic spatial reasoning of sign language users. A comparison between Czech Sign Language users and Czech non-signers Jakub Jehlička Abstract: Differences in spatial language in signed and spoken languages represent one of the great challenges for many linguistic fields, including psycholinguistic studies. Being articulated spatially, signed languages, compared to spoken languages, enable their users to express more fine-grained spatial information within a single segment, using simultaneous classifier constructions. In addition, signers project the spatial relations onto iconic signing space, and so they are required to maintain different perspectives that demand complex mental imagery. Possible effects of these modalityspecific aspects of spatial language on signers’ non-linguistic spatial cognition have been studied, especially with respect to the involvement of mental rotation during language processing. In the present paper, I review previous research on mental rotation skills in users of signed languages (especially American Sign Language) and present interim outcomes of an ongoing study with CzSL signers and hearing Czech speakers. 1 Introduction Since signed languages (SL) 1,2 have become an object of serious linguistic inquiry, that is, since SL were generally recognized as fully developed natu- 1 I use the term signed language instead of sign language. The reason for using this term is that it constitutes an opposition to spoken, indicating a system equality of the two language modalities. However, the term signed may also refer to the artificial sign systems derived from spoken languages, e.g., Signed English and Signed Czech, which should not be confused with genuine SLs, as they are mere “translation” of the spoken language structure into handshapes. Also, the phrase sign language is used in official English nomenclature of SL (e.g., American Sign Language, French Sign Language, etc.). 2 The abbreviations used in this text are as follows: AE — American English AoA — age of acquisition ASL — American Sign Language CzSL — Czech Sign Language DGS — German Sign Language (Deutsche Gebärdensprache) Jakub Jehlička 280 ral languages in the 1960s and 1970s 3 , there has been growing interest in the psychological processes underlying signed communication. Psycholinguistics of SL is not, in general, any different from psycholinguistics of spoken languages. It focuses on the very same topics — processes of language production and comprehension, the nature of mental lexicon, acquisition of SL as a first and other language, etc. Additionally, SL psycholinguistics has contributed to the search for the actual extent of cognitive diversity in crosslinguistic perspective, by unraveling the modality-specific aspects of language processing in SL. Thus, SL psycholinguistics has greatly contributed to the exploration of cognitive foundations of language as such. This paper 4 is concerned with one particular topic of SL psycholinguistics: the question of whether (and if so, how) the specific linguistic use of space and spatiality can affect the non-linguistic spatial cognition of SL users (in particular, their mental rotation abilities). First, I highlight the major spatial language differences between SL and spoken languages (section 2). Then, I focus on the research of mental rotation abilities in deaf and hearing signers and also in hearing non-signers. I review methods and results of a number of studies carried out with users of different SL in the course of the last three decades (section 3). In general, in mental rotation-based tasks, a lower mental rotation effect 5 is observed in deaf signer subject groups than in hearing groups. It remains unclear exactly which factors contribute to this cross-modal 6 cognitive difference. In the final parts of my paper, I report the interim results of my adaptation of the experimental design developed by Emmorey and her colleagues (1998), who compared mental rotation skills in a non-linguistic task between deaf American Sign Language signers and hearing non-signing American English (AE) speakers, finding significantly better mental rotation performance within the deaf group. In my study, I compared the mental rotation skills of deaf Czech Sign Language (CzSL) signers and hearing Czech non-signers. The preliminary results of the hearing group suggested HZJ — Croatian Sign Language (Hrvatski znakovni jezik) ISN — Nicaraguan Sign Language (Idioma de señas de Nicaragua) MR — mental rotation RT — reaction time SL — signed language TID — Turkish Sign Language (Türk İşaret Dili). 3 In particular, William Stokoe (1960) has played a crucial role in SL emancipation. 4 I would like to thank the anonymous reviewers for their valuable comments. 5 “Mental rotation effect” is an inhibition of a subject’s reaction time in a task requiring mental rotation. 6 I.e. involving the visuo-manual and audio-oral modality of language. Influence of spatial language on the non-linguistic spatial reasoning 281 that Czech non-signers score better in the mental rotation task than AE nonsigners. In the conclusion, I provide a possible explanation for why the results of my study differed from those of Emmorey and her colleagues and what that implies for studies of this kind in general. 2 Language of space in SL One of the prominent areas of SL psycholinguistic research 7 is the domain of space. The confluence of spatial language and spatial cognition in speakers of spoken languages has also been a prominent research area in the psycholinguistics of spoken languages (Levinson, 2004; Levinson & Wilkins, 2006). However, unlike spoken languages, for SL, space is crucial ex definitione: space, in this case, is not only one of the domains of the real world that is mediated and construed via language; space is where the language “happens.” Sign language is realized in physical space using space as a means of expression. Given this modality-specific spatial character of SL, the way signed and spoken languages express spatial relations, locations, motions, and so on (often referred to as “spatial language”) differs significantly. Let us review the major differences in spatial language for signed vs. spoken languages. (i) As stated above, SL use space as a medium: all signing is realized (i.e., produced and comprehended) within the signer’s signing space. The signing space (Bellugi, 1972) is delimited by the signer’s spread arms on the horizontal axis and by the apex of her or his head and chest on the vertical axis. (ii) In SL, unlike in spoken languages, the primary means of expressing spatiality is not lexical but is rather characterized by the involvement of classifier 8 constructions. Classifier constructions in SL are “polycomponential verb” phrases that contain a classifier morpheme combined with a morpheme of motion, location, visual-geometric description, or handling (Schembri, 2003). A classifier morpheme is realized by a handshape that represents a class of entities — “class” in terms 7 For a general overview of SL psycholinguistic topics, specific issues, and representative studies see, e.g., Emmorey (2007), Wilcox and Morford (2007), or Morford, Nicodemus, and Wilkinson (2015). 8 The term “classifier” used in SL linguistics has its origin in the category of classifier in spoken language typology (Allan, 1977). However, classifiers in SL and in spoken languages are not generally conceived as analogical phenomena (see discussion in Emmorey, 2003). 282 Ima Alt cou lan Ad tha Mo to s ing ter ma 2 age For abo tha ver out (Em thou unte ngua dam at sig mo It r als dle lar oreo spo g to istic atica of en 1: E O r si out an p rsal t to mm ugh er-e age moro gned ore s repr o re ed to gely over oken Tal cs o al p size titie Exam OBJE gn spa prep to s o be orey h cla evid s (th obe d la stru rese epre oge y ico r, th n lan lmy of v rop e or es in mple ECT lan ace posi sign e typ y, 2 assi denc he a Sig angu uctu nts esen ther onic his c ngu y (ib visua perti r sh n a e of T (e.g ngua — p ition n lan polo 002 ifier ce a abs gn L uage ural ma nts t r int c wi cap uage bid., al p ies hape mo a cla g., c ages prim ns o ngu ogic , p. r co agai enc Lan e ca elem any thes to p ith v abil e is , p. pars of s e. A re o assif olum s, sp mar or lo uage cally 73) onst nst ce o gua an m men mor se d pre-p visib lity bas 192 sing spat As su or le fier mn, patia rily ocat es [… y un truc the f w age, mar nts, re o disti pac ble s of m sed 2), g g th tial uch ess i han tree al la inv tive …], niqu ction eir u whol , De rk f mo of th ncti kag spat mor on gene an rep h, cla icon ndsh e tru angu olve aff and ue a ns s univ le-en e V finer ore c hese ions ged tial re c the eral the pres assi nic hape unk, uage es t fixes d th and seem vers ntit Vos r sp cate e dis s in sch cha com e us lly “ clo ent ifier way e in , ma e — the s. Th ey e d ari m t salit ty cl & Z patia egor stin dep ema arac mple e of “see osed atio r co y (E CzS ast) — th use hes exhi ise to b ty c lass Zesh al d ries, ctio pend as. A teri ex re f a c em[ d-cla on d onstr Emm SL: B (Tik he li of e co ibit from be u can b sifie han disti and ons i dent And stic epre clas s] c ass dom ruc mor BRO kovs ingu clas onst som m t univ be f ers w n, 20 inct d m in a tly i d its s (p esen ssifi close sub mina tion rey, OAD ská, uist ssifi truc me p the vers foun was 012) tion more any in t s sp pp. 1 ntat er s er to bsys ant i ns r 200 D CY 200 tic d ier c ction prop visu sal a nd i s re ). T ns w e ele par the e atia 192tion subs o th stem in s epr 02). YLIN 6, p devi con ns a pert ualacro in s por Talm with eme rticu exp al re -193 n of syst he s ms spok rese NDE p. 154 ices nstru appe ties -spa oss som rted my ( its ents ular press epre 3). spa tem truc [of ken Jak ent r ER-S 4) s us uctio ear tha atial sta me vi d in (200 inv per exp sion esen ace m tha ctur lexi n lan kub real SHA sed ons, to t m l mo nda illag Gh 03) ven r ca pres n, no ntati com at, a ral c ical ngua Jehl l-wo APED to , ra be may t oda ard ge s hana arg ntory ateg ssio ot b ions mpa acco cha l gr age lička orld D talk ther uniturn ality SL sign aian gues y of gory n. It buns are ared ordracames].” a d k r n . , n n s f . t e d - - - ” Influence of spatial language on the non-linguistic spatial reasoning 283 (iii) As in spoken languages, when communicating spatial information in SL, signers can take various perspectives. The problem of perspective in spoken languages has been traditionally associated with the study of frames of reference (cf. Levinson, 1996). Three types of frames of reference that are recognized within spoken languages are: absolute, relative, and intrinsic. All three are also found in SL (Arik, 2008; Perniss, 2007; De Vos, 2012). There are other ways of perspectivization in SL, determined by the character of the iconic signing space and depending on the nature of the communication situation. Emmorey (2002) distinguishes two “spatial formats” used for the spatial description of an environment. In the “viewer spatial format,” the experiencer of the described situation is the reference point in the signing space. The default frame of reference is in this case relative. The second type is a “diagrammatic spatial format” with “scheme-like” descriptions using an absolute frame of reference (the landmark-based type). In the viewer spatial format, narrators usually take so-called “route perspective” (Linde & Labov, 1975), using subjective perspective, for example, when directing someone 9 . In the diagrammatic spatial format, the “survey perspective” is usually taken. This is an objective perspective, as in general descriptions 10 . Emmorey notes that spatial formats and frames of reference do not refer to the same type of perspective-taking, “rather they are specific ways of structuring signing space within a discourse” for “signers can adopt an intrinsic, a relative, or an absolute frame of reference when using either diagrammatic or viewer space” (Emmorey, 2002, p. 97). When a signer and addressee communicate about a spatial situation that is physically present at the moment of conversation, the signing space becomes “shared” (Emmorey, 2002). That is, the location of particular segments of the signed description of a spatial situation (e.g., location of the classifier morphemes referring to the objects present) will correspond to the location of the objects that are referred to. The signer and addressee both share the same perspective and coordinate their understanding of the described situation using indexical signs (pointing). Shared space could be approximated to a mix of intrinsic and relative frames of reference in spoken languages while the point of reference shifts from the referent itself to one of the signers and the other way round. Perspectivization strategies in SL vary cross-linguistically. As for static spatial descriptions, there is, according to Perniss (2007), a preference on the part of DGS signers for the use of a diagrammatic format 11 employing both 9 E.g., “go straight, then turn to the left and you will see the church.” 10 Such as “the church is one mile north of the hospital.” 11 Perniss calls diagrammatic spatial format “observer perspective.” Jakub Jehlička 284 relative and intrinsic frames of reference simultaneously (ibid., p. 291). In Turkish Sign Language (TID, Arik, 2008), when describing a static spatial scene, signers use a simultaneous mix of intrinsic and relative frames of reference, or intrinsic alone. There is, however, no evidence for the use of a purely relative or absolute frame of reference in SL. When the object of the spatial description is not present, the way signers structure the signing space is different. Signers describe a spatial scene as if they were “recreating” the scene configuration (i.e., location and/ or movement of objects) according to the image she or he visualizes in her or his mind (Emmorey, 2002). The signer (“narrator”) can describe a scene to the addressee either from her or his own perspective or from the perspective of the addressee. In ASL, as well as in TID (Arik, 2008), German Sign Language (DGS, Perniss, 2007), Croatian Sign Language (HZJ, Arik & Milković, 2007), or CzSL (Tučková, 2013), the predominant perspective in this kind of communication situation is the narrator’s perspective (although not universally; e.g., in Kata Kolok, an absolute “geocentric” perspective is used instead, [De Vos, 2012]). The addressee thus perceives a sequence of signs that reflects information about a spatial scene that is rotated 180°. Yet, there seems to be no substantial comprehension problem for addressees in such situations (Emmorey, 1998). A question for psycholinguists arises as to whether the modality-specific exposure to conversed perspective leads to enhancement of spatial-cognitive skills in signers. It has to be noted that the use of iconic manual gestures for static spatial descriptions in spoken languages does not fundamentally differ from that of signed spatial descriptions. Manual gestures accompanying a spoken description may exhibit a similar level of iconicity and simultaneousness as classifier constructions in SL, and the various ways of perspective-taking and orientation in “gesture space” (Haviland, 2000) are also analogous to SL — including the use of the narrator’s perspective. However, addressees of spoken spatial description do not have to rely solely on visual spatial information in order to decode the narrator’s description (as signers do). 3 Empirical studies of mental rotation Since the late 1980s, there has been ongoing research dedicated to the possible involvement of mental rotation (MR) during SL processing. In general cognitive psychology, MR was first described by Shepard and Metzler (1971), who carried out an experiment based on the recognition of the identity of two objects rotated differently in three-dimensional space. They found that the subjects’ reaction times (RT, i.e., the time required for subjects to Influence of spatial language on the non-linguistic spatial reasoning 285 recognize whether or not two shapes are identical) increases in direct proportion to the angle of rotation of the two objects. Moreover, the subjects reported that they used imaginary rotation of the objects to recognize the differences between them. That led Shepard and Metzler to call this process “mental rotation in three-dimensional space” (Shepard & Metzler, 1971, p. 703). The average rotation speed was 60° per second (ibid.). In this section, I will briefly review several studies on MR abilities associated with SL use, with respect to major factors affecting MR skills in deaf as well as hearing subjects. One factor should be mentioned beforehand: gender. The fact that men in general perform better on MR tasks has been well documented in cognitive psychology (for a review see, e.g., Parsons et al., 2004). The exact reason for this the male dominance in MR skills has not yet been clarified. The first study investigating MR abilities in deaf subjects was carried out by McKee (1987). McKee’s task was based on Shepard and Metzler’s design: an identity judgment task using mirror images as stimuli. Comparing deaf ASL signers and hearing English speaking non-signers, McKee found that the deaf performed significantly better than hearing subjects on the task. Emmorey, Kosslyn, and Bellugi (1993) tested deaf and hearing ASL signers as well as hearing non-signers. In their study, participants were asked to recognize whether two differently rotated images were mirror images of each other or were identical. The results revealed that the signers, regardless of whether they were deaf or hearing, outperformed hearing non-signers in mirror image/ identity tasks that require MR. Talbot and Haude (1993) tested MR skills in hearing ASL signers with various levels of ASL proficiency, and Chamberlain and Mayberry (1994) and Parasnis and colleagues (1996) compared deaf non-signers and hearing non-signers. The results of these studies revealed that ASL signers outperform non-signers regardless of the subjects’ deafness, and thus they lead to the conclusion that better performance in MR tasks or in identity/ mirrorimage recognition tasks that might require MR is, in fact, a result of ASL usage, not an effect of deafness. Moreover, the success rate appeared to increase the higher the ASL competence level. Emmorey, Klima, and Hickok (1998) conducted a series of experiments testing (i) MR abilities in deaf ASL signers during language comprehension tasks, and (ii) MR effects in deaf ASL signers and hearing (non-signing) American English (AE) speakers during non-linguistic tasks. In the first part of the study, the authors investigated whether ASL signers have to perform MR in order to properly interpret the signed description of a spatial scene from the narrator’s perspective. Eighteen (12 female, 6 male) deaf ASL users were presented with a set of visual stimuli containing a sequence of images Jakub Jehlička 286 depicting various kinds of furniture arranged in a room, followed by a video of a signed description. The description either matched or did not match the preceding image, and half of the descriptions were from the perspective of the narrator. After viewing each sequence, participants were asked to decide whether or not the description matched the room. RT was not measured 12 in this study, only correct answer scores. The results were surprising: when MR was supposed to take place, that is, when interpreting descriptions were perceived from the narrator’s perspective, correct responses were received significantly more often than in the case of the addressee’s perspective. Such results imply that if any MR effect is present during comprehension of spatial descriptions, it is stronger with the less frequent type of perspectivization. Emmorey and her colleagues (1998) interpret the results as follows: These findings suggest that the advantage for processing the canonical (most frequent) linguistic expression overrides the difficulty imposed by mental rotation. The results further document the assertion that not only narrators but addressees as well prefer spatial descriptions from the narrator’s point of view, despite the mental rotation requirements for the addressee when this viewpoint is adopted (p. 226). The second part of the study contained another variant of the MR task. The task was based on manipulating real objects in order to reconstruct a spatial scene (objects arranged on the horizontal plane) presented as visual stimuli in the form of manual ASL signs (lexical signs or classifier constructions) or images of objects sequentially appearing on the screen 13 . In the SL condition, 15 (10 female, 5 male) deaf ASL signers were presented with a series of signed descriptions of the spatial arrangement of different types of objects in a room with a marked entrance (and thus marked narrative perspective). In half of the stimuli descriptions, the narrator’s perspective was taken; in the other half, the addressee’s perspective was adopted. After viewing each description, participants were asked to reconstruct the spatial scene with corresponding miniatures of objects on a desk representing a room from the top view. The position of the entrance was fixed (towards the subject), and so subjects were required to perform a 180° MR in order to place objects on the desk correctly in half of the reconstruction instances. 12 Since MR is defined by subjects’ RT in object identification tasks, the design of this study only allows for capturing the indirect effect of MR on spatial memory. Hence, by MR effect I mean here both MR effect in the original sense as well as this indirect MR effect. 13 I will address and discuss the design of this experiment in more detail below. Influence of spatial language on the non-linguistic spatial reasoning 287 In the object condition, 15 deaf ASL signers (same group as in SL condition) and 15 (8 female, 7 male) hearing AE speakers without competence in ASL were presented with videos of sequentially appearing objects on the desk representing a room with a marked entrance. Half of the rooms were oriented with the entrance towards the subject and half with the entrance on the opposite side. Afterwards, subjects performed the reconstruction tasks as in the SL condition. For both conditions, the object placement and orientation accuracy were measured. The results of the experiment in the SL condition did not reveal any significant MR effect. In the object condition, both subject groups performed worse when rotation was required. However, the deaf group significantly outperformed the hearing group. There was no significant gender effect in either experiment. These findings show differences in spatial reasoning associated with language processing and non-linguistic spatial cognition in deaf subjects: “signs in space are treated differently than objects in space. It appears that the habitual use of mental rotation when comprehending ASL leads to the attenuation or reversal of the normal mental rotation effect during language processing” (Emmorey et al., 1998, p. 240). However, it “only reduces, rather than eliminates, rotation difficulty within the non-linguistic domain” (ibid.). Nevertheless, deaf subjects did not report any kind of mental transformation of imaginary objects during the SL comprehension task. Therefore, Emmorey and colleagues offer another plausible explanation for the stronger MR effect when rotation was not required: it might be that the signers actually do not perform MR but instantly perceive a mirror image of the described spatial scene instead. SL speakers might be capable of automatic reversal of the mental image created during comprehension processes, transforming it into a production-like subjectivized mental image that fits the subjective signing space. ASL usage, thus, does not enhance MR skills as such but rather influences other spatial-cognitive processes, some of which may be associated with MR. Emmorey’s conclusions about the effects of mental rotation on ASL comprehension were supported by the findings from German Sign Language (DGS) by Perniss (2007). In her thorough study of space and iconicity in DGS, Perniss used a modification of the Man and Tree task 14 to elicit static spatial descriptions from 8 deaf DGS signers. The results of her study show, in accordance with Emmorey’s explanation, that “the transformation that 14 The Man and Tree task, being a part of so-called “Nijmegen Space Games,” was designed at Max Planck Institute for Psycholinguistics in Nijmegen as a tool for elicitation of frame of reference usage in spatial language (Senft, 2007). Jakub Jehlička 288 takes place in interpretation [of a static spatial scene] is not a rotation of the representation itself (as a whole), but indeed a type of self-rotation or ‘ comprehension by imagined doing’ ” (Perniss, 2007, p. 164). Although Perniss did not actually measure MR effects, she did observe that deaf signers, in a task that required reconstruction of a signed spatial description using physical objects, arranged the objects in the same manner as the narrators did when they were placing respective signs in the signing space, regardless of which perspective the narrators adopted. Pyers and her colleagues (2010) compared two groups of native users of Nicaraguan Sign Language (ISN, 16 in total; subjects’ gender not reported) — a developing SL that emerged in the 1970s after the establishment of the Nicaraguan deaf community under specific conditions, allowing SL linguists to observe the emergence and development of a new language in situ. The first group of subjects in Pyers’ study acquired early forms of ISN while the second group acquired ISN 10 years later, when the language had evolved into a generally more complex form, especially in regard to spatial grammar. Deaf subjects participated in a series of non-linguistic spatial cognition tasks involving the rotation of images. The results showed that the ISN signers from the second, “more developed” group were more precise when identifying whether or not two rotated images were the same. In another study of MR abilities in users of ISN, Martin, Senghas, and Pyers (2013) focused on more specific aspects of the transformation of mental images. Since the majority of rotations of spatial scenes associated with SL comprehension take place on the horizontal plane, there is a question as to whether enhancement of non-linguistic MR skills in SL users is also axisspecific. Another feature of SL rotations that may influence the character of general MR abilities is the prominent role of rotation of human subjects in SL spatial language because of a so-called referential shift (hand/ bodily motions marking the shift of vantage point from one subject to another in discourse; see, e.g., Emmorey, 2002). Another question thus arises as to whether or not the performance in MR tasks involving human figures differs from that of tasks with abstract objects. There is also a question of whether it is the age of onset of acquisition (AoA) or the length of exposure and practice of using SL that puts a more significant constraint on the enhancement of spatial cognition in SL users. Martin and colleagues (2013) tested six groups of subjects: deaf ISN adults (n=13) and children (20), with both groups divided into subgroups based on AoA (before or after the age of 6), and Spanish speaking hearing Nicaraguan adults (11) and children (5) 15 . Participants carried out a range of 15 Gender distribution was not reported for any of the subject groups. Influence of spatial language on the non-linguistic spatial reasoning 289 canonical MR tests with objects (human figures/ abstract shapes) on horizontal/ vertical planes. Object identification accuracy and RT were recorded. In both the adult and children deaf groups, subjects with early AoA outperformed late learners. As predicted, both signer groups outperformed hearing non-signers in both experimental conditions. Every group performed better on a horizontal plane and with human figures. As previously demonstrated by Sayeki (1981), there is a tendency towards better proficiency in MR tasks when human-like stimuli are used. Martin and colleagues also pointed out that it is possible that deaf subjects make use of their experience with “mental handling” of human figures when comprehending referential shifts: “The human figure may invite the viewer to embody the stimulus more readily, enabling a visual perspective-taking strategy” (Martin et al., 2013, p. 248). In sum, the studies reported here provide evidence suggesting that deaf signers generally perform better than deaf or hearing non-signers and hearing signers in non-linguistic MR and/ or mirror-image recognition tasks. This might be due to the effect of SL use, although it is not clear whether MR necessarily occurs during SL processing. Supported especially by the findings of Emmorey et al. (1998), it seems that it is not MR as such but a kind of instant mirror-image projection that is employed while comprehending (and maybe also producing) spatial descriptions from the narrator’s perspective (and during other perspectivization processes, such as referential shifts and perspective/ frame of reference changes within a single signing space). This occurs in ASL and ISN (where there is a preference for the narrator’s perspective) use so frequently that it can actually have an effect on nonlinguistic spatial reasoning — including MR abilities. Although it is considered to be a major predictor of MR ability, the gender factor does not seem to play a very important role in the studies reviewed above. The possible correlation between gender and MR skills is not, surprisingly, stronger than the effect of language-related factors, or is not reported at all. The AoA of SL is, however, a significant factor as a higher success rate in MR tasks correlates, according to Martin and colleagues’ (2013) ISN data, with the early AoA of ISN. This factor outweighs the overall length of exposure to SL, and thus the length of practice of MR skills. The absence of this effect is inconsistent with the research on practice effects on MR skills in the hearing population (Beaninger & Newcombe, 1989). This points to an SL-specific influence on the nature of mental imagery in native or near-to-native users of SL, regardless of whether they are deaf or hearing, with an early onset of language acquisition. Jakub Jehlička 290 4 Experimental study of mental rotation abilities in CzSL signers and Czech speakers Since the perspectivization strategies in CzSL do not differ markedly from ASL, and as there is a preference for the narrator’s perspective in spatial descriptions in both languages (Tučková, 2013), we might expect similar linguistic effects on non-linguistic MR skills in CzSL native signers (compared to non-signing hearing Czech speakers). As for the comparison of the MR skills of hearing AE speakers and hearing Czech speakers, there is no apparent reason for assuming languagebased differences in MR abilities between speakers of English and Czech. Although Germanic and Slavic languages employ various grammatical devices for static spatial descriptions (e.g., prepositional vs. case construction) and some spatial relations may be conceptualized differently, there is no reason to expect any effect on perspective-taking when describing a static spatial scene motivated by a different linguistic structure. Nevertheless, the use of non-verbal means should be taken into account, namely, the role of gesticulation in Czech and English 16 . In order to investigate the possible effect of CzSL use on MR abilities in deaf CzSL signers, I adapted 17 part of the experimental design used by Emmorey and colleagues (1998). Only the second part of the original study has been carried out, that is, research on MR in Czech hearing non-signers and deaf CzSL signers. The reason for choosing this study as a basis for my research is that (i) it takes into account stimulus variables (type of stimulus and axis of the object display), (ii) its design allows for an effortless replication involving subjects with different linguistic backgrounds, (iii) it uses various types of stimulus presentation (videos of SL production or objects), and (iv) further extension of the study with different subjects (CzSL interpreters, CzSL learners). The present study is the first attempt to explore spatial-cognitive skills in users of CzSL. 4.1 Research question The aim of the study was to assess possible correlations between subject score vs. subject gender, SL competence, rotation of the scene, object type, 16 The present study, however, does not deal with the influence of gesticulation on the MR abilities of hearing subjects. 17 All modifications I made are mentioned below. Except for those modifications, the experimental design described here corresponds with the original study. Influence of spatial language on the non-linguistic spatial reasoning 291 and the axis of the object arrangement. All subjects were expected to be less accurate in the rotation condition than in the non-rotation condition. In both conditions, male subjects were predicted to score better than female subjects. Deaf signers were expected to outperform hearing subjects in the rotation condition, and their accuracy was expected to be higher with (i) the animate stimuli compared to other stimulus types and (ii) when the stimuli were presented on the horizontal axis. 4.2 Subjects Analyzed data were collected from 13 hearing native Czech speakers (10 women, 3 men; mean age 20.2 years; age range 19-24) who participated in the experiment. Subjects in the hearing group were recruited via the Laboratory of Behavioral and Linguistic Studies in Prague 18 , where the experiment took place. All hearing subjects reported no hearing impairments and no competence in CzSL or any other SL. All the subjects were philology or psychology students at Charles University in Prague and received financial compensation for their participation in the experiment 19 . Deaf subjects were recruited individually. The collection of a sample comparable to that of the original study is still in progress. So far, data from only 4 deaf subjects (2 women, 2 men; mean age 34 years; age range 30-37; 2 subjects reported CzSL as native language, 2 acquired CzSL before the age of 5) have been obtained. Thus, a full analysis of the data has not yet been carried out, and I will address only the interim results of the hearing group, taking into account only the general MR effect without respect to the stimulus factors. 4.3 Procedure After participants received their instructions (hearing participants in Czech and deaf participants in CzSL having a CzSL interpreter at their disposal during the entire experiment, and both groups were also presented with the instructions in writing as part of the informed consent form), they were seated in front of a computer screen. They were presented with 45 sequences of visual stimuli, each followed by the spatial memory tasks involving rotated objects. 18 A joint project of the Institute of Psychology, Czech Academy of Sciences, and Faculty of Arts, Charles University in Prague (http: / / labels.ff.cuni.cz). 19 Gender disproportion was due to the general prevalence of women in the gender distribution of the students of the particular study programs. 292 ject ord Th tion ed equ Fig 2 Th ts l der 1. 2. 3. e du n of and uall gure he st oca of s Im the Im Im zon ura f th d no ly b e 1: S (Je timu ated sequ mage e vi mage mage ntal tion he pr on-r ut r Sequ ehlič uli d on uen e of ewe e of es o l or n of rod rota rand uenc čka, con n a nces a d er (f a b of 2 r ver f the duct ated dom ce o 201 nsist des wa desk fig. blan -4 o rtica e di tion d as mly of sti 14, p ted sk r as as k wi 1) o nk d obje al a ispl n of we in t imu p. 45 of repr s fo ith or r desk ects axis lay the ell a the li: w 5) seq rese ollow a m rota k ap of t of e e res as ho task witho quen enti ws: mark ted pea the ever spec oriz ks. out nces ng ked d 180 arin des ry i ctiv zont rota s of a r ent 0˚ (f ng s sk w ma ve si tal a ation ph room tran fig. equ with ge w igns and n, ve hoto m w nce 2) uent hout was s by d ve ertic ogra with — e tiall t a m s se y a n ertic cal a aphs h a eith ly, o mar t on nati cal s axis s de ma her o one rked n th ive scen (“vc epic arke on t e by d en he b CzS nes w chod ctin ed e the y on ntra basis SL s wer d”= Jak g v entr sid ne, o ance s of sign re d =entr kub ario ranc e cl on e f the ner. distr ranc Jehl ous ce. lose a h e du . Ro ribu ce) lička ob- The er to horiuraotatuted a e o - - d Infl Fig In imi sig nar fro oth ord tion the onl der fluen gure the itate gned rrat om l her der ns: e sa ly. T r to nce o 2: S or ori e th d de tor’s left way of left ame Thi avo of sp Sequ rder igin he c escr s or to y ar obje t>ri in s m oid patia uenc r of nal s cano ripti r ad rig rou ects ght bot modi a p al lan ce of obje stud onic ion ddre ht/ und s w t/ fr th c ifica poss ngua f stim ects dy, t cal of esse fron if f as m ont con atio sible age mul was the ord the ee’s nt t from mod t>ba nditi on w e ce on t li: w s rev ord der e sp pe to b m th difi ack. ions was ilin the n with r vers der of atia rsp back he n ed, . W s, p exp ng ef nonrota sed ( of a lexi al ar ecti k if narr rem When parti pec ffec -ling ation ([6]> app ical rran ive. fro rato mai n th icip cted ct. guis n, ho >[5] pear l sig nge Ob om or’s inin he d pant d to tic s oriz >[4] ranc gns men bjec the per ng th direc ts h ma spati onta ]>[3 ce o or nt o ts w e ad rspe he s ctio had ake ial r al ax ]) (J of th cla of o wer ddre ecti sam on o to the reaso xis. Jehli he o assif obje e th esse ive. me i of th rely e tas onin In th ička obje fier cts here ee’s In in b he o y on sk m ng he p , 201 cts ha in a efor per my both obje n th mor prese 14, p wa nds a ro re “ rspe y ad h ro ects’ he f re d ent p. 46 s de shap oom pop ecti dapt otati ’ di first diffi stud 6) esig pes m fro ppin ive, tatio ion spla t st cult dy, t gned in om ng or on, con ay w timu t in 293 the d to the the up” the the ndiwas ulus n or- 3 o e e ” e e s s - Jakub Jehlička 294 Objects were divided into three groups: animate figures (human and animal), artifacts (e.g., miniature chair or toy wheelbarrow), and blocks (basic symmetric shapes, e.g., cone or triangle). After viewing a stimuli sequence, subjects were asked to reconstruct the object arrangements using an actual desk and objects — the same as were used in the stimuli images. The desk was fixed so that the marked entrance was still in the same position, requiring participants to create a mirror image of half of the viewed spatial arrangements. There were 3 practice trials and 12 test trials in each object condition. The subjects’ success in the task was determined by the correct placement of an object on the desk and by the correct orientation of the object with respect to the entrance. The object orientation condition did not apply to the block type of objects. 4.4 Interim results The percentage of the individual objects placed in the correct location and with the correct orientation was calculated for the hearing group. Mean object placement scores of the hearing group compared with the results of Emmorey and colleagues (1998) are presented in the tables below. Without Rotation SE Rotation SE Jehlička (2014) 91.6 2.5 89.4 1.4 Emmorey et al. (1998) 91.2 1.6 76.0 3.4 Table 1: Object location accuracy (%) (SE=standard error) Without Rotation SE Rotation SE Jehlička (2014) 80.9 4.2 74.1 3.7 Emmorey et al. (1998) 83.5 3.6 64.4 4.9 Table 2: Object orientation accuracy (%) (SE=standard error) As expected, Czech hearing subjects scored better when no rotation was required in both object location (91.6 % of correct answers — only 0.4 % better than the AE group in the original study) and object orientation tasks (Czech 80.9 % vs. AE 83.3 %). In the rotation condition, the average accuracy was lower; however, Czech hearing subjects were more accurate than AE subjects (13.4 % in object location and 9.7 % in object orientation). Influence of spatial language on the non-linguistic spatial reasoning 295 5 Discussion Apparently, when no rotation was required, hearing Czech speakers performed similarly to the AE group in the original study in both object placement and orientation. However, the effect of rotation was markedly weaker than in the original study — regardless of whether in object placement or location or related to any other variable. The seeming lack of an MR effect in comparison to Emmorey and colleagues’ results is very likely due to the modification of the experimental design and not a result of actual crosslinguistic differences. Contrary to what was assumed, constant direction of the objects’ appearing on the screen might have facilitated the responses and created a ceiling effect, diminishing the MR effect in the rotation condition. However, such results in the hearing group could also (and perhaps primarily) have been caused by the duration of the display of stimuli. Emmorey and colleagues (1998) do not report the exact timing of the stimuli display based on ASL production. The problem lies in the individual differences between the signers whose production served as the sample. Even when they were instructed to produce the description at a normal pace, the actual difference in the duration of the utterances might have been a few seconds. Such a difference has to be considered as an important factor for the subsequent non-linguistic spatial tasks because a longer duration of visual stimuli presentation may facilitate the storage and retrieval of the visual information in working memory. As the interim results have indicated, there are several shortcomings of the experimental design, and it should be revised in a significant way. First, the order of the stimuli sequence should remain as in Emmorey’s study, that is, reflecting the order of signs in the signed spatial description, to avoid any possible effect on MR performance in both subject groups. Second, the duration of visual stimuli presentation must be set based on a larger sample of spatial descriptions by various native CzSL signers instructed to produce the spatial descriptions. At the moment, there is no appropriate corpus of natural interactions in CzSL available, so the descriptions have to be elicited from a relatively large number of signers in various communication situations. The average production time of the individual lexical signs or classifiers will then be used for the timing of the stimuli presentation in the object task. Also, the gender distribution of the sample has to be balanced — the number of male participants should be increased in order to measure possible effects of gender on subjects’ performance in MR tasks. Jakub Jehlička 296 Conclusion So far, empirical research on MR abilities has presented strong evidence for SL use-associated enhancement of MR skills in some deaf as well as hearing users of SLs. However, the conditions under which signers acquire those skills remain unclear. There are other questions left, such as whether this phenomenon is gender-specific or how it is related to deafness. Experimental methods based on the classic MR task (i.e., behavioral tasks) are clearly not sufficient for resolving this problem. Even so, it is meaningful to continue with the behavioral studies — one of the reasons being the importance of replication. First, replication studies may lead to establishing the methodological baselines that MR studies with deaf subjects seem to lack. Second, it is necessary to seek possible SL use-based MR effects cross-linguistically since the evidence from different SL remains limited. Any future crosslinguistic research on the effects of CzSL use on non-linguistics spatial cognition would highly benefit from the existence of a thorough description of spatial language in CzSL. References Allan, K. (1977). Classifiers. Language, 53, 285-311. Arik, E., & Milković, M. (2007). Perspective taking strategies in Turkish Sign Language and Croatian Sign Language. LSO Working Papers in Linguistics 7: Proceedings of WIGL 2007, 17-31. Arik, E. (2008). Locative constructions in Turkish Sign Language (TID). In R. M. de Quadros (Ed.), Sign languages: Spinning and unraveling the past, present, and future. TISLR9, the Theoretical Issues in Sign Languages Research Conference (pp. 15-31). Petropolis: Editora Arara Azul. Beaninger, M., & Newcombe, N. (1989). The role of experience in spatial test performance. A meta-analysis. Sex Roles, 20, 327-344. Bellugi, U. (1972). Studies in sign language. In O’Rourke, T. J. (Ed.), Psycholinguistics and total communication: The state of the art (pp. 68-84). Washington, DC: American Annals. Chamberlain, C., & Mayberry, R. (1994). Do the deaf ‘see’ better? Effects of deafness on visuospatial skills. Poster presented at TENNET V, Montreal, Quebec. De Vos, C., & Zeshan, U. (2012). Introduction: Demographic, sociocultural, and linguistic variation across rural signing communities. In U. Zeshan & C. de Vos (Eds.), Sign Languages in village communities: Anthropological and linguistic insights (pp. 2-23). Berlin: Mouton De Gruyter. De Vos, C. (2012). Sign-spatiality in Kata Kolok: How a village sign language in Bali inscribes its signing space (Unpublished doctoral dissertation). Radboud University, Nijmegen. Retrieved from: http: / / hdl.handle.net/ 2066/ 99153 Influence of spatial language on the non-linguistic spatial reasoning 297 Emmorey, K., Kosslyn, S., & Bellugi, U. (1993). Visual imagery and visual-spatial language: Enhanced imagery abilities in deaf and hearing ASL signers. Cognition, 46, 139-181. Emmorey, K., Klima, E. S., & Hickok, G., (1998). Mental rotation within linguistic and nonlinguistic domains in users of American Sign Language. Cognition, 68, 221-246. Emmorey, K. (2002). Language, cognition, and the brain. Insights from sign language research. Mahwah, NJ: Lawrence Erlbaum and Associates. Emmorey, K. (2003). Perspectives on classifier constructions in signl. Mahwah, NJ: Lawrence Erlbaum and Associates. Emmorey, K. (2007). The psycholinguistics of signed and spoken languages: How biology affects processing. In G. Gaskell (Ed.), The Oxford handbook of psycholinguistics (pp. 703-721). Oxford: Oxford University Press. Haviland, D. (2000). Pointing, gesture spaces, and mental maps. In D. McNeill (Ed.), Language and gesture (pp. 1-46), Cambridge: Cambridge University Press. Jehlička, J. (2014). Prostorová kognice mluvčích češtiny a českého znakového jazyka: Jak mezijazyková diverzita ovlivňuje nejazykovou kognici (Unpublished master’s thesis). Charles University, Prague. Retrieved from https: / / is.cuni.cz/ webapps/ zzp/ download/ 120169738/ ? lang=en Levinson, S. C. (1996). Frames of reference and Molyneux's question: Cross-linguistic evidence. In P. Bloom, M. Peterson, L. Nadel, & M. Garrett (Eds.), Language and space (pp. 109-169). Cambridge, MA: MIT press. Levinson, S. C. (2004). Space in language and cognition. Explorations in cognitive diversity. Cambridge: Cambridge University Press. Levinson, S. C., & Wilkins, D. (Eds.) (2006). Grammars of space. Explorations in cognitive diversity. Cambridge: Cambridge University Press. Linde, C., & Labov, W. (1975). Spatial networks as a site for the study of language and thought. Language, 51, 924-929. Martin, A., Senghas, A. & Pyers, J. (2013). Age of acquisition effects on mental rotation: Evidence from Nicaraguan Sign Language. In S. Baiz, N. Goldman, & R. Hawkes (Eds.), BUCLD 37: Proceedings of the 37th Boston University Conference on Language Development (pp. 241-250). Boston, MA: Cascadilla Press. McKee, D. (1987). An analysis of specialized cognitive functions in deaf and hearing signers (Unpublished doctoral dissertation). University of Pittsburgh. Morford, J. P., Nicodemus, B., & Wilkinson, E. (2015). Research methods in psycholinguistic investigations of signed language processing. In E. Orfanidou, B. Woll, & G. Morgan (Eds.), Research methods in sign language studies: A practical guide (pp. 209-249). Hoboken, NJ: John Wiley & Sons. Parasnis, I., Samar, V. J., Bettger, J. G., & Sathe, K. (1996). Does deafness lead to enhancement of visual spatial cognition in children? Negative evidence from deaf nonsigners. Journal of Deaf Studies & Deaf Education, 1, 145-152. Parsons, T. D., Larson, P., Kratz, K., Thiebaux, M., Bluestein, B., Buckwalter, J. G., & Rizzo, A. A. (2004). Sex differences in mental rotation and spatial rotation in a virtual environment. Neuropsychologia, 42, 555-562 Perniss, P. (2007). Space and iconicity in German sign language (DGS). MPI Series in Psycholinguistics 45, Nijmegen: Radboud University. Jakub Jehlička 298 Pyers, J., Shusterman, A., Senghas, A., Spelke, E., & Emmorey, K. (2010). Evidence from an emerging sign language reveals that language supports spatial cognition. Proceedings of the National Academy of Sciences, 107(27), 12116-12120. Sayeki, Y. (1981). “Body analogy” and the cognition of rotated figures. The Quarterly Newsletter of the Laboratory of Comparative Human Cognition, 3(2), 36-40. Schembri, A. (2003). Rethinking “classifier constructions” in signed languages. In K. Emmorey (Ed.), Perspectives on classifier constructions in sign languages (pp. 3-34). Mahwah, NJ: Lawrence Erlbaum and Associates. Senft, G. (2007). The Nijmegen space games: Studying the interrelationship between language, culture and cognition. In J. Wassmann & K. Stockhaus (Eds.), Experiencing new worlds (pp. 224-244). New York, NY: Berghahn Books. Shepard, R. N., & Metzler J. (1971). Mental rotation of three-dimensional objects. Science, 171, 701-703. Stokoe, W. C. (1960). Sign language structure: An outline of the visual communication systems of the American Deaf. Studies in Linguistics Occasional Paper 8. New York: University of Buffalo. Talbot, K. F., & Haude, R. H. (1993). The relationship between sign language skill and spatial visualization ability: Mental rotation of three-dimensional objects. Perceptual and Motor Skills, 77, 1387-1391. Talmy, L. (2003). Spatial structure in spoken and signed language. In K. Emmorey (Ed.), Perspectives in classifier constructions in sign languages (pp. 169-195). Mahwah, NJ: Lawrence Erlbaum. Tikovská, L. (2006). Klasifikátory českého znakového jazyka (Unpublished bachelor’s thesis). Charles University, Prague. Retrieved from https: / / is.cuni.cz/ webapps/ zzp/ download/ 130026342/ ? lang=en Tučková, D. (2013). Deixe a prostor v českém znakovém jazyce (Unpublished bachelor’s thesis). Charles University, Prague. Retrieved from https: / / is.cuni.cz/ webapps/ zzp/ download/ 130113912/ ? lang=en Wilcox, S., & Morford, J. P. (2007). Empirical methods in signed language research. In M. Gonzalez-Marquez et al. (Eds.), Methods in cognitive linguistics (pp. 171-200). Amsterdam, Philadelphia, PA: John Benjamins. Index abstract 57, 78, 138, 140-141, 144- 145, 147-148, 212, 235, 273, 288- 289 word 273, 278 acceptability 11, 20, 30, 34, 47, 49, 62, 95, 101, 106, 151-153, 158, 172 judgment 151-153, 158 access 16, 77, 135, 191, 228 achievement 96, 99 acquisition 9, 19, 23, 30, 49, 79, 194- 195, 222-223, 255, 261, 279-280, 288, 297 first language 31, 151, 275 adaptation 11, 84, 106, 155, 280, 293 advancedness 23 adverbial 57 count-quantifier 93 durative 96-97, 99 quantifier 102 time-frame 97 age 22-23, 26, 29, 53, 78, 157, 159, 195, 201, 222-223, 226, 232-233, 244, 270, 279, 288, 291 age at the time of immigration 193 of acquisition (AoA) 279, 288- 289 of onset early vs. late 193 Aktionsarten 11, 113-115, 117, 121, 127, 129-131 American English 81, 222, 279-280, 285 animacy 142, 154-155, 157, 172 category of 154, 163 acquisition 9, 19, 23, 30, 151, 194- 195, 261-262, 280 Arabic 25-26, 227, 265, 277 articulation 82, 214-215, 217-218 gliding 215 aspect 11, 28-29, 35, 50, 52, 57-58, 84-86, 88-98, 107-108, 114, 121, 142, 183, 227 differentiation of tense and aspect 84 grammaticalized 85 imperfective 59 (see also imperfective) lexical 83 perfective 59, 62, 167 (see also perfective) processing of 84, 95, 101-103, 109 Slavic 23, 85, 92, 110 use 28, 88, 91, 94, 107, 109 verbal 8-9, 11, 51, 83-85, 87, 90, 92, 94-95, 97, 100-103, 105, 107- 110 see also grammatical aspect aspectual 8, 26, 51, 84-87, 89-92, 94- 96, 98, 101, 103, 105, 107-110, 113- 114, 130-132 feature 84-85, 87, 94, 96 grammaticalization 94 interpretation 95, 109 mismatch 96, 108 processing 96 Index 300 system 103, 114 bare nominals 84, 88, 110 bare singular 90 Belarusian 13, 257, 259, 261-262, 266, 268-270, 273-274, 276-278 Belarusian-Russian 257, 259, 261, 268, 273-274, 276-277 mixed speech (BRMS) 259-262, 266, 269, 273, 277, 317 bilingual 22, 29, 159, 191-194, 212- 213, 216, 218, 225, 231-232, 238, 257-259, 261, 263, 268, 274-275 early 194-195, 226 language representation 258, 274 late 194, 201, 209 mental lexicon 12, 191, 193-194, 201, 227, 258 mode 236 type of 257 bilingualism 12, 193-194, 230, 244, 257, 259, 274-275, 277-278 case of doubt 153-155, 169 categorization 27, 240, 262 Český národní korpus (ČNK) 102, 106 classification trees 123 classifier 279, 281-284, 286, 293, 297-298 constructions 279, 282, 284, 286, 297-298 cloze test 113, 117, 130, 175, 180, co-activation 12, 194-195, 204, 220 intraand interlingual 191 code-mixing (CM) 257-258, 266, 275 code-switch, code-switching (CS) 13, 231, 257-258, 263-266, 268, 270-275, 277-278 coding scheme 22, 26, 28 cognate 196, 258, 268, 273 coincidence 101 degree of 106 comparability 19, 22, 83-84, 87, 98, 106, 237 comparative study 94, 108, 131 competition 31, 104, 137-141, 145, 147-148, 175, 194-196, 212 comprehension 7, 11, 14, 79, 81, 96, 132, 134, 136, 190, 227, 232, 278, 280, 284-285, 287-288 concrete 10, 15, 18, 20, 47-48, 57, 85, 228, 235, 244, 273, 278 names of concrete objects 195 word 274 conditions 21, 38-39, 45, 97, 105, 107-108, 126, 136, 140, 143-145, 156, 192, 266, 268, 287-288, 291, 293, 296 conjugational class 12, 142 consonants 177-178, 182, 184, 213- 215, 218 construction of materials 191 context 11-12, 14, 20, 30, 38, 43, 46- 47, 49-50, 54, 56, 58-59, 65, 68, 72, 75-77, 82, 84-86, 91, 93, 96, 99, 101-102, 105-108, 113, 119, 121- 122, 124, 126, 129-130, 138-139, 148, 175-176, 180, 182-186, 225, 228-229, 232-233, 237, 262, 264, 268-270, 272 contextual features 86 corpus 11, 29, 47, 52-55, 58, 62-64, 66, 71-72, 76-77, 92, 102, 106, 113- 115, 127, 129, 152-153, 168, 176- 179, 185, 187, 192, 196-200, 210- 211, 221, 295 of spontaneous Russian 177-178 cross-linguistic 11, 18, 98, 103, 108- 110, 131, 195-196, 199, 213, 234, 255, 258, 280, 283, 295-296 data 84, 98 difference 83-84, 295 experiment 11, 83-85, 87, 95, 108 phonetic differences 213 301 research 18, 87-88, 94, 103, 108, 296 Czech 5-7, 11, 13, 21-23, 25-26, 28- 29, 36, 83-84, 92-95, 101-103, 105, 134-136, 138-139, 142, 145, 147- 153, 279-280, 290-291, 294-295 database 20, 55, 65, 192, 199 declension 142, 147, 149, 155, 162- 163 hard 139-140, 145 soft 139 declensional 12 class 142 definiteness 34, 36, 43, 45, 48, 50, 90, 100-101 definite 36, 43-44, 49, 90, 99-100, 138 derivation 98, 136, 150 distribution 44, 69, 113-114, 118- 119, 121, 127, 206, 208, 210, 242, 265, 273, 288, 291, 295 Dutch 25-26, 53, 79, 81, 137-138, 150, 182, 188 ecological 30, 77 elicitation 10, 15, 19-21, 23-26, 49, 203, 273, 287 encoding 21, 25, 48, 142, 144, 147- 148, 175, 191 English 8, 13, 21, 25-26, 34, 36-37, 42, 45, 47-48, 63, 76, 84-91, 130, 148, 176, 191-192, 194, 197-199, 213, 229-230, 263, 279, 285, 290, event 21, 24-25, 28, 35-38, 40, 42, 56- 57, 59, 85-87, 89, 91, 93-94, 101, 103, 105, 114-115, 127 Event Related Potential (ERP) 13, 16, 18, 257, 263-264, 272, 274-275, 277-278 executive control 231, 238, 253, 255 experiment 9, 11, 15-17, 19-20, 22, 23-25, 30, 38, 43, 52, 64-66, 68-71, 73-78, 83, 88, 94-99, 101-103, 105- 108, 115-118, 120-121, 123, 126- 130, 135-136, 138, 140, 144-145, 147-148, 152, 155-158, 168, 171, 176, 180-182, 184-187, 192, 195- 197, 201, 207, 212-213, 216, 218, 220-221, 263, 266, 268-269, 284, 286-287, 291 classical psychoacoustic 180 cloze test 185 dictation task 12, 175, 180-182, 184-185 forced choice 95, 101, 106 psycholinguistic 34-35, 113, 116, 127, 177, 180, 225, 227, 252 experimental 5-7, 9-13, 15-16, 18-21, 25-28, 34-35, 38, 40, 45, 53, 62, 64, 65-67, 71-72, 75, 83-85, 92, 101, 103, 106, 107-108, 113, 117-119, 128, 130-131, 134-137, 148, 151- 153, 175, 177, 180-182, 184-185, 191, 193, 207, 220-221, 225, 227, 230, 263, 289-290 design 12, 15, 18-19, 27-28, 30, 77, 83-84, 87, 98, 107, 108-109, 151, 192-193, 280, 290, 295 item 26, 65, 71, 75, 87, 98, 108 linguistics 15, 19 research 9, 35, 50, 83, 87, 107, 182, 220 sentence condition 107 setting 10, 19-20, 123, 130, 187 eye tracking 10-11, 15, 16, 19, 20, 22, 25-28, 49-50, 72, 83, 95, 181, 191-192, 194, 196-197, 201, 213, 218, 220 feature 48, 137, 140-141, 144-145, 150, 169, 201, 203, 214-215, 235, 288 external 147 internal 147 filler 66, 76, 201, 207 fluency 12, 225, 227-228, 230-231, 238, 244, 246, 248, 252-255 Index Index 302 focus 7, 9-10, 15-16, 18, 20-21, 30, 53, 55-56, 58, 68, 93-94, 99, 101, 105, 115, 134, 187, 192, 220, 228, 231-233, 235, 239-240, 270, 280 frame 140, 235, 239, 283, 287, 289 frame of reference 283-284, 287, 289 French 21, 86, 138, 279 frequency 108, 117, 122, 173, 177, 185, 197-200, 210 concept of 191, 198 corpus 192, 197, 199-200, 211, 221 estimated 200 subjective 192, 200-202, 208, 210- 211, 221-222 word 19, 82, 191, 196-197, 200, 220, 222-224, 229-230, 237 word list 176, 179 functional model of spontaneous speech recognition 176 gender 12-13, 26, 29, 98, 137-142, 144-145, 147-149, 154, 185, 258, 285, 287-291, 295-296 German 6-7, 11-12, 21-22, 24-26, 28, 83-86, 91, 94-100, 107-108, 138, 147-148, 151, 154, 191-193, 195- 196, 198-211, 213-217, 220-221, 225, 232-242, 246-247, 249, 252- 253, 263, 270, 280, 284, 287 Germanic 36, 137, 142, 148, 192, 290 goal-oriented motion events 21, 25, 28 grammar 7, 11-12, 34, 47-49, 55, 114, 117, 131, 142, 151-152, 156, 232-233, 288 grammatical 9, 11-12, 17, 19-23, 27, 30, 34-35, 47, 49-50, 56, 83-85, 100, 107-108, 130-131, 137-138, 140-142, 144-145, 147, 151-160, 163-166, 168, 170, 178, 184, 186, 226, 228, 238-239, 259, 266, 274, 282, 290 aspect 11, 21, 22, 92, 96, 103 category of aspect 107 encoding 138, 140-141 judgment 17, 19-20 variations 151, 155 grammaticality judgment 12, 19, 151-153, 156, 160, 163-164, 166, 168, 170-171, 226 grammaticalization 63, 92, 101 degree of 92 grammaticalized 92 heritage speakers 194, 225-230, 232- 252 homomorphism 43 event-object 36, 38-39, 43, 47, 49 hypothesis 8, 17, 27-28, 55, 72, 96, 98, 103-105, 109, 122, 126-127, 129, 140-141, 145, 184, 186 seeing-for-speaking 25 split morphology 136 imaginary norm 153, 170-171 immigration 193 age at the time of immigration 193 imperfective 11, 29, 37-41, 57, 59, 61, 65-66, 85, 115, 123, 142 increment size 96 incremental theme 36-37, 47, 91 incrementally 56, 96 indefinite 36, 43-44, 46, 49, 90, 100 indefiniteness 43, 51, 90, 100-101 information structure 34, 42, 44, 50, 51, 77, 102 inner-Slavic 83-84, 94, 101, 107-108 intercoder reliability 11, 22, 28, 30 interference 108, 137-138, 142, 128, 257-258, 261 picture-word 12, 137, 142, 150 semantic 108, 137-138, 257-258, 261 intonation 49, 186 intrinsic 145, 283-284 introspection 152 introspective gradated 156 303 Italian 21 item 19, 21, 25-28, 30, 45, 49-50, 54, 65-66, 68, 71, 75-76, 83-84, 96-100, 102-109, 121-122, 178, 181, 191- 192, 197-198, 203-208, 221, 227- 231, 235-238, 240-242, 244, 246- 247, 252-253, 257-258, 268 iterated 85, 93-94, 101-102 event 94, 101 iteration 94-95, 101-103, 105-106 bounded 93 judgment 20, 28-30, 49, 151, 158- 159, 165-166, 168-170, 257, 285 acceptablity 151-153, 158 thermometer 12, 152-153, 156- 159 see also grammaticality judgment Korean 45, 227 language 5-13, 24-25, 47-50, 66, 83- 87, 90, 92, 96, 102-103, 106-108, 138, 147-148, 171, 191-195, 199- 200, 207, 212, 214-216, 220, 227, 229, 231-232, 234-237, 240-241, 252, 257-263, 268, 273-275, 279- 283, 290, 296 acquisition 45, 49, 53, 63, 255, 273-275, 289 articleless 90 attrition 229-230, 233, 255 categories 257, 275 closely related 13, 257-258, 263, 265, 273-275 contact 13, 257, 259, 275 dominant 23, 194, 247, 253 mode 22, 26, 31, 195, 222 native 29, 31, 259, 291 second 9, 14, 20-21, 45, 50, 194- 195, 223, 255-256, 259, 278 signed 279-280, 282, 297-298 spatial 13, 279-282, 287-288, 296 spoken 18-20, 22, 155, 192, 198, 279-284, 297 system 131, 154, 257-258, 261, 274 Late Positive Complex (LPC) 265, 274 Levelt model 148 lexical 11-12, 34-35, 49-50, 52-54, 56, 62-63, 72, 75, 87, 89, 103, 113-114, 121-122, 124, 126, 177, 183-184, 186, 192, 194, 225-235, 247, 258- 260, 263, 265-266, 268, 273-274, 281-282, 286, 293, 295 access 117, 222, 228, 238, 258 aspect 83 decision 8, 12, 16-17, 72, 134-137, 180, 198 fluency 227, 233, 253-254 proficiency 225, 227, 230-235, 250-253 lexicon 237, 265-266, 273 linguistic 7-8, 10, 13-20, 22-23, 25- 27, 30, 34-35, 37, 49-50, 54, 56, 59, 63-64, 68-69, 72-73, 76-77, 83-87, 95-96, 108, 124, 129, 148, 151-155, 157-160, 164, 170-171, 180, 191- 192, 194-199, 213, 225-226, 232, 234, 247, 257-258, 261-263, 266, 268-269, 273, 275, 279-280, 282, 286-287, 289-290, 296 norm 154, 262 rules 154, 155 theory 30, 81 marked 35, 37-38, 40, 43, 49, 85, 90, 96, 100-101, 120, 137-140, 142, 145, 147, 154, 249, 273, 275, 286- 287, 292, 294 markedness 263 mass 34-35, 37, 39-40, 42-43, 90-91, 259 argument 91 object 11, 34, 42, 91 matching 27, 56, 118-120, 206-208, 220, 229 Index Index 304 picture-word matching task 202, 206-207 memory 120, 128-229, 264, 286 working 231, 295 memory task 10, 15, 19-20, 24, 25- 27, 291 mental 12-14, 16, 18, 57, 65, 148- 149, 212, 260, 262, 286-290 grammar 12, 116-117, 131 lexicon 8, 134, 136, 175-177, 180, 182, 192, 196, 199, 220-221, 234, 258, 273, 280 lexicon of a listener 176, 180, 182 rotation (MR) 279-280, 284-291, 295-297 method 12, 15-17, 19-21, 23, 26-27, 52, 79, 83, 95, 118, 129, 131, 137, 151-152, 156-159, 172, 178, 181, 184, 186-187, 192, 200, 211, 230, 233, 252 measuring frequency 197 offline 16-18, 20 online 10, 16-20, 30, 180 problem 7, 109, 177, 187, 220, 226 true online 16, 18 mismatch 95-98, 102-103, 106 condition 97, 98 effect 95, 96 sentence 98 model 11, 52-53, 55-56, 58-60, 62, 64-69, 71-73, 75, 78-79, 121, 123- 124, 132, 135, 150, 154, 175-176, 180-181, 186, 209, 218, 227, 258, 274 monolingual bias 257, 275 morphologically marked 87 multilingualism 9, 225 N400 13, 264-265, 268, 272-274 natural communication conditions 187 near-complementary distribution 127 neuronal processes 16, 18 nonce-verbs 113, 128-131 non-cognates 266, 268, 271, 273 non-linguistic 13, 20, 27, 279-280, 285, 287-290, 295-296 method 27 task 27, 280, 285 non-random distribution 121, 127 Norwegian 48 numeral 155, 162-163, 167 object construal 41-42, 49 offline 10, 16-20, 27, 30, 149, 181, 269 online 10, 16-20, 23, 27, 30, 84, 102, 112, 173, 180, 269, 277 P300 265 P600 265 palatalization 214, 217-218 pause 177, 186, 269 perceptual lexicon 175-176 perfective 8, 11, 28-29, 34, 37-43, 47, 49, 57-61, 65-66, 85, 115, 130, 142 perfectivity 38, 42, 43, 110 performance 16, 18, 26-27, 34, 60, 117, 152, 169, 232-233, 240-242, 244, 246-247, 250, 280, 285, 288, 295 performative utterances 101, 105 performatives 95, 101, 105, 109 periphery 12, 151, 153 periphery of the linguistic norm 151 perspective 8, 11, 32, 35, 54, 59, 110, 136, 191, 193-194, 280, 283-286, 288-290, 293 phoneme 212, 216 overlapping 213, 215 phonetic 12, 175-179, 182, 187, 212- 218, 220-221, 235, 239-240, 247, 253 and phonological differences 191, 193, 196, 212-214, 216-218, 220 305 distances 216-218 overlap 212 phonological 12, 34, 113, 138-141, 144-145, 147-148, 178, 189, 191- 193, 195-196, 212-217, 221 encoding 138, 140, 144, 148 facilitation 137 feature 214 overlap 191, 193, 196, 201, 212- 213, 215 picture 20, 106, 116, 140, 142, 144- 145, 150, 192, 196, 201-202, 204- 208, 244, 247, 275 matching of object names and pictures 220 picture naming 12, 137, 225, 228- 229, 231, 235, 252, 254-255, 275 Polish 12, 36, 92, 225, 227, 232-237, 239-242, 244-247, 250, 252-253 pre-tested 201, 220-221 priming 12, 29, 136-137, 149, 180, 258, 276 principle 92, 148, 217 basic methodological 187 probability 58-60, 64-65, 68, 70, 72- 75, 78, 117, 260 processing 8-13, 18, 22, 52, 54, 62,- 64, 74, 76, 84, 88, 95-96, 101-103, 105, 108-109, 130, 134-138, 140- 142, 144, 148, 155, 175, 191-192, 194, 196, 200, 257, 263-265, 273- 274, 284, 286-287, 289 cascaded 140-141 proficiency 21-23, 26, 50, 193, 225- 226, 232, 265, 268, 285 progressive 63, 85-86, 88-89, 110 psycholinguistic 7-12, 15-16, 34-35, 49-50, 73, 76-77, 113, 116, 127, 131, 134, 136, 142, 148-149, 175- 177, 180, 187, 191-192, 200, 220, 225-226, 228, 232-233, 234, 251- 252, 257-258, 261, 268, 275, 279, 281 quantifier 39, 93, 101, 102, 105 quantization 34-37 quantized 35-37, 47, 91 object 36, 39, 41, 43 quartile 118 questionnaires 12, 29, 151, 153, 159, 160, 163, 166-167, 170-171 random forest 123, 133 randomized order 26, 29 reaction time 11, 16-17, 63, 67, 73- 74, 77, 79, 117, 134-135, 137, 140, 149, 154-155, 198, 228, 236, 263- 264, 280, 284 reading 11, 13, 29, 38, 44, 49, 52, 54- 56, 62-70, 72, 74-78, 83, 88, 90, 92, 95, 100, 102-105, 108, 181, 232, 238, 263 effects 104 here-and-now 28 recognition of Russian reduced word forms 12, 181 reference 22, 26, 50-51, 91, 106, 109, 114, 158, 178, 297 absolute frame of 283-284 referential shift 288-289 regression 52, 55-56, 58, 60, 62, 64, 66-67, 71, 73, 75, 78, 108, 124, 133 relatedness of the languages 275 rival forms 79, 113, 129, 132 Romance 137-138, 142, 148 Russian 8, 11-13, 21-23, 25-26, 34- 39, 41-49, 52-54, 60, 63-66, 76, 78, 83-108, 113-118, 121, 126-127, 129-131, 154-159, 168, 175-179, 182, 185-187, 191-211, 213-218, 220-221, 225, 227, 230, 232-247, 250, 252-253, 257-263, 266, 268- 270, 272-275 aspectual system 103, 114 grammar 12, 47, 151-153 homophones 184 Index Index 306 National Corpus (RNC) 55, 66, 106, 114, 116-119, 122-123, 126, 128-129, 177, 237 New Frequency Dictionary 198 Salish 48 Sámi 47, 48 semantic 7, 11-12, 14, 34, 43, 48-49, 54, 56-58, 62, 76, 78, 89, 114, 116, 119, 121, 129, 132-133, 135, 154, 173, 175, 181, 186, 199, 230, 237- 239, 244, 246, 252-253, 276, 278 interference 137 interpretation 96 mapping 12, 225, 252-253 semantic-syntactic unit 178, 186 semelfactives 127-130, 132 Sign Language (SL) 279-291, 282, 284, 287-288, 296-297 signing space 279, 281, 283-284, 287-289, 296 Slavic 5-11, 23, 34-38, 50, 76, 83, 85- 86, 91-93, 108, 114, 131, 134, 142, 148, 153, 176, 187, 192-193, 225, 232, 234, 252, 275, 290 Spanish 25-26, 80, 82, 142, 195, 213, 227, 231, 254-255, 263, 276, 288 spatial 13, 31, 116, 279-295 cognition 279, 280, 281, 287-288, 296-298 speaker 22-23, 26, 34-35, 44, 48-49, 53, 62, 72, 88, 114, 153, 182, 194, 216, 218, 226-227, 230, 257, 262 heritage 194, 225-226, 228-229, 233-235, 244, 247, 250, 252, 254 speech 12, 19, 20, 44, 57, 116, 142, 144, 148, 152, 158-159, 171, 175- 178, 180-182, 184-187, 223, 226, 235, 259-261, 263 onset times 19-20 rate 226 speech recognition 180-181, 185- 186, 188, 223 speech signals 185 ambiguous 185 spoken word recognition 175-178, 180-182, 184, 187-189, 194, 221 stimulus 15, 19-21, 24-25, 27-28, 30, 63, 71, 83, 95, 128, 135-137, 156, 183, 186, 197, 202, 206, 209, 220- 221, 231, 237, 289-291, 293 asemantic 181, 184 material 15, 19, 21, 24-25, 28, 30, 83 set 19, 24-25, 197, 206 summarizing function 93-94, 102 TAM (Tense Aspect Mood) 11, 52, 58, 62-66, 68, 70, 73-78 task 9, 12-13, 16-18, 20-21, 23, 25, 27-30, 35, 50, 52, 54, 56, 63, 66-67, 71, 74, 77-78, 87, 97, 108, 115, 117, 134-135, 137, 159, 162, 180-181, 184, 186, 189, 206, 208, 227-230, 234-238, 240-242, 245, 249-250, 252-253, 269-270, 273, 280-281, 285-287, 293-296 grammatical judgment 19-20 naming 139, 202-203, 206, 228, 253 picture-word matching 202, 206-207 preference judgment 10, 15, 20, 28 subjective frequency rating 202 translation 225, 229, 236, 252-253 telicity 35-39, 42, 85, 91, 110 temporal 26, 32, 51, 86, 89, 92, 106, 264 definiteness 92 tense 11, 19, 21, 24-25, 28, 30, 33, 52, 56-57, 59, 63, 67, 81, 86, 88, 91, 98, 102, 110, 130, 142, 144, 184 test 11-12, 15, 20, 28, 30, 39, 42, 45- 46, 50, 76, 81, 84, 96-97, 100, 102- 107, 114, 121-122, 130, 136, 140, 151-152, 156-159, 162, 165-168, 171, 180, 192, 195, 197, 200-204, 307 206-208, 211-212, 220-221, 228- 229, 231-232, 234-237, 239, 247, 249, 253, 264, 268, 270, 294, 296 cloze-test 113, 117 the role of the context in spoken word recognition 185 thermometer judgment 12, 152-153, 156-159 time course 18, 95, 137, 150, 221 topic 7, 9, 13, 84, 99, 101, 111, 225, 244, 260, 280 traditional offline tasks 180 transcription 19, 175-178, 183, 185 acoustic-phonetic 177-178, 187 transfer 57, 83-84, 87, 101, 109, 276 transforming 96, 107, 287 translation 11-12, 87-88, 90-91, 94- 95, 103, 106, 108, 196, 198, 201, 225, 229, 236, 241-242, 247, 252- 253, 279 task 225, 229, 236, 252-253 TRY verbs 54-55, 58, 62, 64, 68-71, 75-76 unmarked 37-38, 85, 90, 235, 274 validity 84, 103, 108, 156, 196, 231 ecological 30, 77 variables 22, 35, 38, 50, 56, 58-59, 65, 67-68, 70, 73, 108, 122, 220, 232-233, 290 relative importance of 124 variation 11-12, 43, 46-50, 66, 73, 98, 113-114, 116, 119, 121, 127, 129- 131, 153, 160, 226, 232, 247, 253, 260-261, 296 verbal 8-9, 11-12, 36-37, 83-87, 90-92, 94-98, 100-103, 105, 107-114, 116, 120-121, 126, 130, 142, 148, 225, 230, 252-253, 290 fluency 12, 225, 230, 252-253 morphology 114, 130 visual word paradigm 180 vowel 177, 196, 213-217 well-formedness 157 word 8, 12, 54, 63, 66-67, 69-70, 72, 76-77, 99-102, 130, 134-137, 144, 152-153, 179-180, 182, 184-187, 199-200, 202, 205-208, 213, 216, 227, 237-239, 253, 264-266, 268- 270 abstract 273, 278 association 230 concrete 274 form frequency 185 frequency 19, 82, 191, 196-197, 200, 220, 229-230, 237 memory 229 non-word 134 order 23, 34, 38, 42-46, 49-50, 52, 66, 83, 96, 98, 99-101, 108-109, 186 recognition 81, 175, 177, 180-181, 187, 196, 222, 229, 235, 276 word by 96, 102 working memory 231, 295 z-transformation 157 Index Notes on Contributors Tanja Anstatt (tanja.anstatt@rub.de) is a full professor of Slavic Linguistics at Ruhr-University in Bochum, Germany. She obtained her PhD in Slavic Linguistics from the University of Hamburg in 1995, having undertaken a study of the expressions used for time in Slavic languages. Her research interests include various grammatical and lexical features of Slavic languages, mainly Russian and Polish, as well as multilingualism and multilingual language acquisition. A special focus of her work is on Slavic languages in Germany. Her research subjects in this area include the acquisition of grammar, especially the verbal aspect, by bilingual children, language maintenance and loss in bilingual children and adolescents, and various questions on the multilingual lexicon. Antti Arppe (arppe@ualberta.ca) is assistant professor of Quantitative Linguistics at the University of Alberta and founder of Alberta Language Technology Lab (ALTLab). His research applies and develops statistical and computational methods, as well as corpora and language technology, in modeling linguistic phenomena, with an aim for cognitive plausibility, and contrasting evidence representing different modalities of language. Prior to his academic career, he worked in senior managerial positions for Lingsoft, a Finnish language technology company, responsible for the development and licensing of proofing tools for the majority Nordic languages. Currently, he is also engaged in the development of language technological tools and applications for supporting the revitalization and retention of Indigenous languages in North America, e.g. Plains Cree. Harald Baayen (harald.baayen@uni-tuebingen.de) completed his PhD thesis with Geert Booij and Richard Gill in 1989 at the Free University, Amsterdam. In 1990, he joined the MPI for Psycholinguistics, Nijmegen. He became associate professor at the University of Nijmegen in 1998, leading a research group funded by the Dutch Research Council. In 2007, he accepted a full professorship in quantitative linguistics at the University of Alberta, Edmonton, Canada. In 2011, the Alexander von Humboldt Foundation awarded him with a Humboldt chair at the University of Tuebingen, Germany, which enabled him to set up a large research group addressing the consequences of discrimination learning for language processing, and to explore in depth the potential of generalized additive mixed models for the Notes o Contributors 309 understanding of experimental and corpus data. In collaboration with many others, he has worked on Dutch, German, English, Italian, Russian, Hebrew, Chinese, Vietnamese, Japanese, Finnish, and Estonian. Eva Belke (belke@linguistics.rub.de) is professor for Psycholinguistics at the Department of Linguistics at Ruhr-Universität Bochum. She graduated in Clinical Linguistics/ Speech and Language Pathology from the University of Bielefeld (Germany), where she also obtained a PhD in Psycholinguistics. Before coming to Bochum, she spent three years as a post doc and a lecturer in Birmingham (UK) and three years at the University of Bielefeld, where she completed her habilitation. Her work focuses on two lines of research: In the first, she works on lexical retrieval processes across the lifespan and in healthy and impaired speakers. In the second one, she investigates, from a psycholinguistic perspective, language learning and methods of language teaching for children at pre-school and primary school age, particularly those who acquire German as a second language. Denisa Bordag (denisav@uni-leipzig.de) is an associate professor at the Herder-Institute (German as a foreign language) at the University of Leipzig/ Germany. She obtained her Mgr. in Czech and English linguistics and literature at the Charles University in Prague/ Czech Republic. She completed her PhD and habilitation at the Philological Faculty at the University of Leipzig in the area of psycholinguistics. She is one of the founders of the Czech psycholinguistics. The focus of her research has been the morphological representation and processing of Czech on the one hand, and second language acquisition on the other. In her recent research, she has explored incidental vocabulary acquisition during reading in both native and nonnative languages. Bernhard Brehmer (brehmerb@uni-greifswald.de) is a full professor of Slavic Linguistics at the Institute of Slavic Studies of the University of Greifswald. He received his PhD from the University of Tübingen in 2004 with a study on the expression of gratitude in Russian. He was a principal investigator in several national and international research projects on Polish- German and Russian-German bilingualism at the University of Hamburg and the University of Greifswald. Data collection for all of these projects involved psycholinguistic experimental methods. His main research interests include Slavic heritage languages in Germany, multilingual language acquisition, language contact, language variation, script linguistics, and pragmatics. n 310 Christina Clasmeier (christina.clasmeier@rub.de) is an academic research assistant to the chair of Slavic linguistics at Ruhr-University in Bochum, Germany. Psycholinguistic research questions and methods run through her academic work, starting from her M.A. thesis on the verbal aspect in Russian-German bilinguals (University of Hamburg) and going on to her dissertation on the mental representation of Russian aspectual relations in monolinguals. She obtained her doctoral degree from Ruhr-University. Together with Tanja Anstatt, she has examined the subjective frequency of Russian verbs and its interaction with aspectual features. In her ongoing research, a joint project involving Slavic linguists and psycholinguists in Bochum, she investigates coactivation processes in the mental lexicon of Russian-German bilinguals (see the contribution in this volume). Elena Dieser (elena.dieser@uni-wuerzburg.de) is an academic council member at the Department of Slavistics of the Institute of New Philology at the University of Wuerzburg/ Germany. She completed her doctorate in 2008 with a dissertation on the acquisition of genus in the Russian and German languages. After completing her doctorate, Elena Dieser worked as an academic research assistant at the chair of the Slavic Linguistics Department under the direction of Tanja Anstatt at the Ruhr-University of Bochum/ Germany. Fields of research of Elena Dieser include: Multilingual faci lity (the acquisition of genus in the German and Russian languages, Code-Mixing and linguistic separation, as well as lexical and grammatical development with multilingual children) along with the periphery of the Russian grammar. Since 2015, she has been working on a habilitation project on the subject of “Grammatical variation in Russian”. Dagmar Divjak (d.divjak@sheffield.ac.uk) obtained her PhD in Russian Linguistics from the KULeuven (Belgium) in 2004. She spent a year with Laura Janda at the UNC at Chapel Hill (USA) as a Francqui Foundation Fellow (BAEF) and a year with Oesten Dahl at the University of Stockholm (Sweden), while she held a Postdoctoral Fellowship of the Research Foundation (Flanders) associated with the Quantitative & Variational Linguistics research group of Dirk Geeraerts (KULeuven, Belgium). She joined the Department of Russian and Slavonic Studies at the University of Sheffield (UK) in September 2006 as a Lecturer in Slavonic Languages & Linguistics; currently she is a Reader there and directs the Centre for Linguistics Research. Dagmar is interested in usage-based linguistic theories of language and uses a combination of corpus and experimental methods to understand how distributional linguistic patterns are detected, extracted, processed and represented. Notes o Contributors n Notes o Contributors 311 Jessica Ernst (jessica.ernst@rub.de) works as a research fellow in the Department of Linguistics at Ruhr-University in Bochum. She is interested in age-of-acquisition effects in language production, language processing in monoand bilinguals, and the processes underlying written word production in children and adults. For her PhD project, she investigates orthography acquisition and written word production in children with German as a first and a second language. Anja Gattnar (anja.gattnar@uni-tuebingen.de) received her PhD in Slavistics from the University of Tübingen in 2010. Since 2009 she is a research assistant in the Slavic project ‘Verbal aspect in context: Contextual Dynamization vs. Grammar’ at the SFB 833 ‘The Construction of Meaning - the Dynamics and Adaptivity of Linguistic Structures’ at the University of Tübingen. Currently she stands in for a parenting leave as a Lecturer at the Institute of Slavic Languages and Literatures (Prof. Tilman Berger) at the University of Tübingen. Her research interest is in Slavic verbal aspect, more precisely the processing of aspect. In her research she uses psycholinguistic experimental methods like self-paced reading and eye tracking tasks. Gerd Hentschel (gerd.hentschel@uni-oldenburg.de) holds a chair of Slavic linguistics at the Carl von Ossietzky University in Oldenburg / Germany. Throughout his academic career, he has been engaged in exploring phenomena of language contact, German-Slavic and Slavic-Slavic, from a synchronic and a diachronic point of view. Regardless of contact, another stable characteristic of his research has been variation, be it phonic, morphological and (morpho-)syntactic. Recent projects he has conducted with the assistance of Jan Patrick Zeller are concerned with contact induced variation in Belarus and the Ukraine, with Belarusian-Russian and Ukrainian-Russian mixed speech (Trasyanka and Suržyk, respectively). His approach could casually be described as “variationist socio-contact-corpus-linguistic” applying quantitative and qualitative methods of research. Jakub Jehlička (jakub.jehlicka@ff.cuni.cz) is a PhD student of general linguistics at Charles University in Prague. In his MA thesis, he focused on the effects of the use of Czech Sign Language on the spatial cognition of deaf and hearing signers and hearing non-signers. Currently, he is working on his doctoral dissertation “Word order, information structure, and gesture: a cross-linguistic study” on a one-year scholarship at the University of Sheffield. The aim of the thesis is to compare how gesture is used as a marker of sentence focus in German and Czech and to explore focus constructions from the perspective of multimodal construction grammar. n 312 Tatjana Kurbangulova (tatjana.kurbangulova@uni-greifswald.de) is research assistant in the project “Russian and Polish heritage languages as a resource in school” which is currently being conducted at the Institute of Slavic Studies of the University of Greifswald (Germany). She studied Slavic Linguistics and Preand Early History at the University of Hamburg. Her current PhD project examines the quality and quantity of parental language input and their role in language maintenance among Russian heritage speakers in Germany. Her main fields of scientific interest include language contact, bilingual language acquisition, and first language attrition. Anastasia Makarova (anastasia.makarova@uit.no) has received her education in General linguistics and Russian as a foreign language from the State University of St Petersburg, Russia. She also holds MA and PhD degrees in Russian cognitive linguistics from University of Tromsø - The Arctic University of Norway. Her doctoral dissertation entitled “Rethinking diminutives: a case study of Russian verbs” proposed a unified account of diminutives in terms of a radial category. Since 2011 Anastasia has been an active member of the CLEAR (Cognitive Linguistics: Empirical Approaches to Russian) research group at the University of Tromsø. Her research interests range from Russian aspect, aktionsart and deictic words to language change and error analysis in spontaneous speech. In her research, she has used different types of data and applied a variety of methods. She has conducted comparative as well as diachronic studies, uses quantitative methods and the statistical software R extensively. Barbara Mertins (barbara.mertins@tu-dortmund.de) is a full professor of German linguistics, with focus on experimental linguistics and psycholinguistics at the Technical University Dortmund. In her research she uses different psycholinguistic methods to investigate the effect of language (especially grammar) on cognition in language production of monolingual and bilingual speakers (adults and children). She is interested in conceptual learning and restructuring, the relation between memory, visual attention and language, language pathology, and structure of bilingual and multilingual mental lexicon. Yulia Nigmatulina (julia.nigmatic@yandex.ru) is a postgraduate student in the Department of Phonetics. She received her M.A. degree in experimental linguistics and psycholinguistics, and her thesis was entitled “Sound contractions in Russian spontaneous and prepared speech” (2013). She is currently pursuing her research interests in examining word-external reduction in spontaneous Russian in order to understand how these reductions can be Notes o Contributors n Notes o Contributors 313 processed in speech recognition. She has been an assistant professor at the Faculty of Art of St. Petersburg State University of Aerospace Instrumentation since September 2014. Olga Raeva (olgaspace@rambler.ru) was an engineer in the Maintenance Department for Academic Programs in Asian and African Studies, Arts, and Philology in 2012-2015. The subject of her M.A. thesis (2012) was the recognition of word forms of high frequency. In her current research, she is investigating the interaction of lexical and grammatical information in the process of spoken word recognition, together with E lena Riechakajnen. Elena Riechakajnen (e.riehakajnen@spbu.ru) is a senior lecturer in the Department of General Linguistics, and psycholinguistics is one of her main research interests. She has been studying the interaction of context and frequency in the process of spontaneous speech recognition for more than 10 years. The focus of her ongoing research is the structure of the mental lexicon of a Russian speaker, and the role of the lexical and grammatical context in spoken word processing. Esther Ruigendijk (esther.ruigendijk@uni-oldenburg.de) is a professor in the Department of Dutch Studies at the Carl von Ossietzky University in Oldenburg, Germany since 2005. She received her PhD in 2002 at the University of Groningen (the Netherlands). Her research interests include first and second language acquisition, language impairment and language processing. In these areas she concentrates on syntactic and lexical phenomena. Together with Gerd Hentschel and Jan Patrick Zeller she examines code switching from a psycholinguistic perspective. These studies involve code switching in second language learners (e.g. German-Russian switching in Russian learners of German) and also in more complicated language contact situations where mixed speech is a common phenomenon (e.g. Russian- Belarusian, see contribution in this volume). Roumyana Slabakova (R.Slabakova@soton.ac.uk) is a professor and chair of Applied Linguistics at the University of Southampton, UK, and a professor emeritus at the University of Iowa, where she taught previously. Her research interest is in the second language acquisition of meaning, more specifically phrasal-semantic, discourse, and pragmatic meanings. Her monographs include Telicity in the Second Language (Benjamins 2001) and Meaning in the Second Language (Mouton de Gruyter 2008). She co-edits the journal n 314 Second Language Research (Sage). Her textbook entitled Second Language Acquisition will be published by Oxford University Press in March 2016. Natalija Slepokurova (nataliars@inbox.ru) is an assistant professor in the Department of General Linguistics. Her research interests include the phonetics of Russian, syntax, spontaneous speech processing, and general linguistics. For many years, Anatolij Vencov and Natalija Slepokurova have worked at the Pavlov Institute of Physiology, in the research group headed by the famous Russian speech physiologists, Ljudmila Čistovič and Valerij Koževnikov, whose scientific approach A. Vencov and N. Slepokurova have developed in their current study of spontaneous speech processing. Anatolij Vencov (av.ventsov@gmail.com) is a programmer in the Maintenance Department for Academic Programs in Asian and African Studies, Arts, and Philology. He is one of the most well-known and competent researchers of speech perception and spoken word recognition in Russia, and the author of more than 100 papers on psychoacoustics and perceptual linguistics (including a book on the problems of speech perception co-authored with Vadim Kasevič). Martin Winski (marcin.winski@uni-greifswald.de) is research assistant in the project “Russian and Polish heritage languages as a resource in school” which is currently being conducted at the Institute of Slavic Studies of the University of Greifswald (Germany). Martin Winski studied Slavic and General Linguistics in Hamburg and Belgrade. His M.A. thesis deals with the expression of temporal boundaries in Polish verbs. For his dissertation, he is currently investigating Polish verbal aspect in Polish-German bilingual children. Further areas of his scientific interest include semantics, typology, and the theory of proper names. Jan Patrick Zeller (j.p.zeller@uni-oldenburg.de) is an academic research assistant at the chair of Slavic linguistics at the Carl von Ossietzky University in Oldenburg, Germany. His research interests lie in the fields of language contact, sociolinguistics and language variation. Most of the constellations he has been investigating involve the contact between closely related languages, such as Belarusian and Russian, Ukrainian and Russian and different contact situations of Polish. These he examines within a wide range of research questions applying different methods, drawing on corpus linguistics, acoustical phonetics, and psycholinguistics. He obtained his doctoral degree from the University of Oldenburg with a work on phonic variation in Notes o Contributors n Notes o Contributors 315 Belarusian-Russian mixed speech. Together with Esther Ruigendijk and Gerd Hentschel, he has investigated psycholinguistic activities in processing code switching in closely related and in less closely related languages. n Narr Francke Attempto Verlag GmbH+Co. KG • Dischingerweg 5 • D-72070 Tübingen Tel. +49 (07071) 9797-0 • Fax +49 (07071) 97 97-11 • info@narr.de • www.narr.de JETZT BES TELLEN! JETZT BES TELLEN! Monika Rathert Psycholinguistik Ein Lehr- und Arbeitsbuch narr studienbücher kartoniert 2016, ca. 300 Seiten ca. €[D] 24,99 ISBN 978-3-8233-6775-8 Was geschieht, wenn wir sprechen, schreiben, anderen zuhören oder etwas lesen? Wie lernen Kinder die Muttersprache? Und was läuft bei Sprachstörungen wie Stottern eigentlich schief? Diese scheinbar einfachen Fragen sind tatsächlich komplex und nur im interdisziplinären Dialog von Linguistik, Psychologie und Neurowissenschaften zu beantworten. Dieses interdisziplinäre Forschungsfeld nennt sich Psycholinguistik, und das vorliegende Buch behandelt die Kernthemen des Faches: Sprachproduktion, Sprachverstehen, Spracherwerb und Sprachstörungen. Zwei kleinere Praxiskapitel stellen Berufsfelder für Psycholinguisten sowie Basistechniken des Experimentierens und Analysierens dar. Großen Wert legt das Buch auf die empirische Forschungsmethodik und -Technik einerseits und auf theorie- und modellgeleitetes Argumentieren andererseits. www.narr.de Psycholinguistics explores the anchoring of language in cognition. The Slavic languages are an attractive topic for psycholinguistic studies since their structural characteristics offer great starting points for the development of research on speech processing. The research of these languages with experimental methods is, however, still in its infancy. This book provides an insight into the current research within this field. On one hand, central topic is the question of how Slavic languages can contribute to psycholinguistic findings. On the other hand, all chapters introduce their respective psycholinguistic method and discuss it according to its usefulness and transferability to the Slavic languages. The researched languages are mainly Russian and Czech, however, other languages (e.g., Polish, Belarusian or Bulgarian) are touched upon as well. Main topics are the characteristics of the mental lexicon, multilingualism, word recognition, and sentence comprehension. Furthermore, several contributions address the issue of verbal aspect and aktionsarten as well as other grammatical categories.