LE 60: Demonstrative Anaphors in Hindi Newspaper Reportage: a corpus-based study

Srija Sinha
University of British Columbia

The purpose of this study has been to investigate, using the methodology of corpus linguistics, the phenomenon of discourse anaphora in Hindi, focussing on demonstrative elements. This research also evaluates the potential effectiveness of an annotation scheme for Hindi, which is based on the distinctive-feature paradigm, and has been developed in this study relying essentially on Botley’s (2000) annotation scheme devised for English.

The study begins with a literature review, and an examination of approaches to demonstrative anaphors, which presents the key facts and points of interest concerning demonstrative forms in Hindi, and reviews an existing model (Botley, 2000). Botley’s work provides the blueprint for the annotation scheme for Hindi demonstratives used in this dissertation. A subsequent pilot study introduces refinements and alterations in the scheme for an adequate description of Hindi demonstratives, and shows how the use of this annotation scheme enables the categorization of demonstrative use with a high degree of precision, simultaneously highlighting key issues and limitations of the study. The study continues with a comprehensive and detailed report of the construction and annotation of a 100,000-word Hindi corpus, which is comprised of newspaper articles.

The annotated corpus is amenable to the application of quantitative methods. Statistical tests are carried out on it, and the results obtained are discussed in detail with regard to comparison of features. The statistical results on anaphoric use in Hindi are also compared and contrasted with the counts obtained for a comparable corpus of English, reported in Botley (2000). The primary goal is to develop an annotation scheme for Hindi to the extent that it is both useful, and usable, for further linguistic research, and to this end some key theoretical points are successfully examined, enhanced by a statistical perspective. The superiority of corpus-based research is demonstrated, and the inextricable link between theory and data highlighted.

ISBN 9783895860348. Linguistics Edition 60. 153pp. 2007.

