'Academic fieldwork data collections are often unique and unrepeatable records of highly significant events collected at considerable expense of researcher time, effort and resources. While fieldworkers have been quick to take advantage of digital technologies to enable them to collect and organise their data, standards and workflows are only now beginning to emerge to assist researchers to submit their data for archiving and access. This collection of refereed papers from the conference of the same name held at the University of Sydney in December 2006 provides a record of recent research practice by fieldworkers in linguistics, botany and anthropology, and by archive and repository managers.' (Publication summary)
'Since the late 1990s, the technical group at the Max-Planck-Institute for Psycholinguistics has worked on solutions for several of the questions addressed in this paradisec-meeting, in particular, how to guarantee long-time-availability of digital research data for future research. The support for the well-known DOBES (Documentation of Endangered Languages) programme has greatly inspired and advanced this work, and lead to the ongoing development of a whole suite of tools for annotating, cataloguing and archiving multi-media data. At the core of the LAT tools is the IMDI metadata schema, now being integrated into a larger network of digital resources in the European CLARIN project. The multi-media annotator ELAN (with its web-based cousin ANNEX) is now well known not only among documentary linguists. Other tools such as the lexical database tool LEXUS, the related knowledge-space builder VICOS and others are not yet widely used. With further development and integration with other tools they also have the potential for being useful tools for representing non-time-related linguistic data. We aim at present an overview of the solutions, both achieved and in development, for creating and exploiting sustainable digital data, in particular in the area of documenting languages and cultures, and their interfaces with related other developments.' (Publication abstract)
'The work described in this paper aims to outline some of the design aspects for a collaborative tool for typological research. This tool is designed to allow for the collation, from multiple contributors, of linguistic examples and their analysis with regards to an open set of variation dimensions of both onomasiological and semasiological nature. The resulting knowledge base combines linguistically relevant categories of human conceptualisation (e.g. in-group, such as ethnic or family group, categories) together with their linguistic coding (e.g. in gender affixes, verbal agreement), all based on actual linguistic examples from diverse natural languages as its underlying data-driven foundation. The system is based on Semantic Web technology and hence can be queried in a flexible way that allows for combining any variation dimensions within a query (e.g. it allows to answer questions such as which languages exhibit joint attention marking by way of verbal suffixing). We will focus on design aspects relating to sustainable data. How can sustainable data for such a project be delimited? Surely, this encompasses commonly accepted aspects such as standards conformity, longevity, and accessibility, which we will address in the paper. Additionally and in particular, however, we will argue that user orientation and involvement is a critical factor. Following on from this, the tool is designed in a way that it (i) does not require linguistic users to be trained extensively in system usage, (ii) allows linguists to deploy their standard methods of data entry (e.g. interlinear glossing), and (iii) provides contributors with immediate integration of their own with previously entered data and access to the resulting analysis (i.e. querying) and research potential. The paper will roughly be structured as follows: We will describe the background and aims of the project, and contextualise it in relation to other similar projects. We will then concentrate on how sustainability is addressed, discussing a number of different facets of sustainability. This includes data storage formats, user interface and workflow modelling, knowledge base design, and system features (in particular system output). We will also outline some problems that have arisen so far and close with an outlook on future development.' (Publication abstract)
'Sign languages, or iltyem-iltyem angkety, are in daily use in Arandic speaking communities of Central Australia. They are a form of communication used alongside other semiotic systems, including speech, gesture and drawing practices. Whereas sign languages used in deaf communities operate without any connection to speech, these 'alternate' handsign languages are used in various contexts by people who also use spoken language. They are culturally valued and highly endangered, yet there has been little or no systematic documentation of Arandic sign since Kendon (1988). In this paper we describe a pilot program to record Arandic sign languages, conducted by a community language team, funded by the Maintenance of Indigenous Languages and Records (MILR) program and by the Endangered Languages Documentation Program (ELDP), and auspiced by the Batchelor Institute (BIITE). Research into various aspects of multimodal communication brings with it many theoretical and practical challenges. New technologies and the ever-expanding potentials of data annotation systems create a plethora of choices and huge volumes of recorded material. Whereas the use of film in language documentation has recently become de rigueur, at least in some circles, it is often only as an adjunct to studies of spoken language. When the visual is foregrounded, as it is in sign and gesture research, additional layers of complexity are added that impact on all aspects of the documentation process. How, for example, do we balance the desire for naturalistic visual data with the need for visually 'clean' images? What lessons can linguists learn from ethnocinematographers (Dimmendaal 2010)? What kinds of resources will benefit the community and a range of users (scholarly, archival, educational etc), as well as satisfying community aspirations for medium and long-term engagement with their audio-visual language materials? How do we ensure that our methodologies are robust enough to allow comparisons between primary sign language corpora and alternate sign language ones?
' We discuss these issues and various others encountered in our research, including our field methodologies, annotation of film data, community consultations and ethical considerations, and issues that have arisen in designing an interactive sign language website for use as a teaching/learning resource in Arandic schools. Although the creation and management of digital archives for primary sign languages have been documented before (see Johnston & Schembri 2006), 'alternate' sign languages have received little attention.' (Publication abstract)
'Australian literary studies have, in the past decade, been greatly assisted by AustLit: The Australian Literature Resource (www.austlit.edu.au), a multi-institutional collaboration between researchers, librarians and software designers from ten universities and the National Library of Australia. Under the leadership of The University of Queensland, this collaboration has produced a web-based research environment that supports a wide range of projects and publications across a diverse array of fields in Australian literary and narrative cultures while also becoming a key resource for teaching and general information. AustLit has consistently worked to integrate the research output of associated projects and is currently planning to expand its position in the community with a new open access and open contribution model. A major innovation in data management and maintenance, the AustLit Research Community structure supports the study of Australian literary and story-making cultures by providing a web-based environment where segments of these cultures can be explored and presented as distinct topics within a larger knowledge framework. Scholars are able to build datasets, annotate, analyse and present that data in a range of ways, and publish scholarly interpretations of their findings in the form of peer reviewed articles. The incorporation of these research-rich datasets into AustLit contributes to an overarching goal of building a comprehensive database of information about Australian writers, writing and print culture more broadly. With a recent decision to move from the current access model as a subscription service, available to relatively few users, to an open access and open contributions model incorporating content produced by a network of volunteers, AustLit is now facing a significant new challenge. The Aus-e-Lit Project has delivered innovative tools and services that will enable AustLit users to engage more directly with AustLit data and to contribute to a Research Commons with collaborative annotations and richly described collections of internet resources. This paper will report on the implications that these innovations bring to current and future research practices. It will consider the successes and challenges that AustLit faces with its aim to be the definitive virtual research environment and information resource for Australian literary, print, and narrative culture, not only for scholars in the field but for students of all levels and the general public.' (Publication abstract)