next up previous contents
Next: Future work Up: From Cards to Computer Previous: A revised system

Subsections

Data retrieval

  Most of the efforts in the BSI project so far has been contrentrated on recording as much data as possible. Given that data entry has proved an extremely labour intensive process[*], the major thrust of the work was to achieve a critical mass in the amount of recorded data before one can set about the obvious next phase, data retrieval[*].

Current limits

At the moment, there is no provision whatever to aid access to the data. (This is in accordance with the fact that the BSI archives are not yet open to the general public.) In the revised system, all of the informants' responses reside in the ANSWERS file. It is a standard Foxpro database file (see Table 4.2), it basically serves to tell which variant was used by a given informant in response to which item (who said what in response to which item) - all enshrined in numbers. In order to find out what the numbers stand for both as regards the variant and the item, one needs to link the ANSWERS table with the OPTIONS table to access the particular variant, as well as with the PROMPTS table to retrieve the prompt i. e. the text of the card. The linking is to be done through the VL ('response') and the ITEM fields in the way shown in Figure 4.2.

Lack of user friendly tools to access the data is merely a temporary constraint and implies no inherent shortcoming in the potential of the present arrangement of the data. A genuine limitation in the current data model, however, is the fact that at the moment the database comprises a collection of atomic observations of each informant's language use. In its present state it fails to reflect the fact that (1) the same test word occurs in many different tasks throughout the interview and (2) the same linguistic phenomenon (e. g. palatal assimilation, -ik conjugation, foreign words) is investigated in a number of different words. In short, the isolated atomic facts need to aggregated along (socio)linguistically significant generalisations. This task will be taken up briefly in 6.1.

A common multimedia interface

  The first large-scale publication of BSI data was launched in late 1997. The full data plus the digitized sound of a complete BSI version 2 interview was published on a CD-ROM. This required the integration of the numerically coded data with the transcript in a unified system on the one hand, and the linking of the digitized sound files with the recoded data, on the other. As a result, it is now possible to very quickly access any part of the data and by simply clicking on an item to listen to it as well. 10 years ago when the BSI project was in its initial phase, all this sounded like a Utopean dream. This is no longer beyond the means of current software and hardware technology. The following sections give an overview of the process of how this was achieved.

Digitizing the tape recordings

Limitations of the tape recordings

The use of tape recording meant a revolutionary step in empirical linguistic analysis, and has clearly established itself as a basic research tool (only challenged perhaps by video tape recorders where their use was appropriate and feasible). However, both media have certain limitations in terms of handling.

1.
The master tape cannot be reproduced without loss of quality,
2.
its quality is subject to deterioration even if kept in very stringent storage conditions[*]

3.
positioning the tape to a precise location is a bit cumbersome[*]

Making a digital recording

All these limitations can be overcome with the use of digital technology as digital recordings do not decay in time, exact duplicates can be produced even in chain copying and data can be looked up instantaneously. Ordinary (i. e. non-digital) tape recordings capture sound in terms of an analogue electric signal, variations in the level of voltage, basically. Digital recordings record sound as a stream of numbers which record the level of the incoming signal strength measured at a set interval. In order to enjoy the benefits of digital technology, the original tape recordings had to be re-recorded digitally, a process referred to as digitization. This process no longer poses any challenge either as regards hardware or software. It only calls for a PC equipped with a sound card and a piece of sound editor software. True, it requires a lot of storage space but hard disk space and CD ROM disks have become relatively inexpensive.

The digital recording was made by playing the original tape recording into the input channel of the sound card which measured the signal stream at the required sampling frequency[*] and produced the stream of numbers which was stored in a file. The file could then be played back with the help of the sound card again, converting the numbers into analogue signals required by the speakers or earphones attached to the computer.

Linking the data with the sound files

  One major design issue that had to be faced was how to break up the the original tape recording into separate digital sound files. The answer has a crucial bearing on the way the data is accessed. The important factor to consider is the way the digital sound is linked in with the data files. At the moment, it is unrealistic to expect a system where one could select any arbitrary stretch of the transcript and have the computer play back the corresponding sound bit[*]. Instead, what is accessed with a click of the mouse button will be a whole sound file. This means that all those parts of the whole interview which one would like to jump to need to be put in a separate file. It is also possible to navigate within the sound file, once it is retrieved, but that operation would be serial, i. e. would require winding and rewinding, in a virtual sense, of the data. In short, access is random to files, but serial within the files.

Therefore, the digitized sound was broken up into files in the following way. With the exception of the judgement data and the reading of the passages, each item of the card based part was recorded in a separate file. Note that making them accessible as individual units of the data and in full accoustic and textual richness detail (as against a mere number, with possibly a transcriber's comment) represents a significant advancement over the way the same data is treated in the BSI files.

Full-scale access to the sound of this part of the interview opens up the possibility to carry out future research on aspects of the data that were not monitored and therefore not preserved in any form either by the original BSI protocol.

The reading passages were not broken up into the isolated bits of data that were itemized in the database. Technologically, there was no problem in identifying and dissecting even individual sounds in the data stream. However, after a few tests it was obvious that an isolated recording of a word is insufficient to make judgement about phonological and prosodic characteristics of the data. (Length, for example, is relative to overall speech tempo, length of other similar units etc. One needs to listen to a longer stretch to make reliable judgement about individual units.) Therefore, each reading passage was put into separate separate files. Of course, the slow and fast readings of the same text were treated as different data.

As regards the quided conversations, each conversation module presented an obvious natural unit for a sound file. The only problem sometimes was with long modules. They generated huge files[*] which took long to load and a long file means it takes longer to find things through navigation (as against instant lookup). Therefore, it was decided that the modules that were longer than two minutes would be broken up into a series of one minute stretches.

Converting the data files

Processing the database files

As described in Tables 4.1 and 4.2 on p. 4.1, the data derived from the card based tasks of the interview were coded numerically. The data tables contain numbers which are impossible to interpret without linking them to the corresponding records in the data tables PROMPTS and OPTIONS in the way shown in Figure 4.2. These three tables can be linked and browsed with the help of a suitable database management program but this is not what is needed. What we need here is a plain text form of the answers given by the particular informant, all the records of the ANSWERS table that come from the given informant in a form where the text of the corresponding record from the PROMPTS table is supplied, together with the textual form of the answer as looked up in the OPTIONS table.

Handling the reading passages

The reading passages required a slightly different handling. First of all, to bring the database records more in line with the sound files, where the whole passage is heard continuously, the original passage is first shown in the same form as was handed over to the informant[*]. This version was followed by the forms of the passage where the standard forms were replaced by the variant forms actually used by the informant. However, only those items that were coded in the database were treated in this way. The rest of the passages occur in standard orthographic form regardless of how it was uttered by the informants.

Adjusting the transcript

The transcript of the conversation modules required slight modifications only. The original format of the transcript was designed so that each line could be identified when a concordance is prepared of the data. For the present purposes, however, the invariant part of the left margin containing the ID of the informant and the particular module could be dispensed with entirely. Their role is either superfluous in the present context - as we are dealing with only one informant at the moment - or is filled by other navigational device. The tape counter settings on the right margin of the transcript can also be ommitted as irrelevant.

Weaving everything together

Having discussed the individual components of the system, it is time to give an overview of how they were integrated into a common user interface. Here again, we are fortunate to find that due to the enormous development of software technology since the start of the BSI project - which dates back to the age when home computers like the Commodore 64 ruled the day and the IBM PC XT was just on the horizon - we now have readily available technology to handle text, sound and pictures together in a simple and intuitive manner. This is provided by the hypertext technology which has now become an indispensible part of computer literacy as it is embodied in the help system of any windows-based operating system and, more conspicuously, in the Internet browser programs like Netscape or Internet Explorer.

The basic insight of the hypertext is as simple as ingenious. Ordinary text is basically a stream of characters, a flat two dimensional structure, which is processed in a serial manner. A hypertext adds another dimension to it in that it contains embedded links pointing to different parts of the same texts or other texts residing in different files (possibly in computers located in a different different continent, even), allowing the user to follow different threads in reading the text. It also makes it possible to hide a lot of details, notes etc. that would only clutter up the main plane of the text, but it is just as powerful in linking in a lot of related information as well.

This mechanism alone will have proved a blessing for our purposes when we are dealing with such a richly structured set of data as the BSI. In addition, however, as a recent development, the technology has been developed to handle sound in the same way i. e. links embedded in the text can evoke the sound corresponding to it[*].

In order to turn the BSI data into a hypertext system, we had to decide how to structure the data into separate units of reference (not necessarily residing in separate file but each accessible individually) and what navigational system to use to link them together. The structuring of the sound data has been discussed above in 5.2.1. The contents of the data files was structured into a menu system of three level depth. Figure 5.1 displays the main menu which provides access to the full data in chronological order (Teljes anyag `full interview' option) as well as through the major task types (Ir ny¡tott t rsalg s `guided conversation', Ir ny¡tott t rsalg s `guided conversation', K rty s feladatok `card based tasks', Olvas si feladatok `reading tasks', Öt‚letalkot s `judgement tasks' options).


  
Figure 5.1: The revised data entry screen
7#7

If one clicks on the quided conversation module option, the left hand menu window is filled with the list of conversational modules that occurred in the particular interview and clicking further on any of them takes us to the beginning of the module in the transcript. This is displayed in Figure 5.2.


  
Figure 5.2: The menu system and format of the guided conversation modules
8#8

The following three figures each display the menu system and the format of various parts of the iterview. Figure 5.3 shows how the data tables are displayed. Figure 5.4 shows the same for judgement data and Figure 5.5 menu and a sample page containing different versions of a reading passage.


  
Figure 5.3: The menu system and format of the data tables
9#9


 


  
Figure 5.4: The menu system and format of the judgement data
10#10


  
Figure 5.5: The menu system and format of the reading passages
11#11


next up previous contents
Next: Future work Up: From Cards to Computer Previous: A revised system
Tamás Váradi
12/26/1997