Adding value to data in translation process research: the

Adding value to data in translation process research:
1
the TransComp Asset Management System
Susanne Göpferich
Abstract
This article focuses on how data collected in translation process research
can be made re-usable and how results can be made verifiable. These are
major issues in translation process research since, in the past, data
collected in investigations into translation processes have in most cases
not been made available to the scientific community, with the consequence
that findings cannot be reproduced and verified and that the data cannot
be re-used in other investigations.
In an attempt to solve this problem, a case will be made for using
Asset Management Systems to make translation process data accessible on
the Internet. Asset Management Systems are electronic systems for storing,
archiving, annotating, analysing and displaying digital resources of any
type. The advantages and functionality of these systems will be described
both from the perspective of research and of translation pedagogy. As an
example of an Asset Management System, the one developed for
TransComp with its specific functionality will be presented. One notable
feature of this AMS is the option of linking transcripts to the
corresponding sections in the sound or video files, so that, for example, a
section of the video file can be replayed by clicking on the corresponding
section in the transcript.
1
TransComp is a longitudinal study which investigates the development of translation
competence in twelve students of translation over a period of three years and
compares it with that of ten professional translators with at least ten years of
professional experience in translation/interpretation. It is funded by the Austrian
Science Fund (FWF) as project No. P20908-G03 (September 2008–August 2011). For
a detailed project description, see Göpferich (2009).
160
Susanne Göpferich
1. The problem: data documentation and availability
The data collected in studies on translation processes are usually diverse
and voluminous. Apart from the source texts and the translation
assignments, which are of interest to researchers who wish to use them
with other subjects in comparative studies, they may include the target
texts produced and the evaluators‟ comments on them, notes taken by the
subjects during the translation process, questionnaires filled in by them
after the translation, project files (e.g. those used in the key-logging
software Translog, Jakobsen 1999), log files, screen recordings, webcam
recordings, video recordings, eye-tracking data, sound files with verbal
data, and transcripts of what the subjects uttered and did during the
translation process.
Owing to restrictions of length, most publications do not include
these data, especially the complete transcripts of the translation processes
on which they are based. As a consequence, the results cannot be
reproduced and it is impossible to verify whether the categories used in the
analyses have been operationalized in such a way that objective or at least
inter-subjective results are obtained. This is particularly unfortunate in the
case of larger-scale studies such as Krings (1986, 2001), Jääskeläinen
(1999), Englund Dimitrova (2005), and Hansen (2006), all of which have
involved considerable effort in data collection and transcription. Another
disadvantage of not making the data accessible to the scientific community
is that often these data have only been analysed with regard to certain
criteria although they could form a useful corpus for other studies as well.
2. The solution: Asset Management Systems (AMS)
By making data accessible on the Internet, the drawbacks outlined above
can easily be overcome. Provided that adequate server capacity is
available, there are no length restrictions. In printed publications, reference
can be made to the Internet resources if they are stored in an appropriate
manner. Furthermore, process data that are stored online can be searched
easily if they are provided with detailed meta-data which can be used as
search criteria.
Adding value to data in translation process research
161
In an effort to solve the problems mentioned above, all materials
used in the longitudinal study TransComp, such as the source texts, the
translation assignments and the model translations, as well as all data
2
obtained in the experiments, such as the translation process protocols , the
log files, the screen records, and the subjects‟ target texts without and with
evaluation mark-up will be made available to the scientific community in
an Asset Management System (AMS), an open-source-based storage,
administration and retrieval system for digital resources. For the translation
process protocols, XML-based transcription conventions have been
developed on the basis of the Guidelines for Electronic Text Encoding and
Interchange (TEI Consortium 2007) of the Text Encoding Initiative (TEI).
They are described in detail in Göpferich (forthcoming a). In this way the
problems pointed out by Englund Dimitrova (2005: 82 f.) are addressed.
She states that so far “no single, widely accepted model for coding and
analysis” has been developed and that “there does not yet seem to be an
established way of reporting protocol data”. The AMS will contribute to
the solution of this problem and facilitate future multi-centre studies, in
which, for instance, source texts and assignments can be downloaded from
the system and used with subjects from other translation-oriented
programmes and with other language combinations; these data can then
flow into the system and be compared with the ones from our own and
other studies.
Ideally, such an AMS could form part of an Internet portal that
provides access to one or several archives of data collected in
investigations into translation processes. These could be stored in such a
way that by applying certain search criteria, specific types of data could be
retrieved (e.g., all the data resulting from a specific project or from
3
experiments in which professional translators took part). Translation
process researchers could then use these data as a corpus of reference with
which to compare their own data and findings, which could then also be
2
3
The term translation process protocol refers to the transcript of what has been said
(e.g. think aloud) but also of other actions that have occurred during the translation
process, such as the consultation of dictionaries and the refitting of the headset.
For such an archive with regard to writing process research, cf. Strömqvist et al.
(2006: 71) and Sullivan/Lindgren (2006b: 210 f.).
162
Susanne Göpferich
uploaded into the archive, thereby enabling several smaller-scale studies to
become extended into a larger-scale one on a cooperative basis.
Furthermore, such an Internet portal could also form a valuable
resource of material for translation pedagogy, in which a process-oriented
approach has now been advocated for about a decade (cf. Gile 1994, 2004;
Kußmaul 1995, 2000, 2007), especially in “[c]ourses for experienced
translators, who want to improve their methods rather than acquire basic
experience, which they already have” (Gile 1994: 112), in “[c]ourses
involving source and target languages that the teacher does not know”
(Gile 1994: 112), and in courses for students at the beginning of their
training in translation (Gile 2004). Samples from the process data stored in
the Internet portal could be used in the translation classroom not only to
give students insight into typical shortcomings in problem-solving
processes but also into strategies that have led to successful or particularly
creative translation solutions (cf. the numerous examples in Kußmaul
1995, 2000, and 2007).
Asset Management Systems provide the type of functionality needed
for these purposes. These electronic systems allow us to import and export
digital resources of any type, such as texts, graphics, videos, and sound
files (including format conversions), to annotate them using meta-data, to
retrieve data and analyse them, to view and play sound and video files, to
bundle files which belong together, and to archive and version them.
By providing easily accessible information on the Internet, an AMS
supports collaborative work beyond the boundaries of departments,
institutions, and even countries. Another advantage of an AMS is the
possibility of connecting files (e.g., transcripts and the screen-recording
files on which they are based) via hyperlinks and thus reducing the amount
of work involved in data transcription as will be explained below (cf. also
Stigler 2008). Additionally, Asset Management Systems allow various
display options. Whereas some researchers want to use specific
interpretation categories that have been tagged into a corpus of process
data, others prefer a clean corpus without such tagging and do not want to
be influenced by other researchers‟ interpretation mark-up. As will be
shown, the display options in an AMS can be designed in such a way that
users can choose what types of information in the transcripts they wish to
Adding value to data in translation process research
163
display. Furthermore, Asset Management Systems also allow storing
documents in different versions, for example, in the form of „pure‟
transcripts without interpretation mark-up as a resource for various types
of research, and as transcripts with such tagging for purposes of
4
verifiability.
In the following, the AMS developed for the longitudinal study
TransComp (Göpferich et al. 2008 ff.) will be described.
3. The TransComp AMS
Figure 1. Start page of the TransComp Asset Management System
4
Where a „clean‟ version of a transcript is not available in an AMS, it can easily be
produced from an XML document by removing unwanted interpretation tags by
means of an editor‟s search-and-replace option (search for the unwanted tags and
replace them with nothing).
164
Susanne Göpferich
The TransComp AMS has been implemented using Fedora open-source
repository software (cf. Fedora). Fig. 1, which shows its start page,
provides general information on the project. Project data and contact
information (see Fig. 2) can be obtained by clicking on the link “ Project
Data & Contact” at the bottom of the page (not visible in Fig. 1).
Figure 2. The page “
Project Data & Contact”
Adding value to data in translation process research
3.1
165
Accessing “Materials”
The materials used in TransComp, i.e. the source texts, translation
assignments and model translations, can be accessed via the link
“Materials” in the menu on the left.
Figure 3. Materials: source texts, translation assignments, and model
translations
Where useful, the materials are provided in different formats. As can be
seen from the screenshot in Fig. 3, the “Source Texts and Assignments” are
provided in two formats: as PDF documents and as Translog project files.
The PDF documents include both the translation assignment and the source
text in one document, the Translog project files only include the source
5
text to be displayed in Translog and the Translog project specification
because the assignments are usually handed out to the subjects in printed
form. The files can either be opened directly by clicking on the respective
file designation in the main window or saved to a storage medium by
placing the mouse cursor on them, pressing the right mouse button and
selecting the option desired. Note that Translog 2006 files, which are XML
files, have to be saved as *.project if you wish to open them directly in
Translog.
5
The Translog project specification describes how the source text will be displayed.
166
3.2
Susanne Göpferich
Accessing data
All data collected in the experiments can be accessed in two ways: either
via the “Experimental Waves”, in which the data were collected, or via the
“Subjects”, who took part in the experiments. Fig. 4 shows the list of all
experiments conducted in the first experimental wave (“Wave 1”) with the
student subjects.
Figure 4. Experiments conducted in the first experimental wave with the student
subjects
Fig. 5 below shows the list of all experiments in which the student subject
BKR has taken part so far.
Adding value to data in translation process research
167
Figure 5. Experiments in which the student subject BKR took part
To access the data collected in an experiment, its designation has to be
clicked on in the main window. Clicking on “Wave 2 / Source Text A5 /
Student BKR” in the screen shown in Fig. 5, for example, opens the view
shown in Fig. 6.
Figure 6. Access to the data collected in the translation experiment
“Wave 2 / Source Text A5 / Student BKR”
168
Susanne Göpferich
This display can be subdivided into three areas: (1) an area (“more”),
where the meta-data can be displayed; (2) the window on the left, where
the transcript can be displayed; and (3) the window on the right, where the
screen record can be replayed.
Figure 7. Access to meta-data
By clicking on “more”, the meta-data for the experiment will be displayed
as shown in Fig. 7. The meta-data include information about the subject,
the experimental setting (date of the experiment, duration in minutes,
Adding value to data in translation process research
169
description of the situation in the room where the experiment was
conducted, methods and software used for the experiment, a short
description of the task to be carried out, the documents that have resulted
from the experiment), the resources available to the subjects during the
experiment, and the persons responsible for the experiment
(responsibilities). The individual documents (the transcript of the
experiment with XML mark-up, the target text produced by the subject
both as a TXT-file and as a PDF-file, the target text with evaluation
comments again both as a PDF-file and as a DOC-file, and the Translog
recording as a PDF-file and as an XML-file for replay in Translog) can be
accessed by clicking on their designations. An example of a transcript with
XML-mark-up can be found in Appendix A. Appendix B shows an
example of a target text with evaluation comments. The coding of metadata in XML is described in Göpferich (forthcoming a).
Clicking on “more” again closes the meta-data window and brings
the user back to the display in Fig. 6. Clicking on one of the phases shown
on the left opens the transcript of that phase without any mark-up. To
display the mark-up, the respective box has to be ticked in the upper part
of the window. In Fig. 8 below, the main phase has been opened and the
category “problem” has been ticked, so that all passages in which the
subject is solving a translation problem in the main phase are displayed.
For identifying instances of such problems, we use an adapted version of
Krings‟ (1986: 121) classification of problem indicators, which he
developed as a means of identifying translation problems in transcripts in a
consistent and inter-subjective way. Krings differentiates between primary
and secondary problem indicators. Primary problem indicators are clear
evidence of translation problems, whereas secondary problem indicators
only lead to the assumption that there might have been a problem in the
translation process. For this reason, Krings only counts those phenomena
as translation problems for which there is either one primary problem
indicator or for which there are at least two secondary problem indicators.
Like Krings we count the following phenomena as primary problem
indicators: (1) utterances by means of which subjects make clear that they
have a translation problem, e.g. “da weiß ich jetzt net was das genau
bedeutet” [here I don‟t know what it means exactly]; (2) any consultation
170
Susanne Göpferich
of a source of reference (printed or online dictionary, parallel text, etc.);
and (3) gaps in the target text resulting from not knowing how to translate
certain source-text units. Krings‟ list of secondary problem indicators had
to be adapted for our purposes (for the reasons, see Göpferich forthcoming
a). What we count as secondary problem indicators are: (1) alternative
tentative translation equivalents; (2) negative evaluations of target-text
units verbalized by the translator; (3) unfilled pauses of a duration of at
least three seconds; (4) certain vocalized non-lexical phenomena, such as
sighing; and (5) the inability to think of a primary equivalent association.
What is important here is that these are only secondary problem
indicators. This means that if any one of them does not occur in
combination with at least one other problem indicator, the respective
passage in the transcript is not counted as an instance of a translation
problem. There is one exception to this: if subjects take up a problem that
they have worked on previously, the earlier occurrence of the problem is
counted as one problem indicator. As a consequence, one additional
secondary problem indicator for the same item that caused the problem
suffices for counting that particular passage in the transcript as a recurring
instance of a translation problem.
Sections in which at least two secondary problem indicators occur
but where it is not clear what may have caused the potential problem are
not counted as instances of translation problems. Comments on the
difficulty of the text and on whether the subject liked it or not are not
considered problem indicators.
Problems may occur in any phase of the translation process, not only
in the main phase. Problem instances start when the subject becomes aware
of the problem. In the transcript, the start tag is placed immediately before
the utterance or action indicating this. A problem instance ends when the
subject has solved the problem or turns to something else. This is where
the end tag is placed.
In TransComp, the occurrence of a translation problem is transcribed
as follows:
<incident xml:id="1" type="problem" subtype="conditioning"
start="00:06:16" end="00:06:17">
</incident>
Adding value to data in translation process research
171
As can be seen from the example above, instances of interpretation
categories are enclosed in incident tags. They are specified as instances of
translation problems by means of the type attribute with the value problem.
The cause of the translation problem is indicated in a subtype attribute
whose value is the source-text element or any other indication of what
caused the translation problem. In the example above, the translation
problem was caused by the source-text item conditioning. All occurrences
of translation problems are provided with a running number. In the
example above, this is xml:id="1", which means that this is the first
translation problem in the experiment. Furthermore, the time is indicated
when the subject first showed signs of having a translation problem (start)
as well as the time when the subject had solved the problem or turned to
something else (end). If the subject does not solve a problem and returns to
it later, this is also indicated in the number. When addressing the problem
for the second time, the number that the problem had at its first occurrence
is retained, followed by “.2”; when it is taken up for the third time, the
same number is used, followed by “.3”, etc. In the example below, the
subject returns to the problem in the example above for the second time:
<incident xml:id="1.2" type="problem" subtype="conditioning"
start="00:06:47" end="00:06:54">ähm: ja das weiß ich immer noch nicht so
recht wie ich das übersetzen soll</incident>
Encoding such instances of interpretation categories in transcripts can be
much more difficult and time-consuming than “just” transcribing what has
been said and done, even if the latter also requires an interpretative effort
which should not be underestimated. Nevertheless, transcribing what has
been said and done can often be delegated to student assistants. However,
the encoding of instances of interpretation categories usually cannot be
delegated and has to be done by researchers themselves. Smagorinsky
(1994: 19) explains this as follows: “However, coding is an analytical and
recursive process. It is thus an integral part of the research process, and the
exact coding system must develop in interaction with the data.” (cf. also
Göpferich 2008: Section 3.9)
Encoding instances of translation problems with their beginning and
their end (start attribute and end attribute) has advantages in Asset
Management Systems. The time indications can be used for linking
172
Susanne Göpferich
sections of the transcript to the corresponding positions in the sound or
video file on which the transcript is based. As illustrated in Fig. 8 below, in
the TransComp AMS, transcripts and the corresponding screen-recording
files are linked in such a way that clicking on a translation problem in the
transcript starts the section of the video file where this problem is
documented. This feature may not only be useful in translation process
research, but also for illustrative purposes in the translation classroom.
Figure 8. Linking passages of transcripts to the corresponding sections in the
6
screen record
By linking transcripts to the corresponding screen-recording files, what is
encoded can be reduced to those phenomena which seem most relevant for
answering the research questions asked. However, other phenomena,
which have not been encoded because they seemed to be less important at
first, do not get lost but are retained in the screen-recording file and can be
encoded in the transcript later, if necessary. Furthermore, during the phase
6
Problem No. 1 (“speaks vs. talks”) refers to the following passage of the source text,
which had to be translated from English into German: “Moja, together with a dozen
or so other chimps and one gorilla in the United States, talks. She doesn‟t speak – she
talks. She communicates with her fingers in American Sign Language, devised for,
and used by, hundreds of thousands of deaf Americans.”
Adding value to data in translation process research
173
of analysis, the transcripts and screen-recording files can be used in
parallel, which has the advantage that certain phenomena can be observed
much more precisely in the screen-recording file than they could ever have
been documented in a transcript. This is another advantage of making
process data accessible in an AMS.
As mentioned above, in the transcripts, the problems are numbered
and given designations which specify the items in the source text that
caused the translation problems. These numbers and designations do not
only appear in the transcripts in the highlighted line at the top of the
passage in which the translation problem occurred but also below the video
window on the right in Figs 6 and 8. If one clicks on them, the passage of
7
the screen record is replayed. The duration of this passage is given in
minutes and seconds at the right end of the highlighted line at the top of
the problem box.
In this way, the transcripts are linked to the screen-recording files.
The size of the screen recording window can be increased by clicking on
the symbol marked with a circle in Fig. 8.
Apart from the problem passages described above (“problem”), the
following phenomena can be highlighted in the transcripts by ticking the
respective box at the top of the screen:
“utterance”. Utterances are verbalizations that the subject makes
without reading from a written document such as the source text, the
target text produced so far or any work of reference.
“self-dictate”. What the subjects dictate to themselves while typing is
marked as “self-dictate”.
“read”. “Read” indicates reading processes.
“consult”. “Consult” indicates the looking-up of items in external
resources.
“vocal”. “Vocal” indicates non-lexical phenomena, such as smacking
one‟s lips, sniffing, and sighing.
“pause”. “Pause” indicates pauses where the subjects do not do or
verbalize anything.
7
This may take a few seconds because the screen recording file (in flash format) has to
be downloaded first.
174
Susanne Göpferich
“shift”. “Shift” indicates changes in voice quality such as lowering
one‟s voice, speaking up, speaking in an approving or disapproving
tone.
“other”. Clicking on “other” displays the mark-up of actions such as
typing or clicking silently, refitting the head-set, drinking, etc.
A detailed description of all these categories and the way they are
transcribed is given in Göpferich (forthcoming a).
3.3
Additional information
The other links on the left of Fig. 1 provide access to the “Publications”
resulting from the research project TransComp, to the documentation of the
XML schema file developed for TransComp (“Schema documentation”; cf.
Göpferich forthcoming a), the “Bibliographic Database TransPro”, which
contains an online bibliography on translation process research,
information on a “Mailing List” for people interested in translation process
research, and to the links of “Other Research Groups” active in the field of
translation process research.
The materials used and all the data collected in the TransComp
experiments conducted so far have been uploaded into the AMS described
above (Göpferich et al. 2008 ff.). At the moment, these materials and data
are password-protected because the source texts will also be used in future
test waves of TransComp, and we have to make sure that our subjects do
not have access to them until the last test wave has been completed (in
August 2011). After this, password protection will be removed and the
data can be accessed freely.
3.4
Uploading data
For data upload, there is a special client software, which is passwordprotected, too. The screenshot in Fig. 9 gives an impression of what the
user interface in this client looks like. For each experiment conducted, a
so-called „object‟ has to be created with a special identification number
(the lines in the background of Fig. 9). For each of these objects, all the
individual files belonging to it can be uploaded in the overlaid window. As
Adding value to data in translation process research
175
a last step, each object has to be attributed to the „data containers‟ to which
it belongs. For each experiment, these are the containers of the
experimental wave and of the subject involved. This ensures that the data
can be accessed in the AMS both via the link “Experimental Waves” and
via the link “Subjects”.
Figure 9. Client for data upload
4. Conclusion
This article has illustrated how data collected in translation process
research can be made available on the Internet in Asset Management
Systems (AMS). Furthermore, is has shown the added value that such
systems offer, for example, by allowing to connect files in such a way that
they can be analysed in parallel. If an increasing number of researchers
made use of such systems, this could pave the way for collaborative work
beyond the boundaries of departments, institutions, and even countries, an
176
Susanne Göpferich
effort which is indispensable to collect corpora of sizes which allow
generalizations.
References
Englund Dimitrova, B. 2005. Expertise and Explicitation in the Translation
Process. Amsterdam, Philadelphia: John Benjamins.
Fedora: Fedora Commons <http://www.fedora.info> (January 26, 2009).
Gile, D. 1994. The process-oriented approach in translation training. In C.
Dollerup & A. Lindegaard (eds). Teaching Translation and Interpreting 2.
Amsterdam, Philadelphia: John Benjamins. 107–112.
Gile, D. 2004. Integrated Problem and Decision Reporting as a translator training
tool. JosTrans 2: 2–20.
Göpferich, S. 2008. Translationsprozessforschung: Stand – Methoden –
Perspektiven. (Translationswissenschaft 4). Tübingen: Narr.
Göpferich, S. 2009. Towards a model of translation competence and its
acquisition: the longitudinal study „TransComp‟. In S. Göpferich, A. L.
Jakobsen & I. M. Mess (eds): Behind the Mind: Methods, Models and
Results in Translation Process Research. (Copenhagen Studies in Language
37). Copenhagen: Samfundslitteratur. 11–37.
Göpferich, S. forthcoming a. Data documentation and data accessibility in
translation process research. The Translator.
Göpferich, S. forthcoming b. Anleitungen rezipieren, Anleitungen produzieren:
Empirische Befunde zu kognitiven Prozessen bei Übersetzungsnovizen und
Übersetzungsprofis. HERMES – Journal of Language and Communiction
Studies.
Göpferich, S. et al. 2008 ff. TransComp – The Development of Translation
Competence. [Corpus and Asset Management System for the Longitudinal
Study TransComp]. Graz: Univ. of Graz. <http://gams.unigraz.at/container:tc> (April 6, 2009).
Hansen, G. 2006. Erfolgreich übersetzen: Entdecken und Beheben von
Störquellen. (Translationswissenschaft 2). Tübingen: Narr.
Jääskeläinen, R. 1999. Tapping the Process: An Explorative Study of the
Cognitive and Affective Factors Involved in Translation. (University of
Joensuu Publications in the Humanities 22). Joensuu: Univ. of Joenssu,
Faculty of Arts.
Jakobsen, A. L. 1999. Logging target text production with Translog. In: G.
Hansen (ed.). Probing the Process in Translation.: Methods and Results.
(Copenhagen Studies in Language 24). Copenhagen: Samdfundslitteratur,
9–20.
Krings, H. P. 1986. Was in den Köpfen von Übersetzern vorgeht. Eine empirische
Untersuchung zur Struktur des Übersetzungsprozesses bei fortgeschrittenen
Französischlernern. Tübingen: Narr.
Adding value to data in translation process research
177
Krings, H. P. 2001. Repairing Texts: Empirical Investigations of Machine
Translation Post-Editing Processes. Kent (Ohio), London: Kent State Univ.
Press.
Kußmaul, P. 1995. Training the Translator. Amsterdam, Philadelphia: John
Benjamins.
Kußmaul, P. 2000. Kreatives Übersetzen. (Studien zur Translation 10). Tübingen:
Stauffenburg.
Kußmaul, P. 2007. Verstehen und Übersetzen: Ein Lehr- und Arbeitsbuch.
Tübingen: Narr.
Selting, M., et al. 1998. Gesprächsanalytisches Transkriptionssytem GAT.
Linguistische Berichte 173: 91–122. <www.fbls.uni-hannover.de/sdls/
schlobi/schrift/GAT/gat.pdf> (August 1, 2006).
Smagorinsky, P. 1994. Think-aloud protocol analysis beyond the black box. In P.
Smagorinsky (ed.). Speaking about Writing: Reflections on Research
Methodology. Thousand Oak: Sage. 3–19.
Stigler, H. 2008. XML-Frameworks im Korpusmanagement. In B. Tošović (ed.).
Die Unterschiede zwischen dem Bosnischen/Bosniakischen, Kroatischen
und Serbischen. Münster: LIT. 617–629.
Strömqvist, S., et al. 2006. What keystroke logging can reveal about writing. In
K. P. H. Sullivan & E. Lindgren (2006a): 45–71.
Sullivan, K. P. H. and Lindgren, E. (eds) 2006a. Computer Keystroke Logging
and Writing: Methods and Applications. Oxford: Elsevier.
Sullivan, K. P. H. & Lindgren, E. 2006b. Supporting learning, exploring theory
and looking forward with keystroke logging. In Sullivan/Lindgren (2006a):
203–211.
TEI Consortium (eds) 2007. TEI P5: Guidelines for Electronic Text Encoding
and Interchange. [Version 1.0], [Last updated on October 28, 2007],
<http://www.tei-c.org/Guidelines/P5/index.xml> (March 28, 2008).
Appendix A
Example of a transcript following the TEI guidelines (containing
translation-process-specific deviations) from the corpus of the research
project TransComp. The example is an extract from the transcript of a
translation process from English into German. It illustrates the guidelines
followed in TransComp. The spoken text enclosed in tags is encoded
according to the GAT conventions (see Selting et al. 1998 and Göpferich
forthcoming a), i.e., the German text has been written without capitals
except for sounds that the subject emphasized, the lengthening of vowels is
indicated by one or more colons, etc.
178
Susanne Göpferich
<?xml version="1.0" encoding="UTF-8"?>
<teiCorpus xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://gams.uni-graz.at/transcomp/1.0 http://gams.unigraz.at/schemas/transcomp/1.0/transcomp.xsd"
xmlns="http://gams.uni-graz.at/transcomp/1.0">
<teiHeader type="corpus">
<fileDesc>
<titleStmt>
<title>TransComp: The Development of Translation Competence</title>
<respStmt>
<resp>Leader</resp>
<name>Susanne Göpferich</name>
</respStmt>
<respStmt>
<resp>Participant</resp>
<name>Gerrit Bayer-Hohenwarter</name>
</respStmt>
<respStmt>
<resp>TEI Modeling</resp>
<name>Hubert Stigler</name>
</respStmt>
</titleStmt>
<publicationStmt>
<publisher>TransComp</publisher>
<pubPlace>Graz</pubPlace>
<date>2008</date>
</publicationStmt>
<notesStmt>
<note type="description">TransComp is a process-oriented longitudinal
study which explores the development of translation competence in 12
students of translation over a period of 3 years and compares it to that of
10 professional translators. The insight into the components which make
up translation competence and into its development gained in the project
will be utilized for translation pedagogy and the improvement of curricula
for translator training.</note>
</notesStmt>
<sourceDesc>
<p>Cf. sources of the individual transcripts.</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<TEI>
<teiHeader type="text">
<fileDesc>
<titleStmt>
<title>TransComp: t1_A1_Stud_SFR</title>
<respStmt>
<resp>Experimenter</resp>
Adding value to data in translation process research
<name xml:id="SEG">Susanne Göpferch</name>
</respStmt>
<respStmt>
<resp>Experimenter</resp>
<name xml:id="SWI">Solvejg Wiedecke</name>
</respStmt>
<respStmt>
<resp>Transcriber</resp>
<name>Solvejg Wiedecke</name>
</respStmt>
<respStmt>
<resp>Proof-Reader</resp>
<name>Susanne Göpferch</name>
</respStmt>
<respStmt>
<resp>Proof-Reader</resp>
<name xml:id="GBH">Gerrit Bayer-Hohenwarter</name>
</respStmt>
</titleStmt>
<publicationStmt>
<publisher>TransComp</publisher>
<pubPlace>Graz</pubPlace>
<date>2008</date>
</publicationStmt>
<sourceDesc>
<recordingStmt>
<p>transcript of thinking-aloud from Camtasia Studio screen
recording with sound; additional hardware and software used:
Translog 2006 Academic Edition, Webcam</p>
<p>
<ptr type="source_of_transcript"
target="t1_A1_Stud_SFR.camrec"/>
</p>
</recordingStmt>
</sourceDesc>
</fileDesc>
<profileDesc>
<particDesc>
<person xml:id="SFR" role="student" sex="2" age="21">
<nationality>German</nationality>
<state type="competence_level">
<ab>t1</ab>
</state>
<state type="typing_skills">
<ab>touch typist</ab>
</state>
<langKnowledge>
179
180
Susanne Göpferich
<langKnown level="mother_tongue" tag="de-AT"/>
<langKnown level="L2" tag="en"/>
</langKnowledge>
</person>
</particDesc>
<settingDesc>
<setting>
<date when="2007-10-20T10:28:00" dur="P69M"/>
<locale>LabCom.Doc</locale>
</setting>
<setting>
<ab type="description">quiet room, light on, experimenters present in
background</ab>
<ab type="methods_used">think-aloud, key logging, screen
recording, webcam recording, questionnaire</ab>
<ab type="software_used">Translog 2006 Academic Edition,
Camtasia Studio</ab>
<ab type="assignment">translation of a popular-science text from
English into German</ab>
<ab type="documents">
<ptr type="source_text" xml:lang="en" target="http://"/>
<ptr type="target_text" xml:lang="de" target="http://"/>
<ptr type="model_translation" xml:lang="de" target="http://"/>
<ptr type="assignment" xml:lang="de" target="http://"/>
</ab>
<ab type="resources">
<listBibl>
<bibl>Internet</bibl>
<bibl xml:id="Webster">Webster's Collegiate Dictionary</bibl>
<bibl xml:id="Wahrig">Wahrig Die deutsche
Rechtschreibung</bibl>
<bibl xml:id="LCE">Longman Dictionary of Contemporary
English</bibl>
<bibl xml:id="Duden-SSW">Duden Sinn- und sachverwandte
Wörter</bibl>
<bibl xml:id="Oxford-Duden">Oxford Duden German
Dictionary</bibl>
</listBibl>
</ab>
</setting>
</settingDesc>
</profileDesc>
</teiHeader>
<text>
<body>
<div type="pre-phase">
<u start="00:00:59" end="00:03:39">also dann les ich mir das jetzt mal
durch<vocal>sniffs</vocal> <vocal>swallows</vocal><incident
type="reads entire ST silently"/></u>
Adding value to data in translation process research
181
</div>
<div type="main-phase">
<u start="00:03:39" end="01:02:27">ok wa:s das en jetzt?<pause
dur="3"/>vorhin war schon besser.<vocal>sighs</vocal>
<incident type="reads ST"><shift loud="p">we don’t want a thing
because we found a reason for it we find reasons for it because we want
it</shift></incident><pause dur="2"/>ok
<incident type="self-dictates">wa:rum? men:schen rauchen
</incident><pause dur="7"/> <vocal>sniffs</vocal>wir wollen etwas
NIcht weil wir einen grund dafür gefunden haben<pause dur="1"/>wir
finden grün:de dafür weil wir<pause dur="1"/>es wollen<pause
dur="6"/>wir wollen ein ding nicht<pause dur="5"/>
<incident type="self-dictates">nicht, weil wir einen grund dafür
<vocal>sniffs</vocal>gefunden haben<pause dur="2"/>wir finden
gründe dafür, weil wir es wollen</incident>stimmt das?
<incident type="reads TT"><shift loud="p">wir wollen ein ding <pause
dur="1"/>nicht, weil wir einen grund dafür gefunden haben <pause
dur="1"/>wir finden gründe da:für,<pause dur="2"/>weil wir es
wollen<pause dur="3"/></shift></incident>ja, stimmt
<vocal>sniffs</vocal><pause dur="10"/>
<incident type="types"/><vocal>sniffs</vocal>
<incident type="reads ST"><shift loud="p">inspite of the importance of
psychological<pause dur="1"/>or conditioning factors in
addiction</shift></incident>
<incident xml:id="1" type="problem" subtype="conditioning"
start="00:06:16" end="00:06:17">häh:::? </incident>
<incident type="reads ST"><shift loud="p">it is the craving that <pause
dur="2"/>most often causes the addict to fail in any attempt to
quit</shift></incident><vocal>sighs</vocal>inspite <pause
dur="1"/>trotz der<vocal>smacks</vocal>importance ähm: ja:
wichtigkeit<pause dur="1"/>von psycho cholo:gischen: und conditioning
<incident xml:id="1.2" type="problem" subtype="conditioning"
start="00:06:47" end="00:06:54">ähm: ja das weiß ich immer noch nicht
so recht wie ich das übersetzen soll</incident><pause dur="1"/>
factors,<pause dur="1"/>hm:<pause dur="1"/>addiction abhängigkeit
und gewohnheit<pause dur="1"/>fü:r<pause dur="2">physi<pause
dur="1"/>physische<vocal>sniffs</vocal> ähm: substanzen <pause
dur="7"/>.h conditionin ja,<pause dur="2"/>ja, ok. .h trotz der
wichtigkeit<pause dur="1"/>von: <pause dur="2"/>psychologischen und
conditioning [...]
</u>
</div>
<div type="post-phase">
<u start="01:02:27" end="01:09:17">
<incident type="reads TT">warum menschen rauchen. .hhh wir wollen
ein ding nicht weil wir einen grund dafür gefunden haben wir finden
gründe dafür weil wir es wolln:</incident><pause dur="5"/>ja.
<incident type="self-dictates">will:</incident>[…
</u>
</div>
</body>
</text>
</TEI>
</teiCorpus>
182
Susanne Göpferich
Appendix B
Example of an evaluated target text (Wave 1 / Source Text A3 / Student
SFR). For an explanation of the error classification used, see Göpferich
(forthcoming b).