Difference between revisions of "Transcription"

From XML
Jump to: navigation, search
Line 20: Line 20:
  
 
==Reproduction of the source==
 
==Reproduction of the source==
To reproduce all characteristics of a text and its medium in the digital edition we use a number of standard XML elements.
+
To reproduce all characteristics of a text and its medium in the digital edition we use a number of standard XML elements:
  
 
* '''<add>'''          Addition;
 
* '''<add>'''          Addition;

Revision as of 12:25, 15 February 2017

The transcription containing the actual text of the edition is encoded within a <text> element in the XML/TEI file. The structure of the text, the source of the text, required annotations and all other peculiarities are taken into account.

For the transcription of the manuscripts, the aim is to stay as close to the original text as possible. We focus on the production of diplomatic editions. This means that the edition is based on only one source and that the text and all graphic information will be displayed in accordance with this source. We don’t aspire a typographic imitation of the source, but aim for a functional reproduction of the text. As already stated, staying true to the text is key for the transcriptions. This means that deviations from the standard spelling and grammar are copied, changes made during writing - immediately or later - are documented, and if relevant, the physical structure of the source is reproduced.

Structure of the text

A text usually is not a ‘flat’ entity, but consists of different layer with their own hierarchy. These layers are reproduced within the XML file. The text of the edition is encoded in a <text> element, which contains a mandatory <body> element. The structure of a simple text would look as follows:

<text>
 <body>
  <p>...</p>
 </body>
</text>

Other layers of the text can be encoded with the following elements (if so desired supplemented with an @type attribute to further specify them):

  • <p> Paragraph;
  • <ab> Anonymous block (an arbitrary unit of text, but without the semantic baggage of a paragraph);
  • <div> Section (i.e. chapter, rubric or another kind of section);
  • <head> Titles of chapters, headings etc.;
  • <group> To indicate different textst within the transcription.

Reproduction of the source

To reproduce all characteristics of a text and its medium in the digital edition we use a number of standard XML elements:

  • <add> Addition;
It is a <add>complete</add> text.
  • <damage> Damage or text loss;
<damage extent="whole leaf" agent="rubbing"> ... </damage>
  • <del> Deletion;
It is <del type=strikethrough>not</del> a complete text.
  • <hi> Highlight (if you need to highlight something and then specify how it appears in the source);
<hi rend="underline">This text is underlined</hi>
  • <restore> Used to mark an earlier deletion that is undone;
<restore seq="2"><del seq="1">cocktail</del></restore>
<del seq="2"><add seq="1">drink</add></del>
  • <space> Empty lines, using dim="vertical" and the unit and quantity attributes to indicate the number of lines;
<space dim="vertical" unit="lines" quantity="2"/>
  • <unclear> Uncertain reading of the text, for example due to damage;
A single <unclear>word</unclear> is hard to read.
  • @rend This is one of the global attributes in TEI, so it is allowed with a lot of elements. It indicates how the element in question was rendered or presented in the source. The allowed values are defined in the schema;
  • @place Indicates the location of (e.g.) an addition (above, below, margin, etc) or the closer;
  • @type Used to classify or sort elements.

Different sources will contain different sorts of specialties, so the XML elements that encode these qualities of the text will vary to a certain extent per project.

See also