Difference between revisions of "Transcription (Mondrian)"

From XML
Jump to: navigation, search
Line 1: Line 1:
* '''<add>''' Addition
 
* '''&lt;del>''' Deletion
 
* '''&lt;hi>''' Highlight (if you need to highlight something and then specify how it appears in the source)
 
* '''<retrace>''' A letter or word retraced to clarify the intended word or letter
 
* '''<restore>''' Used to mark an earlier deletion that is undone
 
* '''@rend''' This is one of the global attributes in TEI, so it is allowed with a lot of elements. It indicates how the element in question was rendered or presented in the source. The allowed values are defined in the schema.
 
* '''@place''' Indicates the location of (e.g.) an addition (above, below, margin, etc) or the closer.
 
  
  
==The stages of changes==
+
==Writing process and damage==
 
To encode different stages of changes, we use @seq
 
To encode different stages of changes, we use @seq
 
i.e. Mondrian deleted a word and then added a new one.
 
i.e. Mondrian deleted a word and then added a new one.
Line 37: Line 30:
 
<pre><restore seq="2"><del seq="1">cocktail</del></restore>
 
<pre><restore seq="2"><del seq="1">cocktail</del></restore>
 
<del seq="2"><add seq="1">drink</add></del></pre>
 
<del seq="2"><add seq="1">drink</add></del></pre>
 
 
  
 
A Sofortkorrectur is not embedded in a seg-element, because there is no need to show the different states of the text. If it is desirable to show the scope of an immediate correction by overwriting, we use add (with seq="0"):
 
A Sofortkorrectur is not embedded in a seg-element, because there is no need to show the different states of the text. If it is desirable to show the scope of an immediate correction by overwriting, we use add (with seq="0"):
Line 46: Line 37:
 
We don't use @seq on retrace.  
 
We don't use @seq on retrace.  
  
When a text  continues in the margin, that does not by itself make the margin text an addition (<add>). An addition is something added at a later stage. When Mondrian adds a sign to indicate where the text continues, we encode this sign as a metamark.
+
When a text  continues in the margin, that does not by itself make the margin text an addition (<add>).  
  
==Rend details==
+
==Visual characteristics==
If you need multiple values (like underlined and superscript) just enter them separated by a space: <hi rend="underline super">.
+
If a paragraph is indented, use rend="indent" on its first line (<lb rend="indent">).  
 
+
If a paragraph is indented, use rend="indent" on its first line (<lb>).  
+
  
 
We use rend="blockletters" for block capitals.
 
We use rend="blockletters" for block capitals.
Line 65: Line 54:
 
are completely equivalent.  
 
are completely equivalent.  
  
==Transpositions==
 
 
Two (or more) pieces of text that have switched position are encoded using the md:transpose element. The transposed texts are written in their original order. The target attribute indicates where the text fragment is moved. Example:
 
 
<pre><md:transpose seq="1" xml:id="i1" target="#i2">development</md:transpose> <md:transpose seq="1" xml:id="i2" target="#i1">art</md:transpose></pre>
 
 
==Incorrect text==
 
An incorrect text can be encoded in <sic>. The corresponding correction is incoded into <corr>. Both elements goes into <choice>, as in the example:
 
 
<pre><choice><sic type="grammar">Happely</sic><corr>Happily</corr></choice></pre>
 
 
==Unclear, illegible text==
 
If text is hard to read, either because it has been deleted or for another reason (bad handwriting), it is encoded as <unclear>. If for instance if the word “removed’ has been deleted, but we’re not sure about the last three letters, we encode:
 
 
<pre><del>remo<unclear>ved</unclear></del></pre>
 
 
When two readings are possible, we can use <choice > to group them. If the last letter in “free” could also be read as “i”, we encode that as:
 
 
<pre>Fre<choice><unclear>e</unclear><unclear>i</unclear></choice></pre>
 
 
But an <unclear> can also occur by itself:
 
 
<pre>A single <unclear>word</unclear> is hard to read.</pre>
 
 
If text is completely illegible and cannot be transcribed at all the <gap> element is used. The size of the gap can be indicated using the unit and quantity attributes.
 
  
<pre>A single <gap quantity="1" unit="word"/> is illegible.</pre>
 
  
 
==Empty lines==
 
==Empty lines==

Revision as of 14:39, 26 April 2017


Writing process and damage

To encode different stages of changes, we use @seq i.e. Mondrian deleted a word and then added a new one. @seq assigns a sequence number related to the order in which the encoded features carrying this attribute are believed to have occurred.

<del seq="1">yellow</del >
<add seq="2" place="above">red</add>

You can also use seq=0 for immediate deletions (deletions while writing or Sofortkorrektur).

If a single act of modification requires multiple elements, these elements have the same seq attribute:

Ik <del seq="1">heb</del><add seq="1">lees</add> het boek<del seq="1" gelezen</del>.

On the del element, we can use the rend attribute to indicate it has been overwritten, either by an add or in a Sofortkorrektur:

<del rend="overwritten">wel</del><add>niet</add>

A text fragment that has been modified is tagged as <seg> (segment), for the purpose of being able to display the multiple states of the text. It is up to the editor to choose meaningful segments. In the above example the sentence might be tagged as <seg>:

<seg>Ik <del seq="1" >heb</del><add seq="1" >lees</add het 
boek<del seq="1" gelezen</del>.</seg>


If an earlier deletion is restored we can encode this using <restore>. An example ('cocktail' replaced by 'drink', which is then deleted while 'cocktail' is being restored):

<restore seq="2"><del seq="1">cocktail</del></restore>
<del seq="2"><add seq="1">drink</add></del>

A Sofortkorrectur is not embedded in a seg-element, because there is no need to show the different states of the text. If it is desirable to show the scope of an immediate correction by overwriting, we use add (with seq="0"):

<del seq="0">F</del><add seq="0">V</add>ics

We don't use @seq on retrace.

When a text continues in the margin, that does not by itself make the margin text an addition (<add>).

Visual characteristics

If a paragraph is indented, use rend="indent" on its first line (<lb rend="indent">).

We use rend="blockletters" for block capitals.

There is technically no difference between using a rend-attribute on an existing element and using a hi-element with that rend-attribute within that existing element. So

<addrLine rend="underline">New York City</addrLine>

and

<addrLine><hi rend="underline">New York City</hi></addrLine>

are completely equivalent.


Empty lines

Encode using the <space> element, using dim="vertical" and the unit and quantity attributes to indicate the number of lines. For example:

<space dim="vertical" unit="lines" quantity="2"/>

Hyphenation and other dashes

<c type="wbh">-</c> is used to encode a hyphen that divide a word at the end of the line (and only if Mondrian uses it, not where he should have used it). Other kinds of hyphen are not encoded.

If Mondrian writes:

    normal-
ly

we encode (Don’t introduce whitespace!):

normal<c type="wbh">-</c><lb/>ly


If Mondrian writes:

    Normal
Ly

we encode:

normal<lb/>ly

If Mondrian writes:

     well-
known brands

we encode:

well-<lb/>known brands


The mdash corresponds to unicode code point x2014. In Oxygen, it can be entered through the Symbol button (if needed, add the Symbol toolbar), or use the Edit menu, option Insert from Character Map.

Notes in the text

Notes unrelated to the contents of the letter, possibly in another hand, we encode as <ab> (anonymous block). ‘Anonymous’ here refers not to the author being unknown, but to this being a block of text not identified as a paragraph, a list or another block-level element. We use the hand attribute to point to the probable writer and describe the hands in the document hands.xml.

For instance (this is an example of an changed address provided by an anonymous person):

<div type="envelope" xml:id="PD">
   <!-- envelope recto -->
   <pb n="envelope-r" xml:id="env-r" facs="#zone-env-r"/> 
   <div type="postalData">
      <md:postmark>Paris XIV … </md:postmark>
       <address type="receiver">
         …
       </address>
   </div>	
   <ab hand="hands.xml#anon">James Abbott // 20xx Newbold Eve // Bronx</ab>
…
</div>    

See annotations for notes as used to annotate the text.

Postscripts

A postscript is not necessarily indicated by P.S. A postscript is any text added as an afterthought after a letter has been signed. A postscript contains at least one or more paragraphs:

<postscript>
<p>Say hello to your mother.<p>
</postscript>

A letter can have multiple postscripts. Postscripts can be numbered (using the n-attribute) to indicated a logical sequence.

Envelopes

Encode as div type=envelope. See SampleLetterWithEnvelope.xml. The addresses are encoded as divs with type="postalData". The address of the receiver goes on the front (recto) side of the envelope. Code the addresses as <address> with <addrline>s. Give the <address> a type-attribute (receiver, sender). <addrline>-elements are preceded by <lb>-elements if they begin on a new line. Short descriptive phrases (‘sent by’, ‘sender’, ‘To’) before the address go into <label>-elements.

An example of an address:

<address type="receiver">
    <lb/><addrLine rend="underline2">M<hi rend="super underline">r</hi>
        Harry Holtzman</addrLine>
    <lb/><addrLine>231 East 60<hi rend="super">th</hi> Street</addrLine>
    <lb/><addrLine rend="underline">New York City</addrLine>
</address>

See also