Difference between revisions of "Text structure (ePistolarium)"

From XML
Jump to: navigation, search
(Initial version)
 
(Add headings)
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
== Basic structure ==
 +
 
The documents in the ePistolarium have the following basic structure:
 
The documents in the ePistolarium have the following basic structure:
  
Line 17: Line 19:
 
</pre>
 
</pre>
  
However, a number of documents in the Pierre Bayle corpus contain both a text and its translation, encoded using the group element:
+
The <code>text</code> and <code>body</code> elements serve as a container for the actual text. The TEI Guidelines allow for <code>front</code> and <code>back</code> elements as siblings of <code>body</code>, but these are not used.
 +
 
 +
The <code>type</code> attribute of the <code>div</code> element is used to distinguish between various parts of the text, the most important ones being <code>artifact</code> for the 'real' letter text and <code>notes</code> for the text of (editorial) notes.
 +
 
 +
 
 +
== Subsections ==
 +
 
 +
A <code>div</code> element may contain other <code>div</code> elements, which is often used in combination with a <code>head</code> element.
 +
 
 +
Example: https://correspondence.huygens.knaw.nl/documents/920e5785-62d4-4108-b126-8caa1df370b8
 +
(subdivisions)
  
 
<pre>
 
<pre>
<text type="composite">
+
<div type="artifact" subtype="letter">
<group>
+
<div type="section">
<text type="original">
+
<head>Propositio 5.</head>
<body>
+
<p><figure><graphic url="huyg003oeuv01ill19.gif"/></figure></p>
<div xml:id="div-1" xml:lang="la" type="artifact" subtype="letter">
+
<p>Si il y a tant de gravitez qu'on veut ...</p>
...
+
</div>
 +
<div type="section">
 +
<head>Propositio 6.</head>
 +
<p><figure><graphic url="huyg003oeuv01ill20.gif"/></figure></p>
 +
<p>Eadem methodo probatur si ...</p>
 
</div>
 
</div>
</body>
 
</text>
 
<text type="translation">
 
<body>
 
<div xml:id="div-2" xml:lang="fr" type="artifact" subtype="letter">
 
 
...
 
...
 
</div>
 
</div>
</body>
+
</pre>
</text>
+
 
</group>
+
 
</text>
+
== Headings ==
 +
 
 +
The <code>head</code> element is used to encode any type of heading, for example the title of a section, or the heading of a list, etc.
 +
 
 +
Example: see the description of subsections above.
 +
 
 +
 
 +
== Paragraphs ==
 +
 
 +
Paragraphs are encoded with the <code>p</code> element. In the ePistolarium the <code>p</code> element only occurs in <code>div</code> elements. The elements are not allowed to be nested, that is, a <code>p</code> element in a <code>p</code> element is not allowed. Note that this is much stricter than allowed by the TEI Guidelines. The reason for the restriction is that some analysis methods used by the ePistolarium use the paragraph as unit of analysis.
 +
 
 +
 
 +
== Segments ==
 +
 
 +
 
 +
The <code>seg</code> element is used to encode text segments. In the ePistolarium they are used for various purposes: attaching language assignments to parts of a paragraph, encoding inline notes.
 +
 
 +
Example: https://correspondence.huygens.knaw.nl/documents/f687c50b-6765-4519-bc5f-833c499ae453
 +
(language assignment)
 +
 
 +
<pre>
 +
<div xml:lang="fr" type="artifact" subtype="letter">
 +
<p><seg xml:lang="nl">Op mijn brief over uw zoon schrijft de hertog van Bouillon mij terug:</seg> ‘Je vous supplie d'asseurer Mr. Junius que je me gouverneray tousjours aveq son fils comme je doibs, sans luy tesmoigner jamais aucune rigueur, ni mauvais traictement, n'en prenant à tesmoin que Mr. Meteren, qui a veu comme j'y ay procedé et comme je ne me suis jamais pleu à parler de son affaire. Mais les discours de raillerie qu'en faisoyent les enemis et le murmure de la garnison m'ont obligé d'en user comme j'ay faict, remettant le tout à S. Ex.<hi rend="superscript">e</hi>.’</p>
 +
<p xml:lang="nl">Het is dus beter, dezen edelman niet te verbitteren door zijn gedrag openlijk af te keuren; er wordt al te veel gebabbeld door kwaadwillige menschen. 28 Jull. 1634.</p>
 +
</div>
 +
</pre>
 +
 
 +
Note that in this example "Op mijn brief ..." is explicitly assigned language Dutch, whereas the text "‘Je vous supplie d'asseurer ..." is implicitly assigned French because of the presence of the <code>xml:lang="fr"</code> attribute of the <code>div</code> element.
 +
 
 +
 
 +
Example: https://correspondence.huygens.knaw.nl/documents/3ec37404-3be4-4f98-8bdc-02cf100da36d
 +
(inline notes)
 +
 
 +
<pre>
 +
<div type="para">
 +
<p><seg type="note">Adres:</seg> (A Mon)sieur Monsieur Grotius à Paris.</p>
 +
<p><seg type="note">In dorso schreef Grotius:</seg> 6 Jan. 1629 N. Reigersberg.</p>
 +
</div>
 
</pre>
 
</pre>

Latest revision as of 13:40, 9 January 2018

Basic structure

The documents in the ePistolarium have the following basic structure:

<text>
<body>
<div xml:id="div-1" type="artifact" subtype="letter">
...
</div>
<div xml:id="div-2" type="para">
...
</div>
<div xml:id="div-3" type="notes">
...
</div>
</body>
</text>

The text and body elements serve as a container for the actual text. The TEI Guidelines allow for front and back elements as siblings of body, but these are not used.

The type attribute of the div element is used to distinguish between various parts of the text, the most important ones being artifact for the 'real' letter text and notes for the text of (editorial) notes.


Subsections

A div element may contain other div elements, which is often used in combination with a head element.

Example: https://correspondence.huygens.knaw.nl/documents/920e5785-62d4-4108-b126-8caa1df370b8 (subdivisions)

<div type="artifact" subtype="letter">
<div type="section">
<head>Propositio 5.</head>
<p><figure><graphic url="huyg003oeuv01ill19.gif"/></figure></p>
<p>Si il y a tant de gravitez qu'on veut ...</p>
</div>
<div type="section">
<head>Propositio 6.</head>
<p><figure><graphic url="huyg003oeuv01ill20.gif"/></figure></p>
<p>Eadem methodo probatur si ...</p>
</div>
...
</div>


Headings

The head element is used to encode any type of heading, for example the title of a section, or the heading of a list, etc.

Example: see the description of subsections above.


Paragraphs

Paragraphs are encoded with the p element. In the ePistolarium the p element only occurs in div elements. The elements are not allowed to be nested, that is, a p element in a p element is not allowed. Note that this is much stricter than allowed by the TEI Guidelines. The reason for the restriction is that some analysis methods used by the ePistolarium use the paragraph as unit of analysis.


Segments

The seg element is used to encode text segments. In the ePistolarium they are used for various purposes: attaching language assignments to parts of a paragraph, encoding inline notes.

Example: https://correspondence.huygens.knaw.nl/documents/f687c50b-6765-4519-bc5f-833c499ae453 (language assignment)

<div xml:lang="fr" type="artifact" subtype="letter">
<p><seg xml:lang="nl">Op mijn brief over uw zoon schrijft de hertog van Bouillon mij terug:</seg> ‘Je vous supplie d'asseurer Mr. Junius que je me gouverneray tousjours aveq son fils comme je doibs, sans luy tesmoigner jamais aucune rigueur, ni mauvais traictement, n'en prenant à tesmoin que Mr. Meteren, qui a veu comme j'y ay procedé et comme je ne me suis jamais pleu à parler de son affaire. Mais les discours de raillerie qu'en faisoyent les enemis et le murmure de la garnison m'ont obligé d'en user comme j'ay faict, remettant le tout à S. Ex.<hi rend="superscript">e</hi>.’</p>
<p xml:lang="nl">Het is dus beter, dezen edelman niet te verbitteren door zijn gedrag openlijk af te keuren; er wordt al te veel gebabbeld door kwaadwillige menschen. 28 Jull. 1634.</p>
</div>

Note that in this example "Op mijn brief ..." is explicitly assigned language Dutch, whereas the text "‘Je vous supplie d'asseurer ..." is implicitly assigned French because of the presence of the xml:lang="fr" attribute of the div element.


Example: https://correspondence.huygens.knaw.nl/documents/3ec37404-3be4-4f98-8bdc-02cf100da36d (inline notes)

<div type="para">
<p><seg type="note">Adres:</seg> (A Mon)sieur Monsieur Grotius à Paris.</p>
<p><seg type="note">In dorso schreef Grotius:</seg> 6 Jan. 1629 N. Reigersberg.</p>
</div>