Difference between revisions of "About XML"

From XML
Jump to: navigation, search
Line 8: Line 8:
 
  <pre><lb/> linebreak</pre>
 
  <pre><lb/> linebreak</pre>
  
• An element can have an attribute. An attribute has a value “ ”. I.e.
+
• An element can have an attribute. An attribute has a value “ ”:
 
<pre>
 
<pre>
 
  <hi rend=“super”>st</hi>
 
  <hi rend=“super”>st</hi>
Line 19: Line 19:
  
 
• In a XML document, linebreaks (also multiple linebreaks) don't matter.
 
• In a XML document, linebreaks (also multiple linebreaks) don't matter.
If you want a linebreak in a text you have to use <lb/>. I.e.
+
If you want a linebreak in a text you have to use <lb/>.  
 
<pre>
 
<pre>
 
The line is too → The line is too short
 
The line is too → The line is too short
Line 28: Line 28:
 
</pre>
 
</pre>
  
• XML documents are whitespace sensitive (whitespace includes spaces, tabs and newlines)
+
• XML documents are whitespace sensitive (whitespace includes spaces, tabs and newlines):
 
<pre>
 
<pre>
 
normal<lb/>ly → normally
 
normal<lb/>ly → normally
Line 36: Line 36:
 
</pre>
 
</pre>
  
• Any comment (if you want to take a note on something for you or for other transcribers) is encoded in this way: i.e.
+
• Any comment (if you want to take a note on something for you or for other transcribers) is encoded in this way:
 
<pre>
 
<pre>
 
<!--  This is a comment  -->
 
<!--  This is a comment  -->

Revision as of 14:13, 18 January 2017

XML generalities

Some generic rules of XML:

• <starting tag>...</ending tag> (! attention to the position of / ), i.e.

<address>Prins Willem-Alexanderhof 5</address>

• <tag without content/>, i.e.

<lb/> linebreak

• An element can have an attribute. An attribute has a value “ ”:

 	<hi rend=“super”>st</hi>
	hi 	is the element (highlight)
	rend 	is the attribute
	super 	is the value of the attribute.

• @xml:id is an attribute that provides a unique identifier to an element. To point to it use #.

• In a XML document, linebreaks (also multiple linebreaks) don't matter. If you want a linebreak in a text you have to use <lb/>.

The line is too			→	The line is too short
short				

The line <lb/> is too short	→ 	The line
					is too short.

• XML documents are whitespace sensitive (whitespace includes spaces, tabs and newlines):

normal<lb/>ly		→ 	normally
normal <lb/> ly 	→ 	normal  ly
normal<lb/>
ly 			→ 	normal  ly

• Any comment (if you want to take a note on something for you or for other transcribers) is encoded in this way:

<!--  This is a comment  -->
<!-- This is a strange sign that I cannot read, it's better to leave it now and come back to it in a week or two -->
<!-- This encoding is for hyphenation -->

XML namespace and schema

Namespaces

Elements in XML documents can be placed in so-called namespaces. This is a way to be able to distinguish different XML vocabularies. The default namespace for elements in this document is http://www.tei-c.org/ns/1.0 (the TEI namespace). For new elements (normally borrowed from DALF), the namespace is http://mondrian.huygens.knaw.nl/. We refer to the namespace with the prefix 'md', for exemple <md:addressee>.

Prefix and namespace are defined on the root level element of the XML document, as follows:

<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:md="http://mondrian.huygens.knaw.nl/">

Schema

[Geldt zo alleen voor Mondriaan gok ik? Nog algemeen maken] A schema was defined, in which we specify which elements and attributes are allowed. The schema uses the Relax NG schema language. The name is MD.rng. oXygen uses the schema to validate the file and suggest allowed elements and attributes. The schema is stored in the documentation folder in the repository. Our documents refer to the schema as follows :

<?xml-model href="../documentation/MD.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>

(Read the href attribute as: one folder up (the two dots), then down into the documentation folder)

For the writings, we use a separate schema: MDwritings.rng.