Changeset 3086:8ca262cc12d4 in livinglogic.python.xist

Show
Ignore:
Timestamp:
12/28/07 17:24:59 (12 years ago)
Author:
Walter Doerwald <walter@…>
Branch:
default
Message:

Split the documentation into more files and move to the docs/ subdirectory. Finish howto update.

Location:
docs
Files:
4 added
2 moved

Legend:

Unmodified
Added
Removed
  • docs/Howto.xml

    r3083 r3086  
    4848<item><pyref module="ll.xist.xsc" class="Text"><class>Text</class></pyref> for text data;</item> 
    4949<item><pyref module="ll.xist.xsc" class="Frag"><class>Frag</class></pyref> for document fragments, 
    50 (a <pyref module="ll.xist.xsc" class="Frag"><class>Frag</class></pyref> object is simply a list 
    51 of nodes);</item> 
     50(a <class>Frag</class> object is simply a list of nodes);</item> 
    5251<item><pyref module="ll.xist.xsc" class="Comment"><class>Comment</class></pyref> for &xml; comments 
    5352(e.g. <markup>&lt;!-- the comment --&gt;</markup>);</item> 
     
    6564type. All the elements from different &xml; vocabularies known to &xist; are 
    6665defined in modules in the <pyref module="ll.xist.ns"><module>ll.xist.ns</module></pyref> 
    67 subpackage. (Of course it's possible to define additional namespaces for your 
     66subpackage. (Of course it's possible to define additional element classes for your 
    6867own &xml; vocabulary). The definition of &html; can be found in 
    6968<pyref module="ll.xist.ns.html"><module>ll.xist.ns.html</module></pyref> 
     
    113112node = html.div( 
    114113    "Hello world!", 
    115     { 
    116         "class_": "greeting", 
    117         "id": 42, 
    118         "title": "Greet the world" 
    119     } 
     114    dict(class_="greeting", id=42, title="Greet the world") 
    120115) 
    121116</prog> 
     
    144139For this the module <pyref module="ll.xist.parsers"><module>ll.xist.parsers</module></pyref> 
    145140provides several functions:</par> 
    146 <prog> 
    147 def parsestring(data, base=None, encoding=None, **builderargs) 
    148 def parseiter(iterable, base=None, encoding=None, **builderargs) 
    149 def parsestream(stream, base=None, encoding=None, bufsize=8192, **builderargs) 
    150 def parsefile(filename, base=None, encoding=None, bufsize=8192, **builderargs) 
    151 def parseurl(name, base=None, encoding=None, bufsize=8192, headers=None, data=None, **builderargs) 
    152 def parseetree(tree, base=None, **builderargs) 
    153 </prog> 
     141<dlist> 
     142<term><lit>parsestring(data, base=None, encoding=None, **builderargs)</lit></term> 
     143<item>Parse the string <arg>data</arg> into an &xist; tree.</item> 
     144<term><lit>parseiter(iterable, base=None, encoding=None, **builderargs)</lit></term> 
     145<item>Parse the input from the iterable <arg>iterable</arg> (which must produce the 
     146input in chunks of bytes) into an &xist; tree.</item> 
     147<term><lit>parsestream(stream, base=None, encoding=None, bufsize=8192, **builderargs)</lit></term> 
     148<item>Parse &xml; from the stream <arg>stream</arg> into an &xist; tree.</item> 
     149<term><lit>parsefile(filename, base=None, encoding=None, bufsize=8192, **builderargs)</lit></term> 
     150<item>Parse &xml; input from the file named <arg>filename</arg>.</item> 
     151<term><lit>parseurl(name, base=None, encoding=None, bufsize=8192, headers=None, data=None, **builderargs)</lit></term> 
     152<item>Parse &xml; input from the &url; <arg>name</arg> into an &xist; tree.</item> 
     153<term><lit>parseetree(tree, base=None, **builderargs)</lit></term> 
     154<item>Parse &xml; input from the object <arg>tree</arg> which must support the 
     155<link href="http://effbot.org/zone/element-index.htm">ElementTree</link> &api;.</item> 
     156</dlist> 
    154157<par>For example, parsing a string can be done like this:</par> 
    155158<example><title>Parsing a string</title> 
     
    257260This converter is created once and is passed to all <method>convert</method> 
    258261calls. It is used to store parameters for the conversion process and it allows 
    259 elements to pass information to other nodes. You can also call 
     262<method>convert</method> methods to store additional information, so that it is 
     263available elsewhere during the conversion process. You can also call 
    260264<pyref module="ll.xist.xsc" class="Node" method="convert"><method>convert</method></pyref> 
    261265yourself, which would look like this:</par> 
     
    321325 
    322326<section><title>Attributes</title> 
    323 <par>Setting and accessing the attributes of an element works via 
    324 the dictionary interface:</par> 
     327<par>Setting and accessing the attributes of an element works either via 
     328a dictionary interface or by accessing the &xml; attributes as Python attributes 
     329of the elements <lit>attrs</lit> attribute:</par> 
    325330<example> 
    326331<tty> 
     
    328333<prompt>&gt;&gt;&gt; </prompt><input>print node[u"href"].bytes()</input> 
    329334href="http://www.python.org/" 
    330 <prompt>&gt;&gt;&gt; </prompt><input>del node[u"href"]</input> 
     335<prompt>&gt;&gt;&gt; </prompt><input>del node.attrs.href</input> 
    331336<prompt>&gt;&gt;&gt; </prompt><input>print node[u"href"].bytes()</input> 
    332337 
    333338<prompt>&gt;&gt;&gt; </prompt><input>node[u"href"] = u"http://www.python.org"</input> 
    334 <prompt>&gt;&gt;&gt; </prompt><input>print node[u"href"].bytes()</input> 
     339<prompt>&gt;&gt;&gt; </prompt><input>print node.attrs.href.bytes()</input> 
    335340href="http://www.python.org/" 
    336341</tty> 
     
    387392have a <lit>class</lit> attribute.</par> 
    388393 
    389 <par>Also &xml; attributes can be accessed as Python attributes of the 
    390 <lit>attrs</lit> object:</par> 
    391  
    392 <prog> 
    393 node = html.div(u"foo", class_="bar") 
    394 print node.attrs.class_ 
    395 </prog> 
    396  
    397394<section><title>Defining attributes</title> 
    398395 
     
    413410        node = xsc.Frag(self.content, u" is") 
    414411        if u"adj" in self.attrs: 
    415             node.append(u" ", html.em(self[u"adj"])) 
     412            node.append(u" ", html.em(self.attrs.adj)) 
    416413        node.append(u" cool!") 
    417414        return node.convert(converter) 
     
    475472</section> 
    476473 
    477 <section><title>Attribute value sets</title> 
     474<section><title>Allowed attribute values</title> 
    478475<par>It's possible to specify that an attribute has a fixed set of allowed 
    479476values. This can be done with the class attribute <lit>values</lit>. We could 
    480477extend our example to look like this:</par> 
    481478 
    482 <example><title>Defining attributes value sets</title> 
     479<example><title>Defining allowed attribute values</title> 
    483480<prog> 
    484481class cool(xsc.Element): 
     
    503500<tty> 
    504501<prompt>&gt;&gt;&gt; </prompt><input>s = '&lt;cool adj="pretty"&gt;&lt;python/&gt;&lt;/cool&gt;'</input> 
    505 <prompt>&gt;&gt;&gt; </prompt><input>node = parsers.parseString(s)</input> 
    506 /home/walter/pythonroot/ll/xist/xsc.py:1665: IllegalAttrValueWarning: Attribute value 'pretty' not allowed for __main__:cool.Attrs.adj. 
    507   warnings.warn(errors.IllegalAttrValueWarning(self)) 
     502<prompt>&gt;&gt;&gt; </prompt><input>node = parsers.parsestring(s)</input> 
     503/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/xsc.py:2006: IllegalAttrValueWarning: Attribute value u'pretty' not allowed for __main__:cool.Attrs.adj 
     504  warnings.warn(IllegalAttrValueWarning(self)) 
    508505</tty> 
    509506 
     
    516513<prompt>&gt;&gt;&gt; </prompt><input>node = cool(python(), adj=u"pretty")</input> 
    517514<prompt>&gt;&gt;&gt; </prompt><input>print node.bytes()</input> 
    518 /home/walter/pythonroot/ll/xist/xsc.py:1665: IllegalAttrValueWarning: Attribute value 'pretty' not allowed for __main__:cool.Attrs.adj. 
    519   warnings.warn(errors.IllegalAttrValueWarning(self)) 
     515/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/xsc.py:2006: IllegalAttrValueWarning: Attribute value u'pretty' not allowed for __main__:cool.Attrs.adj 
     516  warnings.warn(IllegalAttrValueWarning(self)) 
    520517&lt;cool adj="very"&gt;&lt;python /&gt;&lt;/cool&gt; 
    521518</tty> 
     
    616613<example><title>Parsing &xml;</title> 
    617614<prog><![CDATA[ 
    618 input = '<cool xmlns="http://xmlns.example.org/foo"><python/></cool>' 
    619  
    620 node = parsers.parsestring(input, pool=pool) 
     615s = '<cool xmlns="http://xmlns.example.org/foo"><python/></cool>' 
     616 
     617node = parsers.parsestring(s, pool=pool) 
    621618]]></prog> 
    622619</example> 
     
    627624<example><title>Parsing &xml; with predefined prefix mapping</title> 
    628625<prog><![CDATA[ 
    629 input = '<cool><python/></cool>' 
    630  
    631 node = parsers.parsestring(input, pool=pool, prefixes={None: "http://xmlns.example.org/foo"}) 
     626s = '<cool><python/></cool>' 
     627 
     628node = parsers.parsestring(s, pool=pool, prefixes={None: "http://xmlns.example.org/foo"}) 
    632629]]></prog> 
    633630</example> 
     
    648645 
    649646<section><title>Global attributes</title> 
    650 <par>You can define global attributes belonging to a certain namespace in the 
    651 same way as defining local attributes belonging to a certain element type: 
    652 Define a nested <class>Attrs</class> class inside the namespace class 
    653 (derived from <class>ll.xist.xsc.Namespace.Attrs</class>):</par> 
    654  
    655 <prog> 
    656 from ll.xist import xsc 
    657  
    658 class __ns__(xsc.Namespace): 
    659     xmlname = "foo" 
    660     xmlurl = "http://www.example.com/foo" 
    661  
    662     class Attrs(xsc.Namespace.Attrs): 
    663         class foo(xsc.TextAttr): pass 
    664 __ns__.makemod(vars()) 
    665 </prog> 
    666  
    667 <par>Setting and accessing such an attribute can be done like this:</par> 
     647<par>You can define global attributes belonging to a certain namespace by defining 
     648a global <class>Attrs</class> class and giving each attribute a namespace name 
     649via <lit>xmlns</lit>:</par> 
     650 
     651<prog> 
     652class Attrs(xsc.Attrs): 
     653    class foo(xsc.TextAttr): 
     654        xmlns = "http://www.example.com/foo" 
     655</prog> 
     656 
     657<par>To make this global attribute know to the parsing, you simply can put 
     658the <class>Attrs</class> in the pool used for parsing.</par> 
     659 
     660<par>Setting and accessing such an attribute can be done by using the 
     661attribute class instead of the attribute name like this:</par> 
    668662 
    669663<tty> 
    670664<prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    671 <prompt>&gt;&gt;&gt; </prompt><input>import foo</input> 
    672 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"foo", {(foo, u"foo"): u"bar")</input> 
    673 <prompt>&gt;&gt;&gt; </prompt><input>str(node[foo, u"foo"])</input> 
     665<prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"foo", {Attrs.foo: u"bar")</input> 
     666<prompt>&gt;&gt;&gt; </prompt><input>str(node[Attrs.foo])</input> 
    674667'bar' 
    675668</tty> 
     
    680673<tty> 
    681674<prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    682 <prompt>&gt;&gt;&gt; </prompt><input>import foo</input> 
    683 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"foo", foo.Attrs(foo=u"baz"))</input> 
    684 <prompt>&gt;&gt;&gt; </prompt><input>str(node[foo, u"foo"])</input> 
     675<prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"foo", Attrs(foo=u"baz"))</input> 
     676<prompt>&gt;&gt;&gt; </prompt><input>str(node[Attrs.foo])</input> 
    685677'baz' 
    686678</tty> 
     
    688680</section> 
    689681 
    690 <section><title>Subclassing namespaces</title> 
    691 <par>Each element class that belongs to a namespace can access its 
    692 namespace via the class attribute <lit>__ns__</lit>. When you're subclassing 
    693 namespace classes, the elements in the base namespace will be automatically 
    694 subclassed too. Of course you can explicitly subclass an element class too. 
    695 The following example shows the usefulness of this feature. Define your base 
    696 namespace like this and put it into <filename>navns.py</filename>:</par> 
    697  
    698 <prog> 
    699 from ll.xist import xsc 
    700 from ll.xist.ns import html 
    701  
    702 languages = [ 
    703     (u"Python", u"http://www.python.org/"), 
    704     (u"Perl", u"http://www.perl.org/"), 
    705     (u"PHP", u"http://www.php.net/"), 
    706     (u"Java", u"http://java.sun.com/") 
    707 ] 
    708  
    709 class navigation(xsc.Element): 
    710     def convert(self, converter): 
    711         node = self.__ns__.links() 
    712         for (name, url) in languages: 
    713             node.append(self.__ns__.link(name, href=url)) 
    714         return node.convert(converter) 
    715  
    716 class links(xsc.Element): 
    717     def convert(self, converter): 
    718         node = self.content 
    719         return node.convert(converter) 
    720  
    721 class link(xsc.Element): 
    722     class Attrs(xsc.Element.Attrs): 
    723         class href(xsc.URLAttr): pass 
    724  
    725     def convert(self, converter): 
    726         node = html.div(html.a(self.content, href=self[u"href"])) 
    727         return node.convert(converter) 
    728  
    729 class __ns__(xsc.Namespace): 
    730     xmlname = "nav" 
    731     xmlurl = "http://www.example.com/nav" 
    732 __ns__.makemod(vars()) 
    733 </prog> 
    734  
    735 <par>This namespace defines a navigation element that generates <class>div</class>s 
    736 with links to various homepages for programming languages. We can use it like this:</par> 
    737 <tty> 
    738 <prompt>&gt;&gt;&gt; </prompt><input>import navns</input> 
    739 <prompt>&gt;&gt;&gt; </prompt><input>print navns.navigation().conv().bytes()</input> 
    740 &lt;div&gt;&lt;a href="http://www.python.org/"&gt;Python&lt;/a&gt;&lt;/div&gt; 
    741 &lt;div&gt;&lt;a href="http://www.perl.org/"&gt;Perl&lt;/a&gt;&lt;/div&gt; 
    742 &lt;div&gt;&lt;a href="http://www.php.net/"&gt;PHP&lt;/a&gt;&lt;/div&gt; 
    743 &lt;div&gt;&lt;a href="http://java.sun.com/"&gt;Java&lt;/a&gt;&lt;/div&gt; 
    744 </tty> 
    745 <par>(Of course the output will all be on one line.)</par> 
    746  
    747 <par>Now we can define a derived namespace (in the file <filename>nav2ns.py</filename>) 
    748 that overwrites the element classes <class>links</class> and <class>link</class> 
    749 to change how the navigation looks:</par> 
    750  
    751 <prog> 
    752 from ll.xist import xsc 
    753 from ll.xist.ns import html 
    754  
    755 import navns 
    756  
    757 class __ns__(navns): 
    758     class links(navns.links): 
    759         def convert(self, converter): 
    760             node = html.table( 
    761                 self.content, 
    762                 border=0, 
    763                 cellpadding=0, 
    764                 cellspacing=0, 
    765                 class_=u"navigation", 
    766             ) 
    767             return node.convert(converter) 
    768  
    769     class link(navns.link): 
    770         def convert(self, converter): 
    771             node = html.tr( 
    772                 html.td( 
    773                     html.a( 
    774                         self.content, 
    775                         href=self[u"href"], 
    776                     ) 
    777                 ) 
    778             ) 
    779             return node.convert(converter) 
    780 __ns__.makemod(vars()) 
    781 </prog> 
    782  
    783 <par>When we use the navigation element from the derived namespace we'll get 
    784 the following output:</par> 
    785  
    786 <tty> 
    787 <prompt>&gt;&gt;&gt; </prompt><input>import nav2ns</input> 
    788 <prompt>&gt;&gt;&gt; </prompt><input>print nav2ns.navigation().conv().bytes()</input> 
    789 &lt;table border="0" cellpadding="0" cellspacing="0" class="navigation"&gt; 
    790 &lt;tr&gt;&lt;td&gt;&lt;a href="http://www.python.org/"&gt;Python&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt; 
    791 &lt;tr&gt;&lt;td&gt;&lt;a href="http://www.perl.org/"&gt;Perl&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt; 
    792 &lt;tr&gt;&lt;td&gt;&lt;a href="http://www.php.net/"&gt;PHP&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt; 
    793 &lt;tr&gt;&lt;td&gt;&lt;a href="http://java.sun.com/"&gt;Java&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt; 
    794 &lt;/table&gt; 
    795 </tty> 
    796 <par>(again all on one line.)</par> 
    797  
    798 <par>Notice that we automatically got an element class <class>nav2ns.navigation</class>, 
    799 that this element class inherited the <method>convert</method> method from its 
    800 base class and that the call to <method>convert</method> on the derived class 
    801 did instantiate the link classes from the derived namespace.</par> 
    802  
    803 </section> 
    804  
    805 <section><title>Namespaces as conversion targets</title> 
    806  
    807 <par>The <arg>converter</arg> argument passed to the <method>convert</method> method 
    808 has an attribute <lit>target</lit> which is a namespace class and specifies the target 
    809 namespace to which <self/> should be converted.</par> 
    810  
    811 <par>You can check which conversion is wanted with <function>issubclass</function>. 
    812 Once this is determined you can use element classes from the target to create the 
    813 required &xml; object tree. This makes it possible to customize the conversion by 
    814 passing a derived namespace to the <method>convert</method> method. To demonstrate 
    815 this, we change our example namespace to use the conversion target like this:</par> 
    816  
    817 <prog> 
    818 import navns 
    819  
    820 class __ns__(navns): 
    821     class links(navns.links): 
    822         def convert(self, converter): 
    823             node = converter.target.table( 
    824                 self.content, 
    825                 border=0, 
    826                 cellpadding=0, 
    827                 cellspacing=0, 
    828                 class_=u"navigation", 
    829             ) 
    830             return node.convert(converter) 
    831  
    832     class link(navns.link): 
    833         def convert(self, converter): 
    834             target = converter.target 
    835             node = target.tr( 
    836                 target.td( 
    837                     target.a( 
    838                         self.content, 
    839                         href=self[u"href"], 
    840                     ) 
    841                 ) 
    842             ) 
    843             return node.convert(converter) 
    844 </prog> 
    845  
    846 <par>What we might want to do is have all links (i.e. all <class>ll.xist.ns.html.a</class> 
    847 elements) generated with an attribute <lit>target="_top"</lit>. For this we derive 
    848 a new namespace from <class>ll.xist.ns.html</class> and overwrite the <class>a</class> 
    849 element:</par> 
    850  
    851 <prog> 
    852 from ll.xist.ns import html 
    853  
    854 class __ns__(html): 
    855     class a(html.a): 
    856         def convert(self, converter): 
    857             node = html.a(self.content, self.attrs, target=u"_top") 
    858             return node.convert(converter) 
    859 </prog> 
    860  
    861 <par>Now we can pass this namespace as the conversion target and all links 
    862 will have a <lit>target="_top"</lit>.</par> 
    863 </section> 
    864  
    865 </section> 
    866  
    867  
    868 <section><title>Validation and content models</title> 
    869  
    870 <par>When generating &html; you might want to make sure that your generated 
    871 code doesn't contain any illegal tag nesting (i.e. something bad like 
    872 <markup>&lt;p&gt;&lt;p&gt;Foo&lt;/p&gt;&lt;/p&gt;</markup>). The module 
    873 <module>ll.xist.ns.html</module> does this automatically:</par> 
    874  
    875 <example> 
    876 <tty> 
    877 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    878 <prompt>&gt;&gt;&gt; </prompt><input>node = html.p(html.p(u"foo"))</input> 
    879 <prompt>&gt;&gt;&gt; </prompt><input>print node.bytes()</input> 
    880 /home/walter/pythonroot/ll/xist/sims.py:238: WrongElementWarning: element &lt;ll.xist.ns.html:p&gt; may not contain element &lt;ll.xist.ns.html:p&gt; 
    881   warnings.warn(WrongElementWarning(node, child, self.elements)) 
    882 &lt;p&gt;&lt;p&gt;foo&lt;/p&gt;&lt;/p&gt; 
    883 </tty> 
    884 </example> 
    885  
    886 <par>For your own elements you can specify the content model too. This is done 
    887 by setting the class attribute <lit>model</lit> inside the element class. 
    888 <lit>model</lit> must be an object that provides a <method>checkvalid</method> 
    889 method. This method will be called during parsing or publishing with the element 
    890 as an argument. When a validation violation is detected, the Python warning 
    891 framework should be used to issue a warning.</par> 
    892  
    893 <par>The module <module>ll.xist.sims</module> contains several classes that 
    894 provide simple validation methods: 
    895 <pyref module="ll.xist.sims" class="Empty"><class>Empty</class></pyref> 
    896 can be used to ensure that the element doesn't have any content (like 
    897 <markup>br</markup> and <markup>img</markup> in &html;). 
    898 <pyref module="ll.xist.sims" class="Any"><class>Any</class></pyref> 
    899 does allow any content. 
    900 <pyref module="ll.xist.sims" class="NoElements"><class>NoElements</class></pyref> 
    901 will warn about elements from the same namespace (elements from other namespaces 
    902 will be OK). 
    903 <pyref module="ll.xist.sims" class="NoElementsOrText"><class>NoElementsOrText</class></pyref> 
    904 will warn about elements from the same namespace and non-whitespace text content. 
    905 <pyref module="ll.xist.sims" class="Elements"><class>Elements</class></pyref> 
    906 will only allow the elements specified in the constructor. 
    907 <pyref module="ll.xist.sims" class="ElementsOrText"><class>ElementsOrText</class></pyref> 
    908 will only allow the elements specified in the constructor and text.</par> 
    909  
    910 <par>None of these classes will check the number of child elements or their 
    911 order.</par> 
    912  
    913 <par>For more info see the <pyref module="ll.xist.sims"><module>sims</module></pyref> 
    914 module.</par> 
    915 </section> 
    916  
    917  
    918682<section><title>Entities</title> 
    919683 
    920684<par>In the same way as defining new element types, you can define new entities. 
    921 But to be able to use the new entities in an &xml; file you have to use a parser 
    922 that supports reporting undefined entities to the application via 
    923 <method>skippedEntity</method> 
    924 (<pyref module="ll.xist.parsers" class="SGMLOPParser"><class>SGMLOPParser</class></pyref> 
    925 and <pyref module="ll.xist.parsers" class="ExpatParser"><class>ExpatParser</class></pyref> 
    926 in the module <pyref module="ll.xist.parsers"><module>ll.xist.parsers</module></pyref> 
    927 do that). The following example is from the module 
     685The following example is from the module 
    928686<pyref module="ll.xist.ns.abbr"><module>ll.xist.ns.abbr</module></pyref>:</par> 
    929687 
     
    986744(not &xml; compatible) format. For example <lit>ll.xist.ns.jsp.expression("foo")</lit> 
    987745will be published as <lit>&lt;%= foo&gt;</lit>.</par> 
     746 
     747</section> 
    988748 
    989749</section> 
     
    1005765returns the complete 8-bit &xml; string.</par> 
    1006766 
     767<par>Writing a node to a file can be done with the method 
     768<pyref module="ll.xist.xsc" class="Node" method="write"><method>write</method></pyref>:</par> 
     769 
     770<tty> 
     771<prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
     772<prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"äöü", html.br(), u"ÄÖÜ")</input> 
     773<prompt>&gt;&gt;&gt; </prompt><input>node.write(open("foo.html", "wb"), encoding="ascii")</input> 
     774</tty> 
     775 
     776<par>All these methods use the method 
     777<pyref module="ll.xist.xsc" class="Node" method="publish"><method>publish</method></pyref> internally. 
     778<pyref module="ll.xist.xsc" class="Node" method="publish"><method>publish</method></pyref> gets passed 
     779an instance of <pyref module="ll.xist.publishers" class="Publisher"><class>ll.xist.publisher.Publisher</class></pyref>.</par> 
     780 
     781<section><title>Specifying an encoding</title> 
    1007782<par>You can specify the encoding with the parameter <arg>encoding</arg> 
    1008783(with the encoding specified in an &xml; declaration being the default, if there 
     
    1020795<prompt>&gt;&gt;&gt; </prompt><input>print node.bytes(encoding="iso-8859-1")</input> 
    1021796&lt;div&gt;Aä&amp;#937;&amp;#35486;&gt; 
    1022 <prompt>&gt;&gt;&gt; </prompt><input>print xsc.Comment(s).bytes()</input> 
     797<prompt>&gt;&gt;&gt; </prompt><input>print xsc.Comment(s).bytes(encoding="ascii")</input> 
    1023798Traceback (most recent call last): 
    1024   File "&lt;stdin&gt;", line 1, in ? 
    1025   File "/home/walter/pythonroot/ll/xist/xsc.py", line 600, in bytes 
    1026     publisher.publish(stream, self, base) 
    1027   File "/home/walter/pythonroot/ll/xist/publishers.py", line 205, in publish 
    1028     self.node.publish(self) 
    1029   File "/home/walter/pythonroot/ll/xist/xsc.py", line 1305, in publish 
    1030     publisher.write(self.content) 
    1031   File "/usr/local/lib/python2.3/codecs.py", line 178, in write 
    1032     data, consumed = self.encode(object, self.errors) 
    1033 UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-4: ordinal not in range(128) 
     799  File "&lt;stdin&gt;", line 1, in &lt;module&gt; 
     800  File "/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/xsc.py", line 934, in bytes 
     801    return "".join(self.iterbytes(base, publisher, **publishargs)) 
     802  File "/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/publishers.py", line 244, in publish 
     803    for part in self.node.publish(self): 
     804  File "/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/xsc.py", line 1798, in publish 
     805    yield publisher.encode(content) 
     806  File "/Users/walter/checkouts/LivingLogic.Python.xist/src/ll/xist/publishers.py", line 104, in encode 
     807    return self.encoder.encode(text) 
     808  File "/Users/walter/checkouts/LivingLogic.Python.core/src/ll/xml_codec.py", line 142, in encode 
     809    return self.encoder.encode(input, final) 
     810  File "/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/encodings/ascii.py", line 22, in encode 
     811    return codecs.ascii_encode(input, self.errors)[0] 
     812UnicodeEncodeError: 'ascii' codec can't encode characters in position 4-5: ordinal not in range(128) 
    1034813</tty> 
    1035814 
     
    1046825&lt;meta content="text/html; charset=iso-8859-15" http-equiv="Content-Type" /&gt; 
    1047826</tty> 
    1048  
     827</section> 
     828 
     829<section><title>&html; compatibility</title> 
    1049830<par>Another useful parameter is <arg>xhtml</arg>, 
    1050831it specifies whether you want pure &html; or &xhtml; as output:</par> 
     
    1063844<markup>&lt;br/&gt;</markup> or <markup>&lt;div/&gt;</markup>.</item> 
    1064845</dlist> 
    1065  
    1066 <par>Writing a node to a file can be done with the method 
    1067 <pyref module="ll.xist.xsc" class="Node" method="write"><method>write</method></pyref>:</par> 
    1068  
    1069 <tty> 
    1070 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1071 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(u"äöü", html.br(), u"ÄÖÜ")</input> 
    1072 <prompt>&gt;&gt;&gt; </prompt><input>node.write(open("foo.html", "wb"), encoding="ascii")</input> 
    1073 </tty> 
    1074  
    1075 <par>All these methods use the method 
    1076 <pyref module="ll.xist.xsc" class="Node" method="publish"><method>publish</method></pyref> internally. 
    1077 <pyref module="ll.xist.xsc" class="Node" method="publish"><method>publish</method></pyref> gets passed 
    1078 an instance of <pyref module="ll.xist.publishers" class="Publisher"><class>ll.xist.publisher.Publisher</class></pyref>.</par> 
    1079 </section> 
    1080  
    1081  
    1082 <section><title>Searching trees</title> 
    1083  
    1084 <par>There are two methods available for iterating through an &xml; tree and 
    1085 finding nodes in the tree: The <method>walk</method> method and XFind 
    1086 expressions.</par> 
    1087  
    1088 <section><title>The <method>walk</method> method</title> 
    1089 <par>The method <pyref module="ll.xist.xsc" class="Node" method="walk"><method>walk</method></pyref> 
    1090 is a generator. You pass a callable object to <method>walk</method> 
    1091 which is used for determining which part of the tree should be searched and 
    1092 which nodes should be returned.</par> 
    1093  
    1094 <par><module>ll.xist.xsc</module> provides several useful predefined classes for 
    1095 specifying what should be returned from <method>walk</method>: 
    1096 <pyref module="ll.xist.xsc" class="FindType"><class>FindType</class></pyref> 
    1097 will search only the first level of the tree and will return any node that is an 
    1098 instance of one of the classes passed to the constructor. So if you have an 
    1099 instance of <class>ll.xist.ns.html.ul</class> named <lit>node</lit> you could 
    1100 search for all <class>ll.xist.ns.html.li</class> elements inside with the 
    1101 following code:</par> 
    1102  
    1103 <example><title>Searching for <class>li</class> inside <class>ul</class> with <method>walk</method></title> 
    1104 <prog> 
    1105 for cursor in node.content.walk(xsc.FindType(html.li)): 
    1106     print unicode(cursor.node) 
    1107 </prog> 
    1108 </example> 
    1109  
    1110 <par><pyref module="ll.xist.xsc" class="FindTypeAll"><class>FindTypeAll</class></pyref> 
    1111 can be used when you want to search the complete tree. The following example 
    1112 extracts all the links on the 
    1113 <link href="http://www.python.org/">Python home page</link>:</par> 
    1114  
    1115 <example><title>Finding all links on the Python home page</title> 
    1116 <prog> 
    1117 from ll.xist import xsc, parsers 
     846</section> 
     847 
     848<section><title>Namespaces</title> 
     849 
     850<par>By default &xist; doesn't output any namespace declarations. The simplest 
     851way to change that, is to pass <lit>True</lit> for the <arg>prefixdefault</arg> 
     852argument when publishing:</par> 
     853 
     854<example><title>Publishing namespace info</title> 
     855<prog> 
    1118856from ll.xist.ns import html 
    1119857 
    1120 node = parsers.parseURL("http://www.python.org/", tidy=True) 
    1121  
    1122 for cursor in node.walk(xsc.FindTypeAll(html.a)): 
    1123     print cursor.node[u"href"] 
    1124 </prog> 
    1125 </example> 
    1126  
    1127 <par>This gives the output:</par> 
    1128  
    1129 <tty> 
    1130 http://www.python.org/ 
    1131 http://www.python.org/search/ 
    1132 http://www.python.org/download/ 
    1133 http://www.python.org/doc/ 
    1134 http://www.python.org/Help.html 
    1135 http://www.python.org/dev/ 
    1136 <rep>...</rep> 
    1137 </tty> 
    1138  
    1139 <par>The following example will find all external links on the Python home 
    1140 page:</par> 
    1141  
    1142 <example><title>Finding external links on the Python home page</title> 
    1143 <prog> 
    1144 from ll.xist import xsc, parsers 
    1145 from ll.xist.ns import html 
    1146  
    1147 node = parsers.parseURL("http://www.python.org/", tidy=True) 
    1148  
    1149 def isextlink(cursor): 
    1150     if isinstance(cursor.node, html.a) and not unicode(cursor.node[u"href"]).startswith(u"http://www.python.org"): 
    1151         return (True, xsc.entercontent) 
    1152     return (xsc.entercontent,) 
    1153  
    1154 for cursor in node.walk(isextlink): 
    1155     print cursor.node[u"href"] 
    1156 </prog> 
    1157 </example> 
    1158  
    1159 <par>This gives the output:</par> 
    1160  
    1161 <tty> 
    1162 http://www.jython.org/ 
    1163 http://sourceforge.net/tracker/?atid=105470&amp;group%5fid=5470 
    1164 http://sourceforge.net/tracker/?atid=305470&amp;group%5fid=5470 
    1165 http://sourceforge.net/cvs/?group%5fid=5470 
    1166 http://www.python-in-business.org/ 
    1167 http://www.europython.org/ 
    1168 mailto:webmaster@python.org 
    1169 <rep>...</rep> 
    1170 </tty> 
    1171  
    1172 <par>The callable (<function>isextlink</function> in the example) will be called 
    1173 for each node visited. The <arg>cursor</arg> argument has an attribute <lit>node</lit> 
    1174 that is the node in question. For the other attributes see the 
    1175 <pyref module="ll.xist.xsc" class="Cursor"><class>Cursor</class> class</pyref>.</par> 
    1176  
    1177 <par>The callable must return a sequence with the following entries:</par> 
    1178  
    1179 <dlist> 
    1180 <term><lit>ll.xist.xsc.entercontent</lit></term><item>enter the content of this 
    1181 element and continue searching;</item> 
    1182 <term><lit>ll.xist.xsc.enterattrs</lit></term><item>enter the attributes of this 
    1183 element and continue searching;</item> 
    1184 <term>boolean value</term><item>If true, the node will be part of the result.</item> 
    1185 </dlist> 
    1186  
    1187 <par>The sequence will be <z>executed</z> in the order you specify. To change 
    1188 the top down traversal from our example to a bottom up traversal we could change 
    1189 <function>isextlink</function> to the following (note the swapped tuple entries):</par> 
    1190  
    1191 <example><title>Bottom up link traversal function</title> 
    1192 <prog> 
    1193 def isextlink(node): 
    1194     if isinstance(node, html.a) and not unicode(node[u"href"]).startswith(u"http://www.python.org"): 
    1195         return <em>(xsc.entercontent, True)</em> 
    1196     return (xsc.entercontent,) 
    1197 </prog> 
    1198 </example> 
    1199  
    1200 <par>Note that the cursor yielded from <method>walk</method> will be reused by 
    1201 subsequent <method>next</method> calls, so you should not modify the cursor and 
    1202 you can't rely on attributes of the cursor after reentry to 
    1203 <method>walk</method>.</par> 
    1204  
    1205 </section> 
    1206  
    1207 <section><title>XFind expressions</title> 
    1208  
    1209 <par>A second method exists for iterating through a tree: XFind expressions. 
    1210 An XFind expression looks somewhat like an XPath expression, but is implemented 
    1211 as a pure Python expression (overloading the division operators).</par> 
    1212  
    1213 <par>Our example from above that searched for <class>li</class>s inside 
    1214 <class>ul</class>s can be rewritten as follows:</par> 
    1215  
    1216 <example><title>Searching for <class>li</class> inside <class>ul</class> with an XFind expression</title> 
    1217 <prog> 
    1218 for child in node/html.li: 
    1219     print unicode(child) 
    1220 </prog> 
    1221 </example> 
    1222  
    1223 <par>A XFind expression returns an iterator for certain parts of the &xml; tree. 
    1224 In an XFind expression <lit><rep>a</rep>/<rep>b</rep></lit>, 
    1225 <lit><rep>a</rep></lit> must be either a node or an iterator producing nodes 
    1226 (note that an XFind expression itself is such an iterator, so 
    1227 <lit><rep>a</rep></lit> itself might be a XFind expression). 
    1228 <lit><rep>b</rep></lit> must be an XFind operator.</par> 
    1229  
    1230 <par>Every subclass of 
    1231 <pyref module="ll.xist.xsc" class="Node"><class>ll.xist.xsc.Node</class></pyref> 
    1232 is a XFind operator. If <lit><rep>b</rep></lit> is such a subclass, 
    1233 <lit><rep>a</rep>/<rep>b</rep></lit> will produce any child nodes of the nodes 
    1234 from <lit><rep>a</rep></lit> that is an instance of <lit><rep>b</rep></lit>. 
    1235 If <lit><rep>b</rep></lit> is an attribute class, you will get attribute nodes 
    1236 instead of child nodes. Other XFind operators can be found in the module 
    1237 <pyref module="ll.xist.xfind"><module>ll.xist.xfind</module></pyref>. The 
    1238 <lit>all</lit> operator will produce every node in the tree (except for 
    1239 attributes):</par> 
    1240  
    1241 <prog> 
    1242 from ll.xist import xfind 
    1243 from ll.xist.ns import html 
    1244  
    1245 node = html.div( 
    1246     html.div( 
    1247         html.div(id=3), 
    1248         html.div(id=4), 
    1249         id=2, 
    1250     ), 
    1251     html.div( 
    1252         html.div(id=6), 
    1253         html.div(id=7), 
    1254         id=5, 
    1255     ), 
    1256     id=1 
    1257 ) 
    1258  
    1259 for child in node/xfind.all: 
    1260     print child["id"] 
    1261 </prog> 
    1262  
    1263 <par>The output of this is:</par> 
    1264  
    1265 <tty> 
    1266 1 
    1267 2 
    1268 3 
    1269 4 
    1270 5 
    1271 6 
    1272 7 
    1273 </tty> 
    1274  
    1275 <par>The following example demonstrates how to find all links on the Python 
    1276 homepage via an XFind expression:</par> 
    1277  
    1278 <prog> 
    1279 from ll.xist import xfind, parsers 
    1280 from ll.xist.ns import html 
    1281  
    1282 node = parsers.parseURL("http://www.python.org/", tidy=True) 
    1283 for link in node/xfind.all/html.a: 
    1284     print link["href"] 
    1285 </prog> 
    1286  
    1287 <par>An <lit>all</lit> operator in the middle of an XFind expression can be 
    1288 abbreviated. The XFind expression from the last example 
    1289 (<lit>node/xfind.all/html.a</lit>) can be rewritten like this: 
    1290 <lit>node//html.a</lit>.</par> 
    1291  
    1292 <par>Another XFind operator is 
    1293 <pyref module="ll.xist.xfind" class="contains"><class>contains</class></pyref>. 
    1294 It acts as a filter, i.e. the nodes produced by 
    1295 <lit><rep>a</rep>/xfind.contains(<rep>b</rep>)</lit> are a subset of the nodes 
    1296 produced by <lit><rep>a</rep></lit>, those that contain child nodes of type 
    1297 <lit>b</lit>. Searching for all links on the Python home page that contain 
    1298 images can be done like this:</par> 
    1299  
    1300 <tty> 
    1301 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist import xfind, parsers</input> 
    1302 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1303 <prompt>&gt;&gt;&gt; </prompt><input>node = parsers.parseURL("http://www.python.org/", tidy=True)</input> 
    1304 <prompt>&gt;&gt;&gt; </prompt><input>for link in node//html.a/xfind.contains(html.img):</input> 
    1305 <prompt>... </prompt><input>    print link["href"]</input> 
    1306 <prompt>... </prompt><input></input> 
    1307 http://www.python.org/ 
    1308 http://www.python.org/psf/donations.html 
    1309 http://www.opensource.org/ 
    1310 </tty> 
    1311  
    1312 <par>Note that using the <lit>all</lit> operator twice in an XFind expression 
    1313 currently won't give you the expected result, as nodes might be produced twice.</par> 
    1314  
    1315 <par>Calling <method>__getitem__</method> on an XFind operator gives you an 
    1316 item operator. Such an item operator only returns a specific item (or slice) of 
    1317 those nodes returned by the base iterator. An example:</par> 
    1318  
    1319 <prog> 
    1320 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1321 <prompt>&gt;&gt;&gt; </prompt><input>e = html.table(html.tr(html.td(j) for j in xrange(i, i+3)) for i in xrange(1, 10, 3))</input> 
    1322 <prompt>&gt;&gt;&gt; </prompt><input>print e.pretty().bytes()</input> 
    1323 &lt;table&gt; 
    1324     &lt;tr&gt; 
    1325         &lt;td&gt;1&lt;/td&gt; 
    1326         &lt;td&gt;2&lt;/td&gt; 
    1327         &lt;td&gt;3&lt;/td&gt; 
    1328     &lt;/tr&gt; 
    1329     &lt;tr&gt; 
    1330         &lt;td&gt;4&lt;/td&gt; 
    1331         &lt;td&gt;5&lt;/td&gt; 
    1332         &lt;td&gt;6&lt;/td&gt; 
    1333     &lt;/tr&gt; 
    1334     &lt;tr&gt; 
    1335         &lt;td&gt;7&lt;/td&gt; 
    1336         &lt;td&gt;8&lt;/td&gt; 
    1337         &lt;td&gt;9&lt;/td&gt; 
    1338     &lt;/tr&gt; 
    1339 &lt;/table&gt; 
    1340 <prompt>&gt;&gt;&gt; </prompt><input># Every cell</input> 
    1341 <prompt>&gt;&gt;&gt; </prompt><input>for td in e/html.tr/html.td:</input> 
    1342 <prompt>... </prompt><input>    print td</input> 
    1343 <prompt>... </prompt><input></input> 
    1344 1 
    1345 2 
    1346 3 
    1347 4 
    1348 5 
    1349 6 
    1350 7 
    1351 8 
    1352 9 
    1353 <prompt>&gt;&gt;&gt; </prompt><input># Every first cell in each row</input> 
    1354 <prompt>&gt;&gt;&gt; </prompt><input>for td in e/html.tr/html.td[0]:</input> 
    1355 <prompt>... </prompt><input>    print td</input> 
    1356 1 
    1357 4 
    1358 7 
    1359 <prompt>&gt;&gt;&gt; </prompt><input># Every cell in the first row</input> 
    1360 <prompt>&gt;&gt;&gt; </prompt><input>for td in e/html.tr[0]/html.td:</input> 
    1361 <prompt>... </prompt><input>    print td</input> 
    1362 1 
    1363 2 
    1364 3 
    1365 <prompt>&gt;&gt;&gt; </prompt><input># The first of all cells</input> 
    1366 <prompt>&gt;&gt;&gt; </prompt><input>for td in e/(html.tr/html.td)[0]:</input> 
    1367 <prompt>... </prompt><input>    print td</input> 
    1368 1 
    1369 </prog> 
    1370  
    1371 </section> 
    1372  
    1373 </section> 
    1374  
    1375  
    1376 <section><title>Manipulating trees</title> 
    1377 <par>&xist; provides many methods for manipulating an &xml; tree.</par> 
    1378  
    1379 <par>The method <pyref module="ll.xist.xsc" class="Frag" method="withsep"><method>withsep</method></pyref> 
    1380 can be used to put a seperator node between the child nodes of an 
    1381 <pyref module="ll.xist.xsc" class="Element"><class>Element</class></pyref> 
    1382 or a <pyref module="ll.xist.xsc" class="Frag"><class>Frag</class></pyref>:</par> 
    1383  
    1384 <tty> 
    1385 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist import xsc</input> 
    1386 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1387 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(*xrange(10))</input> 
    1388 <prompt>&gt;&gt;&gt; </prompt><input>print node.withsep(", ").bytes()</input> 
    1389 &lt;div&gt;0, 1, 2, 3, 4, 5, 6, 7, 8, 9&lt;/div&gt; 
    1390 </tty> 
    1391  
    1392 <par>The method <pyref module="ll.xist.xsc" class="Frag" method="shuffled"><method>shuffled</method></pyref> 
    1393 returns a shuffled version of the <pyref module="ll.xist.xsc" class="Element"><class>Element</class></pyref> 
    1394 or <pyref module="ll.xist.xsc" class="Frag"><class>Frag</class></pyref>:</par> 
    1395  
    1396 <tty> 
    1397 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist import xsc</input> 
    1398 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1399 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(*xrange(10))</input> 
    1400 <prompt>&gt;&gt;&gt; </prompt><input>print node.shuffled().withsep(", ").bytes()</input> 
    1401 &lt;div&gt;8, 1, 3, 6, 7, 5, 2, 9, 4, 0&lt;/div&gt; 
    1402 </tty> 
    1403  
    1404 <par>There are methods named <pyref module="ll.xist.xsc" class="Frag" method="reversed"><method>reversed</method></pyref> 
    1405 and <pyref module="ll.xist.xsc" class="Frag" method="sorted"><method>sorted</method></pyref> that 
    1406 return a reversed or sorted version of an element or fragment:</par> 
    1407  
    1408 <tty> 
    1409 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist import xsc</input> 
    1410 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1411 <prompt>&gt;&gt;&gt; </prompt><input>def key(n):</input> 
    1412 <prompt>... </prompt><input>   return unicode(n)</input> 
    1413 <prompt>&gt;&gt;&gt; </prompt><input>node = html.div(8,4,2,1,9,6,3,0,7,5)</input> 
    1414 <prompt>&gt;&gt;&gt; </prompt><input>print node.sorted(key=key).reversed().withsep(",").bytes()</input> 
    1415 &lt;div&gt;9,8,7,6,5,4,3,2,1,0&lt;/div&gt; 
    1416 </tty> 
    1417  
    1418 <par>The method <pyref module="ll.xist.xsc" class="Node" method="mapped"><method>mapped</method></pyref> 
    1419 recursively walks the tree and generates a new tree, where all the nodes are mapped 
    1420 through a function. An example: To replace <lit>Python</lit> with <lit>Parrot</lit> 
    1421 in every text node on the <link href="http://www.python.org/">Python page</link>, do the following:</par> 
    1422  
    1423 <prog> 
    1424 from ll.xist import xsc, parsers 
    1425  
    1426 def p2p(node, converter): 
    1427     if isinstance(node, xsc.Text): 
    1428         node = node.replace(u"Python", u"Parrot") 
    1429         node = node.replace(u"python", u"parrot") 
    1430     return node 
    1431  
    1432 node = parsers.parseURL("http://www.python.org/", tidy=True) 
    1433 node = node.mapped(p2p) 
    1434 node.write(open("parrot_index.html", "wb")) 
    1435 </prog> 
    1436  
    1437 <par>The function must either return a new node, in which case this 
    1438 new node will be used instead of the old one, or return the 
    1439 old node to tell <pyref module="ll.xist.xsc" class="Node" method="mapped"><method>mapped</method></pyref> 
    1440 that it should recursively continue with the content of the node.</par> 
    1441 </section> 
    1442  
    1443 <section><title>&url;s</title> 
    1444  
    1445 <par>For &url; handling &xist; uses the module 
    1446 <pyref module="ll.url"><module>ll.url</module></pyref>. Refer to its documentation 
    1447 for the basic functionality (especially regarding the methods 
    1448 <pyref module="ll.url" class="URL" method="__div__"><method>__div__</method></pyref> 
    1449 and <pyref module="ll.url" class="URL" method="relative"><method>relative</method></pyref>).</par> 
    1450  
    1451 <par>When &xist; parses an &xml; resource it uses a so called <z>base</z> &url;. 
    1452 This base &url; can be passed to all parsing functions. If it isn't specified 
    1453 it defaults to the &url; of the resource being parsed. This base &url; will 
    1454 be prepended to all &url;s that are read during parsing:</par> 
    1455 <tty> 
    1456 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist import parsers</input> 
    1457 <prompt>&gt;&gt;&gt; </prompt><input>from ll.xist.ns import html</input> 
    1458 <prompt>&gt;&gt;&gt; </prompt><input>node = parsers.parseString('&lt;img src="eggs.png"/&gt;', base="root:spam/index.html")</input> 
    1459 <prompt>&gt;&gt;&gt; </prompt><input>print node.bytes()</input> 
    1460 &lt;img src="root:spam/eggs.png" /&gt; 
    1461 </tty> 
    1462  
    1463 <par>For publishing a base &url; can be specified too. &url;s will be published 
    1464 relative to this base &url; with the exception of relative &url;s in the tree. 
    1465 This means:</par> 
    1466 <ulist> 
    1467 <item>When you have a relative &url; (e.g. <lit>#top</lit>) generated by a <method>convert</method> 
    1468 call, this &url; will stay the same when publishing.</item> 
    1469 <item>Base &url;s for parsing should never be relative: Relative base 
    1470 &url;s will be prepended to all relative &url;s in the file, but this will not be 
    1471 reverted for publishing. In most cases the base &url; should be a 
    1472 <lit>root</lit> &url; when you parse local files.</item> 
    1473 <item>When you parse remote web pages you can either 
    1474 omit the <arg>base</arg> argument, so it will default to the 
    1475 &url; being parsing, so that links, images, etc. on the page 
    1476 will still point back to their original location, or you 
    1477 might want to use the empty &url; <lit>URL()</lit> as the 
    1478 base, so you'll get all &url;s in the page as they are.</item> 
    1479 <item><par>When &xist; is used as a compiler for static pages, you're 
    1480 going to read source &xml; files, do a conversion and write the 
    1481 result to a new target file. In this case you should probably 
    1482 use the &url; of the target file for both parsing and 
    1483 publishing. Let's assume we have an &url; <lit>#top</lit> 
    1484 in the source file. When we use the <z>real</z> file names 
    1485 for parsing and publishing like this:</par> 
    1486 <prog> 
    1487 node = parsers.parseFile("spam.htmlxsc", base="root:spam.htmlxsc") 
    1488 node = node.conv() 
    1489 node.write(open("spam.html", "wb"), base="root:spam.html") 
    1490 </prog> 
    1491 <par>the following will happen: The &url; <lit>#top</lit> 
    1492 will be parsed as <lit>root:spam.htmlxsc#top</lit>. After 
    1493 conversion this will be written to <filename>spam.html</filename> 
    1494 relative to the &url; <lit>root:spam.html</lit>, which results 
    1495 in <lit>spam.html#top</lit>, which works, but is not what you 
    1496 want.</par> 
    1497 <par>When you use <lit>root:spam.html</lit> both for parsing 
    1498 and publishing, <lit>#top</lit> will be written to the target file 
    1499 as expected.</par></item> 
    1500 </ulist> 
    1501  
    1502 </section> 
    1503  
    1504  
    1505 <section><title>Pretty printing &xml;</title> 
    1506 <par>The method <method>pretty</method> can be used for pretty printing &xml;. 
    1507 It returns a new version of the node, with additional white space between the 
    1508 elements:</par> 
    1509 <example> 
    1510 <prog> 
    1511 from ll.xist.ns import html 
    1512 node = html.html( 
     858e = html.html( 
    1513859    html.head( 
    1514         html.title(u"foo"), 
     860        html.title("The page") 
    1515861    ), 
    1516862    html.body( 
    1517         html.div( 
    1518             html.h1(u"The ", html.em(u"foo"), u" page!"), 
    1519             html.p(u"Welcome to the ", html.em(u"foo"), u" page."), 
    1520         ), 
    1521     ), 
     863        html.h1("The header"), 
     864        html.p("The content") 
     865    ) 
    1522866) 
    1523867 
    1524 print node.pretty().bytes() 
    1525 </prog> 
    1526 </example> 
    1527 <par>This will print:</par> 
    1528 <example> 
    1529 <tty> 
    1530 &lt;html&gt; 
    1531     &lt;head&gt; 
    1532         &lt;title&gt;foo&lt;/title&gt; 
    1533     &lt;/head&gt; 
    1534     &lt;body&gt; 
    1535         &lt;div&gt; 
    1536             &lt;h1&gt;The &lt;em&gt;foo&lt;/em&gt; page!&lt;/h1&gt; 
    1537             &lt;p&gt;Welcome to the &lt;em&gt;foo&lt;/em&gt; page.&lt;/p&gt; 
    1538         &lt;/div&gt; 
    1539     &lt;/body&gt; 
    1540 &lt;/html&gt; 
    1541 </tty> 
    1542 </example> 
    1543 <par>Element content will only be modified if it doesn't contain 
    1544 <class>Text</class> nodes, so mixed content will not be touched.</par> 
    1545 </section> 
    1546  
    1547  
    1548 <section><title>Automatic generation of image size attributes</title> 
    1549  
    1550 <par>The module <pyref module="ll.xist.ns.htmlspecials"><module>ll.xist.ns.htmlspecials</module></pyref> 
    1551 contains an element <pyref module="ll.xist.ns.htmlspecials" class="autoimg"><class>autoimg</class></pyref> 
    1552 that extends <pyref module="ll.xist.ns.html" class="img"><class>ll.xist.ns.html.img</class></pyref>. 
    1553 When converted to &html; via the <pyref module="ll.xist.ns" class="Node" method="convert"><method>convert</method></pyref> 
    1554 method the size of the image will be determined and the <lit>height</lit> 
    1555 and <lit>width</lit> attributes will be set accordingly (if those attributes 
    1556 are not set already).</par> 
    1557  
    1558 </section> 
    1559  
    1560  
    1561 <section><title>Embedding Python code</title> 
    1562 <par>It's possible to embed Python code into &xist; &xml; files. For this 
    1563 &xist; supports two new processing instructions: 
    1564 <pyref module="ll.xist.ns.code" class="pyexec"><lit>pyexec</lit></pyref> 
    1565 and <pyref module="ll.xist.ns.code" class="pyeval"><lit>pyeval</lit></pyref> (in the module 
    1566 <pyref module="ll.xist.ns.code"><module>ll.xist.ns.code</module></pyref>). The content of 
    1567 <pyref module="ll.xist.ns.code" class="pyexec"><lit>pyexec</lit></pyref> will be 
    1568 executed when the processing instruction node is converted.</par> 
    1569  
    1570 <par>The result of a call to <pyref module="ll.xist.xsc" class="Node" method="convert"><method>convert</method></pyref> 
    1571 for a <pyref module="ll.xist.ns.code" class="pyeval"><lit>pyeval</lit></pyref> processing instruction is whatever the 
    1572 Python code in the content returns. The processing instruction content is treated as the body 
    1573 of a function, so you can put multiple return statements there. 
    1574 The converter is available as the parameter <arg>converter</arg> inside 
    1575 the processing instruction. For example, consider the following &xml; file:</par> 
    1576  
    1577 <prog> 
    1578 &lt;?pyexec 
    1579     # sum 
    1580     def gauss(top=100): 
    1581         sum = 0 
    1582         for i in xrange(top+1): 
    1583             sum += i 
    1584         return sum 
    1585 ?&gt; 
    1586 &lt;b&gt;&lt;?pyeval return gauss()?&gt;&lt;/b&gt; 
    1587 </prog> 
    1588  
    1589 <par>Parsing this file and calling 
    1590 <pyref module="ll.xist.xsc" class="Node" method="convert"><method>convert</method></pyref> 
    1591 results in the following:</par> 
    1592  
    1593 <tty> 
    1594 &lt;b&gt;5050&lt;/b&gt; 
    1595 </tty> 
    1596  
    1597 </section> 
     868print e.bytes(prefixdefault=True) 
     869</prog> 
     870</example> 
     871 
     872<par>Using <lit>True</lit> allows &xist; to choose its own prefixes. The code 
     873above will output (rewrapped for clarity):</par> 
     874<prog><![CDATA[ 
     875<ns:html xmlns:ns="http://www.w3.org/1999/xhtml"> 
     876<ns:head><ns:title>The page</ns:title></ns:head> 
     877<ns:body><ns:h1>The header</ns:h1><ns:p>The content</ns:p></ns:body> 
     878</ns:html> 
     879]]></prog> 
     880 
     881<par>You can also use a fixed prefix:</par> 
     882 
     883<prog> 
     884print e.bytes(prefixdefault="h") 
     885</prog> 
     886 
     887<par>This will output (again rewrapped):</par> 
     888<prog><![CDATA[ 
     889<h:html xmlns:h="http://www.w3.org/1999/xhtml"> 
     890<h:head><h:title>The page</h:title></h:head> 
     891<h:body><h:h1>The header</h:h1><h:p>The content</h:p></h:body> 
     892</h:html>]]></prog> 
     893 
     894<par>If you want the empty prefix you can use <lit>None</lit>:</par> 
     895 
     896<prog> 
     897print e.bytes(prefixdefault=None) 
     898</prog> 
     899 
     900<par>This will output (again rewrapped):</par> 
     901<prog><![CDATA[ 
     902<html xmlns="http://www.w3.org/1999/xhtml"> 
     903<head><title>The page</title></head> 
     904<body><h1>The header</h1><p>The content</p></body> 
     905</html>]]></prog> 
     906 
     907<par>When elements from more than one namespace are present in the tree, 
     908<arg>prefixdefault</arg> is unreliable. The first namespace encountered will 
     909get the prefix specified by <arg>prefixdefault</arg>, all others will get a 
     910different prefix. &xist; will never use the same prefix for different namespaces. 
     911&xist; will also refuse to use an empty prefix for global attributes:</par> 
     912 
     913<example><title>Publishing global attributes</title> 
     914<prog> 
     915from __future__ import with_statement 
     916from ll.xist import xsc 
     917from ll.xist.ns import html, xlink 
     918 
     919with html.html() as e: 
     920    with html.head(): 
     921        +html.title("The page") 
     922    with html.body(): 
     923        +html.h1("The header"), 
     924        with html.p(): 
     925            +xsc.Text("The "), 
     926            +html.a( 
     927                "Python", 
     928                xlink.Attrs( 
     929                    href="http://www.python.org/", 
     930                    title="Python", 
     931                    type="simple" 
     932                ), 
     933                href="http://www.python.org/") 
     934            +xsc.Text(" homepage") 
     935 
     936print e.bytes(prefixdefault=None) 
     937</prog> 
     938</example> 
     939 
     940<par>This will output:</par> 
     941 
     942<prog><![CDATA[ 
     943<html xmlns="http://www.w3.org/1999/xhtml" xmlns:ns="http://www.w3.org/1999/xlink"> 
     944<head><title>The page</title></head> 
     945<body> 
     946<h1>The header</h1> 
     947<p>The <a ns:href="http://www.python.org/" ns:type="simple" ns:title="Python" href="http://www.python.org/">Python</a> homepage</p> 
     948</body> 
     949</html>]]> 
     950</prog> 
     951 
     952<par>In the case of multiple namespaces you can use the <arg>prefixes</arg> 
     953argument to specify an explicit prefix for each namespace. So we could change 
     954the publishing statement from our example above to:</par> 
     955 
     956<prog> 
     957print e.bytes(prefixes={http://www.w3.org/1999/xhtml": None, "http://www.w3.org/1999/xlink": "xl"}) 
     958</prog> 
     959 
     960<par>which would give us the output:</par> 
     961 
     962<prog><![CDATA[ 
     963<html xmlns="http://www.w3.org/1999/xhtml" xmlns:xl="http://www.w3.org/1999/xlink"> 
     964<head><title>The page</title></head> 
     965<body> 
     966<h1>The header</h1> 
     967<p>The <a xl:href="http://www.python.org/" xl:type="simple" xl:title="Python" href="http://www.python.org/">Python</a> homepage</p> 
     968</body> 
     969</html>]]> 
     970</prog> 
     971 
     972<par>Note that we can shorten the publishing call from above to:</par> 
     973 
     974<prog> 
     975print e.bytes(prefixes={html.xmlns: None, xlink.xmlns: "xl"}) 
     976</prog> 
     977 
     978<par>or even to:</par> 
     979 
     980<prog> 
     981print e.bytes(prefixes={html: None, xlink: "xl"}) 
     982</prog> 
     983 
     984<par>Finally it's possible to suppress output of namespace declarations 
     985for certain namespaces by using the <arg>hidexmlns</arg> attribute:</par> 
     986</section> 
     987 
     988<prog> 
     989print e.bytes(prefixes={html: None, xlink: "xl"}, hidexmlns=[html, xlink]) 
     990</prog> 
     991 
     992<par>This will output:</par> 
     993<prog><![CDATA[ 
     994<html> 
     995<head><title>The page</title></head> 
     996<body> 
     997<h1>The header</h1> 
     998<p>The <a xl:href="http://www.python.org/" xl:type="simple" xl:title="Python" href="http://www.python.org/">Python</a> homepage</p> 
     999</body> 
     1000</html> 
     1001]]></prog> 
     1002 
     1003</section>