Object traversal is the process Twisted.Web2 uses to determine what object to use to render HTML for a particular URL. When an HTTP request comes in to the web server, the object publisher splits the URL into segments, and repeatedly calls methods which consume path segments and return objects which represent that path, until all segments have been consumed. At the core, the Web2 traversal API is very simple. However, it provides some higher level functionality layered on top of this to satisfy common use cases.
Object Traversal Basics
The root resource is the top-level object in the URL
      space; it conceptually represents the URI "/". The
      Twisted.Web2 object traversal and object
      publishing machinery uses only two methods to locate an
      object suitable for publishing and to generate the HTML from it;
      these methods are described in the
      interface twisted.web2.iweb.IResource:
class IResource(Interface): """ I am a web resource. """ def locateChild(request, segments): """Locate another object which can be adapted to IResource. return: A 2-tuple of (resource, remaining-path-segments), or a deferred which will fire the above. Causes the object publishing machinery to continue on with specified resource and segments, calling the appropriate method on the specified resource. If you return (self, L{server.StopTraversal}), this instructs web2 to immediately stop the lookup stage, and switch to the rendering stage, leaving the remaining path alone for your render function to handle. """ def renderHTTP(request): """Return an IResponse or a deferred which will fire an IResponse. This response will be written to the web browser which initiated the request. """
Let's examine what happens when object traversal occurs over a very simple root resource:
from twisted.web2 import iweb, http, stream class SimpleRoot(object): implements(iweb.IResource) def locateChild(self, request, segments): return self, () def renderHTTP(self, request): return http.Response(200, stream=stream.MemoryStream("Hello, world!"))
This resource, when passed as the root resource to
      server.Site or
      wsgi.createWSGIApplication, will
      immediately return itself, consuming all path segments. This
      means that for every URI a user visits on a web server which
      is serving this root resource, the text "Hello,
      world!" will be rendered. Let's examine the value of
      segments for various values of
      URI:
/foo/bar
  ('foo', 'bar')
/
  ('', )
/foo/bar/baz.html
  ('foo', 'bar', 'baz.html')
/foo/bar/directory/
  ('foo', 'bar', 'directory', '')
    So we see that Web2 does nothing more than split the URI on the string '/' and pass these path segments to our application for consumption. Armed with these two methods alone, we already have enough information to write applications which service any form of URL imaginable in any way we wish. However, there are some common URL handling patterns which Twisted.Web2 provides higher level support for.
locateChild in depth
One common URL handling pattern involves parents which only
      know about their direct children. For example,
      a Directory object may only
      know about the contents of a single directory, but if it
      contains other directories, it does not know about the contents
      of them. Let's examine a
      simple Directory object which
      can provide directory listings and serves up objects for child
      directories and files:
from twisted.web2 import resource class Directory(resource.Resource): def __init__(self, directory): self.directory = directory def renderHTTP(self, request): html = ['<ul>'] for child in os.listdir(self.directory): fullpath = os.path.join(self.directory, child) if os.path.isdir(fullpath): child += '/' html.extend(['<li><a href="', child, '">', child, '</a></li>']) html.append('</ul>') html = stream.MemoryStream(''.join(html)) return http.Response(200, stream=html) def locateChild(self, request, segments): name = segments[0] fullpath = os.path.join(self.directory, name) if not os.path.exists(fullpath): return None, () # 404 if os.path.isdir(fullpath): return Directory(fullpath), segments[1:] if os.path.isfile(fullpath): return static.File(fullpath), segments[1:]
Because this implementation of locateChild
      only consumed one segment and returned the rest of them
      (segments[1:]),
      the object traversal process will continue
      by calling locateChild
      on the returned resource and passing the partially-consumed
      segments. In this way, a directory structure of any depth can be
      traversed, and directory listings or file contents can be rendered for any
      existing directories and files.
So, let us examine what happens when the URI "/foo/bar/baz.html" is traversed, where "foo" and "bar" are directories, and "baz.html" is a file.
- Directory('/').locateChild(request, ('foo', 'bar', 'baz.html')) - Returns Directory('/foo'), ('bar', 'baz.html')
- Directory('/foo').locateChild(request, ('bar', 'baz.html')) - Returns Directory('/foo/bar'), ('baz.html, )
- Directory('/foo/bar').locateChild(request, ('baz.html')) - Returns File('/foo/bar/baz.html'), ()
No more segments to be
      consumed; File('/foo/bar/baz.html').renderHTTP(ctx) is
      called, and the result is sent to the browser.
childFactory method
Consuming one URI segment at a time by checking to see if a
      requested resource exists and returning a new object is a very
      common pattern. Web2's default implementation
      of twisted.web2.iweb.IResource, twisted.web2.resource.Resource, contains an
      implementation of locateChild which
      provides more convenient hooks for implementing object
      traversal. One of these hooks
      is childFactory. Let us imagine for
      the sake of example that we wished to render a tree of
      dictionaries. Our data structure might look something like this:
tree = dict( one=dict( foo=None, bar=None), two=dict( baz=dict( quux=None)))
Given this data structure, the valid URIs would be:
- /
- /one
- /one/foo
- /one/bar
- /two
- /two/baz
- /two/baz/quux
Let us construct
      a twisted.web2.resource.Resource
      subclass which uses the
      default locateChild implementation
      and overrides the childFactory hook
      instead:
from twisted.web2 import http, resource, stream class DictTree(resource.Resource): def __init__(self, dataDict): self.dataDict = dataDict def renderHTTP(self, request): if self.dataDict is None: content = "Leaf" else: html = ['<ul>'] for key in self.dataDict.keys(): html.extend(['<li><a href="', key, '">', key, '</a></li>']) html.append('</ul>') content = ''.join(html) return http.Response(200, stream=stream.MemoryStream(content)) def childFactory(self, request, name): if name not in self.dataDict: return None # 404 return DictTree(self.dataDict[name])
As you can see, the childFactory
    implementation is considerably shorter than the equivalent locateChild implementation would have been.
child_* methods and attributes
Often we may wish to have some hardcoded URLs which are not
      dynamically generated based on some data structure. For example,
      we might have an application which uses an external CSS
      stylesheet, an external JavaScript file, and a folder full of
      images. The twisted.web2.resource.ResourcelocateChild implementation provides a
      convenient way for us to express these relationships by using
      child_ prefixed methods:
from twisted.web2 import resource, http, static class Linker(resource.Resource): def renderHTTP(self, request): page = """<html> <head> <link href="css" rel="stylesheet" /> <script type="text/javascript" src="scripts" /> <body> <img src="images/logo.png" /> </body> </html>""" return http.Response(200, stream=stream.MemoryStream(page)) def child_css(self, request): return static.File('/Users/dp/styles.css') def child_scripts(self, request): return static.File('/Users/dp/scripts.js') def child_images(self, request): return static.File('/Users/dp/images/')
One thing you may have noticed is that all of the examples so
      far have returned new object instances whenever they were
      implementing a traversal API. However, there is no reason these
      instances cannot be shared. One could for example return a
      global resource instance, an instance which was previously
      inserted in a dict, or lazily create and cache dynamic resource
      instances on the fly. The resource.ResourcelocateChild implementation also provides a
      convenient way to express that one global resource instance
      should always be used for a particular url,
      the child-prefixed attribute:
class FasterLinker(Linker): child_css = static.File('/Users/dp/styles.css') child_scripts = static.File('/Users/dp/scripts.js') child_images = static.File('/Users/dp/images/')
Dots in child names
When a URL contains dots, which is quite common in normal URLs,
      it is simple enough to handle these URL segments
      in locateChild
      or childFactory one of the passed
      segments will simply be a string containing a dot. However, it
      is notimmediately obvious how one would express a URL segment
      with a dot in it when
      using child-prefixed methods. The
      solution is really quite simple:
class DotChildren(resource.Resource):
    def render(self, request):
        return http.Response(200, stream="""
  
    
  
""")
    If you only wish to add a child to specific instance of DotChildren then you should use the putChild method.
rsrc = DotChildren()
rsrc.putChild('child_scripts.js', static.File('/Users/dp/scripts.js'))
    However if you wish to add a class attribute you can use setattr like so.
setattr(DotChildren, 'child_scripts.js', static.File('/Users/dp/scripts.js'))
    The same technique could be used to install a child method with a dot in the name.
The default trailing slash handler
When a URI which is being handled ends in a slash, such as when
      the '/' URI is being rendered or when a directory-like URI is
      being rendered, the string '' appears in the path segments which
      will be traversed. Again, handling this case is trivial inside
      either locateChild
      or childFactory, but it may not be
      immediately obvious
      what child-prefixed method or
      attribute will be looked up. The method or attribute name which
      will be used is simply child with a
      single trailing underscore.
The resource.Resource class
      provides an implementation of this method which can work in two
      different ways. If the
      attribute addSlash is True, the
      default trailing slash handler will
      return self. In the case
      when addSlash is True, the
      default resource.Resource.renderHTTP
      implementation will simply perform a redirect which
      adds the missing slash to the URL.
The default trailing slash handler also returns self
      if addSlash is false, but emits a
      warning as it does so. This warning may become an exception at
      some point in the future.
IRequest.prepath and IRequest.postpath
During object traversal, it may be useful to discover which
      segments have already been handled and which segments are
      remaining to be handled.  In locateChild the remaining segments
      are given as the second argument.  However, since all object
      traversal APIs are also passed
      the request object, this information
      can also be obtained via
      the IRequest.prepath
      and IRequest.postpath
      attributes.
Conclusion
Twisted.web2 makes it easy to handle complex URL
      hierarchies. The most basic object traversal
      interface, twisted.web2.iweb.IResource.locateChild,
      provides powerful and flexible control over the entire object
      traversal process. Web2's
      canonical IResource
      implementation, resource.Resource, also includes the
      convenience hooks childFactory along
      with child-prefixed method and
      attribute semantics to simplify common use cases.