XSLTGen

An Automatic XSLT Stylesheet Generator

 

Brief Description

XSLTGen is an automatic XSLT generator. This system automatically generates an XSLT stylesheet, given a source XML document and a desired output HTML or XML document. The generated XSLT stylesheet contains rules needed to transform the input XML document to the HTML document. It can also be applied to other XML documents with similar structure as the input XML document.

The XSLTGen system consists of six main components: DOM Builder, Text Matching subsystem, Structure Matching subsystem, Sequence Checker, XSLT Stylesheet Constructor, and XSLT Stylesheet Refiner subsystem.

 

Input Requirements

XSLTGen requires its input XML and HTML documents to be well-formed.

 

Current Version

The current version of XSLTGen does not support the capability to automatically generate XSLT stylesheets with complex functions (e.g. conditional transformation, item numbering, sorting)

 

Source Codes

The source codes are divided into 2 zip files:

  1. XSLTGen.zip contains the source code for the first five components of XSLTGen system, i.e. DOM Builder, Text Matching subsystem, Structure Matching subsystem, Sequence Checker, and XSLT Stylesheet Constructor.

  2. Refine.zip contains the source code for the final step of XSLTGen system: XSLT Stylesheet Refiner subsystem.

 

How To Run

To generate the XSLT stylesheet for an input XML document and a desired output HTML document (first five steps of XSLTGen):

  1. Unzip XSLTGen.zip
  2. Run 'make'
  3. Use the command 'java -jar XSLTGen.jar <xml_document> <html_document>' or 'java XSLTGen <xml_document> <html_document>'
  4. The resulting XSLT stylesheet is sent to 'stdout'

To generate the fixed XSLT stylesheet (final step of XSLTGen):

  1. Unzip Refine.zip
  2. Run 'make'
  3. Use the command 'java -cp "./xss4j.jar;" ImproveXSLT <xml_document> <html_document> <xslt_stylesheet_to_be_fixed>' (in Windows), or 'java -cp "./xss4j.jar:." Improve XSLT <xml_document> <html_document> <xslt_stylesheet_to_be_fixed>' (in UNIX)
  4. Upon completion of the program, the fixed XSLT stylesheet can be found in the <xslt_stylesheet_to_be_fixed> file, as well as 'stdout'

Examples

The following are the datasets used in XSLTGen paper:

  1. Books
  2. Itinerary
  3. Poem
  4. Soccer
  5. Chat Log

Each of these example may contain files as follows:

  1. a.xml - the XML document
  2. a.html - the original HTML document, i.e. the one generated based on the XML document 'a.xml'
  3. a.xsl - the XSLT stylesheet used to transform the XML document 'a.xml' to the original HTML document 'a.html'
  4. a-gen.xsl - the XSLT stylesheet produced by XSLTGen system
  5. a-gen.html - the HTML document produced by applying the generated XSLT stylesheet 'a-gen.xsl' back to the XML document 'a.xml'

The following is the dataset used to test the Refinement stage of XSLTGen:

  1. Music

This example contains files as follows:

  1. a.xml - the XML document
  2. a.html - the original HTML document, i.e. the one generated based on the XML document 'a.xml'
  3. a.xsl - the XSLT stylesheet used to transform the XML document 'a.xml' to the original HTML document 'a.html'
  4. a-gen.xsl - the initial XSLT stylesheet produced by XSLTGen system (the one that is erroneous)
  5. a-gen.html - the HTML document produced by applying the generated XSLT stylesheet 'a-gen.xsl' back to the XML document 'a.xml'
  6. a-fixed.xsl - the fixed XSLT stylesheet produced by the Refinement algorithm of XSLTGen.
  7. a-fixed.html - the HTML document produced by applying the fixed XSLT stylesheet 'a-fixed.xsl' back to the XML document 'a.xml'