Next Previous Contents

16. XML projects

Directly from the Apache XML project website, its goals are:

The project homepage is located at http://xml.apache.org. It is an umbrella for a variety of subprojects.

16.1 Introduction to XML

This is a quick introduction to XML. To know more about XML, a good starting point is http://www.xml.com. XML is a markup language (think HTML) for describing structured content using tags and attributes. Once content is separated from presentation, you can choose how to display (cellphone, html, text) or exchange it. The XML standard only describes how the tags and attributes can be arranged, not its names of what they mean. Apache provides the tools described in the following sections.

16.2 Xerces

The Xerces project provides XML parsers for a variety of languages, including Java, C++ and Perl. The Perl bindings are based on the C++ sources. There are Tcl bindings for Xerces in the 2.0 version of TclXML, by Steve Ball. This 2.0 version is available thru the SourceForge project page. An XML parser is a tool used for programatic access to XML documents. This is a description of the standards supported by Xerces:

The Xerces XML project initial code base was donated by IBM. You can find more information in the Xerces Java, Xerces C++ and Xerces Perl homepages.

16.3 Xalan

Xalan is an XSLT processor available for Java and C++. XSL is a style sheet language for XML. The T is for Transformation. XML is good at storing structured data (information). We sometimes need to display this data to the user or apply some other transformation. Xalan takes the original XML document, reads transformation configuration (stylesheet) and outputs HTML, plain text or another XML document. You can learn more about Xalan at the Xalan Java and Xalan C++ project homepages.

16.4 FOP

From the website: FOP is a Java application that reads a formatting object tree and then turns it into a PDF document. So FOP takes an XML document and outputs PDF, in a similar way that Xalan does with HTML or text. You can learn more about FOP here.

16.5 Cocoon

Cocoon leverages other Apache XML technologies like Xerces, Xalan and FOP to provide a comprehensive publishing framework. Cocoon is based around XML and XSL and targeted to sites of medium - high complexity. It separates content, logic and presentation as described in the website:

You can learn more about Cocoon at the project homepage

16.6 Xang

The goal of the Xang project is to make it easy for developers to build commercial quality XML aware applications for the Web. The application logic is defined in a hierarchical XML file which can be scripted via JavaScript. This file defines how to access the data (which can be other XML files, Java plug-ins, etc.). The Xang engine takes care of mapping HTTP requests to the appropriate handlers. You can learn more about Xang at the project homepage.

16.7 SOAP

Apache SOAP ("Simple Object Access Protocol") is an implementation of the SOAP submission to W3C. It is based on, and supersedes, the IBM SOAP4J implementation.

From the draft W3C specification: SOAP is a lightweight protocol for exchange of information in a decentralized, distributed environment. It is an XML based protocol that consists of three parts:

Think of SOAP as an XML based remote procedure call or CORBA system. It is based on HTTP and XML. On the one hand this means it is verbose and slow compared to other systems. On the other hand it eases interoperatibility, debugging and development of clients and servers for a variety of languages (C, Java, , Perl, Python, Tcl, etc.) since most modern languages have HTTP and XML modules. You can learn more at the Apache SOAP homepage

Related talk

16.8 Batik

Batik is a Java based toolkit for applications that want to use images in the Scalable Vector Graphics (SVG) format for various purposes, such as viewing, generation or manipulation.

It is XML centric and compliant with the W3C specification. It is a bit atypical from other Apache projects, in that it provides a graphical component. Batik provides hooks to extend the framework thru custom tags and it allows conversion from SVG to other formats like JPEG or PNG.

Batik homepage

Related talk

16.9 Crimson

Crimson is an alternative, Java-based, XML parser with support for XML 1.0 thru a variety of interfaces. It is the parser currently shipping in Sun products, and an intermediate step until the version 2 of Xerces is released.

Crimson homepage

Related talk

16.10 Other XML projects

There are other projects based on Apache and XML that do not live under the Apache XML umbrella

Related talk


Next Previous Contents