Java XML Parsing Tutorial - BunksAllowed

BunksAllowed is an effort to facilitate Self Learning process through the provision of quality tutorials.

Community



There are two types of parsers: (i) DOM (Document Object Model) and (ii) SAX (Simple API for XML), and their working principles are different.

Both SAX and DOM parser have their advantages and disadvantages. Hence, choosing a parser is dependent on the application.

Working principle of SAX and DOM parsers


A DOM parser creates a tree structure in memory from the input document and then waits for requests from the client. But a SAX parser does not create any internal structure. Instead, it takes the occurrences of components of an input document as events and tells the client what it reads as it reads through the input document. A DOM parser always serves the client application with the entire document. But a SAX parser serves the client application always only with pieces of the document at any given time.

With DOM parser, method calls in client applications have to be explicit and form a kind of chain. But with SAX, some methods are invoked automatically when some certain events occur.

How do we decide on which parser is good?


Ideally, a good parser should be time-efficient, space-efficient, rich in functionality, and easy to use. But in reality, none of the parsers have all these features at the same time.

A DOM parser is rich in functionality, but it is inefficient when the document size is huge, and it takes a little bit long time (to parse the whole document once) to learn how to work with it.

However, a SAX Parser is much more space-efficient in the case of big input documents as it does not create an internal structure. But from the functionality point of view, it provides fewer functions which means that the users themselves have to take care of more, such as creating their own data structures.

Advantages of SAX parser compared to DOM parser

  • The input document is too big for available memory.
  • Processing can be done in small contiguous chunks of input, instead of an entire document.
  • You just want to use the parser to extract the information of interest, and all your computation will be completely based on the data structures created by yourself.

Advantages of DOM parser compared to SAX parser

  • The application needs to access widely separate parts of the document at the same time.
  • An internal data structure is in use within the application.
  • Frequent modification is needed for the document.
  • The document is to be stored for a long time.

Happy Exploring!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.