Version 3.0 of the Jakarta POI, the venerable Java library that provides the ability to read and write certain Microsoft Office documents, has been released. This release adds support for MS Excel formulas, improved PowerPoint support, and image extraction for MS Word documents.
Users of POI deal with a number of sub-projects to handle various aspects of these formats:
- POIFS
- Access to file structure for MS Office documents
- HSSF
- MS Excel data access
- HWPF
- MS Word data access
- HPSF
- Properties attributes for Office documents
- HSLF
- PowerPoint data access
- POI-Ruby
- Ruby bindings to gcj compiled binary libraries
The programming paradigm is relatively simple. A developer will use POIFS to create/open the document as a stream, and then connect that to the appropriate data access API (HSSF, HWPF, or HSLF) to actually interact with the content. Angsuman Chakraborty
has written a concise piece that goes into more detail about using POI to read Excel files that is a good introduction and quick start guide.
This marks the last release of POI under the Jakarta subproject before being promoted to a top-level Apache project.