The Apache Commons IO library contains utility classes, stream implementations, file filters,
file comparators, endian transformation classes, and much more.
Guava is a suite of core and expanded libraries that include
utility classes, Google's collections, I/O classes, and
much more.
Parent pom providing dependency and plugin management for applications built with Maven
The Apache PDFBox library is an open source Java tool for working with PDF documents.
This is the core Apache Tika™ toolkit library from which all other modules inherit functionality. It
also
includes the core facades for the Tika API.
General data-binding functionality for Jackson: works on core streaming API
The Apache Commons Codec component contains encoders and decoders for
formats such as Base16, Base32, Base64, digest, and Hexadecimal. In addition to these
widely used encoders and decoders, the codec package also maintains a
collection of phonetic encoding utilities.
jsoup is a Java library that simplifies working with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, and xpath selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers.
JSON is a light-weight, language independent, data interchange format.
See http://www.JSON.org/
The files in this package implement JSON encoders/decoders in Java.
It also includes the capability to convert between JSON and XML, HTTP
headers, Cookies, and CDL.
This is a reference implementation. There are a large number of JSON packages
in Java. Perhaps someday the Java community will standardize on one. Until
then, choose carefully.
Core Jackson processing abstractions (aka Streaming API), implementation for JSON
Core starter, including auto-configuration support, logging and YAML
Maven Surefire MOJO in maven-surefire-plugin.