A collection of tools for processing PDF files.


Version on this page:
LTS Haskell 22.26:0.1.3
Stackage Nightly 2024-06-20:0.1.3
Latest on Hackage:0.1.3

See all snapshots pdf-toolbox-core appears in

BSD-3-Clause licensed by Yuras Shumovich
Maintained by Yuras Shumovich
This version can be pinned in stack with:pdf-toolbox-core-,2508

Low level tools for processing PDF files.

Level of abstraction: cross reference, trailer, indirect object, object

The API is based on random access input streams, and is designed to be memory efficient. We don't need to parse the entire PDF file and store it in memory when you need just one page or two. Usually it is also leads to time efficiency, but we don't try optimize performance by e.g. maintaining xref or object cache. Higher level layers should take care of it.

The library is low level. It may mean that you need to be familiar with PDF file internals to actually use it.


  • fix compilation on ghc 7.4, 7.6 and 7.8

  • switch to errors-2.0

  • add upper bound to errors dependency
  • support ghc-7.10.1

  • support 1- and 2-digit escapes sequence in literal string

  • add Functor and Applicative instances to fix AMP warnings
  • fix attoparsec module deprication warnings
  • add scientific dependency latest attoparsec uses it for numbers