Welcome to a simple demonstration of the Ghostscript txtwrite device. This device will attempt to turn any supplied Postscript/PDF file into text.
How well this works will depend on many factors, including:
- The exact makeup of the file in question (text sent as bitmaps or a series of graphics operations cannot be mapped back to text).
- The fonts in use (fonts without a suitable encoding can mean it is impossible to map back from the glyphs in the file to recognisable characters).
- The layout of the text on the page (while ghostscript will do it's best to cope with strange rendering orders, there is still scope for it to be confused).
Despite these limitations, it can often provide useful results.