Who is it for?
pdf2xml is for anyone who is looking to extract text from a pdf, generate pdf page
thumbnails, and or convert a pdf to flash. Main reason for development is
frustration with the shortcomings of flash paper.
What is it?
pdf2xml is a free text extraction-swf creation tool for pdfs. It's basically a front end control using acrobat com objects to extract text and bookmarks from
a pdf and uses
swftools to turn pdf pages into swf movies. It then bundless all of that up into a web friendly flash presentation.
The
end result of using pdf2xml on a pdf is a searchable, scalable, load on
demand (no waiting for the entire pdf or sum of all pages to download), seamless
presentation which will display in any browser with flash player 8 or higher. The flash pages, text search, thumbnails and bookmarks are
all xml driven.
Why pdf2xml?
- Why not just use macromedia flash paper?
-
pdf2xml was actually realized due to frustration over the attempted use of macromedia flash paper. Not only were links from pdfs not being converted,
but multiple attempts to resolve the matter with adobe support resulted in a final email from support stating that flash paper was never intended to be
used on files larger than 5 mb.
1 - Sorry but just about all of my pdfs are larger than 5mb.
2 - I need my links from pdfs to come over into my flash version.
3 - I'd rather not wait for a 300 page swf to fully download before I can start interacting with the movie.
-
Why not just publish the pdf online?
-
1 - Ever accidentally click on a link that turns out to be a pdf, and when you realize that pdf is gigantic you try to hit the back button or close that
page, your browser just hangs or all out crashes?
2 - Ever need to view a pdf and get tired of having to wait for the entire pdf to download before you can get to the later part of the document?
3 - Ever need to dynamically link to a certain page in a pdf?
4 - Ever get sick and tired of acrobat reader?
Where can i get it?
Here....
Requirements
Pdf2xml makes use of acrobat com objects and the
acrobat sdk to extract text, bookmarks and thumbnails. It also uses
swftools for swf creation. Swftools is packaged up and installed with pdf2xml, therefore, the only prerequisite to install and run pdf2xml
is Adobe Acrobat 7.0 Pro or higher and .Net 2.0 (this is a windows only application).
- Windows XP pro, Server 2003 or Vista
- Adobe Acrobat 7.0 Pro or higher
- .Net 2.0 or higher (if not on system it is installed during pdf2xml install)