Sectors: Finance • Information Technoloy • Government • Healthcare • Industries • Education • (show all)
import com.asprise.util.pdf.PDFReader;
PDFReader reader = new PDFReader(new File("my.pdf"));
reader.open(); // open the file.
int pages = reader.getNumberOfPages();
for(int i=0; i < pages; i++) {
String text = reader.extractTextFromPage(i);
System.out.println("Page " + i + ": " + text);
}
... // perform other operations on pages.
reader.close(); // finally, close the file.
For example, the following PDF file:
results the following extracted text
Java PDF Read Write Extract Text: Reader/Writer/Extract Text Library... http://localhost/web/Asprise.COM/product/javapdf/index.php 1 of 1 3/30/2007 5:22 PM Java PDF Reader/Writer/Text Extract Library/Component/API Asprise offers PDF writer and reader library (with text extact function) as valued add-on to our flagship products – Asprise OCR & JTwain. Portable Document Format (PDF) is a file format widely used for all kinds of documents. With Asprise Java PDF library, you can easily create, manipulate (read and write), disassemble PDF files easily. You can also use it to extract text and then index the text extracted for search. By using this library and Asprise OCR engine, now you can OCR images stored in PDF files. Download Asprise Java PDF Library Trial For FREE Asprise PDF Library for ALL PLATFORMS Developer's Guide to Asprise PDF Library
For more details on Asprise PDF library, please read Developer's Guide or view the Javadoc.