How to read an existing pdf file in java using iText jar?

To read an existing pdf file using iText jar first download the iText jar files and include in the application classpath.

Steps:

1. Create PdfReader instance.
2. Get the number of pages in pdf
3. Iterate the pdf through pages.
4. Extract the page content using PdfTextExtractor.
5. Process the page content on console.
6. Close the PdfReader.

Example:

PDFReadExample.java
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.parser.PdfTextExtractor;
 
/**
* This class is used to read an existing
* pdf file using iText jar.
* @author javawithease
*/

public class PDFReadExample {
public static void main(String args[]){
try {
//Create PdfReader instance.
PdfReader pdfReader = new PdfReader("D:\\testFile.pdf");
 
//Get the number of pages in pdf.
int pages = pdfReader.getNumberOfPages();
 
//Iterate the pdf through pages.
for(int i=1; i<=pages; i++) {
//Extract the page content using PdfTextExtractor.
String pageContent =
PdfTextExtractor.getTextFromPage(pdfReader, i);
 
//Print the page content on console.
System.out.println("Content on Page "
+ i + ": " + pageContent);
}
 
//Close the PdfReader.
pdfReader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}

Output:

Content on Page 1: Hello world, this a test pdf file.

No comments: