com.itextpdf.text.pdf.parser
Class PdfTextExtractor

java.lang.Object
  extended by com.itextpdf.text.pdf.parser.PdfTextExtractor

public final class PdfTextExtractor
extends Object

Extracts text from a PDF file.

Since:
2.1.4

Method Summary
static String getTextFromPage(PdfReader reader, int pageNumber)
          Extract text from a specified page using the default strategy.
static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy)
          Extract text from a specified page using an extraction strategy.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getTextFromPage

public static String getTextFromPage(PdfReader reader,
                                     int pageNumber,
                                     TextExtractionStrategy strategy)
                              throws IOException
Extract text from a specified page using an extraction strategy.

Parameters:
reader - the reader to extract text from
pageNumber - the page to extract text from
strategy - the strategy to use for extracting text
Returns:
the extracted text
Throws:
IOException - if any operation fails while reading from the provided PdfReader
Since:
5.0.2

getTextFromPage

public static String getTextFromPage(PdfReader reader,
                                     int pageNumber)
                              throws IOException
Extract text from a specified page using the default strategy.

Note: the default strategy is subject to change. If using a specific strategy is important, use getTextFromPage(PdfReader, int, TextExtractionStrategy)

Parameters:
reader - the reader to extract text from
pageNumber - the page to extract text from
Returns:
the extracted text
Throws:
IOException - if any operation fails while reading from the provided PdfReader
Since:
5.0.2


Copyright © 2013. All Rights Reserved.