Class PDFDocument

  • All Implemented Interfaces:
    AutoCloseable

    public class PDFDocument
    extends Object
    implements AutoCloseable
    Utility class to verify the content of a PDF document.
    Since:
    14.4.2, 14.5
    Version:
    $Id: f6072815d37b6ab5fb550f1a06994ecf5d7822e5 $
    • Constructor Detail

      • PDFDocument

        public PDFDocument​(URL url)
                    throws IOException
        Fetches and parses a PDF document from a given URL.
        Parameters:
        url - where to fetch the PDF document from
        Throws:
        IOException - if fetching and parsing the PDF document fails
      • PDFDocument

        public PDFDocument​(URL url,
                           String userName,
                           String password)
                    throws IOException
        Fetches and parses a PDF document from a given URL.
        Parameters:
        url - where to fetch the PDF document from
        userName - the user name used to access the PDF document
        password - the password used to access the PDF document
        Throws:
        IOException - if fetching and parsing the PDF document fails
        Since:
        14.10
    • Method Detail

      • getNumberOfPages

        public int getNumberOfPages()
        Returns:
        the number of pages
      • getTextFromPage

        public String getTextFromPage​(int pageNumber)
                               throws IOException
        Parameters:
        pageNumber - the page number
        Returns:
        the text from the specified page
        Throws:
        IOException - if we fail to extract the page text
      • getText

        public String getText()
                       throws IOException
        Returns:
        the entire text from this PDF document
        Throws:
        IOException - if we fail to extract the text
      • getLinks

        public Map<String,​String> getLinks()
                                          throws IOException
        Returns:
        a mapping between link labels and link targets
        Throws:
        IOException - if we fail to extract the links from this PDF document
      • getLinksFromPage

        public Map<String,​String> getLinksFromPage​(int pageNumber)
                                                  throws IOException
        Parameters:
        pageNumber - the page number
        Returns:
        a mapping between link labels and link targets
        Throws:
        IOException - if we fail to extract the links from the specified page
      • getImagesFromPage

        public List<PDFImage> getImagesFromPage​(int pageNumber)
                                         throws IOException
        Parameters:
        pageNumber - the page number
        Returns:
        the images from the specified page
        Throws:
        IOException - if we fail to extract the images