org.faceless.pdf2
Class PageExtractor.Text

java.lang.Object
  extended by org.faceless.pdf2.PageExtractor.Text
All Implemented Interfaces:
Comparable
Enclosing class:
PageExtractor

public abstract class PageExtractor.Text
extends Object
implements Comparable

A class representing a piece of text which is extracted from the PageExtractor. Each text object has a location on the page, font-size, font-name, color and text.

Since:
2.6.2

Constructor Summary
PageExtractor.Text()
           
 
Method Summary
abstract  int compareTo(Object o)
           
 AnnotationMarkup createAnnotationMarkup(String type)
          Create a new AnnotationMarkup of the specified type to cover this text.
 float getAngle()
          Return the angle of rotation of this text on the page, in degrees clockwise from 12 o'clock.
abstract  Paint getColor()
          Return the color of this text
 float[] getCorners()
          Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the quadrilateral that encompasses the text, specified clockwise from bottom left.
abstract  String getFontName()
          Return the font name of this text
abstract  float getFontSize()
          Return the font size of this text in points
 float getLength()
          Return the length of this Text in points.
abstract  float getOffset(int pos)
          Given an offset into the text, return the start position of that letter.
 PDFPage getPage()
          Return the PDFPage this text was found on - simply the page the parent PageExtractor was created from.
 PageExtractor getPageExtractor()
          Return the PageExtractor this text was created from
abstract  PageExtractor.Text getRowNext()
          Return the next Text item in this row, or null if there are none
abstract  PageExtractor.Text getRowPrevious()
          Return the next Text item in this row, or null if there are none
abstract  String getText()
          Return the text content of this text
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PageExtractor.Text

public PageExtractor.Text()
Method Detail

getLength

public float getLength()
Return the length of this Text in points. This method measures the baseline of the text, so for rotated text the value will always be positive regardless of the angle.

Returns:
the length of the text in points at its baseline

getCorners

public final float[] getCorners()
Return the four corners (x1,y1) (x2,y2) (x3,y3) (x4,y4) of the quadrilateral that encompasses the text, specified clockwise from bottom left. The text baseline runs from (x1,y1) to (x4,y4).


createAnnotationMarkup

public AnnotationMarkup createAnnotationMarkup(String type)
Create a new AnnotationMarkup of the specified type to cover this text. The annotation is not added to the page

Parameters:
type - the type of markup - "Highlight", "Underline" etc.
Since:
2.8

getAngle

public final float getAngle()
Return the angle of rotation of this text on the page, in degrees clockwise from 12 o'clock. Most text is not rotated and so will return 0.

Returns:
the angle of the text

getFontSize

public abstract float getFontSize()
Return the font size of this text in points


getOffset

public abstract float getOffset(int pos)
Given an offset into the text, return the start position of that letter. Because text may not be on a horizontal line, this value is returned as a float in the range 0 to 1 (0 being at the start of the text, 1 being the end). For the common case where text is horizontal, you can calculate it's start position like so:
 float left = text.getCorners()[0] + (text.getOffset(pos) * text.getLength());
 

Parameters:
pos - the position of the letter in the Text to retrive the position for. In the range 0 to getText().length() - 1
Since:
2.6.12

getPage

public PDFPage getPage()
Return the PDFPage this text was found on - simply the page the parent PageExtractor was created from.

Since:
2.6.12

getPageExtractor

public PageExtractor getPageExtractor()
Return the PageExtractor this text was created from

Since:
2.10.3

getColor

public abstract Paint getColor()
Return the color of this text

Returns:
the color

getFontName

public abstract String getFontName()
Return the font name of this text

Returns:
the name of the font

getText

public abstract String getText()
Return the text content of this text

Returns:
the text

compareTo

public abstract int compareTo(Object o)
Specified by:
compareTo in interface Comparable

getRowNext

public abstract PageExtractor.Text getRowNext()
Return the next Text item in this row, or null if there are none

Since:
2.10.3

getRowPrevious

public abstract PageExtractor.Text getRowPrevious()
Return the next Text item in this row, or null if there are none

Since:
2.10.3


Copyright © 2001-2008 Big Faceless Organization