org.htmlparser.lexerapplications.thumbelina

Class Thumbelina

public class Thumbelina extends JPanel implements Runnable, ItemListener, ChangeListener, ListSelectionListener

View images behind thumbnails.
Field Summary
protected booleanmActive
Activity state.
protected JCheckBoxmBackgroundToggle
Background thread checkbox in status bar.
protected StringmCurrentURL
The URL being currently being examined.
protected booleanmDiscardCGI
If true, does not follow links containing cgi calls.
protected booleanmDiscardQueries
If true, does not follow links containing queries (?)
protected JListmHistory
History list.
protected JScrollPanemHistoryScroller
Scroller for the history list.
protected JSplitPanemMainArea
Main panel in central area.
protected PicturePanelmPicturePanel
The central area for pictures.
protected JScrollPanemPicturePanelScroller
Scroller for the picture panel.
protected JPanelmPowerBar
Status bar.
protected PropertyChangeSupportmPropertySupport
Bound property support.
protected JProgressBarmQueueProgress
Image request queue monitor in status bar.
protected JLabelmQueueSize
URL queue size display in status bar.
protected JProgressBarmReadyProgress
Image ready queue monitor in status bar.
protected HashMapmRequested
Images requested.
protected JCheckBoxmRunToggle
Sequencer thread toggle in status bar.
protected SequencermSequencer
The picture sequencer.
protected JSlidermSpeedSlider
Sequencer speed slider in status bar.
protected ThreadmThread
Background thread.
protected HashMapmTracked
Images being tracked currently.
protected JTextFieldmUrlText
URL report in status bar.
protected HashMapmVisited
URL's visited.
protected JLabelmVisitedSize
URL visited count display in status bar.
protected static URL[][]NONE
Value returned when no links are discovered.
static StringPROP_CURRENT_URL_PROPERTY
Property name for current URL binding.
static StringPROP_URL_QUEUE_PROPERTY
Property name for queue size binding.
static StringPROP_URL_VISITED_PROPERTY
Property name for visited URL size binding.
Constructor Summary
Thumbelina()
Creates a new instance of Thumbelina.
Thumbelina(String url)
Creates a new instance of Thumbelina.
Thumbelina(URL url)
Creates a new instance of Thumbelina.
Method Summary
voidaddHistory(String url)
Adds the given url to the history list.
voidaddPropertyChangeListener(PropertyChangeListener listener)
Add a PropertyChangeListener to the listener list.
voidappend(URL url)
Append the given URL to the queue.
voidappend(ArrayList list)
Append the given URLs to the queue.
protected URL[][]extractImageLinks(Lexer lexer, URL docbase)
Get the links of an element of a document.
protected voidfetch(URL[] images)
Fetch images.
protected ArrayListfilter(URL[] urls)
Filter URLs and add to queue.
booleangetBackgroundThreadActive()
Gets the state of the background thread.
StringgetCurrentURL()
Return the URL currently being examined.
booleangetHistoryListVisible()
Gets the state of history list visibility.
protected URL[][]getImageLinks(URL url)
Get the image links from the current URL.
PicturePanelgetPicturePanel()
Get the picture panel object encapsulated by this Thumbelina.
ArrayListgetQueue()
Getter for property queue.
intgetQueueSize()
Getter for property queue.
booleangetSequencerActive()
Gets the state of the sequencer thread.
intgetSpeed()
Get the sequencer delay time.
booleangetStatusBarVisible()
Gets the state of status bar visibility.
protected static voidhelp()
Provide command line help.
booleanisDiscardCGI()
Getter for property discardCGI.
booleanisDiscardQueries()
Getter for property discardQueries.
protected booleanisImage(String url)
Check if the url looks like an image.
voiditemStateChanged(ItemEvent event)
Handle checkbox events from the status bar.
static voidmain(String[] args)
Mainline.
protected voidmemCheck()
Check for low memory situation.
voidopen(String ref)
Open a URL.
voidremovePropertyChangeListener(PropertyChangeListener listener)
Remove a PropertyChangeListener from the listener list.
voidreset()
Reset this Thumbelina.
voidrun()
The main processing loop.
voidsetBackgroundThreadActive(boolean active)
Sets the state of the background thread activity.
protected voidsetCurrentURL(String url)
Set the current URL being examined.
voidsetDiscardCGI(boolean discard)
Setter for property discardCGI.
voidsetDiscardQueries(boolean discard)
Setter for property discardQueries.
voidsetHistoryListVisible(boolean visible)
Sets the history list visibility.
voidsetSequencerActive(boolean active)
Sets the sequencer activity state.
voidsetSpeed(int speed)
Set the sequencer delay time.
voidsetStatusBarVisible(boolean visible)
Sets the status bar visibility.
voidstateChanged(ChangeEvent event)
Handles the speed slider events.
protected voidupdateQueueSize(int original, int current)
Apply a change in 'to be examined' URL list size.
protected voidupdateVisitedSize(int original, int current)
Apply a change in 'visited' URL list size.
voidvalueChanged(ListSelectionEvent event)
Handles the history list events.

Field Detail

mActive

protected boolean mActive
Activity state. true means processing URLS, false not.

mBackgroundToggle

protected JCheckBox mBackgroundToggle
Background thread checkbox in status bar.

mCurrentURL

protected String mCurrentURL
The URL being currently being examined.

mDiscardCGI

protected boolean mDiscardCGI
If true, does not follow links containing cgi calls.

mDiscardQueries

protected boolean mDiscardQueries
If true, does not follow links containing queries (?).

mHistory

protected JList mHistory
History list.

mHistoryScroller

protected JScrollPane mHistoryScroller
Scroller for the history list.

mMainArea

protected JSplitPane mMainArea
Main panel in central area.

mPicturePanel

protected PicturePanel mPicturePanel
The central area for pictures.

mPicturePanelScroller

protected JScrollPane mPicturePanelScroller
Scroller for the picture panel.

mPowerBar

protected JPanel mPowerBar
Status bar.

mPropertySupport

protected PropertyChangeSupport mPropertySupport
Bound property support.

mQueueProgress

protected JProgressBar mQueueProgress
Image request queue monitor in status bar.

mQueueSize

protected JLabel mQueueSize
URL queue size display in status bar.

mReadyProgress

protected JProgressBar mReadyProgress
Image ready queue monitor in status bar.

mRequested

protected HashMap mRequested
Images requested.

mRunToggle

protected JCheckBox mRunToggle
Sequencer thread toggle in status bar.

mSequencer

protected Sequencer mSequencer
The picture sequencer.

mSpeedSlider

protected JSlider mSpeedSlider
Sequencer speed slider in status bar.

mThread

protected Thread mThread
Background thread.

mTracked

protected HashMap mTracked
Images being tracked currently.

mUrlText

protected JTextField mUrlText
URL report in status bar.

mVisited

protected HashMap mVisited
URL's visited.

mVisitedSize

protected JLabel mVisitedSize
URL visited count display in status bar.

NONE

protected static final URL[][] NONE
Value returned when no links are discovered.

PROP_CURRENT_URL_PROPERTY

public static final String PROP_CURRENT_URL_PROPERTY
Property name for current URL binding.

PROP_URL_QUEUE_PROPERTY

public static final String PROP_URL_QUEUE_PROPERTY
Property name for queue size binding.

PROP_URL_VISITED_PROPERTY

public static final String PROP_URL_VISITED_PROPERTY
Property name for visited URL size binding.

Constructor Detail

Thumbelina

public Thumbelina()
Creates a new instance of Thumbelina.

Thumbelina

public Thumbelina(String url)
Creates a new instance of Thumbelina.

Parameters: url Single URL to enter into the 'to follow' list.

Throws: MalformedURLException If the url is malformed.

Thumbelina

public Thumbelina(URL url)
Creates a new instance of Thumbelina.

Parameters: url URL to enter into the 'to follow' list.

Method Detail

addHistory

public void addHistory(String url)
Adds the given url to the history list. Also puts the URL in the url text of the status bar.

Parameters: url The URL to add to the history list.

addPropertyChangeListener

public void addPropertyChangeListener(PropertyChangeListener listener)
Add a PropertyChangeListener to the listener list. The listener is registered for all properties.

Parameters: listener The PropertyChangeListener to be added.

append

public void append(URL url)
Append the given URL to the queue. Adds the url only if it isn't already in the queue, and notifys listeners about the addition.

Parameters: url The url to add.

append

public void append(ArrayList list)
Append the given URLs to the queue.

Parameters: list The list of URL objects to add.

extractImageLinks

protected URL[][] extractImageLinks(Lexer lexer, URL docbase)
Get the links of an element of a document. Only gets the links on IMG elements that reference another image. The latter is based on suffix (.jpg, .gif and .png).

Parameters: lexer The fully conditioned lexer, ready to rock. docbase The url to read.

Returns: The URLs, targets of the IMG links;

Throws: IOException If the underlying infrastructure throws it. ParserException If there is a problem parsing the url.

fetch

protected void fetch(URL[] images)
Fetch images. Ask the toolkit to make the image from a URL, and add a tracker to handle it when it's received. Add details to the rquested and tracked lists and update the status bar.

Parameters: images The list of images to fetch.

filter

protected ArrayList filter(URL[] urls)
Filter URLs and add to queue. Removes already visited links and appends the rest (if any) to the visit pending list.

Parameters: urls The list of URL's to add to the 'to visit' list.

Returns: Returns the filered list.

getBackgroundThreadActive

public boolean getBackgroundThreadActive()
Gets the state of the background thread.

Returns: true if the thread is examining web pages.

getCurrentURL

public String getCurrentURL()
Return the URL currently being examined. This is a bound property. Notifications are available via the PROP_CURRENT_URL_PROPERTY property.

Returns: The size of the 'to be examined' list.

getHistoryListVisible

public boolean getHistoryListVisible()
Gets the state of history list visibility.

Returns: true if the history list is visible.

getImageLinks

protected URL[][] getImageLinks(URL url)
Get the image links from the current URL.

Parameters: url The URL to get the links from

Returns: An array of two URL arrays, index 0 is a list of images, index 1 is a list of links to possibly follow.

getPicturePanel

public PicturePanel getPicturePanel()
Get the picture panel object encapsulated by this Thumbelina.

Returns: The picture panel.

getQueue

public ArrayList getQueue()
Getter for property queue.

Returns: List of URLs that are to be visited.

getQueueSize

public int getQueueSize()
Getter for property queue. This is a bound property. Notifications are available via the PROP_URL_QUEUE_PROPERTY property.

Returns: The size of the list of URLs that are to be visited.

getSequencerActive

public boolean getSequencerActive()
Gets the state of the sequencer thread.

Returns: true if the thread is pumping images.

getSpeed

public int getSpeed()
Get the sequencer delay time.

Returns: The number of milliseconds between image additions to the panel.

getStatusBarVisible

public boolean getStatusBarVisible()
Gets the state of status bar visibility.

Returns: true if the status bar is visible.

help

protected static void help()
Provide command line help.

isDiscardCGI

public boolean isDiscardCGI()
Getter for property discardCGI.

Returns: Value of property discardCGI.

isDiscardQueries

public boolean isDiscardQueries()
Getter for property discardQueries.

Returns: Value of property discardQueries.

isImage

protected boolean isImage(String url)
Check if the url looks like an image.

Parameters: url The usrl to check for image characteristics.

Returns: true if the url ends in a recognized image extension.

itemStateChanged

public void itemStateChanged(ItemEvent event)
Handle checkbox events from the status bar. Based on the thread toggles, activates or deactivates the background thread processes.

Parameters: event The event describing the checkbox event.

main

public static void main(String[] args)
Mainline.

Parameters: args the command line arguments. Can be one or more forms of -help to get command line help, or a URL to prime the program with. Checks for JDK 1.4 and if not found runs in crippled mode (no ThumbelinaFrame).

memCheck

protected void memCheck()
Check for low memory situation. Report to the user a bad situation.

open

public void open(String ref)
Open a URL. Resets the urls list and appends the given url as the only item.

Parameters: ref The URL to add.

removePropertyChangeListener

public void removePropertyChangeListener(PropertyChangeListener listener)
Remove a PropertyChangeListener from the listener list. This removes a PropertyChangeListener that was registered for all properties.

Parameters: listener The PropertyChangeListener to be removed.

reset

public void reset()
Reset this Thumbelina. Clears the sequencer of pending images, resets the picture panel, emptiies the 'to be examined' list of URLs.

run

public void run()
The main processing loop. Pull suspect URLs off the queue one at a time, fetch and parse it, request images and enqueue further links.

setBackgroundThreadActive

public void setBackgroundThreadActive(boolean active)
Sets the state of the background thread activity. The background thread is responsible for examining URLs that are on the queue for thumbnails, and starting the image fetch operation.

Parameters: active If true, the background thread will be turned on.

setCurrentURL

protected void setCurrentURL(String url)
Set the current URL being examined.

Parameters: url The url that is being examined.

setDiscardCGI

public void setDiscardCGI(boolean discard)
Setter for property discardCGI.

Parameters: discard New value of property discardCGI.

setDiscardQueries

public void setDiscardQueries(boolean discard)
Setter for property discardQueries.

Parameters: discard New value of property discardQueries.

setHistoryListVisible

public void setHistoryListVisible(boolean visible)
Sets the history list visibility.

Parameters: visible The new visibility state. If true, the history list will be unhidden.

setSequencerActive

public void setSequencerActive(boolean active)
Sets the sequencer activity state. The sequencer is the thread that moves images from the pending list to the picture panel on a timed basis.

Parameters: active The new activity state. If true, the sequencer will be turned on. This may alter the speed setting if it is set to zero.

setSpeed

public void setSpeed(int speed)
Set the sequencer delay time. The sequencer is the thread that moves images from the pending list to the picture panel on a timed basis. This value sets the number of milliseconds it waits between pictures. Setting it to zero toggles the running state off.

Parameters: speed The sequencer delay in milliseconds.

setStatusBarVisible

public void setStatusBarVisible(boolean visible)
Sets the status bar visibility.

Parameters: visible The new visibility state. If true, the status bar will be unhidden.

stateChanged

public void stateChanged(ChangeEvent event)
Handles the speed slider events.

Parameters: event The event describing the slider activity.

updateQueueSize

protected void updateQueueSize(int original, int current)
Apply a change in 'to be examined' URL list size. Sends notification via the PROP_URL_QUEUE_PROPERTY property and updates the status bar.

Parameters: original The original size of the list. current The new size of the list.

updateVisitedSize

protected void updateVisitedSize(int original, int current)
Apply a change in 'visited' URL list size. Sends notification via the PROP_URL_VISITED_PROPERTY property and updates the status bar.

Parameters: original The original size of the list. current The new size of the list.

valueChanged

public void valueChanged(ListSelectionEvent event)
Handles the history list events.

Parameters: event The event describing the list activity.

HTML Parser is an open source library released under LGPL. SourceForge.net