[PDF]

Concurrent Thread-based Web Crawler


Jack Parsons

06/05/2016

Supervised by David W Walker; Moderated by Paul L Rosin

In this project you will develop a multi-threaded program for crawling the Web. Each web page encountered will be processed in some way that is dependent on its content. The software will offer options for constraining the search; for example, to just one web site.

To do this project you must be familiar with Java and programming with threads.


Initial Plan (31/01/2016) [Zip Archive]

Final Report (06/05/2016) [Zip Archive]

Publication Form