Search Java Code Snippets


  Help us in improving the repository. Add new snippets through 'Submit Code Snippet ' link.





Java - Code Samples

 Sample 1. Web Crawler using crawler4j - Crawler Controller

String crawlStorageFolder = "/data/crawl/t2";

// Set No of Crawler Threads
int numberOfCrawlers = 5;

// Set Config

CrawlConfig config = new CrawlConfig();
config.setCrawlStorageFolder(crawlStorageFolder);

config.setMaxDepthOfCrawling(1);
config.setMaxPagesToFetch(-1);
config.setUserAgentString("JavaIndex");

// Instantiate the controller for this crawl.

PageFetcher pageFetcher = new PageFetcher(config);
RobotstxtConfig robotstxtConfig = new RobotstxtConfig();
RobotstxtServer robotstxtServer = new RobotstxtServer(robotstxtConfig, pageFetcher);
CrawlController controller = new CrawlController(config, pageFetcher, robotstxtServer);

controller.addSeed("http://www.buggybread.com");

// Initiate Crawler threads
controller.start(MyCrawler.class, numberOfCrawlers);

// Exit
System.exit(0);

   Like      Feedback     crawler4j



Subscribe to Java News and Posts. Get latest updates and posts on Java from Buggybread.com
Enter your email address:
Delivered by FeedBurner