260 страниц. 2011 год. LAP Lambert Academic Publishing This book analyses the shortcoming of existing methods for extracting information from web pages. Our analysis shows that existing methods use high level information from these web pages inefficiently, which ultimately degrades their objective performance. We develop a series of optimized extraction techniques which improve on the state of the art. Experimental tests show that our techniques can perform better than the existing techniques on a wide range of data records.