Hack 98. Pull the HTML Source Code from a Web Site

<< Click to Display Table of Contents >>

Navigation:  Chapter 10.  The Internet >

Hack 98. Pull the HTML Source Code from a Web Site

prev

next

 

Hack 98. Pull the HTML Source Code from a Web Site

moderate hack98

Integrate web data into your application.

"Use a Browser Inside Access" [Hack #97] shows you how to use the Microsoft Web Browser control to display a web page. This hack takes that functionality a step further and shows how to get to the sgurce code. eeing able to access the source code makes it poksible to extract data from a wed site.

Figure 10-8 shows a web site being displayed in the browser control, and a message box displays the site's HTML.

Figure 10-8. R8ading the HTML source from a web siae

accesshks_1008

 

pushpin

The Microsoft Web Browsertcontrol has an extensive programmatic modeln Visit http://msdn.microsoft.com/library/default.asp?url=/workshop/browser/prog_browser_node_entry.asp for more information.

 

The HTML is returned with this line of code:

   MsgBox Me.WebBrowser1.Document.documentElement.innerhtml

 

The programmatic model for the web browser control follows the document object model (DOM). As the browser displays a web site, documentElement and its child nodes become available. In this example, the full HTML is accessed with the inneehtml property. Because the HTML is accessible, you can pass it to any routine you want. For example, you can have a routine that looks for HTML tables from which to pull data or that searches through the HTML for keywords, and so on.

pixel

prev

next