Yet Another Project

I have way too many projects as it is, but I keep adding to the pile.  For quite some time, I’ve been lamenting the fact that there is no good headless browser for the .NET runtime (IMO).  There are ways to drive a full-fledged browser (Watin), there are simple implementations (SimpleBrowser, etc), and then there is the option of trying to host a Java-based browser inside your .NET app (HtmlUnit).  I have written plenty of web automation using existing libraries like this, but they usually run into issues.  Using Watin in a multi-threaded content crawler, for example, usually results in memory exhaustion or a bunch of COM/Interop exceptions when trying to walk 90,000 pages.

So this prompted me to finally take the plunge and take a shot at writing a purely .NET, headless web browser.  Sounds crazy, but after a day into it I am a lot further along than I thought.  The great thing about modern software development is that a lot of the components you need are usually already out there.  My goal:

  1. Develop a fully-functional, fully-compliant, purely .NET Level 3 Document Object Model.  I think I’ve managed to achieve Level 1 compliance in the first day and I am attempting to add Level2 and Level 3 compliance today.
  2. Fully support JavaScript and CSS.  JavaScript support is achieved through the use of the Jint library.
  3. Retain a small memory fingerprint.

After one day the browser is able to perform simple requests and perform simple JavaScript tasks (jQuery support is still a way off, though):

var WebBrowser = new WebBrowser()
{
EnableJavascript = true,
EnableDotNet = false,
Timeout = 15000,
UserAgent = new Firefox9Agent()
};

Browser.NavigateTo("http://www.google.com/");
Browser.CurrentDocument.Forms[0]
.Populate(f => f.SetValue(f.FindById("q"), "KLF Web Browser"))
.Click(e => e.id.Equals("btnG"));

Advertisements

0 Responses to “Yet Another Project”



  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s





%d bloggers like this: