What is SimpleML?
SimpleML is a library for making asynchronous requests for raw data from web pages. It is designed for use with some examples from this book (see chapter 18).
The XML parsing functionality of this library is extremely minimal. For more sophisticated XML parsing, you should use either Processing’s XML library or proXML.
The library
Download the library here. Source code included in the zip! When all is said and done the directory structure should look like:
/Processing/libraries/simpleML/library/simpleML.jar
For more about installing libraries, visit the libraries tutorial.
With simpleML, you can:
Make HTMLRequest and XML Request objects:
HTMLRequest req = new HTMLRequest(this,"http://www.yahoo.com"); XMLRequest xreq = new XMLRequest(this,"http://rss.news.yahoo.com/rss/topstories");
Get the request process going:
req.makeRequest(); xreq.makeRequest();
Retrieve the raw data from HTML requests this way:
void netEvent(HTMLRequest ml) {
String html = ml.readRawSource();
println(html);
}
Retrieve data from XML feeds in three different ways, like so:
void netEvent(XMLRequest ml) {
// Getting the text of one Element
String element = ml.getElementText("elementTag");
// Getting the text of one Attribute from one Element
String attribute = ml.getElementAttributeText("elementTag","attributeTag");
// Getting an array of all XML elements
String[] headlines = ml.getElementArray("elementTag");
}
If you want to do more with XML, you’ll want to investigate Processing’s XML library or Christian Riekoff’s proXML library.
And now, for some further information and explanation. . . .
Asynchronous HTML requests
The loadStrings() function can be used for retrieving raw data from web pages. Nonetheless, unless your sketch only needs to load the data once during setup(), you may have a problem. For example, consider an applet that grabs the price of AAPL stock from an XML feed every 5 minutes. Each time loadStrings() is called, the applet will pause while waiting to receive the data. Any animation will stutter. This is because loadStrings() is a “blocking” function, in other words, the applet will sit and wait at that line of code until loadStrings() completes its task. With a local text file, this is extremely fast. Nonetheless, an HTTP Request for a web page is asynchronous, meaning the web server can take its time getting back to you with the data requested. Who knows how long loadStrings() will take!
One way to get around this would be to write your own Thread that makes the request separately, allowing your program to multitask and continuing animating while the Thread finishes its work. A brief tutorial on threading is available here. Nonetheless, in order to keep things simple, I have created a library to make http requests asynchronously and without blocking, entitled SimpleML.
This library functions much in the same way as the Processing Serial and Video libraries. To retreive a web page, you must create an instance of an HTMLRequest object, passing in a reference to the parent applet, i.e. “this”, as well as a String containing the URL you want to request.
HTMLRequest req = new HTMLRequest(this,"http://www.yahoo.com");
The request will not begin, however, until you call makeRequest().
req.makeRequest();
Finally, to receive the data, you must implement netEvent(), which will execute as soon as the data becomes available. This function is known as a “callback.” It is the same as other callbacks in Processing, such as mousePressed(), keyPressed(), serialEvent(), etc. When the user clicks the mouse, the code inside of mousePressed() is executed. When an HTML Request is finished, the code inside of netEvent() is executed.
The data is retreived as a String via the function readRawSource();
void netEvent(HTMLRequest ml) {
String html = ml.readRawSource();
println(html);
}
Following is an example that retrieves Yahoo’s homepage every 10 seconds. (See: Example 18-7.)
import simpleML.*;
// A Request object, from the library
HTMLRequest htmlRequest;
int startTime; // for the timer to make request ever N seconds
String html = ""; // String to hold data from request
int counter = 0; // Counter to animate rectangle across window
int back = 255; // Background brightness
void setup() {
size(200,200);
// Create and make an asynchronous request
htmlRequest = new HTMLRequest(this,"http://www.yahoo.com");
htmlRequest.makeRequest();
startTime = millis();
background(0);
}
void draw() {
// Fill background
background(back);
// Every 5 seconds, make a new request
int now = millis();
if (now - startTime > 5000) {
htmlRequest.makeRequest();
println("Making request!");
startTime = now;
}
// Draw some lines with colors based on characters from data retrieved
for (int i = 0; i < width; i++) {
if (i < html.length()) {
int c = html.charAt(i);
stroke(c,150);
line(i,0,i,height);
}
}
// Animate rectangle and dim rectangle
fill(255);
noStroke();
rect(counter,0,10,height);
counter = (counter + 1) % width;
back = constrain(back - 1,0,255);
}
// When a request is finished
void netEvent(HTMLRequest ml) {
html = ml.readRawSource(); // Read the raw data
back = 255; // Reset background
println("Request completed!"); // Print message
}
Asynchronous XML requests
The examples in section 11.5 demonstrate the process of manually searching through text for individual pieces of data (this is referrring to my book, whoops). Retrieving the raw XML data from http://xml.weather.yahoo.com/forecastrss?p=USNY0996 and parsing it for temperature information was admittedly a bit silly. Sure, if you need information from an HTML page, manual means are required. Unfortunately, this is hard work. HTML is made up of inconsistently formatted pages that are difficult to reverse engineer and parse effectively. XML (Extensible Markup Language), however, is designed to facilitate the sharing of data across different systems.
XML organizes information in a tree structure.

Let's look at the XML for Yahoo weather's RSS feed (this is only part of the source in order to simplify the discussion.)
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
<channel>
<item>
<title>Conditions for New York, NY at 3:51 pm EST</title>
<geo:lat>40.67</geo:lat>
<geo:long>-73.94</geo:long>
<link>
http://us.rd.yahoo.com/dailynews/rss/weather/New_York__NY/*http://xml.weather.yahoo.com/forecast/USNY0996_f.html
</link>
<pubDate>Mon, 20 Feb 2006 3:51 pm EST</pubDate>
<yweather:condition text="Fair" code="34" temp="35" date="Mon, 20 Feb 2006 3:51 pm EST"/>
<yweather:forecast day="Mon" date="20 Feb 2006" low="25" high="37" text="Clear" code="31"/>
<guid isPermaLink="false">USNY0996_2006_02_20_15_51_EST</guid>
</item>
</channel>
</rss>
With the exception of the first line (which simply indicates that the document is XML formatted), this XML document contains a nested structure consisting of elements, each with a start tag, i.e. <channel> and an end tag, i.e. </channel>. Some of these elements have content between the tags, i.e:
<title>Conditions for New York, NY at 3:51 pm EST</title>
and some have attributes:
<yweather:forecast day="Mon" date="20 Feb 2006" low="25" high="37" text="Clear" code="31"/>
It should be fairly obvious how searching for information, such as the title of the page or the high temperature, will be significantly less painful than with the tragically arbitrary process of parsing HTML. In fact, Java provides an api specifically for this purpose: http://java.sun.com/webservices/jaxp/index.jsp. There is also an XML library entited "proXML" available for Processing. For advanced reading of XML documents, these libraries will be required. In this book, however, we will start small and use a very simple library with basic XML functionality. It will allow you to do three things:
- Retrieve the text from one XML element as a String.
- Retrieve the text from one attribute of an XML element as a String.
- Retrieve the text from many XML elements (with the same tag) as an array of Strings.
Requests for XML data are made via an XMLRequest object.
XMLRequest req = new XMLRequest(this,"http://xml.weather.yahoo.com/forecastrss?p=10003");
Again, the request will not begin until you call makeRequest().
req.makeRequest();
And yet again, to receive the data from the XML request, you must implement netEvent(), this time with an XMLRequest as its argument. To retrieve the data from one XML element, getElementText() is called. For an attribute, getElementAttributeText().
The XML data:
<geo:lat>40.67</geo:lat>
<yweather:forecast day="Mon" date="20 Feb 2006" low="25" high="37" text="Clear" code="31"/>
The code to grab the data inside the XML tags:
void netEvent(XMLRequest ml) {
// Getting the text of one Element
String lat = ml.getElementText("geo:lat");
// Getting the text of one Attribute from one Element
String temperature = ml.getElementAttributeText("yweather:forecast","high");
println("Latitude is: " + lat);
println("The high temperature is: " + temperature);
}
The method getElementArray() can also be called to retrieve XML elements that appear multiple times. The following example grabs all of the headlines from Yahoo's Top Stories XML Feed. (See: Example 18-8.)
import simpleML.*;
XMLRequest xmlRequest;
void setup() {
size(200,200);
// Creating and starting the request
xmlRequest = new XMLRequest(this,"http://rss.news.yahoo.com/rss/topstories");
xmlRequest.makeRequest();
}
void draw() {
background(0);
}
// When the request is complete
void netEvent(XMLRequest ml) {
// Retrieving an array of all XML elements inside "<title*>" tags
String[] headlines = ml.getElementArray("title");
for (int i = 0; i < headlines.length; i++) {
println(headlines[i]);
}
}








