11.1 RDF and C#

When Microsoft went to its new .NET architecture, one of the products released with the architecture was the Common Language Runtime (CLR), a programming language platform capable of supporting different programming languages. The first language released was C#, a hybrid between C++ and Java.

If you're running Linux, you don't need .NET to compile C# code; you can also compile the code using the C# compiler provided with Mono, an open source CLR alternative. Download Mono at Ximian's Mono site, http://www.go-mono.com/.

When I was looking around for application environments that support RDF/XML, I checked for a C# or .NET-based environment, not really expecting to find anything. However, I found more than one product, including an easy-to-install, lightweight C# parser named Drive.

Drive can be downloaded at http://www.daml.ri.cmu.edu/drive/news.html. According to a news release at the site, the API has been updated to the newest RDF specification.

Drive is a relatively uncomplicated API, providing three major classes:

Softagents.Drive.RDFEdge

Represents an edge (arc) within an RDF graph. Variables include m_Sourcenode and m_Destnode, representing the source and destination node of the arc, respectively.

Softagents.Drive.RDFGraph

Stores and manages the entire graph.

Softagents.Drive.RDFNode

Represents a node within an RDF graph. Variables include m_Edges, with all arcs associated with the node. Methods include getEdges, getIncomingEdges, getOutgoingEdges, and so on.

To work with a graph, first create an instance of RDFGraph, reading in an RDF/XML document. Once it is read in, you can then query information from the graph, such as accessing a node with a URI and then querying for the edges related to that node.

Example 11-1 shows a small application that pulls the URL for a RDF/XML document from the command line and then loads this document in a newly created RDFGraph object. Next, the RDFGraph method getNode is called, passing in the URI for the resource and getting back an RDFNode object instantiated to that object. The getEdges method is called on the node returning an ArrayList of RDFEdge objects. The URI and local name properties for each of the edges are accessed and then printed out to the console. Finally, at the end, another RDFGraph method, PrintNTriples, is called to print out all of the N-Triples from the model.

Example 11-1. Printing out the edges for a given node using Drive C# parser
/*****************************************************************************
 * PracticalRDF
 ******************************************************************************/
using System;
using Softagents.Drive;
using System.Collections;

namespace PracticalRDF
{
	/// PracticalRDF
	/// 
	public class PracticalRDF
	{
		[STAThread]
		static void Main(string[] args)
		{
                  string[] arrNodes;

                  // check argument count
			if(args.Length <1)
			{
				Console.WriteLine("Usage:Practical <inputfile.rdf>");
				return;
			}
					
                  //read in RDF/XML document
			RDFGraph rg = new RDFGraph(  );
                  rg.BuildRDFGraph(args[0]);

                  // find specific node
			RDFNode rNode = rg.GetNode("http://burningbird.net/articles/monsters1.htm");
			System.Collections.ArrayList arrEdges = rNode.GetEdges(  );
    
                  // access edges and print
                  foreach (RDFEdge rEdge in arrEdges) {
                     Console.WriteLine(rEdge.m_lpszNameSpace + rEdge.m_lpszEdgeLocalName);
                     }
            	
                  // dump all N-Triples
			Console.WriteLine("\nN Triples\n");
			rg.PrintNTriples(  );

		}
	}
}

After compilation, the application is executed, passing in the name of the RDF/XML document:

PracticalRDF http://burningbird.net/articles/monsters1.rdf

The parser does return warnings about a possible redefinition of a node ID for each of the major resources, but this doesn't impact on the process:

Warning: Possible redefinition of Node ID=http://burningbird.net/articles/monsters1.
htm! Ignoring.
Warning: Possible redefinition of Node ID=http://burningbird.net/articles/monsters2.
htm! Ignoring.
Warning: Possible redefinition of Node ID=http://burningbird.net/articles/monsters3.
htm! Ignoring.
Warning: Possible redefinition of Node ID=http://burningbird.net/articles/monsters4.
htm! Ignoring.
Warning: Possible redefinition of Node ID=http://www.yasd.com/dynaearth/monsters1.
htm! Ignoring.
Warning: Possible redefinition of Node ID=http://www.dynamicearth.com/articles/
monsters1.htm! Ignoring.

All of the predicates directly attached to the top-level node within the document are found and returned:

http://burningbird.net/postcon/elements/1.0/relevancy
http://burningbird.net/postcon/elements/1.0/history
http://burningbird.net/postcon/elements/1.0/bio
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://burningbird.net/postcon/elements/1.0/related
http://burningbird.net/postcon/elements/1.0/related
http://burningbird.net/postcon/elements/1.0/related
http://burningbird.net/postcon/elements/1.0/presentation

Drive cannot handle query-like processing of the data, using an RDQL language. However, there are methods for adding edges to a node and nodes to a graph if you're interested in building an RDF graph from scratch or modifying an existing one.