Simon Potter simon.potter@auckland.ac.nz
and Paul Murrell p.murrell@auckland.ac.nz
Department of Statistics, University of Auckland
November 12, 2012
Abstract: The gridSVG package exports grid images to an SVG format for viewing on the web. This article describes a new development in the way that gridSVG produces the SVG output. The result is greater flexibility in how the SVG output is produced and increased opportunities to manipulate the SVG output, which creates new possibilities for generating more complex and sophisticated dynamic and interactive R graphics for the web.
grid is an alternative graphics system to the traditional base graphics system provided by R [1]. Two key features of grid distinguish it from the base graphics system, graphics objects (grobs) and viewports.
Viewports are how grid defines a drawing context and plotting region. All drawing occurs relative to the coordinate system within a viewport. Viewports have a location and dimension and set scales on the horizontal and vertical axes. Crucially, they also have a name so we know how to refer to them.
Graphics objects (grobs) store information necessary to describe
how a particular object is to be drawn. For example,
a grid circleGrob
contains the information
used to describe a circle, in particular its location and its
radius. As with viewports, graphics objects also have names.
The task that gridSVG [2] performs is to translate viewports and graphics objects into SVG [3] equivalents. In particular, the exported SVG image retains the naming information on viewports and graphics objects. The advantage of this is we can still refer to the same information in grid and in SVG. In addition, we are able to annotate grid grobs to take advantage of SVG features such as hyperlinking and animation.
This document describes a new development in gridSVG that changes the mechanism used to convert grid grobs and viewports to an SVG representation.
In order to aid our explanation, a simple grid plot will be drawn using the code below.
R> library(grid) R> grid.rect(width = unit(0.5, "npc"), R+ height = unit(0.5, "npc"), R+ name = "example-rect")
R> grid.ls()
example-rect
The output from grid.ls()
shows the grid
display list. This is represents the list of grobs that have
been plotted on the current graphics device. The display list
shows that the rectangle has been drawn and we can see that it
is named example-rect
. When gridSVG
translates example-rect
into SVG, the rectangle
translates into the following markup:
<g id="example-rect"> <rect id="example-rect.1" x="126" y="126" width="252" height="252"/> </g>
Prior to the recent development,
gridSVG would create this SVG by concatenating
strings.
The first step involved creating an SVG group
(g
). This group needs to have all of its
appropriate attributes inserted, which always include
an id
attribute, but can also include attributes
related to animation, hyperlinking, or custom attributes by
“garnishing” attributes. In R, string
concatenation is accomplished using the paste()
function. A fragment of pseudo-code follows, which would
generate the SVG group markup.
R> groupID <- "example-rect" R> paste("<g", R+ ' id="', groupID, '"', R+ " ... ", R+ ">\n", sep = "") R> increaseIndent()
<g id="example-rect" ... >
In this case the ...
represents the optional
attributes applied to a group, e.g. hyperlinking. We can see
already that the code to produce the SVG markup is
reasonably complex compared to the markup itself. Note that we
have also increased the level of indentation so that children of
this group are clearly observed to be children of this
particular group.
The next step is to add a child <rect />
element to this SVG group. We need to first indent to
the correct position on a new line, and then draw the
rectangle. The code that would be used to produce the rectangle
is shown below.
R> paste(indent(), R+ "<rect", R+ ' id="', rectID, '"', R+ ' x="', rectX, '"', R+ ' y="', rectY, '"', R+ ' width="', rectWidth, '"', R+ ' height="', rectHeight, '"', R+ ..., R+ " />\n", sep = "")
<rect id="example-rect.1" x="126" y="126" width="252" height="252" ... />
We can clearly see how attribute values are inserted into
the SVG output, in particular with our location and
dimension attributes. Again, the ...
represents
other attributes that may be inserted (though not
demonstrated). What is also being shown here is how we are
applying the indentation. This is done by calling a function
that returns a vector character with the correct number of
spaces to indent our <rect />
element.
Once all children have been added to the SVG group, we
can close the group so that all <rect />
elements are contained with it. Because we are closing an
element, we need to decrease the level of indentation to
preserve the heirarchical structure of the SVG
markup. This means when closing any element, we need to do
something similar to the following code which closes
an SVG group.
R> decreaseIndent() R> paste(indent(), R+ "</g>\n", sep = "")
</g>
We have shown how SVG images are built using a series
of concatenated strings. It is important to note that these
strings are written directly to a file (specified when
calling gridToSVG()
). This means each time
an SVG fragment is created using paste()
,
it is appended to a specified file name.
This approach has a few limitations. For instance, we cannot guarantee that the output that is produced is valid SVG markup. We are also writing directly to a file, which means that we need to read the file to observe its contents; we do not retain the SVG content in resident memory. Finally, but less importantly, performance is a concern when generating output using repeated string concatenation as it is known to be a slow operation (this is less important because the drawing of the original image by grid, before export, is also slow).
To remedy these limitations a rewrite of the markup generating component of gridSVG was undertaken.
The rewrite of part of the gridSVG package was achieved by utilising the XML [4] package. The XML package is an R wrapper for the libxml2 [5] XML parser and generator. The key feature that the XML package provides us with is a way of representing an SVG image as a collection of SVG nodes (elements), instead of a long character vector. We simply need to define the structure of the document, the XML package will take care of how this will be exported to text.
To define the structure of an SVG image, we need to establish how elements relate to each other. In the case of gridSVG, the only relationship of importance is the parent/child relationship. The earlier example with the rectangle will be recreated using the XML package to demonstrate the differences between the two approaches. The code that creates an SVG group is shown below. Notice that when we print out the node itself, the markup is generated for us by the XML package.
R> g <- newXMLNode("g", R+ attrs = list(id = "example-rect")) R> setParentNode(g) R> g
<g id="example-rect"/>
The group is given the name of the grob that it is going
to be representing. Because we wish
to add children to this <g>
element, we set
it as the current parent node with a call to the
setParentNode()
function.
The next piece of code creates a <rect />
element. It is important to note in this code that
the parent
parameter is given as an argument the
result of the function
call getParentNode()
. Earlier we set the current
parent node to be the <g>
element. This means that the <rect>
element will be a
child of the <g>
element.
R> svgrect <- newXMLNode("rect", parent = getParentNode(), R+ attrs = list(id = rectID, R+ x = rectX, R+ y = rectY, R+ width = rectWidth, R+ height = rectHeight)) R> svgrect
<rect id="example-rect.1" x="126" y="126" width="252" height="252" fill="none" stroke="black"/>
R> g
<g id="example-rect"> <rect id="example-rect.1" x="126" y="126" width="252" height="252" fill="none" stroke="black"/> </g>
We can now see how the document is beginning to build up as
the <rect />
is added to
the <g>
.
A complete SVG document must have a "root"
<svg>
. This has been left out of the examples
so far, but it is worth mentioning here because,
with the XML approach we include several
namespace definitions in the
<svg>
element. This allows the XML package to
ensure that we are producing valid SVG output.
R> svgroot <- newXMLNode("svg", namespaceDefinitions = R+ list("http://www.w3.org/2000/svg", R+ xlink = "http://www.w3.org/1999/xlink"), R+ attrs = list(width = svgWidth, R+ height = svgHeight, R+ version = "1.1")) R> setParentNode(svgroot) R> svgroot
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="504px" height="504px" version="1.1"/>
This
<svg>
element is made the parent node so that the
<g>
element we created earlier can be made a child of the
root
<svg>
element.
R> addChildren(svgroot, g)
If we print out the
<svg>
node now we see the
<g>
and
<rect>
elements nested neatly within it.
R> svgroot
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="504px" height="504px" version="1.1"> <g id="example-rect"> <rect id="example-rect.1" x="126" y="126" width="252" height="252" fill="none" stroke="black"/> </g> </svg>
As a final step, we can write out the root SVG node. This will be inserted directly into this document.
R> saveXML(getParentNode(), file = NULL)
This demonstrates how SVG images can be built up in a more reliable way than with simple string concatenation. It is clear that the way in which we define our SVG image is less prone to error in creating markup, and it also ensures that images are both well-formed (conform to XML syntax) and valid (conform to SVG syntax).
The node-based approach to SVG creation offers more advantages than just being a cleaner way of building up an image. We are saving the root node (and thus its descendents) after the image has been created. This means we can keep the image in memory until we want to save to disk, or some other output. An example where this is useful is for producing this article, plots are written out directly within the HTML document as inline SVG (rather than having to create an external file and then link to that file from the HTML document).
Another advantage is that because we are dealing with XML nodes, we can manipulate those nodes using other powerful XML tools such as XPath [6]. For example, we can retrieve and add subsets within the SVG image.
We will demonstrate this idea using a ggplot2 [7] plot (the ggplot2 package uses grid for rendering so a ggplot2 plot consists of a large number of grid viewports and grobs).
R> library(ggplot2) R> qplot(mpg, wt, data=mtcars, colour=cyl)
We can reduce this image by removing the legend,
so that only the plot is shown. This code relies
on standard functionality from the XML package
for identifying and removing nodes; all we have to do
is provide the XPath that describes the node that
we want (in this case, a
<g>
element that has a specific id
attribute).
R> svgdoc <- gridToSVG(name=NULL, "none", "none")$svg R> legendNode <- getNodeSet(svgdoc, R+ "//svg:g[@id='layout::guide-box.3-5-3-5.1']", R+ c(svg="http://www.w3.org/2000/svg"))[[1]] R> removeChildren(xmlParent(legendNode), legendNode) R> saveXML(svgdoc, file = NULL)
Alternatively, we could extract just the legend from the plot and use it to create a new image.
R> svgdoc <- gridToSVG(name=NULL, "none", "none")$svg R> legendNode <- getNodeSet(svgdoc, R+ "//svg:g[@id='layout::guide-box.3-5-3-5.1']", R+ c(svg="http://www.w3.org/2000/svg"))[[1]] R> rootNode <- getNodeSet(svgdoc, R+ "/svg:svg/svg:g", R+ c(svg="http://www.w3.org/2000/svg"))[[1]] R> removeChildren(rootNode, "g") R> addChildren(rootNode, legendNode) R> newsvg <- newXMLNode("svg", namespaceDefinitions = R+ list("http://www.w3.org/2000/svg", R+ xlink = "http://www.w3.org/1999/xlink"), R+ attrs = list(width = "50", R+ height = "200", R+ viewBox = "435 150 50 200", R+ version = "1.1")) R> addChildren(newsvg, rootNode) R> saveXML(newsvg, file = NULL)
These simple examples demonstrate the basic idea of extracting
and combining arbitrary subsets of an SVG image.
More complex applications are possible, such as
combining the contents of two or more plots
together. It is also important to note that these
manipulations are made more convenient because the
SVG produced by gridSVG has a clear
and labelled structure; these tasks would be considerably
more difficult if we had to work with the SVG output
from the standard R svg()
device.
Another advantage of new approach is that when we create an XML node, it can then be inserted into the SVG document at any location. Previously, with the string concatenation approach we were forced to simply append to the document. Now we have the option of inserting nodes at any point in the document.
A case where this is useful is within gridSVG
itself. When gridToSVG()
is called, there are three
parameters of particular interest: the
filename, export.coords
and export.js
. The latter two parameters determine
how JavaScript code is to be included within
an SVG image, if at all. If we are going to be
including JavaScript code, the SVG image is
first generated. Once the image is created we insert
new <script>
node(s) to the
root <svg>
element. This demonstrates the
ability to insert nodes at any location because rather than
being forced to append to the document, we are able to add the
nodes to be children of the root <svg>
element.
One particular case where the XML package gives us some
advantages is when saving an XML document. The
function saveXML()
provides a boolean
option, indent
. This determines whether there is
going to be any visual structure in the form of indentation and
line breaks or none at all. An example of its effect is shown
below.
R> a <- newXMLNode("a") R> b <- newXMLNode("b", parent = a) R> saveXML(a, file = NULL, indent = TRUE) R> saveXML(a, file = NULL, indent = FALSE)
<a> <b/> </a>
<a><b/></a>
We can see that the output without indentation present is much more compact. In complex SVG images, particularly those with deep heirarchical structure, this could reduce the size of the resulting file greatly, which would improve the delivery speed of gridSVG plots being sent over the web by reducing the amount of data that needs to be transferred.
Another case where removing indentation is useful is when manipulating the SVG image in the browser using JavaScript. When parsing the SVG DOM with indentation present, the whitespace used for indentation is counted as a “node”. This makes it difficult to traverse the DOM as it forces us to check whether the node that we have encountered is simply whitespace text or not. When indentation is removed, we no longer have this problem and can be certain that all nodes are either elements, or actual content within them.
This article describes changes to the mechanism used by the gridSVG package to convert grid viewports and grobs to SVG representations. Instead of pasting strings together to generate SVG code as text within an external file, the gridSVG package now uses the XML package to create XML nodes in resident memory. The advantages of this approach include: guaranteed validity of the SVG representation; greater flexibility in the production of the SVG representation; improved access to the SVG representation; and greater flexibility in the formatting of the SVG code. There are also possible speed benefits from these changes.
These advantages have been demonstrated through simple examples, but they also have an impact on much more complex scenarios. For example, if R is being used to serve web content to a browser, it is now possible for gridSVG to provide SVG fragments (rather than complete plots) and to supply them directly from resident memory (rather than having to generate an external file as an intermediate step).
This document is licensed under a Creative Commons Attribution 3.0 New Zealand License . The code is freely available under the GPL. The features described in this article were added to version 1.0-0 of gridSVG, which is available on R-Forge (if not on CRAN).