Student Project Ideas
-
Develop set of graphical building blocks for producing variety of
oceanographic plots (data and plot ideas from Sam McClatchie).
Started by Mark Emmett (Masters project); some elements
in Chapter 7 of
"R Graphics".
-
R graphics devices: (Windows) metafile on Unix; Macromedia Flash/Shockwave
(Note the existence of
GPLFlash).
Sasha Goodman has discovered at least one approach (see below).
-
Importing graphics into R (see
further notes). See also the GIF support in package caTools (on CRAN).
Some good progress on this. See the grImport package on CRAN.
-
Investigate the development of useful cross-platform
building blocks for 3D and/or
dynamic/interactive statistical graphics; build on
Gir?
Derek Law (PhD) is doing some work on this.
-
Develop some plots for presenting statistical results as per
Gelman, Pasarica, and Dodhia (2002),
"Let's practice what we preach: Turning tables into graphs"
The American Statistician, 56 (2), 121-130.
-
Internationalisation support for PostScript (and PDF) output:
(i) add support for embedding fonts and CID fonts in existing
PostScript device,
(ii) add-on package that provides alternative PostScript device
based on
Lasi
(and Pango).
Ei-ji Nakama (and others?) are working on this. Some support in
R version 2.3.0 (due April-ish 06). See R News volume 6/2.
-
Produce a
CRAN Task View
focused on graphics. (Without it being a list of all packages on CRAN!
i.e., packages focused on graphics).
Nicholas Lewin-Koh has done this.
-
Develop a variation on the sunflowerplot, using Tukey's tallying
symbols (see EDA page 16) instead of sunflowers to represent replications
(up to 10).
-
Create a simple GUI interface for interacting with grid output. Something
based on tcltk or Gtk. The idea would be to draw in a canvas and have a
tree-like menu (as per the typical Windows menu for selecting a file)
of all graphical objects in the current scene (with child nodes for
children of gTrees). Clicking in the menu would select the appropriate
element in the scene (draw it gold or something) and pressing a capture
button might return the relevant element as an R object. You could
go further and produce dialogs for editing various aspects
of the elements. I have an early tcltk attempt at something like this, that
might be useful to demonstrate the idea before being thrown away and
starting again (see ~paul/Research/Rstuff/Grid/tcltk).
-
Tidy/rewrite graphicsQC package for regression testing R code that
produces graphical output. Needs generalising so that can test more
than just examples in a package (e.g.,
test code in package "tests" directory, test code in an arbitrary file, ...).
See 'graphicsQC' package on CRAN.
Possible extensions: provide function to test for any differences
in a qcCompare*Result object to automate inclusion of graphicsQC
run in package testing; provide function to run example code just
given the name of an S4 generic (suggested by Tobias Verbeke).
-
Develop a "4-plot" function that provides a basic
graphical univariate summary of a variable: x[i] versus i,
x[i+1] versus x[i], histogram, and normal probability plot.
"4-plot" attributed to Filliben (1982) in "Elements of Computational
Statistics" (Gentle, 2002).
-
Start developing some core 2D plot components using grid. And think
about how these should be managed so that other components can be
created and contributed.
-
Some command-line "zoom" control for graphics. See the R-help thread
titled (rather unfortunately) "R-help", initiated by Sam McClatchie
on 22/6/05. In particular, it describes the matlab interface.
(Some modifications for fit-to-window resizing on Windows
for R version 2.2.0 should make this a lot more feasible than it used to be.)
-
Implement Tukey and Tukey's Textured dotplots for R (Bellcore Tech Report).
Variation on jittering for solving overplotting. Pointed out by Ross Ihaka
so check that he is not going to do it first!
-
Produce an R-level interface for specifying general pattern fills
and gradient fills for R graphics; then graphics devices which
already support this (PDF, SVG) "would just go".
-
Create some "big plot" functions to draw plot arrangements that
would span more than one page; e.g., a scatterplot matrix
that spans four a4 pages rather than being squashed onto one. You'd
have to pin it on the wall to see it, but may have some applications.
[idea by James Curran]
-
Standard arc() function (user specifies centre-radius-angle).
-
Expose graphics engine xspline code via traditional graphics function;
i.e., xspline() analogue of grid.xspline(). Done by BDR.
-
Redesign graphics devices so that can take advantage of native features
such as PostScript's curveto and PDFs hyperlinks (not possible with
current lowest-common-denominator, graphics-engine-bottleneck design).
Another example is the multiple-colourspace support in some devices.
Another example is embedding audio in PDFs.
-
An R package that interfaces with the 'swftools' library so that
can produce vector-graphics animations by generating PDF with R
and converting to flash animation using pdf2swf. Idea from Sasha Goodman.
-
Introduce some concept of "next viewport" in grid (analogous to the
"next plot" in traditional graphics). Some suggestions from Gabor
Grothendieck (guaranteeing viewport order). Ross Ihaka has suggested
a grid.newViewport() like function analogous to grid.newpage().
-
A package of functions that draw tables (as graphical output as opposed
to LaTeX descriptions a la xtable).
-
Implement Venn diagrams for greater-than-two variables as mentioned in
R-help thread "Venn diagram" (2007-06-04). See also Roger Marshall's
work on this.
-
Write a grid.xaxis.Date() for grid (i.e., axis for date data).
-
Extend text overlap resolution to allow something smarter than just
omitting overlapping text (e.g., jittering text until it all fits).
Hadley also suggested that, if you move more than a 'threshold' away
from original point, you could (optionally) draw a line (or arrow)
from the label to the original point.
(see the
ggrepel package.)
-
Allow mixing of fonts within single piece of text.
-
Add general grid.element() to gridSVG package to allow insertion of
arbitrary element (e.g., audio).
-
Hershey fonts provide enough info to be able to do plotmath in R, it just
needs an implementation of a function to return metric info for
individual characters that the R graphics engine can hook into.
This would be C-level programming, but not very advanced.
-
Make the bounding box calculations for points grobs in grid
more "complete" (at the moment they are treated as zero-size).
C code required, but pretty straightforward.
-
Add support for raster images to the graphics engine.
Done for R 2.11.0
-
Add (direct) control over aspect ratio of grid viewports.
-
Provide something like show.layout() for grid viewport trees.
Done in R 2.10.0
-
Write default widthDetails() and heightDetails() for gTrees that
calculates bounding box for all children. Will rely on
grobX() and grobY() to do the calculation, so also requires default xDetails()
and yDetails() methods for gTrees to be properly general.
-
There are some useful extensions for the 'grImport' package
(see the doc/todo.txt file within that package).
-
The 'compare' package could be augmented to provide support for
comparing some more specific classes of object (e.g., 'lm' objects).
-
Develop code that creates tag clouds (see Gorjanc Gregor's starting
point R-help 2009-06-08)
-
Develop a package to read DWF files (Autocad) as a wrapper around
DWF Toolkit C++ library from Autodesk.
-
Use R to query DictService web service (via WSDL) to get list of
"i before e except after c" violations (as per Dave Mozealous' blog 2009-12-24)
-
Use R to play with the SetiQuest data (public release of SETI data)
-
Add "ndc" coordinate system to 'grid' (to allow conversion of locations
in ANY viewport to common coordinate system).
-
The 'compare' package could be augmented to provide (brief) details about
where differences occur when objects are not the same
(e.g., which elements differ between two character vectors)
-
Develop grid back-end for Rgraphviz (so that the rendering of the graph is
done by 'grid'). There is a start in a couple of places (e.g., the
'gridGraphviz' package on R-Forge). Interesting sub-problem would be trying to
allow arbitrary 'grid' grob as graph node.
-
Add to 'grImport': XSL file to convert directly from SVG to RGML
(at least for simple cases); interface to PDF utility (xpdf/poppler?)
to allow direct PDF-to-RGML conversion? (or even just using R tools -
see pdf_info() and pdf_fonts() ?)
-
Add to 'gridSVG': how should missing data values be dealt with
in animations?
-
Add Mathematical Annotation to 'gridSVG' (using MathJax?).
A first draft of this is done. Some work required to poll
and summarise different levels of support in browsers.
Another problem to fix is potential conflict between UNICODE
entities in MathML in SVG and the encoding of the overall SVG
(e.g., on Windows, the encoding is likely to be ISOLatin1)
-
Start building some useful "templates" for adding interactivity to
"standard" plots with 'gridSVG' ? Write some standard javascript ?
-
Add ellipse() primitive to 'grid'.
-
Implement more rational text justification in R graphics.
Need to make use of metric info to determine position and calculate
size of text (and plotmath) output. The metric info is there,
just the current API does not allow enough control (e.g., strheight()
ignores descenders and uses "M" height rather than ascent
if only one line of text).
-
Expand the 'rdataviewer' package (and work on a better GUI!).
See the reviewers comments from when I tried submitting this to JSS.
-
Produce some more "model" 'grid' versions of plots (or plot
components), with nice modularity, reusability, and grob naming
schemes. (as per the 'lattice' naming scheme and the forestplot()
summer project work.)
-
Investigate combining 'gridSVG' with dynamic and interactive
javascript library like d3
(suggested by Sasha Goodman).
-
Adapt 'partykit' or write from new plotting code for output from
'Cubist' package? (see R-help 2012-07-26 from Dominik Bruhn)
-
A smart stacking algorithm to prevent overprinting in dot plots that
deals well with close points, not just exact repetitions (from Chris
W).
-
A Flow-Based Programming package for R?
-
An OpenShift cartridge for R?
-
Design a naming scheme for 'ggplot2'
-
Implement something like d3.layout.pack with gridSVG
(see http://quantifyingmemory.blogspot.co.nz/2013/11/d3-without-javascript.html
and http://bl.ocks.org/mbostock/4063530)
-
Invertible reproducible documents: how to get (e.g., annotations/comments)
back from processed document to literate document source? (from
discussion with Finlay Thompson from DragonFly)
See
technical report from Eric Lim's BSc Hons project.
-
"Grammar of Tables"; system for describing make up of a table based on
table "components" (perhaps a geom_table or geom_column/geom_row within the
ggplot2 infrastructure?)
-
The 'Gviz' package produces some really nice plots (and allows nice
combinations of plot components) BUT it does not appear to label its viewports
or its grobs, so post-hoc customisations are not possible
(see, e.g., R-help 2014-02-18 from sh. chunxuan)
-
A naming scheme for 'ggviz' (if it needs one) ?
-
Implement missing features for 'gridGraphics' (e.g., C-code-heavy functions
like persp() and filledcontour() ; recordGraphics() another tough one)
-
Permanently reproducible documents: how hard is it to produce an article/report
that you are confident will be totally reproducible in 10/20/50 years?
(just an excuse to play with and draw together technologies for
reproducible documents, data archiving, licencing, virtualisation, and
provisioning)
-
Accessible graphics: investigate ways to create SVG graphics that can be
interacted with via embossed images on a "touch pad"
(work with Jonathon Godfrey, author of BrailleR package; also
involves working with ViewPlus, manufacturers of embossers and touch pads).
Sub-project to write function that takes a grob and returns logical
indicating which shapes are actually drawn by the graphics engine.
-
Create a package of functions that mimic Adobe Illustrator tools for
manipulating images (so that people who create a plot in R, then export
to AI to touch it up manually, can get the same result by writing code!)
-
Visual displays of model fits; mini images of effects as lines plus
confidence bands to show size and significance of effects at a glance.
-
Big data graphics; parallelising graphics computations/output
(see
Byron Ellis' "Hadoop for Statistical Analysis and Exploration")
-
Expand animation support in 'gridSVG' (e.g., to other animation elements
beyond <animate>, not to mention support for all attributes
in <animate>)
-
Implement a proper Bezier curve support for graphics engine?
-
Implement a 'grid' version of Textures.js (as suggested in
a
Tweet by Hadley Wickham). Also see
this.
-
Improve font support (in graphics) in R ?
See (here
and here and here and here for examples
of problems and solutions.
-
Profile 'grid' to look for opportunities to improve (speed) performance.
-
Take a look at the interactive graphics (and mapping) packages, like
ggvis and leaflet and ggmap, and make sure that there is a non-RStudio
way of creating and viewing the results.
-
How would you create teaching materials so that they are as reusable
as possible? e.g., can extract snippet of video that relates to small
component of material (i.e., very modular and marked up)
-
Continue development of 'safemode' package
(see "Future work" in the safemode
technical report.