by Paul Murrell http://orcid.org/0000-0002-3224-8858
Version 2: Tuesday 14 January 2020
Version 1: original publication
Version 2: update pdf.js code (for displaying PDFs)
This document
by Paul
Murrell is licensed under a Creative
Commons Attribution 4.0 International License.
This report describes an R package called 'dvir' that aims to use TeX as a layout engine, but performs all rendering within R. The package reads DVI files that are produced from TeX files and renders the content using the R package 'grid'.
The image below shows a 'lattice' (Sarkar, 2008) line plot of the Standard Normal probability distribution function, with a text annotation showing the general form of the Gaussian function. This image was drawn with R (R Core Team, 2018) using the "plotmath" feature that makes it possible to annotate a plot with mathematical equations (Murrell and Ihaka, 2000)
The basic text-drawing functions in R graphics all accept,
in addition to a simple character value, an R expression.
An R expression is interpreted as a mathematical equation,
with certain symbols, such as mu
and sigma
,
converted to greek characters,
and certain functions, such as frac
and sqrt
treated as layout instructions similar to the
\frac
and \sqrt
operators in TeX mathematical
expressions (Knuth, 1986).
The following R code provides a simple
example and the result is shown below the code.
expr <- expression(bgroup("(", frac(x - mu, sigma), ")")) library(grid) grid.text(expr)
The algorithm used to draw the mathematical equations in R attempts to mimic the algorithm used by TeX, but unfortunately the result is nowhere near the quality of the real thing.
One significant difference arises from the fact that R does not use the TeX math fonts, but it is possible to make use of the TeX fonts, with the 'fontcm' extension (Chang et al., 2014) to the 'extrafont' package (Chang, 2014), as shown below.
library(extrafont) font_install('fontcm') loadfonts("pdf") pdf("fontcm.pdf", width=1, height=1) grid.text(expr, gp=gpar(fontfamily="CM Roman")) dev.off() embed_fonts("fontcm.pdf", outfile="fontcm-embed.pdf")
While this shows a small improvement (the greek symbols are TeX's math italic variants), it is still some distance from the TeX result, which is shown below.
A different approach to including mathematical equations within R plots is to use the 'tikzDevice' package. This allows us to specify an equation using TeX syntax within character values. For example, we can write code like the following.
grid.text("$\\frac{x - \\mu}{\\sigma}$")
The following code reproduces the plot from the start of this section with the full Gaussian function annotation.
library(tikzDevice) options(tikzDocumentDeclaration = "\\documentclass[12pt]{article}") tikz("tikz.tex", standAlone=TRUE, height=4) tex <- "$g(x) = \\frac{1}{\\sigma\\sqrt{2\\pi}}e^{-\\frac{1}{2}(\\frac{x - \\mu}{\\sigma})}$" xyplot(y ~ x, type="l", ylim=c(0, .6), panel=function(...) { panel.xyplot(...) ltext(0, .5, tex) }) dev.off()
This produces a full-quality TeX version of the mathematical equation because the 'tikzDevice' package generates a TeX version (actually a PGF/TikZ version) of the entire plot. This is evident in the fact that the axis and tick labels on the plot are also rendered using TeX fonts.
TeX fonts everywhere is a nice feature if the plot is to be used within a TeX document, but it can be undesirable if all we want is the equation in TeX format.
This report introduces an R package called 'dvir' that allows the plot to be normal R graphics with just the equation rendered in full-quality TeX layout and fonts. The package is in the early stages of development, but it can reproduce plain LaTeX output within R, on a range of graphics devices, on Linux.
The next section describes the convenient high-level interface that the 'dvir' package provides for rendering LaTeX equations in R graphics. Subsequent sections document the lower-level interface and internal design of the 'dvir' package.
The simplest interface provided by the 'dvir' package
is the grid.latex
function. The first argument
to this function is a character value, which is interpreted as LaTeX code.
This can be just plain text, but it can also contain,
for example, TeX mathematical expressions.
The following code provides a simple demonstration.
library(dvir)
grid.latex("$x - \\mu$")
It is also possible to use standard LaTeX commands, as in the following example.
grid.latex("plain, {\\it italic}, and {\\bf bold}")
The following code shows the how the
grid.latex
function can be used to generate a complete
plot with Gaussian function annotation
(the LaTeX string tex
was defined in the 'tikzDevice' example above).
xyplot(y ~ x, type="l", ylim=c(0, .6), panel=function(...) { panel.xyplot(...) grid.latex(tex, 0, .5, default.units="native") })
In the following example, we use the 'xtable' package (Dahl, 2016) to generate LaTeX code for a table and then 'dvir' to draw the table within a 'lattice' plot.
library(xtable) xyplot(mpg ~ disp, mtcars, panel=function(...) { panel.xyplot(...) tex <- print(xtable(head(mtcars[1:3])), floating=FALSE) grid.latex(tex, x=unit(1, "npc") - unit(2, "mm"), y=unit(1, "npc") - unit(2, "mm"), just=c("right", "top")) })
The grid.latex
function is just a thin wrapper around
a call to LaTeX, which generates a DVI file, followed by calls
to functions that read the DVI file and render its contents in R.
These functions that read and render DVI files
make up the real heart of the 'dvir' package and
provide the focus for the remainder of this report.
A standard LaTeX workflow consists of writing a source file
containing LaTeX text and code
(suffix .tex
) and then running pdflatex
on that file to produce a PDF document. An alternative is to
run latex
, which produces an intermediate DVI file
(suffix .dvi
), and then dvips
or
dvisvgm
(or even dvipdfm
) to
produce a PostScript or SVG (or PDF) document from the DVI file.
A DVI file is a device-independent description of the placement of individual characters on a page. It contains instructions that move the current location across or up and down, define fonts, select fonts, place characters at the current location, and draw vertical and horizontal rectangles.
The 'dvir' package provides a function readDVI
to
read a DVI file into R. For example, the following LaTeX
code describes a very simple document that just contains the
word "Hello". This code has been saved in a text file called
"simple.tex
.
\documentclass[12pt]{standalone} \begin{document} Hello \end{document}
If we run latex
on that file ...
system("latex simple.tex")
... we get a DVI file called
simple.dvi
and the following code reads that DVI
file into R.
dvi <- readDVI("simple.dvi") dvi
pre version=2, num=25400000, den=473628672, mag=1000, comment= TeX output 2020.01.14:1936 bop counters=1 0 0 0 0 0 0 0 0 0, p=-1 down4 a=1069005536 push y4 a=-1073741823 down4 a=1073741823 push right3 b=-4736287 y0 down3 a=546132 push push push fnt_def_1 fontnum=14, checksum=1487622411, scale=786432, design=786432, fontname=cmr12 fnt_num_14 set_char_72 'H' set_char_101 'e' set_char_108 'l' set_char_108 'l' set_char_111 'o' pop pop pop pop pop eop post fnt_def_1 fontnum=14, checksum=1487622411, scale=786432, design=786432, fontname=cmr12 post_post
The DVI format
is a binary format (Knuth, 1995),
so 'dvir' uses the 'hexView'
package to define memory blocks for each possible DVI operation
and to read those memory blocks from the DVI file.
The result of readDVI
is a "DVI" object, which is
a list of 'hexView' "rawFormat"
objects ...
class(dvi[[1]])
[1] "flatRawFormat" "rawFormat"
dvi[[1]]
=======op.opcode 0 : f7 | 247 =======op.opparams.i 1 : 02 | 2 =======op.opparams.num 2 : 01 83 92 c0 | 25400000 =======op.opparams.den 6 : 1c 3b 00 00 | 473628672 =======op.opparams.mag 10 : 00 00 03 e8 | 1000 =======op.opparams.comment.length 14 : 1b | 27 =======op.opparams.comment.string 15 : 20 54 65 58 20 6f 75 74 70 75 74 20 32 30 32 30 2e 30 31 2e 31 34 | TeX output 2020.01.14 37 : 3a 31 39 33 36 | :1936
... and it is easy to run through the list of DVI operations simply
by calling lapply
(or sapply
) on this list.
For example, the following code generates a numeric vector
containing all of the operation codes from the DVI file.
sapply(dvi, function(op) { hexView::blockValue(op$blocks$op.opcode) })
[1] 247 139 160 141 165 160 141 145 161 239 239 159 141 141 141 243 185 72 101 108 108 111 142 142 [25] 142 142 142 140 248 243 249
The grid.dvi
function renders a "DVI" object,
by converting the DVI instructions
from a DVI file into 'grid' drawing on an R graphics device.
grid.dvi(dvi)
The essential steps in faithfully rendering the DVI file are as follows:
The DVI coordinate system has (0, 0) at top-left and the scale of locations and distances is defined in the first "preamble" operation in the DVI file.
pre version=2, num=25400000, den=473628672, mag=1000, comment= TeX output 2020.01.14:1936
We multiply a location or distance by the num
erator,
divide by the den
ominator, multiply by the
mag
nitude, and divide by 1000
to get a value in 10^(-7)mm units.
The 'grid' package can specify locations and dimensions in mm
via the unit
function. What the 'dvir'
package actually does is calculate a bounding box from
the DVI operations, create a 'grid' viewport based on the size
of that bounding box (in mm),
with an x-scale and a y-scale that encompasses
the DVI operations, and renders the DVI operations using "native"
coordinates within the viewport.
The most important part of a DVI font definition is the
fontname
. In our simple example, this name
is cmr12
(a Computer Modern Roman serif font
at 12pt size).
fnt_def_1 fontnum=14, checksum=1487622411, scale=786432, design=786432, fontname=cmr12
We must generate an R graphics font specification from just this font name, a task that is complicated by the fact that font specifications are different for different graphics devices in R.
In the case of the pdf
graphics device, we specify a
font by giving the name of
a Type 1 Font definition and we define a Type 1 Font by specifying
a path to an AFM (Adobe
Font Metrics) file. We also need to find a path to a PFB
(Printer Font Binary) file so that we can embed the actual font
within the final PDF file.
The 'dvir' package uses the kpsewhich
program to
first find the font mapping file pdftex.map
, which contains
information on mappings from font names (as seen in the DVI file)
to actual font files,
and then kpsewhich
again to find the actual font files.
Some typical results on an Ubuntu system with TeX Live are shown below.
First, we have the location of the pdftex.map
file ...
mapfile <- system("kpsewhich pdftex.map", intern=TRUE) mapfile
[1] "/var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map"
... and the line in this file for the cmr12
DVI font name
shows the name of an actual font file at the end of the line ...
system("grep ^cmr12 $(kpsewhich pdftex.map)", intern=TRUE)
[1] "cmr12 CMR12 <cmr12.pfb"
We can get the location of this PFB file ...
pfbfile <- system("kpsewhich cmr12.pfb", intern=TRUE) pfbfile
[1] "/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm/cmr12.pfb"
... and the location of the corresponding AFM file ...
afmfile <- system("kpsewhich cmr12.afm", intern=TRUE) afmfile
[1] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm"
This gives us enough information to create a Type 1 Font definition
called "cmr12"
in R ...
Type1Font("cmr12", rep(afmfile, 4))
$family [1] "cmr12" $metrics [1] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [2] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [3] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [4] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [5] "Symbol.afm" $encoding [1] "default" attr(,"class") [1] "Type1Font"
... and we can then draw text with 'grid' using that font (on a
pdf
device) by
specifying "cmr12"
as the font family ...
pdf("test.pdf", width=1, height=.5) grid.text("Test", gp=gpar(fontfamily="cmr12")) dev.off()
For the resulting
PDF file to display properly, it is best to embed the fonts
in the PDF file with the embedFonts
function.
This requires us to specify the locations of the PFB files to embed.
embedFonts("test.pdf", outfile="test-embed.pdf", options=paste0("-sFONTPATH=", pfbfile))
The 'dvir' package
provides a fontPaths
function that can generate
the information needed to embed fonts. It works with a grob
that is created by grid.dvi
. In the code below,
rather than drawing DVI output, we just create a grob from
a DVI object
and use fontPaths
to generate the locations
of fonts within that DVI object.
fontPaths(dviGrob(dvi, device="pdf"))
[1] "/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm"
The postscript
graphics device also uses Type 1 fonts,
so is handled in exactly the same way as the pdf
device.
However, Cairo-based devices (Packard et al., 2018)
(e.g., x11(type="cairo")
and
cairo_pdf
) require a different approach to font mapping.
Specifying a font on these devices
is more convenient in R because we just have to give
a font family name, like "Times"
. Unfortunately, that makes
it harder to map a DVI font name to an R font specification for these
devices.
The Cairo devices find an actual font from a font family name
using a program called fontconfig
. For example,
when we specify "Times"
in R, the actual font that
gets used is whatever fontconfig
finds as the best
match.
system("fc-match Times", intern=TRUE)
[1] "texgyretermes-regular.otf: \"TeX Gyre Termes\" \"Regular\""
Because R only gives fontconfig
a font family, the first
step is to map the DVI font name to a font family.
The 'dvir' package does this by looking at the AFM file (found above) and
extracting the FamilyName
field.
head(readLines(afmfile))
[1] "StartFontMetrics 2.0" [2] "Comment Creation Date: Mon Jul 13 16:17:00 2009" [3] "FontName CMR12" [4] "FullName CMR12" [5] "FamilyName Computer Modern" [6] "Weight Medium"
This font family by itself is not specific enough to ensure that
fontconfig
will find the same font that was used to
create the DVI file. However, it is possible to create configuration
files for fontconfig
that influence how it matches
family names to actual fonts.
The 'dvir' package performs several steps to force
fontconfig
to find the right font.
First, we append the FullName
from the AFM file
to the family name (e.g., in the case above, the family
name becomes Computer Modern CMR12
).
Next, we generate a fontconfig
configuration
entry that expands this family name to a more detailed
font specification. The configuration also ensures that
fontconfig
will search in the directory that contains
the font we want to match.
An example is shown below and the effect of
this configuration is that when fontconfig
attempts
to match a font with "family" name Computer Modern CMR12
,
the match is modified
to find a font with "family" Computer Modern
and
"postscriptname" CMR12
.
<dir>/usr/share/texlive/texmf-dist/fonts/type1/public/amsfonts/cm</dir> <match target="pattern"> <test name="family" mode="eq"> <string>Computer Modern CMR12</string> </test> <edit name="family" mode="assign" binding="strong"> <string>Computer Modern</string> </edit> <edit name="postscriptname" mode="assign" binding="strong"> <string>CMR12</string> </edit> </match>
The 'dvir' package generates these font configurations
on the fly, as it reads a DVI file, which means that it must
force fontconfig
to load these new font configurations.
This is done via a fork of the 'gdtools' package, which
provides a fontconfig_reinit
function
(see the Technical requirements Section).
As well as mapping to the correct font file, so that text is drawn correctly, 'dvir' must access the correct font metric information (because when a DVI file says to draw a character we must move the current position forward to the end of that character, which requires knowing the exact character width).
For the pdf
and postscript
devices,
this font metric information is taken
from the AFM file. For Cairo-based devices, an off-screen
pdf
device is created and the AFM files are used
with that device (because the Cairo and PangoCairo font metric
information that is provided by Cairo-based devices
did not prove to be consistently accurate).
The final piece required for properly rendering a DVI file in R
is to correctly generate character values from
set_char
operations.
set_char_72 'H' set_char_101 'e' set_char_108 'l' set_char_108 'l' set_char_111 'o'
As we can see, the DVI file contains instructions such as "typeset character
72". How do we interpret the number 72 as a character?
This number means the
seventy-second character within the current font, so the answer depends
on which font has been selected.
For example, if we look within the AFM files for two different
fonts, cmr12
(for normal
english text) and
cmmi12
(for mathematical equations), we can see
that, while the seventy-second character in both fonts is 'H',
the thirty-third character is an exclamation sign in cmr12
,
but it is the Greek character omega in cmmi12
.
StartFontMetrics 2.0 Comment Creation Date: Mon Jul 13 16:17:00 2009 FontName CMR12 FullName CMR12 FamilyName Computer Modern Weight Medium ... C 33 ; WX 271 ; N exclam ; B 86 0 184 715 ; ... C 72 ; WX 734 ; N H ; B 41 0 692 683 ;
StartFontMetrics 2.0 Comment Creation Date: Mon Jul 13 16:17:00 2009 FontName CMMI12 FullName CMMI12 FamilyName Computer Modern Weight Medium ... C 33 ; WX 610 ; N omega ; B 12 -10 594 442 ; ... C 72 ; WX 811 ; N H ; B 46 0 858 683 ;
We saw in the previous section that we can specify the current DVI font for R graphics devices, but we still need to select the correct character from the current font.
To make things more complicated, when we draw text within R, we must provide a character value. This means that, we must somehow choose a character value that corresponds to the seventy-second character within the current font. This is the inverse of the first problem: we now want to go from a character to a number.
Fortunately, for the limited set of 128 ASCII characters, we can rely
on a consistent mapping between characters and numbers.
For example,
the character value "H"
will be converted to the number 72
and the character value "!"
will be converted to the number 33 (and vice versa).
For the pdf
and postscript
graphics devices,
we now have almost everything we need. We can take the character number
in the DVI file and convert it to an ASCII character value ...
rawToChar(as.raw(72))
[1] "H"
... and we can rely on the character value "H"
being converted to the number 72 to find the seventy-second
character in the current font.
However, one final detail is required. For Type 1 Fonts on the
pdf
and postscript
graphics devices,
there is another conversion to worry about. The character number
is converted to a character name to find the actual "glyph"
(letter shape) that will be drawn by the font. We can see this
correspondence in the lines of the AFM files above. For example,
in the cmr12
font, character 33 (C 33
)
corresponds to the character name exclam
(N
exclam
). In the cmmi12
font, character 33
corresponds to the character name omega
. These
character names are used to identify the glyphs in the PFB files
(which are the shapes that are actually drawn by the graphics
device).
This means that we must set up the correct mapping between character numbers and character names for each Type 1 Font that we use.
The 'dvir' package generates this "character encoding"
for each font from the font AFM file,
based on the order of the characters within the file, and
the character encoding is then
specified as part of the Type 1 Font that we
create. For example, the first few characters in the
cmr12
font are shown below ...
StartFontMetrics 2.0 Comment Creation Date: Mon Jul 13 16:17:00 2009 FontName CMR12 FullName CMR12 FamilyName Computer Modern Weight Medium ... C 0 ; WX 611 ; N Gamma ; B 41 0 570 681 ; C 1 ; WX 815 ; N Delta ; B 46 0 769 714 ; C 2 ; WX 761 ; N Theta ; B 54 -21 707 704 ; C 3 ; WX 679 ; N Lambda ; B 31 0 648 714 ; C 4 ; WX 652 ; N Xi ; B 41 0 611 678 ;
... and this produces an encoding file that begins like this ...
/cmr12Encoding [ /Gamma /Delta /Theta /Lambda /Xi
... and this encoding file, along with the AFM file, is used to define a Type 1 Font with the correct encoding ...
$family [1] "cmr12" $metrics [1] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [2] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [3] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [4] "/usr/share/texlive/texmf-dist/fonts/afm/public/amsfonts/cm/cmr12.afm" [5] "Symbol.afm" $encoding [1] "/tmp/RtmpOV7AHy/cmr12.enc" attr(,"class") [1] "Type1Font"
In the case of characters in the
english alphabet, on the pdf
and postscript
graphics devices, things are relatively straightforward.
For example, the DVI character number 72 is converted to
the R character value
"H"
. That will be converted to the number 72
and the seventy-second character name in encoding file for the font
will be H
, which will produce a glyph that draws an 'H'.
For characters outside the english alphabet, things can get
more complicated. For example, the DVI character number 33
will be converted to the R character value "!"
,
which will be converted to the number 33. For the
font cmr12
, the encoding converts 33 to the character
name exclam
, which will draw an exclamation glyph,
but for the font cmmi12
, the encoding converts 33
to the character name omega
, which will draw a
glyph representing the Greek character omega.
To produce a summation sign within a mathematical equation,
the DVI font is cmex10
,
the DVI character number is 80, the R character value is "P"
,
which is converted back to 80,
the encoding converts 80 to the character name
summationtext
, which will draw a summation glyph like the
one below.
grid.latex("$\\sum$")
Unfortunately, the situation is completely different for Cairo-based graphics devices.
One advantage of using the Cairo-based devices is that we
are able to use UNICODE UTF-8 character values when specifying text to
draw, which means that we can specify any character in an
R character value. For example, we can specify a Greek omega character
with "\U03C9"
.
"\U03C9"
[1] "ω"
The downside of the Cairo-based devices
is that, from R, we cannot control the
details of the conversions between numbers and characters,
such as character encodings, as we did for the pdf
device.
This means that we have to make sure that the
R character value that we feed into
a Cairo-based graphics device is something that will select the
correct glyph from the current font.
The approach taken by the 'dvir' package is to convert DVI character numbers into UNICODE character values and rely on Cairo to map the UNICODE values to the correct font glyphs. For example, 72 is converted into "\U0048" (the UNICODE code point for "H") and Cairo maps that UNICODE code point to an 'H' glyph in the current font.
This approach still requires a conversion from DVI character numbers to UNICODE character values. As we noted at the start of this section, the DVI character number i specifies the ith character within the current font, so the conversion will depend on which font we are currently using.
The 'dvir' package currently specifies these conversions through
hard-coded tables and supports UNICODE conversions for
several basic TeX fonts:
Computer Modern Roman, e.g.,
cmr12
,
Computer Modern Math Italic, e.g.,
cmmi12
,
Computer Modern Math Symbols, e.g.,
cmsy12
, and
Computer Modern Math Extensions, e.g.,
cmex10
.
dvir:::rawToUTF8(as.raw(33), "CMR")
[1] "!"
dvir:::rawToUTF8(as.raw(33), "CMMI")
[1] "ω"
The cmex10
font is an exceptional case.
This font contains symbols for mathematical equations such
as large brackets (character name parenleftbig
).
The problem is that some of these symbols
have no counterpart in UNICODE, which means that it is not possible
to specify a UNICODE character
value that will select these glyphs from a font.
The solution to this problem in 'dvir' is to provide a customised
font called cmexunicode10
. This is the same as the
cmex10
font, but all of the glyphs in the font
have been renamed to use the same character names as cmr12
.
The conversion from DVI number to UTF-8 that is
used for cmex10
is the same as the conversion that
is used for cmr12
, but the glyphs in the two fonts
are very different.
This customised cmexunicode10
font
is included as part of the 'dvir' pacakge.
In the case of characters in the
english alphabet, on Cairo-based
graphics devices, things are again relatively straightforward.
For example, the DVI character number 72 is converted to the
R character value "\U0048"
, which maps
to a character name H
, which produces a glyph
that draws an 'H' (in most fonts).
For characters outside the english alphabet, things are a little
more complicated. For example, the DVI character number 33 with
font cmr12
is
converted to "\U0021"
, which maps to a character name
exclam
, which draws an exclamation glyph, but with font
cmmi12
it is converted to "\U03C9"
,
which maps to a character name omega
, which draws
a Greek omega glyph.
To produce a summation sign within a mathematical equation,
the DVI font is cmex10
, which gets switched to
cmexunicode10
,
the DVI character number is 80,
the R character value is "\U0050"
(UNICODE for "P"),
which maps to a character name P
, which draws a
summation sign glyph (because the character names have all been
modified in the cmexunicode10
font).
The 'dvir' package is in an early development stage. It has only been tested on Ubuntu systems, it only works with a subset of R graphics devices, and it only works with plain LaTeX documents.
The 'dvir' package is also very slow. It draws each character (and each filled rectangle) as a separate 'grid' grob.
Nevertheless, for this limited set of conditions, and assuming time is not critical, the 'dvir' package does produce full-quality TeX layout and fonts.
There are many ways in which the basic functionality of the
'dvir' package could be extended.
Getting the package working on Windows would be big step
forward, as would support for a wider range of graphics devices,
especially the svg
device.
Extending the range of supported fonts would also be useful,
particularly to fonts outside of the standard TeX Computer
Modern fonts and/or to other font types like True Type and Open
Type fonts.
It would be interesting to look at extending the package to
support multi-page DVI files and to design support for
xxxi
operations (TeX \special
commands).
For example, it might be possible to support the LaTeX
xcolor
package or the graphicx
package
for included graphics.
Another interesting question would be how to combine the drawing
of DVI output with normal 'grid' text output, with particular
attention to aligning baselines.
Apart from the "plotmath" facility and the 'tikzDevice' package mentioned in the Mathematical Equations in R Section, another TeX-related R package is the 'texPreview' package (Sidi and Polhamus, 2018). This package automates the generation of PDF, PNG, or SVG images from LaTeX code. It is then possible to import the resulting image into R and draw it, but the quality of the result is unlikely to be ideal because of scaling of raster images (in the case of PNG) or because of conversion to outlines rather than real fonts (in the case of vector images). Another package that works explicitly with DVI files is Duncan Murdoch's 'patchDVI' (Murdoch, 2015). The aim of that package is to assist with editing and debugging 'Sweave' (Leisch, 2002) and 'knitr' (Xie, 2015) documents by modifying the DVI file to allow links between the source document and the final output. It does not attempt to render the DVI file.
As demonstrated in one of the early
examples, packages that generate LaTeX code are potentially useful
for providing input to the grid.latex
function in 'dvir'.
Examples include 'xtable', and the latex
function from the
'Hmisc' (Harrell Jr, 2017) package.
Outside of R, there is a module for the Python library
matplotlib (Hunter, 2007) called dviread
(The Matplotlib development team, 2018).
This appears to take a similar approach to the 'dvir' package.
It reads the raw DVI file and attempts to access the exact fonts
and encodings to produce high-quality TeX output, though with
similar limitations to 'dvir'. In some ways
this is more like the 'tikzDevice' approach because the effect
is applied to an entire plot at once, although it is closer to
'dvir' in the sense that the effect only applies to text labels
within the plot.
The examples and discussion in this document relate to version 0.1-1 of the 'dvir' package, version 0.3-4 of the 'hexView' package, and version 0.1.7.9001 of the 'gdtools' package (which is a fork of the original package).
This report was generated within a Docker container (see Resources section below).
Murrell, P. (2018). "Revisiting Mathematical Equations in R: the 'dvir' package" Technical Report 2018-08, Department of Statistics, The University of Auckland. [ bib ]
This document
by Paul
Murrell is licensed under a Creative
Commons Attribution 4.0 International License.