7

For some time, I wanted to stop copy-pasting my R results into word, but climbing the LaTex mountain seemed to much to be worth it. Recently, I came to discover LyX, as a laymen's solution to people like me who do not wish to code their text, but do wish to combine R analysis with text.

However, I found there is very little, updated, documentation about LyX+R - which leads me to writing my question:

  1. What tools do you combine in your R+LyX workflow? (do you combine text editors on top of LyX? which? and why?)
  2. What is your common folder structure for an analysis project?
  3. What is the order of steps you take for constructing your analysis? What code do you keep in file.r? and what in other types of files? (images, .RData backups, .TeX, and so on)
  4. Do you use different work strategies for different types of project (due to size of project or size of dataset)
  5. What R packages do you combine in your work? and how?
  6. Recommended links?
amoeba
  • 93,463
  • 28
  • 275
  • 317
Tal Galili
  • 19,935
  • 32
  • 133
  • 195
  • duplicate: http://stackoverflow.com/questions/7464220/how-to-setup-r-with-lyx – richiemorrisroe Sep 19 '11 at 10:16
  • 2
    Tal, given that you can use R, you really shouldn't have any problem learning latex. For my money, the best solution is Emacs + auctex, which gives some really handy shortcuts and highlighting of both R and latex code. I found that Lyx just got in my way when I tried it. – richiemorrisroe Sep 19 '11 at 10:18
  • Hi Richie - 1) Thank you for the vote of confidence. I am sure I can learn LaTeX, but reading through other people, I suspect that LyX will could make it faster. 2) This question is (IMHO) not a duplicate. There is a difference between setting the system up and having a workflow. BTW - what is your workflow on this? – Tal Galili Sep 19 '11 at 10:24
  • 1
    @ Tal i used to write my code in texnic center (with no highlighting), then call sweaeve(file.choose()), rinse, repeat. Now I just use emacs, write my document in one buffer, have R open in another to test, and then call sweave from the .Rnw buffer and then alter my sweave file based on what happened (any errors or unwanted results). The major advantage to emacs is that R is running right there, so you can replicate the query workflow common in the R GUI, and then copy over the working code to a sweave file. Then, with Latex do all the formatting (again with emacs) and you're done. – richiemorrisroe Sep 19 '11 at 10:29
  • I'll put this as a comment, since I can't answer most of your question. I'm a techie guy and long-time UNIX user, but have simply never liked emacs. My personal workflow uses R and BBEdit (on a Mac) for keeping track of my calculations, with LyX for creating a final report. For very simple reports that are aimed at a non-technical audience, I'll just use Pages. LyX has its plusses and minuses: it makes many things possible much easier than raw LaTeX and helps with things like hiding footnotes or graphs if you want, but if you want to go against the LaTeX grain (hard anyhow) it makes it harder. – Wayne Sep 19 '11 at 13:04

3 Answers3

5

Tal, I also jumped on the Lyx-Swave bandwagon and acutally started off writing my Psychology Masters thesis using Lyx.

However, I abandoned Lyx due to various problems (e.g. adhering to the new APA 6th standard and other journal-specifics, correct formatting of references etc.) and converted to straight tex. I found it difficult to achieve what I wanted in Lyx - I had to resort to evil red text (pure tex) all the time, and realized that just sticking to evil red text all the time (i.e. plain tex) gave me much more flexibility and opportunities.

With regards to actual publication, there is also very few journals within my field (psychology) that will accept submissions in tex or pdf - and converting to tex to word is a pain, especially with tables...

A third issue was with regards to collaboration; very few of my colleagues used tex, even fewer used Lyx, and there was no support for installing or maintaining installations by the university meaning people had different versions, missing packages etc.

What I have resorted to now is to use R (rstudio) for all my analyses with documentation, but simply using that to produce tables and figures. I then write my papers in word and include the pdf tables (xtable is excellent) and figures from R into those. I find most journals allow you to upload tables and figures in pdf format.

Another possibility is to use Rstudio for writing the analysis, you need very few lines of latex-code to produce a sweaveable .Rnw file:

\documentclass{article}
\begin{document}
<<>>=
your code goes here
@
\end{document}

Use this file to conduct your analyses, hit "Compile PDF" in Rstudio (or alternatively R CMD Sweave filename.Rnw in a command prompt) to make the tex file (and PDF if you used Rstudio), and open this tex-file with an editor like texniccenter or texworks to enter and edit the surrounding text. These editors gives you shortcuts to commands like bold and italics, headinglevels etc., and saves you from learning the actual code. When you get more advanced, using \Sexpr{} to insert results directly into the text is not going to be hard! If you get errors relating to missing Sweave.sty just make a copy of this file (found in the R directory r\r-version\share\texmf) and place it in the same directory as the file you are currently trying to "sweave".

I have seen your skills in R on your blog and in your answers around here, so I know learning the tiny bit of tex required to use sweave will be no match for you!

Good luck.

Tormod
  • 412
  • 4
  • 12
  • indeed the inability of psychology journals to accept word is a pain. oolatex and htlatex from the mk4ht package are very useful in this situation though. – richiemorrisroe Sep 20 '11 at 08:47
  • You can use a combination of `R CMD Sweave` + `texi2pdf` to compile your document, without having to worry about `Sweave.sty`. – chl Feb 15 '12 at 10:43
4

Here's my take your questions. Your mileage may vary.

1) For tools, everything except R + LyX is icing on the cake. In my case, I use Emacs + ESS + AUCTeX, Org-Mode, the terminal, and RStudio. Again, R + LyX will get you by in a pinch.

2) LyX gets rid of (read: hides) a lot of the extra folders/crap of which you need to keep track. I just opened one of my projects and I just had a root directory and an /img folder, for holding those images that aren't generated by Sweave. Everything else is, well, icing on the cake.

3) With LyX, you just get up and go! The point is you don't have to fiddle with things like documentclass or anything else. Just start writing, and you can polish everything else later. Yes, sometimes I will run a lengthy computation and save it in an .RData, which I load later. I don't fiddle with the R code (i.e. .r file), because I can tangle that later. (At least, you used to be able to).

4) If it's a very small project, I use RStudio. If it's a medium-sized project, I open LyX and get started. If it's a huge project with lots of coding, I'll usually use Emacs/ESS and copy-paste to LyX later. If it's a REALLY huge project, I've used LyX but more recently Emacs Org Mode.

5) I use the same R packages that I use everywhere else, and LyX is not a term in that equation.

6) Yihui Xie has a lot of great stuff.

Some general comments to help your decision to keep/abandon LyX:

a) LyX has a knowledgeable community to help you, and they are responsive.

b) I've worked on projects large and small with LyX. It's really powerful for 1) something quick or 2) something huge, where you need to label, do an index, and/or bibliographies. This functionality exists for Emacs but with LyX it's out-of-the-box, follow-your-nose.

c) LyX isn't so good for instant-preview of your LaTeX and/or Sweave code (which can be a real PITA for Sweave figures). I've found that Emacs Org-Mode does both, while AUCTeX does the former.

d) and BTW: now that I think about it, I HAVE used text editors (Emacs/ESS or RStudio) to run/polish code before pasting in the .lyx file, simply because when your project is big it is unproductive to compile the whole thing just for a few lines. When the project is small, it doesn't matter either way.

e) and BBTW, @Tormod is right that collaboration is an issue, but it always is and will continue to be for the foreseeable future. With LyX you can export to Sweave/LaTeX/HTML/OpenDocument, and if none of your buddies use anything on that list then you need to find different buddies. :-)

f) B^{3}TW: the siren song which initially lured me to LyX was its automatic handling of figures and tables - they couldn't be simpler. You can mix and match whatever figure filetypes you like, and LyX knows which packages to load and conversions to do such that it just works. This is a pain that I am enduring again as I do other projects with Org Mode, AUCTeX, and RStudio.

3

I suppose I should probably give a full answer, so here it is.

I don't use Lyx, I use vanilla LaTeX. I tried using Lyx, but it confused me, and I actually found plain latex to be much more understandable. For my money, the interface to Lyx was too much like a word processor and hid the code from me (although I didn't use it for that long, so there may be ways around it). I use Emacs with Emacs speaks statistics and Auctex for my sweave and latex files. This has the benefits of extremely good documentation, cross-platform support and syntax highlighting for both LaTeX and R within an Rnw file. As stated in my comment, it also makes it much easier to go from interactive analysis in the R buffer (an emacs term) to the Sweave file that I intend to use for my report/thesis/paper.

I keep absolutely everything I've done in the Sweave file (as bitter experience tells me that the one thing you don't put in the file will be what you have a problem with).

I normally have one folder per project/paper, and this folder contains all of the input data files, the sweave files and all the output graphics/data files. This folder tends to become quite crowded over time, so I normally create a subfolder for my final document and rerun the analysis from there.

I put everything in the sweave file as stated above, to ensure that all the analysis can be rerun. I typically call rm(list=ls()) on my R buffer before regenerating a sweave file, just to be sure. Rda files are dangerous, as the objects within my not be fully reproducible. That being said, if you have a computation that takes a really long time, then after you're sure that it works, you could change the sweave chunk to eval=FALSE and then load that object from an Rda file. Thats a last resort though (this advice shamelessly stolen from andrew gelman).

I try to keep the same strategy for all my analyses, but that may change if I start doing more Bayesian work (due to the large amounts of computation involved).

The packages I use most for LaTeX is xtable, which suits my needs. I don't tend to use cacheSweave, as I have a morbid fear that I won't be able to replicate my analyses afterwards.

Recommended links: Sweave manual Emacs for windows with ess and auctex TeX for the impatient

To be honest, I found that the best way to learn LaTeX is to start hacking away at it. There are many Sweave templates floating around the net, get one and start coding. Googling of error messages will probably sort you out, unless you want to do fancy stuff. There's a guide to sweave from a psychologist also floating around which is also pretty useful.

Finally, LaTeX in general works way, way better on either a Mac or Linux (Ubuntu is always nice), and Emacs is much easier to extend on Linux. If you do need to stay on Windows, install both R and LaTeX (Miktex is good, select the install packages on the fly option) into a location with no spaces in the path (for example, c:\bin rather than c:\Program Files). This will save you a lot of hassle in the long run (but really install linux, r is faster, latex works better and the command line tools are to die for).

richiemorrisroe
  • 2,666
  • 17
  • 16
  • Hi Richie - thank you for the detailed response! You are actually making me consider letting go of LyX. I won't run over to Emacs or Linux in the near future (I think LaTex will be enough for me at this point). I do consider using RStudio... – Tal Galili Sep 19 '11 at 12:30
  • @TalGalili come to the dark side - we have cookies. Incidentally, if you're ever doing some experiments with millisecond response time recording, then Linux is about the only way you can go (thats what happened to me). – richiemorrisroe Sep 19 '11 at 12:51