If it's already plotted but you want to redraw it rather than plot beside it, you'll need a new plot. I presume that what you really need is to update the calculations by which the plot is drawn without having all the data available at one time.
I'll discuss an approach as a general algorithm (but parenthetically mention a couple of specific hints relating to R implementation; similar considerations will apply in many other languages)
If you can make multiple passes through those portions of data (or at least know beforehand good bounds on the variable over all portions), then certainly something can be done.
Let's ignore issues like anti-aliasing and imagine we plot purely in monochrome; then our device is limited to some resolution -- however in practice we can probably go considerably coarser, since the eye is unlikely to discern much finer than some moderate number of positions.
Either way, let's say we would like to have a resolution of $M$ (e.g. 1000) "pixels". (1000 would probably be four times as many as there would be any practical purpose to having, by the way.)
Since the maximum and minimum are represented on the plot, we need a scale that will include them. So we would need to pass through the data (or otherwise bound) the maximum and minimum from which we would then compute the boundaries of our scale.
We then create an integer vector with the required number of bins (one per notional "pixel" position) and pass through the data constructing a histogram (i.e. at each data value we add +1 to a bin-count if we find a point within its bin boundaries).
(In R you could actually use hist
to do this by passing the precomputed bin boundaries (breaks
) to it along with the given data portion then accumulate those bin-counts for that portion into the overall bin-counts)
From the histogram we can then identify the bins within which each required statistic (like the median and the hinges) is located and then from those find the fences, whisker ends, points outside the fences and so on, by treating all the observations in a bin as occurring at the bin-center.
(If the variable is actually on a discrete lattice, you may be able to use many fewer bins, centered at the possible values.)
We can then plot the resulting boxplot. This could work with almost any size of data set as long as you can count the number of observations in a bin.
(In R, an integer count can go up to 2147483647L = $2^{31}-1$; but you could consider modifying it to work with double precision floating point if you needed really big numbers.
Note that in R boxplot.default
is the default boxplot function; it in turn calls boxplot.stats
to compute the things you would calculate above and bxp
to actually produce the plot; one could simply adapt that default boxplot code
to incorporate working through the data portions in lieu of the call to boxplot.stats
)