Bounty: 50
One thing which is quite tedious with producing pgf plots is preparing or the raw data for pgf. Specifically, making the data set smaller to avoid the memory cap.
Generate some dummy data in R:
nPoints <- 10^6
df <- data.frame(seq(nPoints), cumsum(runif(nPoints, 0, 1)))
fwrite(x=df, file="data.dat", sep=" ", col.names=F)
Plotting data.dat
directly results in a capacity exceeded error on my machine:
TeX capacity exceeded, sorry [pool size=6177416].
I thought it would be possible to filter the input data directly by doing something like,
documentclass[preview]{standalone}
usepackage{pgfplots}
pgfplotsset{compat=1.16}
begin{document}
begin{tikzpicture}
begin{axis}[]
addplot+[only marks] table [
x index={0},
y expr={ifthenelse(mod(coordindex, 10000) == 0, thisrowno{1}, NaN)},
unbounded coords=jump,
] {data.dat};
end{axis}
end{tikzpicture}
end{document}
However this just loads the whole data file and then displays the specified points instead of loading only the specified points. The pgfplotstable
package offers pgfplotstabletypeset[every nth row={integer}[shift]{options}]
which looks useful. However it isn’t clear what the options
should be in order to delete rows from the read data, and whether the typesetting happens during or after data is read.
Is it possible to read only selected lines of a file with pgfplotstable
and if so, how?