Getting Gnuplot Working in J


I have been trying to get Gnuplot working in J under linux. It should be easy, as there are at least two packages in J for using Gnuplot. However, it appears both of these packages presume that the user is operating the language under windows.

The motivator for this is having a way of visualising data in J. The plot package is what I would usually use, but the plot package seems to be incompatible with the i3 window manager in linux. In JQt when I try to plot something the plot window seems to be generated and then immediately vanishes. I have not been able to work out why. At least in gnuplot the window is generated and stays in place until dismissed. Using, gnuplot, although more work, also generates better-looking plots.

My particular interest is in loading, processing and plotting data files. These are usually in tab-delimited or in comma-delimited form, so let’s make an example file using the following data that gnuplot can easily read.

For this example we will look at comparing two models for the drag on a spherical object travelling at supersonic speed, such as a meteor. If you want to know more about drag on objects travelling at high speed you might want to look at this paper. The drag on a sphere is usually characterised in terms of a non-dimensional quantity known as the drag coefficient, which is the quantity we have tabulated in our data file. The definition of the drag coefficient is

C_D = \frac{D}{0.5 \pi \rho u^2 r^2}

where \(C_D\) indicates the drag coefficient, \(\rho\) is the gas density, \(u\) is the gas velocity and \(r\) is the diameter of the sphere.

This data is saved in the comma-delimited text file ‘DragDataSpheres.csv’. The first column contains the Mach number of the flow, and the second column contains the drag coefficient.

0.22703679008802968, 0.4754504865676933
0.4289222926495784, 0.5029886144003781
0.612450731328545, 0.52777191664627
0.8164462412329196, 0.693557725580885
0.8726990369926486, 0.779240903758345
0.984149624840694, 0.8814834195623001
1.1685642666036478, 0.9643307478710026
1.4622350885781084, 1.0056430036376527
1.7918182355274597, 0.9999307917591553
2.084687254711646, 0.9887089287070719
2.651941628756867, 0.9552155161499962
2.890034857321303, 0.9550838516918039
3.0728458935037084, 0.9328629423630395
3.201345340681785, 0.9521467214705908
3.347653249833309, 0.9382409290784333
3.494045559278546, 0.9298650439303529
3.6039347416929006, 0.9298042757188796
3.7687685153144335, 0.9297131234016695
3.842365571432188, 0.951792240236996
3.97018981626056, 0.9268367613919297
4.097971860942075, 0.8991163289248247
4.372905817712245, 0.9127891765063343
4.83077741110539, 0.9125359756251952
5.05042917549353, 0.9041195783361327
5.251766076145945, 0.8957133090823157
5.892955107483774, 0.9064186423368754
6.5339753382341765, 0.9060641611032807
6.9369445405670005, 0.908606297949917
7.339913742899824, 0.9111484347965532
7.852476726619007, 0.8942751280774458
8.786872378315877, 0.9158782272562309
9.556138855363217, 0.918217803397956
9.775537418870218, 0.8932116843766618

There is an empirical formula to fit this data, and it is given by a sum of two exponentials:

C_D (M) = 2.1 e^{-1.2(M+0.35)}-8.9e^{-2.2(M+0.35)} + 0.92

In hypersonic theory there is an approximate method for calculating drag, due to the great Lester Lees, called the modified Newtonian method. There’s an interesting story to Newtonian methods. As the name suggests, the model was first postulated by Isaac Newton as a way of measuring drag forces at subsonic speeds. The fluid was considered as a series of particles that hit the object like bullets fired from a gun. When the particles collide with a surface they slip along the surface, only imparting the perpendicular component of their momentum to the object. This model never really took off because it was very inaccurate at subsonic speeds. However the approximation becomes pretty good at hypersonic speeds, and so is often used as a simple way of determining the drag force, because it has the advantage of being simple to determine analytically. So I guess that proves that Newton really was ahead of his time. Lees took Newton’s idea and made a modifying calibration factor that removes an offset inherent to the model by fitting the relationship to the pressure coefficient at the stagnation point of the object, which makes it quite accurate. For a sphere, the drag can be computed using the Modified Newtonian method using the equation

C_D = \frac{1}{2} \left( \frac{2}{\gamma M^2} \left( \left[ \frac{(\gamma + 1)^2 M^2}{4 \gamma M^2 -2(\gamma-1)} \right]^{\frac{\gamma}{\gamma-1}} \left[ \frac{1-\gamma + 2 \gamma M^2}{\gamma+1} \right] -1 \right) \right)

as explained in this paper.

We would like to prepare the data in J, then plot it and save it to a graphical file using Gnuplot. First, let’s just plot the first set of data from the file. The following J code sets the directory and then loads the CSV file columns into the variables M_data and CD_data.

1!:44 '/home/sean/ownCloud/Org/blog/Gnuplot/'
load 'trig plot files csv'

NB. Read the data
a =: 5 }. readcsv 'DragDataSpheres.csv'
'M_data CD_data' =: 0 ". each |:(0,1){"0 1 1 a

text =: 0 : 0
set term qt
set style data points
set datafile separator ","
set pointsize 2
plot "DragDataSpheres.csv" using 1:2 pt 6
set ylabel 'Drag coefficient'
set xlabel 'Mach number'
set nokey
pause 5
set term png
set out 'testplot.png'
set out ''
set term qt

text fwrites 'testplot.gnu'
2!:0 'gnuplot < testplot.gnu'


Figure 1: Drag coefficient plot

We put all of our gnuplot instructions in an adjective definition called ‘text’. The ‘pause 5’ instruction pauses for 5 seconds to give us enough time to examine the plot before it goes away. Now that we have this, we can plot our two functions

NB. C_D (M) = 2.1 e^{-1.2(M+0.35)}-8.9e^{-2.2(M+0.35)} + 0.92
CD_Fit =: (2.1 *  (^_1.2*(M_data+0.35))) - (8.9* (^_2.2*(M_data+0.35)))) + 0.92

gamma =: 1.4
term1 =. 2%gamma * *: M_data
term2 =. (((*: M_data)*(*:gamma+1))%((4*gamma* *: M_data)-2*gamma-1))^gamma%gamma-1
term3 =. ((1-gamma) + 2*gamma* *: M_data)%(gamma+1)

CD_Mod_Newt =: 0.5*term1*(term2*term3)-1

(|:M_data,CD_data,CD_Fit,:CD_Mod_Newt) writecsv 'Compar.csv'

Note that I split the more complex modified Newtonian form into three separate terms before combining them. This eases debugging and also makes the probability of incorrectly parsing the right-to-left evaluation of the J code lower.

The three plots can be plotted together using the following gnuplot command…

text =: 0 : 0
set term qt
set style data points
set datafile separator ","
set pointsize 2
plot "Compar.csv" using 1:2 pt 6 t 'data', '' using 1:3 with lines lw 2 t 'empirical fit', '' using 1:4 with lines lt 4 lw 2 dashtype 2 t 'modified Newtonian'
set ylabel 'Drag coefficient'
set xlabel 'Mach number'
set yrange [0:1.2]
pause 10
set term pngcairo
set out 'Compar.png'
set out ''
set term qt


Figure 2: Drag coefficient plot

Note that it’s better to use the pngcairo terminal type than png. For some reason the latter will not do dashed lines, whereas the former will.

Physically, for those interested, it shows that the modified Newtonian theory is not as good a fit as the empirical curve fit at low Mach number, but it fits the behaviour well at Mach number of 5 or greater. The curve fit won’t help you at all if the ratio of specific heats \(\gamma\) is not 1.4, but the modified Newtonian method can account for that, which makes it very useful in predicting drag.

To summarise: by loading data, doing the appropriate calculations and saving the data to a text file, we can plot the data from J using a system call to gnuplot, which will plot the raw data and any manipulations we choose to perform, combining the power of J for data analysis with the rich plotting functionality of gnuplot.

NB. Here is the full J source code

Org-babel for J


As part of my emacs org-mode work flow, I have been using org-babel for a while. This allows you to insert code blocks into org buffers and have those blocks be executed when your file is compiled. This is a really handy method for doing reproduceable research. For example, you can call the source code function in R to do the statistical calculations for data in a table. If the data in the table changes, so will the calculation of the output data change. This prevents the perennial problem of having data in one file (typically a spreadsheet) and not knowing whether the document you generated for a paper used the 12th of September or the 15th of September version of the spreadsheet. By having explicit links to data and to the algorithm that manipulates that data, you can explicitly record the calculations you used to produce your data. And so can anyone else if they want to. This is very important for producing believeable data.

Org-babel is built into emacs org-mode, and supports an amazing array of programming languages, from compiled languages like C to interpreted languages like python or MATLAB, to specialised scripting languages like awk or gnuplot.

The best feature for me is that org-mode can read from or write to org tables, allowing a seamless integration between code and document. However, this capability differs between programming languages. Some languages, like python and common lisp, seem to be very well catered for in this regard. However my favourite programming language, J, is rather less well catered for. In particular, there does not seem to be a built-in way to pass variables to and from the code block. Instead, you can run your code as if it were a script, and the source block will provide the last calculated value as an output. For example,

#+BEGIN_SRC j :exports both
 NB. The square root of the sum of the squares of the numbers
 NB. between 1 and 10
	    [a =: %: +/ *: 1 + i.10
: 19.6214

The output, as stated by the comment, produces the Euclidean norm of the integers between 1 and 10 inclusive, which is 19.6214, and displays it as the result from evaluating the source block. However, for other programming languages one could supply a variable argument using the :var command in the header, to pass a variable argument to a function. So, for example, the 10 in the example above could be replaced by each of the values in the column of a table.

Like most things in emacs, the code for executing commands in code blocks is available as elisp. So, in theory, it should be possible to modify the existing elisp export code to pass variables, including rows and columns in tables, to a J function. At the moment though, my understanding of elisp is not sufficiently good to be able to work out how to do this, but it sounds like a very useful thing to do, and necessary if J is to be seriously used from within org-mode. If anyone has managed to do this, I’d be very interested to know how it’s done. If not, I’ll need to learn some more elisp and try to reverse engineer how it’s already been done for MATLAB code to see if I can do something equivalent for J.

Oh, and happy new year for 2021. I wanted to get one more blog entry done before the end of 2020, as an old-year’s resolution…

Marking pdf Assignments in Emacs


Most of the undergraduate student assignments I need to mark are numerical or analytical in nature.  The assignments are for small classes, and are often written on paper, and even if submitted electronically they are relatively easy to mark up on a tablet, because the answer is either correct or it isn’t.  But I recently had to mark a large number of assignments, for a course on the social context of engineering, where answers to questions are in paragraph form, and are fairly open-ended.  This requires explanation on my part for why marks were not allocated and suggestions on what would have made the answers better.

First time I did this I used a tablet and pen, and although it worked, it was very time consuming and messy to add comments.  Using Adobe Acrobat under Windows was not better, as I find the comment interface unwieldy when I have to use it a lot.  The most tedious aspect of this was that many of the comments were the same for large numbers of students, requiring you to type the same thing over and over again: this year I had around 75 students, with 6 pages and on average around 6 individual comments per page – a total of 2700 comments! Many of the mistakes were common to several students in the class.

In an attempt to improve the experience of marking these assignments, I looked at doing the marking in emacs, using two very simple tools: the pdf-tools package maintained by Andreas Politz, and the built-in registers of the editor.  If you do much marking up of pdfs, this may be of interest to you.

If you use emacs, and don’t currently use pdf-tools, you should.  It’s an easy install from the melpa archive, and it is a lot faster than the default pdf viewer DocView.  Most importantly, it has commands that are useful for marking up and annotating pdf files.  I have added the following commands to my init.el file to allow single-key commands in pdf-tools such as ‘h’ for highlight text, ‘s’ for strikeout and ‘t’ for text comment, as well as ‘d’ to delete existing comments.  I also have changed the default strikeout line from red to green, because red is so judgemental…

configuration of pdf-tools

Configuration file settings for pdf-tools

When this is loaded, opening the pdf file provides a view just like that of a normal pdf-viewer, scaled to fit in the emacs window.  If I select text and hit ‘h’ I highlight that text in the pdf and get to edit a comment on the highlighted text in the minibuffer.  Pressing C-c C-c exits from the minibuffer and the comment can be seen when the cursor rolls over the highlighted text in the pdf.  Pressing ‘t’ does the same, but produces an icon in the pdf instead of the highlighted text.

Annotating in pdf-tools

Annotating in pdf-tools

Once mastered, this is a very quick way of adding comments or deleting text is very fast, as only a few keystrokes are required to add annotations, which are automatically timestamped and user stamped.  Note that I changed my user designation to anonymous here, as sometimes (eg if using for annotating review papers) it’s not desirable to identify.  By default it’s your username.

As I hinted previously, what makes this even more useful is the effectiveness in combination with emacs’ registers.  I had not really used these much previously, as I had no idea what one was supposed to do with them, but populating many pdfs with identical comments is one example of where they can be useful.

While editing a comment in the minibuffer, you can select it and type C-x-r-s-1 and it will save that text to register 1.  Then the next time you want to insert that text in a comment, you type C-x-r-i-1 and it will insert the text.  The box below gives the general idea.

Insert comment from register

Using emacs registers to recycle comments

The process can then be repeated as many times as necessary when needed in a comment.  The usual C-x C-s will save the pdf for you.  Another handy command is C-c C-a l, which lists the annotations.

This procedure is not flawless: there is a bug that comes up every so often (but not always) when editing existing annotations with pdf-tools, where the link to the animation is lost and the updated animation cannot be edited.  If this happens, kill the annotation minibuffer and make and delete another comment, then try again.  This works for me.  The only other thing I’d like to see from the excellent pdf-tools package for marking up documents is a carat symbol for insertion of text.  But I can use it perfectly well in its current form.

You can find out more about registers at https://www.emacswiki.org/emacs/Registers.

The source for pdf-tools is currently at https://github.com/politza/pdf-tools.



Research notes


One of the problems I have been engaged with for many years now is the idea of how to keep a consistent record of my research that I can come back to years later and pick up where I left off.

Having thought a lot about this, I have some important criteria for such a record.  Lab notes must be

  1. Future-proof (not dependent on proprietary software);
  2. Based fundamentally on ASCII text;
  3. Flexible enough to cope with a variety of experiment structures (one-off experiments, repetitive experiments, numerical experiments etc);
  4. Easy to input and maintain;
  5. Able to time-stamp the state of an experiment at the time it was performed;
  6. Able to incorporate the source code used to analyse the data;
  7. Able to facilitate collaboration with my group members and external colleagues;
  8. Easy to transform into outputs like papers and presentations;

I have tried many different notetaking systems to achieve this, including commercial systems like LabArchives, Microsoft OneNote, Microsoft SharePoint, My own TikiWiki webpage, Zim desktop wiki, Tiddlywiki and others I have forgotten, but none of them could give me what I need according to the above 8 points.  I believe I now have a system that works the way I need it to for lab notes, though I know it is not for everyone: emacs org-mode.  I hope to write up a little more about org-mode in the future, and why I think it’s a great way to produce consistently high-quality, self-documenting research outputs.