Skip to content

Latest commit

 

History

History
901 lines (611 loc) · 18.9 KB

scimax-jupyter.org

File metadata and controls

901 lines (611 loc) · 18.9 KB

scimax-jupyter

/Users/jkitchin/Dropbox/emacs/scimax/scimax-jupyter.png

There are two approaches to using jupyter in scimax.

The first one is based on https://github.com/gregsexton/ob-ipython which uses a web-server to communicate with the kernel. This is currently ([2021-11-02 Tue]) the default in scimax.

The second one is based on https://github.com/nnicandro/emacs-jupyter which uses a compiled module to communicate directly with a kernel using zeromq. This requires that you have an emacs with support for compiled modules, and that you are able to compile it.

I have been trying to move away from ob-ipython. The web-server really limits things like completion and inspection, and also is sometimes problematic in debugging. Using zeromq to interface with the kernel solves some of these problems. This document shows how to use it.

It is not the default (yet) because you have to be able to compile the zmq library, and at least on my Mac, this requires some manual intervention. I am not sure how easy it is to do that in Windows.

(require 'scimax-jupyter)

Customization in scimax

Overall, emacs-jupyter is an improvement on ob-ipython. There are a few things I want that don’t come out of the box with emacs-jupyter. Here are a few customizations I have done in scimax.

default header args

These are the settings that work well for me.

org-babel-default-header-args:jupyter-python

Some of these are scimax-specific. For example :results . "both" captures both printed and returned values, which is most consistent with Jupyter notebooks. I set :pandoc . "t" to convert outputs like html to org format.

buffer specific kernels that close when you kill the buffer

I find it confusing to have one kernel shared among many files.

  1. It is easy to mess up the state if you use similar variables in different files
  2. I often assume the CWD is the file I work from, but the kernel starts in the directory it was started in, which is often different than another org-file
  3. I want the kernel to shutdown and close when I close the buffer because I don’t need it after that.

You can set a buffer specific kernel with yasnippet: <jps

scimax closes kernels when you close their buffer.

:results raw seemed to be broken in emacs-jupyter

and it works in scimax.

elisp:(scimax-jupyter-advise t) this un-advises emacs-jupyter

Not raw

for i in range(3):
    print(i)
for i in range(3):
    print(i)

elisp:(scimax-jupyter-advise) this advises emacs-jupyter the scimax way

emacs-jupyter in scimax is more consistent with org-babel

Out of the box you get mixed output and value with :results value and not quite the right way. On one hand that is consistent with what you would get in a terminal. OTOH, it is not fully consistent with org-babel.

I modified this to be closer to org-babel behavior. Note, however that if you have any code that uses Ipython display (e.g. plots, rich outputs, etc.) You will not get what you expect. The display results always come last, and it is not clear you can put them in order to get the right last line.

  • :results value returns the last line.
print(5)

3-5

If you choose output for results, that is all you get, there is nothing returned for the last line.

print(5)


[9 + 9, 4]
3-5

scimax provides “both” to get the original behavior. This is also the default setting in scimax.

print(5)

[9 + 9, 4]

scimax jupyter src-block hydra

Try it: elisp:scimax-jupyter-org-hydra/body

Easy access to:

  • inspect (M-i)
  • completion (M-tab)
  • editing functions
  • kernel management

Examples of usage

Getting help

import numpy as np

?np.linspace

np.linspace
??np.linspace

If you have your cursor on linspace, type M-i or f12-/ to inspect it.

np.linspace

Completion

Use M-tab to complete the thing at point. Sometimes you have to type it more than once.

np.geomspace

Plotting with matplotlib

Figures work like you expect.

import matplotlib.pyplot as plt
import numpy as np

t = np.linspace(0, 20 * np.pi, 350)
x = np.exp(-0.1 * t) * np.sin(t)
y = np.exp(-0.1 * t) * np.cos(t)

plt.plot(x, y)
plt.axis('equal')

plt.figure()
plt.plot(y, x)

plt.axis('equal')

print('Length of t = {}'.format(len(t)))
print('x .dot. y = {}'.format(x @ y))

plotly

Emacs still does not natively render html or interactive javascript. Until that happens, I monkey-patched plotly to capture a static image, and save the interactive html so you can still use it in a browser.

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 size='petal_length', hover_data=['petal_width'])
fig.show()

<<44d53136-5dc5-45ca-b851-56c64248b5ce>>

from pycse.plotly import *

import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 size='petal_length', hover_data=['petal_width'])
fig.show()

pandas

using the pandoc: "t" header generally makes pandas behave well with org-mode. Turn that off if you want with an empty header like :pandoc

import pandas as pd

f = pd.DataFrame([['a', 'b'], [1, 2]])
display(f)
01
0ab
112

Figures and Tables with captions, names, attributes

pycse.orgmode defines several helpful classes to make org figures and tables with attributes.

from pycse.orgmode import *

Table([['x', 'y'],
       [1, 2],
       [3, 4]],
      headers='firstrow',
      name='org-data',
      caption='The information about the table',
      attributes=[('latex', ':environment longtable :align |l||l|')])
xy
12
34

See Table ref:org-data.

import matplotlib.pyplot as plt

f = './test.png'
plt.plot([1, 4, 17])
plt.savefig(f)
plt.close() # you need this to not see two figures.
Figure(f, name='org-fig', caption='a line plot',
       attributes=[('org', ':width 300'),
                   ('latex', ':placement [H]')])

ref:org-fig

import matplotlib.pyplot as plt
import numpy as np

t = np.linspace(0, 20 * np.pi, 350)
x = np.exp(-0.1 * t) * np.sin(t)
y = np.exp(-0.1 * t) * np.cos(t)

plt.plot(x, y)
plt.axis('equal')
plt.savefig('fig-1.png')
plt.close()

plt.figure()
plt.plot(y, x)
plt.axis('equal')
plt.savefig('fig-2.png')
plt.close()

print('Length of t = {}'.format(len(t)))
print('x .dot. y = {}'.format(x @ y))

from pycse.orgmode import Figure, Org

display(Org("\n\n"),
        Figure('./fig-1.png', name='clock',
               caption='a clockwise line plot'),
        Org("\n\n"),
        Figure('./fig-2.png', name='counterclock',
               caption='a counter-clockwise line plot'))

./fig-1.png

./fig-2.png

import pandas as pd

Table(pd.DataFrame([['a', 'b'],
                    [1, 2],
                    [5, 6]]),
      headers='firstrow',
      name='pd-data',
      caption='A table from a dataframe')
0ab
112
256

There is also a keyword.

Keyword('name', 'fig-1')    

and a comment.

Heading('An example of a heading from code', 3)
Comment('A comment for orgmode')

Exceptions

Exceptions go in the results. Type f12 e to jump to the exception in the src block.

print(5)


a = 5




for j in range(5):
    1 / 0



print(54)

print(z)

ZeroDivisionError: division by zero

Select rich outputs with :display

The priority for display is:

  • text/org
  • image/svg+xml, image/jpeg, image/png
  • text/html
  • text/markdown
  • text/latex
  • text/plain

LaTeX is automatically rendered to a png

from sympy import *
init_printing()
x, y, z = symbols('x y z')

display(Integral(sqrt(1 / x), x))

To get the actual LaTeX, use the :display

from sympy import *
init_printing()
x, y, z = symbols('x y z')

display(Integral(sqrt(1 / x), x))

and to get it in plain text:

from sympy import *
init_printing()
x, y, z = symbols('x y z')

display(Integral(sqrt(1 / x), x))

Rich displays mostly work

These get converted to org-syntax by pandoc I think. Note that emacs-jupyter and/or pandoc seems to put some \ in the converted results. I use the function scimax-rm-backslashes in a hook to remove these.

from IPython.display import FileLink, Image, display

display(FileLink('scimax.png'))
display(Image('test.png'))
display(FileLink('scimax.png'), Image('test.png'))

Not every type is easily converted to org-mode, pandoc doesn’t know everything.

from IPython.display import Audio

audio = Audio(filename='/Users/jkitchin/Dropbox/emacs/scimax/2021-06-04-19-48-38.mp3')

display(audio)

We can “orgify” these like this.

from pycse.orgmode import *

ip = get_ipython()

orgf = ip.display_formatter.formatters['text/org']
orgf.for_type_by_name('IPython.lib.display', 'Audio', lambda O: f'[[{O.filename}]]')


audio = Audio(filename='./2021-06-04-19-48-38.mp3')
audio

Some of these are already orgified, e.g. YouTubeVideo.

from IPython.display import YouTubeVideo

YouTubeVideo('ZXSaLcFSOsU')

scratch space and the REPL

The buffer is a great scratch space, but there is also a separate Jupyter scratch buffer. Use it to try out ideas, check values, etc.


Each kernel has a REPL associated with it. Type C-c C-v C-z or f12-z to get to it. It is like an IPython shell! You can explore things there, make plots, etc…

REPL like interaction mode in src blocks

print(3) 
3 + 4  # highlight region, C-M-x to run it.

a = 5  # Run C-x C-e here
5 + a  # Then, M-i here to inspect a

debugging with the REPL

Put a breakpoint in a function. Define it, then go to the REPL (f12 z) to step through it.

def f(x):
    breakpoint()
    return 1 / x

learn more about PDB at https://realpython.com/python-debugging-pdb/#getting-started-printing-a-variables-value.

Export to ipynb

See ox-ipynb. This org-file is not ideal for this export, it has some links that are not supported, and I marked the Known issues section as noexport because it has src-blocks with variables in it.

#+ox-ipynb-language: jupyter-python

(setq  org-export-with-broken-links t)
(ox-ipynb-export-to-ipynb-file-and-open)

Other languages

Julia seems to work

./scimax-jupyter-julia.org

R

./scimax-jupyter-r.org

Known issues

display order is not always respected

See emacs-jupyter/jupyter#351

When using pandoc, it takes time to convert the display, and this often messes up the display order. scimax overrides this behavior to try avoiding this. The root of the issue seems to be there is a process filter that processes data in the order it is received though, so I cannot guarantee the order will always be correct. For now what we do works here.

from IPython.display import HTML, Markdown, Latex

print(1)
display(HTML('<b>bold</b>'),
        Latex('\\bf{lbold}'),
        Markdown('**mbold**'))
print(2)

This works now for making Figures.

from IPython.display import Image, Markdown, HTML

print(1)
display(HTML('''#+attr_org: :width 400<br>
#+name: fig-one<br>
#+caption: <b>bold</b> text.'''),
  Image('test.png'))

using jupyter-python blocks as input to other blocks was broken in emacs-jupyter and is sort of better in scimax

and it works in scimax, sort of. Raw strings get passed around, which isn’t great. One day I will figure out the issue with that. It seems to be a feature of emacs-jupyter though (https://github.com/nnicandro/emacs-jupyter#standard-output-displayed-data-and-code-block-results_. It has something to do with org-babel-insert-result.

a = 9 + 9
a
( / d 2)

example with a table

import pandas as pd
data = [[1, 2], [34, 4]]
pd.DataFrame(data, columns=["Foo", "Bar"])
FooBar
012
1344
(with-temp-buffer (insert dd) (org-babel-read-table))
FooBar
012
1344

see nb:scimax::elpa/org-9.5/ob-emacs-lisp.el::c2254 I think it has something to do with this.

'(("" Foo Bar) hline (0 1 2) (1 3 4))
FooBar
012
134
d
return [[1, 2, 3], [3, 4, 6]]
d

widgets do not seem to work

In theory emacs-jupyter supports widgets, if you build it in the emacs-jupyter src directory. I did that, and don’t see any obvious issues, but this does not work. I am not likely to spend time fixing this anytime soon.

(let ((default-directory (file-name-directory (locate-library "jupyter"))))
  (shell-command-to-string "make widgets"))

This at least outputs something, but I think it should open a browser.

import ipywidgets as widgets

w = widgets.VBox([widgets.Text('#+attr_org: :width 300'),
                  widgets.Text('#+name: fig-data'),
                  widgets.Text('#+caption: something here.')])
display(w)

This code does not run correctly. I am not sure why. I don’t think it is related to my changes. See emacs-jupyter/jupyter#333, I am not sure widgets still work.

This just hangs, and does not do anything.

widgets.Image(value=open("test.png", "rb").read(),  width=400)

Wishlist

handle long outputs

Sometimes you get long outputs from things, and especially when it is something that needs fontification, this makes Emacs hard to use. I would like to have a way to truncate long outputs, and maybe write them to a file where you could look at them.

Jump to definition of variable or function

It would be awesome to do this. Probably this could build on ./scimax-literate-programming.el and ./scimax-ob-flycheck.el.

inspect variables in function calls

This does not always work when variables are inside a call. I usually see help for the function then.

a = 5
print(a + 5)  # inspect a here, I usually see print documentation