January 1, 0001

Interactive GUIs for containerized applications... without HTML

Check out the code to support this post

Let’s talk about graphical user interfaces for server-side software. Say you’ve developed a backend service in Python. Maybe this is software for training neural networks. In my case, it’s the server portion of IRL, our game engine for physical location-based games.

You decide you need a GUI to interact with your running service. You’ve got some choices:

  • Write it in HTML5/Javascript: Even if you like Javascript, you’re not going to enjoy switching languages every few seconds. Not to mention that your GUI and your server don’t share state! You’ll need to synchronize the two with a fast bi-directional data flow between server and browser. This is starting to sound like a project.
  • Write a command-line interface. You probably have one of these already. But there are some limitations to what you can accomplish with a CLI. Sure, ncurses will get you something “graphical-ish,” but is a slider you can move with your mouse too much to ask for? You still have the problem of no shared state.
  • Use a GUI toolkit like tkinter (a venerable Python built-in) or the more beautiful (but un-free) PyQT. If your GUI is a just thread (or greenlet) running in your server, then you’ve solved the problem of shared state.

Though we desperately needed one, it took me a long time to add a real GUI to IRL’s backend. The shortcomings of the HTML5/JS and CLI options weren’t worth the effort. tkinter and the like expect an attached monitor and keyboard. Our backend runs in a Docker container, always in a data-center, sometimes in cloud-based hosting. Not a monitor or keyboard for miles.

But all is not lost! Containerized server software runs in a Linux environment. Most graphics software in Unix (tk included) is logically separated into a backend and frontend, with the backend communicating with the X Windows server (X11) using the “X Windows Protocol.” Crucially, the backend and frontend need not even be on the same computer! The trick of “setting your $DISPLAY variable” has long allowed brave Linux-on-the-Desktop nerds to ssh into a remote computer and launch a graphical application that appears on their local desktop.

My first tentative steps were to use X11.app on my Mac. A throwback to OSX’s Unix roots, this software allows a Mac to behave like an X Windows server, talk the X Windows protocol, and host graphical interfaces.

It worked! My laptop needed an IP reachable from the server’s container. I put its IP in the backend container’s $DISPLAY environment variable. Bam! Interactive tkinter window on my laptop, produced by a python program running inside a remote Docker container.

Alas, it was only half a victory. The connection between the backend and X11 on my Mac was established when the server started up, but once it was closed it could never be re-opened. Also, X11.app is… functional but weird. The Windows equivalent is no better.

I needed a more modern way for authorized users to connect and disconnect to the GUI at any time.

Enter x11vnc. This ingenious piece of code is half X11 server, half VNC server. With the right combination of command-line switches, it can create a virtual X11 screen, and re-share that virtual screen over VNC. VNC is the original cross-platform remote desktop protocol. Your Mac has a VNC client built in, and there are plenty of good options for Windows as well.

What’s even better is that x11vnc can be run in a docker container of its own, adjacent to the backend’s container. (I use docker-compose for stuff like this). The backend’s $DISPLAY variable must be set to the internal Docker hostname of the x11vnc container. When the backend starts, it establishes its X Windows Protocol connection to the virtual screen. A VNC client (your laptop) can connect and disconnect from the VNC server at any time on the exposed VNC port.

The virtual screen published by x11vnc isn’t much to look at: a full-screen GUI window with no bells or whistles, not even a title bar.

I still found this solution to be somewhat unreliable. Peculiar VNC protocol mismatches between my client and x11vnc meant the connection sometimes couldn’t be established.

Today’s browsers are powerful indeed, but I was still surprised and delighted to discover noVNC. noVNC is a client/server tool that makes it possible to put a VNC client in a web browser.

noVNC server is written in Node.JS. For me, noVNC runs in another Docker container and functions as a VNC client to x11vnc's VNC server. It proxies the VNC protocol over websockets to the noVNC HTML5/JS user-facing app which runs in a browser.

I hide noVNC behind an nginx ingress reverse proxy so I can do HTTP Basic Authentication. The nginx server is the only port that’s exposed to the Internet. Everything else is on the internal Docker network.

This is my final solution for how I added an always-available interactive GUI to my remote containerized Python backend:

Python application & tkinter
<-- X Windows protocol -->
x11vnc server
<-- VNC protocol -->
noVNC server
<-- nginx proxy -->
<-- HTTP/websocket -->
Browser (noVNC client)

There’s quite a few layers there, but I find the interface to be snappy. The beauty of containerization is that its easy to package up complex systems. Check out my github repo for a sample implementation of this that you can incorporate into your own stack.

Having a GUI that lives in the same process as my server has been more useful than I ever imagined. Because I don’t have to worry about serializing application state, I don’t think twice about exposing state for visualization or tweaking.

End Notes

If you’ve never worked with tkinter before then you’re in for a shock. It’s got a gritty 90s UNIX feel to it that fills me with nostalgia. Python isn’t known for its GUI tools; you should feel lucky to have anything at all. You might want to check out PyQT if its licensing isn’t a turn-off.

I really wanted to use pyimgui because I’ve used imgui extensively in C++ and it’s the bomb. Unfortunately, it relies on OpenGL (or SDL, etc) for rendering. That kind of hardware accelerated rendering can’t be forwarded over the X Windows protocol. I tried. It ended in tears.

If you disagree with the premise that “you might want a GUI” and believe there’s nothing a good command-line interface won’t solve, then this article isn’t for you. But then, it probably doesn’t render in lynx anyway.

The process of getting the X11 server up and running properly is based on the excellent work here