"If treating LXD containers like a Cloud is hard, maybe I shouldn't be thinking of it like one? Maybe SSH & cloud-init are the wrong answer?"
Hmmm. Well, time to rethink this whole thing.
Back to the Documentation
Clearly, the lxc command line tool can execute commands inside containers; I knew this. Running "lxc exec bash" is the standard way to get a shell inside a container. What I hadn't done is link that idea to the REST API; lo' and behold, a check of the documentation shows me that the API provides a method to execute a command in a container.Even more obviously (in hindsight) the Hyperkit Gem documentation shows an execute_command method, as well methods to download & upload files. "Wait a minute, that's precisely the three things a Transport plugin does!" and finally the lightbulb switches on.
Foresight & Hindsight are well balanced
When I originally designed & built Cyclid I was very careful to layer everything:- Builders deal with creating & destroying build hosts
- Transports deal with executing commands in build hosts
- Provisioners deal with setup & configuration of build hosts
This means that everything is decoupled, and nothing makes any assumptions about how the build host has been created, how it's been configured or how to execute commands: that's all handled by the different plugins, and (in theory) is completely opaque to the other layers.
I originally designed things way with something like a WinRM Transport plugin, or even an "agent" type transport (similar to how Jenkins works). It turns out that decision was pure 20:20 foresight. All I need to do is write an LXD Transport plugin that uses the LXD REST API to execute commands on LXD build hosts, instead of SSH!
Gotchas
You didn't think it was that easy, did you?Okay, it wasn't that bad, but there were some gotchas I had to contend with. The biggest issue was that actually using the REST API to execute a command is not particularly well documented. In fact, the documentation for the WebSockets that are created as part of the process is, as far as I can tell, entirely undocumented. I had to fall back to reading the command line client source code (which is written in Go: I don't actually know Go) to piece it together. If you're wondering:
- There are two options that are relevant, "wait-for-websocket" and "interactive". These are both documented, but the documentation may not make it immediately obvious what the implication are.
- Depending on the "interactive" flag, you either get 1 WebSocket which combines STDIN, STDOUT & STDERR, or three seperate WebSockets (one each).
- There is always a "control" WebSocket. The actual function of the control WebSocket is undocumented.
It turns out the "control" WebSocket is a red-herring. From reading the source, it is used to send two different types of message: "signal" and "window-size". The messages are simple JSON objects with the message type & associated parameters. We don't need to deal with signals, and certainly don't need to set the terminal size, so we can ignore the control Websocket.
It turned out that the Hyperkit Gem didn't actually expose the "wait-for-websocket" or "interactive" flags, so I had to submit patches to enable that functionality. While I was there I also extended the "put file" and "get file" functionality to better fit how Cyclid passes IO objects around, rather than reading and writing files on-disk.
Last but not least, I had to learn enough about WebSockets to be dangerous. In fact I'm not entirely sure if I've got it right still; it does seem to work, but some of the semantics seem odd. I have no idea if this is by design of WebSockets, by design of LXD, or an artifact of the WebSockets client library I'm using: if you know more about WebSockets than I do (it may not be hard) please let me know so that I can pick your brains!
It lives!
Given those relatively minor caveats, the good news is that it all works: the Cyclid LXD plugin can create build instances in a machine running LXD, connect to those instances and run jobs. This is a really significant milestone; with Cyclid & LXD you can now run an entirely Open Source Continuous Integration system, on-premise, with a price tag of 0.And if that isn't exciting enough for you then I hope the Grinch steals your Christmas!
No comments:
Post a Comment