Interactive jobs: connecting to DNAnexus with SSH

In the quest to make app development on the DNAnexus platform easier and more interactive, we are excited to announce today a new feature – SSH connections to compute jobs. Bioinformaticians and Linux developers are familiar with the SSH command, used to connect to remote computers over the network. The new feature makes it easier to monitor DNAnexus jobs, debug them if something goes wrong, or use DNAnexus workers as powerful interactive workstations in the cloud. Jobs running on the DNAnexus platform can now be optionally configured to allow SSH connections to their execution environment.

By default, DNAnexus jobs have always been firewalled from the Internet, and only have network access to the DNAnexus API. This default will remain, but now three new command-line options are available when launching your job:

  • Running  dx run <executable> --allow-ssh will configure your job to open the SSH port for network connections from IP ranges that you specify.
  • Running  dx run <executable> --ssh will do the same as above, but also immediately connect to the job as soon as it starts running.
  • Running  dx run <executable> --debug-on <error-type> will configure your job’s execution environment to set a breakpoint, so that if the job encounters an error, you can connect to it over SSH and examine what went wrong.

As before, outbound access by jobs can be configured using network access permissions.  Inbound access is restricted to SSH connectivity only, and must be enabled explicitly by the user at run time using the options above.

One-time setup of your user account is required to allow use of SSH connections. Use dx ssh_config to perform this setup. This will generate a new SSH key pair, which you can protect with a password, and configure your account with the public key. Only you and the job you’re connecting to can see the public key; the private key remains on the computer that you ran dx ssh_config on.

When you log in, the system will automatically start the byobu window manager running the tmux terminal multiplexer, so that you can use multiple terminals to monitor the job and do other tasks, and can resume where you left off if you get disconnected. Further information on the state of the job and on how to use the terminal is presented in a banner when you log in.

We have been using this feature for the past few weeks, and it has already proven to be a great tool for debugging and understanding the performance of our genomics tools in the cloud. Detailed documentation is available on the DNAnexus wiki.  We have also provided a tutorial on deploying an interactive “Cloud Workstation” to explore and manipulate data stored on DNAnexus data as you would on a local Linux machine.

If you have any questions or comments, don’t hesitate to ask them on DNAnexus Answers!