HDInsight and WebSSH Security Issue

HDInsight, Microsoft Azure

Background

This post relates to an unpublished ‘feature’ of Microsoft Azure HDInsight Linux clusters that is misconfigured such that it allows users to obtain root access to clusters without having knowledge of the ‘admin’ account name or password via a web console.

I originally raised this with Microsoft Support around the end of October / beginning of November 2016. Initially, support informed me that they had discussed it with the product team and that the security issue that I was reporting was not a security issue because:

  • The security boundary of HDInsight is the Virtual Network (VNET) and
  • The clusters are only intended for single user tenancy (ironically a MSFT Cloud Data Solution Architect recently said to me that HDInsight fully supports multiple users – which I guess is sort of true now with secure clusters being in preview).

Eventually they agreed that it was indeed an issue and disabled the feature on all new clusters as an interim measure.

 

What is the issue?

An Azure HDInsight Linux cluster consists of head, worker and zookeeper nodes – these nodes are Azure VMs, although the VMs are not visible nor can the individual VMs be managed in the Azure Portal you can SSH to the cluster nodes.

When you provision a cluster you are prompted to set to credentials:

  • One that will be used for the Ambari web interface – which you can login to over HTTPS and a <cluster name>.azurehdinsight.net domain.
  • The other for a local account that will be created on ALL nodes in the cluster which you can then use to SSH to the cluster ssh <user>@<cluster name>-ssh.azurehdinsight.net

The SSH account by default has passwordless sudo – that is you can run sudo su and become root without being prompted for your password.

One of the packages that is installed when you provision a HDInsight cluster is hdinsight-webssh running apt-cache show hdinsight-webssh shows us that it is a Microsoft package (there are other Microsoft HDInsight packages they are all prefixed with hdinsight-):

Running netstat you can see that there is a nodejs based web terminal running and listening on port TCPv6 port 3000:

If you run

you will see the process (which incidentally also runs as root!).

The configuration for the service/application is here:

/etc/websshd/conf.json

It looks like that a number of python scripts are run when you provision a cluster to start ambari, configure hive etc. one of which is to start this websshd service with /opt/startup_scripts/startup_webssh.py

Impact of the issue

The issue cannot be easily exploited by an external attacker e.g. one that does not already have access to infrastructure in the Azure Virtual Network (VNET) that the HDInsight cluster resides in. Such an external attacker would first need to gain access to (doesn’t need to be a privileged account) on any other system hosted in the same VNET and from this point they can easily gain root access on the HDInsight cluster by simply browsing to http://

<clusternodeipaddress>:3000 which would automatically give them a web based shell as the user that has passwordless sudo without entering any username or password.

However, since the default NSG rules allow connectivity within a VNET (as opposed to a default deny that requires all traffic to be explicitly allowed) this makes it easier for an attacker to extend their reach.

Another possibility is that an external attacker would need to find a vulnerability in the proxy servers and/or the various web interfaces that are accessible via the proxies.

In the case of a malicious user who has authorised access to say an application or web server, they would be able to take advantage of the misconfiguration to obtain root access to the HDInsight cluster as described above.

In either case an external attacker or malicious user can then use the root access to exfiltrate data, plant malicious software etc.

Summary

Microsoft have since disabled the service (although the last time I checked back in December 2016 the package is still installed but the service is not running, nor is there a systemd unit file installed.

Microsoft didn’t explain why the package is installed in the first place but I can only assume it was added as a convenience when the product team were developing or testing.

Browser based terminals are problematic when it comes to security but it’s worse when the endpoint is

  1. Unencrypted
  2. Performs no authentication
  3. Drops you in as a user that has passwordless sudo

As an added measure you can disable passwordless sudo for the admin account – which probably shouldn’t be enabled anyway.

KVM Automation

KVM

Introduction

This blog post describes one option for automating the build of a KVM guest.

There are alternative ways to automate the build but the method that is described here uses the ability to pass a kickstart file to the virt-install command when creating a new VM. Kickstart is a file contains the answers to all the normal questions that an interactive installer would ask during installation. The kickstart script installs software packages, configures SELinux, auditd, rsyslog etc.

Using virt-install with a kickstart file

The virt-install command is used to create new virtual machines / guests; it supports a

parameter allows you specify the path to an Anaconda Kickstart file.

An example virt-install command is provided below:

The key parameters as it pertains to automating the install are:

  • The

    parameter specifies the path to the kickstart file on the host machine
  • The

    parameter then specifies where the kickstart file is on the VM.
  • The

    parameter specifies to not enter into a console – which is the default behaviour. The reason we disable this is because if we enter into a console it requires manual intervention to exit from the console after the kickstart installation completes in order to continue with the rest of the script for building a encrypted VM.

 

The Kickstart file

 The format of the Kickstart file will not be covered in detail here however, the key configuration lines that are important for automating the KVM VM build are highlighted below:

  • specifies that a text based installation should be performed

  • specifies that the VM should be shutdown after the kickstart installation completes – this is important as we use this to detect when the installation and configuration is complete before we move on to encrypting the VM operating system disk.

  • specifies the URL for the package repository and that it can be reached via the proxy 192.168.0.20 (if you have direct internet access then this line is not required, also if you are using an internal repo then the URL should be modified accordingly)

  • specifies the location of an additional repo, in this case, the EPEL repo and that it can be accessed via the proxy
  • The line below sets the password for the root user
  • It is stored in hashed form you can generate this by running the command below:

There was also a requirement to configure auditing and logging. Some of these files were quite long and so it was too unweildly to simply hardcode the entire contents of the files into the kickstart file and using heredocs to write them out to a file on the guest. In light of this I used base64 encoding and gunzip to encode the file.

The Kickstart file includes blocks of code such as the example below:

This command decodes a base64 encoded string and then decompresses it and dumps it to a file; the string contains the code for a shell script. This is a convenient way to included scripts without including the entire code using heredocs.

To create the base64 encoded and compressed script enter the script as is into a file, then run the command:

 

Detecting completion of the kickstart script

After the virt-install command is run the virtual machine build script virt-create-guest.sh script waits for the VM to enter the shutdown state (recall that the kickstart file specified that the machine should be shutdown after installation) it does this using the following snippet by running

virsh domstate <guest vm name> and check if it returns “shut off”.