This post describes a way of creating user specific databases on HDInsight standard. This uses a similar technique as described in a previous post.
Overview
The script creates databases using beeline taking a list of the databases names from a CSV file. Since we are creating user specific databases the database names should match the username.
- First create a CSV file
- The first line should contain the header name dbname
- The subsequent lines should contain the
- Store the CSV file on the default Azure Storage account
- Attach the Storage account to the HDInsight cluster
- Deploy the cluster with an ARM template that uses a custom script
- The script
- Determines the cluster name
- Based on the cluster name it looks for a file named <clustername>-user-db-list.csv on the storage account
- Copies the file to the node and iterates through the lines in the file and iterates through the file to create the databases in the file
The script is available here on GitHub https://github.com/vijayjt/AzureHDInsight/blob/master/script-actions/create-user-hive-dbs.sh
Future improvements
If we wanted to create user specific databases but use a different name for the database then the CSV and script can be modified to use two columns; the first the database name and the other the owner of the database.
The script assumes the storage account that contains the CSV file contains the string artifacts in its name; the script could and should be updated to take the storage account and container name as parameters.