Installing Acceldata Torch

Acceldata Torch uses Kubernetes for deployment and execution. This is guided by Replicated Kots which is responsible for deployment providing a single click experience to end customers. Torch can be deployed in both managed cloud Kubernetes environment and on-premise machines. In on-prem environment, Replicated Kots installs Kubernetes in the provided nodes and ensures that Torch gets deployed in that environment.

Once you sign up for Torch, you will be provided with a license file that needs to be used during the installation process.

To install Acceldata Torch, you should provide a cloud-managed Kubernetes environment, or on-premise nodes where Kubernetes will be installed.

Once the Kubernetes environment is ready, the process of configuring and installing Torch is same for both the environments.

Minimum Hardware Recommendation

The recommended HW configuration depends on the following factors:

  • Amount of data to be processed
  • Type of Spark deployment used
Data VolumeSpark deployment modeK8S Cluster Configuration
Low(< 10GB)External Spark (Hadoop cluster)1 master and 2 worker nodes (2 cores, 8GB + Memory each)
Low(< 10GB)Spark On Kubernetes1 master and 4 worker nodes (4 cores, 8GB + Memory each)
Medium(10GB to 100GB)External Spark (Hadoop cluster)1 master and 2 worker nodes (2 cores, 8GB + Memory each)
Medium(10GB to 100GB)Spark On Kubernetes1 master and 6 worker nodes (4 cores, 8GB + Memory each)
High(100GB+ )External Spark (Hadoop cluster)1 master and 2 worker nodes (2 cores, 8GB + Memory each)
High(100GB+ )Spark On Kubernetes1 master and 8 worker nodes (4 cores, 16GB + Memory each)

On-premise software installation

The first step in the process is to install Kubernetes cluster in the nodes.

  • SSH into the master node.
  • Execute the following command
curl -sSL https://k8s.kurl.sh/torch-db-kots | sudo bash
  • Follow the guided procedure step by step.
  • If you are prompted with the given statement, This application is incompatible with memory swapping enabled. Disable swap to continue? (Y/n), press Y.
  • Firewall is to be disabled.
  • Select the network interface, if prompted.
  • Finally, Kots installs the components required for Kubernetes master.

Once the Kots components are installed, copy the content at the end and store it for future reference. Example output:

Installation
Complete ✔
The UIs of Prometheus, Grafana and Alertmanager are exposed on NodePorts 30900, 30902 and 30903 respectively.
To access Grafana use the generated user:password of admin:xxxxxxxxx .
Kotsadm: http://xxx.xxx.xxx.xxx:8800
Login with password (will not be shown again): xxxxxxxxx
To access the cluster with kubectl, copy kubeconfig to your home directory:
cp /etc/kubernetes/admin.conf ~/.kube/config
chown -R root ~/.kube
echo unset KUBECONFIG >> ~/.profile
bash -l
You need to use sudo to copy and chown admin.conf.
Node join commands expire after 24 hours.
To generate new node join commands, run curl -sSL https://kurl.sh/torch-db-kots/tasks.sh | sudo bash -s join_token on this node.
To add worker nodes to this installation, run the following script on your other nodes:
curl -sSL https://kurl.sh/torch-db-kots/join.sh | sudo bash -s kubernetes-master-address=xxx.xxx.xxx.xxx:6443 kubeadm-token=v5atbd.gvkm08e0lx3t8iks kubeadm-token-ca-hash=sha256:a0fe34a1f1fc7ea9d4adb76e58c6264b555302f12b07d7fb9352d58abd2d1731 kubernetes-version=1.19.2 docker-registry-ip=xxx.xxx.xxx.xxx

In the next step, log into the worker nodes and execute the join command as mentioned at the bottom of the master installation procedure.

curl -sSL https://kurl.sh/torch-db-kots/join.sh | sudo bash -s kubernetes-master-address=xxx.xxx.xxx.xxx:6443 kubeadm-token=v5atbd.gvkm08e0lx3t8iks kubeadm-token-ca-hash=sha256:a0fe34a1f1fc7ea9d4adb76e58c6264b555302f12b07d7fb9352d58abd2d1731 kubernetes-version=1.19.2 docker-registry-ip=xxx.xxx.xxx.xxx

Follow the instructions and on completion of the installation, the worker nodes are joined with the cluster.

To check if the nodes are ready, execute the following code on the master node.

kubectl get nodes
NAME STATUS ROLES AGE VERSION
xxxxxxxxxxxxxx Ready master 45m v1.19.2
xxxxxxxxxxxxxx Ready worker 45m v1.19.2

Managed Cloud Kubernetes Installation

In a managed Kubernetes cluster, the nodes are managed by the cloud provider, hence only kots needs to be installed.

Execute the following command in an environment where kubectl is configured and it points to the cluster.

curl https://kots.io/install | bash
kubectl kots install torch/db-kots

The above command installs Kots and the system is ready for Torch deployment.

Configure and Install Torch

  • Open any browser and go to the following URL: http://master-node:8800 to open up the replicated manager. Bypass TLS Click Continue to Setup button.

  • If you view a pop-up warning message about the connection not being private, proceed by adding an exception.

  • In the following window, click skip & continue to bypass setting SSL certificate for Admin console.

Skip SSL

  • The password window is displayed.

Password

Enter your password for Kots admin provided the installation is completed. Look out for the following output

Kotsadm: http://xxx.xxx.xxx.xxx:8800
Login with password (will not be shown again): xxxxxxxxx
  • Upload the license provided by Acceldata in the next window that is displayed.

Password

  • Next, you need to provide configurations.

  • Click continue.

  • In a few minutes, Torch installation will complete and the Kubernetes artifacts will be deployed.

Configurations

Torch version

Displays the current Acceldata Torch version that is to be installed. This is a read-only configuration for reference.

Torch version

Hive Configuration

Click Enable hive support if Hive support is required. Upload the hive-site.xml file in the specified location. If enabled, provide the configurations for Other Hadoop settings.

Hive version

Other Hadoop Configuration

If you have enabled Hive support, core-site.xml and hdfs-site.xml must be provided. Also, if the job result is to be saved in HDFS, then this configuration is required.

Other Hadoop Config

Job result persistence configuration

Torch stores the result of the jobs in few distributed file systems. Currently, it can store in HDFS or AWS S3.

Select one of the two options given below:

  • Use HDFS file system
  • Use AWS S3 file system

HDFS configuration:

Job Result HDFS

Inputs required:

  • Directory: HDFS directory where job results will be stored (Default: /tmp/ad/torch_results)
note

Other Hadoop configurations are also required for this option. Refer Other Hadoop Config.

AWS S3 configuration:

Job Result S3

Inputs required:

  1. AWS S3 Access key: Access Key for the bucket

  2. AWS S3 Secret key: Secret Key for the bucket

  3. AWS S3 Bucket name: Bucket name for where the job results are to be stored.

    note

    It should only contain alphanumeric letters.

Spark Support

Torch uses Apache Spark for running jobs. Currenly, Torch supports three modes of deployment.

Use Embedded Spark

Use Embedded Spark

In this mode, Torch runs jobs locally inside a service. No separate installation of configuration is required.

note

This should only be used for testing.

Use Existing Spark cluster

Use Existing Spark

If there is an existing hadoop cluster with Apache Spark installed, then Torch can run the jobs inside the cluster. It is required to have Apache Livy installed as well. Torch connects to Livy using HTTP and submits the Spark jobs.

Inputs required:

  1. Apache Livy URL: HTTP endpoint for Livy
  2. Apache Livy Queue: The queue name to which the jobs are submitted
  3. Number of executors: Number of executors that are spawned for each job
  4. Number of CPU cores: Number of CPU cores per executor
  5. Memory per executor: Amount of memory to be allocated to each executor

Deploy Spark on Kubernetes

Use Existing Spark

In this mode, the installer deploys Spark on Kubernetes and that is used for running the Jobs.

Inputs required:

  1. Number of executors: Number of executors to be spawned for each job
  2. Number of CPU cores: Number of CPU cores per executor
  3. Memory per executor: Amount of memory to be allocated to each executor
note

This is the most preferred option.

Notification Configuration

Click enable notification, if notification support is required. On enabling, Torch would send emails or Slack messages for multiple events occurring in the system.

Use Existing Spark

Input required:

  1. Default email ID: The default mail ID from which mails are sent
  2. Default Slack webhook url: The default channel to send Slack messages

After the configuring the system, click the Deploy button on the next screen. This would take few minutes to complete.

Deployment Screen

Verify installation

After few minutes, the following services should be visible.

☁ ~ kubectl get services
ad-analysis-service ClusterIP 10.96.0.58 <none> 19021/TCP 11d
ad-catalog ClusterIP 10.96.2.176 <none> 8888/TCP 11d
ad-catalog-auth-db ClusterIP 10.96.1.240 <none> 27017/TCP 11d
ad-catalog-db ClusterIP 10.96.3.72 <none> 5432/TCP 11d
ad-catalog-ui ClusterIP 10.96.3.158 <none> 4000/TCP 11d
ad-torch-auth ClusterIP 10.96.0.106 <none> 9090/TCP 11d
ad-torch-ml ClusterIP 10.96.3.107 <none> 19035/TCP 11d
kotsadm ClusterIP 10.96.2.102 <none> 3000/TCP 11d
kotsadm-postgres ClusterIP 10.96.3.30 <none> 5432/TCP 11d
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 11d
kurl-proxy-kotsadm NodePort 10.96.3.113 <none> 8800:8800/TCP 11d
torch-api-gateway NodePort 10.96.2.129 <none> 80:80/TCP,443:443/TCP 10d

Accessing the Torch Application

For the nginx-ingress-nginx-controller service shown above, the system assigns a node port 80.

The Torch UI can be accessed from port 80 of the Kubernetes master node.

For example: Open http://xxx.xxx.xxx.xxx to start the Torch UI, where xxx.xxx.xxx.xxx is the K8S master node IP or hostname.