Installing Acceldata Torch

Acceldata Torch uses Kubernetes for deployment and execution. This is guided by Replicated Kots which is responsible for deployment providing a single click experience to end customers. Torch can be deployed in both managed cloud Kubernetes environment and on-premise machines. In on-prem environment, Replicated Kots installs Kubernetes in the provided nodes and ensures that Torch gets deployed in that environment.

Once you sign up for Torch, you will be provided with a license file that needs to be used during the installation process.

To install Acceldata Torch, you should provide a cloud-managed Kubernetes environment, or on-premise nodes where Kubernetes will be installed.

Once the Kubernetes environment is ready, the process of configuring and installing Torch is same for both the environments.

Configure and Install Torch

Minimum Hardware Recommendation

The recommended HW configuration depends on the following factors:

Amount of data to be processed
Type of Spark deployment used

Data Volume	Spark deployment mode	K8S Cluster Configuration
Low(< 10GB)	External Spark (Hadoop cluster)	1 master and 2 worker nodes (2 cores, 8GB + Memory each)
Low(< 10GB)	Spark On Kubernetes	1 master and 4 worker nodes (4 cores, 8GB + Memory each)
Medium(10GB to 100GB)	External Spark (Hadoop cluster)	1 master and 2 worker nodes (2 cores, 8GB + Memory each)
Medium(10GB to 100GB)	Spark On Kubernetes	1 master and 6 worker nodes (4 cores, 8GB + Memory each)
High(100GB+ )	External Spark (Hadoop cluster)	1 master and 2 worker nodes (2 cores, 8GB + Memory each)
High(100GB+ )	Spark On Kubernetes	1 master and 8 worker nodes (4 cores, 16GB + Memory each)

On-premise software installation

The first step in the process is to install Kubernetes cluster in the nodes.

SSH into the master node.
Execute the following command

 curl -sSL https://k8s.kurl.sh/torch-db-kots | sudo bash

Follow the guided procedure step by step.
If you are prompted with the given statement, This application is incompatible with memory swapping enabled. Disable swap to continue? (Y/n), press Y.
Firewall is to be disabled.
Select the network interface, if prompted.
Finally, Kots installs the components required for Kubernetes master.

Once the Kots components are installed, copy the content at the end and store it for future reference. Example output:

        Installation
          Complete ✔

The UIs of Prometheus, Grafana and Alertmanager are exposed on NodePorts 30900, 30902 and 30903 respectively.

To access Grafana use the generated user:password of admin:xxxxxxxxx .

Kotsadm: http://xxx.xxx.xxx.xxx:8800
Login with password (will not be shown again): xxxxxxxxx

To access the cluster with kubectl, copy kubeconfig to your home directory:

    cp /etc/kubernetes/admin.conf ~/.kube/config
    chown -R root ~/.kube
    echo unset KUBECONFIG >> ~/.profile
    bash -l

You need to use sudo to copy and chown admin.conf.

Node join commands expire after 24 hours.

To generate new node join commands, run curl -sSL https://kurl.sh/torch-db-kots/tasks.sh | sudo bash -s join_token on this node.

To add worker nodes to this installation, run the following script on your other nodes:
    curl -sSL https://kurl.sh/torch-db-kots/join.sh | sudo bash -s kubernetes-master-address=xxx.xxx.xxx.xxx:6443 kubeadm-token=v5atbd.gvkm08e0lx3t8iks kubeadm-token-ca-hash=sha256:a0fe34a1f1fc7ea9d4adb76e58c6264b555302f12b07d7fb9352d58abd2d1731 kubernetes-version=1.19.2 docker-registry-ip=xxx.xxx.xxx.xxx

In the next step, log into the worker nodes and execute the join command as mentioned at the bottom of the master installation procedure.

curl -sSL https://kurl.sh/torch-db-kots/join.sh | sudo bash -s kubernetes-master-address=xxx.xxx.xxx.xxx:6443 kubeadm-token=v5atbd.gvkm08e0lx3t8iks kubeadm-token-ca-hash=sha256:a0fe34a1f1fc7ea9d4adb76e58c6264b555302f12b07d7fb9352d58abd2d1731 kubernetes-version=1.19.2 docker-registry-ip=xxx.xxx.xxx.xxx

Follow the instructions and on completion of the installation, the worker nodes are joined with the cluster.

To check if the nodes are ready, execute the following code on the master node.

kubectl get nodes

NAME                   STATUS   ROLES    AGE   VERSION
xxxxxxxxxxxxxx         Ready    master   45m   v1.19.2
xxxxxxxxxxxxxx         Ready    worker   45m   v1.19.2

Managed Cloud Kubernetes Installation

In a managed Kubernetes cluster, the nodes are managed by the cloud provider, hence only kots needs to be installed.

Execute the following command in an environment where kubectl is configured and it points to the cluster.

curl https://kots.io/install | bash
kubectl kots install torch/db-kots

The above command installs Kots and the system is ready for Torch deployment.

Configure and Install Torch

Open any browser and go to the following URL: http://master-node:8800 to open up the replicated manager. Click Continue to Setup button.
If you view a pop-up warning message about the connection not being private, proceed by adding an exception.
In the following window, click skip & continue to bypass setting SSL certificate for Admin console.

Skip SSL

The password window is displayed.

Password

Enter your password for Kots admin provided the installation is completed. Look out for the following output

Kotsadm: http://xxx.xxx.xxx.xxx:8800
Login with password (will not be shown again): xxxxxxxxx

Upload the license provided by Acceldata in the next window that is displayed.

Password

Next, you need to provide configurations.
Click continue.
In a few minutes, Torch installation will complete and the Kubernetes artifacts will be deployed.

Configurations

Torch version

Displays the current Acceldata Torch version that is to be installed. This is a read-only configuration for reference.

Torch version

Hive Configuration

Click Enable hive support if Hive support is required. Upload the hive-site.xml file in the specified location. If enabled, provide the configurations for Other Hadoop settings.

Hive version

Other Hadoop Configuration

If you have enabled Hive support, core-site.xml and hdfs-site.xml must be provided. Also, if the job result is to be saved in HDFS, then this configuration is required.

Other Hadoop Config

Job result persistence configuration

Torch stores the result of the jobs in few distributed file systems. Currently, it can store in HDFS or AWS S3.

Select one of the two options given below:

Use HDFS file system
Use AWS S3 file system

HDFS configuration:

Job Result HDFS

Inputs required:

Directory: HDFS directory where job results will be stored (Default: /tmp/ad/torch_results)

note

Other Hadoop configurations are also required for this option. Refer Other Hadoop Config.

AWS S3 configuration:

Job Result S3

Inputs required:

AWS S3 Access key: Access Key for the bucket
AWS S3 Secret key: Secret Key for the bucket
AWS S3 Bucket name: Bucket name for where the job results are to be stored.
note
It should only contain alphanumeric letters.

Spark Support

Torch uses Apache Spark for running jobs. Currenly, Torch supports three modes of deployment.

Use Embedded Spark

Use Embedded Spark

In this mode, Torch runs jobs locally inside a service. No separate installation of configuration is required.

note

This should only be used for testing.

Use Existing Spark cluster

Use Existing Spark

If there is an existing hadoop cluster with Apache Spark installed, then Torch can run the jobs inside the cluster. It is required to have Apache Livy installed as well. Torch connects to Livy using HTTP and submits the Spark jobs.

Inputs required:

Apache Livy URL: HTTP endpoint for Livy
Apache Livy Queue: The queue name to which the jobs are submitted
Number of executors: Number of executors that are spawned for each job
Number of CPU cores: Number of CPU cores per executor
Memory per executor: Amount of memory to be allocated to each executor

Deploy Spark on Kubernetes

Use Existing Spark

In this mode, the installer deploys Spark on Kubernetes and that is used for running the Jobs.

Inputs required:

Number of executors: Number of executors to be spawned for each job
Number of CPU cores: Number of CPU cores per executor
Memory per executor: Amount of memory to be allocated to each executor

note

This is the most preferred option.

Notification Configuration

Click enable notification, if notification support is required. On enabling, Torch would send emails or Slack messages for multiple events occurring in the system.

Use Existing Spark

Input required:

Default email ID: The default mail ID from which mails are sent
Default Slack webhook url: The default channel to send Slack messages

After the configuring the system, click the Deploy button on the next screen. This would take few minutes to complete.

Deployment Screen

Verify installation

After few minutes, the following services should be visible.

☁  ~  kubectl get services
ad-analysis-service                                          ClusterIP   10.96.0.58    <none>        19021/TCP                    11d
ad-catalog                                                   ClusterIP   10.96.2.176   <none>        8888/TCP                     11d
ad-catalog-auth-db                                           ClusterIP   10.96.1.240   <none>        27017/TCP                    11d
ad-catalog-db                                                ClusterIP   10.96.3.72    <none>        5432/TCP                     11d
ad-catalog-ui                                                ClusterIP   10.96.3.158   <none>        4000/TCP                     11d
ad-torch-auth                                                ClusterIP   10.96.0.106   <none>        9090/TCP                     11d
ad-torch-ml                                                  ClusterIP   10.96.3.107   <none>        19035/TCP                    11d
kotsadm                                                      ClusterIP   10.96.2.102   <none>        3000/TCP                     11d
kotsadm-postgres                                             ClusterIP   10.96.3.30    <none>        5432/TCP                     11d
kubernetes                                                   ClusterIP   10.96.0.1     <none>        443/TCP                      11d
kurl-proxy-kotsadm                                           NodePort    10.96.3.113   <none>        8800:8800/TCP                11d
torch-api-gateway                                            NodePort    10.96.2.129   <none>        80:80/TCP,443:443/TCP        10d

Accessing the Torch Application

For the nginx-ingress-nginx-controller service shown above, the system assigns a node port 80.

The Torch UI can be accessed from port 80 of the Kubernetes master node.

For example: Open http://xxx.xxx.xxx.xxx to start the Torch UI, where xxx.xxx.xxx.xxx is the K8S master node IP or hostname.

Minimum Hardware Recommendation#

On-premise software installation#

Managed Cloud Kubernetes Installation#

Configure and Install Torch#

Configurations#

Torch version#

Hive Configuration#

Other Hadoop Configuration#

Job result persistence configuration#

note

note

Spark Support#

note

note

Notification Configuration#

Verify installation#

Accessing the Torch Application#

Minimum Hardware Recommendation

On-premise software installation

Managed Cloud Kubernetes Installation

Configure and Install Torch

Configurations

Torch version

Hive Configuration

Other Hadoop Configuration

Job result persistence configuration

Spark Support

Notification Configuration

Verify installation

Accessing the Torch Application