Many organizations choose Apache Ambari for simplifying their Hadoop operations. Ambari provides an easy way to configure and manage a Hadoop platform and it’s services. DataTorrent RTS being a Hadoop native platform makes a good candidate. Considering this, DataTorrent has added support for Ambari, ambari- DataTorrent-service, to install and manage DataTorrent RTS platform using Ambari. DataTorrent RTS platform helps you quickly build real-time stream and batch data analytics applications using the open source Apache Apex platform. This is one of the most favored options amongst developers/application writers to write their streaming applications for real-time analysis of big data. For easy adoption of the platform, DataTorrent has added support to install and manage the platform using Apache Ambari and Apache BigTop . In this blog, we’ll look at how to use the ambari- DataTorrent-service to install and manage the DataTorrent RTS platform using Ambari.

Note that service installation has been tested under the HDP stack only. You could either use a physical Ambari managed cluster or a virtual cluster using Vagrant or Docker. To learn about how to setup your Ambari managed cluster of virtual machines check this blog: How to Build a Hadoop VM With Ambari and Vagrant. Readymade Ambari Docker instances are available at Ambari on Docker.

Deploy using Ambari UI

Prerequisites:

• Ambari 2.2.0.0 and above.
• DataTorrent RTS 3.4.0 and above.

1. Connect to the server where Ambari is installed using ssh

2. Download the Ambari service for DataTorrent RTS in HDP stack folder, by running the following commands:

VERSION=`hdp-select status hadoop-client | sed ‘s/hadoop-client – \([0-9]\.[0-9]\).*/\1/’`

sudo git clone https://github.com/ DataTorrent/ambari- DataTorrent-service.git /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/ DataTorrent

Note : ambari- DataTorrent-service code is hosted on DataTorrent’s GitHub repository

3. Run the following commands to restart Ambari service:

#sandbox

service ambari restart

#non sandbox

sudo service ambari-server restart

4. In the bottom-left corner of the Ambari dashboard, click Add Services from the Actions list.

5. Select DataTorrent RTS, and click Next.

6. Modify the appropriate configuration properties, and click Next.

Configuration Properties:

a. Config File Path: Full path to existing dt-site.xml file to use for new installation. Overrides default and previous dt-site.xml

b. Environmental Expression: Adds export <expr> to custom-env.sh file. Used to set or override environment variables. Common examples include:
-E JAVA_HOME=/usr/bin/java Java used by dtgateway and apex cli

-E DT_LOG_DIR=/var/log/ DataTorrent Directory for dtgateway logs

-E DT_RUN_DIR=/var/run/ DataTorrent Directory for dtgateway pid files

c. Environment File Path: Full path to existing custom-env.sh file to use for new installation. Overrides default and previous custom-env.sh

d. Gateway Address: DataTorrent Gateway listen address. Port is required, but ip is optional. Default : 0.0.0.0:9090

e. GroupName: Use <group> group for installation. Default: dtadmin

f. Hadoop Home: Use <path> for location for hadoop executable. Overrides defaults of HADOOP_PREFIX and PATH

g. Install Dir: Use <path> as base installation directory. Must be an absolute path. Default: /opt/ DataTorrent

h. Username: Use <user> user account for installation. Default: dtadmin

7. Click Deploy

Manage using Ambari UI

Upon successful deployment, the DataTorrent RTS platform service appears as a part of the Ambari stack and can be managed using Ambari management console. To start or stop this service, go to ‘Service Actions’ tab and choose the appropriate action.

Note: Screenshots in this blog are from Ambari version 2.2.0

Manage using Ambari REST APIs

Another benefit of wrapping DataTorrent RTS within the Ambari service is that you can now monitor and manage this service remotely via REST APIs. Use the following commands for monitoring and management:

export SERVICE= DataTorrent-RTS

export PASSWORD=admin

export AMBARI_HOST=localhost

#detect name of cluster

output=`curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ http://$AMBARI_HOST:8080/api/v1/clusters`

CLUSTER=`echo $output | sed -n ‘s/.*”cluster_name” : “\([^\”]*\)”.*/\1/p’`

#get service status

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#start service

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X PUT -d ‘{“RequestInfo”: {“context” :”Start $SERVICE via REST”}, “Body”: {“ServiceInfo”: {“state”: “STARTED”}}}’ http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#stop service

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X PUT -d ‘{“RequestInfo”: {“context” :”Stop $SERVICE via REST”}, “Body”: {“ServiceInfo”: {“state”: “INSTALLED”}}}’ http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE