Many organizations choose Apache Ambari for simplifying their Hadoop operations. Ambari provides an easy way to configure and manage a Hadoop platform and it’s services. DataTorrent RTS being a Hadoop native platform makes a good candidate. Considering this, DataTorrent has added an alpha version for support of Ambari. This feature is not yet available as part of the product. Ambari-DataTorrent-service example will provide guidance to users to install and manage DataTorrent RTS platform using Ambari. DataTorrent RTS platform helps you quickly build real-time stream and batch data analytics applications using the open source Apache Apex platform. This is one of the most favored options amongst developers/application writers to write their streaming applications for real-time analysis of big data. For easy adoption of the platform, DataTorrent has added support to install and manage the platform using Apache Ambari and Apache BigTop . In this blog, we will illustrate how an user can use the alpha version of Ambari-DataTorrent-Service to install and manage the DataTorrent RTS platform using Ambari. This is not a supported feature product and meant as an example. Contact us if you wish us to advise on Ambari DataTorrent-RTS integration.

Note that service installation has been tested under the HDP stack only. You could either use a physical Ambari managed cluster or a virtual cluster using Vagrant or Docker.

Deploy using Ambari UI

Prerequisites:
  • Ambari 2.2.0.0 and above
  • DataTorrent RTS 3.4.0 and above
  1. Connect to the server where Ambari is installed using ssh
  2. Download the Ambari service for DataTorrent RTS in HDP stack folder, by running the following commands:

    VERSION=`hdp-select status hadoop-client | sed ‘s/hadoop-client – \([0-9]\.[0-9]\).*/\1/’`

    sudo git clone https://github.com/ DataTorrent/ambari- DataTorrent-service.git /var/lib/ambari-server/resources/stacks/HDP/$VERSION/services/ DataTorrent

    Note : ambari- DataTorrent-service code is hosted on DataTorrent’s GitHub repository

  3. Run the following commands to restart Ambari service:

    #sandbox

    service ambari restart

    #non sandbox

    sudo service ambari-server restart

  4. In the bottom-left corner of the Ambari dashboard, click Add Services from the Actions list
  5. Select DataTorrent RTS, and click Next
  6. Modify the appropriate configuration properties, and click Next

    Configuration Properties:
    1. Config File Path: Full path to existing dt-site.xml file to use for new installation. Overrides default and previous dt-site.xml
    2. Environmental Expression: Adds export <expr> to custom-env.sh file. Used to set or override environment variables. Common examples include:
      -E JAVA_HOME=/usr/bin/java Java used by dtgateway and apex cli

      -E DT_LOG_DIR=/var/log/ DataTorrent Directory for dtgateway logs

      -E DT_RUN_DIR=/var/run/ DataTorrent Directory for dtgateway pid files

    3. Environment File Path: Full path to existing custom-env.sh file to use for new installation. Overrides default and previous custom-env.sh
    4. Gateway Address: DataTorrent Gateway listen address. Port is required, but ip is optional. Default : 0.0.0.0:9090
    5. GroupName: Use <group> group for installation. Default: dtadmin
    6. Hadoop Home: Use <path> for location for hadoop executable. Overrides defaults of HADOOP_PREFIX and PATH
    7. Install Dir: Use <path> as base installation directory. Must be an absolute path. Default: /opt/ DataTorrent
    8. Username: Use <user> user account for installation. Default: dtadmin
  7. Click Deploy

Manage using Ambari UI

Upon successful deployment, the DataTorrent RTS platform service appears as a part of the Ambari stack and can be managed using Ambari management console. To start or stop this service, go to ‘Service Actions’ tab and choose the appropriate action.

Note: Screenshots in this blog are from Ambari version 2.2.0

Manage using Ambari REST APIs

Another benefit of wrapping DataTorrent RTS within the Ambari service is that you can now monitor and manage this service remotely via REST APIs. Use the following commands for monitoring and management:

export SERVICE= DataTorrent-RTS

export PASSWORD=admin

export AMBARI_HOST=localhost

#detect name of cluster

output=`curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ http://$AMBARI_HOST:8080/api/v1/clusters`

CLUSTER=`echo $output | sed -n ‘s/.*”cluster_name” : “\([^\”]*\)”.*/\1/p’`

#get service status

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X GET http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#start service

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X PUT -d ‘{“RequestInfo”: {“context” :”Start $SERVICE via REST”}, “Body”: {“ServiceInfo”: {“state”: “STARTED”}}}’ http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

#stop service

curl -u admin:$PASSWORD -i -H ‘X-Requested-By: ambari’ -X PUT -d ‘{“RequestInfo”: {“context” :”Stop $SERVICE via REST”}, “Body”: {“ServiceInfo”: {“state”: “INSTALLED”}}}’ http://$AMBARI_HOST:8080/api/v1/clusters/$CLUSTER/services/$SERVICE

This blog provides guide on how to integrate DataTorrent RTS into Ambari. This is not a supported feature. Please contact DataTorrent if you need help integrating RTS with Ambari.