By | August 1, 2015

Talend Hadoop Tutorial – Talend Open Studio and Kerberos HDFS connection – Hortonworks

In this tutorial I will demonstrate about the Kerberos HDFS connectivity using Talend Open studio

This is a very simplistic and step by step explanation about Kerberos Hadoop connectivity (less theory). I am doing lots of practical work during my job and whatever problems I have faced during job I am explaining here so others can take advantages of my blog. Let’s get started

Prerequisites

  • Talend Open Studio
  • Hortonworks or Any Hadoop distribution.

Open Talend Studio, Create new Job and Drag a tHDFSOutput component from Palette.

I am going to explain connectivity options here

This is my Talend job, in my case I am reading data from Cassandra table.




Alternative for this – you can take tMYSQLInput or any other component for reading data.

Step 1 - Let’s explore the tHDFSOutput component properties.

Distribution – Hortonworks

Hadoop Version – I have HDP 2.1 cluster – Hortonworks Data Platform v2.1.0

Step 2 - Namenode URI you need to pass namenode address in double quotes “hdfs://localhost:50070”. In my case I have created context variables. 

Step 3 - Enter namenode principal according to your matching principal

localhost is my hostname, TEST.LOCAL is my domain name and nn is namenode principal in Hortonworks Hadoop cluster.

Step 4 - Specify the Principal and keytab of that principal 

Here I am using hdfs user so I need to give reference of hdfs keytab. For any queries here please drop a comment.

Here is my complete screen.

If you have any doubts please drop a comment.

Share this knowledge ! Join us on Facebook ! Now Whatsapp sharing is supportable ! TooDey Inc.

Leave a Reply

Your email address will not be published. Required fields are marked *