Talend Hadoop Tutorial – Talend Open Studio and Kerberos HDFS connection – Hortonworks
In this tutorial I will demonstrate about the Kerberos HDFS connectivity using Talend Open studio
This is a very simplistic and step by step explanation about Kerberos Hadoop connectivity (less theory). I am doing lots of practical work during my job and whatever problems I have faced during job I am explaining here so others can take advantages of my blog. Let’s get started
- Talend Open Studio
- Hortonworks or Any Hadoop distribution.
Open Talend Studio, Create new Job and Drag a tHDFSOutput component from Palette.
I am going to explain connectivity options here
This is my Talend job, in my case I am reading data from Cassandra table.
Alternative for this – you can take tMYSQLInput or any other component for reading data.
Step 1 - Let’s explore the tHDFSOutput component properties.
Distribution – Hortonworks
Hadoop Version – I have HDP 2.1 cluster – Hortonworks Data Platform v2.1.0
Step 2 - Namenode URI you need to pass namenode address in double quotes “hdfs://localhost:50070”. In my case I have created context variables.
Step 3 - Enter namenode principal according to your matching principal
localhost is my hostname, TEST.LOCAL is my domain name and nn is namenode principal in Hortonworks Hadoop cluster.
Step 4 - Specify the Principal and keytab of that principal
Here I am using hdfs user so I need to give reference of hdfs keytab. For any queries here please drop a comment.
Here is my complete screen.
If you have any doubts please drop a comment.
Share this knowledge ! Join us on Facebook ! Now Whatsapp sharing is supportable ! TooDey Inc.