Latest Cloudera Hadoop CDH 5.4.x – Part 5 Apache Sentry security

By | July 18, 2015

In Part 4,

[spacer height=”20px”] We have discussed about cloudera search workflow and it’s feature. In this tutorial i will cover apache sentry introduction and how it plays a very important role in cloudera hadoop.  

  • Sentry is security framework mainly used for providing access control over Hadoop data
  • Sentry can be easily integrated with other Hadoop components like Pig, Hive and impala
  • Sentry enables the multi user access control over a large amount of data stored in Hadoop
  • Using sentry you can define roles over user access control

Some of the key functionality in cloudera sentry :

 

  • It provide access and roles to particular user on particular data.
  • Data can be divided into multiple categories like important data and regular data.
  • Users can be divided into multiple groups, you can grant specific role on complete groups.
  • Default access is not provided so you must specify the grant for that user role before accessing data

 

Diagram Components :

 

Policy engine
  • This is a engine for policy execution, it gets required privileges from the binding component and required privileges from the authentication layer.
  • Policy engine can make a decision whether the action to be taken or not.

 

Policy agent
  • Policy agent responsible for managing metadata required for policy execution.
  • Whenever required it gets the required policy metadata from main repositories like from HDFS or or Database.

 

File based provider
  • It stores metadata information in a ini or conf format file
  • This configuration or ini file can be present on a local file system or HDFS.
  • The policy file contains a group and group to its respective role mapping.
  • Next for a particular role some privilege mapping is considered

 

DB based provider
  • The Sentry policy store the sentry services and policy mappings in an relational databases like MySQL, Postgres
  • DB based provider provides programmatic way to access its API to create, query, update and delete policies, it uses ORM library.

 

Bindings
  • The binding layer is the bridge between the on going tools like hive, impala and Sentry authorization.
  • Binding layer takes the authentication request that can be handled by Sentry policy engine.

[spacer height=”20px”]

Leave a Reply

Your email address will not be published. Required fields are marked *