Hive tutorial – What is Beeline in Hive 2 ?
Before introducing beeline in HiveServer2 let’s talk about HiveServer1 and why beeline came into picture and some extra information about Hive1 and Hive2.
- Support for SQL to perform ad-hoc queries
- Support for mapreduce, custom mappers and reducer support with UDF (user defined function in Hive)
- Limitation – Support for single user at a time
- No Authentication support provided
- Hive Cl is simple to use and widely used interface, still in prodction use
- Hive server is a server client model service
- Allow users to connect using Hive CLI interface and uisng thrift client.
- Support for remote client connection but only one client can connect at a time.
- No session management support.
- Because of thrift no concurrecy control due to thrift API.
Now let’s take a look at
- Hive server 2 is also a client and server model.
- It allows to connect many different clients like thrift
- HiveServer2 gives multi client support where many clients can connect at the same time
- Authentication is much better using kerberose.
- Suppport for JDBC and ODBC driver connection.
- Beeline cli is used for connecting to HiveServer2
- Beeline is a command line interface for HiveServer2
- This is based on SQLLine CLI.
- It gives better support for JDBC/ODBC.
- This is not compatible with old HiveServer1
- To configure beeline over HiveServer2 you need some extra configuration.
Some extra points to be remembered about Beeline is, it runs in two different mode
What is embedded mode and remote mode for this you can refer my another acticle, Click Here
Embedded mode – In this mode beeline cli starts embedded hive which is similar like regular Hive CLI ( Remember – Hive CLI and HiveServer1 is two different things )
Remote mode – In remote mode you are connecting using thrift API to another HiveServer2 process running on separate machine/server.
In next tutorial we will cover some practical tutorial using Beeline CLI for Hive2
Share This Knowledge ! Please like us on Facebook.