If you’re already familiar with SQL then you may well be thinking about how to add Hadoop skills to your toolbelt as an option for data processing.
From a querying perspective, using Apache Hive provides a familiar interface to data held in a Hadoop cluster and is a great way to get started. Apache Hive is data warehouse infrastructure built on top of Apache Hadoop for providing data summarization, ad-hoc query, and analysis of large datasets. It provides a mechanism to project structure onto the data in Hadoop and to query that data using a SQL-like language called HiveQL (HQL).
If you really want to get to grips with Hive, then take a look at the
full language manual.
Retrieving Information
Function |
MySQL |
Hive |
Retrieving Information (General) |
SELECT from_columns FROM table WHERE conditions; |
SELECT from_columns FROM table WHERE conditions; |
Retrieving All Values |
SELECT * FROM table; |
SELECT * FROM table; |
Retrieving Some Values |
SELECT * FROM table WHERE rec_name = "value"; |
SELECT * FROM table WHERE rec_name = "value"; |
Retrieving With Multiple Criteria |
SELECT * FROM TABLE WHERE rec1 = "value1" AND rec2 = "value2"; |
SELECT * FROM TABLE WHERE rec1 = "value1" AND rec2 = "value2"; |
Retrieving Specific Columns |
SELECT column_name FROM table; |
SELECT column_name FROM table; |
Retrieving Unique Output |
SELECT DISTINCT column_name FROM table; |
SELECT DISTINCT column_name FROM table; |
Sorting |
SELECT col1, col2 FROM table ORDER BY col2; |
SELECT col1, col2 FROM table ORDER BY col2; |
Sorting Reverse |
SELECT col1, col2 FROM table ORDER BY col2 DESC; |
SELECT col1, col2 FROM table ORDER BY col2 DESC; |
Counting Rows |
SELECT COUNT(*) FROM table; |
SELECT COUNT(*) FROM table; |
Grouping With Counting |
SELECT owner, COUNT(*) FROM table GROUP BY owner; |
SELECT owner, COUNT(*) FROM table GROUP BY owner; |
Maximum Value |
SELECT MAX(col_name) AS label FROM table; |
SELECT MAX(col_name) AS label FROM table; |
Selecting from multiple tables (Join same table using alias w/”AS”) |
SELECT pet.name, comment FROM pet, event WHERE pet.name = event.name; |
SELECT pet.name, comment FROM pet JOIN event ON (pet.name = event.name) |
Metadata
Function |
MySQL |
Hive |
Selecting a database |
USE database; |
USE database; |
Listing databases |
SHOW DATABASES; |
SHOW DATABASES; |
Listing tables in a database |
SHOW TABLES; |
SHOW TABLES; |
Describing the format of a table |
DESCRIBE table; |
DESCRIBE (FORMATTED|EXTENDED) table; |
Creating a database |
CREATE DATABASE db_name; |
CREATE DATABASE db_name; |
Dropping a database |
DROP DATABASE db_name; |
DROP DATABASE db_name (CASCADE); |
Current SQL Compatibility
Command Line
Function |
Hive |
Run Query |
hive -e 'select a.col from tab1 a' |
Run Query Silent Mode |
hive -S -e 'select a.col from tab1 a' |
Set Hive Config Variables |
hive -e 'select a.col from tab1 a' -hiveconf hive.root.logger=DEBUG,console |
Use Initialization Script |
hive -i initialize.sql |
Run Non-Interactive Script |
hive -f script.sql |
No comments:
Post a Comment