参考官网:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
HiveQL DDL statements are documented here, including:
1 | - CREATE |
Create/Drop/Alter/Use Database
Create Database
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] database_name
[COMMENT database_comment]
[LOCATION hdfs_path]
[WITH DBPROPERTIES (property_name=property_value, [...)];](https://cwiki.apache.org/confluence/display/Hive/...)];)
- CREATE DATABASE myhive;
- 通过location字段指定存放路径
CREATE DATABASE myhive
location '/myhive/directory';
Q1:我们创建的database存在哪里
- DESC DATABASE myhive;
OK
d6 hdfs://hadoop001:8020/user/hive/warehouse/d6.db hadoop USER
Time taken: 0.954 seconds, Fetched: 1 row(s)
hdfs://hadoop001:8020--HDFS目录
/user/hive/warehouse/d6.db--默认的hive存储在hdfs上的目录
在Q2的配置上,我们可以查看到hive.metastore.warehouse.dir的默认路径是/user/hive/warehouse
Q2:hive的配置文件可以到哪里去进行查看
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
Q3:hive的配置文件要怎么修改
- 直接配置(当前session生效)
hive> set hive.metastore.warehouse.dir;
hive.metastore.warehouse.dir=/user/hive/warehouse
set key [value]:如果只有key则获取信息,如果有value进行设置
- 修改hive.site.xml(全局修改)
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>Drop Database
1
2DROP (DATABASE|SCHEMA) [IF EXISTS] database_name [RESTRICT|CASCADE];
droup database myhive;Alter Database
1
2
3ALTER (DATABASE|SCHEMA) database_name SET DBPROPERTIES (property_name=property_value, [...); ](https://cwiki.apache.org/confluence/display/Hive/...);%C2%A0%C2%A0) -- (Note: SCHEMA added in Hive ``[0.14](https://cwiki.apache.org/confluence/display/Hive/0.14)``.``0``)
ALTER (DATABASE|SCHEMA) database_name SET OWNER [USER|ROLE] user_or_role; -- (Note: Hive ``[0.13](https://cwiki.apache.org/confluence/display/Hive/0.13)``.``0` `and later; SCHEMA added in Hive ``[0.14](https://cwiki.apache.org/confluence/display/Hive/0.14)``.``0``)
ALTER (DATABASE|SCHEMA) database_name SET LOCATION hdfs_path; -- (Note: Hive ``2.2``.``1``, ``2.4``.``0` `and later)Use Database
1
2USE database_name;
USE myhive;MySQL中存储的database元数据信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19mysql> use hiveDB
mysql> select * from DBS \G;
*************************** 1. row ***************************
DB_ID: 1
DESC: Default Hive database
DB_LOCATION_URI: hdfs://hadoop001:8020/user/hive/warehouse
NAME: default
OWNER_NAME: public
OWNER_TYPE: ROLE
*************************** 2. row ***************************
DB_ID: 6
DESC: NULL
DB_LOCATION_URI: hdfs://hadoop001:8020/user/hive/warehouse/d6.db
NAME: d6
OWNER_NAME: hadoop
OWNER_TYPE: USER
- 我们的database的元数据信息全部存储在DBS这张表里
- 可以看到default数据库默认的hdfs存储路径就是/user/hive/warehouse
Create/Drop/Truncate Table
测试数据emp.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
157369 SMITH CLERK 7902 1980-12-17 800.00 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.00 300.00 30
7521 WARD SALESMAN 7698 1981-2-22 1250.00 500.00 30
7566 JONES MANAGER 7839 1981-4-2 2975.00 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.00 1400.00 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.00 30
7782 CLARK MANAGER 7839 1981-6-9 2450.00 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.00 20
7839 KING PRESIDENT 1981-11-17 5000.00 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.00 0.00 30
7876 ADAMS CLERK 7788 1987-5-23 1100.00 20
7900 JAMES CLERK 7698 1981-12-3 950.00 30
7902 FORD ANALYST 7566 1981-12-3 3000.00 20
7934 MILLER CLERK 7782 1982-1-23 1300.00 10
8888 HIVE PROGRAM 7839 1988-1-23 10300.00相关配置,输出表名和表头
1
2set hive.cli.print.current.db=true;
set hive.cli.print.header=true;数据类型
1
2
3
4数值类型: int bigint float double DECIMAL
字符串:string
** 时间不要使用date类型,string即可
** 数据类型的选用适量为准Create Table
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33row format
两大分隔符:行与行、列与列(默认\001)
file format
file格式:行式\列式
CREATE [TEMPORARY] [EXTERNAL] TABLE table_name
[(col_name data_type [COMMENT col_comment]]
[(col_name data_type [COMMENT col_comment]]
[COMMENT table_comment]
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION '<hdfs_location>';
- 普通的建表语句
create table emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
- 加载数据
LOAD DATA LOCAL INPATH '/home/hadoop/data/emp.txt' OVERWRITE INTO TABLE emp ;
- LIKE:复制表结构,不复制数据
CREATE TABLE emp2 LIKE emp;
- AS 查询:复制表结构和数据
CREATE TABLE emp3 AS SELECT * FROM emp;Drop Table
1
DROP TABLE emp;
Truncate Table:清空表
1
TRUNCATE TABLE emp;
DESC
DESC table_name;
1
2
3
4
5
6
7
8
9
10
11
12hive (myhive)> DESC emp3;
OK
col_name data_type comment
empno int
ename string
job string
mgr int
hiredate string
sal double
comm double
deptno int
Time taken: 0.108 seconds, Fetched: 8 row(s)DESC FORMATTED table_name;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42hive (myhive)> DESC FORMATTED emp3;
OK
col_name data_type comment
# col_name data_type comment
empno int
ename string
job string
mgr int
hiredate string
sal double
comm double
deptno int
# Detailed Table Information
Database: myhive
Owner: hadoop
CreateTime: Thu Mar 07 03:27:15 PST 2019
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://hadoop001:8020/user/hive/warehouse/myhive.db/emp3
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE false
numFiles 1
numRows -1
rawDataSize -1
totalSize 708
transient_lastDdlTime 1551958114
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
Time taken: 0.098 seconds, Fetched: 38 row(s)