0001 ---
0002 layout: global
0003 title: LOAD DATA
0004 displayTitle: LOAD DATA
0005 license: |
0006 Licensed to the Apache Software Foundation (ASF) under one or more
0007 contributor license agreements. See the NOTICE file distributed with
0008 this work for additional information regarding copyright ownership.
0009 The ASF licenses this file to You under the Apache License, Version 2.0
0010 (the "License"); you may not use this file except in compliance with
0011 the License. You may obtain a copy of the License at
0012
0013 http://www.apache.org/licenses/LICENSE-2.0
0014
0015 Unless required by applicable law or agreed to in writing, software
0016 distributed under the License is distributed on an "AS IS" BASIS,
0017 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
0018 See the License for the specific language governing permissions and
0019 limitations under the License.
0020 ---
0021
0022 ### Description
0023
0024 `LOAD DATA` statement loads the data into a Hive serde table from the user specified directory or file. If a directory is specified then all the files from the directory are loaded. If a file is specified then only the single file is loaded. Additionally the `LOAD DATA` statement takes an optional partition specification. When a partition is specified, the data files (when input source is a directory) or the single file (when input source is a file) are loaded into the partition of the target table.
0025
0026 ### Syntax
0027
0028 ```sql
0029 LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ partition_spec ]
0030 ```
0031
0032 ### Parameters
0033
0034 * **path**
0035
0036 Path of the file system. It can be either an absolute or a relative path.
0037
0038 * **table_identifier**
0039
0040 Specifies a table name, which may be optionally qualified with a database name.
0041
0042 **Syntax:** `[ database_name. ] table_name`
0043
0044 * **partition_spec**
0045
0046 An optional parameter that specifies a comma separated list of key and value pairs
0047 for partitions.
0048
0049 **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] )`
0050
0051 * **LOCAL**
0052
0053 If specified, it causes the `INPATH` to be resolved against the local file system, instead of the default file system, which is typically a distributed storage.
0054
0055 * **OVERWRITE**
0056
0057 By default, new data is appended to the table. If `OVERWRITE` is used, the table is instead overwritten with new data.
0058
0059 ### Examples
0060
0061 ```sql
0062 -- Example without partition specification.
0063 -- Assuming the students table has already been created and populated.
0064 SELECT * FROM students;
0065 +---------+----------------------+----------+
0066 | name| address|student_id|
0067 +---------+----------------------+----------+
0068 |Amy Smith|123 Park Ave, San Jose| 111111|
0069 +---------+----------------------+----------+
0070
0071 CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT) USING HIVE;
0072
0073 -- Assuming the students table is in '/user/hive/warehouse/'
0074 LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load;
0075
0076 SELECT * FROM test_load;
0077 +---------+----------------------+----------+
0078 | name| address|student_id|
0079 +---------+----------------------+----------+
0080 |Amy Smith|123 Park Ave, San Jose| 111111|
0081 +---------+----------------------+----------+
0082
0083 -- Example with partition specification.
0084 CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) PARTITIONED BY (c2, c3);
0085
0086 INSERT INTO test_partition PARTITION (c2 = 2, c3 = 3) VALUES (1);
0087
0088 INSERT INTO test_partition PARTITION (c2 = 5, c3 = 6) VALUES (4);
0089
0090 INSERT INTO test_partition PARTITION (c2 = 8, c3 = 9) VALUES (7);
0091
0092 SELECT * FROM test_partition;
0093 +---+---+---+
0094 | c1| c2| c3|
0095 +---+---+---+
0096 | 1| 2| 3|
0097 | 4| 5| 6|
0098 | 7| 8| 9|
0099 +---+---+---+
0100
0101 CREATE TABLE test_load_partition (c1 INT, c2 INT, c3 INT) USING HIVE PARTITIONED BY (c2, c3);
0102
0103 -- Assuming the test_partition table is in '/user/hive/warehouse/'
0104 LOAD DATA LOCAL INPATH '/user/hive/warehouse/test_partition/c2=2/c3=3'
0105 OVERWRITE INTO TABLE test_load_partition PARTITION (c2=2, c3=3);
0106
0107 SELECT * FROM test_load_partition;
0108 +---+---+---+
0109 | c1| c2| c3|
0110 +---+---+---+
0111 | 1| 2| 3|
0112 +---+---+---+
0113 ```