0001 ---
0002 layout: global
0003 title: CREATE FUNCTION
0004 displayTitle: CREATE FUNCTION
0005 license: |
0006 Licensed to the Apache Software Foundation (ASF) under one or more
0007 contributor license agreements. See the NOTICE file distributed with
0008 this work for additional information regarding copyright ownership.
0009 The ASF licenses this file to You under the Apache License, Version 2.0
0010 (the "License"); you may not use this file except in compliance with
0011 the License. You may obtain a copy of the License at
0012
0013 http://www.apache.org/licenses/LICENSE-2.0
0014
0015 Unless required by applicable law or agreed to in writing, software
0016 distributed under the License is distributed on an "AS IS" BASIS,
0017 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
0018 See the License for the specific language governing permissions and
0019 limitations under the License.
0020 ---
0021
0022 ### Description
0023
0024 The `CREATE FUNCTION` statement is used to create a temporary or permanent function
0025 in Spark. Temporary functions are scoped at a session level where as permanent
0026 functions are created in the persistent catalog and are made available to
0027 all sessions. The resources specified in the `USING` clause are made available
0028 to all executors when they are executed for the first time. In addition to the
0029 SQL interface, spark allows users to create custom user defined scalar and
0030 aggregate functions using Scala, Python and Java APIs. Please refer to
0031 [Scalar UDFs](sql-ref-functions-udf-scalar.html) and
0032 [UDAFs](sql-ref-functions-udf-aggregate.html) for more information.
0033
0034 ### Syntax
0035
0036 ```sql
0037 CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ]
0038 function_name AS class_name [ resource_locations ]
0039 ```
0040
0041 ### Parameters
0042
0043 * **OR REPLACE**
0044
0045 If specified, the resources for the function are reloaded. This is mainly useful
0046 to pick up any changes made to the implementation of the function. This
0047 parameter is mutually exclusive to `IF NOT EXISTS` and can not
0048 be specified together.
0049
0050 * **TEMPORARY**
0051
0052 Indicates the scope of function being created. When `TEMPORARY` is specified, the
0053 created function is valid and visible in the current session. No persistent
0054 entry is made in the catalog for these kind of functions.
0055
0056 * **IF NOT EXISTS**
0057
0058 If specified, creates the function only when it does not exist. The creation
0059 of function succeeds (no error is thrown) if the specified function already
0060 exists in the system. This parameter is mutually exclusive to `OR REPLACE`
0061 and can not be specified together.
0062
0063 * **function_name**
0064
0065 Specifies a name of function to be created. The function name may be optionally qualified with a database name.
0066
0067 **Syntax:** `[ database_name. ] function_name`
0068
0069 * **class_name**
0070
0071 Specifies the name of the class that provides the implementation for function to be created.
0072 The implementing class should extend one of the base classes as follows:
0073
0074 * Should extend `UDF` or `UDAF` in `org.apache.hadoop.hive.ql.exec` package.
0075 * Should extend `AbstractGenericUDAFResolver`, `GenericUDF`, or
0076 `GenericUDTF` in `org.apache.hadoop.hive.ql.udf.generic` package.
0077 * Should extend `UserDefinedAggregateFunction` in `org.apache.spark.sql.expressions` package.
0078
0079 * **resource_locations**
0080
0081 Specifies the list of resources that contain the implementation of the function
0082 along with its dependencies.
0083
0084 **Syntax:** `USING { { (JAR | FILE ) resource_uri } , ... }`
0085
0086 ### Examples
0087
0088 ```sql
0089 -- 1. Create a simple UDF `SimpleUdf` that increments the supplied integral value by 10.
0090 -- import org.apache.hadoop.hive.ql.exec.UDF;
0091 -- public class SimpleUdf extends UDF {
0092 -- public int evaluate(int value) {
0093 -- return value + 10;
0094 -- }
0095 -- }
0096 -- 2. Compile and place it in a JAR file called `SimpleUdf.jar` in /tmp.
0097
0098 -- Create a table called `test` and insert two rows.
0099 CREATE TABLE test(c1 INT);
0100 INSERT INTO test VALUES (1), (2);
0101
0102 -- Create a permanent function called `simple_udf`.
0103 CREATE FUNCTION simple_udf AS 'SimpleUdf'
0104 USING JAR '/tmp/SimpleUdf.jar';
0105
0106 -- Verify that the function is in the registry.
0107 SHOW USER FUNCTIONS;
0108 +------------------+
0109 | function|
0110 +------------------+
0111 |default.simple_udf|
0112 +------------------+
0113
0114 -- Invoke the function. Every selected value should be incremented by 10.
0115 SELECT simple_udf(c1) AS function_return_value FROM t1;
0116 +---------------------+
0117 |function_return_value|
0118 +---------------------+
0119 | 11|
0120 | 12|
0121 +---------------------+
0122
0123 -- Created a temporary function.
0124 CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf'
0125 USING JAR '/tmp/SimpleUdf.jar';
0126
0127 -- Verify that the newly created temporary function is in the registry.
0128 -- Please note that the temporary function does not have a qualified
0129 -- database associated with it.
0130 SHOW USER FUNCTIONS;
0131 +------------------+
0132 | function|
0133 +------------------+
0134 |default.simple_udf|
0135 | simple_temp_udf|
0136 +------------------+
0137
0138 -- 1. Modify `SimpleUdf`'s implementation to add supplied integral value by 20.
0139 -- import org.apache.hadoop.hive.ql.exec.UDF;
0140
0141 -- public class SimpleUdfR extends UDF {
0142 -- public int evaluate(int value) {
0143 -- return value + 20;
0144 -- }
0145 -- }
0146 -- 2. Compile and place it in a jar file called `SimpleUdfR.jar` in /tmp.
0147
0148 -- Replace the implementation of `simple_udf`
0149 CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR'
0150 USING JAR '/tmp/SimpleUdfR.jar';
0151
0152 -- Invoke the function. Every selected value should be incremented by 20.
0153 SELECT simple_udf(c1) AS function_return_value FROM t1;
0154 +---------------------+
0155 |function_return_value|
0156 +---------------------+
0157 | 21|
0158 | 22|
0159 +---------------------+
0160 ```
0161
0162 ### Related Statements
0163
0164 * [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html)
0165 * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html)
0166 * [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html)