Back to home page

OSCL-LXR

 
 

    


0001 ---
0002 layout: global
0003 title: ANSI Compliance
0004 displayTitle: ANSI Compliance
0005 license: |
0006   Licensed to the Apache Software Foundation (ASF) under one or more
0007   contributor license agreements.  See the NOTICE file distributed with
0008   this work for additional information regarding copyright ownership.
0009   The ASF licenses this file to You under the Apache License, Version 2.0
0010   (the "License"); you may not use this file except in compliance with
0011   the License.  You may obtain a copy of the License at
0012  
0013      http://www.apache.org/licenses/LICENSE-2.0
0014  
0015   Unless required by applicable law or agreed to in writing, software
0016   distributed under the License is distributed on an "AS IS" BASIS,
0017   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
0018   See the License for the specific language governing permissions and
0019   limitations under the License.
0020 ---
0021 
0022 Since Spark 3.0, Spark SQL introduces two experimental options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details).
0023 
0024 When `spark.sql.ansi.enabled` is set to `true`, Spark SQL follows the standard in basic behaviours (e.g., arithmetic operations, type conversion, SQL functions and SQL parsing).
0025 Moreover, Spark SQL has an independent option to control implicit casting behaviours when inserting rows in a table.
0026 The casting behaviours are defined as store assignment rules in the standard.
0027 
0028 When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies with the ANSI store assignment rules. This is a separate configuration because its default value is `ANSI`, while the configuration `spark.sql.ansi.enabled` is disabled by default.
0029 
0030 |Property Name|Default|Meaning|Since Version|
0031 |-------------|-------|-------|-------------|
0032 |`spark.sql.ansi.enabled`|false|(Experimental) When true, Spark tries to conform to the ANSI SQL specification: <br/> 1. Spark will throw a runtime exception if an overflow occurs in any operation on integral/decimal field. <br/> 2. Spark will forbid using the reserved keywords of ANSI SQL as identifiers in the SQL parser.|3.0.0|
0033 |`spark.sql.storeAssignmentPolicy`|ANSI|(Experimental) When inserting a value into a column with different data type, Spark will perform type coercion.  Currently, we support 3 policies for the type coercion rules: ANSI, legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. In practice, the behavior is mostly the same as PostgreSQL.  It disallows certain unreasonable type conversions such as converting string to int or double to boolean.  With legacy policy, Spark allows the type coercion as long as it is a valid Cast, which is very loose.  e.g. converting string to int or double to boolean is allowed.  It is also the only behavior in Spark 2.x and it is compatible with Hive.  With strict policy, Spark doesn't allow any possible precision loss or data truncation in type coercion, e.g. converting double to int or decimal to double is not allowed.|3.0.0|
0034 
0035 The following subsections present behaviour changes in arithmetic operations, type conversions, and SQL parsing when the ANSI mode enabled.
0036 
0037 ### Arithmetic Operations
0038 
0039 In Spark SQL, arithmetic operations performed on numeric types (with the exception of decimal) are not checked for overflows by default.
0040 This means that in case an operation causes overflows, the result is the same with the corresponding operation in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number).
0041 On the other hand, Spark SQL returns null for decimal overflows.
0042 When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric and interval arithmetic operations, it throws an arithmetic exception at runtime.
0043 
0044 ```sql
0045 -- `spark.sql.ansi.enabled=true`
0046 SELECT 2147483647 + 1;
0047 java.lang.ArithmeticException: integer overflow
0048 
0049 -- `spark.sql.ansi.enabled=false`
0050 SELECT 2147483647 + 1;
0051 +----------------+
0052 |(2147483647 + 1)|
0053 +----------------+
0054 |     -2147483648|
0055 +----------------+
0056 ```
0057 
0058 ### Type Conversion
0059 
0060 Spark SQL has three kinds of type conversions: explicit casting, type coercion, and store assignment casting.
0061 When `spark.sql.ansi.enabled` is set to `true`, explicit casting by `CAST` syntax throws a runtime exception for illegal cast patterns defined in the standard, e.g. casts from a string to an integer.
0062 On the other hand, `INSERT INTO` syntax throws an analysis exception when the ANSI mode enabled via `spark.sql.storeAssignmentPolicy=ANSI`.
0063 
0064 Currently, the ANSI mode affects explicit casting and assignment casting only.
0065 In future releases, the behaviour of type coercion might change along with the other two type conversion rules.
0066 
0067 ```sql
0068 -- Examples of explicit casting
0069 
0070 -- `spark.sql.ansi.enabled=true`
0071 SELECT CAST('a' AS INT);
0072 java.lang.NumberFormatException: invalid input syntax for type numeric: a
0073 
0074 SELECT CAST(2147483648L AS INT);
0075 java.lang.ArithmeticException: Casting 2147483648 to int causes overflow
0076 
0077 -- `spark.sql.ansi.enabled=false` (This is a default behaviour)
0078 SELECT CAST('a' AS INT);
0079 +--------------+
0080 |CAST(a AS INT)|
0081 +--------------+
0082 |          null|
0083 +--------------+
0084 
0085 SELECT CAST(2147483648L AS INT);
0086 +-----------------------+
0087 |CAST(2147483648 AS INT)|
0088 +-----------------------+
0089 |            -2147483648|
0090 +-----------------------+
0091 
0092 -- Examples of store assignment rules
0093 CREATE TABLE t (v INT);
0094 
0095 -- `spark.sql.storeAssignmentPolicy=ANSI`
0096 INSERT INTO t VALUES ('1');
0097 org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`':
0098 - Cannot safely cast 'v': string to int;
0099 
0100 -- `spark.sql.storeAssignmentPolicy=LEGACY` (This is a legacy behaviour until Spark 2.x)
0101 INSERT INTO t VALUES ('1');
0102 SELECT * FROM t;
0103 +---+
0104 |  v|
0105 +---+
0106 |  1|
0107 +---+
0108 ```
0109 
0110 ### SQL Functions
0111 
0112 The behavior of some SQL functions can be different under ANSI mode (`spark.sql.ansi.enabled=true`).
0113   - `size`: This function returns null for null input under ANSI mode.
0114 
0115 ### SQL Keywords
0116 
0117 When `spark.sql.ansi.enabled` is true, Spark SQL will use the ANSI mode parser.
0118 In this mode, Spark SQL has two kinds of keywords:
0119 * Reserved keywords: Keywords that are reserved and can't be used as identifiers for table, view, column, function, alias, etc.
0120 * Non-reserved keywords: Keywords that have a special meaning only in particular contexts and can be used as identifiers in other contexts. For example, `EXPLAIN SELECT ...` is a command, but EXPLAIN can be used as identifiers in other places.
0121 
0122 When the ANSI mode is disabled, Spark SQL has two kinds of keywords:
0123 * Non-reserved keywords: Same definition as the one when the ANSI mode enabled.
0124 * Strict-non-reserved keywords: A strict version of non-reserved keywords, which can not be used as table alias.
0125 
0126 By default `spark.sql.ansi.enabled` is false.
0127 
0128 Below is a list of all the keywords in Spark SQL.
0129 
0130 |Keyword|Spark SQL<br/>ANSI Mode|Spark SQL<br/>Default Mode|SQL-2011|
0131 |-------|----------------------|-------------------------|--------|
0132 |ADD|non-reserved|non-reserved|non-reserved|
0133 |AFTER|non-reserved|non-reserved|non-reserved|
0134 |ALL|reserved|non-reserved|reserved|
0135 |ALTER|non-reserved|non-reserved|reserved|
0136 |ANALYZE|non-reserved|non-reserved|non-reserved|
0137 |AND|reserved|non-reserved|reserved|
0138 |ANTI|reserved|strict-non-reserved|non-reserved|
0139 |ANY|reserved|non-reserved|reserved|
0140 |ARCHIVE|non-reserved|non-reserved|non-reserved|
0141 |ARRAY|non-reserved|non-reserved|reserved|
0142 |AS|reserved|non-reserved|reserved|
0143 |ASC|non-reserved|non-reserved|non-reserved|
0144 |AT|non-reserved|non-reserved|reserved|
0145 |AUTHORIZATION|reserved|non-reserved|reserved|
0146 |BETWEEN|non-reserved|non-reserved|reserved|
0147 |BOTH|reserved|non-reserved|reserved|
0148 |BUCKET|non-reserved|non-reserved|non-reserved|
0149 |BUCKETS|non-reserved|non-reserved|non-reserved|
0150 |BY|non-reserved|non-reserved|reserved|
0151 |CACHE|non-reserved|non-reserved|non-reserved|
0152 |CASCADE|non-reserved|non-reserved|reserved|
0153 |CASE|reserved|non-reserved|reserved|
0154 |CAST|reserved|non-reserved|reserved|
0155 |CHANGE|non-reserved|non-reserved|non-reserved|
0156 |CHECK|reserved|non-reserved|reserved|
0157 |CLEAR|non-reserved|non-reserved|non-reserved|
0158 |CLUSTER|non-reserved|non-reserved|non-reserved|
0159 |CLUSTERED|non-reserved|non-reserved|non-reserved|
0160 |CODEGEN|non-reserved|non-reserved|non-reserved|
0161 |COLLATE|reserved|non-reserved|reserved|
0162 |COLLECTION|non-reserved|non-reserved|non-reserved|
0163 |COLUMN|reserved|non-reserved|reserved|
0164 |COLUMNS|non-reserved|non-reserved|non-reserved|
0165 |COMMENT|non-reserved|non-reserved|non-reserved|
0166 |COMMIT|non-reserved|non-reserved|reserved|
0167 |COMPACT|non-reserved|non-reserved|non-reserved|
0168 |COMPACTIONS|non-reserved|non-reserved|non-reserved|
0169 |COMPUTE|non-reserved|non-reserved|non-reserved|
0170 |CONCATENATE|non-reserved|non-reserved|non-reserved|
0171 |CONSTRAINT|reserved|non-reserved|reserved|
0172 |COST|non-reserved|non-reserved|non-reserved|
0173 |CREATE|reserved|non-reserved|reserved|
0174 |CROSS|reserved|strict-non-reserved|reserved|
0175 |CUBE|non-reserved|non-reserved|reserved|
0176 |CURRENT|non-reserved|non-reserved|reserved|
0177 |CURRENT_DATE|reserved|non-reserved|reserved|
0178 |CURRENT_TIME|reserved|non-reserved|reserved|
0179 |CURRENT_TIMESTAMP|reserved|non-reserved|reserved|
0180 |CURRENT_USER|reserved|non-reserved|reserved|
0181 |DATA|non-reserved|non-reserved|non-reserved|
0182 |DATABASE|non-reserved|non-reserved|non-reserved|
0183 |DATABASES|non-reserved|non-reserved|non-reserved|
0184 |DAY|reserved|non-reserved|reserved|
0185 |DBPROPERTIES|non-reserved|non-reserved|non-reserved|
0186 |DEFINED|non-reserved|non-reserved|non-reserved|
0187 |DELETE|non-reserved|non-reserved|reserved|
0188 |DELIMITED|non-reserved|non-reserved|non-reserved|
0189 |DESC|non-reserved|non-reserved|non-reserved|
0190 |DESCRIBE|non-reserved|non-reserved|reserved|
0191 |DFS|non-reserved|non-reserved|non-reserved|
0192 |DIRECTORIES|non-reserved|non-reserved|non-reserved|
0193 |DIRECTORY|non-reserved|non-reserved|non-reserved|
0194 |DISTINCT|reserved|non-reserved|reserved|
0195 |DISTRIBUTE|non-reserved|non-reserved|non-reserved|
0196 |DIV|non-reserved|non-reserved|non-reserved|
0197 |DROP|non-reserved|non-reserved|reserved|
0198 |ELSE|reserved|non-reserved|reserved|
0199 |END|reserved|non-reserved|reserved|
0200 |ESCAPE|reserved|non-reserved|reserved|
0201 |ESCAPED|non-reserved|non-reserved|non-reserved|
0202 |EXCEPT|reserved|strict-non-reserved|reserved|
0203 |EXCHANGE|non-reserved|non-reserved|non-reserved|
0204 |EXISTS|non-reserved|non-reserved|reserved|
0205 |EXPLAIN|non-reserved|non-reserved|non-reserved|
0206 |EXPORT|non-reserved|non-reserved|non-reserved|
0207 |EXTENDED|non-reserved|non-reserved|non-reserved|
0208 |EXTERNAL|non-reserved|non-reserved|reserved|
0209 |EXTRACT|non-reserved|non-reserved|reserved|
0210 |FALSE|reserved|non-reserved|reserved|
0211 |FETCH|reserved|non-reserved|reserved|
0212 |FIELDS|non-reserved|non-reserved|non-reserved|
0213 |FILTER|reserved|non-reserved|reserved|
0214 |FILEFORMAT|non-reserved|non-reserved|non-reserved|
0215 |FIRST|non-reserved|non-reserved|non-reserved|
0216 |FOLLOWING|non-reserved|non-reserved|non-reserved|
0217 |FOR|reserved|non-reserved|reserved|
0218 |FOREIGN|reserved|non-reserved|reserved|
0219 |FORMAT|non-reserved|non-reserved|non-reserved|
0220 |FORMATTED|non-reserved|non-reserved|non-reserved|
0221 |FROM|reserved|non-reserved|reserved|
0222 |FULL|reserved|strict-non-reserved|reserved|
0223 |FUNCTION|non-reserved|non-reserved|reserved|
0224 |FUNCTIONS|non-reserved|non-reserved|non-reserved|
0225 |GLOBAL|non-reserved|non-reserved|reserved|
0226 |GRANT|reserved|non-reserved|reserved|
0227 |GROUP|reserved|non-reserved|reserved|
0228 |GROUPING|non-reserved|non-reserved|reserved|
0229 |HAVING|reserved|non-reserved|reserved|
0230 |HOUR|reserved|non-reserved|reserved|
0231 |IF|non-reserved|non-reserved|reserved|
0232 |IGNORE|non-reserved|non-reserved|non-reserved|
0233 |IMPORT|non-reserved|non-reserved|non-reserved|
0234 |IN|reserved|non-reserved|reserved|
0235 |INDEX|non-reserved|non-reserved|non-reserved|
0236 |INDEXES|non-reserved|non-reserved|non-reserved|
0237 |INNER|reserved|strict-non-reserved|reserved|
0238 |INPATH|non-reserved|non-reserved|non-reserved|
0239 |INPUTFORMAT|non-reserved|non-reserved|non-reserved|
0240 |INSERT|non-reserved|non-reserved|reserved|
0241 |INTERSECT|reserved|strict-non-reserved|reserved|
0242 |INTERVAL|non-reserved|non-reserved|reserved|
0243 |INTO|reserved|non-reserved|reserved|
0244 |IS|reserved|non-reserved|reserved|
0245 |ITEMS|non-reserved|non-reserved|non-reserved|
0246 |JOIN|reserved|strict-non-reserved|reserved|
0247 |KEYS|non-reserved|non-reserved|non-reserved|
0248 |LAST|non-reserved|non-reserved|non-reserved|
0249 |LATERAL|non-reserved|non-reserved|reserved|
0250 |LAZY|non-reserved|non-reserved|non-reserved|
0251 |LEADING|reserved|non-reserved|reserved|
0252 |LEFT|reserved|strict-non-reserved|reserved|
0253 |LIKE|non-reserved|non-reserved|reserved|
0254 |LIMIT|non-reserved|non-reserved|non-reserved|
0255 |LINES|non-reserved|non-reserved|non-reserved|
0256 |LIST|non-reserved|non-reserved|non-reserved|
0257 |LOAD|non-reserved|non-reserved|non-reserved|
0258 |LOCAL|non-reserved|non-reserved|reserved|
0259 |LOCATION|non-reserved|non-reserved|non-reserved|
0260 |LOCK|non-reserved|non-reserved|non-reserved|
0261 |LOCKS|non-reserved|non-reserved|non-reserved|
0262 |LOGICAL|non-reserved|non-reserved|non-reserved|
0263 |MACRO|non-reserved|non-reserved|non-reserved|
0264 |MAP|non-reserved|non-reserved|non-reserved|
0265 |MATCHED|non-reserved|non-reserved|non-reserved|
0266 |MERGE|non-reserved|non-reserved|non-reserved|
0267 |MINUS|reserved|strict-non-reserved|non-reserved|
0268 |MINUTE|reserved|non-reserved|reserved|
0269 |MONTH|reserved|non-reserved|reserved|
0270 |MSCK|non-reserved|non-reserved|non-reserved|
0271 |NAMESPACE|non-reserved|non-reserved|non-reserved|
0272 |NAMESPACES|non-reserved|non-reserved|non-reserved|
0273 |NATURAL|reserved|strict-non-reserved|reserved|
0274 |NO|non-reserved|non-reserved|reserved|
0275 |NOT|reserved|non-reserved|reserved|
0276 |NULL|reserved|non-reserved|reserved|
0277 |NULLS|non-reserved|non-reserved|non-reserved|
0278 |OF|non-reserved|non-reserved|reserved|
0279 |ON|reserved|strict-non-reserved|reserved|
0280 |ONLY|reserved|non-reserved|reserved|
0281 |OPTION|non-reserved|non-reserved|non-reserved|
0282 |OPTIONS|non-reserved|non-reserved|non-reserved|
0283 |OR|reserved|non-reserved|reserved|
0284 |ORDER|reserved|non-reserved|reserved|
0285 |OUT|non-reserved|non-reserved|reserved|
0286 |OUTER|reserved|non-reserved|reserved|
0287 |OUTPUTFORMAT|non-reserved|non-reserved|non-reserved|
0288 |OVER|non-reserved|non-reserved|non-reserved|
0289 |OVERLAPS|reserved|non-reserved|reserved|
0290 |OVERLAY|non-reserved|non-reserved|non-reserved|
0291 |OVERWRITE|non-reserved|non-reserved|non-reserved|
0292 |PARTITION|non-reserved|non-reserved|reserved|
0293 |PARTITIONED|non-reserved|non-reserved|non-reserved|
0294 |PARTITIONS|non-reserved|non-reserved|non-reserved|
0295 |PERCENT|non-reserved|non-reserved|non-reserved|
0296 |PIVOT|non-reserved|non-reserved|non-reserved|
0297 |PLACING|non-reserved|non-reserved|non-reserved|
0298 |POSITION|non-reserved|non-reserved|reserved|
0299 |PRECEDING|non-reserved|non-reserved|non-reserved|
0300 |PRIMARY|reserved|non-reserved|reserved|
0301 |PRINCIPALS|non-reserved|non-reserved|non-reserved|
0302 |PROPERTIES|non-reserved|non-reserved|non-reserved|
0303 |PURGE|non-reserved|non-reserved|non-reserved|
0304 |QUERY|non-reserved|non-reserved|non-reserved|
0305 |RECORDREADER|non-reserved|non-reserved|non-reserved|
0306 |RECORDWRITER|non-reserved|non-reserved|non-reserved|
0307 |RECOVER|non-reserved|non-reserved|non-reserved|
0308 |REDUCE|non-reserved|non-reserved|non-reserved|
0309 |REFERENCES|reserved|non-reserved|reserved|
0310 |REFRESH|non-reserved|non-reserved|non-reserved|
0311 |RENAME|non-reserved|non-reserved|non-reserved|
0312 |REPAIR|non-reserved|non-reserved|non-reserved|
0313 |REPLACE|non-reserved|non-reserved|non-reserved|
0314 |RESET|non-reserved|non-reserved|non-reserved|
0315 |RESTRICT|non-reserved|non-reserved|non-reserved|
0316 |REVOKE|non-reserved|non-reserved|reserved|
0317 |RIGHT|reserved|strict-non-reserved|reserved|
0318 |RLIKE|non-reserved|non-reserved|non-reserved|
0319 |ROLE|non-reserved|non-reserved|non-reserved|
0320 |ROLES|non-reserved|non-reserved|non-reserved|
0321 |ROLLBACK|non-reserved|non-reserved|reserved|
0322 |ROLLUP|non-reserved|non-reserved|reserved|
0323 |ROW|non-reserved|non-reserved|reserved|
0324 |ROWS|non-reserved|non-reserved|reserved|
0325 |SCHEMA|non-reserved|non-reserved|non-reserved|
0326 |SECOND|reserved|non-reserved|reserved|
0327 |SELECT|reserved|non-reserved|reserved|
0328 |SEMI|reserved|strict-non-reserved|non-reserved|
0329 |SEPARATED|non-reserved|non-reserved|non-reserved|
0330 |SERDE|non-reserved|non-reserved|non-reserved|
0331 |SERDEPROPERTIES|non-reserved|non-reserved|non-reserved|
0332 |SESSION_USER|reserved|non-reserved|reserved|
0333 |SET|non-reserved|non-reserved|reserved|
0334 |SETS|non-reserved|non-reserved|non-reserved|
0335 |SHOW|non-reserved|non-reserved|non-reserved|
0336 |SKEWED|non-reserved|non-reserved|non-reserved|
0337 |SOME|reserved|non-reserved|reserved|
0338 |SORT|non-reserved|non-reserved|non-reserved|
0339 |SORTED|non-reserved|non-reserved|non-reserved|
0340 |START|non-reserved|non-reserved|reserved|
0341 |STATISTICS|non-reserved|non-reserved|non-reserved|
0342 |STORED|non-reserved|non-reserved|non-reserved|
0343 |STRATIFY|non-reserved|non-reserved|non-reserved|
0344 |STRUCT|non-reserved|non-reserved|non-reserved|
0345 |SUBSTR|non-reserved|non-reserved|non-reserved|
0346 |SUBSTRING|non-reserved|non-reserved|non-reserved|
0347 |TABLE|reserved|non-reserved|reserved|
0348 |TABLES|non-reserved|non-reserved|non-reserved|
0349 |TABLESAMPLE|non-reserved|non-reserved|reserved|
0350 |TBLPROPERTIES|non-reserved|non-reserved|non-reserved|
0351 |TEMPORARY|non-reserved|non-reserved|non-reserved|
0352 |TERMINATED|non-reserved|non-reserved|non-reserved|
0353 |THEN|reserved|non-reserved|reserved|
0354 |TO|reserved|non-reserved|reserved|
0355 |TOUCH|non-reserved|non-reserved|non-reserved|
0356 |TRAILING|reserved|non-reserved|reserved|
0357 |TRANSACTION|non-reserved|non-reserved|non-reserved|
0358 |TRANSACTIONS|non-reserved|non-reserved|non-reserved|
0359 |TRANSFORM|non-reserved|non-reserved|non-reserved|
0360 |TRIM|non-reserved|non-reserved|non-reserved|
0361 |TRUE|non-reserved|non-reserved|reserved|
0362 |TRUNCATE|non-reserved|non-reserved|reserved|
0363 |UNARCHIVE|non-reserved|non-reserved|non-reserved|
0364 |UNBOUNDED|non-reserved|non-reserved|non-reserved|
0365 |UNCACHE|non-reserved|non-reserved|non-reserved|
0366 |UNION|reserved|strict-non-reserved|reserved|
0367 |UNIQUE|reserved|non-reserved|reserved|
0368 |UNKNOWN|reserved|non-reserved|reserved|
0369 |UNLOCK|non-reserved|non-reserved|non-reserved|
0370 |UNSET|non-reserved|non-reserved|non-reserved|
0371 |UPDATE|non-reserved|non-reserved|reserved|
0372 |USE|non-reserved|non-reserved|non-reserved|
0373 |USER|reserved|non-reserved|reserved|
0374 |USING|reserved|strict-non-reserved|reserved|
0375 |VALUES|non-reserved|non-reserved|reserved|
0376 |VIEW|non-reserved|non-reserved|non-reserved|
0377 |VIEWS|non-reserved|non-reserved|non-reserved|
0378 |WHEN|reserved|non-reserved|reserved|
0379 |WHERE|reserved|non-reserved|reserved|
0380 |WINDOW|non-reserved|non-reserved|reserved|
0381 |WITH|reserved|non-reserved|reserved|
0382 |YEAR|reserved|non-reserved|reserved|