华为云用户手册

MAPREDUCE服务 MRS-正则表达式函数:概述

概述所有的正则表达式函数都使用Java样式的语法。但以下情况除外：使用多行模式（通过（？m）标志启用）时，只有\ n被识别为行终止符。此外，不支持（？d）标志，因此不能使用。大小写区分模式（通过（？i）标志启用）时，总是以unicode的模式去实现。同时，不支持上下文敏感匹配和局部敏感匹配。此外，不支持（？u）标志。不支持Surrogate Pair编码方式。例如，\ uD800 \ uDC00不被视为U + 10000，必须将其指定为\ x {10000}。边界字符（\b）无法被正确处理，因为它一个不带基字符的非间距标记。 \Q和\E在字符类（如[A-Z123]）中不受支持，而是作为文本处理。支持Unicode字符类（\ p {prop}），但有以下差异：名称中的所有下划线都必须删除。例如，使用OldItalic而不是Old_Italic 必须直接指定脚本，不能带Is，script =或sc =前缀。示例：\ p {Hiragana} 必须使用In前缀指定块。不支持block =和blk =前缀。示例：\p{Mongolian} 必须直接指定类别，而不能带Is，general_category =或gc =前缀。示例：\p{L} 二进制属性必须直接指定，而不是Is。示例：\p{NoncharacterCodePoint}

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-ALTER TABLE:限制

限制 EXCHANGE PARTITION：被迁移的单个或多个分区，迁移前必须都是已存在的分区，并归属于来源表，且在目标表中不包含这些分区；该操作涉及的表需要有相同的列定义，并且有相同的分区键；如果表中包含索引，该操作会失败；来源表和目标表中任意一个为事务表时，不允许Exchange partition操作；对于目标表，在一次操作中，多个分区要么同时迁移成功，要么全部失败。对于来源表，操作成功后，所有迁移的分区都会被释放； Alter table change column不支持orc格式的表。 ALTER TABLE table_name ADD | DROP col_name命令仅对于ORC/PARQUET存储格式的非分区表可用。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-ALTER TABLE:示例

示例将表名从users 修改为 people： ALTER TABLE users RENAME TO people; 在表users中增加名为zip的列： ALTER TABLE users ADD COLUMN zip varchar; 从表users中删除名为zip的列： ALTER TABLE users DROP COLUMN zip; 将表users中列名id更改为user_id： ALTER TABLE users RENAME COLUMN id TO user_id; 修改分区操作： --创建两个分区表 CREATE TABLE IF NOT EXISTS hetu_int_table5 (eid int, name String, salary String, destination String, dept String, yoj int) COMMENT 'Employee Names' partitioned by (dt timestamp,country String, year int, bonus decimal(10,3)) STORED AS TEXTFILE; CREATE TABLE IF NOT EXISTS hetu_int_table6 (eid int, name String, salary String, destination String, dept String, yoj int) COMMENT 'Employee Names' partitioned by (dt timestamp,country String, year int, bonus decimal(10,3)) STORED AS TEXTFILE; --添加分区 ALTER TABLE hetu_int_table5 ADD IF NOT EXISTS PARTITION (dt='2008-08-08 10:20:30.0', country='IN', year=2001, bonus=500.23) PARTITION (dt='2008-08-09 10:20:30.0', country='IN', year=2001, bonus=100.50) ; --查看分区 show partitions hetu_int_table5; dt | country | year | bonus -------------------------|---------|------|--------- 2008-08-09 10:20:30.000 | IN | 2001 | 100.500 2008-08-08 10:20:30.000 | IN | 2001 | 500.230 (2 rows) --删除分区 ALTER TABLE hetu_int_table5 DROP IF EXISTS PARTITION (dt=timestamp '2008-08-08 10:20:30.0', country='IN', year=2001, bonus=500.23); --查看分区 show partitions hetu_int_table5; dt | country | year | bonus -------------------------|---------|------|--------- 2008-08-09 10:20:30.000 | IN | 2001 | 100.500 (1 row) --迁移分区示例 CREATE SCHEMA part_test; CREATE TABLE hetu_exchange_partition1 (a string, b string) PARTITIONED BY (ds string); CREATE TABLE part_test.hetu_exchange_partition2 (a string, b string) PARTITIONED BY (ds string); ALTER TABLE hetu_exchange_partition1 ADD PARTITION (ds='1'); --查看分区 show partitions hetu_exchange_partition1; ds ---- 1 (1 row) show partitions part_test.hetu_exchange_partition2; ds ---- (0 rows) --迁移分区，从 T1 到 T2 ALTER TABLE part_test.hetu_exchange_partition2 EXCHANGE PARTITION (ds='1') WITH TABLE hetu_exchange_partition1; --再次查看分区，可以看到分区迁移成功 show partitions hetu_exchange_partition1; ds ---- (0 row) show partitions part_test.hetu_exchange_partition2; ds ---- 1 (1 rows) --重命名分区 CREATE TABLE IF NOT EXISTS hetu_rename_table ( eid int, name String, salary String, destination String, dept String, yoj int) COMMENT 'Employee details' partitioned by (year int) STORED AS TEXTFILE; ALTER TABLE hetu_rename_table ADD IF NOT EXISTS PARTITION (year=2001); SHOW PARTITIONS hetu_rename_table; year ------ 2001 (1 row) ALTER TABLE hetu_rename_table PARTITION (year=2001) rename to partition (year=2020); SHOW PARTITIONS hetu_rename_table; year ------ 2020 (1 row) --修改分区表 create table altercolumn4(a integer, b string) partitioned by (c integer); --修改表的文件格式 alter table altercolumn4 SET FILEFORMAT textfile; insert into altercolumn4 values (100, 'Daya', 500); alter table altercolumn4 partition (c=500) change column b empname string comment 'changed column name to empname' first; --修改分区表的存储位置（需要先在hdfs上创建目录，执行语句后，无法查到之前插入的那条数据） alter table altercolumn4 partition (c=500) set Location '/user/hive/warehouse/c500'; --修改列 b 改名为name，同时类型从integer转为string create table altercolumn1(a integer, b integer) stored as textfile; alter table altercolumn1 change column b name string; --修改altercolumn1的存储属性 ALTER TABLE altercolumn1 CLUSTERED BY(a, name) SORTED BY(name) INTO 25 BUCKETS; --查看altercolumn1的属性 describe formatted altercolumn1; Describe Formatted Table ---------------------------------------------------------------------------------------- # col_name data_type comment a integer name varchar # Detailed Table Information Database: default Owner: admintest LastAccessTime: 0 Location: hdfs://hacluster/user/hive/warehouse/altercolumn1 Table Type: MANAGED_TABLE # Table Parameters: STATS_GENERATED_VIA_STATS_TASK workaround for potential lack of HIVE-12730 numFiles 0 numRows 0 orc.compress.size 262144 orc.compression.codec GZIP orc.row.index.stride 10000 orc.stripe.size 67108864 presto_query_id 20210325_025238_00034_f63xj@default@HetuEngine presto_version rawDataSize 0 totalSize 0 transient_lastDdlTime 1616640758 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: 25 Bucket Columns: [a, name] Sort Columns: [SortingColumn{columnName=name, order=ASCENDING}] Storage Desc Params: serialization.format 1 (1 row) Query 20210325_090522_00091_f63xj@default@HetuEngine, FINISHED, 1 node Splits: 1 total, 1 done (100.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s]

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE LIKE:语法

语法 CREATE TABLE [ IF NOT EXISTS] table_name ( { coulumn_name data_type [ COMMENT comment] [ WITH (property_name = expression [,…] ) ] | LIKE existing_table_name [ {INCLUDING| EXCLUDING} PROPERTIES] } ) [,…] [ COMMENT table_comment] [WITH (property_name = expression [,… ] ) ]

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE LIKE:示例

示例创建基础表order01和order02 CREATE TABLE order01(id int,name string,tel string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n'STORED AS TEXTFILE; CREATE TABLE order02(sku int, sku_name string, sku_describe string); 创建表orders_like01，它将包含表order01定义的列及表属性 CREATE TABLE orders_like01 like order01 INCLUDING PROPERTIES; 创建表orders_like02，它将包含表order02定义的列，并将表的存储格式设置为‘TEXTFILE’ CREATE TABLE orders_like02 like order02 STORED AS TEXTFILE; 创建表orders_like03，它将包含表order01定义的列及表属性，order02定义的列，以及额外的列c1和c2 CREATE TABLE orders_like03 (c1 int,c2 float,LIKE order01 INCLUDING PROPERTIES,LIKE order02); 创建表orders_like04和orders_like05，它们都会包含同一个表order_partition的定义，但orders_like04不会包含分区键信息，而orders_like05会包含分区键的信息 CREATE TABLE order_partition(id int,name string,tel string) PARTITIONED BY (sku int); CREATE TABLE orders_like04 (like order_partition); CREATE TABLE orders_like05 like order_partition; DESC orders_like04; Column | Type | Extra | Comment --------|---------|-------|--------- id | integer | | name | varchar | | tel | varchar | | sku | integer | | (4 rows) DESC orders_like05; Column | Type | Extra | Comment --------|---------|---------------|--------- id | integer | | name | varchar | | tel | varchar | | sku | integer | partition key | (4 rows)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE LIKE:描述

描述使用LIKE子句可以在一个新表中包含一个已存在的表所有的列定义。可以使用多个LIKE来复制多个表的列。如果使用了INCLUDING PROPERTIES，表的所有属性也会被复制到新表，该选项最多只能对一个表生效。对于从表中复制过来的属性，可以使用WITH子句指定属性名进行修改。默认使用EXCLUDING PROPERTIES属性。对于带分区的表，如果用括号包裹like子句，复制的列定义不会包含分区键的信息。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE AS:示例

示例用指定列的查询结果创建新表orders_column_aliased： CREATE TABLE orders_column_aliased (order_date, total_price) AS SELECT orderdate, totalprice FROM orders; 用表orders的汇总结果新建一个表orders_by_data： CREATE TABLE orders_by_date COMMENT 'Summary of orders by date' WITH (format = 'ORC') AS SELECT orderdate, sum(totalprice) AS price FROM orders GROUP BY orderdate; 如果表orders_by_date不存在，则创建表orders_by_date： CREATE TABLE IF NOT EXISTS orders_by_date AS SELECT orderdate, sum(totalprice) AS price FROM orders GROUP BY orderdate; 用和表orders具有相同schema创建新表empty_orders table，但是没数据： CREATE TABLE empty_orders AS SELECT * FROM orders WITH NO DATA; 使用VALUES 创建表，参考 VALUES。分区表示例: CREATE EXTERNAL TABLE hetu_copy(corderkey, corderstatus, ctotalprice, corderdate, cds) PARTITIONED BY(cds) SORT BY (corderkey, corderstatus) COMMENT 'test' STORED AS orc LOCATION '/user/hetuserver/tmp' TBLPROPERTIES (orc_bloom_filter_fpp = 0.3, orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = 'corderstatus,ctotalprice') as select * from hetu_test; CREATE TABLE hetu_copy1(corderkey, corderstatus, ctotalprice, corderdate, cds) WITH (partitioned_by = ARRAY['cds'], bucketed_by = ARRAY['corderkey', 'corderstatus'], sorted_by = ARRAY['corderkey', 'corderstatus'], bucket_count = 16, orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = ARRAY['corderstatus', 'ctotalprice'], external = true, format = 'orc', location = '/user/hetuserver/tmp ') as select * from hetu_test;

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE AS:语法

语法 CREATE [EXTERNAL]① TABLE [IF NOT EXISTS] [catalog_name.][db_name.]table_name [ ( column_alias, ... ) ] [[PARTITIONED BY ①(col_name, ....)] [SORT BY① ([column [, column ...]])] ]① [COMMENT 'table_comment'] [ WITH ( property_name = expression [, ...] ) ]② [[STORED AS file_format]① [LOCATION 'hdfs_path']① [TBLPROPERTIES (orc_table_property = value [, ...] ) ] ]① AS query [ WITH [ NO ] DATA ]②

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-UPDATE:示例

示例 -- 创建事务表 create table upd_tb(col1 int,col2 string) with (format='orc',transactional=true); --插入数据 insert into upd_tb values (3,'A'),(4,'B'); --修改col1 = 4的数据 update upd_tb set col1=5 where col1=4; --查询表，col1=4的记录已被修改 select * from upd_tb; -- col1 | col2 ------|------ 5 | B 3 | A

MAPREDUCE服务 MRS HetuEngine DML SQL语法说明
MAPREDUCE服务 MRS-数组函数和运算符:Concatenation Operator : ||

Concatenation Operator : || || 操作符用于将相同类型的数组或数值串联起来。 SELECT ARRAY[1] || ARRAY[2]; _col0 -------- [1, 2] (1 row) SELECT ARRAY[1] || 2; _col0 -------- [1, 2] (1 row) SELECT 2 || ARRAY[1]; _col0 -------- [2, 1] (1 row)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-TABLESAMPLE:BERNOULLI

BERNOULLI 每一行都将基于指定的采样率选择到采样表中。当使用Bernoulli方法对表进行采样时，将扫描表的所有物理块并跳过某些行（基于采样百分比和运行时计算的随机值之间的比较）。结果中包含一行的概率与任何其他行无关。这不会减少从磁盘读取采样表所需的时间。如果进一步处理采样输出，则可能会影响总查询时间。 SELECT * FROM users TABLESAMPLE BERNOULLI (50);

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-TABLESAMPLE:SYSTEM

SYSTEM 此采样方法将表划分为数据的逻辑段，并按此粒度对表进行采样。此采样方法要么从特定数据段中选择所有行，要么跳过它（基于采样百分比与运行时计算的随机值之间的比较）。系统采样中行的选择依赖于使用的connector。例如，如果使用Hive数据源，这将取决于数据在HDFS上的布局。这种采样方法不能保证独立的抽样概率。 SELECT * FROM users TABLESAMPLE SYSTEM (75);

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-EXPLAIN:示例

示例 LOGICAL： CREATE TABLE testTable (regionkey int, name varchar); EXPLAIN SELECT regionkey, count(*) FROM testTable GROUP BY 1; Query Plan ------------------------------------------------------------------------------------------------------------------------------------- Output[regionkey, _col1] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ _col1 := count └─ RemoteExchange[GATHER] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Project[] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(FINAL)[regionkey][$hashvalue] │ Layout: [regionkey:integer, $hashvalue:bigint, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ count := count("count_8") └─ LocalExchange[HASH][$hashvalue] ("regionkey") │ Layout: [regionkey:integer, count_8:bigint, $hashvalue:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ RemoteExchange[REPARTITION][$hashvalue_9] │ Layout: [regionkey:integer, count_8:bigint, $hashvalue_9:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(PARTIAL)[regionkey][$hashvalue_10] │ Layout: [regionkey:integer, $hashvalue_10:bigint, count_8:bigint] │ count_8 := count(*) └─ ScanProject[table = hive:default:testtable] Layout: [regionkey:integer, $hashvalue_10:bigint] Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}/{rows: 0 (0B), cpu: 0, memory: 0B, network: 0B} $hashvalue_10 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey"), 0)) regionkey := regionkey:int:0:REGULAR DISTRIBUTED： EXPLAIN (type DISTRIBUTED) SELECT regionkey, count(*) FROM testTable GROUP BY 1; Query Plan ----------------------------------------------------------------------------------------------------------------------- Fragment 0 [SINGLE] Output layout: [regionkey, count] Output partitioning: SINGLE [] Stage Execution Strategy: UNGROUPED_EXECUTION Output[regionkey, _col1] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ _col1 := count └─ RemoteSource[1] Layout: [regionkey:integer, count:bigint] Fragment 1 [HASH] Output layout: [regionkey, count] Output partitioning: SINGLE [] Stage Execution Strategy: UNGROUPED_EXECUTION Project[] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(FINAL)[regionkey][$hashvalue] │ Layout: [regionkey:integer, $hashvalue:bigint, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ count := count("count_8") └─ LocalExchange[HASH][$hashvalue] ("regionkey") │ Layout: [regionkey:integer, count_8:bigint, $hashvalue:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ RemoteSource[2] Layout: [regionkey:integer, count_8:bigint, $hashvalue_9:bigint] Fragment 2 [SOURCE] Output layout: [regionkey, count_8, $hashvalue_10] Output partitioning: HASH [regionkey][$hashvalue_10] Stage Execution Strategy: UNGROUPED_EXECUTION Aggregate(PARTIAL)[regionkey][$hashvalue_10] │ Layout: [regionkey:integer, $hashvalue_10:bigint, count_8:bigint] │ count_8 := count(*) └─ ScanProject[table = hive:default:testtable, grouped = false] Layout: [regionkey:integer, $hashvalue_10:bigint] Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}/{rows: 0 (0B), cpu: 0, memory: 0B, network: 0B} $hashvalue_10 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey"), 0)) regionkey := regionkey:int:0:REGULAR VALIDATE： EXPLAIN (TYPE VALIDATE) SELECT id, count(*) FROM testTable GROUP BY 1; Valid ------- true IO： EXPLAIN (TYPE IO, FORMAT JSON) SELECT regionkey , count(*) FROM testTable GROUP BY 1; Query Plan --------------------------------- { "inputTableColumnInfos" : [ { "table" : { "catalog" : "hive", "schemaTable" : { "schema" : "default", "table" : "testtable" } }, "columnConstraints" : [ ] } ] }

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-EXPLAIN:描述

描述显示一条语句的逻辑的或者分布式的执行计划，也可以用于校验一条SQL语句，或者是分析IO。参数TYPE DISTRIBUTED用于显示分片后的计划（fragmented plan）。每一个fragment都会被一个或者多个节点执行。Fragments separation表示数据在两个节点之间进行交换。Fragment type表示一个fragment如何被执行以及数据在不同fragment之间怎样分布。 SINGLE Fragment会在单个节点上执行。 HASH Fragment会在固定数量的节点上执行，输入数据通过哈希函数进行分布。 ROUND_ROBIN Fragment会在固定数量的节点上执行，片段在固定数量的节点上执行，输入数据以轮循方式进行分布。 BROADCAST Fragment会在固定数量的节点上执行，输入数据被广播到所有的节点。 SOURCE Fragment在访问输入分段的节点上执行。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-UNNEST:使用多个列

使用多个列 SELECT numbers, animals, n, a FROM ( VALUES (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']), (ARRAY[7, 8, 9], ARRAY['cow', 'pig']) ) AS x (numbers, animals) CROSS JOIN UNNEST(numbers, animals) AS t (n, a);

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-SHOW TABLE/PARTITION EXTENDED:示例

示例 -- 演示数据准备 create schema show_schema; use show_schema; create table show_table1(a int,b string); create table show_table2(a int,b string); create table from_table1(a int,b string); create table in_table1(a int,b string); --查询表名以"show"开始的表的详细信息 show table extended like 'show*'; tab_name -------------------------------------------------------------------------- tableName:show_table1 owner:admintest location:hdfs://hacluster/user/hive/warehouse/show_schema.db/show_table1 InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns:struct columns {int a,string b} partitioned:false partitionColumns: totalNumberFiles:0 totalFileSize:0 tableName:show_table2 owner:admintest location:hdfs://hacluster/user/hive/warehouse/show_schema.db/show_table2 InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns:struct columns {int a,string b} partitioned:false partitionColumns: totalNumberFiles:0 totalFileSize:0 (1 row) -- 查询表名以"from"或者"show"开头的表的详细信息 show table extended like 'from*|show*'; tab_name ---------------------------------------------------------------------- tableName show_table1 owner admintest location hdfs://hacluster/user/hive/warehouse/show_table1 InputFormat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns struct columns {int a,string b} partitioned false partitionColumns totalNumberFiles 0 totalFileSize null tableName from_table1 owner admintest location hdfs://hacluster/user/hive/warehouse/from_table1 InputFormat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns struct columns {int a,string b} partitioned false partitionColumns totalNumberFiles 0 totalFileSize null tableName show_table2 owner admintest location hdfs://hacluster/user/hive/warehouse/show_table2 InputFormat org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns struct columns {int a,string b} partitioned false partitionColumns totalNumberFiles 0 totalFileSize null (1 row) -- 查询web schema下的page_views表扩展信息 show table extended from web like 'page*'; tab_name ----------------------------------------------------------------------------- tableName:page_views owner:admintest location:hdfs://hacluster/user/web.db/page_views InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat columns:struct columns {timestamp view_time,bigint user_id,string page_url} partitioned:true partitionColumns: struct partition_columns {date ds,string country} totalNumberFiles:0 totalFileSize:0 (1 row)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-SHOW TABLE/PARTITION EXTENDED:参数说明

参数说明 IN | FROM schema_name 指定schema名称，未指定时默认使用当前的schema。 LIKE 'identifier_with_wildcards' identifier_with_wildcards只支持包含“*”和“|”的规则匹配表达式。其中“*”可以匹配单个或多个字符，“|”适用于匹配多种规则匹配表达式中的任意一种的情况，它用于分隔这些规则匹配表达式。规则匹配表达式首尾的空格，不会参与匹配计算。 partition_spec 一个可选参数，使用键值对来指定分区列表，键值对之间通过逗号分隔。需要注意，指定分区时，表名不支持模糊匹配。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-HyperLogLog函数:数据结构

数据结构 HyperLogLog（hll）是一种统计基数的算法。它实际上不会存储每个元素出现的次数，它使用的是概率算法，通过存储元素的32位hash值的第一个1的位置，来计算元素数量。通常分为稀疏存储结构和密集存储结构两种。hll创建时是稀疏存储结构，当需要更高效处理时会转为密集型数据结构。P4HyperLogLog则在其整改生命周期都是密集型数据结构。如有必要，可以显式地转换cast(hll as P4HyperLogLog)。在当前数据引擎的实现中，hll的数据草图是通过一组32位的桶来存储对应的最大hash。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-字符串函数和运算符:字符串函数

字符串函数这些函数假定输入字符串包含有效的UTF-8编码的Unicode代码点。不会显式检查UTF-8数据是否有效，对于无效的UTF-8数据，函数可能会返回错误的结果。可以使用from_utf8来更正无效的UTF-8数据。此外，这些函数对Unicode代码点进行运算，而不是对用户可见的字符（或字形群集）进行运算。某些语言将多个代码点组合成单个用户感观字符（这是语言书写系统的基本单位），但是函数会将每个代码点视为单独的单位。 lower和upper函数不执行某些语言所需的区域设置相关、上下文相关或一对多映射。 chr(n) → varchar 描述：返回Unicode编码值为n的字符值。 select chr(100); --d char_length(string) → bigint 参考length(string) character_length(string) → bigint 参考length(string) codepoint(string) → integer 描述：返回单个字符对应的Unicode编码。 select codepoint('d'); --100 concat(string1, string2) → varchar 描述：字符串连接。 select concat('hello','world'); -- helloworld concat_ws(string0, string1, ..., stringN) → varchar 描述：将string1、string2、...,stringN，以string0作为分隔符串联成一个字符串。如果string0为null，则返回值为null。分隔符后的参数如果是NULL值，将会被跳过。 select concat_ws(',','hello','world'); -- hello,world select concat_ws(NULL,'def'); --NULL select concat_ws(',','hello',NULL,'world'); -- hello,world select concat_ws(',','hello','','world'); -- hello,,world concat_ws(string0, array(varchar)) → varchar 描述：将数组中的元素以string0为分隔符进行串联。如果string0为null，则返回值为null。数组中的任何null值都将被跳过。 select concat_ws(NULL,ARRAY['abc']);--NULL select concat_ws(',',ARRAY['abc',NULL,NULL,'xyz']); -- abc,xyz select concat_ws(',',ARRAY['hello','world']); -- hello,world decode(binary bin, string charset) →varchar 描述：根据给定的字符集将第一个参数编码为字符串，支持的字符集包括（'UTF-8', 'UTF-16BE', 'UTF-16LE', 'UTF-16'），当第一个参数为null，将返回null。 select decode(X'70 61 6e 64 61','UTF-8'); _col0 ------- panda (1 row) select decode(X'00 70 00 61 00 6e 00 64 00 61','UTF-16BE'); _col0 ------- panda (1 row) encode(string str, string charset) →binary 描述：字符串按照给定的字符集进行编码。 select encode('panda','UTF-8'); _col0 ---------------- 70 61 6e 64 61 (1 row) find_in_set (string str, string strList) →int 描述：返回str在逗号分隔的strList中第一次出现的位置。当有参数为null时，返回值也为null。 select find_in_set('ab', 'abc,b,ab,c,def'); -- 3 format_number(number x, int d) →string 描述：将数字x格式化为'#,###,###.##'，保留d位小数，以字符串的形式返回结果。 select format_number(541211.212,2); -- 541,211.21 format(format,args...) → varchar 描述：参见Format。 locate(string substr, string str, int pos]) →int 描述：返回子串在字符串的第pos位后第一次出现的位置。没有满足条件的返回0。 select locate('aaa','bbaaaaa',6);-- 0 select locate('aaa','bbaaaaa',1);-- 3 select locate('aaa','bbaaaaa',4);-- 4 length(string) → bigint 描述：返回字符串的长度。 select length('hello');-- 5 levenshtein_distance(string1, string2) → bigint 描述：计算string1和string2的Levenshtein距离，即将string转为string2所需要的单字符编辑（包括插入、删除或替换）最少次数。 select levenshtein_distance('helo word','hello,world'); -- 3 hamming_distance(string1, string2) → bigint 描述：返回字符串1和字符串2的汉明距离，即对应位置字符不同的数量。请注意，两个字符串的长度必须相同。 select hamming_distance('abcde','edcba');-- 4 instr(string,substring) → bigint 描述：查找substring 在string中首次出现的位置。 select instr('abcde', 'cd');--3 levenshtein(string1, string2) → bigint 参考levenshtein_distance(string1, string2) levenshtein_distance(string1, string2) → bigint 描述：返回字符串1和字符串2的Levenshtein编辑距离，即将字符串1更改为字符串2所需的最小单字符编辑（插入，删除或替换）次数。 select levenshtein_distance('apple','epplea');-- 2 lower(string) → varchar 描述：将字符转换为小写。 select lower('HELLo!');-- hello! lcase(string A) → varchar 描述：同lower(string)。 ltrim(string) → varchar 描述：去掉字符串开头的空格。 select ltrim(' hello');-- hello lpad(string, size, padstring) → varchar 描述：右填充字符串以使用padstring调整字符大小。如果size小于字符串的长度，则结果将被截断为size个字符。大小不能为负，并且填充字符串必须为非空。 select lpad('myk',5,'dog'); -- domyk

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-修改Kafka Topic配置:操作场景

操作场景用户可以根据业务需要，使用集群客户端创建Kafka Topic。启用Kerberos认证的集群，需要拥有管理Kafka主题的权限。也可以通过KafkaUI修改Topic Configs。安全模式下，KafkaUI对修改Topic Configs场景，需保证KafkaUI登录用户属于“kafkaadmin”用户组或者单独给用户授予对应操作权限，否则将会鉴权失败。非安全模式下，KafkaUI对所有操作不作鉴权处理。

MAPREDUCE服务 MRS 管理Kafka Topic
MAPREDUCE服务 MRS-DESCRIBE DATABASE| SCHEMA:示例

示例 CREATE SCHEMA web; DESCRIBE SCHEMA web; Describe Schema ------------------------------------------------------------------------- web hdfs://hacluster/user/hive/warehouse/web.db admintest USER (1 row)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-DESCRIBE FORMATTED COLUMNS:示例

示例 describe formatted show_table1 a; Describe Formatted Column ------------------------------ col_name a data_type integer min max num_nulls distinct_count 0 avg_col_len max_col_len num_trues num_falses comment (1 row)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-SHOW STATS:描述

描述返回表的近似统计信息。返回每一列的统计信息。列描述 column_name 列名（汇总行为NULL） data_size 列中所有值的总大小（以字节为单位） distinct_values_count 列中不同值的数量 nulls_fraction 列中值为NULL的部分 row_count 行数（仅针对摘要行返回） low_value 在此列中找到的最小值（仅对于某些类型） high_value 在此列中找到的最大值（仅适用于某些类型）

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-SHOW STATS:示例

示例 SHOW STATS FOR orders; SHOW STATS FOR (SELECT * FROM orders); 在 Analyze nation表之前： SHOW STATS FOR nation; column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value -------------|-----------|-----------------------|----------------|-----------|-----------|------------ name | NULL | NULL | NULL | NULL | NULL | NULL regionkey | NULL | NULL | NULL | NULL | NULL | NULL NULL | NULL | NULL | NULL | 6.0 | NULL | NULL (3 rows) 在 Analyze nation表之后： Analyze nation; ANALYZE: 6 rows --查询分析后的结果 SHOW STATS FOR nation; column_name | data_size | distinct_values_count | nulls_fraction | row_count | low_value | high_value -------------|-----------|-----------------------|----------------|-----------|-----------|------------ name | 45.0 | 5.0 | 0.0 | NULL | NULL | NULL regionkey | NULL | 2.0 | 0.0 | NULL | 0 | 2 NULL | NULL | NULL | NULL | 6.0 | NULL | NULL (3 rows)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE:示例

示例创建一个新表orders，使用子句with指定创建表的存储格式、存储位置、以及是否为外表。通过“auto.purge”参数可以指定涉及到数据移除操作（如DROP、DELETE、INSERT OVERWRITE、TRUNCATE TABLE）时是否清除相关数据： "auto.purge"='true'时，清除元数据和数据文件。 "auto.purge"='false'时，仅清除元数据，数据文件会移入HDFS回收站。默认值为“false”，且不建议用户修改此属性，避免数据删除后无法恢复。 CREATE TABLE orders ( orderkey bigint, orderstatus varchar, totalprice double, orderdate date ) WITH (format = 'ORC', location='/user',orc_compress='ZLIB',external=true, "auto.purge"=false); -- 通过DESC FORMATTED 语句，可以查看建表的详细信息 desc formatted orders ; Describe Formatted Table ------------------------------------------------------------------------------ # col_name data_type comment orderkey bigint orderstatus varchar totalprice double orderdate date # Detailed Table Information Database: default Owner: admintest LastAccessTime: 0 Location: hdfs://hacluster/user Table Type: EXTERNAL_TABLE # Table Parameters: EXTERNAL TRUE auto.purge false orc.compress.size 262144 orc.compression.codec ZLIB orc.row.index.stride 10000 orc.stripe.size 67108864 presto_query_id 20220812_084110_00050_srknk@default@HetuEngine presto_version 1.2.0-h0.cbu.mrs.320.r1-SNAPSHOT transient_lastDdlTime 1660293670 # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: serialization.format 1 (1 row) 创建一个新表，指定Row format： --建表时，指定表的字段分隔符为‘,’号（如果创建外表，要求数据文件中的每条记录的字段是以逗号进行分隔） CREATE TABLE student( id string,birthday string, grade int, memo string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; --建表时，指定字段分隔符为'\t'，换行符为'\n' CREATE TABLE test( id int, name string , tel string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' STORED AS TEXTFILE; 如果表orders不存在，则创建表orders，并且增加表注释和列注释： CREATE TABLE IF NOT EXISTS orders ( orderkey bigint, orderstatus varchar, totalprice double COMMENT 'Price in cents.', orderdate date ) COMMENT 'A table to keep track of orders.'; insert into orders values (202011181113,'online',9527,date '2020-11-11'), (202011181114,'online',666,date '2020-11-11'), (202011181115,'online',443,date '2020-11-11'), (202011181115,'offline',2896,date '2020-11-11'); 使用表orders的列定义创建表bigger_orders： CREATE TABLE bigger_orders ( another_orderkey bigint, LIKE orders, another_orderdate date ); SHOW CREATE TABLE bigger_orders ; Create Table --------------------------------------------------------------------- CREATE TABLE hive.default.bigger_orders ( another_orderkey bigint, orderkey bigint, orderstatus varchar, totalprice double, ordersdate date, another_orderdate date ) WITH ( external = false, format = 'ORC', location = 'hdfs://hacluster/user/hive/warehouse/bigger_orders', orc_compress = 'GZIP', orc_compress_size = 262144, orc_row_index_stride = 10000, orc_stripe_size = 67108864 ) (1 row) 标号① 建表示例： CREATE EXTERNAL TABLE hetu_test (orderkey bigint, orderstatus varchar, totalprice double, orderdate date) PARTITIONED BY(ds int) SORT BY (orderkey, orderstatus) COMMENT 'test' STORED AS ORC LOCATION '/user' TBLPROPERTIES (orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = 'orderstatus,totalprice'); 标号② 建表示例： CREATE EXTERNAL TABLE hetu_test1 (orderkey bigint, orderstatus varchar, totalprice double, orderdate date) COMMENT 'test' PARTITIONED BY(ds int) CLUSTERED BY (orderkey, orderstatus) SORTED BY (orderkey, orderstatus) INTO 16 BUCKETS STORED AS ORC LOCATION '/user' TBLPROPERTIES (orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = 'orderstatus,totalprice'); 标号③ 建表示例： CREATE TABLE hetu_test2 (orderkey bigint, orderstatus varchar, totalprice double, orderdate date, ds int) COMMENT 'This table is in Hetu syntax' WITH (partitioned_by = ARRAY['ds'], bucketed_by = ARRAY['orderkey', 'orderstatus'], sorted_by = ARRAY['orderkey', 'orderstatus'], bucket_count = 16, orc_compress = 'SNAPPY', orc_compress_size = 6710422, orc_bloom_filter_columns = ARRAY['orderstatus', 'totalprice'], external = true, format = 'orc', location = '/user');

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE:描述

描述使用CREATE TABLE创建一个具有指定列的、新的空表。使用CREATE TABLE AS创建带数据的表。使用可选参数IF NOT EXISTS，如果表已经存在则不会报错。 WITH子句可用于在新创建的表或单列上设置属性，如表的存储位置（location）、是不是外表（external）等。 LIKE子句用于在新表中包含来自现有表的所有列定义。可以指定多个LIKE子句，从而允许从多个表中复制列。如果指定了INCLUDING PROPERTIES，则将所有表属性复制到新表中。如果WITH子句指定的属性名称与复制的属性名称相同，则将使用WITH子句中的值。默认是EXCLUDING PROPERTIES属性，而且最多只能为一个表指定INCLUDING PROPERTIES属性。 PARTITIONED BY能够用于指定分区的列；CLUSTERED BY能够被用于指定分桶的列；SORT BY和 SORTED BY能够用于给指定的分桶列进行排序；BUCKETS能够被用于指定分桶数；EXTERNAL可用于指定创建外部表；STORED AS能被用于指定文件存储的格式；LOCATION能被用于指定在HDFS上存储的路径。想要查看支持哪些column属性，可以运行以下命令，会显示当前对接的catalog分别支持哪些列属性。 SELECT * FROM system.metadata.column_properties; 想要查看支持哪些table属性，可以运行以下命令： SELECT * FROM system.metadata.table_properties; 下表为catalog为hive时的查询结果。 SELECT * FROM system.metadata.table_properties where catalog_name = 'hive'; catalog_name property_name default_value type description hive auto_purge false boolean Skip trash when table or partition is deleted hive avro_schema_url - varchar URI pointing to Avro schema for the table hive bucket_count 0 integer Number of buckets hive bucketed_by [] array(varchar) Bucketing columns hive bucketing_version - integer Bucketing version hive csv_escape - varchar CSV escape character hive csv_quote - varchar CSV quote character hive csv_separator - varchar CSV separator character hive external_location - varchar File system location URI for external table hive format ORC varchar Hive storage format for the table. Possible values: [ORC, PARQUET, AVRO, RCBINARY, RCTEXT, SEQUENCEFILE, JSON, TEXTFILE, TEXTFILE_MULTIDELIM, CSV] hive orc_compress GZIP varchar Compression codec used. Possible values: [NONE, SNAPPY, LZ4, ZSTD, GZIP, ZLIB] hive orc_compress_size 262144 bigint orc compression size hive orc_row_index_stride 10000 integer no. of row index strides hive orc_stripe_size 67108864 bigint orc stripe size hive orc_bloom_filter_columns [] array(varchar) ORC Bloom filter index columns hive orc_bloom_filter_fpp 0.05 double ORC Bloom filter false positive probability hive partitioned_by [] array(varchar) Partition columns hive sorted_by [] array(varchar) Bucket sorting columns hive textfile_skip_footer_line_count - integer Number of footer lines hive textfile_skip_header_line_count - integer Number of header lines hive transactional false boolean Is transactional property enabled

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE:限制

限制 session属性可以设置bucket_count，默认值为-1，表示未设置。创建分区表时，如果bucket_count为-1且建表语句中未设置buckets，则使用默认值16。默认外部表存储位置/user/hive/warehouse/{schema_name}/{table_name}，其中{schema_name}为建表时使用的schema，{table_name}为表名。指定属性“transactional=true”可以让表支持“原子性、一致性、隔离性、持久性”写入的事务能力，但是将表定义为事务表后，无法通过设置“transactional=false”将其退化为非事务表。 transactional='true'或 '0'在执行过程中不会进行类型转换，所以这种写法会抛出异常： Cannot convert ['true'] to boolean Cannot convert ['0'] to boolean 默认不允许向托管表（表属性external = true）插入数据，如需使用该功能，可参考注意事项，添加hive自定义属性：hive.non-managed-table-writes-enabled=true。 Mppdb有一个限制，数据库的标识符的最大长度为63，如果把标识符命名超过了最大长度，那么会被自动截取掉超出的部分，只留下最大长度的标识符。跨域场景不支持建表。

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-CREATE TABLE:语法

语法 ① CREATE TABLE [ IF NOT EXISTS ] [catalog_name.][db_name.]table_name ( { column_name data_type [ NOT NULL ] [ COMMENT col_comment] [ WITH ( property_name = expression [, ...] ) ] | LIKE existing_table_name [ { INCLUDING | EXCLUDING } PROPERTIES ] } [, ...] ) [ COMMENT table_comment ] [ WITH ( property_name = expression [, ...] ) ] ② CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [catalog_name.][db_name.]table_name ( { column_name data_type [ NOT NULL ] [ COMMENT comment ] [ WITH ( property_name = expression [, ...] ) ] | LIKE existing_table_name [ { INCLUDING | EXCLUDING } PROPERTIES ] } [, ...] ) [COMMENT 'table_comment'] [PARTITIONED BY(col_name data_type, ....)] [CLUSTERED BY (col_name, col_name, ...) [SORTED BY (col_name, col_name, ...)] INTO num_buckets BUCKETS] ] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION 'hdfs_path'] [TBLPROPERTIES (orc_table_property = value [, ...] ) ] ③ CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [catalog_name.][db_name.]table_name ( { column_name data_type [ NOT NULL ] [ COMMENT comment ] [ WITH ( property_name = expression [, ...] ) ] | LIKE existing_table_name [ { INCLUDING | EXCLUDING } PROPERTIES ] } [, ...] ) [PARTITIONED BY(col_name data_type, ....)] [SORT BY ([column [, column ...]])] [COMMENT 'table_comment'] [ROW FORMAT row_format] [STORED AS file_format] [LOCATION 'hdfs_path'] [TBLPROPERTIES (orc_table_property = value [, ...] ) ]

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-SHOW TBLPROPERTIES TABLE|VIEW:示例

示例 --查看show_table1的所有表属性 SHOW TBLPROPERTIES ----------------------------------------------------------------------------- STATS_GENERATED_VIA_STATS_TASK 'workaround for potential lack of HIVE-12730' auto.purge 'false' numFiles '0' numRows '0' orc.compress.size '262144' orc.compression.codec 'GZIP' orc.row.index.stride '10000' orc.stripe.size '67108864' presto_query_id '20230909_095107_00042_2hwbg@default@HetuEngine' presto_version '399' rawDataSize '0' totalSize '0' transient_lastDdlTime '1694253067' (1 row) --查看show_table1的压缩算法 SHOW TBLPROPERTIES show_table1('orc.compression.codec'); SHOW TBLPROPERTIES --------------------- GZIP (1 row)

MAPREDUCE服务 MRS
MAPREDUCE服务 MRS-二进制函数和运算符:二进制函数

二进制函数 length(binary) → bigint 返回binary的字节长度。 select length(x'00141f');-- 3 concat(binary1, ..., binaryN) → varbinary 将binary1，binary2，binaryN串联起来。这个函数返回与SQL标准连接符||相同的功能。 select concat(X'32335F',x'00141f'); -- 32 33 5f 00 14 1f to_base64(binary) → varchar 将binary编码为base64字符串表示。 select to_base64(CAST('hello world' as binary)); -- aGVsbG8gd29ybGQ= from_base64(string) → varbinary 将base64编码的string解码为varbinary。 select from_base64('helloworld'); -- 85 e9 65 a3 0a 2b 95 unbase64(string) → varbinary 将base64编码的string解码为varbinary。 SELECT from_base64('helloworld'); -- 85 e9 65 a3 0a 2b 95 to_base64url(binary) → varchar 使用URL安全字符，将binary编码为base64字符串表示。 select to_base64url(x'555555'); -- VVVV from_base64url(string) → varbinary 使用URL安全字符，将base64编码的string解码为二进制数据。 select from_base64url('helloworld'); -- 85 e9 65 a3 0a 2b 95 to_hex(binary) → varchar 将binary编码为16进制字符串表示。 select to_hex(x'15245F'); -- 15245F from_hex(string) → varbinary 将16进制编码的string解码为二进制数据。 select from_hex('FFFF'); -- ff ff to_big_endian_64(bigint) → varbinary 将bigint类型的数字编码为64位大端补码格式。 select to_big_endian_64(1234); _col0 ------------------------- 00 00 00 00 00 00 04 d2 (1 row) from_big_endian_64(binary) → bigint 64位大端补码格式的二进制解码为bigint类型的数字。 select from_big_endian_64(x'00 00 00 00 00 00 04 d2'); _col0 ------- 1234 (1 row) to_big_endian_32(integer) → varbinary 将bigint类型的数字编码为32位大端补码格式。 select to_big_endian_32(1999); _col0 ------------- 00 00 07 cf (1 row) from_big_endian_32(binary) → integer 32位大端补码格式的二进制解码为bigint类型的数字。 select from_big_endian_32(x'00 00 07 cf'); _col0 ------- 1999 (1 row) to_ieee754_32(real) → varbinary 根据IEEE 754算法，将单精度浮点数编码为一个32位大端字节序的二进制块。 select to_ieee754_32(3.14); _col0 ------------- 40 48 f5 c3 (1 row) from_ieee754_32(binary) → real 对采用IEEE 754单精度浮点格式的32位大端字节序binary进行解码。 select from_ieee754_32(x'40 48 f5 c3'); _col0 ------- 3.14 (1 row) to_ieee754_64(double) → varbinary 根据IEEE 754算法，将双精度浮点数编码为一个64位大端字节序的二进制块。 select to_ieee754_64(3.14); _col0 ------------------------- 40 09 1e b8 51 eb 85 1f (1 row) from_ieee754_64(binary) → double 对采用IEEE 754单精度浮点格式的64位大端字节序binary进行解码。 select from_ieee754_64(X'40 09 1e b8 51 eb 85 1f'); _col0 ------- 3.14 (1 row) lpad(binary, size, padbinary) → varbinary 左填充二进制以使用padbinary调整字节大小。如果size小于二进制文件的长度，则结果将被截断为size个字符。size不能为负，并且padbinary不能为空。 select lpad(x'15245F', 11,x'15487F') ; -- 15 48 7f 15 48 7f 15 48 15 24 5f rpad(binary, size, padbinary) → varbinary 右填充二进制以使用padbinary调整字节大小。如果size小于二进制文件的长度，则结果将被截断为size个字符。size不能为负，并且padbinary不能为空。 SELECT rpad(x'15245F', 11,x'15487F'); -- 15 24 5f 15 48 7f 15 48 7f 15 48 crc32(binary) → bigint 计算二进制块的CRC 32值。 md5(binary) → varbinary 计算二进制块的MD 5哈希值。 sha1(binary) → varbinary 计算二进制块的SHA 1哈希值。 sha2(string, integer) → string 安全散列算法2，是一种密码散列函数算法标准，其输出长度可以取224位，256位， 384位、512位，分别对应SHA-224、SHA-256、SHA-384、SHA512 sha256(binary) → varbinary 计算二进制块的SHA 256哈希值。 sha512(binary) → varbinary 计算二进制块的SHA 512哈希值。 xxhash64(binary) → varbinary 计算二进制块的XXHASH 64哈希值。 spooky_hash_v2_32(binary) → varbinary 计算二进制块的32位SpookyHashV2哈希值。 spooky_hash_v2_64(binary) → varbinary 计算二进制块的64位SpookyHashV2哈希值。 hmac_md5(binary, key) → varbinary 使用给定的key计算二进制块的HMAC值（采用 md5）。 hmac_sha1(binary, key) → varbinary 使用给定的key计算二进制块的HMAC值（采用 sha1）。 hmac_sha256(binary, key) → varbinary 使用给定的key计算二进制块的HMAC值（采用 sha256）。 hmac_sha512(binary, key) → varbinary 使用给定的key计算二进制块的HMAC值（采用 sha512）。 CRC32、MD5、SHA1算法在密码学场景已被攻击者破解，不建议应用于密码学安全场景。

MAPREDUCE服务 MRS

共100000条

undefined

意见反馈

0/200

提交取消

提交成功！非常感谢您的反馈，我们会继续努力做到更好反馈提交失败！请稍后重试！

华为云用户手册

7*24

备案

专业服务

退订

建议反馈

售前咨询热线