Hive函数

2024-03-08 1954阅读

温馨提示：这篇文章已超过438天没有更新，请注意相关的内容是否还可用！

Hive 函数

1. Hive 函数分类

从输入输出的角度，可以将Hive的函数分为3类：标准函数、聚合函数、表生成函数

（图片来源网络，侵删）

标准函数：以一行中的一列或多列数据作为输入的参数且返回结果是一个值的函数。

标准函数返回值只有一个，返回值类型为基本数据类型或复杂数据类型，如cast()
聚合函数：以多行的零个或多个列的数据作为输入且返回单一值的函数。

聚合函数常与 group by 子句结合使用。例如 sum(), count(), max()等
表生成函数：接受零个或多个输入且产生多列或多行输出的函数。

1.1 查看函数命令
- show functions 命令用于显示当前Hive会话中加载的所有函数，包括内置函数、自定义函数
- desc function function_name 和 desc function extended function_name 两个命令可以用于查看指定函数名称的描述，extended 关键字可以显示的更加详细
  
  1.2 调用函数
  
  通过在查询语句中调用函数名，并传入参数来调用函数，函数的调用可以用于 select 与 where 子句中，包括以下3种典型情况：
  
  （1） select concat(cola, colb) as x from table_name;
  
  （2）select concat('abc', 'def');
  
  （3）select * from table_name where length(col)0;
  - ```
  hive> select instr('abcd','a');
  OK
  1
```
  length
  
  统计字符串的长度 length(string a)
```
  select length('abc');
  3
```
  trim
  
  将字符串前后的空格去掉，和java中的trim方法一样，
```
  #最后会得到sfssf sdf sdfds
  select trim(' sfssf sdf sdfds '); 
```
  upper
  
  字符串中所有的字母转为大写 upper(string a)
```
  select upper(concat_ws('', customer_fname, customer_lname)) as fullname from customers limit 10;
```
  lower
  
  字符串中所有的字母转为小写 lower(string a)
  
  substr
  
  截取字符串中从指定位置开始，指定长度的子字符串并返回，其中长度可选，默认截取到末尾。
  
  substr(string a, int start, [int length ])
  
  2.2 类型转换函数
  - cast(字段名 as 转换的类型) , 将 expr 的数据类型转换为 type 类型，如果转换失败，返回null
```
  CHAR[(N)] 字符型
  DATE 日期型
  DATETIME 日期和时间型
  DECIMAL float型
  SIGNED int
  TIME 时间型
```
```
  hive> select cast(round(9/3) as int);
  OK
  3
```
    2.3 聚合函数
    
    聚合函数是在一组多行数据中进行计算并返回单一值的函数。常用的聚合函数如下：
    
    count() sum() max() min() avg()
    
    2.4 数学函数
    
    round
    
    round(double a) 返回对a四舍五入的BIGINT值
    
    round(double a, int d) 返回对a四舍五入，保留d位小数的值
```
  select round(4/3),round(4/3,2);
```
    ceil
    
    求不小于给定实数的最小整数;向上取整
    
    ceil(double a), ceiling(double a)
```
    select ceil(4/3),ceiling(4/3);
    2
```
    floor
    
    对给定的实数向下取整
    
    floor(double a)
```
  select floor(4/3);
```
    示例：对订单总金额进行四舍五入，要求精度位小数点后两位。
```
  select order_id, round(sum(cast(order_items.order_ite_subtotal as float)),2)
  from orders join order_items on orders_id = order_items.order_item_order_id
  group by order_id limit 10;
```
    2.5 日期函数
    
    from_unxitime
    
    from_unixtime(bigint unixtime[, string format])
    
    将时间的秒值转换成format格式（format可为“yyyy-MM-dd hh:mm:ss”,“yyyy-MM-dd hh”,“yyyy-MM-dd hh:mm”等等）
```
  select from_unixtime(1599898989,'yyyy-MM-dd') as current_time
```
    unix_timestamp
    
    unix_timestamp():获取当前时间戳
    
    unix_timestamp(string date)：获取指定时间对应的时间戳
    
    通过该函数结合from_unixtime使用，或者可计算两个时间差等
```
  select 
   unix_timestamp() as current_timestamp,--获取当前时间戳
   unix_timestamp('2020-09-01 12:03:22') as speical_timestamp,--指定时间对于的时间戳
   from_unixtime(unix_timestamp(),'yyyy-MM-dd')  as current_date --获取当前日期
```
    to_date
    
    to_date(string timestamp)
    
    返回时间字符串的日期部分
```
  --最后得到2020-09-10
  select to_date('2020-09-10 10:31:31') 
```
    year
    
    year(string date)
    
    返回时间字符串的年份部分
```
  --最后得到2020
  select year('2020-09-02')
```
    month
    
    month(string date)
    
    返回时间字符串的月份部分
```
  --最后得到09
  select month('2020-09-10')
```
    day
    
    day(string date)
    
    返回时间字符串的天
```
  --最后得到10
  select day('2002-09-10')
```
    date_add
    
    date_add(string startdate, int days)
    
    从开始时间startdate加上days
```
  --获取当前时间下未来一周的时间
  select date_add(now(),7) 
  --获取上周的时间
  select date_add(now(),-7)
```
    date_sub
    
    date_sub(string startdate, int days)
    
    从开始时间startdate减去days
```
  --获取当前时间下未来一周的时间
  select date_sub(now(),-7) 
  --获取上周的时间
  select date_sub(now(),7)
```
    示例：统计月度订单数量
```
  select from_unxitime(unix_timestamp(order_date), "yyyy-MM") as year_month,
  count(order_id) from orders 
  group by from_unxitime(unix_timestamp(order_date), "yyyy-MM")
```
    2.6 条件函数
    
    if
    
    if(boolean testCondition, T valueTrue, T valueFalseOrNull):判断函数，很简单
    
    如果testCondition 为true就返回valueTrue,否则返回valueFalseOrNull
```
  --判断是否为user1用户
  select 
    distinct user_id,
    if(user_id='user1',true,false) as flag
  from wedw_tmp.tmp_url_info 
```
    case when
    
    CASE a WHEN b THEN c [WHEN d THEN e] [ELSE f] END
    
    如果a=b就返回c,a=d就返回e，否则返回f 如CASE 4 WHEN 5 THEN 5 WHEN 4 THEN 4 ELSE 3 END 将返回4
    
    相比if，个人更倾向于使用case when
```
  --仍然以if上面的列子
  select 
    distinct user_id,
    case when user_id='user1' then 'true'
       when user_id='user2' then 'test'
    else 'false' end  as flag
  from wedw_tmp.tmp_url_info 
```
    coalesce
    
    COALESCE(T v1, T v2, …)
    
    返回第一非null的值，如果全部都为NULL就返回NULL
```
  --该函数结合lead或者lag更容易贴近实际业务需求,这里使用lead，并取后3行的值作为当前行值
  select 
    user_id,
    visit_time,
    rank,
    lead_time,
    coalesce(visit_time,lead_time) as has_time
  from 
  (
    select
    user_id,
    visit_time,
    visit_cnt,
    row_number() over(partition by user_id order by visit_date desc) as rank,
    lead(visit_time,3) over(partition by user_id order by visit_date desc) as lead_time
    from  wedw_tmp.tmp_url_info
    order by user_id
  )t;
```
```
  hive> select coalesce(null,'aa');
  OK
  aa
```
    示例：根据商品价格将商品分为3个级别：0~100， 100~200及200以上，并分别统计各档商品个数
```
  select level, count(*) from (select *, case when product_price
```

VPS购买请点击我

免责声明：我们致力于保护作者版权，注重分享，被刊用文章因无法核实真实出处，未能及时与作者取得联系，或有版权异议的，请联系管理员，我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理! 图片声明：本站部分配图来自人工智能系统AI生成,觅知网授权图片,PxHere摄影无版权图库和百度，360，搜狗等多加搜索引擎自动关键词搜索配图，如有侵权的图片，请第一时间联系我们，邮箱：ciyunidc@ciyunshuju.com。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!

Hive函数

Hive 函数

1. Hive 函数分类

1.1 查看函数命令

1.2 调用函数

length

trim

upper

lower

substr

2.2 类型转换函数

2.3 聚合函数

2.4 数学函数

round

ceil

floor

2.5 日期函数

from_unxitime

unix_timestamp

to_date

year

month

day

date_add

date_sub

2.6 条件函数

if

case when

coalesce

相关阅读

怎么把织梦的模板替换?

dedecms怎么调用特定的栏目文档?

怎么抓包一个网页?

wap怎么封装app?

目录[+]