云社区 博客 博客详情
云社区 博客 博客详情

DWS日期时间类型格式设置及不同格式时间数据入库指导

DWS_Jack 发表于 2020-06-23 08:51:46 06-23 08:51
DWS_Jack 发表于 2020-06-23 08:51:46 2020/06/23
1
0

【摘要】 DWS日期时间类型支持不同的格式,通过设置DateStyle参数可以修改日期时间数据显示格式,提供了不同格式日期时间类型数据入库方法,以及利用to_char对日期时间类型数据的格式化输出。

使用DWS过程中,经常遇到不同格式的时间类型数据入库失败, DWS默认的日期时间显示格式是yyyy-mm-dd hh24:mi:ss.us,其他格式的日期时间数据入库会导致报错, DWS支持不同的日期时间显示风格,需要对数据做格式化处理才能正常入库。

对于日期时间类型的显示,需要了解DWS中的配置参数DateStyle

postgres=# show DateStyle;
 DateStyle 
-----------
 ISO, MDY
(1 row)

DateStyle参数详解

DateStyle参数中,分为两部分,一部分是控制日期/时间的输出格式,一部分是控制day/month/year的显示顺序。

PostgreSQL官方文档中描述如下:

The output format of the date/time types can be set to one of the four styles ISO 8601,SQL(Ingres), traditional POSTGRES (Unix date format), or German. The default is theISOformat. (TheSQLstandard requires the use of the ISO 8601 format. The name of the "SQL" output format is a historical accident.) Table 8-14 shows examples of each output style. The output of the date and time types is generally only the date or time part in accordance with the given examples. However, the POSTGRES style outputs date-only values inISOformat.

Table 8-14. Date/Time Output Styles

Style Specification

Description

Example

ISO

ISO   8601, SQL standard

1997-12-17 07:37:16-08

SQL

traditional   style

12/17/1997 07:37:16.00 PST

Postgres

original   style

Wed Dec 17 07:37:16 1997 PST

German

regional   style

17.12.1997 07:37:16.00 PST

Note: ISO 8601 specifies the use of uppercase letter T to separate the date and time. PostgreSQL accepts that format on input, but on output it uses a space rather than T, as shown above. This is for readability and for consistency with RFC 3339 as well as some other database systems.

In theSQLand POSTGRES styles, day appears before month if DMY field ordering has been specified, otherwise month appears before day. (See Section 8.5.1 for how this setting also affects interpretation of input values.) Table 8-15 shows examples.

Table 8-15. Date Order Conventions

datestyle    Setting

Input    Ordering

Example    Output

SQL, DMY

day/month/year

17/12/1997 15:37:16.00 CET

SQL, MDY

month/day/year

12/17/1997 07:37:16.00 PST

Postgres, DMY

day/month/year

Wed 17 Dec 07:37:16 1997 PST

通过一些探索性的测试验证,发现一些更为细节的东西,比如day/month/year的顺序只能为YMD/DMY/MDY month/day的顺序还可以使用EuropeanUS来控制。详细情况测试如下:

默认DateStyle设置下的行为如下:

postgres=# show DateStyle;
 DateStyle 
-----------
 ISO, MDY
(1 row)
 
postgres=# select current_timestamp;
        pg_systimestamp        
-------------------------------
 2020-06-22 20:07:51.148858+08
(1 row)

ISO格式的day/month/year顺序是不受MDY参数控制,也就是说无论怎么设置,顺序都是yyyy-mm-dd

而同样,在GERMAN格式下,格式始终是dd.mm.yyyy

postgres=# set DateStyle to 'GERMAN, MDY';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 22.06.2020 20:18:45.135697 CST
(1 row)

Postgres格式下,不管怎么设置,year始终在最后,只能变换monthday的位置。

postgres=# set DateStyle to 'Postgres, DMY';
SET
postgres=# select current_timestamp;
           pg_systimestamp           
-------------------------------------
 Mon 22 Jun 20:10:01.270965 2020 CST
(1 row)
postgres=# set DateStyle to 'Postgres, YMD';
SET
postgres=# select current_timestamp;
           pg_systimestamp           
-------------------------------------
 Mon Jun 22 20:14:13.967454 2020 CST
(1 row)
 
postgres=# set DateStyle to 'Postgres, MDY';
SET
postgres=# select current_timestamp;
           pg_systimestamp           
-------------------------------------
 Mon Jun 22 20:29:37.870996 2020 CST
(1 row)

对于SQL格式来说,跟Postgres格式类似,不管怎么设置,year始终在最后,只能变换monthday的位置。

postgres=# set DateStyle to 'SQL, YMD';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 06/22/2020 20:26:01.411167 CST
(1 row)
 
postgres=# set DateStyle to 'SQL, DMY';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 22/06/2020 20:26:11.314706 CST
(1 row)
 
postgres=# set DateStyle to 'SQL, MDY';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 06/22/2020 20:27:23.056398 CST
(1 row)

SQLPostgres格式下,可以通过设置EuropeanUS来设置monthday的位置,European代表day/monthUS代表month/day

postgres=# set DateStyle to 'SQL, European';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 22/06/2020 20:31:06.073123 CST
(1 row)
 
postgres=# set DateStyle to 'SQL, US';
SET
postgres=# select current_timestamp;
        pg_systimestamp         
--------------------------------
 06/22/2020 20:31:12.113224 CST
(1 row)
 
postgres=# set DateStyle to 'Postgres, US';
SET
postgres=# select current_timestamp;
          pg_systimestamp           
------------------------------------
 Mon Jun 22 20:31:34.13653 2020 CST
(1 row)
 
postgres=# set DateStyle to 'Postgres, European';
SET
postgres=# select current_timestamp;
           pg_systimestamp           
-------------------------------------
 Mon 22 Jun 20:31:40.816756 2020 CST
(1 row)

另外,NonEuropeanUS一致,还可以用NonEuroEuro缩写来进行设置。

不同格式时间数据入库

假如有时间类型数据无法入库,例如

cat test.dat
22/06/2020 20:57:20

导入时报错

postgres=# copy test from '/home/jack/test.dat';
ERROR:  date/time field value out of range: "22/06/2020 20:57:20"
HINT:  Perhaps you need a different "datestyle" setting.
CONTEXT:  COPY test, line 1, column time: "22/06/2020 20:57:20"

第一种方案可以在session中临时修改DateStyle,然后再执行入库:

postgres=# set DateStyle to 'SQL, European';
SET
postgres=# copy test from '/home/jack/test.dat';
COPY 1
postgres=# select * from test;
        time         
---------------------
 22/06/2020 20:57:20
(1 row)

还有一种方案,需要借助于copy以及GDS/OBS导入外表的参数DATE_FORMAT|TIME_FORMAT|TIMESTAMP_FORMAT| SMALLDATETIME_FORMAT,例如timestamp类型的列,通过指定TIMESTAMP_FORMAT格式,执行入库。

postgres=# copy test from '/home/jack/test.dat' with(timestamp_format 'DD/MM/YYYY HH24:MI:SS');
COPY 1
postgres=# select * from test;
        time         
---------------------
 2020-06-22 20:57:20
(1 row)

时间数据类型格式化

在不更改DateStyle参数的情况下,如果需要调整日期时间显示格式,可以借助于to_char格式化命令。

postgres=# select to_char(sysdate,'Dy DD Mon YYYY HH24:MI:SS');
         to_char          
--------------------------
 Mon 22 Jun 2020 20:56:07
(1 row)
 
postgres=# select to_char(sysdate,'DD.MM.YYYY HH24:MI:SS');
       to_char       
---------------------
 22.06.2020 20:57:00
(1 row)
 
postgres=# select to_char(sysdate,'DD/MM/YYYY HH24:MI:SS');
       to_char       
---------------------
 22/06/2020 20:57:20
(1 row)

对于需要带微秒、时区、十二小时制的时间格式化来说,举例如下:

postgres=# select to_char(current_timestamp, 'Dy DD Mon YYYY HH:MI:SS.US TZ AM');
                to_char                 
----------------------------------------
 Mon 22 Jun 2020 09:45:36.404909 CST PM
(1 row)

【注意】由于DWS默认时区为UTC,与国内的东八区有差异,建议在使用时间类型数据时关注时区,尤其是时间类型数据作为分区表的分区字段的场景,DWS支持的时间类型支持timestamp with time zone(timestamptz),可以避免客户端服务端不同时区导致的数据差异。

PostgreSQL官方文档中,to_char支持的格式化Patterns很多,有兴趣可以研究一下:

In a to_char output template string, there are certain patterns that are recognized and replaced with appropriately-formatted data based on the given value. Any text that is not a template pattern is simply copied verbatim. Similarly, in an input template string (for the other functions), template patterns identify the values to be supplied by the input data string.

Table 9-22 shows the template patterns available for formatting date and time values.

Table 9-22. Template Patterns for Date/Time Formatting

Pattern

Description

HH

hour   of day (01-12)

HH12

hour   of day (01-12)

HH24

hour   of day (00-23)

MI

minute   (00-59)

SS

second   (00-59)

MS

millisecond   (000-999)

US

microsecond   (000000-999999)

SSSS

seconds   past midnight (0-86399)

AM, am, PM   or pm

meridiem   indicator (without periods)

A.M., a.m., P.M.   or p.m.

meridiem   indicator (with periods)

Y,YYY

year   (4 or more digits) with comma

YYYY

year   (4 or more digits)

YYY

last   3 digits of year

YY

last   2 digits of year

Y

last   digit of year

IYYY

ISO   8601 week-numbering year (4 or more digits)

IYY

last   3 digits of ISO 8601 week-numbering year

IY

last   2 digits of ISO 8601 week-numbering year

I

last   digit of ISO 8601 week-numbering year

BC, bc, AD   or ad

era   indicator (without periods)

B.C., b.c., A.D.   or a.d.

era   indicator (with periods)

MONTH

full   upper case month name (blank-padded to 9 chars)

Month

full   capitalized month name (blank-padded to 9 chars)

month

full   lower case month name (blank-padded to 9 chars)

MON

abbreviated   upper case month name (3 chars in English, localized lengths vary)

Mon

abbreviated   capitalized month name (3 chars in English, localized lengths vary)

mon

abbreviated   lower case month name (3 chars in English, localized lengths vary)

MM

month   number (01-12)

DAY

full   upper case day name (blank-padded to 9 chars)

Day

full   capitalized day name (blank-padded to 9 chars)

day

full   lower case day name (blank-padded to 9 chars)

DY

abbreviated   upper case day name (3 chars in English, localized lengths vary)

Dy

abbreviated   capitalized day name (3 chars in English, localized lengths vary)

dy

abbreviated   lower case day name (3 chars in English, localized lengths vary)

DDD

day   of year (001-366)

IDDD

day   of ISO 8601 week-numbering year (001-371; day 1 of the year is Monday of the   first ISO week)

DD

day   of month (01-31)

D

day   of the week, Sunday (1) to Saturday (7)

ID

ISO   8601 day of the week, Monday (1) to Sunday (7)

W

week   of month (1-5) (the first week starts on the first day of the month)

WW

week   number of year (1-53) (the first week starts on the first day of the year)

IW

week   number of ISO 8601 week-numbering year (01-53; the first Thursday of the year   is in week 1)

CC

century   (2 digits) (the twenty-first century starts on 2001-01-01)

J

Julian   Day (integer days since November 24, 4714 BC at midnight UTC)

Q

quarter   (ignored by to_date and to_timestamp)

RM

month   in upper case Roman numerals (I-XII; I=January)

rm

month   in lower case Roman numerals (i-xii; i=January)

TZ

upper   case time-zone abbreviation (only supported in to_char)

tz

lower   case time-zone abbreviation (only supported in to_char)

OF

time-zone   offset from UTC (only supported in to_char)

对于时间类型格式化这块,DWS保持了PostgreSQL的兼容性,更详细的格式化Patterns以及使用方法,可以参考PostgreSQL文档。

更多精彩内容,请关注华为云开发者社区论坛:   https://bbs.huaweicloud.com/forum/forum-598-1.html     及微信公众号华为云AI”。

登录后可下载附件,请登录或者注册

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区),文章链接,文章作者等基本信息,否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件至:huaweicloud.bbs@huawei.com进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容。
评论文章 //点赞 收藏 0
点赞
分享文章到微博
分享文章到朋友圈

评论 (1)


彩虹上的水瓶座

1楼2020-06-23 11:14:00
写的很详细,很实用
0/1000
评论

登录后可评论,请 登录注册

评论

您还没有写博客的权限!

温馨提示

您确认删除评论吗?

确定
取消
温馨提示

您确认删除评论吗?

删除操作无法恢复,请谨慎操作。

确定
取消
温馨提示

您确认删除博客吗?

确定
取消

确认删除

您确认删除博客吗?

确认删除

您确认删除评论吗?

温馨提示

登录超时或用户已下线,请重新登录!!!

确定
取消