使用python从未知日期格式获取年份

因此,我正在查询服务器以获取特定数据,并且需要从返回的日期字段中提取年份,但是日期字段会有所不同,例如:

2009
2009-10-8
2009-10
2017-10-22
2017-10

显而易见的是将日期提取到数组中并获取最大值:(但存在问题)

year = max(d.split('-'))

由于某些原因,这会带来误报,因为22似乎是2017年的最高水平,而且,如果将来对服务器的调用导致日期存储为“ 2019/10/20”,这也会带来问题。

评论
死肥仔
死肥仔

The problem is that, while 2017 > 22, '2017' < '22' because it's a string comparison. You could do this to resolve that:

year = max(map(int, d.split('-')))

But instead, if you don't mind being frowned upon by the Long Now Foundation, consider using a regular expression to extract any 4-digit number:

match = re.search(r'\b\d{4}\b', d)
if match:
    year = int(match.group(0))
点赞
评论