之前有这么一个需求
将数据按照时间分组,比如说每5分钟为一组,或者每30分钟为一组,开始时间如果分钟数大于等于30分钟,则设置为30,如果小于30,则设置为0(也可以设置为更复杂的逻辑)
当然中间还有一些数据的统计,分析和计算暂时不管
获取开始时间
def get_start_date(start_date):
time = datetime.fromtimestamp(int(start_date) / 1000)
if time.minute >= 30:
minute = 30
else:
minute = 0
start_date = datetime(time.year, time.month, time.day, time.hour, minute,
0)
start_date = mktime(start_date.timetuple()) * 1000
return start_date
开始时间很简单,需要注意的是设置分钟,需要将时间戳转换为datetime,设置分钟数后再将datetime转为时间戳
time = datetime.fromtimestamp(int(start_date) / 1000) start_date = datetime(time.year, time.month, time.day, time.hour, minute, 0) start_date = mktime(start_date.timetuple()) * 1000
时间分组
直接给出源码
def gen_date_group(start_date, datas, interval):
# datas是一组含时间戳的数据
interval = int(interval)
# 获取开始时间
start_date = get_start_date(start_date)
end_date = start_date + interval
count = 0
type_count = 0
for data in datas:
if data['date'] >= end_date:
yield (start_date, get_count_average(type_count, count))
# 一些数据统计与计算
type_count = data['type_count']
count = data['count']
start_date = end_date
end_date = start_date + interval
# 注意这里,即使dates里没有某个时间区间的数据,也要不断循环分组
while data['date'] >= end_date:
yield (start_date, 0)
start_date = end_date
end_date = start_date + interval
else:
type_count += data['type_count']
count += data['count']