增强重复事件处理和TBD管理 v3.7
Some checks failed
continuous-integration/drone/push Build is failing

主要改进:
- 改进TBD比赛ID生成:TBD vs TBD使用时间戳生成唯一ID,避免重复
- 自动删除被取代的TBD事件:当队伍确定后删除对应占位符
- 重复比赛清理:优先保留已完成的比赛,删除未完成的重复
- 增强重复检测:按30分钟时间窗口分组,自动清理同时间重复

修复的问题:
- 修复了XG vs TBD等重复占位符事件问题
- 修复了NGX vs Liquid等比赛的重复记录问题
- 改进了过期TBD事件的清理逻辑

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Ching L 2025-09-08 10:50:06 +08:00
parent 56a79c8f9d
commit 3efc50deee
3 changed files with 202 additions and 44 deletions

View File

@ -1,5 +1,25 @@
# Changelog # Changelog
## v3.7 - 2025-09-08 - 增强重复事件处理和TBD管理
- **改进TBD比赛ID生成**
- TBD vs TBD比赛现在使用时间戳生成唯一ID避免重复创建
- 普通比赛继续使用队伍+锦标赛组合生成ID
- **自动删除被取代的TBD事件**
- 当同一时间段存在确定的比赛时自动删除对应的TBD占位符
- 例如XG vs Tundra确定后自动删除XG vs TBD事件
- 使用30分钟时间窗口匹配相近时间的事件
- **重复比赛清理功能**
- 自动检测同一天同队伍的重复比赛
- 优先保留已完成的比赛,删除未完成的重复事件
- 修复了NGX vs Liquid等比赛的重复问题
- **增强的重复事件检测**
- 按30分钟时间窗口分组事件
- 自动删除同时间的重复TBD事件
- 改进了过期TBD事件的清理逻辑
- **改进TBD事件更新逻辑**
- 只在队伍确定时才尝试更新TBD事件避免TBD vs TBD互相匹配
- 保持1小时的时间窗口用于TBD事件匹配
## v3.6 - 改进TBD比赛处理机制 ## v3.6 - 改进TBD比赛处理机制
- **放宽TBD比赛时间匹配条件** - **放宽TBD比赛时间匹配条件**
- 将TBD事件匹配的时间窗口从5分钟扩大到1小时 - 将TBD事件匹配的时间窗口从5分钟扩大到1小时
@ -110,6 +130,8 @@
| v3.3 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | v3.3 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| v3.4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | v3.4 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| v3.5 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | v3.5 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| v3.6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| v3.7 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
## 使用建议 ## 使用建议

View File

@ -1,6 +1,6 @@
# Dota 2 Calendar Sync v3.6 # Dota 2 Calendar Sync v3.7
自动从 Liquipedia 获取 Dota 2 Tier 1 比赛信息并同步到 Google Calendar支持自动更新比赛结果、时间变更和智能管理TBD占位事件 自动从 Liquipedia 获取 Dota 2 Tier 1 比赛信息并同步到 Google Calendar支持自动更新比赛结果、时间变更、智能管理TBD占位事件和自动清理重复比赛
## 功能 ## 功能
@ -9,7 +9,8 @@
- 自动创建 Google Calendar 事件 - 自动创建 Google Calendar 事件
- **自动更新已完成比赛的结果和比分** - **自动更新已完成比赛的结果和比分**
- **检测并更新比赛时间变更**(赛程调整时自动同步) - **检测并更新比赛时间变更**(赛程调整时自动同步)
- **智能管理TBD占位事件**(自动更新队伍信息,删除过期事件) - **智能管理TBD占位事件**(自动更新队伍信息,删除过期和被取代的事件)
- **自动清理重复比赛**(优先保留已完成的比赛记录)
- 避免重复添加已存在的比赛 - 避免重复添加已存在的比赛
- 支持 dry-run 模式进行测试 - 支持 dry-run 模式进行测试
@ -83,7 +84,9 @@ python sync_dota2_matches.py --dry-run
- 提取比赛格式Bo1、Bo3、Bo5 - 提取比赛格式Bo1、Bo3、Bo5
- **智能去重**:相同时间、相同轮次的 TBD 比赛只保留一个代表 - **智能去重**:相同时间、相同轮次的 TBD 比赛只保留一个代表
- **TBD比赛保护**确保TBD vs TBD的比赛不会被错误标记为已完成 - **TBD比赛保护**确保TBD vs TBD的比赛不会被错误标记为已完成
- **改进的TBD匹配**放宽时间匹配窗口至1小时更好处理赛程调整 - **改进的TBD匹配**1小时时间窗口匹配更好处理赛程调整
- **重复比赛清理**:自动检测并删除同队伍的重复事件
- **TBD事件自动删除**当队伍确定后自动删除对应的TBD占位符
2. **日历事件管理** 2. **日历事件管理**
- 自动设置比赛时长(根据 Bo 格式估算) - 自动设置比赛时长(根据 Bo 格式估算)

View File

@ -294,16 +294,25 @@ class Dota2CalendarSync:
# Use teams and tournament for ID (not datetime to handle reschedules) # Use teams and tournament for ID (not datetime to handle reschedules)
id_parts = [] id_parts = []
if 'team1' in match_data: # For TBD vs TBD matches, include datetime to make them unique
id_parts.append(match_data['team1']) if match_data.get('team1') == 'TBD' and match_data.get('team2') == 'TBD':
if 'team2' in match_data: # Include datetime for TBD matches to avoid duplicates
id_parts.append(match_data['team2'])
if 'tournament' in match_data:
id_parts.append(match_data['tournament'])
else:
# Fall back to date if no tournament
if 'datetime' in match_data: if 'datetime' in match_data:
id_parts.append(str(match_data['datetime'].date())) id_parts.append(str(match_data['datetime']))
if 'tournament' in match_data:
id_parts.append(match_data['tournament'])
else:
# Normal matches: use teams and tournament
if 'team1' in match_data:
id_parts.append(match_data['team1'])
if 'team2' in match_data:
id_parts.append(match_data['team2'])
if 'tournament' in match_data:
id_parts.append(match_data['tournament'])
else:
# Fall back to date if no tournament
if 'datetime' in match_data:
id_parts.append(str(match_data['datetime'].date()))
unique_string = '_'.join(id_parts) unique_string = '_'.join(id_parts)
return hashlib.md5(unique_string.encode()).hexdigest()[:16] return hashlib.md5(unique_string.encode()).hexdigest()[:16]
@ -780,18 +789,21 @@ class Dota2CalendarSync:
# Special handling for TBD matches that might have been updated # Special handling for TBD matches that might have been updated
# Look for TBD events at the same time and tournament # Look for TBD events at the same time and tournament
if not existing_event and '_by_match' in existing_events: if not existing_event and '_by_match' in existing_events:
# Check if this match used to be TBD # Only look for TBD to update if current match is NOT TBD vs TBD
for event_key, event in existing_events['_by_match'].items(): # (we don't want to match TBD vs TBD with other TBD vs TBD)
if 'TBD_TBD' in event_key and tournament in event_key: if not (team1 == 'TBD' and team2 == 'TBD'):
# Check if time matches # Check if this match used to be TBD
event_start = event['start'].get('dateTime', event['start'].get('date')) for event_key, event in existing_events['_by_match'].items():
event_dt = datetime.fromisoformat(event_start.replace('Z', '+00:00')) if 'TBD_TBD' in event_key and tournament in event_key:
# Relaxed time matching: within 1 hour (3600 seconds) # Check if time matches (within 1 hour)
if abs((event_dt - match_time).total_seconds()) < 3600: # Within 1 hour event_start = event['start'].get('dateTime', event['start'].get('date'))
existing_event = event event_dt = datetime.fromisoformat(event_start.replace('Z', '+00:00'))
print(f" → Found TBD match to update: {team1} vs {team2}") # Relaxed time matching: within 1 hour (3600 seconds)
print(f" Time difference: {abs((event_dt - match_time).total_seconds())/60:.0f} minutes") if abs((event_dt - match_time).total_seconds()) < 3600: # Within 1 hour
break existing_event = event
print(f" → Found TBD match to update: {team1} vs {team2}")
print(f" Time difference: {abs((event_dt - match_time).total_seconds())/60:.0f} minutes")
break
if existing_event: if existing_event:
# Check if this is a TBD match that now has team names # Check if this is a TBD match that now has team names
@ -961,40 +973,161 @@ class Dota2CalendarSync:
error_count += 1 error_count += 1
# Delete old TBD events that are past and not updated # Delete old TBD events that are past and not updated
# Also check for duplicate TBD events at the same time
# Also delete TBD events when a confirmed match exists at the same time
if delete_old_tbd and not dry_run: if delete_old_tbd and not dry_run:
print("\nChecking for expired TBD events to delete...") print("\nChecking for expired, duplicate, and superseded TBD events to delete...")
print("-" * 30) print("-" * 30)
# Get all TBD events again to check which ones to delete # Group all events by time to find duplicates and superseded TBD events
events_by_time = {}
tbd_by_time = {}
# Get all events and group by time
for key, event in existing_events.items(): for key, event in existing_events.items():
if key == '_by_match': if key == '_by_match':
continue continue
summary = event.get('summary', '') summary = event.get('summary', '')
if 'TBD vs TBD' in summary: event_id = event['id']
event_id = event['id']
# Skip if this event was updated
if event_id in updated_tbd_events:
continue
# Get event time
event_start = event['start'].get('dateTime', event['start'].get('date'))
event_dt = datetime.fromisoformat(event_start.replace('Z', '+00:00'))
# Use 30-minute window for "same time"
time_key = (event_dt.year, event_dt.month, event_dt.day,
event_dt.hour, event_dt.minute // 30)
if time_key not in events_by_time:
events_by_time[time_key] = {'tbd': [], 'confirmed': []}
# Categorize events
if 'vs TBD' in summary or 'TBD vs' in summary:
events_by_time[time_key]['tbd'].append(event)
# Skip if this event was updated # Also check if it's expired (for TBD vs TBD only)
if event_id in updated_tbd_events: if 'TBD vs TBD' in summary and event_dt < now - timedelta(hours=2):
continue if self.delete_calendar_event(event_id):
print(f"🗑️ Deleted expired TBD event: {summary} ({event_dt.strftime('%Y-%m-%d %H:%M UTC')})")
deleted_tbd_count += 1
time.sleep(0.2)
else:
print(f"✗ Failed to delete TBD event: {summary}")
error_count += 1
continue # Don't process this event further
# Check if event is in the past # Track non-expired TBD events
if 'TBD vs TBD' in summary:
simple_time_key = event_dt.strftime('%Y-%m-%d %H:%M')
if simple_time_key not in tbd_by_time:
tbd_by_time[simple_time_key] = []
tbd_by_time[simple_time_key].append(event)
else:
events_by_time[time_key]['confirmed'].append(event)
# Delete TBD events that have been superseded by confirmed matches
for time_key, events in events_by_time.items():
if events['confirmed'] and events['tbd']:
# We have both confirmed and TBD events at the same time
for tbd_event in events['tbd']:
tbd_summary = tbd_event.get('summary', '')
# Extract team from TBD event
team_match = re.search(r'(\w+)\s+vs\s+TBD|TBD\s+vs\s+(\w+)', tbd_summary)
if team_match:
team_in_tbd = team_match.group(1) or team_match.group(2)
# Check if this team has a confirmed match
for confirmed_event in events['confirmed']:
confirmed_summary = confirmed_event.get('summary', '')
if team_in_tbd and team_in_tbd in confirmed_summary:
# This TBD event has been superseded
if self.delete_calendar_event(tbd_event['id']):
print(f"🗑️ Deleted superseded TBD event: {tbd_summary}")
print(f" Replaced by: {confirmed_summary}")
deleted_tbd_count += 1
time.sleep(0.2)
else:
print(f"✗ Failed to delete TBD event: {tbd_summary}")
error_count += 1
break
# Delete duplicate TBD vs TBD events at the same time
for time_key, events in tbd_by_time.items():
if len(events) > 1:
print(f"Found {len(events)} duplicate TBD events at {time_key}")
# Keep the first one, delete the rest
for event in events[1:]:
if self.delete_calendar_event(event['id']):
print(f"🗑️ Deleted duplicate TBD event: {event['summary']}")
deleted_tbd_count += 1
time.sleep(0.2)
else:
print(f"✗ Failed to delete duplicate TBD event: {event['summary']}")
error_count += 1
# Also check for duplicate matches with different completion states
# Group matches by teams and date (not exact time)
matches_by_teams_date = {}
for key, event in existing_events.items():
if key == '_by_match':
continue
summary = event.get('summary', '')
# Skip TBD matches
if 'TBD' in summary:
continue
# Extract teams from summary
# Remove completion markers and scores
clean_summary = re.sub(r'^✓\s*\d+[-:]\d+\s*', '', summary)
clean_summary = re.sub(r'^\d+[-:]\d+\s*', '', clean_summary)
# Extract teams
teams_match = re.search(r'([\w\s]+)\s+vs\s+([\w\s]+)\s*\[', clean_summary)
if teams_match:
team1 = teams_match.group(1).strip()
team2 = teams_match.group(2).strip()
# Get date
event_start = event['start'].get('dateTime', event['start'].get('date')) event_start = event['start'].get('dateTime', event['start'].get('date'))
event_dt = datetime.fromisoformat(event_start.replace('Z', '+00:00')) event_dt = datetime.fromisoformat(event_start.replace('Z', '+00:00'))
date_key = event_dt.strftime('%Y-%m-%d')
# If event is more than 2 hours in the past, delete it # Create key for this match
if event_dt < now - timedelta(hours=2): match_key = f"{min(team1, team2)}_vs_{max(team1, team2)}_{date_key}"
if dry_run:
print(f"◯ Would delete expired TBD event: {summary} ({event_dt.strftime('%Y-%m-%d %H:%M UTC')})") if match_key not in matches_by_teams_date:
matches_by_teams_date[match_key] = []
matches_by_teams_date[match_key].append(event)
# Delete duplicates, keeping the completed one
for match_key, events in matches_by_teams_date.items():
if len(events) > 1:
# Sort by completion status (completed first) and time
def sort_key(e):
summary = e.get('summary', '')
is_completed = '' in summary
event_start = e['start'].get('dateTime', e['start'].get('date'))
return (not is_completed, event_start) # Completed first, then by time
sorted_events = sorted(events, key=sort_key)
# Keep the first (preferably completed) event
print(f"Found {len(events)} duplicate matches: {sorted_events[0]['summary']}")
for event in sorted_events[1:]:
if self.delete_calendar_event(event['id']):
print(f"🗑️ Deleted duplicate match: {event['summary']}")
deleted_tbd_count += 1 deleted_tbd_count += 1
time.sleep(0.2)
else: else:
if self.delete_calendar_event(event_id): print(f"✗ Failed to delete duplicate: {event['summary']}")
print(f"🗑️ Deleted expired TBD event: {summary} ({event_dt.strftime('%Y-%m-%d %H:%M UTC')})") error_count += 1
deleted_tbd_count += 1
time.sleep(0.2)
else:
print(f"✗ Failed to delete TBD event: {summary}")
error_count += 1
# Summary # Summary
print("\n" + "="*50) print("\n" + "="*50)