Evgueni Poloukarov commited on
Commit
dfe40ac
·
1 Parent(s): e5f4fec

fix: adjust run_date to ensure future data exists in dataset

Browse files

- Changed run_date calculation in smoke_test.py and full_inference.py
- smoke_test.py: run_date = max_date - 168 hours (7-day forecast)
- full_inference.py: run_date = max_date - 336 hours (14-day forecast)
- Ensures forecast window (Sept 17-30 or Sept 17-30) has data in dataset
- Fixes empty future_df bug that caused smoke test failure

Note: This is for smoke test validation within Sept data.
Later: proper Oct holdout validation (run_date=Sept 30, forecast Oct 1-14)

Files changed (2) hide show
  1. full_inference.py +7 -1
  2. smoke_test.py +8 -3
full_inference.py CHANGED
@@ -81,13 +81,19 @@ print(f" Borders: {', '.join(borders[:5])}... (showing first 5)")
81
 
82
  # Step 3: Prepare forecast parameters
83
  print("\n[3/7] Setting up forecast parameters...")
84
- run_date = df['timestamp'].max()
 
 
 
85
  context_hours = 512
86
  prediction_hours = 336 # 14 days (fixed)
 
 
87
 
88
  print(f" Run date: {run_date}")
89
  print(f" Context window: {context_hours} hours")
90
  print(f" Prediction horizon: {prediction_hours} hours (14 days, D+1 to D+14)")
 
91
 
92
  # Initialize DynamicForecast once for all borders
93
  forecaster = DynamicForecast(
 
81
 
82
  # Step 3: Prepare forecast parameters
83
  print("\n[3/7] Setting up forecast parameters...")
84
+ # Use a date that has 14 days of future data available
85
+ # Dataset ends at 2025-09-30 23:00, so we need run_date such that
86
+ # forecast ends at most at 2025-09-30 23:00
87
+ # For 336 hours (14 days), run_date should be at most 2025-09-16 23:00
88
  context_hours = 512
89
  prediction_hours = 336 # 14 days (fixed)
90
+ max_date = df['timestamp'].max()
91
+ run_date = max_date - timedelta(hours=prediction_hours)
92
 
93
  print(f" Run date: {run_date}")
94
  print(f" Context window: {context_hours} hours")
95
  print(f" Prediction horizon: {prediction_hours} hours (14 days, D+1 to D+14)")
96
+ print(f" Forecast range: {run_date + timedelta(hours=1)} to {run_date + timedelta(hours=prediction_hours)}")
97
 
98
  # Initialize DynamicForecast once for all borders
99
  forecaster = DynamicForecast(
smoke_test.py CHANGED
@@ -82,14 +82,19 @@ print(f"[*] Test border: {test_border}")
82
 
83
  # Step 3: Prepare test data with DynamicForecast
84
  print("\n[3/6] Preparing test data...")
85
- # Use last available date as forecast date (Sept 30, 23:00)
86
- run_date = df['timestamp'].max()
87
- context_hours = 512
 
88
  prediction_hours = 168 # 7 days
 
 
 
89
 
90
  print(f" Run date: {run_date}")
91
  print(f" Context: {context_hours} hours (historical)")
92
  print(f" Forecast: {prediction_hours} hours (7 days, D+1 to D+7)")
 
93
 
94
  # Initialize DynamicForecast
95
  forecaster = DynamicForecast(
 
82
 
83
  # Step 3: Prepare test data with DynamicForecast
84
  print("\n[3/6] Preparing test data...")
85
+ # Use a date that has 7 days of future data available
86
+ # Dataset ends at 2025-09-30 23:00, so we need run_date such that
87
+ # forecast ends at most at 2025-09-30 23:00
88
+ # For 168 hours (7 days), run_date should be at most 2025-09-23 23:00
89
  prediction_hours = 168 # 7 days
90
+ max_date = df['timestamp'].max()
91
+ run_date = max_date - timedelta(hours=prediction_hours)
92
+ context_hours = 512
93
 
94
  print(f" Run date: {run_date}")
95
  print(f" Context: {context_hours} hours (historical)")
96
  print(f" Forecast: {prediction_hours} hours (7 days, D+1 to D+7)")
97
+ print(f" Forecast range: {run_date + timedelta(hours=1)} to {run_date + timedelta(hours=prediction_hours)}")
98
 
99
  # Initialize DynamicForecast
100
  forecaster = DynamicForecast(