2023-04-25
ยง
|
15:22 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update |
[production] |
15:22 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update |
[production] |
15:22 |
<claime> |
Datacenter Service Switchback concluded - T335015 |
[production] |
15:21 |
<cgoubert@deploy1002> |
Synchronized README: check the deployment server after switchback - T335015 (duration: 19m 55s) |
[production] |
15:19 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors |
[production] |
15:19 |
<cgoubert@cumin2002> |
START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors |
[production] |
15:19 |
<cgoubert@cumin2002> |
START - Cookbook sre.discovery.service-route depool restbase-async in eqiad: T335015 |
[production] |
15:18 |
<claime> |
Restoring restbase-async to codfw only - T335015 |
[production] |
15:18 |
<cgoubert@deploy1002> |
Finished deploy [restbase/deploy@a08f56d]: (no justification provided) (duration: 13m 06s) |
[production] |
15:08 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS bullseye |
[production] |
15:05 |
<cgoubert@deploy1002> |
Started deploy [restbase/deploy@a08f56d]: (no justification provided) |
[production] |
15:02 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
15:02 |
<inflatador> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
15:02 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) |
[production] |
15:02 |
<btullis@cumin1001> |
Added views for new wiki: vewikimedia T330704 |
[production] |
15:01 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
15:01 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) |
[production] |
15:01 |
<btullis@cumin1001> |
Added views for new wiki: ckbwiktionary T331834 |
[production] |
15:01 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
15:00 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) |
[production] |
15:00 |
<btullis@cumin1001> |
Added views for new wiki: fatwiki T335018 |
[production] |
15:00 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
15:00 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) |
[production] |
15:00 |
<btullis@cumin1001> |
Added views for new wiki: kcgwiktionary T334739 |
[production] |
15:00 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
14:59 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) |
[production] |
14:59 |
<btullis@cumin1001> |
Added views for new wiki: guwwikinews T334408 |
[production] |
14:59 |
<btullis@cumin1001> |
START - Cookbook sre.wikireplicas.add-wiki |
[production] |
14:58 |
<bking@deploy1002> |
Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 07m 38s) |
[production] |
14:54 |
<cgoubert@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: Datacenter Service Switchback - T335015 (duration: 81m 19s) |
[production] |
14:51 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage |
[production] |
14:50 |
<bking@deploy1002> |
Started deploy [wdqs/wdqs@0e051d8]: 0.3.123 |
[production] |
14:48 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage |
[production] |
14:45 |
<claime> |
Running authdns-update - T335015 |
[production] |
14:45 |
<inflatador> |
[WDQS Deploy] Gearing up for deploy of wdqs `0.3.123`. Pre-deploy tests passing on canary `wdqs1003` |
[production] |
14:44 |
<claime> |
Switch deployment server back to eqiad - T335015 |
[production] |
14:43 |
<claime> |
All active/active services repooled in codfw - T335015 |
[production] |
14:43 |
<cgoubert@cumin1001> |
END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in codfw: Datacenter Services Switchback - T335015 |
[production] |
14:36 |
<herron@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS bullseye |
[production] |
14:35 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS bullseye |
[production] |
14:26 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: Datacenter Services Switchback - T335015 |
[production] |
14:26 |
<claime> |
All services pooled in eqiad, all depooled in codfw, proceeding with repooling active/active services in codfw - T335015 |
[production] |
14:25 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None |
[production] |
14:25 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.datacenter status all services in all: None - None |
[production] |
14:24 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Services Switchback - T335015 |
[production] |
14:19 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015 |
[production] |
14:19 |
<herron@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage |
[production] |
14:18 |
<cgoubert@cumin1001> |
END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) depool all services in codfw: Datacenter Services Switchback - T335015 |
[production] |
14:16 |
<herron@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage |
[production] |
14:04 |
<cgoubert@cumin1001> |
START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015 |
[production] |