2023-03-01
ยง
|
15:06 |
<hashar> |
Restarting Apache on Gerrit host |
[production] |
15:04 |
<root@cumin1001> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1005'] |
[production] |
15:02 |
<hnowlan@deploy2002> |
helmfile [staging] START helmfile.d/services/thumbor: apply |
[production] |
14:57 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:aqs-eqiad |
[production] |
14:52 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd1005 |
[production] |
14:45 |
<jmm@cumin2002> |
START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs-eqiad |
[production] |
14:45 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:aqs-codfw |
[production] |
14:45 |
<dcaro@cumin1001> |
START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd1005 |
[production] |
14:34 |
<filippo@cumin1001> |
conftool action : set/pooled=no; selector: name=thanos-fe2002.codfw.wmnet,service=thanos-web |
[production] |
14:33 |
<elukey@cumin2002> |
START - Cookbook sre.hosts.reimage for host ml-serve1006.eqiad.wmnet with OS bullseye |
[production] |
14:32 |
<jmm@cumin2002> |
START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs-codfw |
[production] |
14:30 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.aqs.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:aqs-canary |
[production] |
14:30 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1008.eqiad.wmnet with OS bullseye |
[production] |
14:30 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1006.eqiad.wmnet with OS bullseye |
[production] |
14:29 |
<jmm@cumin2002> |
START - Cookbook sre.aqs.roll-restart-reboot rolling restart_daemons on A:aqs-canary |
[production] |
14:27 |
<taavi> |
re-start persistRevisionThreadItems.php on itwiki from P44912 after DC switchover T315510 |
[production] |
14:27 |
<claime> |
End mediawiki datacenter switchover - T327920 |
[production] |
14:26 |
<cgoubert@deploy2002> |
Finished scap: Backport for [[gerrit:892428|debug.json: List primary DC servers first (T327920)]] (duration: 07m 54s) |
[production] |
14:20 |
<cgoubert@deploy2002> |
cgoubert: Backport for [[gerrit:892428|debug.json: List primary DC servers first (T327920)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet |
[production] |
14:18 |
<cgoubert@deploy2002> |
Started scap: Backport for [[gerrit:892428|debug.json: List primary DC servers first (T327920)]] |
[production] |
14:16 |
<claime> |
Removing scap lock - T327920 |
[production] |
14:15 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) |
[production] |
14:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Reduce db2122 weight', diff saved to https://phabricator.wikimedia.org/P44913 and previous config saved to /var/cache/conftool/dbconfig/20230301-141414-marostegui.json |
[production] |
14:10 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters |
[production] |
14:09 |
<claime> |
Phase 9.5 DNS records for new database masters updated - T327920 |
[production] |
14:08 |
<claime> |
Phase 9.5 Update DNS records for new database masters - T327920 |
[production] |
14:07 |
<taavi> |
test |
[production] |
14:06 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) |
[production] |
14:05 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.09-restore-ttl |
[production] |
14:05 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) |
[production] |
14:03 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-start-maintenance |
[production] |
14:02 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners (exit_code=0) |
[production] |
14:02 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners |
[production] |
14:02 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) |
[production] |
14:02 |
<cgoubert@cumin1001> |
MediaWiki read-only period ends at: 2023-03-01 14:02:09.272468 |
[production] |
14:02 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.07-set-readwrite |
[production] |
14:02 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) |
[production] |
14:01 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite |
[production] |
14:01 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) |
[production] |
14:01 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki |
[production] |
14:01 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) |
[production] |
14:00 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly |
[production] |
14:00 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) |
[production] |
14:00 |
<cgoubert@cumin1001> |
MediaWiki read-only period starts at: 2023-03-01 14:00:10.075167 |
[production] |
14:00 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.02-set-readonly |
[production] |
13:56 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1007.eqiad.wmnet with OS bullseye |
[production] |
13:52 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reimage for host ml-serve1007.eqiad.wmnet with OS bullseye |
[production] |
13:52 |
<cgoubert@cumin1001> |
END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) |
[production] |
13:51 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1007.eqiad.wmnet with OS bullseye |
[production] |
13:51 |
<cgoubert@cumin1001> |
START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance |
[production] |