2021-03-09
ยง
|
23:59 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1002.eqiad.wmnet with reason: REIMAGE |
[production] |
23:58 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1001.eqiad.wmnet with reason: REIMAGE |
[production] |
22:04 |
<mutante> |
phab1001 - manually running phab public task dumd script after making changes to redirect stdout |
[production] |
20:42 |
<elukey> |
reimaged an-worker1091 to buster |
[production] |
20:41 |
<bstorm> |
depooled labsdb1009 T276980 |
[production] |
20:25 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1091.eqiad.wmnet with reason: REIMAGE |
[production] |
20:25 |
<bstorm> |
downtimed labsdb1009 so it doesn't keep paging T276980 |
[production] |
20:23 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1091.eqiad.wmnet with reason: REIMAGE |
[production] |
20:09 |
<brennen> |
train status: 1.36.0-wmf.32 (T274938) on group0 at 20:06:32 UTC; logs initially quiet. |
[production] |
20:06 |
<brennen@deploy1002> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.34 |
[production] |
19:05 |
<brennen@deploy1002> |
Pruned MediaWiki: 1.36.0-wmf.31 (duration: 03m 34s) |
[production] |
19:04 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:59 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:54 |
<brennen@deploy1002> |
Finished scap: testwikis wikis to 1.36.0-wmf.34 (duration: 47m 25s) |
[production] |
18:52 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1087.eqiad.wmnet with reason: REIMAGE |
[production] |
18:49 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1087.eqiad.wmnet with reason: REIMAGE |
[production] |
18:47 |
<dcausse> |
re-pool wdqs1004 |
[production] |
18:37 |
<mbsantos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
18:35 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
18:34 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:29 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:26 |
<elukey> |
reimage an-worker1087 to buster |
[production] |
18:16 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
18:13 |
<mbsantos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . |
[production] |
18:12 |
<brennen@deploy1002> |
Started scap: testwikis wikis to 1.36.0-wmf.34 |
[production] |
18:10 |
<mbsantos@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
18:05 |
<mbsantos@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
18:03 |
<mbsantos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
18:02 |
<marxarelli> |
deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) |
[production] |
18:02 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1085.eqiad.wmnet with reason: REIMAGE |
[production] |
18:00 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1085.eqiad.wmnet with reason: REIMAGE |
[production] |
17:50 |
<papaul> |
rebooting db2073 for firmware upgrade |
[production] |
17:01 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on analytics1077.eqiad.wmnet with reason: REIMAGE |
[production] |
17:00 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 3119d7a703a38b328fa634db64b2929d54829884: sqwiki: Fix deployment of Growth features (duration: 01m 00s) |
[production] |
16:59 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on analytics1077.eqiad.wmnet with reason: REIMAGE |
[production] |
16:46 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:41 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
16:40 |
<elukey> |
reimage analytics1077 to buster |
[production] |
16:33 |
<aborrero@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1027.eqiad.wmnet |
[production] |
16:32 |
<jayme@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
16:31 |
<jayme@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
16:31 |
<brennen> |
1.36.0-wmf.34 was branched at e175899921535f83e168145cbe942489475607db for T274938 |
[production] |
16:27 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host cloudvirt1027.eqiad.wmnet |
[production] |
16:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P14708 and previous config saved to /var/cache/conftool/dbconfig/20210309-162116-root.json |
[production] |
16:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1175 (re)pooling @ 80%: 10', diff saved to https://phabricator.wikimedia.org/P14707 and previous config saved to /var/cache/conftool/dbconfig/20210309-160613-root.json |
[production] |
15:56 |
<moritzm> |
imported prometheus-ircd-exporter 0.2 to apt.wikimedia.org T224579 |
[production] |
15:51 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1175 (re)pooling @ 60%: 10', diff saved to https://phabricator.wikimedia.org/P14706 and previous config saved to /var/cache/conftool/dbconfig/20210309-155109-root.json |
[production] |
15:45 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on analytics1072.eqiad.wmnet with reason: REIMAGE |
[production] |
15:43 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on analytics1072.eqiad.wmnet with reason: REIMAGE |
[production] |
15:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repooling db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P14705 and previous config saved to /var/cache/conftool/dbconfig/20210309-153715-root.json |
[production] |