2021-01-21
ยง
|
18:34 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2373.codfw.wmnet with reason: REIMAGE |
[production] |
18:33 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2371.codfw.wmnet with reason: REIMAGE |
[production] |
18:21 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:20 |
<Urbanecm> |
Start StewardBot and SULWatcher again, maintenance over (T269609) |
[tools.stewardbots] |
18:14 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:14 |
<Urbanecm> |
Stopping StewardBot and SULWatcher for maintenance (T269609) |
[tools.stewardbots] |
18:12 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:08 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
18:08 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
18:02 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
17:42 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:36 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
17:35 |
<ryankemper> |
[wdqs] Depooled `wdqs1013` to allow it to catch up on lag |
[production] |
16:49 |
<ottomata> |
installed libsnappy-dev and python3-snappy on webperf1001 |
[analytics] |
16:27 |
<hnowlan@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . |
[production] |
16:14 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1001.eqiad.wmnet |
[production] |
16:09 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host krb1001.eqiad.wmnet |
[production] |
16:05 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2001.codfw.wmnet |
[production] |
15:59 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host krb2001.codfw.wmnet |
[production] |
15:29 |
<bstorm> |
pushed the maintain-kubeusers:beta tag with the new code to the docker repo T271847 |
[toolsbeta] |
15:17 |
<joal> |
Kill mediawiki-wikitext-history-wf-2020-12 as it was stuck and failed |
[analytics] |
15:13 |
<moritzm> |
installing cairo security updates on stretch |
[production] |
15:12 |
<hnowlan@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . |
[production] |
15:11 |
<hnowlan@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . |
[production] |
14:22 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5001.wikimedia.org |
[production] |
14:17 |
<godog> |
roll-restart swift-object in eqiad to apply new concurrency |
[production] |
14:14 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host bast5001.wikimedia.org |
[production] |
14:13 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4002.wikimedia.org |
[production] |
14:08 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host bast4002.wikimedia.org |
[production] |
14:06 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3004.wikimedia.org |
[production] |
13:54 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host bast3004.wikimedia.org |
[production] |
13:38 |
<XioNoX> |
put eqiad/esams lumen link back in service |
[production] |
12:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1085 (re)pooling @ 100%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13872 and previous config saved to /var/cache/conftool/dbconfig/20210121-122043-root.json |
[production] |
12:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1085 (re)pooling @ 75%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13871 and previous config saved to /var/cache/conftool/dbconfig/20210121-120540-root.json |
[production] |
11:50 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1085 (re)pooling @ 50%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13870 and previous config saved to /var/cache/conftool/dbconfig/20210121-115036-root.json |
[production] |
11:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1085 (re)pooling @ 25%: After moving wikireplicas to another host', diff saved to https://phabricator.wikimedia.org/P13868 and previous config saved to /var/cache/conftool/dbconfig/20210121-113533-root.json |
[production] |
11:35 |
<arturo> |
merging core router firewall changes https://gerrit.wikimedia.org/r/c/operations/homer/public/+/657439 (T209082) |
[admin] |
11:30 |
<arturo> |
merging core router firewall changes https://gerrit.wikimedia.org/r/c/operations/homer/public/+/657358 (T272486, T209082) |
[admin] |
11:29 |
<marostegui> |
Stop replication on db1085 to move wiki replicas under the other sanitarium host |
[production] |
11:28 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P13867 and previous config saved to /var/cache/conftool/dbconfig/20210121-112849-marostegui.json |
[production] |
11:19 |
<elukey> |
block UA with 'python-requests.*' hitting AQS via Varnish |
[analytics] |
11:12 |
<hnowlan@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . |
[production] |
11:12 |
<hnowlan@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' . |
[production] |
09:44 |
<hoo> |
Updated the Wikidata property suggester with data from the 2021-01-11 JSON dump and applied the T132839 workarounds |
[production] |
09:00 |
<marostegui> |
m1 master restart - T271540 |
[production] |
08:51 |
<jynus> |
stopping puppet and bacula for backup1001 T271540 |
[production] |
08:43 |
<godog> |
swift codfw-prod: more weight to ms-be20[58-61] - T269337 |
[production] |
08:37 |
<marostegui> |
Silence m1 hosts in preparation for the restart T271540 |
[production] |
08:34 |
<godog> |
roll-restart swift-object in codfw to apply new concurrency |
[production] |
07:21 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1099:3318', diff saved to https://phabricator.wikimedia.org/P13864 and previous config saved to /var/cache/conftool/dbconfig/20210121-072101-marostegui.json |
[production] |