2021-02-18
ยง
|
15:30 |
<moritzm> |
installing PHP 7.3 security updates on buster |
[production] |
15:06 |
<godog> |
swift codfw-prod decrease HDD weight for ms-be20[16-27] - T272837 |
[production] |
14:50 |
<arturo> |
rebooting cloudnet1004 for T271058 |
[admin] |
14:35 |
<moritzm> |
installing libzstd security updates on Buster |
[production] |
13:59 |
<moritzm> |
installing intel-microcode security updates on buster |
[production] |
13:49 |
<jynus> |
restart db1150 T271913 |
[production] |
13:10 |
<elukey> |
failover analytics-hive to an-coord1001 after maintenance (DNS change) |
[analytics] |
12:24 |
<arturo> |
delete couple of VMs no longer in use (arturo-puppetmaster, arturo-cloudgw-test) |
[testlabs] |
12:20 |
<jynus> |
restart db1140 T271913 |
[production] |
12:01 |
<urbanecm@deploy1001> |
Synchronized php-1.36.0-wmf.31/includes/HookContainer/DeprecatedHooks.php: 28aa8718549b76c88e9757a273e0c602479b8d8b: Silent deprecate ProtectionForm::buildForm (T274889) (duration: 01m 14s) |
[production] |
11:49 |
<jynus> |
restart db1102 T271913 |
[production] |
11:32 |
<elukey> |
restart hive daemons on an-coord1001 to pick up new parquet settings |
[analytics] |
11:13 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool pc1009 (duration: 01m 09s) |
[production] |
11:04 |
<marostegui> |
Upgrade and reboot pc1009 |
[production] |
11:03 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool pc1009 (duration: 01m 08s) |
[production] |
10:47 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: 33ab68f3d54dcb411c47b03fa8e283fa3077ea85: Add https://seer.ufrgs.br to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T270962) (duration: 01m 09s) |
[production] |
10:45 |
<urbanecm@deploy1001> |
Synchronized static/images: d1db3005144c1c6fc212bde49127ea13627857be: Revert "Temporarily add cswiki-black-ribbon.png as a static resource" (duration: 01m 09s) |
[production] |
10:41 |
<jynus> |
restarting dbprov* hosts T271913 |
[production] |
10:34 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudmetrics1001.eqiad.wmnet |
[production] |
10:30 |
<oblivian@deploy1001> |
Synchronized wmf-config/ProductionServices.php: Switch restbase calls to envoy (duration: 01m 15s) |
[production] |
10:27 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host cloudmetrics1001.eqiad.wmnet |
[production] |
10:25 |
<dcaro> |
Rebooting cloudmetrics1001 to apply new kernel (T275116) |
[admin] |
10:16 |
<dcaro> |
Rebooting cloudmetrics1002 to apply new kernel (T275116) |
[admin] |
10:14 |
<dcaro> |
Upgrading grafana on cloudmetrics1002 (T275116) |
[admin] |
10:12 |
<dcaro> |
Upgrading grafana on cloudmetrics1001 (T275116) |
[admin] |
10:07 |
<elukey> |
hive failover to an-coord1002 to apply new hive settings to an-coord1001 |
[analytics] |
10:00 |
<elukey> |
restart hive daemons on an-coord1002 (standby coord) to pick up new default parquet file format change |
[analytics] |
09:48 |
<jynus> |
restarting backup* hosts T271913 |
[production] |
09:46 |
<elukey> |
upgrade presto to 0.246-wmf on an-coord1001, an-presto*, stat100x |
[analytics] |
09:46 |
<elukey> |
upgrade presto to 0.246-wmf on an-coord1001, an-presto*, stat100x |
[production] |
08:59 |
<Majavah> |
restart both stewardbot and sulwatcher manually |
[tools.stewardbots] |
08:56 |
<dcaro> |
canary instances seem to be stuck, looking (T275111) |
[cloudvirt-canary] |
08:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1090:3312, db1090:3317 T274333', diff saved to https://phabricator.wikimedia.org/P14408 and previous config saved to /var/cache/conftool/dbconfig/20210218-084758-marostegui.json |
[production] |
08:31 |
<marostegui> |
Upgrade kernel on db1154 and db1155 (sanitarium running buster hosts) |
[production] |
08:23 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1003.eqiad.wmnet with reason: REIMAGE |
[production] |
08:21 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1003.eqiad.wmnet with reason: REIMAGE |
[production] |
08:01 |
<godog> |
upgrade grafana* to 7.4.2 - T263747 |
[production] |
07:59 |
<marostegui> |
Reboot es2029, es2030, es2031, es2032, es2033, es2034 for kernel upgrade |
[production] |
07:32 |
<marostegui> |
Reboot es2026, es2027, es2028 for kernel upgrade |
[production] |
06:56 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host archiva1002.wikimedia.org |
[production] |
06:54 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host archiva1002.wikimedia.org |
[production] |
06:53 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet |
[production] |
06:49 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet |
[production] |
06:25 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1075.eqiad.wmnet |
[production] |
06:10 |
<marostegui> |
Reboot dbproxy1014 for kernel upgrade |
[production] |
01:56 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: fe646957eb9b09377b07545ff194a726fd0cc6c7: hewikisource: Allow sysops to grant/revoke reviewer (T274796) (duration: 01m 07s) |
[production] |
01:50 |
<Urbanecm> |
Kill stuck beta-scap-eqiad job and start a new one to sync beta |
[releng] |
01:38 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
01:32 |
<robh@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
00:58 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |