2020-09-04
§
|
22:15 |
<ryankemper> |
wdqs deploy complete, service is healthy |
[production] |
21:54 |
<ryankemper> |
`sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 60 && systemctl restart wdqs-categories && sleep 30 && pool'` |
[production] |
21:52 |
<ryankemper> |
`sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
21:49 |
<ryankemper@deploy1001> |
Finished deploy [wdqs/wdqs@c7e6b35]: 0.3.47 (duration: 12m 55s) |
[production] |
21:37 |
<ryankemper> |
Tests on canary `wdqs1003` passing, beginning full wdqs deploy |
[production] |
21:36 |
<ryankemper@deploy1001> |
Started deploy [wdqs/wdqs@c7e6b35]: 0.3.47 |
[production] |
21:31 |
<ryankemper> |
`ryankemper@wdqs2002:~$ sudo systemctl restart wdqs-blazegraph` |
[production] |
21:06 |
<mutante> |
apt1001 - removed all libnginx-mod* packages except libnginx-mod-http-echo ; sudo apt-get autoremove ; run puppet ; restarted nginx - apt.wikimedia.org switched to nginx-light (T261962) |
[production] |
21:02 |
<mutante> |
apt1001 - remove all libnginx-mod* packages except libnginx-mod-http-echo |
[production] |
20:59 |
<mutante> |
apt2001 - sudo apt-get autoremove |
[production] |
20:51 |
<mutante> |
apt2001 - apt-get remove --purge libnginx* and run puppet to replace nginx-full with nginx-light (T261962) |
[production] |
20:43 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:41 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:39 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:38 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:38 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:36 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:36 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
20:35 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
20:34 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:32 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:31 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:31 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:30 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:30 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:05 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:04 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
20:03 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:01 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
20:01 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
20:00 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
19:59 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
19:59 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
19:57 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
19:57 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
19:22 |
<mutante> |
Icinga - ACKing with sticky - alerts on test and dev hosts |
[production] |
18:10 |
<milimetric@deploy1001> |
Finished deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing (duration: 07m 35s) |
[production] |
18:02 |
<milimetric@deploy1001> |
Started deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing |
[production] |
10:31 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) |
[production] |
10:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1087 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12492 and previous config saved to /var/cache/conftool/dbconfig/20200904-102955-marostegui.json |
[production] |
10:28 |
<marostegui> |
Deploy MCR schema change on db1087 (sanitarium master), this will generate lag (probably a few days) on s8 labsdb hosts T238966 |
[production] |
09:48 |
<marostegui> |
Restart prometheus-mysqld-exporter on db2125 |
[production] |
09:11 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers |
[production] |
08:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) |
[production] |
08:31 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers |
[production] |
08:29 |
<elukey> |
roll restart of the hadoop workers (test and analytics cluster) for openjdk upgrades |
[production] |
08:08 |
<moritzm> |
installing 4.19.132 kernel on buster systems (only installing the deb, reboots separately) |
[production] |
07:30 |
<moritzm> |
installing 4.9.228 kernel on stretch systems (only installing the deb, reboots separately) |
[production] |
05:13 |
<marostegui> |
Deploy MCR schema change on s4 eqiad master T238966 |
[production] |