2020-06-25
§
|
10:00 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
10:00 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
10:00 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
10:00 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
09:59 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
09:58 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
09:57 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
09:57 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
09:53 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
09:37 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
09:34 |
<volans@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
09:28 |
<akosiaris> |
schedule downtime for eqiad wikifeeds as it's flapping too much without yet knowing why. T256358 |
[production] |
09:28 |
<godog> |
extend lv on thanos-fe2001 and restart thanos-compact |
[production] |
09:21 |
<vgutierrez> |
rolling restart of ncredir instances to catch up on kernel updates |
[production] |
09:13 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@4aba370] (thin): Analytics fix over weekly train THIN [analytics/refinery@4aba370] (duration: 00m 10s) |
[production] |
09:13 |
<joal@deploy1001> |
Started deploy [analytics/refinery@4aba370] (thin): Analytics fix over weekly train THIN [analytics/refinery@4aba370] |
[production] |
09:13 |
<joal@deploy1001> |
Finished deploy [analytics/refinery@4aba370]: Analytics fix over weekly train [analytics/refinery@4aba370] (duration: 16m 27s) |
[production] |
09:01 |
<vgutierrez> |
restarting acme-chief instances to catch up on kernel updates |
[production] |
08:56 |
<joal@deploy1001> |
Started deploy [analytics/refinery@4aba370]: Analytics fix over weekly train [analytics/refinery@4aba370] |
[production] |
08:42 |
<hashar> |
releases2002: restarted bacula-fd to take in account the puppet provided configuration # T247652 |
[production] |
08:14 |
<jynus> |
restarting bacula-dir on backup1001 |
[production] |
08:09 |
<akosiaris> |
restart etherpad-lite on etherpad1002 |
[production] |
08:03 |
<marostegui> |
Failover m1 from db1135 to db1097 - T254556 |
[production] |
07:52 |
<jynus> |
stop bacula-director on backup1001 for db maintenance T254556 |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:49 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:49 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:49 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:48 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:48 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) |
[production] |
07:47 |
<akosiaris@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
07:36 |
<elukey> |
reboot an-launcher1001 for kernel upgrades |
[production] |
07:18 |
<elukey> |
reboot kafkamon* vms for kernel upgrades |
[production] |
07:08 |
<marostegui> |
Start pre switchover steps on m1 T254556 |
[production] |
06:40 |
<elukey> |
reboot matomo1002 for kernel upgrades |
[production] |
06:35 |
<elukey> |
reboot archiva1002 (new vm, not yet in service) for kernel upgrades |
[production] |
06:34 |
<elukey> |
reboot archiva for kernel upgrades |
[production] |
06:31 |
<elukey> |
force puppet run on ores1003/1005 to restore celery (killed by the oom) |
[production] |
06:24 |
<elukey> |
reboot an-tool* vms for kernel upgrades |
[production] |
06:23 |
<elukey> |
reboot analytics-tool1004 for kernel upgrades (Superset host) |
[production] |
06:22 |
<elukey> |
reboot analytics-tool1001 for kernel upgrades |
[production] |
06:19 |
<elukey> |
execute ip addr flush ens5 on an-airflow1001 to clear RTNETLINK answers: File exists (error from ifup@ens5.service) |
[production] |
06:03 |
<elukey> |
reboot an-airflow1001 for kernel upgrades |
[production] |
04:26 |
<marostegui> |
Remove triggers from db2095:3312 - T238966 |
[production] |
04:25 |
<marostegui> |
Deploy schema change on s2 codfw - T238966 |
[production] |
00:48 |
<twentyafterfour> |
restart php-fpm on phab1001 to fix T256343 |
[production] |
00:12 |
<twentyafterfour> |
phabricator updated, all seems normal |
[production] |
00:11 |
<twentyafterfour> |
updating phabricator to release/2020-06-25/1, momentary (<1 minute) downtime expected. |
[production] |