2020-06-16
ยง
|
18:18 |
<mutante> |
mw2293 - scap pull (because Icinga reports mismatched MW versions) |
[production] |
18:01 |
<crusnov@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
17:55 |
<dzahn@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
17:52 |
<crusnov@cumin2001> |
START - Cookbook sre.ganeti.makevm |
[production] |
17:44 |
<ebernhardson@deploy1001> |
Finished deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff (duration: 01m 35s) |
[production] |
17:42 |
<ebernhardson@deploy1001> |
Started deploy [wikimedia/discovery/analytics@f4f5d7b]: airflow: adjust glent legal cutoff |
[production] |
17:32 |
<dzahn@cumin1001> |
START - Cookbook sre.ganeti.makevm |
[production] |
17:03 |
<herron> |
performing rolling reboots of kafka-main hosts for security updates T254990 |
[production] |
16:26 |
<hnowlan@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'changeprop' for release 'production' . |
[production] |
16:26 |
<hnowlan> |
Updating changeprop to new container version with updated dependencies |
[production] |
16:07 |
<hnowlan@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'changeprop' for release 'production' . |
[production] |
16:04 |
<hnowlan@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
16:02 |
<elukey> |
reboot kafka-jumbo1008 for kernel upgrades |
[production] |
15:58 |
<hnowlan@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . |
[production] |
15:49 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1076', diff saved to https://phabricator.wikimedia.org/P11543 and previous config saved to /var/cache/conftool/dbconfig/20200616-154924-marostegui.json |
[production] |
15:45 |
<ebernhardson@deploy1001> |
Finished deploy [wikimedia/discovery/analytics@7d4458c]: Reduce glent maximum yarn resource usage to reasonable levels (duration: 00m 41s) |
[production] |
15:44 |
<ebernhardson@deploy1001> |
Started deploy [wikimedia/discovery/analytics@7d4458c]: Reduce glent maximum yarn resource usage to reasonable levels |
[production] |
15:26 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@c652f62] (thin): Regular analytics weekly THIN train [analytics/refinery@c652f62] (duration: 00m 08s) |
[production] |
15:25 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@c652f62] (thin): Regular analytics weekly THIN train [analytics/refinery@c652f62] |
[production] |
15:23 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@c652f62]: Regular analytics weekly train [analytics/refinery@c652f62] (duration: 07m 56s) |
[production] |
15:20 |
<elukey> |
reboot kafka-jumbo1007 for kernel upgrades |
[production] |
15:15 |
<moritzm> |
upgrading intel-microcode on jessie hosts |
[production] |
15:15 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@c652f62]: Regular analytics weekly train [analytics/refinery@c652f62] |
[production] |
15:06 |
<elukey> |
reboot an-coord1001 for kernel upgrades |
[production] |
14:49 |
<hnowlan@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . |
[production] |
14:49 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:45 |
<moritzm> |
rebooting scandium for kernel security update |
[production] |
14:45 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:43 |
<cdanis> |
repool eqiad T243080 |
[production] |
14:40 |
<papaul> |
power off ms-be2018 for BBU replacement |
[production] |
14:33 |
<cdanis> |
eqiad router upgrades completed! ๐ T243080 |
[production] |
14:33 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:31 |
<elukey> |
reboot druid100[7,8] for kernel upgrades |
[production] |
14:28 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:25 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:22 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1076', diff saved to https://phabricator.wikimedia.org/P11541 and previous config saved to /var/cache/conftool/dbconfig/20200616-141540-marostegui.json |
[production] |
14:14 |
<cdanis> |
T243080 cdanis@re1.cr2-eqiad> request chassis routing-engine master switch |
[production] |
14:10 |
<moritzm> |
removing stray nginx packages from mw canaries (mw1261-mw1265 and mw1276-mw1283) T255565 |
[production] |
14:06 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:03 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:03 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
14:03 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:03 |
<akosiaris@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
14:03 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
13:56 |
<cdanis> |
T243080 cdanis@re0.cr2-eqiad> request chassis routing-engine master switch |
[production] |
13:50 |
<cdanis> |
cr2-eqiad: rebooting RE1 [backup] with new junos version T243080 |
[production] |
13:39 |
<cdanis> |
cr2-eqiad: disable transit/peering BGP & bump fr MED T243080 |
[production] |
13:32 |
<marostegui@cumin2001> |
dbctl commit (dc=all): 'Repool db2092 T254462', diff saved to https://phabricator.wikimedia.org/P11535 and previous config saved to /var/cache/conftool/dbconfig/20200616-133241-marostegui.json |
[production] |
13:17 |
<XioNoX> |
pfw3-eqiad rollback MED to cr1 to 0 - T243080 |
[production] |