2021-01-14
ยง
|
20:17 |
<mutante> |
ACKing all unhandled crit alerts about systemd on clouddb hosts - notifications are disabled but this cleans up Icinga web UI noise - T267090 |
[production] |
20:15 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: REIMAGE |
[production] |
20:05 |
<razzi@cumin1001> |
START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes - razzi@cumin1001 |
[production] |
19:31 |
<urbanecm@deploy1001> |
Synchronized dblists/closed.dblist: d3e274e9b953f5edda07fa3a016b7291a451ceb2: Close lrcwiki (T272041) (duration: 00m 58s) |
[production] |
19:03 |
<mutante> |
mc1024 - attempting to power on via mgmt, went down and power down |
[production] |
18:45 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2258.codfw.wmnet with reason: REIMAGE |
[production] |
18:43 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2255.codfw.wmnet with reason: REIMAGE |
[production] |
18:41 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2242.codfw.wmnet with reason: REIMAGE |
[production] |
18:41 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2258.codfw.wmnet with reason: REIMAGE |
[production] |
18:40 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2255.codfw.wmnet with reason: REIMAGE |
[production] |
18:39 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2241.codfw.wmnet with reason: REIMAGE |
[production] |
18:38 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2242.codfw.wmnet with reason: REIMAGE |
[production] |
18:38 |
<Amir1> |
started mass deletion of lrcwiki (T272041) - https://w.wiki/uPV |
[production] |
18:37 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2241.codfw.wmnet with reason: REIMAGE |
[production] |
18:36 |
<jynus> |
restarting backup1002, backup2002 T271913 |
[production] |
18:05 |
<jynus> |
restarting backup1001, backup2001 T271913 |
[production] |
16:47 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 10 hosts with reason: upgrading openstack |
[production] |
16:47 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 10 hosts with reason: upgrading openstack |
[production] |
16:47 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 93 hosts with reason: upgrading openstack |
[production] |
16:46 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 93 hosts with reason: upgrading openstack |
[production] |
16:32 |
<moritzm> |
installing php-pear updates on stretch |
[production] |
16:03 |
<moritzm> |
installing tomcat8 security updates |
[production] |
15:40 |
<moritzm> |
installing sqlite3 security updates on Stretch |
[production] |
15:30 |
<papaul> |
power down ms-be2022 for maintenance |
[production] |
15:19 |
<otto@deploy1001> |
Finished deploy [analytics/refinery@1117f45]: Explicitly set timeout in banner_activity-druid-monthly-coord - T264358 (duration: 02m 16s) |
[production] |
15:16 |
<otto@deploy1001> |
Started deploy [analytics/refinery@1117f45]: Explicitly set timeout in banner_activity-druid-monthly-coord - T264358 |
[production] |
15:11 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) |
[production] |
15:00 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 93 hosts with reason: upgrading openstack |
[production] |
14:59 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 93 hosts with reason: upgrading openstack |
[production] |
14:59 |
<andrew@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 10 hosts with reason: upgrading openstack |
[production] |
14:59 |
<andrew@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on 10 hosts with reason: upgrading openstack |
[production] |
14:56 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:30 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:28 |
<arturo> |
running homer in asw-b-codfw* (T271519) |
[production] |
14:26 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:24 |
<arturo> |
running homer in asw-b-codfw* (T271519) |
[production] |
14:10 |
<liw@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.26 |
[production] |
14:07 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) |
[production] |
14:06 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
14:06 |
<hashar@deploy1001> |
Synchronized php-1.36.0-wmf.26/skins/CologneBlue/includes/CologneBlueHooks.php: Edit link may not be present, avoid undefined index notice T271978 (duration: 01m 07s) |
[production] |
13:56 |
<aborrero@cumin2001> |
END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) |
[production] |
13:47 |
<marostegui> |
Restart mysql on db2094 for openssl upgrades test |
[production] |
13:42 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
13:23 |
<moritzm> |
restarting mw canaries for openssl update |
[production] |
13:22 |
<jmm@cumin2001> |
START - Cookbook sre.ganeti.makevm |
[production] |
13:22 |
<aborrero@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
13:17 |
<moritzm> |
installing openssl1.0 security updates on stretch |
[production] |
13:15 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
13:11 |
<moritzm> |
installing xerces-c security updates on stretch |
[production] |
12:50 |
<volans> |
upgraded python3-pynetbox to 5.3.0-1 on all affected hosts - T266487 |
[production] |