2019-05-16
ยง
|
16:22 |
<XioNoX> |
add BGP session to Hetzner in AMS-IX |
[production] |
16:19 |
<akosiaris> |
switch all etcd* kubestagetcd* servers from "drbd" ganeti disk template to "plain" ganeti disk template |
[production] |
16:17 |
<jbond42> |
reboot ores2001-2002 |
[production] |
16:16 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
16:16 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
15:59 |
<akosiaris> |
build service-checker OCI container 0.0.2 with 0.1.5 service-checker version T220401 |
[production] |
15:49 |
<jforrester@deploy1001> |
Synchronized php-1.34.0-wmf.5/extensions/CirrusSearch/includes/InterwikiSearcher.php: Hot-deploy CirrusSearch interwiki no result UBN T223449 (duration: 00m 49s) |
[production] |
15:45 |
<marostegui> |
Drop the following databases from tendril to recreated them with the right user: db1127,db1129,db1130, db1131, db1137,db1138 |
[production] |
15:35 |
<jforrester@deploy1001> |
Synchronized php-1.34.0-wmf.5/includes/specials/pagers/ContribsPager.php: Hot-deploy Contribs getNamespaceInfo UBN fix T223440 (duration: 00m 53s) |
[production] |
15:25 |
<aborrero@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=labweb1001.wikimedia.org,service=labweb |
[production] |
15:02 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
15:02 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
15:02 |
<jbond42> |
rebooting aqs1009 |
[production] |
14:54 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:54 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:54 |
<jbond42> |
rebooting aqs1008 |
[production] |
14:45 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:45 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:45 |
<jbond42> |
rebooting aqs1007 |
[production] |
14:34 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:34 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:34 |
<jbond42> |
rebooting aqs1006 |
[production] |
14:28 |
<jbond42> |
rebooting aqs1005 |
[production] |
14:21 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:21 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:18 |
<moritzm> |
powercycling mw2199, stuck during reboot |
[production] |
14:08 |
<jbond@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:08 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:07 |
<jbond@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
14:07 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:07 |
<jbond@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
14:07 |
<jbond@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:57 |
<marostegui> |
and recreate the following hosts in tendril: db2103,db2104,db2105,db2106,db2107,db2108,db2109,db2110,db2111,db2112,db2113,db2115,db2116,db2117,db2119 T222772 |
[production] |
13:50 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:50 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:39 |
<cmjohnson1> |
replacing pdu in rack B5 eqiad |
[production] |
13:04 |
<hashar@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.34.0-wmf.5 |
[production] |
13:00 |
<arturo> |
labweb1001 depooled |
[production] |
12:59 |
<mobrovac> |
bootstrap restbase1020-c - T219404 |
[production] |
12:58 |
<aborrero@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=labweb1001.wikimedia.org,service=labweb |
[production] |
12:21 |
<godog> |
stop swift and rsync on ms-be10[16,17,18,32,33] for eqiad B5 pdu replacement - T223126 |
[production] |
12:02 |
<jynus> |
stop and shutdown db1098,db1131,db1139 T223126 |
[production] |
11:56 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
11:55 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
11:54 |
<moritzm> |
rebooting mw app servers in codfw for kernel update |
[production] |
11:32 |
<hoo@deploy1001> |
Synchronized wmf-config/extension-list: Add EntitySchema to extension-list (T221650) (duration: 00m 56s) |
[production] |
11:22 |
<jynus@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Depool db1098 & db1131 for maintenance (duration: 00m 57s) |
[production] |
11:00 |
<arturo> |
T223148 downtime cloudvirt[1014,1028].eqiad.wmnet and labweb1001.wikimedia.org for 8 hours |
[production] |
11:00 |
<aborrero@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
11:00 |
<aborrero@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |