5301-5350 of 10000 results (54ms)
2019-05-16 ยง
14:45 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:45 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:45 <jbond42> rebooting aqs1007 [production]
14:34 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:34 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:34 <jbond42> rebooting aqs1006 [production]
14:28 <jbond42> rebooting aqs1005 [production]
14:21 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:21 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:18 <moritzm> powercycling mw2199, stuck during reboot [production]
14:09 <elukey> restart the webrequest-druid-hourly-coord coordinator with the analytics user [analytics]
14:08 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:08 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:08 <elukey> restart the webrequest-druid-daily-coord coordinator with the analytics user [analytics]
14:07 <jbond@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:07 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:07 <jbond@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:07 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:57 <marostegui> and recreate the following hosts in tendril: db2103,db2104,db2105,db2106,db2107,db2108,db2109,db2110,db2111,db2112,db2113,db2115,db2116,db2117,db2119 T222772 [production]
13:57 <elukey> start webrequest-load-bundle from hour 12:00 [analytics]
13:50 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:50 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
13:39 <cmjohnson1> replacing pdu in rack B5 eqiad [production]
13:27 <elukey> chown -R analytics:analytics /user/hive/warehouse/wmf_raw.db on HDFS [analytics]
13:23 <elukey> chown -R analytics:analytics /wmf/data/raw/webrequests_faulty_hosts on HDFS [analytics]
13:07 <elukey> chown -R analytics:analytics /wmf/data/raw/webrequests_data_loss on HDFS [analytics]
13:04 <hashar@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.34.0-wmf.5 [production]
13:00 <arturo> labweb1001 depooled [production]
12:59 <mobrovac> bootstrap restbase1020-c - T219404 [production]
12:58 <aborrero@puppetmaster1001> conftool action : set/pooled=no; selector: name=labweb1001.wikimedia.org,service=labweb [production]
12:57 <elukey> chown -R analytics:analytics-privatedata-users /wmf/data/wmf/webrequest on HDFS [analytics]
12:53 <elukey> kill the webrequest-load-bundle in hue - prep step to migrate the webrequest bundle to the analytics user [analytics]
12:49 <elukey> kill webrequest-load-coord-upload from hue - prep step to migrate the webrequest bundle to the analytics user [analytics]
12:21 <godog> stop swift and rsync on ms-be10[16,17,18,32,33] for eqiad B5 pdu replacement - T223126 [production]
12:02 <jynus> stop and shutdown db1098,db1131,db1139 T223126 [production]
11:56 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:55 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
11:54 <moritzm> rebooting mw app servers in codfw for kernel update [production]
11:32 <hoo@deploy1001> Synchronized wmf-config/extension-list: Add EntitySchema to extension-list (T221650) (duration: 00m 56s) [production]
11:22 <chicocvenancio> PAWS: restart hub to get new configured announcement [tools]
11:22 <jynus@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool db1098 & db1131 for maintenance (duration: 00m 57s) [production]
11:05 <chicocvenancio> PAWS: change confimap to reference WMHACK 2019 as busiest time [tools]
11:00 <arturo> T223148 downtime cloudvirt[1014,1028].eqiad.wmnet and labweb1001.wikimedia.org for 8 hours [production]
11:00 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:00 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:00 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:00 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime [production]
11:00 <aborrero@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:00 <aborrero@cumin1001> START - Cookbook sre.hosts.downtime [production]
10:50 <godog> bootstrap restbase1020-b - T219404 [production]