951-1000 of 10000 results (45ms)
2019-05-16 ยง
16:16 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
16:16 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:59 <akosiaris> build service-checker OCI container 0.0.2 with 0.1.5 service-checker version T220401 [production]
15:49 <jforrester@deploy1001> Synchronized php-1.34.0-wmf.5/extensions/CirrusSearch/includes/InterwikiSearcher.php: Hot-deploy CirrusSearch interwiki no result UBN T223449 (duration: 00m 49s) [production]
15:45 <marostegui> Drop the following databases from tendril to recreated them with the right user: db1127,db1129,db1130, db1131, db1137,db1138 [production]
15:35 <jforrester@deploy1001> Synchronized php-1.34.0-wmf.5/includes/specials/pagers/ContribsPager.php: Hot-deploy Contribs getNamespaceInfo UBN fix T223440 (duration: 00m 53s) [production]
15:25 <aborrero@puppetmaster1001> conftool action : set/pooled=yes; selector: name=labweb1001.wikimedia.org,service=labweb [production]
15:02 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:02 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:02 <jbond42> rebooting aqs1009 [production]
14:54 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:54 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:54 <jbond42> rebooting aqs1008 [production]
14:45 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:45 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:45 <jbond42> rebooting aqs1007 [production]
14:34 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:34 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:34 <jbond42> rebooting aqs1006 [production]
14:28 <jbond42> rebooting aqs1005 [production]
14:21 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:21 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:18 <moritzm> powercycling mw2199, stuck during reboot [production]
14:09 <elukey> restart the webrequest-druid-hourly-coord coordinator with the analytics user [analytics]
14:08 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:08 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:08 <elukey> restart the webrequest-druid-daily-coord coordinator with the analytics user [analytics]
14:07 <jbond@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:07 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:07 <jbond@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
14:07 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:57 <marostegui> and recreate the following hosts in tendril: db2103,db2104,db2105,db2106,db2107,db2108,db2109,db2110,db2111,db2112,db2113,db2115,db2116,db2117,db2119 T222772 [production]
13:57 <elukey> start webrequest-load-bundle from hour 12:00 [analytics]
13:50 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:50 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
13:39 <cmjohnson1> replacing pdu in rack B5 eqiad [production]
13:27 <elukey> chown -R analytics:analytics /user/hive/warehouse/wmf_raw.db on HDFS [analytics]
13:23 <elukey> chown -R analytics:analytics /wmf/data/raw/webrequests_faulty_hosts on HDFS [analytics]
13:07 <elukey> chown -R analytics:analytics /wmf/data/raw/webrequests_data_loss on HDFS [analytics]
13:04 <hashar@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.34.0-wmf.5 [production]
13:00 <arturo> labweb1001 depooled [production]
12:59 <mobrovac> bootstrap restbase1020-c - T219404 [production]
12:58 <aborrero@puppetmaster1001> conftool action : set/pooled=no; selector: name=labweb1001.wikimedia.org,service=labweb [production]
12:57 <elukey> chown -R analytics:analytics-privatedata-users /wmf/data/wmf/webrequest on HDFS [analytics]
12:53 <elukey> kill the webrequest-load-bundle in hue - prep step to migrate the webrequest bundle to the analytics user [analytics]
12:49 <elukey> kill webrequest-load-coord-upload from hue - prep step to migrate the webrequest bundle to the analytics user [analytics]
12:21 <godog> stop swift and rsync on ms-be10[16,17,18,32,33] for eqiad B5 pdu replacement - T223126 [production]
12:02 <jynus> stop and shutdown db1098,db1131,db1139 T223126 [production]
11:56 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
11:55 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]