101-150 of 10000 results (41ms)
2021-09-28 ยง
16:38 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
16:28 <mbsantos@deploy1002> Finished deploy [kartotherian/deploy@3e52e0a]: tegola: use global config var for load tests (duration: 00m 14s) [production]
16:28 <mbsantos@deploy1002> Started deploy [kartotherian/deploy@3e52e0a]: tegola: use global config var for load tests [production]
16:27 <bd808@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
16:26 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:21 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
16:19 <mbsantos@deploy1002> Finished deploy [kartotherian/deploy@f35571e] (eqiad): tegola: mirror kartotherian/eqiad traffic to codfw/tegola (duration: 00m 18s) [production]
16:19 <mbsantos@deploy1002> Started deploy [kartotherian/deploy@f35571e] (eqiad): tegola: mirror kartotherian/eqiad traffic to codfw/tegola [production]
16:16 <bd808@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
16:13 <bd808@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
16:12 <jgiannelos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [production]
16:10 <pt1979@cumin2002> END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host mw2412.codfw.wmnet [production]
16:09 <jgiannelos@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [production]
16:07 <jgiannelos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' . [production]
15:53 <pt1979@cumin2002> START - Cookbook sre.experimental.reimage for host mw2412.codfw.wmnet [production]
15:41 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
15:39 <_joe_> restarting pybal on lvs2010 [production]
15:38 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
15:31 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
14:51 <_joe_> restarting pybals in codfw again [production]
14:41 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
14:39 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
14:38 <marostegui> Remove flaggedimages from s5 T290340 [production]
14:36 <_joe_> restarting pybal on lvs2009 [production]
14:34 <_joe_> restarting pybal on lvs1015 [production]
14:33 <hnowlan@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=kartotherian,name=codfw [production]
14:32 <_joe_> restarting pybal on lvs2010 [production]
14:32 <arturo> add packages for buster-wikimedia|thirdparty/kubeadm-k8s-1-20 (T280402) [production]
14:31 <_joe_> restarting pybal on lvs1016 [production]
13:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2080 T290868', diff saved to https://phabricator.wikimedia.org/P17339 and previous config saved to /var/cache/conftool/dbconfig/20210928-134030-marostegui.json [production]
13:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2103 T290865', diff saved to https://phabricator.wikimedia.org/P17337 and previous config saved to /var/cache/conftool/dbconfig/20210928-134012-marostegui.json [production]
13:39 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on centrallog2002.codfw.wmnet with reason: REIMAGE [production]
13:37 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog2002.codfw.wmnet with reason: REIMAGE [production]
13:36 <marostegui@cumin1001> END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host db2103.codfw.wmnet [production]
13:36 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
13:36 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:33 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:33 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
13:30 <otto@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:03 <marostegui@cumin1001> START - Cookbook sre.experimental.reimage for host db2103.codfw.wmnet [production]
13:01 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@380d165] (duration: 07m 02s) [production]
12:54 <btullis@deploy1002> Started deploy [analytics/refinery@380d165] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@380d165] [production]
12:54 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165] (thin): Regular analytics weekly train THIN [analytics/refinery@380d165] (duration: 00m 07s) [production]
12:53 <btullis@deploy1002> Started deploy [analytics/refinery@380d165] (thin): Regular analytics weekly train THIN [analytics/refinery@380d165] [production]
12:53 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165]: Regular analytics weekly train [analytics/refinery@380d165] (duration: 17m 42s) [production]
12:35 <btullis@deploy1002> Started deploy [analytics/refinery@380d165]: Regular analytics weekly train [analytics/refinery@380d165] [production]
12:29 <akosiaris@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
12:27 <akosiaris@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
12:11 <urbanecm> [urbanecm@wtp1026 ~]$ sudo -i /usr/local/sbin/restart-php7.2-fpm [production]
12:10 <Lucas_WMDE> lucaswerkmeister-wmde@wtp1026:~$ sudo -u mwdeploy /usr/local/sbin/restart-php7.2-fpm # attempt to solve a recurrence of T290120, but it failed [production]