2401-2450 of 10000 results (31ms)
2021-09-28 ยง
14:51 <_joe_> restarting pybals in codfw again [production]
14:41 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' . [production]
14:39 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 [production]
14:38 <marostegui> Remove flaggedimages from s5 T290340 [production]
14:36 <_joe_> restarting pybal on lvs2009 [production]
14:34 <_joe_> restarting pybal on lvs1015 [production]
14:33 <hnowlan@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=kartotherian,name=codfw [production]
14:32 <_joe_> restarting pybal on lvs2010 [production]
14:32 <arturo> add packages for buster-wikimedia|thirdparty/kubeadm-k8s-1-20 (T280402) [production]
14:31 <_joe_> restarting pybal on lvs1016 [production]
13:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2080 T290868', diff saved to https://phabricator.wikimedia.org/P17339 and previous config saved to /var/cache/conftool/dbconfig/20210928-134030-marostegui.json [production]
13:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2103 T290865', diff saved to https://phabricator.wikimedia.org/P17337 and previous config saved to /var/cache/conftool/dbconfig/20210928-134012-marostegui.json [production]
13:39 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on centrallog2002.codfw.wmnet with reason: REIMAGE [production]
13:37 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog2002.codfw.wmnet with reason: REIMAGE [production]
13:36 <marostegui@cumin1001> END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host db2103.codfw.wmnet [production]
13:36 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
13:36 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:33 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:33 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
13:30 <otto@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
13:03 <marostegui@cumin1001> START - Cookbook sre.experimental.reimage for host db2103.codfw.wmnet [production]
13:01 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@380d165] (duration: 07m 02s) [production]
12:54 <btullis@deploy1002> Started deploy [analytics/refinery@380d165] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@380d165] [production]
12:54 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165] (thin): Regular analytics weekly train THIN [analytics/refinery@380d165] (duration: 00m 07s) [production]
12:53 <btullis@deploy1002> Started deploy [analytics/refinery@380d165] (thin): Regular analytics weekly train THIN [analytics/refinery@380d165] [production]
12:53 <btullis@deploy1002> Finished deploy [analytics/refinery@380d165]: Regular analytics weekly train [analytics/refinery@380d165] (duration: 17m 42s) [production]
12:35 <btullis@deploy1002> Started deploy [analytics/refinery@380d165]: Regular analytics weekly train [analytics/refinery@380d165] [production]
12:29 <akosiaris@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
12:27 <akosiaris@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . [production]
12:11 <urbanecm> [urbanecm@wtp1026 ~]$ sudo -i /usr/local/sbin/restart-php7.2-fpm [production]
12:10 <Lucas_WMDE> lucaswerkmeister-wmde@wtp1026:~$ sudo -u mwdeploy /usr/local/sbin/restart-php7.2-fpm # attempt to solve a recurrence of T290120, but it failed [production]
12:06 <marostegui> Remove flaggedimages from s7 T290340 [production]
12:03 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:00 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
11:57 <Lucas_WMDE> EU backport+config window done [production]
11:54 <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.38.0-wmf.1/extensions/Wikibase/repo/includes/Store/Sql/SqlSiteLinkConflictLookup.php: Backport: [[gerrit:724370|Use CONN_TRX_AUTOCOMMIT in SqlSiteLinkConflictLookup (T291377)]] (duration: 00m 57s) [production]
11:32 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet [production]
11:31 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
11:29 <marostegui> Deploy schema change on s3 codfw (lag will show up) T283499 [production]
11:29 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720982|Add support for SectionTranslationTargetLanguages (T290302, T290175)]] (duration: 00m 57s) [production]
11:29 <arturo> cleanup unused repo component buster-wikimedia|thirdparty/kubeadm-k8s-1-18 (T280402) [production]
11:28 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
11:27 <elukey@cumin1001> END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES codfw cluster: Roll restart of ORES's daemons. - elukey@cumin1001 [production]
11:25 <marostegui> Deploy schema change on s6 codfw (lag will show up) T283499 [production]
11:12 <jmm@cumin2002> START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet [production]
11:09 <ladsgroup@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:723211|Enable new dispatch via job approach on testwikidata and testwiki (T291610)]] (duration: 00m 57s) [production]
11:08 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet [production]
11:07 <elukey@cumin1001> START - Cookbook sre.ores.roll-restart-workers for ORES codfw cluster: Roll restart of ORES's daemons. - elukey@cumin1001 [production]
11:05 <effie> downgrading scap to 3.17.1 on deploy1002 - T291095 [production]
11:01 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1011.eqiad.wmnet [production]