2601-2650 of 10000 results (56ms)
2019-12-16 ยง
14:03 <cdanis@deploy1001> Synchronized wmf-config/etcd.php: enable dbctl for externalLoads 6dfb30c76 T229686 (duration: 00m 53s) [production]
13:59 <arturo> powering down `puppet-stretch-test` VM to test stuff related to T240851 [testlabs]
13:50 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:50 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:33 <ema> cp-ats: rolling ats-backend-restart to apply ram cache size changes T238494 [production]
13:33 <moritzm> restarting systemd-timesyncd on stat1005 [production]
12:56 <joal> Kill all oozie jobs after having dumped their statuses [analytics]
12:52 <elukey> shutdown of the Analytics Hadoop cluster to enable Kerberos [production]
12:26 <joal> Reference for killed backfilling mediarequest-per-file job: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0003296-191212123816836-oozie-oozi-C/ [analytics]
12:26 <joal> Reference for killed backfillin jo [analytics]
12:23 <joal> Kill backfilling job for mediarequest-per-file with 2017-07-0[2345] days not done [analytics]
12:22 <joal> Rerun cassandra-daily-wf-local_group_default_T_pageviews_per_article_flat-2019-12-15 [analytics]
12:17 <elukey> kill netflow realtime druid supervisor as prep step for kerberos [analytics]
12:16 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:15 <elukey@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:12 <Urbanecm> EU SWAT done [production]
12:11 <urbanecm@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: 026913d: Add no=>nb in $wgInterlanguageLinkCodeMap (T174160) (duration: 00m 53s) [production]
11:58 <jynus@cumin1001> dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json [production]
11:55 <hashar> Restarting Jenkins completely to flush out stall Gearman functions in Zuul [production]
11:41 <jdrewniak@deploy1001> Synchronized portals: Wikimedia Portals Update: [[gerrit:558017| Bumping portals to master (T128546)]] (duration: 00m 52s) [production]
11:40 <jdrewniak@deploy1001> Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017| Bumping portals to master (T128546)]] (duration: 00m 56s) [production]
11:14 <joal> Clean spark-shell drivers on cluster before kerberos [analytics]
10:57 <elukey> disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos [production]
10:46 <elukey> stop airflow-* on an-airflow1001 [analytics]
10:41 <XioNoX> delete virtual chassis ID on asw-d-codfw [production]
10:41 <elukey> stop jupyterhub on notebook100[3,4] as prep step for kerberos [analytics]
10:38 <elukey> kill Nuria's spark shell application masters in Yarn [analytics]
10:17 <elukey> stop hadoop-related timers on stat1007 [analytics]
10:14 <hashar> Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost) [production]
10:04 <joal> Killing user-app eating all cluster (application_1573208467349_190044) [analytics]
09:50 <marostegui> Stop replication in the same position in labsdb1010 and labsdb1012 - T238399 [production]
09:35 <hashar> doc1001: sudo -u doc-uploader rm -fR /srv/docroot/org/wikimedia/doc/DOCKER-mediawiki-core [releng]
09:24 <hashar> Reloading Jenkins CI [production]
09:14 <godog> upgrade hw raid firmware on ms-be2016 and reboot - T240798 [production]
09:14 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
09:13 <filippo@cumin1001> START - Cookbook sre.hosts.downtime [production]
09:05 <joal> Rerun webrequest-load-wf-text-2019-12-14-18 with updated error-checking parameters (all false positive) [analytics]
09:04 <Urbanecm> mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 (T240825) [production]
08:54 <ema> cp1077: ats-backend-restart to increase RAM cache size T238494 [production]
08:53 <moritzm> powercycling ms-be2016 T240798 [production]
08:49 <elukey> re-run webrequest-load 2019-12-14-13 and 2019-12-15-12 with higher mapreduce limits (modified version of refinery on hdfs /user/elukey with https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/557794/) [analytics]
08:36 <ema> cp1075: repool all services T240826 [production]
08:12 <ema> cp1075: wipe varnish-fe and ats-be caches due to missed purges T240826 [production]
08:08 <ema> cp1075: manually start vhtcpd.service T240826 [production]
07:52 <ema> cp1075: depool, vhtcpd not running [production]
07:38 <marostegui> Disable auto-learn on db21[03-35] T240823 [production]
07:27 <marostegui> Disable auto-learn on db[1126-1138].eqiad.wmnet T240823 [production]
07:22 <elukey> stop camus timers as prep step for maintenance (if we'll do it) [analytics]
07:13 <_joe_> restarting cpjobqueue on scb1001 to check if processing rate of recentChanges recovers T240518 [production]
07:11 <marostegui> Stop replication in the same position in labsdb1010 and labsdb1012 - T238399 [production]