__all__ SAL

1351-1400 of 10000 results (62ms)

2019-12-16 §
13:33	<ema>	cp-ats: rolling ats-backend-restart to apply ram cache size changes T238494	[production]
13:33	<moritzm>	restarting systemd-timesyncd on stat1005	[production]
12:56	<joal>	Kill all oozie jobs after having dumped their statuses	[analytics]
12:52	<elukey>	shutdown of the Analytics Hadoop cluster to enable Kerberos	[production]
12:26	<joal>	Reference for killed backfilling mediarequest-per-file job: https://hue.wikimedia.org/oozie/list_oozie_coordinator/0003296-191212123816836-oozie-oozi-C/	[analytics]
12:26	<joal>	Reference for killed backfillin jo	[analytics]
12:23	<joal>	Kill backfilling job for mediarequest-per-file with 2017-07-0[2345] days not done	[analytics]
12:22	<joal>	Rerun cassandra-daily-wf-local_group_default_T_pageviews_per_article_flat-2019-12-15	[analytics]
12:17	<elukey>	kill netflow realtime druid supervisor as prep step for kerberos	[analytics]
12:16	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
12:15	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
12:12	<Urbanecm>	EU SWAT done	[production]
12:11	<urbanecm@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: 026913d: Add no=>nb in $wgInterlanguageLinkCodeMap (T174160) (duration: 00m 53s)	[production]
11:58	<jynus@cumin1001>	dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json	[production]
11:55	<hashar>	Restarting Jenkins completely to flush out stall Gearman functions in Zuul	[production]
11:41	<jdrewniak@deploy1001>	Synchronized portals: Wikimedia Portals Update: [[gerrit:558017\| Bumping portals to master (T128546)]] (duration: 00m 52s)	[production]
11:40	<jdrewniak@deploy1001>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017\| Bumping portals to master (T128546)]] (duration: 00m 56s)	[production]
11:14	<joal>	Clean spark-shell drivers on cluster before kerberos	[analytics]
10:57	<elukey>	disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos	[production]
10:46	<elukey>	stop airflow-* on an-airflow1001	[analytics]
10:41	<XioNoX>	delete virtual chassis ID on asw-d-codfw	[production]
10:41	<elukey>	stop jupyterhub on notebook100[3,4] as prep step for kerberos	[analytics]
10:38	<elukey>	kill Nuria's spark shell application masters in Yarn	[analytics]
10:17	<elukey>	stop hadoop-related timers on stat1007	[analytics]
10:14	<hashar>	Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost)	[production]
10:04	<joal>	Killing user-app eating all cluster (application_1573208467349_190044)	[analytics]
09:50	<marostegui>	Stop replication in the same position in labsdb1010 and labsdb1012 - T238399	[production]
09:35	<hashar>	doc1001: sudo -u doc-uploader rm -fR /srv/docroot/org/wikimedia/doc/DOCKER-mediawiki-core	[releng]
09:24	<hashar>	Reloading Jenkins CI	[production]
09:14	<godog>	upgrade hw raid firmware on ms-be2016 and reboot - T240798	[production]
09:14	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
09:13	<filippo@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
09:05	<joal>	Rerun webrequest-load-wf-text-2019-12-14-18 with updated error-checking parameters (all false positive)	[analytics]
09:04	<Urbanecm>	mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 (T240825)	[production]
08:54	<ema>	cp1077: ats-backend-restart to increase RAM cache size T238494	[production]
08:53	<moritzm>	powercycling ms-be2016 T240798	[production]
08:49	<elukey>	re-run webrequest-load 2019-12-14-13 and 2019-12-15-12 with higher mapreduce limits (modified version of refinery on hdfs /user/elukey with https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/557794/)	[analytics]
08:36	<ema>	cp1075: repool all services T240826	[production]
08:12	<ema>	cp1075: wipe varnish-fe and ats-be caches due to missed purges T240826	[production]
08:08	<ema>	cp1075: manually start vhtcpd.service T240826	[production]
07:52	<ema>	cp1075: depool, vhtcpd not running	[production]
07:38	<marostegui>	Disable auto-learn on db21[03-35] T240823	[production]
07:27	<marostegui>	Disable auto-learn on db[1126-1138].eqiad.wmnet T240823	[production]
07:22	<elukey>	stop camus timers as prep step for maintenance (if we'll do it)	[analytics]
07:13	<_joe_>	restarting cpjobqueue on scb1001 to check if processing rate of recentChanges recovers T240518	[production]
07:11	<marostegui>	Stop replication in the same position in labsdb1010 and labsdb1012 - T238399	[production]
07:09	<onimisionipe>	depool maps2001 for postgres reinit - T239728	[production]
06:59	<onimisionipe>	pool maps2004. osm import is complete - T239728	[production]
06:58	<_joe_>	clearing apcu across multiple api servers to allow metrics to be collected again (task coming soon)	[production]
06:56	<marostegui>	Force re-learn cycle on db1130	[production]