production SAL

5851-5900 of 10000 results (92ms)

2019-12-16 §
14:39	<oblivian@deploy1001>	helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .	[production]
14:39	<cdanis@deploy1001>	Synchronized wmf-config/etcd.php: db-codfw: remove dbctl-obsoleted externalLoads section 519e37461 T229686 (duration: 00m 53s)	[production]
14:38	<oblivian@deploy1001>	helmfile [CODFW] Ran 'apply' command on namespace 'blubberoid' for release 'production' .	[production]
14:36	<oblivian@deploy1001>	helmfile [STAGING] Ran 'apply' command on namespace 'blubberoid' for release 'staging' .	[production]
14:35	<XioNoX>	delete virtual chassis ID on asw-a-codfw	[production]
14:34	<XioNoX>	delete virtual chassis ID on asw-b-codfw	[production]
14:32	<XioNoX>	delete virtual chassis ID on asw-c-codfw	[production]
14:30	<cdanis>	manual testing of I219711eb on mwdebug2001	[production]
14:11	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db1127 after testing', diff saved to https://phabricator.wikimedia.org/P9875 and previous config saved to /var/cache/conftool/dbconfig/20191216-141141-marostegui.json	[production]
14:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1127 from x1 for testing', diff saved to https://phabricator.wikimedia.org/P9874 and previous config saved to /var/cache/conftool/dbconfig/20191216-140951-marostegui.json	[production]
14:03	<cdanis@deploy1001>	Synchronized wmf-config/etcd.php: enable dbctl for externalLoads 6dfb30c76 T229686 (duration: 00m 53s)	[production]
13:50	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
13:50	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
13:33	<ema>	cp-ats: rolling ats-backend-restart to apply ram cache size changes T238494	[production]
13:33	<moritzm>	restarting systemd-timesyncd on stat1005	[production]
12:52	<elukey>	shutdown of the Analytics Hadoop cluster to enable Kerberos	[production]
12:16	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
12:15	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
12:12	<Urbanecm>	EU SWAT done	[production]
12:11	<urbanecm@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: 026913d: Add no=>nb in $wgInterlanguageLinkCodeMap (T174160) (duration: 00m 53s)	[production]
11:58	<jynus@cumin1001>	dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P9873 and previous config saved to /var/cache/conftool/dbconfig/20191216-115841-jynus.json	[production]
11:55	<hashar>	Restarting Jenkins completely to flush out stall Gearman functions in Zuul	[production]
11:41	<jdrewniak@deploy1001>	Synchronized portals: Wikimedia Portals Update: [[gerrit:558017\| Bumping portals to master (T128546)]] (duration: 00m 52s)	[production]
11:40	<jdrewniak@deploy1001>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:558017\| Bumping portals to master (T128546)]] (duration: 00m 56s)	[production]
10:57	<elukey>	disable puppet on labstore100[6,7] and stop analytics-related systemd timers - prep step for Kerberos	[production]
10:41	<XioNoX>	delete virtual chassis ID on asw-d-codfw	[production]
10:14	<hashar>	Restarting CI Jenkins due to out of sync state between Zuul Gearman and what is actually running (some jobs got lost)	[production]
09:50	<marostegui>	Stop replication in the same position in labsdb1010 and labsdb1012 - T238399	[production]
09:24	<hashar>	Reloading Jenkins CI	[production]
09:14	<godog>	upgrade hw raid firmware on ms-be2016 and reboot - T240798	[production]
09:14	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
09:13	<filippo@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
09:04	<Urbanecm>	mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T240825 (T240825)	[production]
08:54	<ema>	cp1077: ats-backend-restart to increase RAM cache size T238494	[production]
08:53	<moritzm>	powercycling ms-be2016 T240798	[production]
08:36	<ema>	cp1075: repool all services T240826	[production]
08:12	<ema>	cp1075: wipe varnish-fe and ats-be caches due to missed purges T240826	[production]
08:08	<ema>	cp1075: manually start vhtcpd.service T240826	[production]
07:52	<ema>	cp1075: depool, vhtcpd not running	[production]
07:38	<marostegui>	Disable auto-learn on db21[03-35] T240823	[production]
07:27	<marostegui>	Disable auto-learn on db[1126-1138].eqiad.wmnet T240823	[production]
07:13	<_joe_>	restarting cpjobqueue on scb1001 to check if processing rate of recentChanges recovers T240518	[production]
07:11	<marostegui>	Stop replication in the same position in labsdb1010 and labsdb1012 - T238399	[production]
07:09	<onimisionipe>	depool maps2001 for postgres reinit - T239728	[production]
06:59	<onimisionipe>	pool maps2004. osm import is complete - T239728	[production]
06:58	<_joe_>	clearing apcu across multiple api servers to allow metrics to be collected again (task coming soon)	[production]
06:56	<marostegui>	Force re-learn cycle on db1130	[production]
06:42	<marostegui>	Depool labsdb1010 - T238399	[production]
06:39	<marostegui>	Recreate views on commonswiki,testcommonswiki for protected_titles on all labsdb hosts - T233135	[production]
06:29	<marostegui>	Remove triggers for ar_comment on db1125:3314 T234704	[production]