production SAL

2801-2850 of 10000 results (35ms)

2020-11-19 §
08:49	<elukey>	restart hadoop daemons on analytics1058 for openjdk upgrades (canary)	[production]
08:25	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-masters	[production]
08:19	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0)	[production]
08:19	<XioNoX>	eqiad row C: standardize interfaces config	[production]
07:55	<XioNoX>	eqiad row D: move Ganeti/LVS interfaces to individual terms	[production]
07:47	<XioNoX>	eqiad row D: standardize interfaces config	[production]
07:22	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-masters	[production]
07:21	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0)	[production]
07:05	<elukey>	roll restart java daemons on Hadoop test for openjdk upgrades	[production]
07:05	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-workers	[production]
06:22	<marostegui@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)	[production]
06:21	<marostegui>	Remove es1014 from tendril and zarcillo T268102	[production]
06:18	<marostegui@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
06:08	<marostegui>	Stop mysql on db1125:3316 to clone clouddb1015 and clouddb1019, there will be lag on s6 on wikireplicas - T267090	[production]
02:41	<ryankemper@cumin1001>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
01:30	<ryankemper@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
2020-11-18 §
23:34	<mutante>	disabling puppet on memcache::mediawiki - deploying gerrit:637742	[production]
22:56	<dpifke@deploy1001>	Finished deploy [performance/arc-lamp@6bbac6d]: Fix for bytes/str issue after T267269 (duration: 00m 04s)	[production]
22:56	<dpifke@deploy1001>	Started deploy [performance/arc-lamp@6bbac6d]: Fix for bytes/str issue after T267269	[production]
22:24	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
22:22	<robh@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
22:19	<urbanecm@deploy1001>	Synchronized wmf-config/CommonSettings.php: Deploy GlobalWatchlist to beta (noop; T268181) (duration: 01m 04s)	[production]
22:11	<urbanecm@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalWatchlist extension: Prepare IS.php to know relevant variables (noop; T268181) (duration: 01m 06s)	[production]
22:05	<urbanecm@deploy1001>	Synchronized wmf-config/extension-list: Deploy GlobalWatchlist extension to beta: add it to extension-list (T268181) (duration: 01m 05s)	[production]
21:53	<mutante>	mwdebug1003 - restarting ferm because config was generated but service not restarted due to puppet dependency errors, breaking NRPE monitoring T267248	[production]
21:47	<mutante>	mwdebug1003 - scap pull - T267248	[production]
21:40	<mutante>	mw1317,mw1318 - back in action and all monitoring activated again	[production]
21:17	<dzahn@cumin1001>	conftool action : set/weight=10; selector: name=mw1318.eqiad.wmnet,cluster=videoscaler	[production]
21:08	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1317.eqiad.wmnet	[production]
21:08	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1318.eqiad.wmnet	[production]
21:02	<mutante>	mw1317,mw1318 - repooled=no after physical move to rack B	[production]
20:56	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1318.eqiad.wmnet	[production]
20:54	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1317.eqiad.wmnet	[production]
20:27	<mutante>	mw1317, mw1318 shutting down for physical move	[production]
20:21	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw1318.eqiad.wmnet	[production]
20:21	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw1317.eqiad.wmnet	[production]
20:15	<mutante>	mw1317,mw1318 - downtimed and depooled - they are physically moving from B7 to B5 (T266164)	[production]
20:13	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
20:13	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
20:13	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
20:13	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
20:11	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1317.eqiad.wmnet	[production]
20:11	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw1318.eqiad.wmnet	[production]
20:10	<dancy@deploy1001>	Synchronized php: group1 wikis to 1.36.0-wmf.18 (duration: 01m 03s)	[production]
20:09	<dancy@deploy1001>	rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.18	[production]
20:03	<akosiaris@cumin1001>	conftool action : set/pooled=inactive; selector: service=recommendation-api,name=kubernetes.*,dc=codfw	[production]
20:03	<akosiaris@cumin1001>	conftool action : set/weight=0; selector: service=recommendation-api,name=kubernetes.*,dc=codfw	[production]
19:53	<akosiaris@cumin1001>	conftool action : set/pooled=no; selector: service=recommendation-api,name=kubernetes.*,dc=codfw	[production]
19:48	<otto@deploy1001>	Synchronized php-1.36.0-wmf.16/extensions/EventLogging/modules/ext.eventLogging/core.js: EventLogging legacy events should use dt as server side receive time - T240460 (duration: 01m 06s)	[production]
19:45	<otto@deploy1001>	Synchronized php-1.36.0-wmf.18/extensions/EventLogging/modules/ext.eventLogging/core.js: EventLogging legacy events should use dt as server side receive time - T240460 (duration: 01m 07s)	[production]